Prediction of high-responding peptides for targeted protein assays by mass spectrometry

Fusaro, Vincent A; Mani, D R; Mesirov, Jill P; Carr, Steven A

doi:10.1038/nbt.1524

Article
Published: 25 January 2009

Prediction of high-responding peptides for targeted protein assays by mass spectrometry

Vincent A Fusaro^1,2,
D R Mani¹,
Jill P Mesirov¹ &
…
Steven A Carr¹

Nature Biotechnology volume 27, pages 190–198 (2009)Cite this article

3778 Accesses
235 Citations
13 Altmetric
Metrics details

Abstract

Protein biomarker discovery produces lengthy lists of candidates that must subsequently be verified in blood or other accessible biofluids. Use of targeted mass spectrometry (MS) to verify disease- or therapy-related changes in protein levels requires the selection of peptides that are quantifiable surrogates for proteins of interest. Peptides that produce the highest ion-current response (high-responding peptides) are likely to provide the best detection sensitivity. Identification of the most effective signature peptides, particularly in the absence of experimental data, remains a major resource constraint in developing targeted MS–based assays. Here we describe a computational method that uses protein physicochemical properties to select high-responding peptides and demonstrate its utility in identifying signature peptides in plasma, a complex proteome with a wide range of protein concentrations. Our method, which employs a Random Forest classifier, facilitates the development of targeted MS–based assays for biomarker verification or any application where protein levels need to be measured.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: ESP application and model development overview.**

**Figure 2: ESP predictor validation and method comparison.**

**Figure 3: ESP predictions translate into experimentally validated MRM peptides.**

**Figure 4: Analysis of important physicochemical properties in predicting high-responding peptides.**

Decrypting the molecular basis of cellular drug phenotypes by dose-resolved expression proteomics

Article Open access 07 May 2024

PERCEPTION predicts patient response and resistance to treatment using single-cell transcriptomics of their tumors

Article 18 April 2024

Photoaffinity labelling with small molecules

Article 02 May 2024

References

Rifai, N., Gillette, M.A. & Carr, S.A. Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat. Biotechnol. 24, 971–983 (2006).
Article CAS Google Scholar
Uhlen, M. & Hober, S. Generation and validation of affinity reagents on a proteome-wide level. J. Mol. Recognit. (2008).
Anderson, L. & Hunter, C.L. Quantitative mass spectrometric multiple reaction monitoring assays for major plasma proteins. Mol. Cell. Proteomics 5, 573–588 (2006).
Article CAS Google Scholar
Gerber, S.A., Rush, J., Stemman, O., Kirschner, M.W. & Gygi, S.P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci. USA 100, 6940–6945 (2003).
Article CAS Google Scholar
Keshishian, H., Addona, T., Burgess, M., Kuhn, E. & Carr, S.A. Quantitative, multiplexed assays for low abundance proteins in plasma by targeted mass spectrometry and stable isotope dilution. Mol. Cell. Proteomics 6, 2212–2229 (2007).
Article CAS Google Scholar
Stahl-Zeng, J. et al. High sensitivity detection of plasma proteins by multiple reaction monitoring of N-glycosites. Mol. Cell. Proteomics 6, 1809–1817 (2007).
Article CAS Google Scholar
Kuster, B., Schirle, M., Mallick, P. & Aebersold, R. Scoring proteomes with proteotypic peptide probes. Nat. Rev. Mol. Cell Biol. 6, 577–583 (2005).
Article CAS Google Scholar
Craig, R., Cortens, J.P. & Beavis, R.C. Open source system for analyzing, validating, and storing protein identification data. J. Proteome Res. 3, 1234–1242 (2004).
Article CAS Google Scholar
Deutsch, E.W., Lam, H. & Aebersold, R. PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows. EBMO reports 9, 429–434 (2008).
Article CAS Google Scholar
Mallick, P. et al. Computational prediction of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 25, 125–131 (2007).
Article CAS Google Scholar
Sanders, W.S., Bridges, S.M., McCarthy, F.M., Nanduri, B. & Burgess, S.C. Prediction of peptides observable by mass spectrometry applied at the experimental set level. BMC Bioinformatics 8 Suppl 7, S23 (2007).
Article Google Scholar
Tang, H. et al. A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics 22, e481–e488 (2006).
Article CAS Google Scholar
Webb-Robertson, B.J. et al. A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics. Bioinformatics 24, 1503–1509 (2008).
Article CAS Google Scholar
Jaffe, J.D. et al. Accurate inclusion mass screening: a bridge from unbiased discovery to targeted assay development for biomarker verification. Mol. Cell. Proteomics 7, 1952–1962 (2008).
Article CAS Google Scholar
Malmstrom, J., Lee, H. & Aebersold, R. Advances in proteomic workflows for systems biology. Curr. Opin. Biotechnol. 18, 378–384 (2007).
Article Google Scholar
Kawashima, S. & Kanehisa, M. AAindex: amino acid index database. Nucleic Acids Res. 28, 374 (2000).
Article CAS Google Scholar
Zhang, Z. Prediction of low-energy collision-induced dissociation spectra of peptides. Anal. Chem. 76, 3908–3922 (2004).
Article CAS Google Scholar
Breiman, L. Random forest. Mach. Learn. 45, 5–32 (2001).
Article Google Scholar
Liaw, A. & Wiener, M. ClassificatIon and Regression by randomForest. R News 2, 18–22 (2002).
Google Scholar
Diaz-Uriarte, R. & Alvarez de Andres, S. Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7, 3 (2006).
Article Google Scholar
Enot, D.P., Beckmann, M., Overy, D. & Draper, J. Predicting interpretability of metabolome models based on behavior, putative identity, and biological relevance of explanatory signals. Proc. Natl. Acad. Sci. USA 103, 14865–14870 (2006).
Article CAS Google Scholar
Vapnik, V. The Nature of Statistical Learning Theory (Springer, New York, 1995).
Book Google Scholar
Bishop, C. Neural Networks for Pattern Recognition (Oxford University Press, Oxford, 1995).
Google Scholar
Fawcett, T. ROC Graphs: Notes and Practical Considerations for Researchers (Technical report, HP Laboratories, Palo Alto, CA, USA, 2004).
Google Scholar
Lange, V., Picotti, P., Domon, B. & Aebersold, R. Selected reaction monitoring for quantitative proteomics: a tutorial. Mol. Syst. Biol. 4, 222 (2008).
Article Google Scholar
Svetnik, V. et al. Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43, 1947–1958 (2003).
Article CAS Google Scholar
Cech, N.B. & Enke, C.G. Relating electrospray ionization response to nonpolar character of small peptides. Anal. Chem. 72, 2717–2723 (2000).
Article CAS Google Scholar
Cech, N.B. & Enke, C.G. Practical implications of some recent studies in electrospray ionization fundamentals. Mass Spectrom. Rev. 20, 362–387 (2001).
Article CAS Google Scholar
Cowan, R. & Whittaker, R.G. Hydrophobicity indices for amino acid residues as determined by high-performance liquid chromatography. Pept. Res. 3, 75–80 (1990).
CAS PubMed Google Scholar
Parker, J.M., Guo, D. & Hodges, R.S. New hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and X-ray-derived accessible sites. Biochemistry 25, 5425–5432 (1986).
Article CAS Google Scholar
Whiteaker, J.R. et al. Integrated pipeline for mass spectrometry-based discovery and confirmation of biomarkers demonstrated in a mouse model of breast cancer. J. Proteome Res. 6, 3962–3975 (2007).
Article CAS Google Scholar
Zolg, J.W. & Langen, H. How industry is approaching the search for new diagnostic markers and biomarkers. Mol. Cell. Proteomics 3, 345–354 (2004).
Article CAS Google Scholar
Sokal, R.R. & Rohlf, F.J. Biometry the Principles and Practice of Statistics in Biological Research, edn. 3 (W.H. Freeman and Company, 1995).
Google Scholar
Thomson, R., Hodgman, T.C., Yang, Z.R. & Doyle, A.K. Characterizing proteolytic cleavage site activity using bio-basis function neural networks. Bioinformatics 19, 1741–1747 (2003).
Article CAS Google Scholar
Yen, C.Y. et al. Improving sensitivity in shotgun proteomics using a peptide-centric database with reduced complexity: protease cleavage and SCX elution rules from data mining of MS/MS spectra. Anal. Chem. 78, 1071–1084 (2006).
Article CAS Google Scholar
Chen, C., Liaw, A & Breiman, L. Using Random Forest to Learn Imbalanced Data (Technical Report 666. Statistics Department of University of California at Berkeley, Berkeley, 2004).
Google Scholar
Sing, T., Sander, O., Beerenwinkel, N. & Lengauer, T. ROCR: visualizing classifier performance in R. Bioinformatics 21, 3940–3941 (2005).
Article CAS Google Scholar
Klimek, J. et al. The standard protein mix database: a diverse data set to assist in the production of improved peptide and protein identification software tools. J. Proteome Res. 7, 96–103 (2008).
Article CAS Google Scholar
Wang, H. et al. Development and evaluation of a micro- and nanoscale proteomic sample preparation method. J. Proteome Res. 4, 2397–2403 (2005).
Article CAS Google Scholar

Download references

Acknowledgements

We thank the National Cancer Institute (NCI) Clinical Proteomic Technology Assessment in Cancer Program (NCI-CPTAC, http://proteomics.cancer.gov/programs/CPTAC/) for providing samples of yeast lysate and raw MS data generated by the CPTAC centers. We thank Rushdy Ahmad, Kathy Do, Amy Ham, Emily Rudomin, and Shao-En Ong for MS data generation, and Hasmik Keshishian and Terri Addona for generating the lists of validated MRM peptides. We also thank Shao-En Ong, Jacob Jaffe, Karl Clauser, Eric Kuhn, Pablo Tamayo, and Nick Patterson for helpful discussions. We would like to thank the reviewers for their insightful comments. This work was supported in part by grants to S.A.C. from the National Institutes of Health Grants 1U24 CA126476 as part of the NCI's Clinical Proteomic Technologies Assessment in Cancer Program, the National Heart, Lung, and Blood Institute, U01-HL081341 and The Women's Cancer Research Fund; to J.P.M. from the National Science Foundation and NIGMS the National Institutes of Health (NIGMS and NCI); to D.R.M. from the National Institutes of Health grant R01 CA126219, as part of NCI's Clinical Proteomic Technologies for Cancer Program.

Author information

Authors and Affiliations

Broad Institute of Massachusetts Institute of Technology and Harvard, 7 Cambridge Center, Cambridge, 02142, Massachusetts, USA
Vincent A Fusaro, D R Mani, Jill P Mesirov & Steven A Carr
Bioinformatics Program, Boston University, 24 Cummington Street, Boston, 02215, Massachusetts, USA
Vincent A Fusaro

Authors

Vincent A Fusaro
View author publications
You can also search for this author in PubMed Google Scholar
D R Mani
View author publications
You can also search for this author in PubMed Google Scholar
Jill P Mesirov
View author publications
You can also search for this author in PubMed Google Scholar
Steven A Carr
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jill P Mesirov or Steven A Carr.

Supplementary information

Supplementary Figures 1–7, Methods, Data

(PDF 483 kb)

Supplementary Table 1

Ranked list of 550 physicochemical properties (XLS 77 kb)

Supplementary Table 2

Validated MRM peptides (XLS 49 kb)

Supplementary Source Code (ZIP 84009 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fusaro, V., Mani, D., Mesirov, J. et al. Prediction of high-responding peptides for targeted protein assays by mass spectrometry. Nat Biotechnol 27, 190–198 (2009). https://doi.org/10.1038/nbt.1524

Download citation

Received: 17 October 2008
Accepted: 03 January 2009
Published: 25 January 2009
Issue Date: February 2009
DOI: https://doi.org/10.1038/nbt.1524

This article is cited by

Absolute quantitation of human wild-type DNAI1 protein in lung tissue using a nanoLC-PRM-MS-based targeted proteomics approach coupled with immunoprecipitation
- Hui Wang
- Xiaoyan Ni
- Gang Sun
Clinical Proteomics (2024)
Development of an ID-LC–MS/MS method using targeted proteomics for quantifying cardiac troponin I in human serum
- Meltem Asicioglu
- Merve Oztug
- Nevin Gul Karaguler
Clinical Proteomics (2023)
Targeted quantitation of CFTR protein expression in vivo using immunoprecipitation & parallel reaction monitoring tandem mass spectrometry
- Hui Wang
- Yunxiang Dai
- James C. Sullivan
Translational Medicine Communications (2022)
Tutorial: using nanoneedles for intracellular delivery
- Ciro Chiappini
- Yaping Chen
- Roey Elnathan
Nature Protocols (2021)
Simultaneous and quantitative monitoring transcription factors in human embryonic stem cell differentiation using mass spectrometry–based targeted proteomics
- Mengying Xu
- Lei Xu
- Yun Chen
Analytical and Bioanalytical Chemistry (2021)

Prediction of high-responding peptides for targeted protein assays by mass spectrometry

Abstract

Access options

Similar content being viewed by others

Decrypting the molecular basis of cellular drug phenotypes by dose-resolved expression proteomics

PERCEPTION predicts patient response and resistance to treatment using single-cell transcriptomics of their tumors

Photoaffinity labelling with small molecules

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Supplementary information

Supplementary Figures 1–7, Methods, Data

Supplementary Table 1

Supplementary Table 2

Supplementary Source Code (ZIP 84009 kb)

Rights and permissions

About this article

Cite this article

This article is cited by

Absolute quantitation of human wild-type DNAI1 protein in lung tissue using a nanoLC-PRM-MS-based targeted proteomics approach coupled with immunoprecipitation

Development of an ID-LC–MS/MS method using targeted proteomics for quantifying cardiac troponin I in human serum

Targeted quantitation of CFTR protein expression in vivo using immunoprecipitation & parallel reaction monitoring tandem mass spectrometry

Tutorial: using nanoneedles for intracellular delivery

Simultaneous and quantitative monitoring transcription factors in human embryonic stem cell differentiation using mass spectrometry–based targeted proteomics

Search

Quick links

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links