Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Calculating absolute and relative protein abundance from mass spectrometry-based protein expression data

Abstract

Mass spectrometry (MS)-based shotgun proteomics allows protein identifications even in complex biological samples. Protein abundances can then be estimated from the counts of tandem MS (MS/MS) spectra attributable to each protein, provided one accounts for differential MS detectability of contributing peptides. We developed a method, APEX, which calculates Absolute Protein EXpression levels based upon learned correction factors, MS/MS spectral counts and each protein's probability of correct identification. This protocol describes APEX-based calculations in three parts. (i) Using training data, peptide sequences and their sequence properties, a model is built to estimate MS detectability (Oi) for any given protein. (ii) Absolute protein abundances are calculated from spectral counts, identification probabilities and the learned Oi-values. (iii) Simple statistics allow calculation of differential expression in two distinct biological samples, i.e., measuring relative protein abundances. APEX-based protein abundances span 3–4 orders of magnitude and are applicable to mixtures of 100s to 1,000s of proteins.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: APEX pipeline—overview.
Figure 2: Preparation of input files.
Figure 3: Use of WEKA.

References

  1. Nesvizhskii, A.I., Keller, A., Kolker, E. & Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646–4658 (2003).

    Article  CAS  PubMed  Google Scholar 

  2. Silva, J.C., Gorenstein, M.V., Li, G.Z., Vissers, J.P. & Geromanos, S.J. Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol. Cell. Proteomics 5, 144–156 (2006).

    Article  CAS  PubMed  Google Scholar 

  3. Oda, Y., Huang, K., Cross, F.R., Cowburn, D. & Chait, B.T. Accurate quantitation of protein expression and site-specific phosphorylation. Proc. Natl. Acad. Sci. U.S.A. 96, 6591–6596 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Ong, S.E. & Mann, M. Mass spectrometry-based proteomics turns quantitative. Nat. Chem. Biol. 1, 252–262 (2005).

    Article  CAS  PubMed  Google Scholar 

  5. Ong, S.E. et al. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 1, 376–386 (2002).

    Article  CAS  PubMed  Google Scholar 

  6. Gygi, S.P. et al. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat. Biotechnol. 17, 994–999 (1999).

    Article  CAS  PubMed  Google Scholar 

  7. Gerber, S.A., Rush, J., Stemman, O., Kirschner, M.W. & Gygi, S.P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci. U.S.A. 100, 6940–6945 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Ishihama, Y. et al. Quantitative mouse brain proteomics using culture-derived isotope tags as internal standards. Nat. Biotechnol. 23, 617–621 (2005).

    Article  CAS  PubMed  Google Scholar 

  9. Liu, H., Sadygov, R.G. & Yates, J.R. 3rd. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 76, 4193–4201 (2004).

    Article  CAS  PubMed  Google Scholar 

  10. Gao, J., Opiteck, G.J., Friedrichs, M.S., Dongre, A.R. & Hefta, S.A. Changes in the protein expression of yeast as a function of carbon source. J. Proteome Res. 2, 643–649 (2003).

    Article  CAS  PubMed  Google Scholar 

  11. Florens, L. et al. A proteomic view of the Plasmodium falciparum life cycle. Nature 419, 520–526 (2002).

    Article  CAS  PubMed  Google Scholar 

  12. Gao, J., Friedrichs, M.S., Dongre, A.R. & Opiteck, G.J. Guidelines for the routine application of the Peptide hits technique. J. Am. Soc. Mass. Spectrom. 16, 1231–1238 (2005).

    Article  CAS  PubMed  Google Scholar 

  13. States, D.J. et al. Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study. Nat. Biotechnol. 24, 333–338 (2006).

    Article  CAS  PubMed  Google Scholar 

  14. Blondeau, F. et al. Tandem MS analysis of brain clathrin-coated vesicles reveals their critical involvement in synaptic vesicle recycling. Proc. Natl. Acad. Sci. U.S.A. 101, 3833–3838 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Kislinger, T. et al. Global survey of organ and organelle protein expression in mouse: combined proteomic and transcriptomic profiling. Cell 125, 173–186 (2006).

    Article  CAS  PubMed  Google Scholar 

  16. Kislinger, T. et al. Proteome dynamics during C2C12 myoblast differentiation. Mol. Cell Proteomics 4, 887–901 (2005).

    Article  CAS  PubMed  Google Scholar 

  17. Steen, H. & Pandey, A. Proteomics goes quantitative: measuring protein abundance. Trends Biotechnol. 20, 361–364 (2002).

    Article  CAS  PubMed  Google Scholar 

  18. Elias, J.E., Gibbons, F.D., King, O.D., Roth, F.P. & Gygi, S.P. Intensity-based protein identification by machine learning from a library of tandem mass spectra. Nat. Biotechnol. 22, 214–219 (2004).

    Article  CAS  PubMed  Google Scholar 

  19. Gay, S., Binz, P.A., Hochstrasser, D.F. & Appel, R.D. Peptide mass fingerprinting peak intensity prediction: extracting knowledge from spectra. Proteomics 2, 1374–1391 (2002).

    Article  CAS  PubMed  Google Scholar 

  20. Craig, R., Cortens, J.P. & Beavis, R.C. The use of proteotypic peptide libraries for protein identification. Rapid Commun. Mass. Spectrom. 19, 1844–1850 (2005).

    Article  CAS  PubMed  Google Scholar 

  21. Kuster, B., Schirle, M., Mallick, P. & Aebersold, R. Scoring proteomes with proteotypic peptide probes. Nat. Rev. Mol. Cell Biol. 6, 577–583 (2005).

    Article  CAS  PubMed  Google Scholar 

  22. Le Bihan, T., Robinson, M.D., Stewart, I.I. & Figeys, D. Definition and characterization of a “trypsinosome” from specific peptide characteristics by nano-HPLC-MS/MS and in silico analysis of complex protein mixtures. J. Proteome Res. 3, 1138–1148 (2004).

    Article  CAS  PubMed  Google Scholar 

  23. Mallick, P. et al. Computational prediction of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 25, 125–131 (2007).

    Article  CAS  PubMed  Google Scholar 

  24. Tang, H. et al. A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics 22, e481–e488 (2006).

    Article  CAS  PubMed  Google Scholar 

  25. Lu, P., Vogel, C., Wang, R., Yao, X. & Marcotte, E.M. Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat. Biotechnol. 25, 117–124 (2007).

    Article  CAS  PubMed  Google Scholar 

  26. Ghaemmaghami, S. et al. Global analysis of protein expression in yeast. Nature 425, 737–741 (2003).

    Article  CAS  PubMed  Google Scholar 

  27. Newman, J.R. et al. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature 441, 840–846 (2006).

    Article  CAS  PubMed  Google Scholar 

  28. Futcher, B., Latter, G.I., Monardo, P., McLaughlin, C.S. & Garrels, J.I. A sampling of the yeast proteome. Mol. Cell. Biol. 19, 7357–7368 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Lopez-Campistrous, A. et al. Localization, annotation, and comparison of the Escherichia coli K-12 proteome under two states of growth. Mol. Cell Proteomics 4, 1205–1209 (2005).

    Article  CAS  PubMed  Google Scholar 

  30. Lu, P. et al. Global metabolic changes following loss of a feedback loop reveal dynamic steady states of the yeast metabolome. Metab. Eng. 9, 8–20 (2007).

    Article  CAS  PubMed  Google Scholar 

  31. Wang, R. & Marcotte, E.M. The proteomic response of Mycobacterium smegmatis to anti-tuberculosis drugs suggests targeted pathways. J. Proteome Res. 7, 855–865 (2008).

    Article  CAS  PubMed  Google Scholar 

  32. Baerenfaller, K. et al. Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics. Science 320, 938–941 (2008).

    Article  CAS  PubMed  Google Scholar 

  33. Schmidt, M.W., Houseman, A., Ivanov, A.R. & Wolf, D.A. Comparative proteomic and transcriptomic profiling of the fission yeast Schizosaccharomyces pombe. Mol. Syst. Biol. 3, 79 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Keller, A., Nesvizhskii, A.I., Kolker, E. & Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74, 5383–5392 (2002).

    Article  CAS  PubMed  Google Scholar 

  35. Cagney, G., Amiri, S., Premawaradena, T., Lindo, M. & Emili, A. In silico proteome analysis to facilitate proteomics experiments using mass spectrometry. Proteome Sci. 1, 5 (2003).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Neidhardt, F.C. & Umbarger, H.E. In Escherichia coli and Salmonella typhimurium: Cellular and Molecular Biology 2nd edn. Vol. 1 (eds. Neidhardt, F.C. et al.) 13–16 (ASM Press, Washington, D.C., 1996).

    Google Scholar 

  37. Sundararaj, S. et al. The CyberCell Database (CCDB): a comprehensive, self-updating, relational database to coordinate and facilitate in silico modeling of Escherichia coli. Nucleic Acids Res. 32, D293–D295 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Kal, A.J. et al. Dynamics of gene expression revealed by comparison of serial analysis of gene expression transcript profiles from yeast grown on two different carbon sources. Mol. Biol. Cell 10, 1859–1872 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Stollberg, J., Urschitz, U., Urban, Z. & Boyd, C.D. A quantitative evaluation of SAGE. Genome Res. 10, 1241–1248 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Velculescu, V.E., Zhang, L., Vogelstein, B. & Kinzler, K.W. Serial analysis of gene expression. Science 270, 484–487 (1995).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

C.V. acknowledges support by the International Human Frontier Science Program. We thank John Braisted and Srilatha Kuntumalla from JCVI for many useful discussions regarding the APEX calculations. This work was supported by grants from the Welch (F-1515) and Packard Foundations, the National Science Foundation and National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Christine Vogel or Edward M Marcotte.

Supplementary information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vogel, C., Marcotte, E. Calculating absolute and relative protein abundance from mass spectrometry-based protein expression data. Nat Protoc 3, 1444–1451 (2008). https://doi.org/10.1038/nprot.2008.132

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nprot.2008.132

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing