Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

A probability-based approach for high-throughput protein phosphorylation analysis and site localization

Abstract

Data analysis and interpretation remain major logistical challenges when attempting to identify large numbers of protein phosphorylation sites by nanoscale reverse-phase liquid chromatography/tandem mass spectrometry (LC-MS/MS) (Supplementary Figure 1 online). In this report we address challenges that are often only addressable by laborious manual validation, including data set error, data set sensitivity and phosphorylation site localization. We provide a large-scale phosphorylation data set with a measured error rate as determined by the target-decoy approach, we demonstrate an approach to maximize data set sensitivity by efficiently distracting incorrect peptide spectral matches (PSMs), and we present a probability-based score, the Ascore, that measures the probability of correct phosphorylation site localization based on the presence and intensity of site-determining ions in MS/MS spectra. We applied our methods in a fully automated fashion to nocodazole-arrested HeLa cell lysate where we identified 1,761 nonredundant phosphorylation sites from 491 proteins with a peptide false-positive rate of 1.3%.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Composite target/decoy database searching strategy provides an accurate estimate of false-positive rates for large data sets by knowingly distracting fifty percent of the error.
Figure 2: Establishing a low false-positive rate for large-scale phosphorylation data sets.
Figure 3: Resolving ambiguity in phosphorylation site localization.
Figure 4: Sequest and Mascot can fail to provide proper phosphorylation site placement.

Similar content being viewed by others

References

  1. Kim, J.E., Tannenbaum, S.R. & White, F.M. Global phosphoproteome of HT-29 human colon adenocarcinoma cells. J. Proteome Res. 4, 1339–1346 (2005).

    Article  CAS  Google Scholar 

  2. Cantin, G.T., Venable, J.D., Cociorva, D. & Yates, J.R., III . Quantitative phosphoproteomic analysis of the tumor necrosis factor pathway. J. Proteome Res. 5, 127–134 (2006).

    Article  CAS  Google Scholar 

  3. Ballif, B.A., Villen, J., Beausoleil, S.A., Schwartz, D. & Gygi, S.P. Phosphoproteomic analysis of the developing mouse brain. Mol. Cell. Proteomics 3, 1093–1101 (2004).

    Article  CAS  Google Scholar 

  4. Beausoleil, S.A. et al. Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc. Natl. Acad. Sci. USA 101, 12130–12135 (2004).

    Article  CAS  Google Scholar 

  5. Ficarro, S.B. et al. Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 20, 301–305 (2002).

    Article  CAS  Google Scholar 

  6. Gruhler, A. et al. Quantitative phosphoproteomics applied to the yeast pheromone signaling pathway. Mol. Cell. Proteomics 4, 310–327 (2005).

    Article  CAS  Google Scholar 

  7. Nuhse, T.S., Stensballe, A., Jensen, O.N. & Peck, S.C. Large-scale analysis of in vivo phosphorylated membrane proteins by immobilized metal ion affinity chromatography and mass spectrometry. Mol. Cell. Proteomics 2, 1234–1243 (2003).

    Article  Google Scholar 

  8. Rush, J. et al. Immunoaffinity profiling of tyrosine phosphorylation in cancer cells. Nat. Biotechnol. 23, 94–101 (2005).

    Article  CAS  Google Scholar 

  9. Collins, M.O. et al. Proteomic analysis of in vivo phosphorylated synaptic proteins. J. Biol. Chem. 280, 5972–5982 (2005).

    Article  CAS  Google Scholar 

  10. Trinidad, J.C., Specht, C.G., Thalhammer, A., Schoepfer, R. & Burlingame, A.L. Comprehensive identification of phosphorylation sites in postsynaptic density preparations. Mol. Cell Proteomics 5, 914–922 (2006).

    Article  CAS  Google Scholar 

  11. MacCoss, M.J. Computational analysis of shotgun proteomics data. Curr. Opin. Chem. Biol. 9, 88–94 (2005).

    Article  CAS  Google Scholar 

  12. Peng, J., Elias, J.E., Thoreen, C.C., Licklider, L.J. & Gygi, S.P. Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J. Proteome Res. 2, 43–50 (2003).

    Article  CAS  Google Scholar 

  13. Elias, J.E., Gibbons, F.D., King, O.D., Roth, F.P. & Gygi, S.P. Intensity-based protein identification by machine learning from a library of tandem mass spectra. Nat. Biotechnol. 22, 214–219 (2004).

    Article  CAS  Google Scholar 

  14. DeGnore, J.P. & Qin, J. Fragmentation of phosphopeptides in an ion trap mass spectrometer. J. Am. Soc. Mass Spectrom. 9, 1175–1188 (1998).

    Article  CAS  Google Scholar 

  15. Schwartz, D. & Gygi, S.P. An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets. Nat. Biotechnol. 23, 1391–1398 (2005).

    Article  CAS  Google Scholar 

  16. Elias, J.E., Haas, W., Faherty, B.K. & Gygi, S.P. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat. Methods 2, 667–675 (2005).

    Article  CAS  Google Scholar 

  17. Pawson, T. & Scott, J.D. Protein phosphorylation in signaling—50 years and counting. Trends Biochem. Sci. 30, 286–290 (2005).

    Article  CAS  Google Scholar 

  18. Ballif, B.A. et al. Quantitative phosphorylation profiling of the ERK/p90 ribosomal S6 kinase-signaling cassette and its targets, the tuberous sclerosis tumor suppressors. Proc. Natl. Acad. Sci. USA 102, 667–672 (2005).

    Article  CAS  Google Scholar 

  19. Stemmann, O., Zou, H., Gerber, S.A., Gygi, S.P. & Kirschner, M.W. Dual inhibition of sister chromatid separation at metaphase. Cell 107, 715–726 (2001).

    Article  CAS  Google Scholar 

  20. Syka, J.E. et al. Novel linear quadrupole ion trap/FT mass spectrometer: performance characterization and use in the comparative analysis of histone H3 post-translational modifications. J. Proteome Res. 3, 621–626 (2004).

    Article  CAS  Google Scholar 

  21. Haas, W. et al. Optimization and use of peptide mass measurement accuracy in shotgun proteomics. Mol. Cell. Proteomics (in the press) (2006).

Download references

Acknowledgements

We thank David Chiang, James Candlin and the software developers at Sage-N-Research for early access to Sequest-Sorcerer and on-the-fly peptide reversal within the Sequest algorithm. We thank P. Everley, C. Bakalarski and B. Faherty for in-house software development and D. Schwartz for motif analysis. HeLa cell lysate was generously provided by M. Rape. This work was supported in part by grants from the National Institutes of Health (HG03456 and GM67945).

Author information

Authors and Affiliations

Authors

Contributions

S.A.B. conducted all experiments, carried out algorithm development and implementation, and data analysis. J.V. and S.A.G. provided analytical expertise for SCX chromatography and mass spectrometry. J.R. provided synthetic peptide libraries and immunoprecipitation data. S.P.G. provided overall experimental design and support.

Corresponding author

Correspondence to Steven P Gygi.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Fig. 1

General Strategy for the large-scale analysis of protein phosphorylation. (PDF 757 kb)

Supplementary Fig. 2

Determining an effective search space using the target/decoy strategy. (PDF 863 kb)

Supplementary Fig. 3

Residue composition from data sets of known phosphorylation sites. (PDF 1768 kb)

Supplementary Fig. 4

Comparison of the Ascore versus Sequest scoring criteria for site localization. (PDF 3070 kb)

Supplementary Fig. 5

Comparison of the Ascore versus Mascot scoring criteria for site localization. (PDF 3256 kb)

Supplementary Fig. 6

Sequence logos (http://weblogo.berkeley.edu/) of the motifs identified in this large-scale data set shown in Supplementary table 1 with an Ascore > 15. (PDF 978 kb)

Supplementary Fig. 7

Biological processes of 362 of the 491 proteins identified in this experiment. (PDF 622 kb)

Supplementary Fig. 8

SDS-PAGE gel used in this experiment. (PDF 5115 kb)

Supplementary Table 1

Filtering criteria for the entire experiment. (PDF 1053 kb)

Supplementary Table 2

2,836 identified phosphopeptides from the entire experiment. (XLS 1649 kb)

Supplementary Table 3

Synthetic peptide libraries. (XLS 815 kb)

Supplementary Table 4

Immunoprecipitation experiments. (XLS 495 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Beausoleil, S., Villén, J., Gerber, S. et al. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat Biotechnol 24, 1285–1292 (2006). https://doi.org/10.1038/nbt1240

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt1240

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing