A probability-based approach for high-throughput protein phosphorylation analysis and site localization

Beausoleil, Sean A; Villén, Judit; Gerber, Scott A; Rush, John; Gygi, Steven P

doi:10.1038/nbt1240

Letter
Published: 10 September 2006

A probability-based approach for high-throughput protein phosphorylation analysis and site localization

Sean A Beausoleil¹,
Judit Villén¹,
Scott A Gerber²,
John Rush³ &
…
Steven P Gygi¹

Nature Biotechnology volume 24, pages 1285–1292 (2006)Cite this article

14k Accesses
1252 Citations
39 Altmetric
Metrics details

Abstract

Data analysis and interpretation remain major logistical challenges when attempting to identify large numbers of protein phosphorylation sites by nanoscale reverse-phase liquid chromatography/tandem mass spectrometry (LC-MS/MS) (Supplementary Figure 1 online). In this report we address challenges that are often only addressable by laborious manual validation, including data set error, data set sensitivity and phosphorylation site localization. We provide a large-scale phosphorylation data set with a measured error rate as determined by the target-decoy approach, we demonstrate an approach to maximize data set sensitivity by efficiently distracting incorrect peptide spectral matches (PSMs), and we present a probability-based score, the Ascore, that measures the probability of correct phosphorylation site localization based on the presence and intensity of site-determining ions in MS/MS spectra. We applied our methods in a fully automated fashion to nocodazole-arrested HeLa cell lysate where we identified 1,761 nonredundant phosphorylation sites from 491 proteins with a peptide false-positive rate of 1.3%.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Composite target/decoy database searching strategy provides an accurate estimate of false-positive rates for large data sets by knowingly distracting fifty percent of the error.**

**Figure 2: Establishing a low false-positive rate for large-scale phosphorylation data sets.**

**Figure 3: Resolving ambiguity in phosphorylation site localization.**

**Figure 4: Sequest and Mascot can fail to provide proper phosphorylation site placement.**

A multi-purpose, regenerable, proteome-scale, human phosphoserine resource for phosphoproteomics

Article 24 October 2022

Top-down mass spectrometry of native proteoforms and their complexes: a community study

Article 14 May 2024

Proteome-wide structural changes measured with limited proteolysis-mass spectrometry: an advanced protocol for high-throughput applications

Article 16 December 2022

References

Kim, J.E., Tannenbaum, S.R. & White, F.M. Global phosphoproteome of HT-29 human colon adenocarcinoma cells. J. Proteome Res. 4, 1339–1346 (2005).
Article CAS Google Scholar
Cantin, G.T., Venable, J.D., Cociorva, D. & Yates, J.R., III . Quantitative phosphoproteomic analysis of the tumor necrosis factor pathway. J. Proteome Res. 5, 127–134 (2006).
Article CAS Google Scholar
Ballif, B.A., Villen, J., Beausoleil, S.A., Schwartz, D. & Gygi, S.P. Phosphoproteomic analysis of the developing mouse brain. Mol. Cell. Proteomics 3, 1093–1101 (2004).
Article CAS Google Scholar
Beausoleil, S.A. et al. Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc. Natl. Acad. Sci. USA 101, 12130–12135 (2004).
Article CAS Google Scholar
Ficarro, S.B. et al. Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 20, 301–305 (2002).
Article CAS Google Scholar
Gruhler, A. et al. Quantitative phosphoproteomics applied to the yeast pheromone signaling pathway. Mol. Cell. Proteomics 4, 310–327 (2005).
Article CAS Google Scholar
Nuhse, T.S., Stensballe, A., Jensen, O.N. & Peck, S.C. Large-scale analysis of in vivo phosphorylated membrane proteins by immobilized metal ion affinity chromatography and mass spectrometry. Mol. Cell. Proteomics 2, 1234–1243 (2003).
Article Google Scholar
Rush, J. et al. Immunoaffinity profiling of tyrosine phosphorylation in cancer cells. Nat. Biotechnol. 23, 94–101 (2005).
Article CAS Google Scholar
Collins, M.O. et al. Proteomic analysis of in vivo phosphorylated synaptic proteins. J. Biol. Chem. 280, 5972–5982 (2005).
Article CAS Google Scholar
Trinidad, J.C., Specht, C.G., Thalhammer, A., Schoepfer, R. & Burlingame, A.L. Comprehensive identification of phosphorylation sites in postsynaptic density preparations. Mol. Cell Proteomics 5, 914–922 (2006).
Article CAS Google Scholar
MacCoss, M.J. Computational analysis of shotgun proteomics data. Curr. Opin. Chem. Biol. 9, 88–94 (2005).
Article CAS Google Scholar
Peng, J., Elias, J.E., Thoreen, C.C., Licklider, L.J. & Gygi, S.P. Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J. Proteome Res. 2, 43–50 (2003).
Article CAS Google Scholar
Elias, J.E., Gibbons, F.D., King, O.D., Roth, F.P. & Gygi, S.P. Intensity-based protein identification by machine learning from a library of tandem mass spectra. Nat. Biotechnol. 22, 214–219 (2004).
Article CAS Google Scholar
DeGnore, J.P. & Qin, J. Fragmentation of phosphopeptides in an ion trap mass spectrometer. J. Am. Soc. Mass Spectrom. 9, 1175–1188 (1998).
Article CAS Google Scholar
Schwartz, D. & Gygi, S.P. An iterative statistical approach to the identification of protein phosphorylation motifs from large-scale data sets. Nat. Biotechnol. 23, 1391–1398 (2005).
Article CAS Google Scholar
Elias, J.E., Haas, W., Faherty, B.K. & Gygi, S.P. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat. Methods 2, 667–675 (2005).
Article CAS Google Scholar
Pawson, T. & Scott, J.D. Protein phosphorylation in signaling—50 years and counting. Trends Biochem. Sci. 30, 286–290 (2005).
Article CAS Google Scholar
Ballif, B.A. et al. Quantitative phosphorylation profiling of the ERK/p90 ribosomal S6 kinase-signaling cassette and its targets, the tuberous sclerosis tumor suppressors. Proc. Natl. Acad. Sci. USA 102, 667–672 (2005).
Article CAS Google Scholar
Stemmann, O., Zou, H., Gerber, S.A., Gygi, S.P. & Kirschner, M.W. Dual inhibition of sister chromatid separation at metaphase. Cell 107, 715–726 (2001).
Article CAS Google Scholar
Syka, J.E. et al. Novel linear quadrupole ion trap/FT mass spectrometer: performance characterization and use in the comparative analysis of histone H3 post-translational modifications. J. Proteome Res. 3, 621–626 (2004).
Article CAS Google Scholar
Haas, W. et al. Optimization and use of peptide mass measurement accuracy in shotgun proteomics. Mol. Cell. Proteomics (in the press) (2006).

Download references

Acknowledgements

We thank David Chiang, James Candlin and the software developers at Sage-N-Research for early access to Sequest-Sorcerer and on-the-fly peptide reversal within the Sequest algorithm. We thank P. Everley, C. Bakalarski and B. Faherty for in-house software development and D. Schwartz for motif analysis. HeLa cell lysate was generously provided by M. Rape. This work was supported in part by grants from the National Institutes of Health (HG03456 and GM67945).

Author information

Authors and Affiliations

Department of Cell Biology, Harvard Medical School, 240 Longwood Ave., Boston, 02115, Massachusetts, USA
Sean A Beausoleil, Judit Villén & Steven P Gygi
Department of Genetics and Norris Cotton Cancer Center, Lebanon, 03755, New Hampshire, USA
Scott A Gerber
Cell Signaling Technology, Inc., Beverley, 01915, Massachusetts, USA
John Rush

Authors

Sean A Beausoleil
View author publications
You can also search for this author in PubMed Google Scholar
Judit Villén
View author publications
You can also search for this author in PubMed Google Scholar
Scott A Gerber
View author publications
You can also search for this author in PubMed Google Scholar
John Rush
View author publications
You can also search for this author in PubMed Google Scholar
Steven P Gygi
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.A.B. conducted all experiments, carried out algorithm development and implementation, and data analysis. J.V. and S.A.G. provided analytical expertise for SCX chromatography and mass spectrometry. J.R. provided synthetic peptide libraries and immunoprecipitation data. S.P.G. provided overall experimental design and support.

Corresponding author

Correspondence to Steven P Gygi.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Fig. 1

General Strategy for the large-scale analysis of protein phosphorylation. (PDF 757 kb)

Supplementary Fig. 2

Determining an effective search space using the target/decoy strategy. (PDF 863 kb)

Supplementary Fig. 3

Residue composition from data sets of known phosphorylation sites. (PDF 1768 kb)

Supplementary Fig. 4

Comparison of the Ascore versus Sequest scoring criteria for site localization. (PDF 3070 kb)

Supplementary Fig. 5

Comparison of the Ascore versus Mascot scoring criteria for site localization. (PDF 3256 kb)

Supplementary Fig. 6

Sequence logos (http://weblogo.berkeley.edu/) of the motifs identified in this large-scale data set shown in Supplementary table 1 with an Ascore > 15. (PDF 978 kb)

Supplementary Fig. 7

Biological processes of 362 of the 491 proteins identified in this experiment. (PDF 622 kb)

Supplementary Fig. 8

SDS-PAGE gel used in this experiment. (PDF 5115 kb)

Supplementary Table 1

Filtering criteria for the entire experiment. (PDF 1053 kb)

Supplementary Table 2

2,836 identified phosphopeptides from the entire experiment. (XLS 1649 kb)

Supplementary Table 3

Synthetic peptide libraries. (XLS 815 kb)

Supplementary Table 4

Immunoprecipitation experiments. (XLS 495 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Beausoleil, S., Villén, J., Gerber, S. et al. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat Biotechnol 24, 1285–1292 (2006). https://doi.org/10.1038/nbt1240

Download citation

Received: 25 May 2006
Accepted: 13 July 2006
Published: 10 September 2006
Issue Date: October 2006
DOI: https://doi.org/10.1038/nbt1240

This article is cited by

Targetable leukaemia dependency on noncanonical PI3Kγ signalling
- Qingyu Luo
- Evangeline G. Raulston
- Andrew A. Lane
Nature (2024)
Multi-omics analysis identifies drivers of protein phosphorylation
- Tian Zhang
- Gregory R. Keele
- Gary A. Churchill
Genome Biology (2023)
ORC1 binds to cis-transcribed RNAs for efficient activation of replication origins
- Aina Maria Mas
- Enrique Goñi
- Maite Huarte
Nature Communications (2023)
Mzion enables deep and precise identification of peptides in data-dependent acquisition proteomics
- Qiang Zhang
Scientific Reports (2023)
Glycopeptide database search and de novo sequencing with PEAKS GlycanFinder enable highly sensitive glycoproteomics
- Weiping Sun
- Qianqiu Zhang
- Baozhen Shan
Nature Communications (2023)

A probability-based approach for high-throughput protein phosphorylation analysis and site localization

Abstract

Access options

Similar content being viewed by others

A multi-purpose, regenerable, proteome-scale, human phosphoserine resource for phosphoproteomics

Top-down mass spectrometry of native proteoforms and their complexes: a community study

Proteome-wide structural changes measured with limited proteolysis-mass spectrometry: an advanced protocol for high-throughput applications

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Supplementary Fig. 1

Supplementary Fig. 2

Supplementary Fig. 3

Supplementary Fig. 4

Supplementary Fig. 5

Supplementary Fig. 6

Supplementary Fig. 7

Supplementary Fig. 8

Supplementary Table 1

Supplementary Table 2

Supplementary Table 3

Supplementary Table 4

Rights and permissions

About this article

Cite this article

This article is cited by

Targetable leukaemia dependency on noncanonical PI3Kγ signalling

Multi-omics analysis identifies drivers of protein phosphorylation

ORC1 binds to cis-transcribed RNAs for efficient activation of replication origins

Mzion enables deep and precise identification of peptides in data-dependent acquisition proteomics

Glycopeptide database search and de novo sequencing with PEAKS GlycanFinder enable highly sensitive glycoproteomics

Automated phosphorylation site mapping

Search

Quick links

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links