Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Resource
  • Published:

A large synthetic peptide and phosphopeptide reference library for mass spectrometry–based proteomics

Subjects

Abstract

We present a peptide library and data resource of >100,000 synthetic, unmodified peptides and their phosphorylated counterparts with known sequences and phosphorylation sites. Analysis of the library by mass spectrometry yielded a data set that we used to evaluate the merits of different search engines (Mascot and Andromeda) and fragmentation methods (beam-type collision-induced dissociation (HCD) and electron transfer dissociation (ETD)) for peptide identification. We also compared the sensitivities and accuracies of phosphorylation-site localization tools (Mascot Delta Score, PTM score and phosphoRS), and we characterized the chromatographic behavior of peptides in the library. We found that HCD identified more peptides and phosphopeptides than did ETD, that phosphopeptides generally eluted later from reversed-phase columns and were easier to identify than unmodified peptides and that current computational tools for proteomics can still be substantially improved. These peptides and spectra will facilitate the development, evaluation and improvement of experimental and computational proteomic strategies, such as separation techniques and the prediction of retention times and fragmentation patterns.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Library design and synthesis.
Figure 2: Peptide library identification rate.
Figure 3: FDR determination for peptide identification.
Figure 4: FLR determination for phosphorylated peptides.
Figure 5: Retention time analysis.

Similar content being viewed by others

References

  1. Chen, Y., Kwon, S.W., Kim, S.C. & Zhao, Y. Integrated approach for manual evaluation of peptides identified by searching protein sequence databases with tandem mass spectra. J. Proteome Res. 4, 998–1005 (2005).

    Article  CAS  Google Scholar 

  2. Chen, Y., Zhang, J., Xing, G. & Zhao, Y. Mascot-derived false positive peptide identifications revealed by manual analysis of tandem mass spectra. J. Proteome Res. 8, 3141–3147 (2009).

    Article  CAS  Google Scholar 

  3. Keller, A. et al. Experimental protein mixture for validating tandem mass spectral analysis. OMICS 6, 207–212 (2002).

    Article  CAS  Google Scholar 

  4. Rudnick, P.A., Wang, Y., Evans, E., Lee, C.S. & Balgley, B.M. Large scale analysis of MASCOT results using a Mass Accuracy-based THreshold (MATH) effectively improves data interpretation. J. Proteome Res. 4, 1353–1360 (2005).

    Article  CAS  Google Scholar 

  5. Klimek, J. et al. The standard protein mix database: a diverse data set to assist in the production of improved Peptide and protein identification software tools. J. Proteome Res. 7, 96–103 (2008).

    Article  CAS  Google Scholar 

  6. Mallick, P. et al. Computational prediction of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 25, 125–131 (2007).

    Article  CAS  Google Scholar 

  7. Nesvizhskii, A.I., Vitek, O. & Aebersold, R. Analysis and validation of proteomic data generated by tandem mass spectrometry. Nat. Methods 4, 787–797 (2007).

    Article  CAS  Google Scholar 

  8. Mallick, P. & Kuster, B. Proteomics: a pragmatic perspective. Nat. Biotechnol. 28, 695–709 (2010).

    Article  CAS  Google Scholar 

  9. Elias, J.E. & Gygi, S.P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).

    Article  CAS  Google Scholar 

  10. Keller, A., Nesvizhskii, A.I., Kolker, E. & Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74, 5383–5392 (2002).

    Article  CAS  Google Scholar 

  11. Nesvizhskii, A.I., Keller, A., Kolker, E. & Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646–4658 (2003).

    Article  CAS  Google Scholar 

  12. Shteynberg, D. et al. iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol. Cell. Proteomics 10, M111.007690 (2011).

    Article  Google Scholar 

  13. Bohrer, B.C. et al. Combinatorial libraries of synthetic peptides as a model for shotgun proteomics. Anal. Chem. 82, 6559–6568 (2010).

    Article  CAS  Google Scholar 

  14. Beausoleil, S.A., Villen, J., Gerber, S.A., Rush, J. & Gygi, S.P. A probability-based approach for high-throughput protein phosphorylation analysis and site localization. Nat. Biotechnol. 24, 1285–1292 (2006).

    Article  CAS  Google Scholar 

  15. Bailey, C.M. et al. SLoMo: automated site localization of modifications from ETD/ECD mass spectra. J. Proteome Res. 8, 1965–1971 (2009).

    Article  CAS  Google Scholar 

  16. Lemeer, S. et al. Phosphorylation site localization in peptides by MALDI MS/MS and the Mascot Delta Score. Anal. Bioanal. Chem. 402, 249–260 (2012).

    Article  CAS  Google Scholar 

  17. Savitski, M.M. et al. Confident phosphorylation site localization using the Mascot Delta Score. Mol. Cell. Proteomics 10, M110.003830 (2011).

    Article  Google Scholar 

  18. Taus, T. et al. Universal and confident phosphorylation site localization using phosphoRS. J. Proteome Res. 10, 5354–5362 (2011).

    Article  CAS  Google Scholar 

  19. Daub, H. et al. Kinase-selective enrichment enables quantitative phosphoproteomics of the kinome across the cell cycle. Mol. Cell 31, 438–448 (2008).

    Article  CAS  Google Scholar 

  20. Olsen, J.V. et al. Global, in vivo, and site-specific phosphorylation dynamics in signaling networks. Cell 127, 635–648 (2006).

    Article  CAS  Google Scholar 

  21. Olsen, J.V. et al. Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis. Sci. Signal. 3, ra3 (2010).

    Article  Google Scholar 

  22. Oppermann, F.S. et al. Large-scale proteomics analysis of the human kinome. Mol. Cell. Proteomics 8, 1751–1764 (2009).

    Article  CAS  Google Scholar 

  23. Rikova, K. et al. Global survey of phosphotyrosine signaling identifies oncogenic kinases in lung cancer. Cell 131, 1190–1203 (2007).

    Article  CAS  Google Scholar 

  24. Steen, H., Jebanathirajah, J.A., Rush, J., Morrice, N. & Kirschner, M.W. Phosphorylation analysis by mass spectrometry: myths, facts, and the consequences for qualitative and quantitative measurements. Mol. Cell Proteomics 5, 172–181 (2006).

    Article  CAS  Google Scholar 

  25. Krokhin, O.V. Sequence-specific retention calculator. Algorithm for peptide retention prediction in ion-pair RP-HPLC: application to 300- and 100-A pore size C18 sorbents. Anal. Chem. 78, 7785–7795 (2006).

    Article  CAS  Google Scholar 

  26. Jedrychowski, M.P. et al. Evaluation of HCD- and CID-type fragmentation within their respective detection platforms for murine phosphoproteomics. Mol. Cell. Proteomics 10, M111.009910 (2011).

    Article  Google Scholar 

  27. Nagaraj, N., D'Souza, R.C., Cox, J., Olsen, J.V. & Mann, M. Feasibility of large-scale phosphoproteomics with higher energy collisional dissociation fragmentation. J. Proteome Res. 9, 6786–6794 (2010).

    Article  CAS  Google Scholar 

  28. Swaney, D.L., McAlister, G.C. & Coon, J.J. Decision tree–driven tandem mass spectrometry for shotgun proteomics. Nat. Methods 5, 959–964 (2008).

    Article  CAS  Google Scholar 

  29. Swaney, D.L., Wenger, C.D., Thomson, J.A. & Coon, J.J. Human embryonic stem cell phosphoproteome revealed by electron transfer dissociation tandem mass spectrometry. Proc. Natl. Acad. Sci. USA 106, 995–1000 (2009).

    Article  CAS  Google Scholar 

  30. Boersema, P.J., Mohammed, S. & Heck, A.J. Phosphopeptide fragmentation and analysis by mass spectrometry. J. Mass. Spectrom. 44, 861–878 (2009).

    Article  CAS  Google Scholar 

  31. Frese, C.K. et al. Improved peptide identification by targeted fragmentation using CID, HCD and ETD on an LTQ-Orbitrap Velos. J. Proteome Res. 10, 2377–2388 (2011).

    Article  CAS  Google Scholar 

  32. Zhou, H. et al. Enhancing the identification of phosphopeptides from putative basophilic kinase substrates using Ti (IV) based IMAC enrichment. Mol. Cell. Proteomics 10, M110.006452 (2011).

    Article  Google Scholar 

  33. Kapp, E.A. et al. An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: sensitivity and specificity analysis. Proteomics 5, 3475–3490 (2005).

    Article  CAS  Google Scholar 

  34. Frank, A.M. et al. Spectral archives: extending spectral libraries to analyze both identified and unidentified spectra. Nat. Methods 8, 587–591 (2011).

    Article  CAS  Google Scholar 

  35. Huang, Y. et al. A data-mining scheme for identifying peptide structural motifs responsible for different MS/MS fragmentation intensity patterns. J. Proteome Res. 7, 70–79 (2008).

    Article  CAS  Google Scholar 

  36. Lemeer, S. & Heck, A.J. The phosphoproteomics data explosion. Curr. Opin. Chem. Biol. 13, 414–420 (2009).

    Article  CAS  Google Scholar 

  37. Baker, P.R., Trinidad, J.C. & Chalkley, R.J. Modification site localization scoring integrated into a search engine. Mol. Cell. Proteomics 10, M111.008078 (2011).

    Article  Google Scholar 

  38. Chalkley, R.J. & Clauser, K.R. Modification site localization scoring: strategies and performance. Mol. Cell. Proteomics 11, 3–14 (2012).

    Article  CAS  Google Scholar 

  39. Kelstrup, C.D., Hekmat, O., Francavilla, C. & Olsen, J.V. Pinpointing phosphorylation sites: quantitative filtering and a novel site-specific x-ion fragment. J. Proteome Res. 10, 2937–2948 (2011).

    Article  CAS  Google Scholar 

  40. Krokhin, O.V. & Spicer, V. Peptide retention standards and hydrophobicity indexes in reversed-phase high-performance liquid chromatography of peptides. Anal. Chem. 81, 9522–9530 (2009).

    Article  CAS  Google Scholar 

  41. Conrads, T.P., Anderson, G.A., Veenstra, T.D., Pasa-Tolic, L. & Smith, R.D. Utility of accurate mass tags for proteome-wide protein identification. Anal. Chem. 72, 3349–3354 (2000).

    Article  CAS  Google Scholar 

  42. Moruz, L. et al. Chromatographic retention time prediction for posttranslationally modified peptides. Proteomics 12, 1151–1159 (2012).

    Article  CAS  Google Scholar 

  43. Moruz, L., Tomazela, D. & Kall, L. Training, selection, and robust calibration of retention time models for targeted proteomics. J. Proteome Res. 9, 5209–5216 (2010).

    Article  CAS  Google Scholar 

  44. Geromanos, S.J. et al. The detection, correlation, and comparison of peptide precursor and product ions from data independent LC-MS with data dependant LC-MS/MS. Proteomics 9, 1683–1695 (2009).

    Article  CAS  Google Scholar 

  45. Hoaglund-Hyzer, C.S., Li, J. & Clemmer, D.E. Mobility labeling for parallel CID of ion mixtures. Anal. Chem. 72, 2737–2740 (2000).

    Article  CAS  Google Scholar 

  46. Thingholm, T.E., Jensen, O.N. & Larsen, M.R. Analytical strategies for phosphoproteomics. Proteomics 9, 1451–1468 (2009).

    Article  CAS  Google Scholar 

  47. Kyte, J. & Doolittle, R.F. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982).

    Article  CAS  Google Scholar 

  48. Zhou, H. et al. Robust phosphoproteome enrichment using monodisperse microsphere-based immobilized titanium (IV) ion affinity chromatography. Nat. Protoc. 8, 461–480 (2013).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This research was supported in part by the Deutsche Forschungsgemeinschaft International Research Training Group 'Regulation and Evolution of Cellular Systems' (RECESS, GRK 1563) and in part by the PRIME-XS project with the grant agreement number 262067 funded by the European Union 7th Framework Program. H.M. acknowledges the support of the Graduate School at the Technische Universität München, and the authors thank A. Hubauer for expert technical assistance.

Author information

Authors and Affiliations

Authors

Contributions

H.M., J.E.S. and B.K. designed the study. S.L., J.E.S., L.M. and S.M. performed experiments. H.M., S.L., L.M. and J.C. analyzed data. H.M., S.L., A.J.R.H., M.M. and B.K. wrote manuscript.

Corresponding author

Correspondence to Bernhard Kuster.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–23 (PDF 3226 kb)

Supplementary Table 1

Sequence, site of phosphorylation within the sequence, length and GRAVY score (Hydrophobicity) of the 851 representative sample peptides derived from the consensus of three out of the five publically available human phosphorylation data sets used in this study (XLSX 47 kb)

Supplementary Table 2

Peptide sequence, position of phosphorylation site in the sequence and Gravy score of the seed peptide synthesis of libraries used in this study. For each seed peptide sequence the final number of peptides in the library is given (XLSX 15 kb)

Supplementary Table 3

Search and classification result of HCD data aquired on a Orbitrap Velos. (XLSX 85450 kb)

Supplementary Table 4

Search and classification result of ETD-FT data aquired on a Orbitrap Velos. (XLSX 58136 kb)

Supplementary Table 5

Number of peptide identifications and phosphorylation site localizations at a given global or local false discovery rate (Mascot) (XLSX 526 kb)

Supplementary Table 6

Coefficients for the computation of local and global FDRs and FLRs (XLSX 551 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Marx, H., Lemeer, S., Schliep, J. et al. A large synthetic peptide and phosphopeptide reference library for mass spectrometry–based proteomics. Nat Biotechnol 31, 557–564 (2013). https://doi.org/10.1038/nbt.2585

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.2585

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing