Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Analysis and validation of proteomic data generated by tandem mass spectrometry

Abstract

The analysis of the large amount of data generated in mass spectrometry–based proteomics experiments represents a significant challenge and is currently a bottleneck in many proteomics projects. In this review we discuss critical issues related to data processing and analysis in proteomics and describe available methods and tools. We place special emphasis on the elaboration of results that are supported by sound statistical arguments.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Peptide identification by MS/MS database searching.
Figure 2: Statistical analysis of large-scale datasets of peptide assignments.
Figure 3: Quantitative proteomics workflows.

Similar content being viewed by others

References

  1. Aebersold, R. & Mann, M. Mass spectrometry-based proteomics. Nature 422, 198–207 (2003).

    Article  CAS  PubMed  Google Scholar 

  2. Domon, B. & Aebersold, R. Mass spectrometry and protein analysis. Science 312, 212–217 (2006).

    Article  CAS  PubMed  Google Scholar 

  3. Carr, S. et al. The need for guidelines in publication of peptide and protein identification data. Mol. Cell. Proteomics 3, 531–533 (2004).

    Article  CAS  PubMed  Google Scholar 

  4. Geer, L.Y. et al. Open mass spectrometry search algorithm. J. Proteome Res. 3, 958–964 (2004).

    Article  CAS  PubMed  Google Scholar 

  5. Sadygov, R.G. & Yates, J.R. A hypergeometric probability model for protein identification and validation using tandem mass spectral data and protein sequence databases. Anal. Chem. 75, 3792–3798 (2003).

    Article  CAS  PubMed  Google Scholar 

  6. Fenyo, D. & Beavis, R.C. A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes. Anal. Chem. 75, 768–774 (2003).

    Article  PubMed  CAS  Google Scholar 

  7. King, N.L. et al. Analysis of the Saccharomyces cerevisiae proteome with PeptideAtlas. Genome Biol. [online] 7, R106 (2006).

    Article  CAS  Google Scholar 

  8. Brunner, E. et al. A high-quality catalog of the Drosophila melanogaster proteome. Nat. Biotechnol. 25, 576–583 (2007).

    Article  CAS  PubMed  Google Scholar 

  9. Yates, J.R., Morgan, S.F., Gatlin, C.L., Griffin, P.R. & Eng, J.K. Method to compare collision-induced dissociation spectra of peptides: potential for library searching and subtractive analysis. Anal. Chem. 70, 3557–3565 (1998).

    Article  CAS  PubMed  Google Scholar 

  10. Craig, R., Cortens, J.C., Fenyo, D. & Beavis, R.C. Using annotated peptide mass spectrum libraries for protein identification. J. Proteome Res. 5, 1843–1849 (2006).

    Article  CAS  PubMed  Google Scholar 

  11. Frewen, B.E., Merrihew, G.E., Wu, C.C., Noble, W.S. & MacCoss, M.J. Analysis of peptide MS/MS spectra from large-scale proteomics experiments using spectrum libraries. Anal. Chem. 78, 5678–5684 (2006).

    Article  CAS  PubMed  Google Scholar 

  12. Lam, H. et al. Development and validation of a spectral library searching method for peptide identification from MS/MS. Proteomics 7, 655–667 (2007).

    Article  CAS  PubMed  Google Scholar 

  13. Stein, S.E. & Scott, D.R. Optimization and testing of mass-spectral library search algorithms for compound identification. J. Am. Soc. Mass Spectrom. 5, 859–866 (1994).

    Article  CAS  PubMed  Google Scholar 

  14. Nesvizhskii, A.I. et al. Dynamic spectrum quality assessment and iterative computational analysis of shotgun proteomic data: toward more efficient identification of post-translational modifications, sequence polymorphisms, and novel peptides. Mol. Cell. Proteomics 5, 652–670 (2006).

    Article  CAS  PubMed  Google Scholar 

  15. Mann, M. & Wilm, M. Error tolerant identification of peptides in sequence databases by peptide sequence tags. Anal. Chem. 66, 4390–4399 (1994).

    Article  CAS  PubMed  Google Scholar 

  16. Tabb, D.L., Saraf, A. & Yates, J.R. GutenTag: high-throughput sequence tagging via an empirically derived fragmentation model. Anal. Chem. 75, 6415–6421 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Tanner, S. et al. InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal. Chem. 77, 4626–4639 (2005).

    Article  CAS  PubMed  Google Scholar 

  18. Bern, M., Cai, Y.H. & Goldberg, D. Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry. Anal. Chem. 79, 1393–1400 (2007).

    Article  CAS  PubMed  Google Scholar 

  19. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate—a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 57, 289–300 (1995).

    Google Scholar 

  20. Elias, J.E. & Gygi, S.P. Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4, 207–214 (2007).

    Article  CAS  PubMed  Google Scholar 

  21. Keller, A., Nesvizhskii, A.I., Kolker, E. & Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74, 5383–5392 (2002).

    Article  CAS  PubMed  Google Scholar 

  22. Storey, J.D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Kapp, E.A. et al. An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: Sensitivity and specificity analysis. Proteomics 5, 3475–3490 (2005).

    Article  CAS  PubMed  Google Scholar 

  24. Elias, J.E., Haas, W., Faherty, B.K. & Gygi, S.P. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat. Methods 2, 667–675 (2005).

    Article  CAS  PubMed  Google Scholar 

  25. Lopez-Ferrer, D. et al. Statistical model for large-scale peptide identification in databases from tandem mass spectra using SEQUEST. Anal. Chem. 76, 6853–6860 (2004).

    Article  CAS  PubMed  Google Scholar 

  26. Anderson, D.C., Li, W.Q., Payan, D.G. & Noble, W.S. A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: support vector machine classification of peptide MS/MS spectra and SEQUEST scores. J. Proteome Res. 2, 137–146 (2003).

    Article  CAS  PubMed  Google Scholar 

  27. Kislinger, T. et al. PRISM, a generic large scale proteomic investigation strategy for mammals. Mol. Cell. Proteomics 2, 96–106 (2003).

    Article  CAS  PubMed  Google Scholar 

  28. Ulintz, P.J., Zhu, J., Qin, Z.H.S. & Andrews, P.C. Improved classification of mass spectrometry database search results using newer machine learning approaches. Mol. Cell. Proteomics 5, 497–509 (2006).

    Article  CAS  PubMed  Google Scholar 

  29. Gentzel, M., Kocher, T., Ponnusamy, S. & Wilm, M. Preprocessing of tandem mass spectrometric data to support automatic protein identification. Proteomics 3, 1597–1610 (2003).

    Article  CAS  PubMed  Google Scholar 

  30. Mujezinovic, N. et al. Cleaning of raw peptide MS/MS spectra: improved protein identification following deconvolution of multiply charged peaks, isotope clusters, and removal of background noise. Proteomics 6, 5117–5131 (2006).

    Article  CAS  PubMed  Google Scholar 

  31. Beer, I., Barnea, E., Ziv, T. & Admon, A. Improving large-scale proteomics by clustering of mass spectrometry data. Proteomics 4, 950–960 (2004).

    Article  CAS  PubMed  Google Scholar 

  32. Tabb, D.L., Thompson, M.R., Khalsa-Moyers, G., VerBerkmoes, N.C. & McDonald, W.H. MS2Grouper: Group assessment and synthetic replacement of duplicate proteomic tandem mass spectra. J. Am. Soc. Mass Spectrom. 16, 1250–1261 (2005).

    Article  CAS  PubMed  Google Scholar 

  33. Zhang, N. et al. ProblDtree: an automated software program capable of identifying multiple peptides from a single collision-induced dissociation spectrum collected by a tandem mass spectrometer. Proteomics 5, 4096–4106 (2005).

    Article  CAS  PubMed  Google Scholar 

  34. Moore, R.E., Young, M.K. & Lee, T.D. Method for screening peptide fragment ion mass spectra prior to database searching. J. Am. Soc. Mass Spectrom. 11, 422–426 (2000).

    Article  CAS  PubMed  Google Scholar 

  35. Wong, J.W.H., Sullivan, M.J., Cartwright, H.M. & Cagney, G. msmsEval: tandem mass spectral quality assignment for high-throughput proteomics. BMC Bioinformatics [online] 8, 51 (2007).

    Article  CAS  Google Scholar 

  36. Flikka, K., Martens, L., Vandekerckhoe, J., Gevaert, K. & Eidhammer, I. Improving the reliability and throughput of mass spectrometry-based proteomics by spectrum quality filtering. Proteomics 6, 2086–2094 (2006).

    Article  CAS  PubMed  Google Scholar 

  37. Xu, M. et al. Assessing data quality of peptide mass spectra obtained by quadrupole ion trap mass spectrometry. J. Proteome Res. 4, 300–305 (2005).

    Article  CAS  PubMed  Google Scholar 

  38. Colinge, J., Magnin, J., Dessingy, T., Giron, M. & Masselot, A. Improved peptide charge state assignment. Proteomics 3, 1434–1440 (2003).

    Article  CAS  PubMed  Google Scholar 

  39. Tabb, D.L. et al. Determination of peptide and protein ion charge states by Fourier transformation of isotope-resolved mass spectra. J. Am. Soc. Mass Spectrom. 17, 903–915 (2006).

    Article  CAS  PubMed  Google Scholar 

  40. Resing, K.A. et al. Improving reproducibility and sensitivity in identifying human proteins by shotgun proteomics. Anal. Chem. 76, 3556–3568 (2004).

    Article  CAS  PubMed  Google Scholar 

  41. Price, T.S. et al. EBP, a program for protein identification using multiple tandem mass spectrometry data sets. Mol. Cell. Proteomics 6, 527–536 (2007).

    Article  CAS  PubMed  Google Scholar 

  42. Higgs, R.E. et al. Estimating the statistical significance of peptide identifications from shotgun proteomics experiments. J. Proteome Res. 6, 1758–1767 (2007).

    Article  CAS  PubMed  Google Scholar 

  43. Keller, A., Eng, J., Zhang, N., Li, X.-J. & Aebersold, R. A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Mol. Syst. Biol. [online] 1, E1–E8 (2005).

    Article  CAS  Google Scholar 

  44. Olsen, J.V. & Mann, M. Improved peptide identification in proteomics by two consecutive stages of mass spectrometric fragmentation. Proc. Natl. Acad. Sci. USA 101, 13417–13422 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Strittmatter, E.F. et al. Application of peptide LC retention time information in a discriminant function for peptide identification by tandem mass spectrometry. J. Proteome Res. 3, 760–769 (2004).

    Article  CAS  PubMed  Google Scholar 

  46. Qian, W.J. et al. Probability-based evaluation of peptide and protein identifications from tandem mass spectrometry and SEQUEST analysis: the human proteome. J. Proteome Res. 4, 53–62 (2005).

    Article  CAS  PubMed  Google Scholar 

  47. Malmstrom, J. et al. Optimized peptide separation and identification for mass spectrometry based proteomics via free-flow electrophoresis. J. Proteome Res. 5, 2241–2249 (2006).

    Article  PubMed  CAS  Google Scholar 

  48. Xie, H. & Griffin, T.J. Trade-off between high sensitivity and increased potential for false positive peptide sequence matches using a two-dimensional linear ion trap for tandem mass spectrometry-based proteomics. J. Proteome Res. 5, 1003–1009 (2006).

    Article  CAS  PubMed  Google Scholar 

  49. Cargile, B.J., Bundy, J.L., Freeman, T.W. & Stephenson, J.L. Gel based isoelectric focusing of peptides and the utility of isoelectric point in protein identification. J. Proteome Res. 3, 112–119 (2004).

    Article  CAS  PubMed  Google Scholar 

  50. Zhang, H. et al. High throughput quantitative analysis of serum proteins using glycopeptide capture and liquid chromatography mass spectrometry. Mol. Cell. Proteomics 4, 144–155 (2005).

    Article  CAS  PubMed  Google Scholar 

  51. Heller, M. et al. Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focusing of peptides. J. Proteome Res. 4, 2273–2282 (2005).

    Article  CAS  PubMed  Google Scholar 

  52. Olsen, J.V. et al. Parts per million mass accuracy on an orbitrap mass spectrometer via lock mass injection into a C-trap. Mol. Cell. Proteomics 4, 2010–2021 (2005).

    Article  CAS  PubMed  Google Scholar 

  53. Rudnick, P.A., Wang, Y.J., Evans, E., Lee, C.S. & Balgley, B.M. Large scale analysis of MASCOT results using a mass accuracy-based THreshold (MATH) effectively improves data interpretation. J. Proteome Res. 4, 1353–1360 (2005).

    Article  CAS  PubMed  Google Scholar 

  54. Nesvizhskii, A.I. & Aebersold, R. Analysis, statistical validation and dissemination of large-scale proteomics data sets generated by tandem MS. Drug Discov. Today 9, 173–181 (2004).

    Article  CAS  PubMed  Google Scholar 

  55. Nesvizhskii, A.I. & Aebersold, R. Interpretation of shotgun proteomic data: the protein inference problem. Mol. Cell. Proteomics 4, 1419–1440 (2005).

    Article  CAS  PubMed  Google Scholar 

  56. Nesvizhskii, A.I., Keller, A., Kolker, E. & Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646–4658 (2003).

    Article  CAS  PubMed  Google Scholar 

  57. Omenn, G.S. et al. Overview of the HUPO Plasma Proteome Project: results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core data set of 3020 proteins and a publicly-available database. Proteomics 5, 3226–3245 (2005).

    Article  CAS  PubMed  Google Scholar 

  58. Rappsilber, J. & Mann, M. What does it mean to identify a protein in proteomics? Trends Biochem. Sci. 27, 74–78 (2002).

    Article  CAS  PubMed  Google Scholar 

  59. Yang, X. et al. DBParser: web-based software for shotgun proteomic data analyses. J. Proteome Res. 3, 1002–1008 (2004).

    Article  CAS  PubMed  Google Scholar 

  60. Weatherly, D.B. et al. A heuristic method for assigning a false-discovery rate for protein identifications from mascot database search results. Mol. Cell. Proteomics 4, 762–772 (2005).

    Article  CAS  PubMed  Google Scholar 

  61. Bandeira, N., Tsur, D., Frank, A. & Pevzner, P.A. Protein identification by spectral networks analysis. Proc. Natl. Acad. Sci. USA 104, 6140–6145 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. States, D.J. et al. Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study. Nat. Biotechnol. 24, 333–338 (2006).

    Article  CAS  PubMed  Google Scholar 

  63. Sadygov, R.G., Liu, H.B. & Yates, J.R. Statistical models for protein validation using tandem mass spectral data and protein amino acid sequence databases. Anal. Chem. 76, 1664–1671 (2004).

    Article  CAS  PubMed  Google Scholar 

  64. Mallick, P. et al. Computational prediction of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 25, 125–131 (2007).

    Article  CAS  PubMed  Google Scholar 

  65. Goshe, M.B. & Smith, R.D. Stable isotope-coded proteomic mass spectrometry. Curr. Opin. Biotechnol. 14, 101–109 (2003).

    Article  CAS  PubMed  Google Scholar 

  66. Old, W.M. et al. Comparison of label-free methods for quantifying human proteins by shotgun proteomics. Mol. Cell. Proteomics 4, 1487–1502 (2005).

    Article  CAS  PubMed  Google Scholar 

  67. Ishihama, Y. et al. Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol. Cell. Proteomics 4, 1265–1272 (2005).

    Article  CAS  PubMed  Google Scholar 

  68. Zybailov, B., Coleman, M.K., Florens, L. & Washburn, M.P. Correlation of relative abundance ratios derived from peptide ion chromatograms and spectrum counting for quantitative proteomic analysis using stable isotope labeling. Anal. Chem. 77, 6218–6224 (2005).

    Article  CAS  PubMed  Google Scholar 

  69. Liu, H., Sadygov, R.G. & Yates, J.R. III. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 76, 4193–4201 (2004).

    Article  CAS  PubMed  Google Scholar 

  70. Silva, J.C., Gorenstein, M.V., Li, G.Z., Vissers, J.P.C. & Geromanos, S.J. Absolute quantification of proteins by LCMSE: a virtue of parallel MS acquisition. Mol. Cell. Proteomics 5, 144–156 (2006).

    Article  CAS  PubMed  Google Scholar 

  71. Lu, P., Vogel, C., Wang, R., Yao, X. & Marcotte, E.M. Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat. Biotechnol. 25, 117–124 (2007).

    Article  CAS  PubMed  Google Scholar 

  72. Blondeau, F. et al. Tandem MS analysis of brain clathrin-coated vesicles reveals their critical involvement in synaptic vesicle recycling. Proc. Natl. Acad. Sci. USA 101, 3833–3838 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Radulovic, D. et al. Informatics platform for global proteomic profiling and biomarker discovery using liquid chromatography-tandem mass spectrometry. Mol. Cell. Proteomics 3, 984–997 (2004).

    Article  CAS  PubMed  Google Scholar 

  74. Jaffe, J.D. et al. PEPPeR, a platform for experimental proteomic pattern recognition. Mol. Cell. Proteomics 5, 1927–1941 (2006).

    Article  CAS  PubMed  Google Scholar 

  75. Li, X.-J., Yi, E.C., Kemp, C.J., Zhang, H. & Aebersold, R. A software suite for the generation and comparison of peptide arrays from sets of data collected by liquid chromatography-mass spectrometry. Mol. Cell. Proteomics 4, 1328–1340 (2005).

    Article  CAS  PubMed  Google Scholar 

  76. Listgarten, J. & Emili, A. Statistical and computational methods for comparative proteomic profiling using liquid chromatography-tandem mass spectrometry. Mol. Cell. Proteomics 4, 419–434 (2005).

    Article  CAS  PubMed  Google Scholar 

  77. Qian, W.-J., Jacobs, J.M., Liu, T., Camp, D.G. II & Smith, R.D. Advances and challenges in liquid chromatography-mass spectrometry-based proteomics profiling for clinical applications. Mol. Cell. Proteomics 5, 1727–1744 (2006).

    Article  CAS  PubMed  Google Scholar 

  78. Anderson, L. & Hunter, C.L. Quantitative mass spectrometric MRM assays for major plasma proteins. Mol. Cell. Proteomics 5, 573–588 (2006).

    Article  CAS  PubMed  Google Scholar 

  79. Gentleman, R.C. et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. [online] 5, R80 (2004).

    Article  Google Scholar 

  80. Meng, F., Forbes, A.J., Miller, L.M. & Kelleher, N.L. Detection and localization of protein modifications by high resolution tandem mass spectrometry. Mass Spectrom. Rev. 24, 126–134 (2005).

    Article  CAS  PubMed  Google Scholar 

  81. Han, X., Jin, M., Breuker, K. & McLafferty, F.W. Extending top-down mass spectrometry to proteins with masses greater than 200 kilodaltons. Science 314, 109–112 (2006).

    Article  CAS  PubMed  Google Scholar 

  82. Chait, B.T. Chemistry: mass spectrometry: bottom-up or top-down? Science 314, 65–66 (2006).

    Article  CAS  PubMed  Google Scholar 

  83. Kuster, B., Schirle, M., Mallick, P. & Aebersold, R. Scoring proteomes with proteotypic peptide probes. Nat. Rev. Mol. Cell Biol. 6, 577–583 (2005).

    Article  CAS  PubMed  Google Scholar 

  84. Eng, J.K., McCormack, A.L. & Yates, J.R. An approach to correlate tandem mass-spectral data of peptides with amino-acid-sequences in a protein database. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994).

    Article  CAS  PubMed  Google Scholar 

  85. Perkins, D.N., Pappin, D.J.C., Creasy, D.M. & Cottrell, J.S. Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1999).

    Article  CAS  PubMed  Google Scholar 

  86. Clauser, K.R., Baker, P. & Burlingame, A.L. Role of accurate mass measurement (+/− 10 ppm) in protein identification strategies employing MS or MS/MS and database searching. Anal. Chem. 71, 2871–2882 (1999).

    Article  CAS  PubMed  Google Scholar 

  87. Zhang, N., Aebersold, R. & Schwilkowski, B. ProbID: a probabilistic algorithm to identify peptides through sequence database searching using tandem mass spectral data. Proteomics 2, 1406–1412 (2002).

    Article  CAS  PubMed  Google Scholar 

  88. Craig, R. & Beavis, R.C. TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20, 1466–1467 (2004).

    Article  CAS  PubMed  Google Scholar 

  89. Colinge, J., Masselot, A., Giron, M., Dessingy, T. & Magnin, J. OLAV: Towards high-throughput tandem mass spectrometry data identification. Proteomics 3, 1454–1463 (2003).

    Article  CAS  PubMed  Google Scholar 

  90. Matthiesen, R., Trelle, M.B., Hojrup, P., Bunkenborg, J. & Jensen, O.N. VEMS 3.0: algorithms and computational tools for tandem mass spectrometry based identification of post-translational modifications in proteins. J. Proteome Res. 4, 2338–2347 (2005).

    Article  CAS  PubMed  Google Scholar 

  91. Tabb, D.L., Fernando, C.G. & Chambers, M.C. MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. J. Proteome Res. 6, 654–661 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Craig, R., Cortens, J.P. & Beavis, R.C. The use of proteotypic peptide libraries for protein identification. Rapid Commun. Mass Spectrom. 19, 1844–1850 (2005).

    Article  CAS  PubMed  Google Scholar 

  93. Johnson, R.S. & Taylor, J.A. Searching sequence databases via de novo peptide sequencing by tandem mass spectrometry. Mol. Biotechnol. 22, 301–315 (2002).

    Article  CAS  PubMed  Google Scholar 

  94. Frank, A. & Pevzner, P. PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal. Chem. 77, 964–973 (2005).

    Article  CAS  PubMed  Google Scholar 

  95. Ma, B. et al. PEAKS: powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom. 17, 2337–2342 (2003).

    Article  CAS  PubMed  Google Scholar 

  96. Hernandez, P., Gras, R., Frey, J. & Appel, R.D. Popitam: towards new heuristic strategies to improve protein identification from tandem mass spectrometry data. Proteomics 3, 870–878 (2003).

    Article  CAS  PubMed  Google Scholar 

  97. Desiere, F. et al. Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry. Genome Biol. [online] 6, R9 (2005).

    Article  Google Scholar 

  98. Rauch, A. et al. Computational proteomics analysis system (CPAS): an extensible, open-source analytic system for evaluating and publishing proteomic data and high throughput biological experiments. J. Proteome Res. 5, 112–121 (2006).

    Article  CAS  PubMed  Google Scholar 

  99. Martens, L. et al. PRIDE: the proteomics identifications database. Proteomics 5, 3537–3545 (2005).

    Article  CAS  PubMed  Google Scholar 

  100. Li, X.J., Zhang, H., Ranish, J.A. & Aebersold, R. Automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry. Anal. Chem. 75, 6648–6657 (2003).

    Article  CAS  PubMed  Google Scholar 

  101. MacCoss, M.J., Wu, C.C., Liu, H.B., Sadygov, R. & Yates, J.R. A correlation algorithm for the automated quantitative analysis of shotgun proteomics data. Anal. Chem. 75, 6912–6921 (2003).

    Article  CAS  PubMed  Google Scholar 

  102. Dudoit, S., Yang, Y.H., Callow, M.J. & Speed, T.P. Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Stat. Sinica 12, 111–139 (2002).

    Google Scholar 

  103. Tusher, V.G., Tibshirani, R. & Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA 98, 5116–5121 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Efron, B., Tibshirani, R., Storey, J.D. & Tusher, V. Empirical Bayes analysis of a microarray experiment. J. Am. Stat. Assoc. 96, 1151–1160 (2001).

    Article  Google Scholar 

  105. Fermin, D. et al. Novel gene and gene model detection using a whole genome open reading frame analysis in proteomics. Genome Biol. [online] 7, R35 (2006).

    Article  CAS  Google Scholar 

  106. Tanner, S. et al. Improving gene annotation using peptide mass spectrometry. Genome Res. 17, 231–239 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Edwards, N.J. Novel peptide identification from tandem mass spectra using ESTs and sequence database compression. Mol. Syst. Biol. [online] 3, 102 (2007).

    Google Scholar 

  108. Pedrioli, P.G.A. et al. A common open representation of mass spectrometry data and its application to proteomics research. Nat. Biotechnol. 22, 1459–1466 (2004).

    Article  CAS  PubMed  Google Scholar 

  109. Martens, L. et al. Do we want our data raw? Including binary mass spectrometry data in public proteomics data repositories. Proteomics 5, 3501–3505 (2005).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported in part by US National Institutes of Health (NIH) National Cancer Institute Grant R01 CA126239 to A.I.N. and with federal funds from the National Heart, Lung, and Blood Institute of the NIH under contract no. N01-HV-28179 to R.A.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruedi Aebersold.

Supplementary information

Supplementary Text and Figures

Supplementary Notes (PDF 104 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nesvizhskii, A., Vitek, O. & Aebersold, R. Analysis and validation of proteomic data generated by tandem mass spectrometry. Nat Methods 4, 787–797 (2007). https://doi.org/10.1038/nmeth1088

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth1088

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing