Proteogenomics: concepts, applications and computational strategies

Nesvizhskii, Alexey I

doi:10.1038/nmeth.3144

Review Article
Published: 30 October 2014

Proteogenomics: concepts, applications and computational strategies

Alexey I Nesvizhskii^1,2

Nature Methods volume 11, pages 1114–1125 (2014)Cite this article

25k Accesses
496 Citations
67 Altmetric
Metrics details

Subjects

Abstract

Proteogenomics is an area of research at the interface of proteomics and genomics. In this approach, customized protein sequence databases generated using genomic and transcriptomic information are used to help identify novel peptides (not present in reference protein sequence databases) from mass spectrometry–based proteomic data; in turn, the proteomic data can be used to provide protein-level evidence of gene expression and to help refine gene models. In recent years, owing to the emergence of new sequencing technologies such as RNA-seq and dramatic improvements in the depth and throughput of mass spectrometry–based proteomics, the pace of proteogenomic research has greatly accelerated. Here I review the current state of proteogenomic methods and applications, including computational strategies for building and using customized protein sequence databases. I also draw attention to the challenge of false positive identifications in proteogenomics and provide guidelines for analyzing the data and reporting the results of proteogenomic studies.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Peptide and protein identification in shotgun proteomics.**

**Figure 2: The concept of proteogenomics.**

**Figure 3: Type of peptides identified in proteogenomics.**

**Figure 4: Statistical assessment of peptide identifications in proteogenomics.**

Proteome-wide structural changes measured with limited proteolysis-mass spectrometry: an advanced protocol for high-throughput applications

Article 16 December 2022

High-throughput proteomics: a methodological mini-review

Article 03 August 2022

Global detection of human variants and isoforms by deep proteome sequencing

Article Open access 23 March 2023

References

Mann, M., Kulak, N.A., Nagaraj, N. & Cox, J. The coming age of complete, accurate, and ubiquitous proteomes. Mol. Cell 49, 583–590 (2013).
Article CAS PubMed Google Scholar
Bantscheff, M., Lemeer, S., Savitski, M.M. & Kuster, B. Quantitative mass spectrometry in proteomics: critical review update from 2007 to the present. Anal. Bioanal. Chem. 404, 939–965 (2012).
Article CAS PubMed Google Scholar
Nesvizhskii, A.I. A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics. J. Proteomics 73, 2092–2123 (2010).
Article CAS PubMed PubMed Central Google Scholar
Nesvizhskii, A.I. & Aebersold, R. Interpretation of shotgun proteomic data: the protein inference problem. Mol. Cell. Proteomics 4, 1419–1440 (2005).
Article CAS PubMed Google Scholar
Dasari, S. et al. TagRecon: high-throughput mutation identification through sequence tagging. J. Proteome Res. 9, 1716–1726 (2010).
Article CAS PubMed PubMed Central Google Scholar
Ma, B. & Johnson, R. De novo sequencing and homology searching. Mol. Cell. Proteomics 11, O111.014902 (2012).
Article CAS PubMed Google Scholar
Jaffe, J.D., Berg, H.C. & Church, G.M. Proteogenomic mapping as a complementary method to perform genome annotation. Proteomics 4, 59–77 (2004).
Article CAS PubMed Google Scholar
Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009).
Article CAS PubMed PubMed Central Google Scholar
Ingolia, N.T. Ribosome profiling: new views of translation, from single codons to genome scale. Nat. Rev. Genet. 15, 205–213 (2014).
Article CAS PubMed Google Scholar
Desiere, F. et al. Integration with the human genome of peptide sequences obtained by high-throughput mass spectrometry. Genome Biol. 6, R9 (2005). Analysis of a large compendium of proteomic data from multiple studies: the first publicly available repository of mass spectrometry data, PeptideAtlas.
Article PubMed Google Scholar
Ning, K. & Nesvizhskii, A.I. The utility of mass spectrometry-based proteomic data for validation of novel alternative splice forms reconstructed from RNA-Seq data: a preliminary assessment. BMC Bioinformatics 11 (suppl. 11), S14 (2010).
Article CAS PubMed PubMed Central Google Scholar
Menschaert, G. et al. Deep proteome coverage based on ribosome profiling aids MS-based protein and peptide discovery and provides evidence of alternative translation products and near-cognate translation initiation events. Mol. Cell. Proteomics 12, 1780–1790 (2013). Use of ribosome-profiling data for creating customized protein sequence databases.
Article CAS PubMed PubMed Central Google Scholar
Sheynkman, G.M., Shortreed, M.R., Frey, B.L. & Smith, L.M. Discovery and mass spectrometric analysis of novel splice-junction peptides using RNA-Seq. Mol. Cell. Proteomics 12, 2341–2353 (2013).
Article CAS PubMed PubMed Central Google Scholar
Low, T.Y. et al. Quantitative and qualitative proteome characteristics extracted from in-depth integrated genomics and proteomics analysis. Cell Rep. 5, 1469–1478 (2013).
Article CAS PubMed Google Scholar
Wu, P. et al. Discovery of novel genes and gene isoforms by integrating transcriptomic and proteomic profiling from mouse liver. J. Proteome Res. 13, 2409–2419 (2014).
Article CAS PubMed Google Scholar
Omasits, U. et al. Directed shotgun proteomics guided by saturated RNA-seq identifies a complete expressed prokaryotic proteome. Genome Res. 23, 1916–1927 (2013). Comprehensive proteogenomic study integrating RNA-seq and proteomic data.
Article CAS PubMed PubMed Central Google Scholar
Kim, M.-S. et al. A draft map of the human proteome. Nature 509, 575–581 (2014).
Article CAS PubMed PubMed Central Google Scholar
Wilhelm, M. et al. Mass-spectrometry-based draft of the human proteome. Nature 509, 582–587 (2014).
Article CAS PubMed Google Scholar
Zhang, B. et al. Proteogenomic characterization of human colon and rectal cancer. Nature 513, 382–387 (2014). Large-scale CPTAC study integrating proteomic and genomic data from human colon and rectal TCGA samples.
Article CAS PubMed PubMed Central Google Scholar
Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).
Article CAS PubMed PubMed Central Google Scholar
Baerenfaller, K. et al. Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics. Science 320, 938–941 (2008). Comprehensive proteogenomic study to assemble a proteome map of an organism.
Article CAS PubMed Google Scholar
Brunner, E. et al. A high-quality catalog of the Drosophila melanogaster proteome. Nat. Biotechnol. 25, 576–583 (2007).
Article CAS PubMed Google Scholar
Khatun, J. et al. Whole human genome proteogenomic mapping for ENCODE cell line data: identifying protein-coding regions. BMC Genomics 14, 141 (2013).
Article CAS PubMed PubMed Central Google Scholar
Fermin, D. et al. Novel gene and gene model detection using a whole genome open reading frame analysis in proteomics. Genome Biol. 7, R35 (2006).
Article CAS PubMed PubMed Central Google Scholar
Castellana, N.E. et al. An automated proteogenomic method uses mass spectrometry to reveal novel genes in Zea mays. Mol. Cell. Proteomics 13, 157–167 (2014).
Article CAS PubMed Google Scholar
Blakeley, P., Overton, I.M. & Hubbard, S.J. Addressing statistical biases in nucleotide-derived protein databases for proteogenomic search strategies. J. Proteome Res. 11, 5221–5234 (2012).
Article CAS PubMed PubMed Central Google Scholar
Brosch, M. et al. Shotgun proteomics aids discovery of novel protein-coding genes, alternative splicing, and “resurrected” pseudogenes in the mouse genome. Genome Res. 21, 756–767 (2011).
Article CAS PubMed PubMed Central Google Scholar
Tanner, S. et al. Improving gene annotation using peptide mass spectrometry. Genome Res. 17, 231–239 (2007).
Article CAS PubMed PubMed Central Google Scholar
Brent, M.R. Steady progress and recent breakthroughs in the accuracy of automated genome annotation. Nat. Rev. Genet. 9, 62–73 (2008).
Article CAS PubMed Google Scholar
Castellana, N.E. et al. Discovery and revision of Arabidopsis genes by proteogenomics. Proc. Natl. Acad. Sci. USA 105, 21034–21038 (2008). Application of an advanced computational pipeline for proteogenomic annotation.
Article PubMed PubMed Central Google Scholar
Choudhary, J.S., Blackstock, W.P., Creasy, D.M. & Cottrell, J.S. Interrogating the human genome using uninterpreted mass spectrometry data. Proteomics 1, 651–667 (2001).
Article CAS PubMed Google Scholar
Edwards, N.J. Novel peptide identification from tandem mass spectra using ESTs and sequence database compression. Mol. Syst. Biol. 3, 102 (2007).
Article PubMed PubMed Central Google Scholar
Nesvizhskii, A.I. et al. Dynamic spectrum quality assessment and iterative computational analysis of shotgun proteomic data: toward more efficient identification of post-translational modifications, sequence polymorphisms, and novel peptides. Mol. Cell. Proteomics 5, 652–670 (2006).
Article CAS PubMed Google Scholar
Derrien, T. et al. The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression. Genome Res. 22, 1775–1789 (2012).
Article CAS PubMed PubMed Central Google Scholar
Engström, P.G. et al. Systematic evaluation of spliced alignment programs for RNA-seq data. Nat. Methods 10, 1185–1191 (2013).
Article CAS PubMed PubMed Central Google Scholar
Steijger, T. et al. Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods 10, 1177–1184 (2013).
Article CAS PubMed PubMed Central Google Scholar
Evans, V.C. et al. De novo derivation of proteomes from transcriptomes for transcript and protein identification. Nat. Methods 9, 1207–1211 (2012).
Article CAS PubMed PubMed Central Google Scholar
Sheynkman, G.M. et al. Using Galaxy-P to leverage RNA-Seq for the discovery of novel protein variations. BMC Genomics 15, 703 (2014).
Article CAS PubMed PubMed Central Google Scholar
Wang, X. & Zhang, B. customProDB: an R package to generate customized protein databases from RNA-Seq data for proteomics search. Bioinformatics 29, 3235–3237 (2013).
Article CAS PubMed PubMed Central Google Scholar
Woo, S. et al. Proteogenomic database construction driven from large scale RNA-seq data. J. Proteome Res. 13, 21–28 (2014).
Article CAS PubMed Google Scholar
Li, J. et al. A bioinformatics workflow for variant peptide detection in shotgun proteomics. Mol. Cell. Proteomics 10, M110.006536 (2011).
Article CAS PubMed PubMed Central Google Scholar
Picardi, E. & Pesole, G. REDItools: high-throughput RNA editing detection made easy. Bioinformatics 29, 1813–1814 (2013).
Article CAS PubMed Google Scholar
Menon, R. et al. Identification of novel alternative splice isoforms of circulating proteins in a mouse model of human pancreatic cancer. Cancer Res. 69, 300–309 (2009).
Article CAS PubMed PubMed Central Google Scholar
Xie, C. et al. NONCODEv4: exploring the world of long non-coding RNA genes. Nucleic Acids Res. 42, D98–D103 (2014).
Article CAS PubMed Google Scholar
Cabili, M.N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011).
Article CAS PubMed PubMed Central Google Scholar
Frenkel-Morgenstern, M. et al. ChiTaRS: a database of human, mouse and fruit fly chimeric transcripts and RNA-sequencing data. Nucleic Acids Res. 41, D142–D151 (2013).
Article CAS PubMed Google Scholar
Frenkel-Morgenstern, M. et al. Chimeras taking shape: potential functions of proteins encoded by chimeric RNA transcripts. Genome Res. 22, 1231–1242 (2012).
Article CAS PubMed PubMed Central Google Scholar
Krug, K. et al. Deep coverage of the Escherichia coli proteome enables the assessment of false discovery rates in simple proteogenomic experiments. Mol. Cell. Proteomics 12, 3420–3430 (2013).
Article CAS PubMed PubMed Central Google Scholar
Shteynberg, D., Nesvizhskii, A.I., Moritz, R.L. & Deutsch, E.W. Combining results of multiple search engines in proteomics. Mol. Cell. Proteomics 12, 2383–2393 (2013).
Article CAS PubMed PubMed Central Google Scholar
Branca, R.M. et al. HiRIEF LC-MS enables deep proteome coverage and unbiased proteogenomics. Nat. Methods 11, 59–62 (2014). Large-scale proteogenomic study seeking to identify novel protein-coding loci in human and mouse.
Article CAS PubMed Google Scholar
Ning, K., Fermin, D. & Nesvizhskii, A.I. Computational analysis of unassigned high-quality MS/MS spectra in proteomic data sets. Proteomics 10, 2712–2718 (2010).
Article CAS PubMed PubMed Central Google Scholar
Helmy, M., Sugiyama, N., Tomita, M. & Ishihama, Y. Mass spectrum sequential subtraction speeds up searching large peptide MS/MS spectra datasets against large nucleotide databases for proteogenomics. Genes Cells 17, 633–644 (2012).
Article CAS PubMed Google Scholar
Shteynberg, D. et al. iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol. Cell. Proteomics 10, M111.007690 (2011).
Article CAS PubMed PubMed Central Google Scholar
Castellana, N. & Bafna, V. Proteogenomics to discover the full coding content of genomes: A computational perspective. J. Proteomics 73, 2124–2135 (2010).
Article CAS PubMed PubMed Central Google Scholar
Abraham, P., Adams, R.M., Tuskan, G.A. & Hettich, R.L. Moving away from the reference genome: evaluating a peptide sequencing tagging approach for single amino acid polymorphism identifications in the genus Populus. J. Proteome Res. 12, 3642–3651 (2013).
Article CAS PubMed Google Scholar
Tsur, D., Tanner, S., Zandi, E., Bafna, V. & Pevzner, P.A. Identification of post-translational modifications by blind search of mass spectra. Nat. Biotechnol. 23, 1562–1567 (2005).
Article CAS PubMed Google Scholar
Lasonder, E. et al. Analysis of the Plasmodium falciparum proteome by high-accuracy mass spectrometry. Nature 419, 537–542 (2002).
Article CAS PubMed Google Scholar
Merrihew, G.E. et al. Use of shotgun proteomics for the identification, confirmation, and correction of C. elegans gene annotations. Genome Res. 18, 1660–1669 (2008).
Article CAS PubMed PubMed Central Google Scholar
Chaerkady, R. et al. A proteogenomic analysis of Anopheles gambiae using high-resolution Fourier transform mass spectrometry. Genome Res. 21, 1872–1881 (2011).
Article CAS PubMed PubMed Central Google Scholar
Alfaro, J.A., Sinha, A., Kislinger, T. & Boutros, P.C. Onco-proteogenomics: cancer proteomics joins forces with genomics. Nat. Methods 11, 1107–1113 (2014).
Article CAS PubMed Google Scholar
Küster, B., Mortensen, P., Andersen, J.S. & Mann, M. Mass spectrometry allows direct identification of proteins in large genomes. Proteomics 1, 641–650 (2001).
Article PubMed Google Scholar
Yang, X. et al. Discovery and annotation of small proteins using genomics, proteomics, and computational approaches. Genome Res. 21, 634–641 (2011).
Article CAS PubMed PubMed Central Google Scholar
Frith, M.C. et al. The abundance of short proteins in the mammalian proteome. PLoS Genet. 2, e52 (2006).
Article CAS PubMed PubMed Central Google Scholar
Oyama, M. et al. Diversity of translation start sites may define increased complexity of the human short ORFeome. Mol. Cell. Proteomics 6, 1000–1006 (2007).
Article CAS PubMed Google Scholar
Slavoff, S.A. et al. Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat. Chem. Biol. 9, 59 (2013). Identification of sORFs using mass spectrometry data.
Article CAS PubMed Google Scholar
Hartmann, E.M. & Armengaud, J. N-terminomics and proteogenomics, getting off to a good start. Proteomics doi:10.1002/pmic.201400157 (2014).
Van Damme, P., Gawron, D., Van Criekinge, W. & Menschaert, G. N-terminal proteomics and ribosome profiling provide a comprehensive view of the alternative translation initiation landscape in mice and men. Mol. Cell. Proteomics 13, 1245–1261 (2014).
Article CAS PubMed PubMed Central Google Scholar
Nilsen, T.W. & Graveley, B.R. Expansion of the eukaryotic proteome by alternative splicing. Nature 463, 457–463 (2010).
Article CAS PubMed PubMed Central Google Scholar
Menon, R. & Omenn, G.S. in Data Mining in Proteomics: From Standards to Applications (eds. Hamacher, M., Eisenacher, M. & Stephan, C.) Ch. 20, 319–326 (2011).
Book Google Scholar
Stunnenberg, H.G. & Hubner, N.C. Genomics meets proteomics: identifying the culprits in disease. Hum. Genet. 133, 689–700 (2014).
Article CAS PubMed Google Scholar
Sheynkman, G.M., Shortreed, M.R., Frey, B.L., Scalf, M. & Smith, L.M. Large-scale mass spectrometric detection of variant peptides resulting from nonsynonymous nucleotide differences. J. Proteome Res. 13, 228–240 (2014).
Article CAS PubMed Google Scholar
Wang, X. et al. Protein identification using customized protein sequence databases derived from RNA-Seq data. J. Proteome Res. 11, 1009–1017 (2012).
Article CAS PubMed Google Scholar
Stepanova, V.V. & Gelfand, M.S. RNA editing: classical cases and outlook of new technologies. Mol. Biol. 48, 11–15 (2014).
Article CAS Google Scholar
Li, M. et al. Widespread RNA and DNA sequence differences in the human transcriptome. Science 333, 53–58 (2011).
Article CAS PubMed PubMed Central Google Scholar
Guttman, M., Russell, P., Ingolia, N.T., Weissman, J.S. & Lander, E.S. Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins. Cell 154, 240–251 (2013).
Article CAS PubMed PubMed Central Google Scholar
Bánfai, B. et al. Long noncoding RNAs are rarely translated in two human cell lines. Genome Res. 22, 1646–1657 (2012).
Article CAS PubMed PubMed Central Google Scholar
Junqueira, M. et al. Protein identification pipeline for the homology-driven proteomics. J. Proteomics 71, 346–356 (2008).
Article CAS PubMed PubMed Central Google Scholar
Renard, B.Y. et al. Overcoming species boundaries in peptide identification with Bayesian information criterion-driven error-tolerant peptide search (BICEPS). Mol. Cell. Proteomics 11, M111.014167 (2012).
Article CAS PubMed PubMed Central Google Scholar
Armengaud, J. et al. Non-model organisms, a species endangered by proteogenomics. J. Proteomics 105, 5–18 (2014).
Article CAS PubMed Google Scholar
Gupta, N. et al. Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes. Genome Res. 18, 1133–1142 (2008).
Article CAS PubMed PubMed Central Google Scholar
Tovchigrechko, A., Venepally, P. & Payne, S.H. PGP: parallel prokaryotic proteogenomics pipeline for MPI clusters, high-throughput batch clusters and multicore workstations. Bioinformatics 30, 1469–1470 (2014).
Article CAS PubMed PubMed Central Google Scholar
Lo, I. et al. Strain-resolved community proteomics reveals recombining genomes of acidophilic bacteria. Nature 446, 537–541 (2007).
Article CAS PubMed Google Scholar
Delmotte, N. et al. Community proteogenomics reveals insights into the physiology of phyllosphere bacteria. Proc. Natl. Acad. Sci. USA 106, 16428–16433 (2009). Large-scale study demonstrating the power of combined metagenome and metaproteome analysis.
Article PubMed PubMed Central Google Scholar
Seifert, J. et al. Bioinformatic progress and applications in metaproteogenomics for bridging the gap between genomic sequences and metabolic functions in microbial communities. Proteomics 13, 2786–2804 (2013).
CAS PubMed Google Scholar
Muth, T., Benndorf, D., Reichl, U., Rapp, E. & Martens, L. Searching for a needle in a stack of needles: challenges in metaproteomics data analysis. Mol. Biosyst. 9, 578–585 (2013).
Article CAS PubMed Google Scholar
Tanca, A. et al. Evaluating the impact of different sequence databases on metaproteome analysis: insights from a lab-assembled microbial mixture. PLoS ONE 8, e82981 (2013).
Article CAS PubMed PubMed Central Google Scholar
de Souza, G.A. et al. Proteogenomic analysis of polymorphisms and gene annotation divergences in prokaryotes using a clustered mass spectrometry-friendly database. Mol. Cell. Proteomics 10, M110.002527 (2011).
Article CAS PubMed Google Scholar
Penzlin, A. et al. Pipasic: similarity and expression correction for strain-level identification and quantification in metaproteomics. Bioinformatics 30, i149–i156 (2014).
Article CAS PubMed PubMed Central Google Scholar
Albright, J.C., Goering, A.W., Doroghazi, J.R., Metcalf, W.W. & Kelleher, N.L. Strain-specific proteogenomics accelerates the discovery of natural products via their biosynthetic pathways. J. Ind. Microbiol. Biotechnol. 41, 451–459 (2014).
Article CAS PubMed Google Scholar
Rodriguez, H. et al. Recommendations from the 2008 International Summit on Proteomics Data Release and Sharing Policy: the Amsterdam principles. J. Proteome Res. 8, 3689–3692 (2009).
Article CAS PubMed PubMed Central Google Scholar
Vizcaíno, J.A. et al. ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat. Biotechnol. 32, 223–226 (2014).
Article CAS PubMed PubMed Central Google Scholar
Mudge, J.M., Frankish, A. & Harrow, J. Functional transcriptomics in the post-ENCODE era. Genome Res. 23, 1961–1973 (2013).
Article CAS PubMed PubMed Central Google Scholar
Carr, S. et al. The need for guidelines in publication of peptide and protein identification data: Working Group On Publication Guidelines For Peptide And Protein Identification Data. Mol. Cell. Proteomics 3, 531–533 (2004).
Article CAS PubMed Google Scholar
Omenn, G.S. The strategy, organization, and progress of the HUPO Human Proteome Project. J. Proteomics 100, 3–7 (2014).
Article CAS PubMed Google Scholar
Ellis, M.J. et al. Connecting genomic alterations to cancer biology with proteomics: the NCI Clinical Proteomic Tumor Analysis Consortium. Cancer Discov. 3, 1108–1112 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ezkurdia, I. et al. Comparative proteomics reveals a significant bias toward alternative protein isoforms with conserved structure and function. Mol. Biol. Evol. 29, 2265–2283 (2012). Bioinformatic analysis of proteomic data for improved characterization of alternative splicing.
Article CAS PubMed PubMed Central Google Scholar
Leoni, G., Le Pera, L., Ferrè, F., Raimondo, D. & Tramontano, A. Coding potential of the products of alternative splicing in human. Genome Biol. 12, R9 (2011).
Article CAS PubMed PubMed Central Google Scholar
Wu, L. et al. Variation and genetic control of protein abundance in humans. Nature 499, 79–82 (2013).
Article CAS PubMed PubMed Central Google Scholar
Albert, F.W., Treusch, S., Shockley, A.H., Bloom, J.S. & Kruglyak, L. Genetics of single-cell protein abundance variation in large yeast populations. Nature 506, 494–497 (2014).
Article CAS PubMed PubMed Central Google Scholar
Picotti, P. et al. A complete mass-spectrometric map of the yeast proteome applied to quantitative trait analysis. Nature 494, 266–270 (2013).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work has been funded in part with US National Institute of Health grant R01-GM-094231. I thank A. Kong, B. Veeneman, A. Shanmugam and G. Omenn for useful discussions.

Author information

Authors and Affiliations

Department of Pathology, University of Michigan, Ann Arbor, Michigan, USA
Alexey I Nesvizhskii
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
Alexey I Nesvizhskii

Authors

Alexey I Nesvizhskii
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexey I Nesvizhskii.

Ethics declarations

Competing interests

The author declares no competing financial interests.

Supplementary information

Supplementary Table

Supplementary Table 1 (PDF 87 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nesvizhskii, A. Proteogenomics: concepts, applications and computational strategies. Nat Methods 11, 1114–1125 (2014). https://doi.org/10.1038/nmeth.3144

Download citation

Received: 21 May 2014
Accepted: 22 September 2014
Published: 30 October 2014
Issue Date: November 2014
DOI: https://doi.org/10.1038/nmeth.3144

This article is cited by

Mitigating the missing-fragmentation problem in de novo peptide sequencing with a two-stage graph-based deep learning model
- Zeping Mao
- Ruixue Zhang
- Ming Li
Nature Machine Intelligence (2023)
Prediction of peptide mass spectral libraries with machine learning
- Jürgen Cox
Nature Biotechnology (2023)
PepQuery2 democratizes public MS proteomics data for rapid peptide searching
- Bo Wen
- Bing Zhang
Nature Communications (2023)
Proteogenomics 101: a primer on database search strategies
- Anurag Raj
- Suruchi Aggarwal
- Debasis Dash
Journal of Proteins and Proteomics (2023)
Enhanced protein isoform characterization through long-read proteogenomics
- Rachel M. Miller
- Ben T. Jordan
- Gloria M. Sheynkman
Genome Biology (2022)