Article

A draft map of the human proteome

Received:
Accepted:
Published online:

Abstract

The availability of human genome sequence has transformed biomedical research over the past decade. However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet. Here we present a draft map of the human proteome using high-resolution Fourier-transform mass spectrometry. In-depth proteomic profiling of 30 histologically normal human samples, including 17 adult tissues, 7 fetal tissues and 6 purified primary haematopoietic cells, resulted in identification of proteins encoded by 17,294 genes accounting for approximately 84% of the total annotated protein-coding genes in humans. A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream open reading frames. This large human proteome catalogue (available as an interactive web-based resource at http://www.humanproteomemap.org) will complement available human genome and transcriptome data to accelerate biomedical research in health and disease.

  • Subscribe to Nature for full access:

    $199

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

References

  1. 1.

    The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012)

  2. 2.

    & Mass spectrometry-based proteomics. Nature 422, 198–207 (2003)

  3. 3.

    , & Mass spectrometry-based proteomics and network biology. Annu. Rev. Biochem. 81, 379–405 (2012)

  4. 4.

    , & The biological impact of mass-spectrometry-based proteomics. Nature 450, 991–1000 (2007)

  5. 5.

    et al. System-wide perturbation analysis with nearly complete coverage of the yeast proteome by single-shot ultra HPLC runs on a bench top Orbitrap. Mol. Cell. Proteomics 11, M111.013722 (2012)

  6. 6.

    et al. A complete mass-spectrometric map of the yeast proteome applied to quantitative trait analysis. Nature 494, 266–270 (2013)

  7. 7.

    et al. Proteogenomic analysis of Mycobacterium tuberculosis by high resolution mass spectrometry. Mol. Cell. Proteomics 10, M111.011627 (2011)

  8. 8.

    et al. A tissue-specific atlas of mouse protein phosphorylation and expression. Cell 143, 1174–1189 (2010)

  9. 9.

    et al. Global proteome analysis of the NCI-60 cell line panel. Cell Rep. 4, 609–620 (2013)

  10. 10.

    et al. HiRIEF LC-MS enables deep proteome coverage and unbiased proteogenomics. Nature Methods 11, 59–62 (2014)

  11. 11.

    et al. The state of the human proteome in 2012 as viewed through PeptideAtlas. J. Proteome Res. 12, 162–171 (2013)

  12. 12.

    , & Open source system for analyzing, validating, and storing protein identification data. J. Proteome Res. 3, 1234–1242 (2004)

  13. 13.

    et al. neXtProt: organizing protein knowledge in the context of human proteome projects. J. Proteome Res. 12, 293–298 (2013)

  14. 14.

    et al. Towards a knowledge-based Human Protein Atlas. Nature Biotechnol. 28, 1248–1250 (2010)

  15. 15.

    et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 42, D756–D763 (2014)

  16. 16.

    , , & Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1999)

  17. 17.

    , & An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5, 976–989 (1994)

  18. 18.

    , , , & Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nature Methods 4, 923–925 (2007)

  19. 19.

    et al. Metrics for the human proteome project 2013–2014 and strategies for finding missing proteins. J. Proteome Res. 13, 15–20 (2014)

  20. 20.

    et al. Highly reproducible label free quantitative proteomic analysis of RNA polymerase complexes. Mol. Cell. Proteomics 10, M110.000687 (2011)

  21. 21.

    , , & Proteomic analysis of the fetal brain. Proteomics 2, 1547–1576 (2002)

  22. 22.

    et al. A dataset of human fetal liver proteome identified by subcellular fractionation and multiple protein separation and identification technology. Mol. Cell. Proteomics 5, 1703–1707 (2006)

  23. 23.

    , & Relating whole-genome expression data with protein-protein interactions. Genome Res. 12, 37–46 (2002)

  24. 24.

    , , & Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nature Genet. 29, 482–486 (2001)

  25. 25.

    et al. CORUM: the comprehensive resource of mammalian protein complexes–2009. Nucleic Acids Res. 38, D497–D501 (2010)

  26. 26.

    & Immunoproteasomes: structure, function, and antigen presentation. Prog. Mol. Biol. Transl. Sci. 109, 75–112 (2012)

  27. 27.

    & The abc’s (and xyz’s) of peptide sequencing. Nature Rev. Mol. Cell Biol. 5, 699–711 (2004)

  28. 28.

    , , , & A novel human endogenous retroviral protein inhibits cell-cell fusion. Sci. Rep. 3, 1462 (2013)

  29. 29.

    , , , & Ribosome profiling provides evidence that large noncoding RNAs do not encode proteins. Cell 154, 240–251 (2013)

  30. 30.

    et al. Expressed pseudogenes in the transcriptional landscape of human cancers. Cell 149, 1622–1634 (2012)

  31. 31.

    et al. The GENCODE pseudogene resource. Genome Biol. 13, R51 (2012)

  32. 32.

    et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012)

  33. 33.

    & A reassessment of the translation initiation codon in vertebrates. Trends Genet. 17, 685–687 (2001)

  34. 34.

    et al. The human proteome project: current state and future direction. Mol. Cell. Proteomics 10, M111.009993 (2011)

  35. 35.

    et al. The Chromosome-Centric Human Proteome Project for cataloging proteins encoded in the genome. Nature Biotechnol. 30, 221–223 (2012)

  36. 36.

    , , & A first step toward completion of a genome-wide characterization of the human proteome. J. Proteome Res. 12, 1–5 (2013)

  37. 37.

    , , , & In-gel digestion for mass spectrometric characterization of proteins and proteomes. Nature Protocols 1, 2856–2860 (2007)

  38. 38.

    et al. Reversed-phase chromatography with multiple fraction concatenation strategy for proteome profiling of human MCF10A cells. Proteomics 11, 2019–2026 (2011)

  39. 39.

    et al. Parts per million mass accuracy on an Orbitrap mass spectrometer via lock mass injection into a C-trap. Mol. Cell. Proteomics 4, 2010–2021 (2005)

  40. 40.

    et al. The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013. Nucleic Acids Res. 41, D1063–D1069 (2013)

  41. 41.

    & TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20, 1466–1467 (2004)

  42. 42.

    et al. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 41, D64–D69 (2013)

  43. 43.

    , & iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinformatics 9, 405 (2008)

  44. 44.

    et al. GeneMANIA prediction server 2013 update. Nucleic Acids Res. 41, W115–W122 (2013)

Download references

Acknowledgements

We would like to acknowledge the National Development and Research Institutes for some of the tissues. We acknowledge the assistance of V. Sandhya, V. Puttamallesh, U. Guha and B. Cole for help with analysis of some of the samples. We thank L. Lane and B. Amos for their assistance with the list of missing genes. This work was supported by an NIH roadmap grant for Technology Centers of Networks and Pathways (U54GM103520), NCI’s Clinical Proteomic Tumor Analysis Consortium initiative (U24CA160036), a contract (HHSN268201000032C) from the National Heart, Lung and Blood Institute and the Sol Goldman Pancreatic Cancer Research Center. The authors acknowledge the joint participation by the Adrienne Helis Malvin Medical Research Foundation and the Diana Helis Henry Medical Research Foundation through its direct engagement in the continuous active conduct of medical research in conjunction with The Johns Hopkins Hospital and the Johns Hopkins University School of Medicine and the Foundation’s Parkinson’s Disease Programs. The analysis work was partially supported by the National Resource for Network Biology (P41GM103504). A.Mah., S.K.Sh., P.S. and T.S.K.P. are supported by DBT Program Support on Neuroproteomics (BT/01/COE/08/05) to IOB and NIMHANS. H.G. is a Wellcome Trust-DBT India Alliance Early Career Fellow. We thank Council of Scientific and Industrial Research, University Grants Commission and Department of Science and Technology, Government of India for research fellowships for S.M.P., R.S.N., A.R., M.K., G.J.S., S.C., P.R., J.S., S.S.M., D.S.K., S.R., S.K.Sr., K.K.D., Y.S., A.S., S.D.Y., N.S., S.A. and G.D.

Author information

Author notes

    • Candace L. Kerr

    Present address: Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA.

Affiliations

  1. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA

    • Min-Sik Kim
    • , Derese Getnet
    • , Raghothama Chaerkady
    • , Pamela Leal-Rojas
    • , Samarjeet Prasad
    • , Tai-Chung Huang
    • , Jun Zhong
    • , Xinyan Wu
    • , Patrick G. Shaw
    • , Donald Freed
    • , Christopher J. Mitchell
    • , Steven D. Leach
    •  & Akhilesh Pandey
  2. Department of Biological Chemistry, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA

    • Min-Sik Kim
    • , Raghothama Chaerkady
    • , Xinyan Wu
    • , Muhammad S. Zahari
    •  & Akhilesh Pandey
  3. Institute of Bioinformatics, International Tech Park, Bangalore 560066, India

    • Sneha M. Pinto
    • , Raja Sekhar Nirujogi
    • , Srikanth S. Manda
    • , Anil K. Madugundu
    • , Dhanashree S. Kelkar
    • , Joji K. Thomas
    • , Babylakshmi Muthusamy
    • , Praveen Kumar
    • , Nandini A. Sahasrabuddhe
    • , Lavanya Balakrishnan
    • , Jayshree Advani
    • , Bijesh George
    • , Santosh Renuse
    • , Lakshmi Dhevi N. Selvan
    • , Arun H. Patil
    • , Vishalakshi Nanjappa
    • , Aneesha Radhakrishnan
    • , Tejaswini Subbannayya
    • , Rajesh Raju
    • , Manish Kumar
    • , Sreelakshmi K. Sreenivasamurthy
    • , Arivusudar Marimuthu
    • , Gajanan J. Sathe
    • , Sandip Chavan
    • , Keshava K. Datta
    • , Yashwanth Subbannayya
    • , Apeksha Sahu
    • , Soujanya D. Yelamanchi
    • , Savita Jayaram
    • , Pavithra Rajagopalan
    • , Jyoti Sharma
    • , Krishna R. Murthy
    • , Nazia Syed
    • , Renu Goel
    • , Aafaque A. Khan
    • , Sartaj Ahmad
    • , Gourav Dey
    • , Aditi Chatterjee
    • , Ravi Sirdeshmukh
    • , T. S. Keshava Prasad
    • , Harsha Gowda
    •  & Akhilesh Pandey
  4. Adrienne Helis Malvin Medical Research Foundation, New Orleans, Louisiana 70130, USA

    • Derese Getnet
    •  & Akhilesh Pandey
  5. The Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada

    • Ruth Isserlin
    • , Shobhit Jain
    •  & Gary D. Bader
  6. Department of Pathology, Universidad de La Frontera, Center of Genetic and Immunological Studies-Scientific and Technological Bioresource Nucleus, Temuco 4811230, Chile

    • Pamela Leal-Rojas
  7. School of Medicine, Imperial College London, South Kensington Campus, London SW7 2AZ, UK

    • Keshav Mudgal
  8. Department of Neurosurgery, Postgraduate Institute of Medical Education & Research, Chandigarh 160012, India

    • Kanchan K. Mukherjee
  9. Department of Internal Medicine Armed Forces Medical College, Pune 411040, India

    • Subramanian Shankar
  10. Department of Neuropathology, National Institute of Mental Health and Neurosciences, Bangalore 560029, India

    • Anita Mahadevan
    •  & Susarla Krishna Shankar
  11. Human Brain Tissue Repository, Neurobiology Research Centre, National Institute of Mental Health and Neurosciences, Bangalore 560029, India

    • Anita Mahadevan
    •  & Susarla Krishna Shankar
  12. Department of Chemical and Biomolecular Engineering and Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong

    • Henry Lam
  13. Department of Neurology, National Institute of Mental Health and Neurosciences, Bangalore 560029, India

    • Parthasarathy Satishchandra
  14. Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21224, USA

    • John T. Schroeder
  15. The Sol Goldman Pancreatic Cancer Research Center, Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21231, USA

    • Anirban Maitra
    • , Marc K. Halushka
    • , Ralph H. Hruban
    • , Christine A. Iacobuzio-Donahue
    •  & Akhilesh Pandey
  16. Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland 21231, USA

    • Anirban Maitra
    • , Charles G. Drake
    • , Ralph H. Hruban
    • , Christine A. Iacobuzio-Donahue
    •  & Akhilesh Pandey
  17. Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, Maryland 21231, USA

    • Steven D. Leach
    •  & Christine A. Iacobuzio-Donahue
  18. Departments of Immunology and Urology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, Maryland 21231, USA

    • Charles G. Drake
  19. Department of Obstetrics and Gynecology, Johns Hopkins University School of Medicine Baltimore, Maryland 21205, USA

    • Candace L. Kerr
  20. Diana Helis Henry Medical Research Foundation, New Orleans, Louisiana 70130, USA

    • Akhilesh Pandey

Authors

  1. Search for Min-Sik Kim in:

  2. Search for Sneha M. Pinto in:

  3. Search for Derese Getnet in:

  4. Search for Raja Sekhar Nirujogi in:

  5. Search for Srikanth S. Manda in:

  6. Search for Raghothama Chaerkady in:

  7. Search for Anil K. Madugundu in:

  8. Search for Dhanashree S. Kelkar in:

  9. Search for Ruth Isserlin in:

  10. Search for Shobhit Jain in:

  11. Search for Joji K. Thomas in:

  12. Search for Babylakshmi Muthusamy in:

  13. Search for Pamela Leal-Rojas in:

  14. Search for Praveen Kumar in:

  15. Search for Nandini A. Sahasrabuddhe in:

  16. Search for Lavanya Balakrishnan in:

  17. Search for Jayshree Advani in:

  18. Search for Bijesh George in:

  19. Search for Santosh Renuse in:

  20. Search for Lakshmi Dhevi N. Selvan in:

  21. Search for Arun H. Patil in:

  22. Search for Vishalakshi Nanjappa in:

  23. Search for Aneesha Radhakrishnan in:

  24. Search for Samarjeet Prasad in:

  25. Search for Tejaswini Subbannayya in:

  26. Search for Rajesh Raju in:

  27. Search for Manish Kumar in:

  28. Search for Sreelakshmi K. Sreenivasamurthy in:

  29. Search for Arivusudar Marimuthu in:

  30. Search for Gajanan J. Sathe in:

  31. Search for Sandip Chavan in:

  32. Search for Keshava K. Datta in:

  33. Search for Yashwanth Subbannayya in:

  34. Search for Apeksha Sahu in:

  35. Search for Soujanya D. Yelamanchi in:

  36. Search for Savita Jayaram in:

  37. Search for Pavithra Rajagopalan in:

  38. Search for Jyoti Sharma in:

  39. Search for Krishna R. Murthy in:

  40. Search for Nazia Syed in:

  41. Search for Renu Goel in:

  42. Search for Aafaque A. Khan in:

  43. Search for Sartaj Ahmad in:

  44. Search for Gourav Dey in:

  45. Search for Keshav Mudgal in:

  46. Search for Aditi Chatterjee in:

  47. Search for Tai-Chung Huang in:

  48. Search for Jun Zhong in:

  49. Search for Xinyan Wu in:

  50. Search for Patrick G. Shaw in:

  51. Search for Donald Freed in:

  52. Search for Muhammad S. Zahari in:

  53. Search for Kanchan K. Mukherjee in:

  54. Search for Subramanian Shankar in:

  55. Search for Anita Mahadevan in:

  56. Search for Henry Lam in:

  57. Search for Christopher J. Mitchell in:

  58. Search for Susarla Krishna Shankar in:

  59. Search for Parthasarathy Satishchandra in:

  60. Search for John T. Schroeder in:

  61. Search for Ravi Sirdeshmukh in:

  62. Search for Anirban Maitra in:

  63. Search for Steven D. Leach in:

  64. Search for Charles G. Drake in:

  65. Search for Marc K. Halushka in:

  66. Search for T. S. Keshava Prasad in:

  67. Search for Ralph H. Hruban in:

  68. Search for Candace L. Kerr in:

  69. Search for Gary D. Bader in:

  70. Search for Christine A. Iacobuzio-Donahue in:

  71. Search for Harsha Gowda in:

  72. Search for Akhilesh Pandey in:

Contributions

A.P., H.G., R.C., M.-S.K. designed the study; A.P., H.G., M.-S.K. managed the study; D.G., C.L.K., C.A.I.-D., K.R.M. collected human cells/tissues; M.-S.K., R.C., D.G. developed the pipeline of experiment and analysis; D.G., M.-S.K., S.M.P., K.M., R.C., S.R., J.Z., X.W., P.G.S., M.S.Z., T.-C.H. prepared peptide samples for LC-MS/MS; M.-S.K., R.S.N., S.M.P., R.C., D.S.K., S.R., G.J.S. performed LC-MS/MS; M.-S.K., S.M.P., S.P., S.S.M., C.J.M., J.A. and A.K.M. processed MS data and managed data; A.K.M., S.S.M., B.G., A.H.P., Y.S., M.-S.K. performed comparison analysis with PeptideAtlas, neXtProt and GPMDB; R.I., S.Jai., G.D.B. performed interaction and complex analysis; M.-S.K., S.M.P., S.S.M., P.K., A.K.M., N.A.S., R.S.N., L.B., L.D.N.S., D.S.K., V.N., A.R., T.S., M.K., S.K.Sr., G.D., A.Mar., R.R., S.C., K.K.D., A.S., S.D.Y., S.Jay., P.R., A.H.P., B.G., J.S., N.S., R.G., G.J.S., A.A.K., S.A., D.F., T.S.K.P., H.G., A.P. performed proteogenomic analysis; A.C., H.L., R.S., J.T.S., K.K.M., S.S., A.Mah., S.K.Sh., P.S., S.D.L., C.G.D., A.Mai., M.K.H., R.H.H., C.L.K., C.A.I.-D. assisted with analysis of the data; M.-S.K., S.M.P., T.-C.H., P.L.-R. performed western blot experiments; M.-S.K., J.K.T., A.K.M., B.M., S.P., S.M.P. designed the Human Proteome Map web portal; M.-S.K., A.K.M., J.K.T. generated selected reaction monitoring (SRM) database; M.-S.K., K.M., G.D., S.M.P., S.S.M. illustrated figures with help of other authors; A.P., M.-S.K., H.G. wrote the manuscript with inputs from other authors.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Harsha Gowda or Akhilesh Pandey.

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository with the dataset identifier PXD000561.

Extended data

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    This file contains a Supplementary Discussion and additional references.

  2. 2.

    Supplementary Data

    This file contains Supplementary Data.

Excel files

  1. 1.

    Supplementary Table 1

    This file contains a summary of results from proteogenomics analysis; a list of peptides indicating novel signal peptide cleavage sites; and a draft map of the human proteome.