A HUPO test sample study reveals common problems in mass spectrometry–based proteomics

Bell, Alexander W; Deutsch, Eric W; Au, Catherine E; Kearney, Robert E; Beavis, Ron; Sechi, Salvatore; Nilsson, Tommy; Bergeron, John J M

doi:10.1038/nmeth.1333

Analysis
Published: 17 May 2009

A HUPO test sample study reveals common problems in mass spectrometry–based proteomics

Alexander W Bell¹,
Eric W Deutsch²,
Catherine E Au¹,
Robert E Kearney³,
Ron Beavis⁴,
Salvatore Sechi⁵,
Tommy Nilsson⁶,
John J M Bergeron¹ &
HUPO Test Sample Working Group

Nature Methods volume 6, pages 423–430 (2009)Cite this article

4448 Accesses
266 Citations
29 Altmetric
Metrics details

A Corrigendum to this article was published on 01 July 2009

This article has been updated

Abstract

We performed a test sample study to try to identify errors leading to irreproducibility, including incompleteness of peptide sampling, in liquid chromatography–mass spectrometry–based proteomics. We distributed an equimolar test sample, comprising 20 highly purified recombinant human proteins, to 27 laboratories. Each protein contained one or more unique tryptic peptides of 1,250 Da to test for ion selection and sampling in the mass spectrometer. Of the 27 labs, members of only 7 labs initially reported all 20 proteins correctly, and members of only 1 lab reported all tryptic peptides of 1,250 Da. Centralized analysis of the raw data, however, revealed that all 20 proteins and most of the 1,250 Da peptides had been detected in all 27 labs. Our centralized analysis determined missed identifications (false negatives), environmental contamination, database matching and curation of protein identifications as sources of problems. Improved search engines and databases are needed for mass spectrometry–based proteomics.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Number of tandem mass spectra assigned to tryptic peptides.**

**Figure 2: Discrepancies between reported data and centralized analysis identify erroneous reporting.**

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Targeted protein degradation: from mechanisms to clinic

Article 29 April 2024

Synthetic intrinsically disordered protein fusion tags that enhance protein solubility

Article Open access 02 May 2024

Change history

29 June 2009
NOTE: In the version of this article initially published, the author name Steven A Carr was spelled incorrectly, and the name of an organization described in the text, the HUPO Proteomics Standards Initiative (PSI), was given incorrectly. These errors have been corrected in the PDF and HTML versions of this article.

References

de Godoy, L.M. et al. Comprehensive mass-spectrometry-based proteome quantification of haploid versus diploid yeast. Nature 455, 1251–1254 (2008).
Article CAS Google Scholar
Turck, C.W. et al. The Association of Biomolecular Resource Facilities Proteomics Research Group 2006 study: relative protein quantitation. Mol. Cell. Proteomics 6, 1291–1298 (2007).
Article CAS Google Scholar
Boutilier, K. et al. Comparison of different search engines using validated MS/MS test datasets. Anal. Chim. Acta 534, 11–20 (2005).
Article CAS Google Scholar
Elias, J.E., Haas, W., Faherty, B.K. & Gygi, S.P. Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat. Methods 2, 667–675 (2005).
Article CAS Google Scholar
Kapp, E.A. et al. An evaluation, comparison, and accurate benchmarking of several publicly available MS/MS search algorithms: sensitivity and specificity analysis. Proteomics 5, 3475–3490 (2005).
Article CAS Google Scholar
Bell, A.W., Nilsson, T., Kearney, R.E. & Bergeron, J.J. The protein microscope: incorporating mass spectrometry into cell biology. Nat. Methods 4, 783–784 (2007).
Article CAS Google Scholar
Gilchrist, A. et al. Quantitative proteomics analysis of the secretory pathway. Cell 127, 1265–1281 (2006).
Article CAS Google Scholar
Klie, S. et al. Analyzing large-scale proteomics projects with latent semantic indexing. J. Proteome Res. 7, 182–191 (2008).
Article CAS Google Scholar
Zubarev, R. & Mann, M. On the proper use of mass accuracy in proteomics. Mol. Cell. Proteomics 6, 377–381 (2007).
Article CAS Google Scholar
Cortez, L. The implementation of accreditation in a chemical laboratory. Trends Analyt. Chem. 18, 638–643 (1999).
Article CAS Google Scholar
Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Article CAS Google Scholar
Yates, J.R. III., Gilchrist, A., Howell, K.E. & Bergeron, J.J. Proteomics of organelles and large cellular structures. Nat. Rev. Mol. Cell Biol. 6, 702–714 (2005).
Article CAS Google Scholar
Shi, L., Perkins, R.G., Fang, H. & Tong, W. Reproducible and reliable microarray results through quality control: good laboratory proficiency and appropriate data analysis practices are essential. Curr. Opin. Biotechnol. 19, 10–18 (2008).
Article CAS Google Scholar
Anonymous. Making the most of microarrays. Nat. Biotechnol. 24, 1039 (2006).
Anonymous. Proteomics' new order. Nature. 437, 169 (2005).
Domon, B. & Aebersold, R. Challenges and opportunities in proteomics data analysis. Mol. Cell. Proteomics 5, 1921–1926 (2006).
Article CAS Google Scholar
Falkner, J.A., Hill, J.A. & Andrews, P.C. Proteomics FASTA archive and reference resource. Proteomics 8, 1756–1757 (2008).
Article CAS Google Scholar
Martens, L. et al. PRIDE: the proteomics identifications database. Proteomics 5, 3537–3545 (2005).
Article CAS Google Scholar
Liang, F. et al. ORFDB: an information resource linking scientific content to a high-quality Open Reading Frame (ORF) collection. Nucleic Acids Res. 32, D595–D599 (2004).
Article CAS Google Scholar
Strausberg, R.L., Feingold, E.A., Klausner, R.D. & Collins, F.S. The mammalian gene collection. Science 286, 455–457 (1999).
Article CAS Google Scholar
Craig, R. & Beavis, R.C. TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20, 1466–1467 (2004).
Article CAS Google Scholar
Keller, A., Eng, J., Zhang, N., Li, X.J. & Aebersold, R. A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Mol. Syst. Biol. 1, 2005.0017 (2005).
Article Google Scholar
Khan, S. et al. Identification of the dominant translation start site in the attB1 sequence of the pET-DEST42 Gateway vector. Protein Expr. Purif. 49, 102–107 (2006).
Article CAS Google Scholar
Fahnert, B., Lilie, H. & Neubauer, P. Inclusion bodies: formation and utilisation. Adv. Biochem. Eng. Biotechnol. 89, 93–142 (2004).
CAS PubMed Google Scholar
Carr, S. et al. The need for guidelines in publication of peptide and protein identification data: Working Group on Publication Guidelines for Peptide and Protein Identification Data. Mol. Cell. Proteomics 3, 531–533 (2004).
Article CAS Google Scholar
Au, C.E. et al. Organellar proteomics to create the cell map. Curr. Opin. Cell Biol. 19, 376–385 (2007).
Article CAS Google Scholar
Kersey, P.J. et al. The International Protein Index: an integrated database for proteomics experiments. Proteomics 4, 1985–1988 (2004).
Article CAS Google Scholar
Pedrioli, P.G. et al. A common open representation of mass spectrometry data and its application to proteomics research. Nat. Biotechnol. 22, 1459–1466 (2004).
Article CAS Google Scholar
Silva, J.C. et al. Quantitative proteomic analysis by accurate mass retention time pairs. Anal. Chem. 77, 2187–2200 (2005).
Article CAS Google Scholar
MacLean, B., Eng, J.K., Beavis, R.C. & McIntosh, M. General framework for developing and evaluating database scoring algorithms using the TANDEM search engine. Bioinformatics 22, 2830–2832 (2006).
Article CAS Google Scholar
Keller, A., Nesvizhskii, A.I., Kolker, E. & Aebersold, R. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal. Chem. 74, 5383–5392 (2002).
Article CAS Google Scholar
Nesvizhskii, A.I., Keller, A., Kolker, E. & Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75, 4646–4658 (2003).
Article CAS Google Scholar

Download references

Acknowledgements

Supported in part by Canadian Institutes of Health Research to the HUPO Head Quarters (S. Ouellette) for coordination of this HUPO test sample initiative. A.W.B. and C.E.A. were supported by Genome Quebec and McGill University. We thank D. Juncker, G. Temple, J. van Oostrum, G. Omenn, K. Colwill, J. Langridge and M. Hallett for their comments on the manuscript, and D.M. Desiderio for helpful comments on the manuscript. This test sample effort builds on pioneering efforts from several other groups and especially Association of Biomolecular Resource Facilities. This study is a HUPO test sample initiative and HUPO welcomes collaborative efforts to benefit proteomics. We acknowledge the following sources of grant support: E.W.D. is supported by the National Heart, Lung and Blood Institute, National Institutes of Health (NIH), under contract N01-HV-28179; the University of California, Los Angeles Burnham Institute for Medical Research NIH grant number RR020843; University of California, Los Angeles (National Heart, Lung and Blood Institute P01-008111); University of Michigan, NIH P41RR018627; Beijing Proteome Research Center, affiliated with The Beijing Institute of Radiation Medicine for National Key Programs for Basic Research grant 2006CB910801 and Hi-Tech Research grant 2006AA02A308. We acknowledge access and use of The University College Dublin Conway Mass Spectrometry Resource instrumentation, supported by Science Foundation, Ireland grant 04/RPI/B499. PRIDE, J.A.V. is a postdoctoral fellow of the “Especialización en Organismos Internacionales” program from the Spanish Ministry of Education and Science. L.M. is supported by the “ProDaC” grant LSHG-CT-2006-036814 of the EU. Samuel Lunenfeld Research Institute, Mount Sinai, Toronto is supported by Genome Canada through Ontario Genomics Institute. J.A.V. and L.M. thank H. Hermjakob and R. Apweiler for their support. A.W.B. thanks L. Roy and Z. Bencsath-Makkai for help in data submission and analysis.

Author information

Authors and Affiliations

Department of Anatomy and Cell Biology, McGill University, Montreal, Canada
Alexander W Bell, Catherine E Au & John J M Bergeron
The Institute for Systems Biology, Seattle, Washington, USA
Eric W Deutsch
Department of Biomedical Engineering, McGill University, Montreal, Canada
Robert E Kearney
Biomedical Research Centre, University of British Columbia, Vancouver, Canada
Ron Beavis
Division Diabetes, Endocrinology and Metabolic Diseases, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, USA
Salvatore Sechi
The Research Institute of the McGill University Health Centre and the Department of Medicine, McGill University, Montreal, Canada
Tommy Nilsson
Verdezyne, Inc., Carlsbad, California, USA
Thomas A Beardslee
BioGrammatics Incorporated, Carlsbad, California, USA
Thomas Chappell
Invitrogen Corporation, Carlsbad, California, USA
Gavin Meredith, Mahbod Hajivandi, Marshall Pope & Paul Predki
Allergan, Irvine, California, USA
Peter Sheffield
Ambry Genetics, Aliso Viejo, California, USA
Phillip Gray
Barnett Institute and Department of Chemistry and Chemical Biology, Northeastern University, Boston, Massachusetts, USA
Majlinda Kullolli, Marina Hincapie & William S Hancock
State Key Laboratory of Proteomics, Beijing Proteome Research Center, Changping District, Beijing, China
Wei Jia, Lina Song, Lei Li, Junying Wei, Bing Yang, Jinglan Wang, Wantao Ying, Yangjun Zhang, Yun Cai, Xiaohong Qian & Fuchu He
Bochum University, Ruhr-Universitaet Bochum, Bochum, Germany
Helmut E Meyer, Christian Stephan, Martin Eisenacher, Katrin Marcus, Elmar Langenfeld & Caroline May
Proteomics, Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, Massachusetts, USA
Steven A Carr & Rushdy Ahmad
Burnham Institute for Medical Research, La Jolla, California, USA
Wenhong Zhu & Jeffrey W Smith
Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
Samir M Hanash, Jason J Struthers, Hong Wang & Qing Zhang
Department of Oncology, Georgetown University, Washington, DC, USA
Yanming An & Radoslav Goldman
Göteborg Proteomics Centre: The Proteomics Core Facility, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden
Elisabet Carlsohn & Sjoerd van der Post
Harvard Partners Center for Genetics and Genomics, Cambridge, Massachusetts, USA
Kenneth E Hung, Kenneth Parker & Raju Kucherlapati
Thermo-Fisher BRIMS Center, Cambridge, Massachusetts, USA
David A Sarracino & Bryan Krastins
Proteomics Platform, Quebec Genomic Center, Laval University Medical Research Center, Quebec, Canada
Sylvie Bourassa
Health and Environment Unit, Laval University Medical Research Center, Quebec, Canada
Guy G Poirier
Joint Proteomics Laboratory, Ludwig Institute for Cancer Research and The Walter & Eliza Hall Institute for Medical Research, Parkville, Australia
Eugene Kapp, Heather Patsiouras, Robert Moritz & Richard Simpson
Genizon BioSciences Incorporated, Saint Laurent, Canada
Benoit Houle
McGill University and Genome Quebec Innovation Centre, Montréal, Canada
Sylvie LaBoissiere
Ontario Cancer Biomarker Network, MaRS Centre, Toronto, Canada
Pavel Metalnikov
Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Canada
Vivian Nguyen & Tony Pawson
Department of Chemical Physiology, The Scripps Research Institute, La Jolla, California, USA
Catherine C L Wong, Daniel Cociorva & John R Yates III
Department of Biochemistry, University of Alberta, Edmonton, Canada
Michael J Ellison, Ana Lopez-Campistrous & Paul Semchuk
Departments of Physiology, Medicine and Division of Cardiology, David Geffen School of Medicine, University of California, Los Angeles, California, USA
Yueju Wang & Peipei Ping
Proteome Research Centre, Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland
Giuliano Elia, Michael J Dunn & Kieran Wynne
Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, USA
Angela K Walker, John R Strahler, Philip C Andrews & Jayson A Falkner
Clinical Proteomics Facility, University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania, USA
Brian L Hood, William L Bigbee & Thomas P Conrads
Department of Pharmacology and Chemical Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
Brian L Hood & Thomas P Conrads
Department of Pathology, University of Pittsburgh School of Medicine, Magee-Womens Research Institute, Pittsburgh, Pennsylvania, USA
William L Bigbee
University of Victoria, Victoria, Canada
Derek Smith & Christoph H Borchers
Department of Biochemistry, University of Western Ontario, London, Ontario, Canada
Gilles A Lajoie & Sean C Bendall
The Wistar Institute, Philadelphia, Pennsylvania, USA
Kaye D Speicher & David W Speicher
Department of Biochemistry and Functional Proteomics, Yamaguchi University Graduate School of Medicine, Ube, Yamaguchi, Japan
Masanori Fujimoto & Kazuyuki Nakamura
Yonsei Proteome Research Center, Yonsei University, Sudaemoon-ku, Seoul, Korea
Young-Ki Paik, Sang Yun Cho, Min-Seok Kwon, Hyoung-Joo Lee, Seul-Ki Jeong & An Sung Chung
Agilent Technologies Incorporated, Santa Clara, California, USA
Christine A Miller & Rudolf Grimm
Applied Biosystems, Foster City, California, USA
Katy Williams
Waters Corporation, Milford, Massachusetts, USA
Craig Dorschel
EMBL Outstation, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
Lennart Martens & Juan Antonio Vizcaíno

Authors

Alexander W Bell
View author publications
You can also search for this author in PubMed Google Scholar
Eric W Deutsch
View author publications
You can also search for this author in PubMed Google Scholar
Catherine E Au
View author publications
You can also search for this author in PubMed Google Scholar
Robert E Kearney
View author publications
You can also search for this author in PubMed Google Scholar
Ron Beavis
View author publications
You can also search for this author in PubMed Google Scholar
Salvatore Sechi
View author publications
You can also search for this author in PubMed Google Scholar
Tommy Nilsson
View author publications
You can also search for this author in PubMed Google Scholar
John J M Bergeron
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

HUPO Test Sample Working Group

Thomas A Beardslee
, Thomas Chappell
, Gavin Meredith
, Peter Sheffield
, Phillip Gray
, Mahbod Hajivandi
, Marshall Pope
, Paul Predki
, Majlinda Kullolli
, Marina Hincapie
, William S Hancock
, Wei Jia
, Lina Song
, Lei Li
, Junying Wei
, Bing Yang
, Jinglan Wang
, Wantao Ying
, Yangjun Zhang
, Yun Cai
, Xiaohong Qian
, Fuchu He
, Helmut E Meyer
, Christian Stephan
, Martin Eisenacher
, Katrin Marcus
, Elmar Langenfeld
, Caroline May
, Steven A Carr
, Rushdy Ahmad
, Wenhong Zhu
, Jeffrey W Smith
, Samir M Hanash
, Jason J Struthers
, Hong Wang
, Qing Zhang
, Yanming An
, Radoslav Goldman
, Elisabet Carlsohn
, Sjoerd van der Post
, Kenneth E Hung
, David A Sarracino
, Kenneth Parker
, Bryan Krastins
, Raju Kucherlapati
, Sylvie Bourassa
, Guy G Poirier
, Eugene Kapp
, Heather Patsiouras
, Robert Moritz
, Richard Simpson
, Benoit Houle
, Sylvie LaBoissiere
, Pavel Metalnikov
, Vivian Nguyen
, Tony Pawson
, Catherine C L Wong
, Daniel Cociorva
, John R Yates III
, Michael J Ellison
, Ana Lopez-Campistrous
, Paul Semchuk
, Yueju Wang
, Peipei Ping
, Giuliano Elia
, Michael J Dunn
, Kieran Wynne
, Angela K Walker
, John R Strahler
, Philip C Andrews
, Brian L Hood
, William L Bigbee
, Thomas P Conrads
, Derek Smith
, Christoph H Borchers
, Gilles A Lajoie
, Sean C Bendall
, Kaye D Speicher
, David W Speicher
, Masanori Fujimoto
, Kazuyuki Nakamura
, Young-Ki Paik
, Sang Yun Cho
, Min-Seok Kwon
, Hyoung-Joo Lee
, Seul-Ki Jeong
, An Sung Chung
, Christine A Miller
, Rudolf Grimm
, Katy Williams
, Craig Dorschel
, Jayson A Falkner
, Lennart Martens
& Juan Antonio Vizcaíno

Contributions

A.W.B. coordinated all steps of the study. C.E.A., T.N. and J.J.M.B. coordinated data analysis and the final manuscript. E.W.D., R.B. and R.K. did the centralized analysis of the collective data retrieved from the raw data supplied from each lab to Tranche. S.A.C., P.P., L.M., E.K., C.D., S.S., X.Q., K.W., T.P.C., K.P. and T.A.B. provided comments. Invitrogen prepared, designed and distributed the test sample proteins.

Corresponding author

Correspondence to John J M Bergeron.

Ethics declarations

Competing interests

There is a potential to market the test samples used in this study.

Additional information

A full list of authors appears at the end of this paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bell, A., Deutsch, E., Au, C. et al. A HUPO test sample study reveals common problems in mass spectrometry–based proteomics. Nat Methods 6, 423–430 (2009). https://doi.org/10.1038/nmeth.1333

Download citation

Received: 18 December 2008
Accepted: 03 April 2009
Published: 17 May 2009
Issue Date: June 2009
DOI: https://doi.org/10.1038/nmeth.1333

This article is cited by

Proteomic insights into mental health status: plasma markers in young adults
- Alexey M. Afonin
- Aino-Kaisa Piironen
- Katja M. Kanninen
Translational Psychiatry (2024)
A uniform data processing pipeline enables harmonized nanoparticle protein corona analysis across proteomics core facilities
- Hassan Gharibi
- Ali Akbar Ashkarran
- Morteza Mahmoudi
Nature Communications (2024)
A multi-omics dataset of human transcriptome and proteome stable reference
- Shaohua Lu
- Hong Lu
- Gong Zhang
Scientific Data (2023)
Achieving quantitative reproducibility in label-free multisite DIA experiments through multirun alignment
- Shubham Gupta
- Justin C. Sing
- Hannes L. Röst
Communications Biology (2023)
Identification of CD38, CD97, and CD278 on the HIV surface using a novel flow virometry screening assay
- Jonathan Burnie
- Claire Fernandes
- Christina Guzzo
Scientific Reports (2023)

A HUPO test sample study reveals common problems in mass spectrometry–based proteomics

Abstract

Access options

Similar content being viewed by others

Highly accurate protein structure prediction with AlphaFold

Targeted protein degradation: from mechanisms to clinic

Synthetic intrinsically disordered protein fusion tags that enhance protein solubility

Change history

29 June 2009

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

HUPO Test Sample Working Group

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Text and Figures

Supplementary Table 6

Supplementary Table 12

Supplementary Table 13

Supplementary Table 14

Supplementary Table 15

Rights and permissions

About this article

Cite this article

This article is cited by

Proteomic insights into mental health status: plasma markers in young adults

A uniform data processing pipeline enables harmonized nanoparticle protein corona analysis across proteomics core facilities

A multi-omics dataset of human transcriptome and proteome stable reference

Achieving quantitative reproducibility in label-free multisite DIA experiments through multirun alignment

Identification of CD38, CD97, and CD278 on the HIV surface using a novel flow virometry screening assay

A stress test for mass spectrometry–based proteomics

Search

Quick links

Abstract

Access options

Similar content being viewed by others

Change history

29 June 2009

References

Acknowledgements

Author information

Authors and Affiliations

Consortia

HUPO Test Sample Working Group

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links