Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Research Briefing
  • Published:

A neural network for large-scale clustering of peptide mass spectra

Repository-scale analysis of hundreds of millions to billions of mass spectra is a challenging endeavor due to the complexity and volume of associated data. A deep neural network embedding method is presented that enables large-scale investigation of repeatedly observed yet consistently unidentified mass spectra.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: GLEAMS deep neural network for clustering hundreds of millions of mass spectra.

References

  1. Perez-Riverol, Y. et al. The PRIDE database and related tools in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019). This paper describes the increase in publicly available proteomics data in the PRIDE database.

    Article  CAS  Google Scholar 

  2. Frank, A. M. et al. Clustering millions of tandem mass spectra. J. Proteome Res. 7, 113–122 (2008). This paper describes MS-Cluster, the first large-scale clustering algorithm for mass spectra.

    Article  CAS  Google Scholar 

  3. Griss, J. et al. Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets. Nat. Methods 13, 651–656 (2016). This paper describes a commonly used spectral clustering algorithm.

    Article  CAS  Google Scholar 

  4. Wang, M. et al. Assembling the community-scale discoverable human proteome. Cell Syst. 7, 412–421.e5 (2018). This paper describes the MassIVE-KB resource that provided training data for GLEAMS.

    Article  CAS  Google Scholar 

Download references

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is a summary of: Bittremieux, W., May, D. H., Bilmes, J. & Noble, W. S. A learned embedding for efficient joint analysis of millions of mass spectra. Nat. Methods https://doi.org/10.1038/s41592-022-01496-1 (2021).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

A neural network for large-scale clustering of peptide mass spectra. Nat Methods 19, 658–659 (2022). https://doi.org/10.1038/s41592-022-01497-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-022-01497-0

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research