Repository-scale analysis of hundreds of millions to billions of mass spectra is a challenging endeavor due to the complexity and volume of associated data. A deep neural network embedding method is presented that enables large-scale investigation of repeatedly observed yet consistently unidentified mass spectra.
This is a preview of subscription content
Access options
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
$29.99
monthly
Subscribe to Journal
Get full journal access for 1 year
$119.00
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Buy article
Get time limited or full article access on ReadCube.
$32.00
All prices are NET prices.

References
Perez-Riverol, Y. et al. The PRIDE database and related tools in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019). This paper describes the increase in publicly available proteomics data in the PRIDE database.
Frank, A. M. et al. Clustering millions of tandem mass spectra. J. Proteome Res. 7, 113–122 (2008). This paper describes MS-Cluster, the first large-scale clustering algorithm for mass spectra.
Griss, J. et al. Recognizing millions of consistently unidentified spectra across hundreds of shotgun proteomics datasets. Nat. Methods 13, 651–656 (2016). This paper describes a commonly used spectral clustering algorithm.
Wang, M. et al. Assembling the community-scale discoverable human proteome. Cell Syst. 7, 412–421.e5 (2018). This paper describes the MassIVE-KB resource that provided training data for GLEAMS.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This is a summary of: Bittremieux, W., May, D. H., Bilmes, J. & Noble, W. S. A learned embedding for efficient joint analysis of millions of mass spectra. Nat. Methods https://doi.org/10.1038/s41592-022-01496-1 (2021).
Rights and permissions
About this article
Cite this article
A neural network for large-scale clustering of peptide mass spectra. Nat Methods 19, 658–659 (2022). https://doi.org/10.1038/s41592-022-01497-0
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-022-01497-0