Abstract
Comprehensive metabolome analyses are essential for biomedical, environmental, and biotechnological research. However, current MS1- and MS2-based acquisition and data analysis strategies in untargeted metabolomics result in low identification rates of metabolites. Here we present HERMES, a molecular-formula-oriented and peak-detection-free method that uses raw LC/MS1 information to optimize MS2 acquisition. Investigating environmental water, Escherichia coli, and human plasma extracts with HERMES, we achieved an increased biological specificity of MS2 scans, leading to improved mass spectral similarity scoring and identification rates when compared with a state-of-the-art data-dependent acquisition (DDA) approach. Thus, HERMES improves sensitivity, selectivity, and annotation of metabolites. HERMES is available as an R package with a user-friendly graphical interface for data analysis and visualization.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Input mzML/mzXML mass spectrometry data files, molecular formula databases, and an RMarkdown script are available at Zenodo with accession number 4985839.
Code availability
The source code of RHermes is offered to the public as a freely accessible software package under the GNU GPL, version 3 license, and is available at https://github.com/RogerGinBer/RHermes and at Zenodo with accession number 5504163.
References
Sindelar, M. & Patti, G. J. Chemical discovery in the era of metabolomics. J. Am. Chem. Soc. 142, 9097–9105 (2020).
Duan, L., Molnár, I., Snyder, J. H., Shen, G. & Qi, X. Discrimination and quantification of true biological signals in metabolomics analysis based on liquid chromatography–mass spectrometry. Mol. Plant 9, 1217–1220 (2016).
Mahieu, N. G. & Patti, G. J. Systems-level annotation of a metabolomics data set reduces 25000 features to fewer than 1000 unique metabolites. Anal. Chem. 89, 10397–10406 (2017).
Myers, O. D., Sumner, S. J., Li, S., Barnes, S. & Du, X. Detailed investigation and comparison of the XCMS and MZmine 2 chromatogram construction and chromatographic peak detection methods for preprocessing mass spectrometry metabolomics data. Anal. Chem. 89, 8689–8695 (2017).
Domingo-Almenara, X., Montenegro-Burke, J. R., Benton, H. P. & Siuzdak, G. Annotation: a computational solution for streamlining metabolomics analysis. Anal. Chem. 90, 480–489 (2018).
Tsugawa, H. et al. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat. Methods 12, 523–526 (2015).
Yin, Y., Wang, R., Cai, Y., Wang, Z. & Zhu, Z.-J. DecoMetDIA: deconvolution of multiplexed MS/MS spectra for metabolite identification in SWATH-MS-based untargeted metabolomics. Anal. Chem. 91, 11897–11904 (2019).
Guo, J. & Huan, T. Comparison of full-scan, data-dependent, and data-independent acquisition modes in liquid chromatography–mass spectrometry based untargeted metabolomics. Anal. Chem. 92, 8072–8080 (2020).
Röst, H. L. et al. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat. Methods 13, 741–748 (2016).
Huan, T. et al. Systems biology guided by XCMS Online metabolomics. Nat. Methods 14, 461–462 (2017).
Wishart, D. S. et al. HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res. 46, D608–D617 (2018).
J, H. et al. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res. 44, D1214–D1219 (2015).
NORMAN Network et al. S0 | SUSDAT | Merged NORMAN Suspect List: SusDat. https://doi.org/10.5281/zenodo.4249026 (2020).
Palmer, A. et al. FDR-controlled metabolite annotation for high-resolution imaging mass spectrometry. Nat. Methods 14, 57–60 (2017).
Domingo-Almenara, X. et al. Autonomous METLIN-guided in-source fragment annotation for untargeted metabolomics. Anal. Chem. 91, 3246–3253 (2019).
Senan, O. et al. CliqueMS: a computational tool for annotating in-source metabolite ions from LC–MS untargeted metabolomics data based on a coelution similarity network. Bioinformatics 35, 4089–4097 (2019).
Dührkop, K. et al. SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nat. Methods 16, 299–302 (2019).
Dührkop, K. et al. Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra. Nat. Biotechnol. 39, 462–471 (2020).
Aron, A. T. et al. Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nat. Protoc. 15, 1954–1991 (2020).
Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R. & Siuzdak, G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78, 779–787 (2006).
Kuhl, C., Tautenhahn, R., Böttcher, C., Larson, T. R. & Neumann, S. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Anal. Chem. 84, 283–289 (2012).
Buescher, J. M. et al. A roadmap for interpreting 13C metabolite labeling patterns from cells. Curr. Opin. Biotechnol. 34, 189–201 (2015).
Zamboni, N., Saghatelian, A. & Patti, G. J. Defining the metabolome: size, flux, and regulation. Mol. Cell 58, 699–706 (2015).
Jang, C., Chen, L. & Rabinowitz, J. D. Metabolomics anD Isotope Tracing. Cell 173, 822–837 (2018).
Mahieu, N. G., Huang, X., Chen, Y.-J. & Patti, G. J. Credentialing features: a platform to benchmark and optimize untargeted metabolomic methods. Anal. Chem. 86, 9583–9589 (2014).
Vinaixa, M. et al. Mass spectral databases for LC/MS- and GC/MS-based metabolomics: state of the field and future prospects. TrAC, Trends Anal. Chem. 78, 23–35 (2016).
Cho, K. et al. Targeting unique biological signals on the fly to improve MS/MS coverage and identification efficiency in metabolomics. Anal. Chim. Acta 1149, 338210 (2021).
Fahy, E., Sud, M., Cotter, D. & Subramaniam, S. LIPID MAPS online tools for lipid research. Nucleic Acids Res. 35, W606–W612 (2007).
Kind, T. et al. LipidBlast in silico tandem mass spectrometry database for lipid identification. Nat. Methods 10, 755–758 (2013).
Ludwig, M. et al. Database-independent molecular formula annotation using Gibbs sampling through ZODIAC. Nat. Mach. Intell. 2, 629–641 (2020).
Djoumbou-Feunang, Y. et al. BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification. J. Cheminformatics 11, 2 (2019).
Rutz, A. et al. Open natural products research: curation and dissemination of biological occurrences of chemical structures through Wikidata. Preprint at bioRxiv https://doi.org/10.1101/2021.02.28.433265 (2021)
Blin, K. et al. antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res. 47, W81–W87 (2019).
Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012).
Koelmel, J. P. et al. Expanding lipidome coverage using LC-MS/MS data-dependent acquisition with automated exclusion list generation. J. Am. Soc. Mass. Spectrom. 28, 908–917 (2017).
Acknowledgements
We gratefully acknowledge financial support from the Ministerio de Educación y Formación Profesional (Spanish Government) to R.G. (2020-COLAB-00552). O.Y. was supported by Ministerio de Economía y Competitividad (MINECO) (BFU2014-57466-P), Spanish Biomedical Research Centre in Diabetes and Associated Metabolic Disorders (CIBERDEM), an initiative of Instituto de Investigación Carlos III (ISCIII), and the European Union’s Horizon 2020 program (MSCA-ITN-2015; 675610). We thank members of the Mil@b for helpful comments.
Author information
Authors and Affiliations
Contributions
R.G. and O.Y. designed the research. R.G., J.C., J.M.B., T.A., M.V., and O.Y. developed the computational method. D.V. and M.S.-H. performed LC–MS and MS2 experiments. All authors applied and evaluated the method on biological samples. R.G. and O.Y. wrote the manuscript, in cooperation with all authors.
Corresponding author
Ethics declarations
Competing interests
A patent application for the method has been filled by R.G., J.C., and O.Y. (P202030061). G.J.P. serves on the scientific advisory board of Cambridge Isotope Laboratories. The other authors declare no competing interests.
Additional information
Peer review information Peer reviewer reports are available. Nature Methods thanks Tao Huan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Arunima Singh was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Resolution-based isotopic envelope calculation.
a) The MS resolution is inversely proportional to the peak width of the acquired signals. When preprocessing raw MS1 data, a centroidization algorithm performs a peak- picking on a continuous profile signal (blue), yielding discrete, centroided signals (red). As resolution increases, the minimal distance to distinguish two adjacent peaks decreases (d2 < d1). b) In practice, this implies that, when acquiring data at lower resolutions, certain isotope signals are masked by close, more intense signals c) By calculating a resolution-based parameter d, HERMES can estimate which close isotopologues can be distinguished in the acquired profile MS1 data and therefore present in the centroided data (See Algorithm 1).
Extended Data Fig. 2 Schematic workflow of the different filtering steps in HERMES.
a) Artificial neural network (ANN) for blank subtraction. b) Adduct and isotopologue grouping according to the similarity of their elution profiles. c) In-source fragment annotation, by using publicly available low-energy MS2 data.
Extended Data Fig. 3 Continuous MS2 acquisition resolves co-eluting ionic species by comparing their fragment elution profile.
a) All fragment ions from continuous MS2 scans are grouped according to their m/z. b) A loose peak-picking algorithm is applied and the resulting peaks are grouped according to their elution profiles, generating a similarity network that is split by a greedy clustering algorithm. c) This grouping yields a curated MS2 spectra for each coeluting species (see Algorithm 3). (*) The shaded slice shows the impact of the algorithm on the resulting MS2 spectral quality. The delineated fragments in blue have a different elution pattern from the rest and would contaminate the MS2 spectra if only one scan was acquired at the top of the peak. The grouping performed by HERMES confidently removes the contaminant ions and separates each group of fragments according to their elution.
Extended Data Fig. 4 HERMES R Graphical User Interface (GUI).
a) Point-and-click selection of SOI detection parameters, with detailed explanations on their usage and optimal values. b) Visualization of isotopic profiles of different adducts of the same formula. Formulas can be inputted directly or inferred from the name of a selected compound. c) Isotopic fidelity exploration of selected SOIs. d) Visualization of the continuous MS2 deconvolution step. Users can check the fragment ion elution profiles from each inclusion list entry and how they are interconnected in the corresponding profile similarity network.
Extended Data Fig. 5 Discrimination of SOIs based on isotopic fidelity.
a) [M + H]+ ion of chloridazon and b) [M + K]+ ion of 2- Amino-alpha-carboline overlapping at 0.27 ppm. The arrows indicate the characteristic [37Cl] isotopologue present in chloridazon and the [41K] isotopologue absent in 2-Amino-alpha-carboline. The absence of characteristic isotopologue signals (Cl, Br, K, etc.) in an intense SOI results in a low isotopic fidelity score and its removal.
Extended Data Fig. 6 Venn-like diagram of the distribution of negative ionization LC/MS1 data points in different steps of the HERMES workflow and XCMS peak-associated points.
a) E. coli and b) human plasma extract. Database: data points that match any m/z from the ionic formula database (including isotopes). SOI: monoisotopic (M0)- annotated data points that are present in an unfiltered SOI list. Inclusion List: data points present in a filtered SOI list (including blank subtraction, isotopic filter and ISF removal steps). Percentages refer to the total number of LC/MS1 data points.
Extended Data Fig. 7 Comparing LC/MS1 annotation performance with CAMERA and CliqueMS.
Positive ionization data. a) and b) E. coli, c) and d) human plasma extract. Percentages refer to the set of datapoints annotated by HERMES and were detected as a peak by XCMS (see Methods for parameters used). The isotope annotation overlap (a and c) was high, due primarily to M0 annotations. On the other hand, adduct annotation overlap (b and d) was markedly low (<20% of datapoints matched the annotation).
Extended Data Fig. 8 13C-enrichment distribution according to the precursor intensity.
a) and b) 13C-enriched metabolites (FC and MIRS > 0.5) are mainly associated with abundant ions (intensity >105), while unlabeled precursors (FC and MIRS < 0.5) relate more frequently to low abundant ions (intensity between 104-105). c) 13C-labeled precursors in iterative DDA corresponded to highly abundant ions that were also covered by HERMES. However, 56% of labelled low abundant ions were not covered by the iterative DDA.
Extended Data Fig. 9 Identified inclusion list entries according to the MS1 precursor intensity in negative ionization data.
An inclusion list entry is considered identified if at least one MS2 scan associated with it has a compound hit in the reference MS2 database with either cosine score > 0.8 (in-house database from MassBankEU, MoNA, Riken and NIST14 spectra), or Match > 90 and Confidence > 30 (mzCloud). a) E. coli extract. b) Human plasma extract.
Extended Data Fig. 10 Injection time effect in spectral quality (35 ms vs 1,500 ms).
Experimental MS2 spectra (black) of a) NADH, b) Biopterin and c) NADPH against library spectra (red). All precursor ions had an intensity below 105. A higher injection time resulted in richer spectra, with more matching fragments against the reference spectra and overall better matching scores.
Supplementary information
Supplementary Information
Supplementary Figures 1–4, Supplementary Tables 1 and 2, and Supplementary Algorithms 1–4
Rights and permissions
About this article
Cite this article
Giné, R., Capellades, J., Badia, J.M. et al. HERMES: a molecular-formula-oriented method to target the metabolome. Nat Methods 18, 1370–1376 (2021). https://doi.org/10.1038/s41592-021-01307-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-021-01307-z
This article is cited by
-
MetaboAnalystR 4.0: a unified LC-MS workflow for global metabolomics
Nature Communications (2024)
-
The changing metabolic landscape of bile acids – keys to metabolism and immune regulation
Nature Reviews Gastroenterology & Hepatology (2024)
-
BUDDY: molecular formula discovery via bottom-up MS/MS interrogation
Nature Methods (2023)
-
Accurate determination of molecular formulae using tandem mass spectrometry
Nature Methods (2023)
-
Metabolic flux between organs measured by arteriovenous metabolite gradients
Experimental & Molecular Medicine (2022)