HERMES: a molecular-formula-oriented method to target the metabolome

Giné, Roger; Capellades, Jordi; Badia, Josep M.; Vughs, Dennis; Schwaiger-Haber, Michaela; Alexandrov, Theodore; Vinaixa, Maria; Brunner, Andrea M.; Patti, Gary J.; Yanes, Oscar

doi:10.1038/s41592-021-01307-z

Article
Published: 01 November 2021

HERMES: a molecular-formula-oriented method to target the metabolome

Nature Methods volume 18, pages 1370–1376 (2021)Cite this article

7130 Accesses
16 Citations
67 Altmetric
Metrics details

Subjects

Abstract

Comprehensive metabolome analyses are essential for biomedical, environmental, and biotechnological research. However, current MS1- and MS2-based acquisition and data analysis strategies in untargeted metabolomics result in low identification rates of metabolites. Here we present HERMES, a molecular-formula-oriented and peak-detection-free method that uses raw LC/MS1 information to optimize MS2 acquisition. Investigating environmental water, Escherichia coli, and human plasma extracts with HERMES, we achieved an increased biological specificity of MS2 scans, leading to improved mass spectral similarity scoring and identification rates when compared with a state-of-the-art data-dependent acquisition (DDA) approach. Thus, HERMES improves sensitivity, selectivity, and annotation of metabolites. HERMES is available as an R package with a user-friendly graphical interface for data analysis and visualization.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 2: Venn-like diagram of the distribution of LC/MS1 data points in different steps of the HERMES workflow and XCMS peak-associated points.**

**Fig. 3: Distribution of MS2 scans acquired by HERMES and iterative DDA.**

**Fig. 4: ¹³C-enrichment analysis in the labeled *E. coli* sample.**

**Fig. 5: Identified inclusion list entries according to the MS1 precursor intensity.**

MetaboAnalystR 4.0: a unified LC-MS workflow for global metabolomics

Article Open access 01 May 2024

DIAMetAlyzer allows automated false-discovery rate-controlled analysis for data-independent acquisition in metabolomics

Article Open access 15 March 2022

A mass spectrum-oriented computational method for ion mobility-resolved untargeted metabolomics

Article Open access 31 March 2023

Data availability

Input mzML/mzXML mass spectrometry data files, molecular formula databases, and an RMarkdown script are available at Zenodo with accession number 4985839.

Code availability

The source code of RHermes is offered to the public as a freely accessible software package under the GNU GPL, version 3 license, and is available at https://github.com/RogerGinBer/RHermes and at Zenodo with accession number 5504163.

References

Sindelar, M. & Patti, G. J. Chemical discovery in the era of metabolomics. J. Am. Chem. Soc. 142, 9097–9105 (2020).
Article CAS Google Scholar
Duan, L., Molnár, I., Snyder, J. H., Shen, G. & Qi, X. Discrimination and quantification of true biological signals in metabolomics analysis based on liquid chromatography–mass spectrometry. Mol. Plant 9, 1217–1220 (2016).
Article CAS Google Scholar
Mahieu, N. G. & Patti, G. J. Systems-level annotation of a metabolomics data set reduces 25000 features to fewer than 1000 unique metabolites. Anal. Chem. 89, 10397–10406 (2017).
Article CAS Google Scholar
Myers, O. D., Sumner, S. J., Li, S., Barnes, S. & Du, X. Detailed investigation and comparison of the XCMS and MZmine 2 chromatogram construction and chromatographic peak detection methods for preprocessing mass spectrometry metabolomics data. Anal. Chem. 89, 8689–8695 (2017).
Article CAS Google Scholar
Domingo-Almenara, X., Montenegro-Burke, J. R., Benton, H. P. & Siuzdak, G. Annotation: a computational solution for streamlining metabolomics analysis. Anal. Chem. 90, 480–489 (2018).
Article CAS Google Scholar
Tsugawa, H. et al. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat. Methods 12, 523–526 (2015).
Article CAS Google Scholar
Yin, Y., Wang, R., Cai, Y., Wang, Z. & Zhu, Z.-J. DecoMetDIA: deconvolution of multiplexed MS/MS spectra for metabolite identification in SWATH-MS-based untargeted metabolomics. Anal. Chem. 91, 11897–11904 (2019).
Article CAS Google Scholar
Guo, J. & Huan, T. Comparison of full-scan, data-dependent, and data-independent acquisition modes in liquid chromatography–mass spectrometry based untargeted metabolomics. Anal. Chem. 92, 8072–8080 (2020).
Article CAS Google Scholar
Röst, H. L. et al. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat. Methods 13, 741–748 (2016).
Article Google Scholar
Huan, T. et al. Systems biology guided by XCMS Online metabolomics. Nat. Methods 14, 461–462 (2017).
Article CAS Google Scholar
Wishart, D. S. et al. HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res. 46, D608–D617 (2018).
Article CAS Google Scholar
J, H. et al. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res. 44, D1214–D1219 (2015).
Google Scholar
NORMAN Network et al. S0 | SUSDAT | Merged NORMAN Suspect List: SusDat. https://doi.org/10.5281/zenodo.4249026 (2020).
Palmer, A. et al. FDR-controlled metabolite annotation for high-resolution imaging mass spectrometry. Nat. Methods 14, 57–60 (2017).
Article CAS Google Scholar
Domingo-Almenara, X. et al. Autonomous METLIN-guided in-source fragment annotation for untargeted metabolomics. Anal. Chem. 91, 3246–3253 (2019).
Article CAS Google Scholar
Senan, O. et al. CliqueMS: a computational tool for annotating in-source metabolite ions from LC–MS untargeted metabolomics data based on a coelution similarity network. Bioinformatics 35, 4089–4097 (2019).
Article CAS Google Scholar
Dührkop, K. et al. SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nat. Methods 16, 299–302 (2019).
Article Google Scholar
Dührkop, K. et al. Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra. Nat. Biotechnol. 39, 462–471 (2020).
Aron, A. T. et al. Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nat. Protoc. 15, 1954–1991 (2020).
Article CAS Google Scholar
Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R. & Siuzdak, G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78, 779–787 (2006).
Article CAS Google Scholar
Kuhl, C., Tautenhahn, R., Böttcher, C., Larson, T. R. & Neumann, S. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Anal. Chem. 84, 283–289 (2012).
Article CAS Google Scholar
Buescher, J. M. et al. A roadmap for interpreting ¹³C metabolite labeling patterns from cells. Curr. Opin. Biotechnol. 34, 189–201 (2015).
Article CAS Google Scholar
Zamboni, N., Saghatelian, A. & Patti, G. J. Defining the metabolome: size, flux, and regulation. Mol. Cell 58, 699–706 (2015).
Article CAS Google Scholar
Jang, C., Chen, L. & Rabinowitz, J. D. Metabolomics anD Isotope Tracing. Cell 173, 822–837 (2018).
Article CAS Google Scholar
Mahieu, N. G., Huang, X., Chen, Y.-J. & Patti, G. J. Credentialing features: a platform to benchmark and optimize untargeted metabolomic methods. Anal. Chem. 86, 9583–9589 (2014).
Article CAS Google Scholar
Vinaixa, M. et al. Mass spectral databases for LC/MS- and GC/MS-based metabolomics: state of the field and future prospects. TrAC, Trends Anal. Chem. 78, 23–35 (2016).
Article CAS Google Scholar
Cho, K. et al. Targeting unique biological signals on the fly to improve MS/MS coverage and identification efficiency in metabolomics. Anal. Chim. Acta 1149, 338210 (2021).
Article CAS Google Scholar
Fahy, E., Sud, M., Cotter, D. & Subramaniam, S. LIPID MAPS online tools for lipid research. Nucleic Acids Res. 35, W606–W612 (2007).
Article Google Scholar
Kind, T. et al. LipidBlast in silico tandem mass spectrometry database for lipid identification. Nat. Methods 10, 755–758 (2013).
Article CAS Google Scholar
Ludwig, M. et al. Database-independent molecular formula annotation using Gibbs sampling through ZODIAC. Nat. Mach. Intell. 2, 629–641 (2020).
Article Google Scholar
Djoumbou-Feunang, Y. et al. BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification. J. Cheminformatics 11, 2 (2019).
Article Google Scholar
Rutz, A. et al. Open natural products research: curation and dissemination of biological occurrences of chemical structures through Wikidata. Preprint at bioRxiv https://doi.org/10.1101/2021.02.28.433265 (2021)
Blin, K. et al. antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res. 47, W81–W87 (2019).
Article CAS Google Scholar
Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012).
Article CAS Google Scholar
Koelmel, J. P. et al. Expanding lipidome coverage using LC-MS/MS data-dependent acquisition with automated exclusion list generation. J. Am. Soc. Mass. Spectrom. 28, 908–917 (2017).
Article CAS Google Scholar

Download references

Acknowledgements

We gratefully acknowledge financial support from the Ministerio de Educación y Formación Profesional (Spanish Government) to R.G. (2020-COLAB-00552). O.Y. was supported by Ministerio de Economía y Competitividad (MINECO) (BFU2014-57466-P), Spanish Biomedical Research Centre in Diabetes and Associated Metabolic Disorders (CIBERDEM), an initiative of Instituto de Investigación Carlos III (ISCIII), and the European Union’s Horizon 2020 program (MSCA-ITN-2015; 675610). We thank members of the Mil@b for helpful comments.

Author information

Authors and Affiliations

Universitat Rovira i Virgili, Department of Electronic Engineering & IISPV, Tarragona, Spain
Roger Giné, Jordi Capellades, Josep M. Badia, Maria Vinaixa & Oscar Yanes
CIBER de Diabetes y Enfermedades Metabólicas Asociadas (CIBERDEM), Instituto de Salud Carlos III, Madrid, Spain
Jordi Capellades, Josep M. Badia, Maria Vinaixa & Oscar Yanes
KWR Water Research Institute, Nieuwegein, the Netherlands
Dennis Vughs & Andrea M. Brunner
Department of Chemistry, Washington University, St. Louis, MO, USA
Michaela Schwaiger-Haber & Gary J. Patti
Department of Medicine, Washington University, St. Louis, MO, USA
Michaela Schwaiger-Haber & Gary J. Patti
Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
Theodore Alexandrov
Molecular Medicine Partnership Unit, European Molecular Biology Laboratory, Heidelberg, Germany
Theodore Alexandrov
Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA
Theodore Alexandrov

Authors

Roger Giné
View author publications
You can also search for this author in PubMed Google Scholar
Jordi Capellades
View author publications
You can also search for this author in PubMed Google Scholar
Josep M. Badia
View author publications
You can also search for this author in PubMed Google Scholar
Dennis Vughs
View author publications
You can also search for this author in PubMed Google Scholar
Michaela Schwaiger-Haber
View author publications
You can also search for this author in PubMed Google Scholar
Theodore Alexandrov
View author publications
You can also search for this author in PubMed Google Scholar
Maria Vinaixa
View author publications
You can also search for this author in PubMed Google Scholar
Andrea M. Brunner
View author publications
You can also search for this author in PubMed Google Scholar
Gary J. Patti
View author publications
You can also search for this author in PubMed Google Scholar
Oscar Yanes
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.G. and O.Y. designed the research. R.G., J.C., J.M.B., T.A., M.V., and O.Y. developed the computational method. D.V. and M.S.-H. performed LC–MS and MS2 experiments. All authors applied and evaluated the method on biological samples. R.G. and O.Y. wrote the manuscript, in cooperation with all authors.

Corresponding author

Correspondence to Oscar Yanes.

Ethics declarations

Competing interests

A patent application for the method has been filled by R.G., J.C., and O.Y. (P202030061). G.J.P. serves on the scientific advisory board of Cambridge Isotope Laboratories. The other authors declare no competing interests.

Additional information

Peer review information Peer reviewer reports are available. Nature Methods thanks Tao Huan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Arunima Singh was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Resolution-based isotopic envelope calculation.

a) The MS resolution is inversely proportional to the peak width of the acquired signals. When preprocessing raw MS1 data, a centroidization algorithm performs a peak- picking on a continuous profile signal (blue), yielding discrete, centroided signals (red). As resolution increases, the minimal distance to distinguish two adjacent peaks decreases (d2 < d1). b) In practice, this implies that, when acquiring data at lower resolutions, certain isotope signals are masked by close, more intense signals c) By calculating a resolution-based parameter d, HERMES can estimate which close isotopologues can be distinguished in the acquired profile MS1 data and therefore present in the centroided data (See Algorithm 1).

Extended Data Fig. 2 Schematic workflow of the different filtering steps in HERMES.

a) Artificial neural network (ANN) for blank subtraction. b) Adduct and isotopologue grouping according to the similarity of their elution profiles. c) In-source fragment annotation, by using publicly available low-energy MS2 data.

Extended Data Fig. 3 Continuous MS2 acquisition resolves co-eluting ionic species by comparing their fragment elution profile.

a) All fragment ions from continuous MS2 scans are grouped according to their m/z. b) A loose peak-picking algorithm is applied and the resulting peaks are grouped according to their elution profiles, generating a similarity network that is split by a greedy clustering algorithm. c) This grouping yields a curated MS2 spectra for each coeluting species (see Algorithm 3). (*) The shaded slice shows the impact of the algorithm on the resulting MS2 spectral quality. The delineated fragments in blue have a different elution pattern from the rest and would contaminate the MS2 spectra if only one scan was acquired at the top of the peak. The grouping performed by HERMES confidently removes the contaminant ions and separates each group of fragments according to their elution.

Extended Data Fig. 4 HERMES R Graphical User Interface (GUI).

a) Point-and-click selection of SOI detection parameters, with detailed explanations on their usage and optimal values. b) Visualization of isotopic profiles of different adducts of the same formula. Formulas can be inputted directly or inferred from the name of a selected compound. c) Isotopic fidelity exploration of selected SOIs. d) Visualization of the continuous MS2 deconvolution step. Users can check the fragment ion elution profiles from each inclusion list entry and how they are interconnected in the corresponding profile similarity network.

Extended Data Fig. 5 Discrimination of SOIs based on isotopic fidelity.

a) [M + H]⁺ ion of chloridazon and b) [M + K]⁺ ion of 2- Amino-alpha-carboline overlapping at 0.27 ppm. The arrows indicate the characteristic [³⁷Cl] isotopologue present in chloridazon and the [⁴¹K] isotopologue absent in 2-Amino-alpha-carboline. The absence of characteristic isotopologue signals (Cl, Br, K, etc.) in an intense SOI results in a low isotopic fidelity score and its removal.

Extended Data Fig. 6 Venn-like diagram of the distribution of negative ionization LC/MS1 data points in different steps of the HERMES workflow and XCMS peak-associated points.

a) E. coli and b) human plasma extract. Database: data points that match any m/z from the ionic formula database (including isotopes). SOI: monoisotopic (M0)- annotated data points that are present in an unfiltered SOI list. Inclusion List: data points present in a filtered SOI list (including blank subtraction, isotopic filter and ISF removal steps). Percentages refer to the total number of LC/MS1 data points.

Extended Data Fig. 7 Comparing LC/MS1 annotation performance with CAMERA and CliqueMS.

Positive ionization data. a) and b) E. coli, c) and d) human plasma extract. Percentages refer to the set of datapoints annotated by HERMES and were detected as a peak by XCMS (see Methods for parameters used). The isotope annotation overlap (a and c) was high, due primarily to M0 annotations. On the other hand, adduct annotation overlap (b and d) was markedly low (<20% of datapoints matched the annotation).

Extended Data Fig. 8 ¹³C-enrichment distribution according to the precursor intensity.

a) and b) ¹³C-enriched metabolites (FC and MIRS > 0.5) are mainly associated with abundant ions (intensity >10⁵), while unlabeled precursors (FC and MIRS < 0.5) relate more frequently to low abundant ions (intensity between 10⁴-10⁵). c) ¹³C-labeled precursors in iterative DDA corresponded to highly abundant ions that were also covered by HERMES. However, 56% of labelled low abundant ions were not covered by the iterative DDA.

Extended Data Fig. 9 Identified inclusion list entries according to the MS1 precursor intensity in negative ionization data.

An inclusion list entry is considered identified if at least one MS2 scan associated with it has a compound hit in the reference MS2 database with either cosine score > 0.8 (in-house database from MassBankEU, MoNA, Riken and NIST14 spectra), or Match > 90 and Confidence > 30 (mzCloud). a) E. coli extract. b) Human plasma extract.

Extended Data Fig. 10 Injection time effect in spectral quality (35 ms vs 1,500 ms).

Experimental MS2 spectra (black) of a) NADH, b) Biopterin and c) NADPH against library spectra (red). All precursor ions had an intensity below 10⁵. A higher injection time resulted in richer spectra, with more matching fragments against the reference spectra and overall better matching scores.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Giné, R., Capellades, J., Badia, J.M. et al. HERMES: a molecular-formula-oriented method to target the metabolome. Nat Methods 18, 1370–1376 (2021). https://doi.org/10.1038/s41592-021-01307-z

Download citation

Received: 11 April 2021
Accepted: 22 September 2021
Published: 01 November 2021
Issue Date: November 2021
DOI: https://doi.org/10.1038/s41592-021-01307-z

This article is cited by

MetaboAnalystR 4.0: a unified LC-MS workflow for global metabolomics
- Zhiqiang Pang
- Lei Xu
- Jianguo Xia
Nature Communications (2024)
The changing metabolic landscape of bile acids – keys to metabolism and immune regulation
- Ipsita Mohanty
- Celeste Allaband
- Pieter C. Dorrestein
Nature Reviews Gastroenterology & Hepatology (2024)
BUDDY: molecular formula discovery via bottom-up MS/MS interrogation
- Shipei Xing
- Sam Shen
- Tao Huan
Nature Methods (2023)
Accurate determination of molecular formulae using tandem mass spectrometry

Nature Methods (2023)
Metabolic flux between organs measured by arteriovenous metabolite gradients
- Hosung Bae
- Katie Lam
- Cholsoon Jang
Experimental & Molecular Medicine (2022)