Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

HERMES: a molecular-formula-oriented method to target the metabolome

Abstract

Comprehensive metabolome analyses are essential for biomedical, environmental, and biotechnological research. However, current MS1- and MS2-based acquisition and data analysis strategies in untargeted metabolomics result in low identification rates of metabolites. Here we present HERMES, a molecular-formula-oriented and peak-detection-free method that uses raw LC/MS1 information to optimize MS2 acquisition. Investigating environmental water, Escherichia coli, and human plasma extracts with HERMES, we achieved an increased biological specificity of MS2 scans, leading to improved mass spectral similarity scoring and identification rates when compared with a state-of-the-art data-dependent acquisition (DDA) approach. Thus, HERMES improves sensitivity, selectivity, and annotation of metabolites. HERMES is available as an R package with a user-friendly graphical interface for data analysis and visualization.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The HERMES workflow.
Fig. 2: Venn-like diagram of the distribution of LC/MS1 data points in different steps of the HERMES workflow and XCMS peak-associated points.
Fig. 3: Distribution of MS2 scans acquired by HERMES and iterative DDA.
Fig. 4: 13C-enrichment analysis in the labeled E. coli sample.
Fig. 5: Identified inclusion list entries according to the MS1 precursor intensity.

Similar content being viewed by others

Data availability

Input mzML/mzXML mass spectrometry data files, molecular formula databases, and an RMarkdown script are available at Zenodo with accession number 4985839.

Code availability

The source code of RHermes is offered to the public as a freely accessible software package under the GNU GPL, version 3 license, and is available at https://github.com/RogerGinBer/RHermes and at Zenodo with accession number 5504163.

References

  1. Sindelar, M. & Patti, G. J. Chemical discovery in the era of metabolomics. J. Am. Chem. Soc. 142, 9097–9105 (2020).

    Article  CAS  Google Scholar 

  2. Duan, L., Molnár, I., Snyder, J. H., Shen, G. & Qi, X. Discrimination and quantification of true biological signals in metabolomics analysis based on liquid chromatography–mass spectrometry. Mol. Plant 9, 1217–1220 (2016).

    Article  CAS  Google Scholar 

  3. Mahieu, N. G. & Patti, G. J. Systems-level annotation of a metabolomics data set reduces 25000 features to fewer than 1000 unique metabolites. Anal. Chem. 89, 10397–10406 (2017).

    Article  CAS  Google Scholar 

  4. Myers, O. D., Sumner, S. J., Li, S., Barnes, S. & Du, X. Detailed investigation and comparison of the XCMS and MZmine 2 chromatogram construction and chromatographic peak detection methods for preprocessing mass spectrometry metabolomics data. Anal. Chem. 89, 8689–8695 (2017).

    Article  CAS  Google Scholar 

  5. Domingo-Almenara, X., Montenegro-Burke, J. R., Benton, H. P. & Siuzdak, G. Annotation: a computational solution for streamlining metabolomics analysis. Anal. Chem. 90, 480–489 (2018).

    Article  CAS  Google Scholar 

  6. Tsugawa, H. et al. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat. Methods 12, 523–526 (2015).

    Article  CAS  Google Scholar 

  7. Yin, Y., Wang, R., Cai, Y., Wang, Z. & Zhu, Z.-J. DecoMetDIA: deconvolution of multiplexed MS/MS spectra for metabolite identification in SWATH-MS-based untargeted metabolomics. Anal. Chem. 91, 11897–11904 (2019).

    Article  CAS  Google Scholar 

  8. Guo, J. & Huan, T. Comparison of full-scan, data-dependent, and data-independent acquisition modes in liquid chromatography–mass spectrometry based untargeted metabolomics. Anal. Chem. 92, 8072–8080 (2020).

    Article  CAS  Google Scholar 

  9. Röst, H. L. et al. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat. Methods 13, 741–748 (2016).

    Article  Google Scholar 

  10. Huan, T. et al. Systems biology guided by XCMS Online metabolomics. Nat. Methods 14, 461–462 (2017).

    Article  CAS  Google Scholar 

  11. Wishart, D. S. et al. HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res. 46, D608–D617 (2018).

    Article  CAS  Google Scholar 

  12. J, H. et al. ChEBI in 2016: Improved services and an expanding collection of metabolites. Nucleic Acids Res. 44, D1214–D1219 (2015).

    Google Scholar 

  13. NORMAN Network et al. S0 | SUSDAT | Merged NORMAN Suspect List: SusDat. https://doi.org/10.5281/zenodo.4249026 (2020).

  14. Palmer, A. et al. FDR-controlled metabolite annotation for high-resolution imaging mass spectrometry. Nat. Methods 14, 57–60 (2017).

    Article  CAS  Google Scholar 

  15. Domingo-Almenara, X. et al. Autonomous METLIN-guided in-source fragment annotation for untargeted metabolomics. Anal. Chem. 91, 3246–3253 (2019).

    Article  CAS  Google Scholar 

  16. Senan, O. et al. CliqueMS: a computational tool for annotating in-source metabolite ions from LC–MS untargeted metabolomics data based on a coelution similarity network. Bioinformatics 35, 4089–4097 (2019).

    Article  CAS  Google Scholar 

  17. Dührkop, K. et al. SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nat. Methods 16, 299–302 (2019).

    Article  Google Scholar 

  18. Dührkop, K. et al. Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra. Nat. Biotechnol. 39, 462–471 (2020).

  19. Aron, A. T. et al. Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nat. Protoc. 15, 1954–1991 (2020).

    Article  CAS  Google Scholar 

  20. Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R. & Siuzdak, G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78, 779–787 (2006).

    Article  CAS  Google Scholar 

  21. Kuhl, C., Tautenhahn, R., Böttcher, C., Larson, T. R. & Neumann, S. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Anal. Chem. 84, 283–289 (2012).

    Article  CAS  Google Scholar 

  22. Buescher, J. M. et al. A roadmap for interpreting 13C metabolite labeling patterns from cells. Curr. Opin. Biotechnol. 34, 189–201 (2015).

    Article  CAS  Google Scholar 

  23. Zamboni, N., Saghatelian, A. & Patti, G. J. Defining the metabolome: size, flux, and regulation. Mol. Cell 58, 699–706 (2015).

    Article  CAS  Google Scholar 

  24. Jang, C., Chen, L. & Rabinowitz, J. D. Metabolomics anD Isotope Tracing. Cell 173, 822–837 (2018).

    Article  CAS  Google Scholar 

  25. Mahieu, N. G., Huang, X., Chen, Y.-J. & Patti, G. J. Credentialing features: a platform to benchmark and optimize untargeted metabolomic methods. Anal. Chem. 86, 9583–9589 (2014).

    Article  CAS  Google Scholar 

  26. Vinaixa, M. et al. Mass spectral databases for LC/MS- and GC/MS-based metabolomics: state of the field and future prospects. TrAC, Trends Anal. Chem. 78, 23–35 (2016).

    Article  CAS  Google Scholar 

  27. Cho, K. et al. Targeting unique biological signals on the fly to improve MS/MS coverage and identification efficiency in metabolomics. Anal. Chim. Acta 1149, 338210 (2021).

    Article  CAS  Google Scholar 

  28. Fahy, E., Sud, M., Cotter, D. & Subramaniam, S. LIPID MAPS online tools for lipid research. Nucleic Acids Res. 35, W606–W612 (2007).

    Article  Google Scholar 

  29. Kind, T. et al. LipidBlast in silico tandem mass spectrometry database for lipid identification. Nat. Methods 10, 755–758 (2013).

    Article  CAS  Google Scholar 

  30. Ludwig, M. et al. Database-independent molecular formula annotation using Gibbs sampling through ZODIAC. Nat. Mach. Intell. 2, 629–641 (2020).

    Article  Google Scholar 

  31. Djoumbou-Feunang, Y. et al. BioTransformer: a comprehensive computational tool for small molecule metabolism prediction and metabolite identification. J. Cheminformatics 11, 2 (2019).

    Article  Google Scholar 

  32. Rutz, A. et al. Open natural products research: curation and dissemination of biological occurrences of chemical structures through Wikidata. Preprint at bioRxiv https://doi.org/10.1101/2021.02.28.433265 (2021)

  33. Blin, K. et al. antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res. 47, W81–W87 (2019).

    Article  CAS  Google Scholar 

  34. Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012).

    Article  CAS  Google Scholar 

  35. Koelmel, J. P. et al. Expanding lipidome coverage using LC-MS/MS data-dependent acquisition with automated exclusion list generation. J. Am. Soc. Mass. Spectrom. 28, 908–917 (2017).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We gratefully acknowledge financial support from the Ministerio de Educación y Formación Profesional (Spanish Government) to R.G. (2020-COLAB-00552). O.Y. was supported by Ministerio de Economía y Competitividad (MINECO) (BFU2014-57466-P), Spanish Biomedical Research Centre in Diabetes and Associated Metabolic Disorders (CIBERDEM), an initiative of Instituto de Investigación Carlos III (ISCIII), and the European Union’s Horizon 2020 program (MSCA-ITN-2015; 675610). We thank members of the Mil@b for helpful comments.

Author information

Authors and Affiliations

Authors

Contributions

R.G. and O.Y. designed the research. R.G., J.C., J.M.B., T.A., M.V., and O.Y. developed the computational method. D.V. and M.S.-H. performed LC–MS and MS2 experiments. All authors applied and evaluated the method on biological samples. R.G. and O.Y. wrote the manuscript, in cooperation with all authors.

Corresponding author

Correspondence to Oscar Yanes.

Ethics declarations

Competing interests

A patent application for the method has been filled by R.G., J.C., and O.Y. (P202030061). G.J.P. serves on the scientific advisory board of Cambridge Isotope Laboratories. The other authors declare no competing interests.

Additional information

Peer review information Peer reviewer reports are available. Nature Methods thanks Tao Huan and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Arunima Singh was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Resolution-based isotopic envelope calculation.

a) The MS resolution is inversely proportional to the peak width of the acquired signals. When preprocessing raw MS1 data, a centroidization algorithm performs a peak- picking on a continuous profile signal (blue), yielding discrete, centroided signals (red). As resolution increases, the minimal distance to distinguish two adjacent peaks decreases (d2 < d1). b) In practice, this implies that, when acquiring data at lower resolutions, certain isotope signals are masked by close, more intense signals c) By calculating a resolution-based parameter d, HERMES can estimate which close isotopologues can be distinguished in the acquired profile MS1 data and therefore present in the centroided data (See Algorithm 1).

Extended Data Fig. 2 Schematic workflow of the different filtering steps in HERMES.

a) Artificial neural network (ANN) for blank subtraction. b) Adduct and isotopologue grouping according to the similarity of their elution profiles. c) In-source fragment annotation, by using publicly available low-energy MS2 data.

Extended Data Fig. 3 Continuous MS2 acquisition resolves co-eluting ionic species by comparing their fragment elution profile.

a) All fragment ions from continuous MS2 scans are grouped according to their m/z. b) A loose peak-picking algorithm is applied and the resulting peaks are grouped according to their elution profiles, generating a similarity network that is split by a greedy clustering algorithm. c) This grouping yields a curated MS2 spectra for each coeluting species (see Algorithm 3). (*) The shaded slice shows the impact of the algorithm on the resulting MS2 spectral quality. The delineated fragments in blue have a different elution pattern from the rest and would contaminate the MS2 spectra if only one scan was acquired at the top of the peak. The grouping performed by HERMES confidently removes the contaminant ions and separates each group of fragments according to their elution.

Extended Data Fig. 4 HERMES R Graphical User Interface (GUI).

a) Point-and-click selection of SOI detection parameters, with detailed explanations on their usage and optimal values. b) Visualization of isotopic profiles of different adducts of the same formula. Formulas can be inputted directly or inferred from the name of a selected compound. c) Isotopic fidelity exploration of selected SOIs. d) Visualization of the continuous MS2 deconvolution step. Users can check the fragment ion elution profiles from each inclusion list entry and how they are interconnected in the corresponding profile similarity network.

Extended Data Fig. 5 Discrimination of SOIs based on isotopic fidelity.

a) [M + H]+ ion of chloridazon and b) [M + K]+ ion of 2- Amino-alpha-carboline overlapping at 0.27 ppm. The arrows indicate the characteristic [37Cl] isotopologue present in chloridazon and the [41K] isotopologue absent in 2-Amino-alpha-carboline. The absence of characteristic isotopologue signals (Cl, Br, K, etc.) in an intense SOI results in a low isotopic fidelity score and its removal.

Extended Data Fig. 6 Venn-like diagram of the distribution of negative ionization LC/MS1 data points in different steps of the HERMES workflow and XCMS peak-associated points.

a) E. coli and b) human plasma extract. Database: data points that match any m/z from the ionic formula database (including isotopes). SOI: monoisotopic (M0)- annotated data points that are present in an unfiltered SOI list. Inclusion List: data points present in a filtered SOI list (including blank subtraction, isotopic filter and ISF removal steps). Percentages refer to the total number of LC/MS1 data points.

Extended Data Fig. 7 Comparing LC/MS1 annotation performance with CAMERA and CliqueMS.

Positive ionization data. a) and b) E. coli, c) and d) human plasma extract. Percentages refer to the set of datapoints annotated by HERMES and were detected as a peak by XCMS (see Methods for parameters used). The isotope annotation overlap (a and c) was high, due primarily to M0 annotations. On the other hand, adduct annotation overlap (b and d) was markedly low (<20% of datapoints matched the annotation).

Extended Data Fig. 8 13C-enrichment distribution according to the precursor intensity.

a) and b) 13C-enriched metabolites (FC and MIRS > 0.5) are mainly associated with abundant ions (intensity >105), while unlabeled precursors (FC and MIRS < 0.5) relate more frequently to low abundant ions (intensity between 104-105). c) 13C-labeled precursors in iterative DDA corresponded to highly abundant ions that were also covered by HERMES. However, 56% of labelled low abundant ions were not covered by the iterative DDA.

Extended Data Fig. 9 Identified inclusion list entries according to the MS1 precursor intensity in negative ionization data.

An inclusion list entry is considered identified if at least one MS2 scan associated with it has a compound hit in the reference MS2 database with either cosine score > 0.8 (in-house database from MassBankEU, MoNA, Riken and NIST14 spectra), or Match > 90 and Confidence > 30 (mzCloud). a) E. coli extract. b) Human plasma extract.

Extended Data Fig. 10 Injection time effect in spectral quality (35 ms vs 1,500 ms).

Experimental MS2 spectra (black) of a) NADH, b) Biopterin and c) NADPH against library spectra (red). All precursor ions had an intensity below 105. A higher injection time resulted in richer spectra, with more matching fragments against the reference spectra and overall better matching scores.

Supplementary information

Supplementary Information

Supplementary Figures 1–4, Supplementary Tables 1 and 2, and Supplementary Algorithms 1–4

Reporting Summary

Peer Review Information

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Giné, R., Capellades, J., Badia, J.M. et al. HERMES: a molecular-formula-oriented method to target the metabolome. Nat Methods 18, 1370–1376 (2021). https://doi.org/10.1038/s41592-021-01307-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-021-01307-z

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research