A cheminformatics approach to characterize metabolomes in stable-isotope-labeled organisms

Tsugawa, Hiroshi; Nakabayashi, Ryo; Mori, Tetsuya; Yamada, Yutaka; Takahashi, Mikiko; Rai, Amit; Sugiyama, Ryosuke; Yamamoto, Hiroyuki; Nakaya, Taiki; Yamazaki, Mami; Kooke, Rik; Bac-Molenaar, Johanna A.; Oztolan-Erol, Nihal; Keurentjes, Joost J. B.; Arita, Masanori; Saito, Kazuki

doi:10.1038/s41592-019-0358-2

Brief Communication
Published: 28 March 2019

A cheminformatics approach to characterize metabolomes in stable-isotope-labeled organisms

Hiroshi Tsugawa ORCID: orcid.org/0000-0002-2015-3958^1,2^na1,
Ryo Nakabayashi¹^na1,
Tetsuya Mori¹,
Yutaka Yamada¹,
Mikiko Takahashi¹,
Amit Rai ORCID: orcid.org/0000-0002-5715-8541³,
Ryosuke Sugiyama¹,
Hiroyuki Yamamoto⁴,
Taiki Nakaya³,
Mami Yamazaki³,
Rik Kooke⁵,
Johanna A. Bac-Molenaar⁵,
Nihal Oztolan-Erol⁵,
Joost J. B. Keurentjes ORCID: orcid.org/0000-0001-8918-0711⁵,
Masanori Arita^1,6 &
…
Kazuki Saito ORCID: orcid.org/0000-0001-6310-5342^1,3

Nature Methods volume 16, pages 295–298 (2019)Cite this article

12k Accesses
165 Citations
81 Altmetric
Metrics details

Subjects

A Publisher Correction to this article was published on 16 April 2019

This article has been updated

Abstract

We report a computational approach (implemented in MS-DIAL 3.0; http://prime.psc.riken.jp/) for metabolite structure characterization using fully ¹³C-labeled and non-labeled plants and LC–MS/MS. Our approach facilitates carbon number determination and metabolite classification for unknown molecules. Applying our method to 31 tissues from 12 plant species, we assigned 1,092 structures and 344 formulae to 3,604 carbon-determined metabolite ions, 69 of which were found to represent structures currently not listed in metabolome databases.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Cheminformatics strategy for comprehensive metabolite characterization.**

**Fig. 2: Structures characterized with computational MS and human curation.**

Metabolite discovery through global annotation of untargeted metabolomics data

Article 28 October 2021

Li Chen, Wenyun Lu, … Joshua D. Rabinowitz

SIMPEL: using stable isotopes to elucidate dynamics of context specific metabolism

Article Open access 12 February 2024

Shrikaar Kambhampati, Allen H. Hubbard, … Doug K. Allen

Standardized multi-omics of Earth’s microbiomes reveals microbial and metabolite diversity

Article Open access 28 November 2022

Justin P. Shaffer, Louis-Félix Nothias, … the Earth Microbiome Project 500 (EMP500) Consortium

Code availability

All program packages and source codes are freely available at http://prime.psc.riken.jp/.

Data availability

All data resources are freely available at http://prime.psc.riken.jp/. The DOI in Metabolomics Workbench is https://doi.org/10.21228/M8XM40.

Change history

16 April 2019
In the originally published Supplementary Information for this paper, the files presented as Supplementary Tables 3, 4, and 7 were duplicates of Supplementary Tables 5, 6, and 9, respectively. All Supplementary Table files are now correct online.

References

Tsugawa, H. Curr. Opin. Biotechnol. 54, 10–17 (2018).
Article CAS Google Scholar
Harvey, A. L., Edrada-Ebel, R. & Quinn, R. J. Nat. Rev. Drug. Discov. 14, 111–129 (2015).
Article CAS Google Scholar
Banerjee, P. et al. Nucleic Acids Res. 43, D935–D939 (2015).
Article CAS Google Scholar
Wang, M. et al. Nat. Biotechnol. 34, 828–837 (2016).
Article CAS Google Scholar
Mahieu, N. G. & Patti, G. J. Anal. Chem. 89, 10397–10406 (2017).
Article CAS Google Scholar
Giavalisco, P. et al. Plant J. 68, 364–376 (2011).
Article CAS Google Scholar
Chokkathukalam, A., Kim, D. H., Barrett, M. P., Breitling, R. & Creek, D. J. Bioanalysis 6, 511–524 (2014).
Article CAS Google Scholar
Sumner, L. W. et al. Metabolomics. 3, 211–221 (2007).
Article CAS Google Scholar
Tsugawa, H. et al. Anal. Chem. 88, 7946–7958 (2016).
Article CAS Google Scholar
Kim, S.-Y. & Volsky, D. J. BMC Bioinformatics 6, 144 (2005).
Article Google Scholar
Falkner, J. A., Falkner, J. W., Yocum, A. K. & Andrews, P. C. J. Proteome Res. 7, 4614–4622 (2008).
Article CAS Google Scholar
Maciel, E. et al. Chem. Phys. Lipids 174, 1–7 (2013).
Article CAS Google Scholar
Nakabayashi, R. et al. Anal. Chem. 85, 1310–1315 (2013).
Article CAS Google Scholar
Mütsch-Eckner, M., Erdelmeier, C. A., Sticher, O. & Reuter, H. D. J. Nat. Prod. 56, 864–869 (1993).
Article Google Scholar
Kooke, R. et al. Plant Physiol. 170, 2187–2203 (2016).
Article CAS Google Scholar
Krasensky, J. & Jonak, C. J. Exp. Bot. 63, 1593–1608 (2012).
Article CAS Google Scholar
Tsugawa, H., Ikeda, K. & Arita, M. Biochim. Biophys. Acta Mol. Cell. Biol. Lipids 1862, 762–765 (2017).
Article CAS Google Scholar
Warth, B. et al. Anal. Chem. 89, 11505–11513 (2017).
Article CAS Google Scholar
Scheubert, K. et al. Nat. Commun. 8, 1494 (2017).
Article Google Scholar
Markley, J. L. et al. Curr. Opin. Biotechnol. 43, 34–40 (2017).
Article CAS Google Scholar
Saito, K. et al. Plant Cell Rep. 20, 267–271 (2001).
Article CAS Google Scholar
Udomsom, N. et al. Front. Plant Sci. 7, 1–14 (2016).
Article Google Scholar
Bac-Molenaar, J. A., Vreugdenhil, D., Granier, C. & Keurentjes, J. J. B. J. Exp. Bot. 66, 5567–5580 (2015).
Article CAS Google Scholar
Bac-Molenaar, J. A., Granier, C., Keurentjes, J. J. B. & Vreugdenhil, D. Plant Cell Environ. 39, 88–102 (2016).
Article CAS Google Scholar
Begley, P. et al. Anal. Chem. 81, 7038–7046 (2009).
Article CAS Google Scholar
Huang, Z. et al. J. Med. Chem. 61, 1833–1844 (2018).
Article CAS Google Scholar
Han, L. K. & Saito, M. Japan Patent JP5507437B2 (2010).
Tsugawa, H. et al. Nat. Methods 12, 523–526 (2015).
Article CAS Google Scholar
Domingo-Almenara, X., Montenegro-Burke, J. R., Benton, H. P. & Siuzdak, G. Anal. Chem. 90, 480–489 (2018).
Article CAS Google Scholar
Djoumbou Feunang, Y. et al. J. Cheminform. 8, 1–20 (2016).
Article Google Scholar
Guijas, C. et al. Anal. Chem. 90, 3156–3164 (2018).
Article CAS Google Scholar
Horai, H. et al. J. Mass. Spectrom. 45, 703–714 (2010).
Article CAS Google Scholar
Sawada, Y. et al. Phytochemistry 82, 38–45 (2012).
Article CAS Google Scholar
Qiu, F., Fine, D. D., Wherritt, D. J., Lei, Z. & Sumner, L. W. Anal. Chem. 88, 11373–11383 (2016).
Article CAS Google Scholar
Lai, Z. et al. Nat. Methods 15, 53–56 (2018).
Article CAS Google Scholar
Subramanian, A. et al. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).
Article CAS Google Scholar
Morreel, K. et al. Plant Cell 26, 929–945 (2014).
Article CAS Google Scholar

Download references

Acknowledgements

This work was partially supported by KAKENHI (JSPS 15K01812 (H.T.), 15H05897 (M.A.), 16H06454 (M.Y.), 17H03621 (M.A.), 18H02432 (H.T.), 18K19155 (H.T.); MHLW 30190401 (M.A.)), MAFF Science and Technology Research Promotion Program for Agriculture, Forestry, Fisheries and Food Industry (R.N.), JST the Strategic International Research Cooperative Program (R.N.), JST National Bioscience Database Center (M.A.), JST the Strategic International Collaborative Research Program (K.S.), Chiba University the GP Program (K.S.) and the Japan Advanced Plant Science Network (K.S.). We thank the HMDB, LipidMAPS and PMN consortiums for providing the SDF files; ChemAxon for a free research license for the Marvin and JChem cheminformatics tools; K. Yonekura-Sakakibara, Y. Higashi, Y. Sawada and K. Mochida for discussions on plant metabolomics; M. Yokota Hirai for providing the place for compound synthesis (RIKEN CSRS); S. Tsugawa for data curation (RIKEN IMS); and U. Petralia for English editing. This work was also supported by the Centre of BioSystems Genomics (no. AA3-WU-PL) (R.K.), the ‘Learning from Nature’ program funded by the Dutch Technology Foundation (STW grant number 10996), which is part of the Netherlands Organization for Scientific Research (J.A.B.-M.), the European Plant Phenotyping Network (EPPN, grant agreement no. 284443) funded by the FP7 Research Infrastructures Program of the European Union (J.A.B.-M.) and the NUE-CROPS project (grant agreement no. 222645) funded by the FP7-CP-IP Research Infrastructures Program of the European Union (N.O.-E.).

Author information

These authors contributed equally: Hiroshi Tsugawa, Ryo Nakabayashi.

Authors and Affiliations

RIKEN Center for Sustainable Resource Science, Yokohama, Japan
Hiroshi Tsugawa, Ryo Nakabayashi, Tetsuya Mori, Yutaka Yamada, Mikiko Takahashi, Ryosuke Sugiyama, Masanori Arita & Kazuki Saito
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Hiroshi Tsugawa
Graduate School of Pharmaceutical Sciences, Chiba University, Chiba, Japan
Amit Rai, Taiki Nakaya, Mami Yamazaki & Kazuki Saito
Human Metabolome Technologies, Inc, Tsuruoka, Japan
Hiroyuki Yamamoto
Laboratory of Genetics, Wageningen University & Research, Wageningen, the Netherlands
Rik Kooke, Johanna A. Bac-Molenaar, Nihal Oztolan-Erol & Joost J. B. Keurentjes
National Institute of Genetics, Mishima, Japan
Masanori Arita

Authors

Hiroshi Tsugawa
View author publications
You can also search for this author in PubMed Google Scholar
Ryo Nakabayashi
View author publications
You can also search for this author in PubMed Google Scholar
Tetsuya Mori
View author publications
You can also search for this author in PubMed Google Scholar
Yutaka Yamada
View author publications
You can also search for this author in PubMed Google Scholar
Mikiko Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Amit Rai
View author publications
You can also search for this author in PubMed Google Scholar
Ryosuke Sugiyama
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Yamamoto
View author publications
You can also search for this author in PubMed Google Scholar
Taiki Nakaya
View author publications
You can also search for this author in PubMed Google Scholar
Mami Yamazaki
View author publications
You can also search for this author in PubMed Google Scholar
Rik Kooke
View author publications
You can also search for this author in PubMed Google Scholar
Johanna A. Bac-Molenaar
View author publications
You can also search for this author in PubMed Google Scholar
Nihal Oztolan-Erol
View author publications
You can also search for this author in PubMed Google Scholar
Joost J. B. Keurentjes
View author publications
You can also search for this author in PubMed Google Scholar
Masanori Arita
View author publications
You can also search for this author in PubMed Google Scholar
Kazuki Saito
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

H.T., R.N., M.A. and K.S. designed the research. H.T. developed all computational programs for metabolome annotations. R.N. and T.M. analyzed the biological samples. R.S. performed the synthesis of compounds. A.R., T.N. and M.Y. prepared the O. pumila material. H.T. suggested FSEA, and H.Y. provided fruitful suggestions. Y.Y. created the PlaSMA website. H.T., R.N., T.M., A.R. and M.T. contributed to the curation of structure elucidations. R.K., J.A.B.-M., N.O.-E. and J.J.B.K. shared the seeds of A. thaliana accessions and performed the phenotyping experiments. H.T. wrote the manuscript and H.T., R.N., A.R., J.J.B.K., M.A. and K.S. thoroughly discussed this project and helped to improve the manuscript.

Corresponding authors

Correspondence to Hiroshi Tsugawa or Kazuki Saito.

Ethics declarations

Competing interests

H.Y., who provided the suggestions for FSEA, is a researcher at Human Metabolome Technologies Inc.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Summary of plant specialized metabolome (PlaSMA) database.

a, The diagram presents coverage in mass spectral databases when CHNOSP elements are considered. b, Principal component analysis (PCA) was performed with PubChem fingerprints for the metabolite structures to clarify the diversity of chemical properties in PlaSMA metabolites. The x- and y-axes show the score values of the first and second principal components. The sizes of spectral records in PlaSMA, MassBank, ReSpect, GNPS, and MetaboBASE are 406, 1,768, 642, 2,501, and 258, respectively, where unique structures were defined by the first layer of InChIKey. c, The retention time and the exact mass of PlaSMA metabolites were mapped. Colors indicate the metabolite class defined by the superclass of ClassyFire.

Supplementary Figure 2 Evaluation of labeling rates in 12 plant species.

a, Labeling rates were examined by a total of 954 metabolite features assigned as levels 2.1, 2.2 and 3.1. The rate was calculated by the equation ¹³C intensity/(¹²C + ¹³C intensities), where the ion abundances were confirmed in plant tissue with maximum intensity among the samples described in Supplementary Table 6. The bar chart describes the average and standard deviation for labeling rates of metabolites in each plant tissue. b, The relationship of m/z and ion abundance with the labeling rate was examined. The x- and y-axes show the m/z value and its labeling rate; the bubble size denotes the scale of ion abundance.

Supplementary Figure 3 Evaluation of MS-FINDER structure elucidations.

MS-FINDER accuracies were evaluated for the formula, the metabolite class, and the structure predictions in publicly available databases containing 4,430 and 1,741 high-quality tandem mass spectral (HQ-MS/MS) records, respectively, in positive and negative ion mode. The search for molecular structures yielded ontology predictions for the second panel; FSEA returned ontology predictions without requiring information from the structural database. The x- and y-axes show the ranking of correct structures and the accuracy of the characterized features.

Supplementary Figure 4 Methodology for the creation of the fragment database for FSEA and molecular fingerprints.

High-quality mass spectral records were extracted on the basis of ion abundance and mass accuracy criteria. Duplicate records for the same metabolite were merged by fragment formula consistency; the ion abundance was simply summed. Substructures were assigned for the product ions and neutral losses; fragment ion structures were managed using the neutralized form with the rearranged hydrogen value. All structures and substructures were managed with the first layer of InChIKey, followed by conversion into the ontology term. Fragments were recorded as ‘frequently observed’ when ion or neutral loss exceeded 0.2% among the merged MS/MS data. Frequently observed fragments were used in the set subjected to FSEA; the query was managed using m/z or the neutral loss formula, and InChIKey and ontology candidates.

Supplementary Figure 5 FSEA concept and methodology.

a, The concepts underlying FSEA and gene set enrichment analysis (GSEA) are compared using phosphatidylcholine (PC). When significant characteristic fragments used to recommend the PC class are detected, the unknown MS/MS spectrum is recommended as a PC class; the estimated P value is derived with the Fisher exact test (one-sided). b, The significance of peak features is currently defined as a more than 5% relative abundance. The peaks are annotated using the fragment ontology database; formula and ontology candidate information is used for querying m/z and neutral losses. A 2 × 2 cross-tabulation table is prepared with the aid of annotated information; insignificant peak features are defined using several methods whose principles involve different specificities and precisions. Finally, using the table, P values are estimated with the Fisher exact test (one-sided).

Supplementary Figure 6 Further explanation of how to annotate C₂₀H₃₃N₃O₁₃S as N-fructosyl S-(2-carboxypropyl) glutathione.

a, Neutral losses of 90 and 120 Da are frequently observed in the spectrum of C-linked flavonoids, whereas the loss of 162 Da is observed in the spectrum of O-linked flavonoids. b, Neutral losses of 90, 120, and 162 Da are commonly observed in the mass fragmentation of N-hexosyl compounds; such losses were also found in the MS/MS spectrum of C₂₀H₃₃N₃O₁₃S. c, The FSEA algorithm recommended the metabolite classes as ‘peptides’ (P value = 1.86 × 10^–5, one-sided) and ‘amino acid derivatives’ (P value = 0.00276, one-sided). d, The spectrum of S-(2-carboxylpropyl)glutathione, which was linked to C₂₀H₃₃N₃O₁₃S by the MS/MS similarity in the integrated molecular network, is shown. The formula and InChIKey were assigned to product ions that were also observed in the spectrum of C₂₀H₃₃N₃O₁₃S. e, The discovery of N-fructosyl amino acids isolated from the bulbs of the Allium genus has been reported on the literature. On the basis of such findings, the chemical structure of C₂₀H₃₃N₃O₁₃S was predicted as N-fructosyl S-(2-carboxypropyl) glutathione.

Supplementary Figure 7 Total synthesis of N-fructosyl S-(2-carboxypropyl) glutathione, and confirmation by LC-MS/MS.

The results were confirmed three times independently.

Supplementary Figure 8 Metabolome analysis of Arabidopsis thaliana accessions using the enriched spectral database.

a, A. thaliana accessions (n = 25) were mapped by the latitude and longitude of their collection site. The color shows the well-defined post-germination FT trait. b, OPLS-DA score plots were prepared to classify early/late FT phenotypes with the shoot and root metabolomes; 131 and 185 metabolites were annotated using the accumulated spectral library we curated. c, The plot presents details of shoot and root metabolomes and their correlation with various traits. Circuit A shows the metabolites were categorized into 11 classes; circuit B is the heat map of the metabolite profiles in 25 accessions. The cycle size in circuit C indicates the fold change between maximum and minimum ion abundances among the accessions. The –log₁₀ value of the Q-value, adjusted with the Benjamini–Hochberg method in the Mann–Whitney U-test (two-sided) for early/late FT traits, is presented as a bar chart in circuit D. The estimated values for other traits, including the phenotypes elicited by nitrogen depletion and drought stress, and the morphological traits of rosette branching and plant height, obtained with the same method, are presented in line charts in circuit E. The associated metabolites, defined by a greater than 0.85 correlation coefficient, are linked in circuit F. Asterisks indicate that the metabolites were annotated as level 2.2 with the assistance of the workflow presented in this study.

Supplementary information

Supplementary Information

Supplementary Figures 1–8, Supplementary Note 1

Reporting Summary

Supplementary Table 1

Summary of our ten-step cheminformatics methodology.

Supplementary Table 2

Details on PlaSMA standard chemicals.

Supplementary Table 3

Details of the PubChem fingerprint used to describe Supplementary Fig. 1b.

Supplementary Table 4

Details for fragment database and FSEA sets.

Supplementary Table 5

Summary of metabolite class information to confirm biological significance.

Supplementary Table 6

Summary of 2,382 ESI(+) and 1,731 ESI(–) metabolite features including m/z, the retention time (min), MS/MS spectrum, and ion abundances among plants.

Supplementary Table 7

Summary of metabolome analysis for Arabidopsis thaliana accessions.

Supplementary Table 8

Detail of extracted SNPs relevant to the flowering time trait.

Supplementary Table 9

The catalog and lot numbers for 31 tissues in 12 plants.

Supplementary Table 10

Detail of extracted SNPs relevant to the methylthiobutyl glucosinolate profile.

Supplementary Table 11

Details for 1,475 molecular fragment fingerprints used in MS-FINDER.

Supplementary Table 12

Detail for the biotransformation dictionary.

Source data

Source Data Fig. 2

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsugawa, H., Nakabayashi, R., Mori, T. et al. A cheminformatics approach to characterize metabolomes in stable-isotope-labeled organisms. Nat Methods 16, 295–298 (2019). https://doi.org/10.1038/s41592-019-0358-2

Download citation

Received: 13 August 2018
Accepted: 19 February 2019
Published: 28 March 2019
Issue Date: April 2019
DOI: https://doi.org/10.1038/s41592-019-0358-2

This article is cited by

Neighbour-induced changes in root exudation patterns of buckwheat results in altered root architecture of redroot pigweed
- Çağla Görkem Eroğlu
- Alexandra A. Bennett
- Aurélie Gfeller
Scientific Reports (2024)
The phosphorylated pathway of serine biosynthesis affects sperm, embryo, and sporophyte development, and metabolism in Marchantia polymorpha
- Mengyao Wang
- Hiromitsu Tabeta
- Masami Yokota Hirai
Communications Biology (2024)
Ultraviolet exposure regulates skin metabolome based on the microbiome
- Vijaykumar Patra
- Natalie Bordag
- Peter Wolf
Scientific Reports (2023)
Spatial Metabolomics Reveals the Multifaceted Nature of Lamprey Buccal Gland and Its Diverse Mechanisms for Blood-Feeding
- Meng Gou
- Xuyuan Duan
- Yonghui Dong
Communications Biology (2023)
Phagocytosis in the retina promotes local insulin production in the eye
- J. Iker Etchegaray
- Shannon Kelley
- Kodi S. Ravichandran
Nature Metabolism (2023)