Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online

Forsberg, Erica M; Huan, Tao; Rinehart, Duane; Benton, H Paul; Warth, Benedikt; Hilmers, Brian; Siuzdak, Gary

doi:10.1038/nprot.2017.151

Protocol
Published: 01 March 2018

Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online

Erica M Forsberg^1,2,
Tao Huan¹,
Duane Rinehart¹,
H Paul Benton¹,
Benedikt Warth^1,3,
Brian Hilmers¹ &
…
Gary Siuzdak ORCID: orcid.org/0000-0002-4749-0014¹

Nature Protocols volume 13, pages 633–651 (2018)Cite this article

10k Accesses
181 Citations
16 Altmetric
Metrics details

Subjects

Abstract

Systems biology is the study of complex living organisms, and as such, analysis on a systems-wide scale involves the collection of information-dense data sets that are representative of an entire phenotype. To uncover dynamic biological mechanisms, bioinformatics tools have become essential to facilitating data interpretation in large-scale analyses. Global metabolomics is one such method for performing systems biology, as metabolites represent the downstream functional products of ongoing biological processes. We have developed XCMS Online, a platform that enables online metabolomics data processing and interpretation. A systems biology workflow recently implemented within XCMS Online enables rapid metabolic pathway mapping using raw metabolomics data for investigating dysregulated metabolic processes. In addition, this platform supports integration of multi-omic (such as genomic and proteomic) data to garner further systems-wide mechanistic insight. Here, we provide an in-depth procedure showing how to effectively navigate and use the systems biology workflow within XCMS Online without a priori knowledge of the platform, including uploading liquid chromatography (LC)–mass spectrometry (MS) data from metabolite-extracted biological samples, defining the job parameters to identify features, correcting for retention time deviations, conducting statistical analysis of features between sample classes and performing predictive metabolic pathway analysis. Additional multi-omics data can be uploaded and overlaid with previously identified pathways to enhance systems-wide analysis of the observed dysregulations. We also describe unique visualization tools to assist in elucidation of statistically significant dysregulated metabolic pathways. Parameter input takes 5–10 min, depending on user experience; data processing typically takes 1–3 h, and data analysis takes ∼30 min.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Upload of mass spectrometry data (Steps 2–5).**

**Figure 2: Predictive pathway analysis parameter settings (Step 14).**

**Figure 3: Predictive metabolic pathway results (Steps 20–24).**

**Figure 4: Overlapping metabolite information (Steps 24–29).**

**Figure 5: Predictive metabolites results (Steps 36–40).**

**Figure 6: Pathway cloud plot (Steps 42–46).**

**Figure 7: Multi-omics integration (Steps 47–50).**

**Figure 8: Multi-omics results (Steps 51–59).**

Genome-wide association studies

Article 26 August 2021

Decrypting the molecular basis of cellular drug phenotypes by dose-resolved expression proteomics

Article Open access 07 May 2024

Unveiling microbial diversity: harnessing long-read sequencing technology

Article 30 April 2024

References

Goodacre, R., Vaidyanathan, S., Dunn, W.B., Harrigan, G.G. & Kell, D.B. Metabolomics by numbers: acquiring and understanding global metabolite data. Trends Biotechnol. 22, 245–252 (2004).
Article CAS Google Scholar
Fondi, M. & Liò, P. Multi-omics and metabolic modelling pipelines: challenges and tools for systems microbiology. Microbiol. Res. 171, 52–64 (2015).
Article CAS Google Scholar
Patti, G.J., Yanes, O. & Siuzdak, G. Metabolomics: the apogee of the omics trilogy. Nat. Rev. Mol. Cell Biol. 13, 263–269 (2012).
Article CAS Google Scholar
Zampieri, M., Sekar, K., Zamboni, N. & Sauer, U. Frontiers of high-throughput metabolomics. Curr. Opin. Chem. Biol. 36, 15–23 (2017).
Article CAS Google Scholar
Cajka, T. & Fiehn, O. Toward merging untargeted and targeted methods in mass spectrometry-based metabolomics and lipidomics. Anal. Chem. 88, 524–545 (2016).
Article CAS Google Scholar
Johnson, C.H., Ivanisevic, J. & Siuzdak, G. Metabolomics: beyond biomarkers and towards mechanisms. Nat. Rev. Mol. Cell Biol. 17, 451–459 (2016).
Article CAS Google Scholar
Smith, C., Want, E., O′Maille, G., Abagyan, R. & Siuzdak, G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching and identification. Anal. Chem. 78, 779–787 (2006).
Article CAS Google Scholar
Gowda, H. et al. Interactive XCMS online: simplifying advanced metabolomic data processing and subsequent statistical analyses. Anal. Chem. 86, 6931–6939 (2014).
Article CAS Google Scholar
Huan, T. et al. Systems biology guided by XCMS Online metabolomics. Nat. Methods 14, 461–462 (2017).
Article CAS Google Scholar
Tautenhahn, R., Patti, G.J., Rinehart, D. & Siuzdak, G. XCMS Online: a web-based platform to process untargeted metabolomic data. Anal. Chem. 84, 5035–5039 (2012).
Article CAS Google Scholar
Smith, C.A. et al. METLIN - a metabolite mass spectral database. Thera. Drug Monit. 27, 747–751 (2005).
Article CAS Google Scholar
Xia, J., Sinelnikov, I.V., Han, B. & Wishart, D.S. MetaboAnalyst 3.0—making metabolomics more meaningful. Nucleic Acids Res. 43, W251–W257 (2015).
Article CAS Google Scholar
Xia, J. & Wishart, D.S. MetPA: a web-based metabolomics tool for pathway analysis and visualization. Bioinformatics 26, 2342–2344 (2010).
Article CAS Google Scholar
Yamada, T., Letunic, I., Okuda, S., Kanehisa, M. & Bork, P. iPath2.0: interactive pathway explorer. Nucleic Acids Res. 39, W412–W415 (2011).
Article CAS Google Scholar
Pirhaji, L. et al. Revealing disease-associated pathways by network integration of untargeted metabolomics. Nat. Methods 13, 770–776 (2016).
Article CAS Google Scholar
Li, S.Z. et al. Predicting network activity from high throughput metabolomics. PLoS Comput. Biol. 9, 11 (2013).
Google Scholar
Kanehisa, M., Goto, S., Sato, Y., Furumichi, M. & Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40, D109–D114 (2012).
Article CAS Google Scholar
Caspi, R. et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 42, D459–D471 (2014).
Article CAS Google Scholar
Johnson, C.H. et al. Metabolism links bacterial biofilms and colon carcinogenesis. Cell Metab. 21, 891–897 (2015).
Article CAS Google Scholar
Gendelman, H.E. et al. Evaluation of the safety and immunomodulatory effects of sargramostim in a randomized, double-blind phase 1 clinical Parkinson's disease trial. Parkinson's Dis. 3, 10 (2017).
Article Google Scholar
Warth, B. et al. Exposome-scale investigations guided by global metabolomics, pathway analysis, and cognitive computing. Anal. Chem. 89, 11505–11513 (2017).
Article CAS Google Scholar
Scheltema, R.A., Jankevics, A., Jansen, R.C., Swertz, M.A. & Breitling, R. PeakML/mzMatch: a file format, Java library, R library, and tool-chain for mass spectrometry data analysis. Anal. Chem. 83, 2786–2793 (2011).
Article CAS Google Scholar
Pluskal, T., Castillo, S., Villar-Briones, A. & Orešiˇ, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics 11, 395 (2010).
Article Google Scholar
Chagoyen, M. & Pazos, F. MBRole: enrichment analysis of metabolomic data. Bioinformatics 27, 730–731 (2011).
Article CAS Google Scholar
Afgan, E. et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res. 44, W3–W10 (2016).
Article CAS Google Scholar
Giacomoni, F. et al. Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics. Bioinformatics 31, 1493–1495 (2015).
Article CAS Google Scholar
Davidson, R.L., Weber, R.J.M., Liu, H.Y., Sharma-Oates, A. & Viant, M.R. Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data. GigaScience 5, 10 (2016).
Article Google Scholar
Kamburov, A., Cavill, R., Ebbels, T.M., Herwig, R. & Keun, H.C. Integrated pathway-level analysis of transcriptomics and metabolomics data with IMPaLA. Bioinformatics 27, 2917–2918 (2011).
Article CAS Google Scholar
Sun, H. et al. iPEAP: integrating multiple omics and genetic data for pathway enrichment analysis. Bioinformatics 30, 737–739 (2014).
Article CAS Google Scholar
Cottret, L. et al. MetExplore: a web server to link metabolomic experiments and genome-scale metabolic networks. Nucleic Acids Res. 38, W132–W137 (2010).
Article CAS Google Scholar
Karnovsky, A. et al. Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data. Bioinformatics 28, 373–380 (2012).
Article CAS Google Scholar
Fabregat, A. et al. The reactome pathway knowledgebase. Nucleic Acids Res. 44, D481–D487 (2016).
Article CAS Google Scholar
Kelder, T. et al. WikiPathways: building research communities on biological pathways. Nucleic Acids Res. 40, D1301–D1307 (2012).
Article CAS Google Scholar
Gika, H. & Theodoridis, G. Sample preparation prior to the LC-MS-based metabolomics/metabonomics of blood-derived samples. Bioanalysis 3, 1647–1661 (2011).
Article CAS Google Scholar
Storey, J.D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).
Article CAS Google Scholar
Benton, H.P. et al. Autonomous metabolomics for rapid metabolite identification in global profiling. Anal. Chem. 87, 884–891 (2015).
Article CAS Google Scholar
Zhu, Z.-J. et al. Liquid chromatography quadrupole time-of-flight mass spectrometry characterization of metabolites guided by the METLIN database. Nat. Protoc. 8, 451–460 (2013).
Article CAS Google Scholar
Smith, G. et al. Mutations in APC, Kirsten-ras, and p53 - alternative genetic pathways to colorectal cancer. Proc. Natl. Acad. Sci. USA 99, 9433–9438 (2002).
Article CAS Google Scholar
Zhan, X.Q. & Desiderio, D.M. Signaling pathway networks mined from human pituitary adenoma proteomics data. BMC Med. Genom. 3, 26 (2010).
Article Google Scholar
Grabherr, M.G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Article CAS Google Scholar
Haas, B.J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).
Article CAS Google Scholar
Martin, J.A. & Wang, Z. Next-generation transcriptome assembly. Nat. Rev. Genet. 12, 671–682 (2011).
Article CAS Google Scholar
Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008).
Article CAS Google Scholar
Washburn, M.P., Wolters, D. & Yates, J.R. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19, 242–247 (2001).
Article CAS Google Scholar
Geiger, T., Cox, J., Ostasiewicz, P., Wisniewski, J.R. & Mann, M. Super-SILAC mix for quantitative proteomics of human tumor tissue. Nat. Methods 7, 383–385 (2010).
Article CAS Google Scholar
Montenegro-Burke, J.R. et al. Data streaming for metabolomics: accelerating data processing and analysis from days to minutes. Anal. Chem. 89, 1254–1259 (2017).
Article CAS Google Scholar
Montenegro-Burke, J.R. et al. Smartphone analytics: mobilizing the lab into the cloud for omicscale analyses. Anal. Chem. 88, 9753–9758 (2016).
Article CAS Google Scholar
Trutschel, D., Schmidt, S., Grosse, I. & Neumann, S. Experiment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data. Metabolomics 11, 851–860 (2015).
Article CAS Google Scholar
Causon, T.J. & Hann, S. Review of sample preparation strategies for MS-based metabolomic studies in industrial biotechnology. Anal. Chim. Acta 938, 18–32 (2016).
Article CAS Google Scholar
Engskog, M.K.R., Haglof, J., Arvidsson, T. & Pettersson, C. LC-MS based global metabolite profiling: the necessity of high data quality. Metabolomics 12, 19 (2016).
Article Google Scholar
Haggarty, J. & Burgess, K.E.V. Recent advances in liquid and gas chromatography methodology for extending coverage of the metabolome. Curr. Opin. Biotechnol. 43, 77–85 (2017).
Article CAS Google Scholar
Kohler, I. & Giera, M. Recent advances in liquid-phase separations for clinical metabolomics. J. Sep. Sci. 40, 93–108 (2017).
Article CAS Google Scholar
Muzny, D.M. et al. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).
Article CAS Google Scholar
Feldman, D., Krishnan, A.V., Swami, S., Giovannucci, E. & Feldman, B.J. The role of vitamin D in reducing cancer risk and progression. Nat. Rev. Cancer 14, 342–357 (2014).
Article CAS Google Scholar
Payne, C.M., Bernstein, C., Dvorak, K. & Bernstein, H. Hydrophobic bile acids, genomic instability, Darwinian selection, and colon carcinogenesis. Clin. Exp. Gastroenterol. 1, 19–47 (2008).
Article CAS Google Scholar
Field, A.E. et al. Impact of overweight on the risk of developing common chronic diseases during a 10-year period. Arch. Intern. Med. 161, 1581–1586 (2001).
Article CAS Google Scholar
Frei, B., Kim, M.C. & Ames, B.N. Ubiquinol-10 is an effective lipid-soluble antioxidant at physiological concentrations. Proc. Natl. Acad. Sci. USA 87, 4879–4883 (1990).
Article CAS Google Scholar
Xian, F., Hendrickson, C.L. & Marshall, A.G. High resolution mass spectrometry. Anal. Chem. 84, 708–719 (2012).
Article CAS Google Scholar
Tautenhahn, R., Böttcher, C. & Neumann, S. Highly sensitive feature detection for high resolution LC/MS. BMC Bioinform. 9, 504 (2008).
Article Google Scholar
Shevlyakov, G., Morgenthaler, S. & Shurygin, A. Redescending M-estimators. J. Stat. Plan. Infer. 138, 2906–2917 (2008).
Article Google Scholar
Welch, B.L. The generalisation of student′s problems when several different population variances are involved. Biometrika 34, 28–35 (1947).
CAS PubMed Google Scholar
Mann, H.B. & Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Statist. 18, 50–60 (1947).
Article Google Scholar
Fisher, R.A. On the probable error of a coefficient of correlation deduced from a small sample. Metron 1, 3–32 (1921).
Google Scholar
Kruskal, W.H. & Wallis, W.A. Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 47, 583–621 (1952).
Article Google Scholar
Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A. & Vingron, M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18, S96–S104 (2002).
Article Google Scholar
Maier, T. et al. Quantification of mRNA and protein and integration with protein turnover in a bacterium. Mol. Syst. Biol. 7, 511–511 (2011).
Article Google Scholar
Hirai, M.Y. et al. Elucidation of gene-to-gene and metabolite-to-gene networks in Arabidopsis by integration of metabolomics and transcriptomics. J. Biol. Chem. 280, 25590–25595 (2005).
Article CAS Google Scholar
Bateman, A. et al. UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–D212 (2015).
Article Google Scholar
Patti, G.J., Tautenhahn, R. & Siuzdak, G. Meta-analysis of untargeted metabolomic data from multiple profiling experiments. Nat. Protoc. 7, 508–516 (2012).
Article CAS Google Scholar
Tautenhahn, R. et al. metaXCMS: second-order analysis of untargeted metabolomics data. Anal. Chem. 83, 696–700 (2011).
Article CAS Google Scholar

Download references

Acknowledgements

The authors thank the following for funding assistance: Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA), a Scientific Focus Area Program at Lawrence Berkeley National Laboratory for the US Department of Energy, Office of Science, Office of Biological and Environmental Research under contract number DE-AC02-05CH11231 (G.S.); and the National Institutes of Health (grants R01 GM114368 (G.S.) and PO1 A1043376-02S1 (G.S.)).

Author information

Authors and Affiliations

Center for Metabolomics and Mass Spectrometry, The Scripps Research Institute, La Jolla, California, USA
Erica M Forsberg, Tao Huan, Duane Rinehart, H Paul Benton, Benedikt Warth, Brian Hilmers & Gary Siuzdak
Department of Chemistry and Biochemistry, San Diego State University, San Diego, California, USA
Erica M Forsberg
Department of Food Chemistry and Toxicology, University of Vienna, Vienna, Austria
Benedikt Warth

Authors

Erica M Forsberg
View author publications
You can also search for this author in PubMed Google Scholar
Tao Huan
View author publications
You can also search for this author in PubMed Google Scholar
Duane Rinehart
View author publications
You can also search for this author in PubMed Google Scholar
H Paul Benton
View author publications
You can also search for this author in PubMed Google Scholar
Benedikt Warth
View author publications
You can also search for this author in PubMed Google Scholar
Brian Hilmers
View author publications
You can also search for this author in PubMed Google Scholar
Gary Siuzdak
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

E.M.F. and T.H. contributed equally to writing the manuscript. E.M.F., T.H., D.R., H.P.B., B.H. and G.S. contributed to platform development, and H.P.B., B.W. and G.S. contributed to manuscript writing.

Corresponding author

Correspondence to Gary Siuzdak.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Methods and Supplementary Table 1. (PDF 852 kb)

Supplementary Data 1

Demonstration transcriptomics data set. (ZIP 5 kb)

Supplementary Data 2

Significant protein data set. (ZIP 0 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Forsberg, E., Huan, T., Rinehart, D. et al. Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online. Nat Protoc 13, 633–651 (2018). https://doi.org/10.1038/nprot.2017.151

Download citation

Published: 01 March 2018
Issue Date: April 2018
DOI: https://doi.org/10.1038/nprot.2017.151

This article is cited by

Small molecule metabolites: discovery of biomarkers and therapeutic targets
- Shi Qiu
- Ying Cai
- Aihua Zhang
Signal Transduction and Targeted Therapy (2023)
Soil microbiome engineering for sustainability in a changing environment
- Janet K. Jansson
- Ryan McClure
- Robert G. Egbert
Nature Biotechnology (2023)
Challenges and Opportunities for Bioactive Compound and Antibiotic Discovery in Deep Space
- Anna C. Simpson
Journal of the Indian Institute of Science (2023)
A novel 6-metabolite signature for prediction of clinical outcomes in type 2 diabetic patients undergoing percutaneous coronary intervention
- Xue-bin Wang
- Ning-hua Cui
- Xia’nan Liu
Cardiovascular Diabetology (2022)
Optimization of metabolomic data processing using NOREVA
- Jianbo Fu
- Ying Zhang
- Feng Zhu
Nature Protocols (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.