Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online

Abstract

Systems biology is the study of complex living organisms, and as such, analysis on a systems-wide scale involves the collection of information-dense data sets that are representative of an entire phenotype. To uncover dynamic biological mechanisms, bioinformatics tools have become essential to facilitating data interpretation in large-scale analyses. Global metabolomics is one such method for performing systems biology, as metabolites represent the downstream functional products of ongoing biological processes. We have developed XCMS Online, a platform that enables online metabolomics data processing and interpretation. A systems biology workflow recently implemented within XCMS Online enables rapid metabolic pathway mapping using raw metabolomics data for investigating dysregulated metabolic processes. In addition, this platform supports integration of multi-omic (such as genomic and proteomic) data to garner further systems-wide mechanistic insight. Here, we provide an in-depth procedure showing how to effectively navigate and use the systems biology workflow within XCMS Online without a priori knowledge of the platform, including uploading liquid chromatography (LC)–mass spectrometry (MS) data from metabolite-extracted biological samples, defining the job parameters to identify features, correcting for retention time deviations, conducting statistical analysis of features between sample classes and performing predictive metabolic pathway analysis. Additional multi-omics data can be uploaded and overlaid with previously identified pathways to enhance systems-wide analysis of the observed dysregulations. We also describe unique visualization tools to assist in elucidation of statistically significant dysregulated metabolic pathways. Parameter input takes 5–10 min, depending on user experience; data processing typically takes 1–3 h, and data analysis takes 30 min.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Upload of mass spectrometry data (Steps 2–5).
Figure 2: Predictive pathway analysis parameter settings (Step 14).
Figure 3: Predictive metabolic pathway results (Steps 20–24).
Figure 4: Overlapping metabolite information (Steps 24–29).
Figure 5: Predictive metabolites results (Steps 36–40).
Figure 6: Pathway cloud plot (Steps 42–46).
Figure 7: Multi-omics integration (Steps 47–50).
Figure 8: Multi-omics results (Steps 51–59).

Similar content being viewed by others

References

  1. Goodacre, R., Vaidyanathan, S., Dunn, W.B., Harrigan, G.G. & Kell, D.B. Metabolomics by numbers: acquiring and understanding global metabolite data. Trends Biotechnol. 22, 245–252 (2004).

    Article  CAS  Google Scholar 

  2. Fondi, M. & Liò, P. Multi-omics and metabolic modelling pipelines: challenges and tools for systems microbiology. Microbiol. Res. 171, 52–64 (2015).

    Article  CAS  Google Scholar 

  3. Patti, G.J., Yanes, O. & Siuzdak, G. Metabolomics: the apogee of the omics trilogy. Nat. Rev. Mol. Cell Biol. 13, 263–269 (2012).

    Article  CAS  Google Scholar 

  4. Zampieri, M., Sekar, K., Zamboni, N. & Sauer, U. Frontiers of high-throughput metabolomics. Curr. Opin. Chem. Biol. 36, 15–23 (2017).

    Article  CAS  Google Scholar 

  5. Cajka, T. & Fiehn, O. Toward merging untargeted and targeted methods in mass spectrometry-based metabolomics and lipidomics. Anal. Chem. 88, 524–545 (2016).

    Article  CAS  Google Scholar 

  6. Johnson, C.H., Ivanisevic, J. & Siuzdak, G. Metabolomics: beyond biomarkers and towards mechanisms. Nat. Rev. Mol. Cell Biol. 17, 451–459 (2016).

    Article  CAS  Google Scholar 

  7. Smith, C., Want, E., O′Maille, G., Abagyan, R. & Siuzdak, G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching and identification. Anal. Chem. 78, 779–787 (2006).

    Article  CAS  Google Scholar 

  8. Gowda, H. et al. Interactive XCMS online: simplifying advanced metabolomic data processing and subsequent statistical analyses. Anal. Chem. 86, 6931–6939 (2014).

    Article  CAS  Google Scholar 

  9. Huan, T. et al. Systems biology guided by XCMS Online metabolomics. Nat. Methods 14, 461–462 (2017).

    Article  CAS  Google Scholar 

  10. Tautenhahn, R., Patti, G.J., Rinehart, D. & Siuzdak, G. XCMS Online: a web-based platform to process untargeted metabolomic data. Anal. Chem. 84, 5035–5039 (2012).

    Article  CAS  Google Scholar 

  11. Smith, C.A. et al. METLIN - a metabolite mass spectral database. Thera. Drug Monit. 27, 747–751 (2005).

    Article  CAS  Google Scholar 

  12. Xia, J., Sinelnikov, I.V., Han, B. & Wishart, D.S. MetaboAnalyst 3.0—making metabolomics more meaningful. Nucleic Acids Res. 43, W251–W257 (2015).

    Article  CAS  Google Scholar 

  13. Xia, J. & Wishart, D.S. MetPA: a web-based metabolomics tool for pathway analysis and visualization. Bioinformatics 26, 2342–2344 (2010).

    Article  CAS  Google Scholar 

  14. Yamada, T., Letunic, I., Okuda, S., Kanehisa, M. & Bork, P. iPath2.0: interactive pathway explorer. Nucleic Acids Res. 39, W412–W415 (2011).

    Article  CAS  Google Scholar 

  15. Pirhaji, L. et al. Revealing disease-associated pathways by network integration of untargeted metabolomics. Nat. Methods 13, 770–776 (2016).

    Article  CAS  Google Scholar 

  16. Li, S.Z. et al. Predicting network activity from high throughput metabolomics. PLoS Comput. Biol. 9, 11 (2013).

    Google Scholar 

  17. Kanehisa, M., Goto, S., Sato, Y., Furumichi, M. & Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40, D109–D114 (2012).

    Article  CAS  Google Scholar 

  18. Caspi, R. et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 42, D459–D471 (2014).

    Article  CAS  Google Scholar 

  19. Johnson, C.H. et al. Metabolism links bacterial biofilms and colon carcinogenesis. Cell Metab. 21, 891–897 (2015).

    Article  CAS  Google Scholar 

  20. Gendelman, H.E. et al. Evaluation of the safety and immunomodulatory effects of sargramostim in a randomized, double-blind phase 1 clinical Parkinson's disease trial. Parkinson's Dis. 3, 10 (2017).

    Article  Google Scholar 

  21. Warth, B. et al. Exposome-scale investigations guided by global metabolomics, pathway analysis, and cognitive computing. Anal. Chem. 89, 11505–11513 (2017).

    Article  CAS  Google Scholar 

  22. Scheltema, R.A., Jankevics, A., Jansen, R.C., Swertz, M.A. & Breitling, R. PeakML/mzMatch: a file format, Java library, R library, and tool-chain for mass spectrometry data analysis. Anal. Chem. 83, 2786–2793 (2011).

    Article  CAS  Google Scholar 

  23. Pluskal, T., Castillo, S., Villar-Briones, A. & Orešiˇ, M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics 11, 395 (2010).

    Article  Google Scholar 

  24. Chagoyen, M. & Pazos, F. MBRole: enrichment analysis of metabolomic data. Bioinformatics 27, 730–731 (2011).

    Article  CAS  Google Scholar 

  25. Afgan, E. et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res. 44, W3–W10 (2016).

    Article  CAS  Google Scholar 

  26. Giacomoni, F. et al. Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics. Bioinformatics 31, 1493–1495 (2015).

    Article  CAS  Google Scholar 

  27. Davidson, R.L., Weber, R.J.M., Liu, H.Y., Sharma-Oates, A. & Viant, M.R. Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data. GigaScience 5, 10 (2016).

    Article  Google Scholar 

  28. Kamburov, A., Cavill, R., Ebbels, T.M., Herwig, R. & Keun, H.C. Integrated pathway-level analysis of transcriptomics and metabolomics data with IMPaLA. Bioinformatics 27, 2917–2918 (2011).

    Article  CAS  Google Scholar 

  29. Sun, H. et al. iPEAP: integrating multiple omics and genetic data for pathway enrichment analysis. Bioinformatics 30, 737–739 (2014).

    Article  CAS  Google Scholar 

  30. Cottret, L. et al. MetExplore: a web server to link metabolomic experiments and genome-scale metabolic networks. Nucleic Acids Res. 38, W132–W137 (2010).

    Article  CAS  Google Scholar 

  31. Karnovsky, A. et al. Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data. Bioinformatics 28, 373–380 (2012).

    Article  CAS  Google Scholar 

  32. Fabregat, A. et al. The reactome pathway knowledgebase. Nucleic Acids Res. 44, D481–D487 (2016).

    Article  CAS  Google Scholar 

  33. Kelder, T. et al. WikiPathways: building research communities on biological pathways. Nucleic Acids Res. 40, D1301–D1307 (2012).

    Article  CAS  Google Scholar 

  34. Gika, H. & Theodoridis, G. Sample preparation prior to the LC-MS-based metabolomics/metabonomics of blood-derived samples. Bioanalysis 3, 1647–1661 (2011).

    Article  CAS  Google Scholar 

  35. Storey, J.D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).

    Article  CAS  Google Scholar 

  36. Benton, H.P. et al. Autonomous metabolomics for rapid metabolite identification in global profiling. Anal. Chem. 87, 884–891 (2015).

    Article  CAS  Google Scholar 

  37. Zhu, Z.-J. et al. Liquid chromatography quadrupole time-of-flight mass spectrometry characterization of metabolites guided by the METLIN database. Nat. Protoc. 8, 451–460 (2013).

    Article  CAS  Google Scholar 

  38. Smith, G. et al. Mutations in APC, Kirsten-ras, and p53 - alternative genetic pathways to colorectal cancer. Proc. Natl. Acad. Sci. USA 99, 9433–9438 (2002).

    Article  CAS  Google Scholar 

  39. Zhan, X.Q. & Desiderio, D.M. Signaling pathway networks mined from human pituitary adenoma proteomics data. BMC Med. Genom. 3, 26 (2010).

    Article  Google Scholar 

  40. Grabherr, M.G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

    Article  CAS  Google Scholar 

  41. Haas, B.J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).

    Article  CAS  Google Scholar 

  42. Martin, J.A. & Wang, Z. Next-generation transcriptome assembly. Nat. Rev. Genet. 12, 671–682 (2011).

    Article  CAS  Google Scholar 

  43. Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008).

    Article  CAS  Google Scholar 

  44. Washburn, M.P., Wolters, D. & Yates, J.R. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19, 242–247 (2001).

    Article  CAS  Google Scholar 

  45. Geiger, T., Cox, J., Ostasiewicz, P., Wisniewski, J.R. & Mann, M. Super-SILAC mix for quantitative proteomics of human tumor tissue. Nat. Methods 7, 383–385 (2010).

    Article  CAS  Google Scholar 

  46. Montenegro-Burke, J.R. et al. Data streaming for metabolomics: accelerating data processing and analysis from days to minutes. Anal. Chem. 89, 1254–1259 (2017).

    Article  CAS  Google Scholar 

  47. Montenegro-Burke, J.R. et al. Smartphone analytics: mobilizing the lab into the cloud for omicscale analyses. Anal. Chem. 88, 9753–9758 (2016).

    Article  CAS  Google Scholar 

  48. Trutschel, D., Schmidt, S., Grosse, I. & Neumann, S. Experiment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data. Metabolomics 11, 851–860 (2015).

    Article  CAS  Google Scholar 

  49. Causon, T.J. & Hann, S. Review of sample preparation strategies for MS-based metabolomic studies in industrial biotechnology. Anal. Chim. Acta 938, 18–32 (2016).

    Article  CAS  Google Scholar 

  50. Engskog, M.K.R., Haglof, J., Arvidsson, T. & Pettersson, C. LC-MS based global metabolite profiling: the necessity of high data quality. Metabolomics 12, 19 (2016).

    Article  Google Scholar 

  51. Haggarty, J. & Burgess, K.E.V. Recent advances in liquid and gas chromatography methodology for extending coverage of the metabolome. Curr. Opin. Biotechnol. 43, 77–85 (2017).

    Article  CAS  Google Scholar 

  52. Kohler, I. & Giera, M. Recent advances in liquid-phase separations for clinical metabolomics. J. Sep. Sci. 40, 93–108 (2017).

    Article  CAS  Google Scholar 

  53. Muzny, D.M. et al. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

    Article  CAS  Google Scholar 

  54. Feldman, D., Krishnan, A.V., Swami, S., Giovannucci, E. & Feldman, B.J. The role of vitamin D in reducing cancer risk and progression. Nat. Rev. Cancer 14, 342–357 (2014).

    Article  CAS  Google Scholar 

  55. Payne, C.M., Bernstein, C., Dvorak, K. & Bernstein, H. Hydrophobic bile acids, genomic instability, Darwinian selection, and colon carcinogenesis. Clin. Exp. Gastroenterol. 1, 19–47 (2008).

    Article  CAS  Google Scholar 

  56. Field, A.E. et al. Impact of overweight on the risk of developing common chronic diseases during a 10-year period. Arch. Intern. Med. 161, 1581–1586 (2001).

    Article  CAS  Google Scholar 

  57. Frei, B., Kim, M.C. & Ames, B.N. Ubiquinol-10 is an effective lipid-soluble antioxidant at physiological concentrations. Proc. Natl. Acad. Sci. USA 87, 4879–4883 (1990).

    Article  CAS  Google Scholar 

  58. Xian, F., Hendrickson, C.L. & Marshall, A.G. High resolution mass spectrometry. Anal. Chem. 84, 708–719 (2012).

    Article  CAS  Google Scholar 

  59. Tautenhahn, R., Böttcher, C. & Neumann, S. Highly sensitive feature detection for high resolution LC/MS. BMC Bioinform. 9, 504 (2008).

    Article  Google Scholar 

  60. Shevlyakov, G., Morgenthaler, S. & Shurygin, A. Redescending M-estimators. J. Stat. Plan. Infer. 138, 2906–2917 (2008).

    Article  Google Scholar 

  61. Welch, B.L. The generalisation of student′s problems when several different population variances are involved. Biometrika 34, 28–35 (1947).

    CAS  PubMed  Google Scholar 

  62. Mann, H.B. & Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Statist. 18, 50–60 (1947).

    Article  Google Scholar 

  63. Fisher, R.A. On the probable error of a coefficient of correlation deduced from a small sample. Metron 1, 3–32 (1921).

    Google Scholar 

  64. Kruskal, W.H. & Wallis, W.A. Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 47, 583–621 (1952).

    Article  Google Scholar 

  65. Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A. & Vingron, M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18, S96–S104 (2002).

    Article  Google Scholar 

  66. Maier, T. et al. Quantification of mRNA and protein and integration with protein turnover in a bacterium. Mol. Syst. Biol. 7, 511–511 (2011).

    Article  Google Scholar 

  67. Hirai, M.Y. et al. Elucidation of gene-to-gene and metabolite-to-gene networks in Arabidopsis by integration of metabolomics and transcriptomics. J. Biol. Chem. 280, 25590–25595 (2005).

    Article  CAS  Google Scholar 

  68. Bateman, A. et al. UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–D212 (2015).

    Article  Google Scholar 

  69. Patti, G.J., Tautenhahn, R. & Siuzdak, G. Meta-analysis of untargeted metabolomic data from multiple profiling experiments. Nat. Protoc. 7, 508–516 (2012).

    Article  CAS  Google Scholar 

  70. Tautenhahn, R. et al. metaXCMS: second-order analysis of untargeted metabolomics data. Anal. Chem. 83, 696–700 (2011).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

The authors thank the following for funding assistance: Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA), a Scientific Focus Area Program at Lawrence Berkeley National Laboratory for the US Department of Energy, Office of Science, Office of Biological and Environmental Research under contract number DE-AC02-05CH11231 (G.S.); and the National Institutes of Health (grants R01 GM114368 (G.S.) and PO1 A1043376-02S1 (G.S.)).

Author information

Authors and Affiliations

Authors

Contributions

E.M.F. and T.H. contributed equally to writing the manuscript. E.M.F., T.H., D.R., H.P.B., B.H. and G.S. contributed to platform development, and H.P.B., B.W. and G.S. contributed to manuscript writing.

Corresponding author

Correspondence to Gary Siuzdak.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Methods and Supplementary Table 1. (PDF 852 kb)

Supplementary Data 1

Demonstration transcriptomics data set. (ZIP 5 kb)

Supplementary Data 2

Significant protein data set. (ZIP 0 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Forsberg, E., Huan, T., Rinehart, D. et al. Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online. Nat Protoc 13, 633–651 (2018). https://doi.org/10.1038/nprot.2017.151

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nprot.2017.151

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research