Protocol | Published:

Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online

Nature Protocols volume 13, pages 633651 (2018) | Download Citation

Abstract

Systems biology is the study of complex living organisms, and as such, analysis on a systems-wide scale involves the collection of information-dense data sets that are representative of an entire phenotype. To uncover dynamic biological mechanisms, bioinformatics tools have become essential to facilitating data interpretation in large-scale analyses. Global metabolomics is one such method for performing systems biology, as metabolites represent the downstream functional products of ongoing biological processes. We have developed XCMS Online, a platform that enables online metabolomics data processing and interpretation. A systems biology workflow recently implemented within XCMS Online enables rapid metabolic pathway mapping using raw metabolomics data for investigating dysregulated metabolic processes. In addition, this platform supports integration of multi-omic (such as genomic and proteomic) data to garner further systems-wide mechanistic insight. Here, we provide an in-depth procedure showing how to effectively navigate and use the systems biology workflow within XCMS Online without a priori knowledge of the platform, including uploading liquid chromatography (LC)–mass spectrometry (MS) data from metabolite-extracted biological samples, defining the job parameters to identify features, correcting for retention time deviations, conducting statistical analysis of features between sample classes and performing predictive metabolic pathway analysis. Additional multi-omics data can be uploaded and overlaid with previously identified pathways to enhance systems-wide analysis of the observed dysregulations. We also describe unique visualization tools to assist in elucidation of statistically significant dysregulated metabolic pathways. Parameter input takes 5–10 min, depending on user experience; data processing typically takes 1–3 h, and data analysis takes 30 min.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , , , & Metabolomics by numbers: acquiring and understanding global metabolite data. Trends Biotechnol. 22, 245–252 (2004).

  2. 2.

    & Multi-omics and metabolic modelling pipelines: challenges and tools for systems microbiology. Microbiol. Res. 171, 52–64 (2015).

  3. 3.

    , & Metabolomics: the apogee of the omics trilogy. Nat. Rev. Mol. Cell Biol. 13, 263–269 (2012).

  4. 4.

    , , & Frontiers of high-throughput metabolomics. Curr. Opin. Chem. Biol. 36, 15–23 (2017).

  5. 5.

    & Toward merging untargeted and targeted methods in mass spectrometry-based metabolomics and lipidomics. Anal. Chem. 88, 524–545 (2016).

  6. 6.

    , & Metabolomics: beyond biomarkers and towards mechanisms. Nat. Rev. Mol. Cell Biol. 17, 451–459 (2016).

  7. 7.

    , , , & XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching and identification. Anal. Chem. 78, 779–787 (2006).

  8. 8.

    et al. Interactive XCMS online: simplifying advanced metabolomic data processing and subsequent statistical analyses. Anal. Chem. 86, 6931–6939 (2014).

  9. 9.

    et al. Systems biology guided by XCMS Online metabolomics. Nat. Methods 14, 461–462 (2017).

  10. 10.

    , , & XCMS Online: a web-based platform to process untargeted metabolomic data. Anal. Chem. 84, 5035–5039 (2012).

  11. 11.

    et al. METLIN - a metabolite mass spectral database. Thera. Drug Monit. 27, 747–751 (2005).

  12. 12.

    , , & MetaboAnalyst 3.0—making metabolomics more meaningful. Nucleic Acids Res. 43, W251–W257 (2015).

  13. 13.

    & MetPA: a web-based metabolomics tool for pathway analysis and visualization. Bioinformatics 26, 2342–2344 (2010).

  14. 14.

    , , , & iPath2.0: interactive pathway explorer. Nucleic Acids Res. 39, W412–W415 (2011).

  15. 15.

    et al. Revealing disease-associated pathways by network integration of untargeted metabolomics. Nat. Methods 13, 770–776 (2016).

  16. 16.

    et al. Predicting network activity from high throughput metabolomics. PLoS Comput. Biol. 9, 11 (2013).

  17. 17.

    , , , & KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40, D109–D114 (2012).

  18. 18.

    et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 42, D459–D471 (2014).

  19. 19.

    et al. Metabolism links bacterial biofilms and colon carcinogenesis. Cell Metab. 21, 891–897 (2015).

  20. 20.

    et al. Evaluation of the safety and immunomodulatory effects of sargramostim in a randomized, double-blind phase 1 clinical Parkinson's disease trial. Parkinson's Dis. 3, 10 (2017).

  21. 21.

    et al. Exposome-scale investigations guided by global metabolomics, pathway analysis, and cognitive computing. Anal. Chem. 89, 11505–11513 (2017).

  22. 22.

    , , , & PeakML/mzMatch: a file format, Java library, R library, and tool-chain for mass spectrometry data analysis. Anal. Chem. 83, 2786–2793 (2011).

  23. 23.

    , , & MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics 11, 395 (2010).

  24. 24.

    & MBRole: enrichment analysis of metabolomic data. Bioinformatics 27, 730–731 (2011).

  25. 25.

    et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res. 44, W3–W10 (2016).

  26. 26.

    et al. Workflow4Metabolomics: a collaborative research infrastructure for computational metabolomics. Bioinformatics 31, 1493–1495 (2015).

  27. 27.

    , , , & Galaxy-M: a Galaxy workflow for processing and analyzing direct infusion and liquid chromatography mass spectrometry-based metabolomics data. GigaScience 5, 10 (2016).

  28. 28.

    , , , & Integrated pathway-level analysis of transcriptomics and metabolomics data with IMPaLA. Bioinformatics 27, 2917–2918 (2011).

  29. 29.

    et al. iPEAP: integrating multiple omics and genetic data for pathway enrichment analysis. Bioinformatics 30, 737–739 (2014).

  30. 30.

    et al. MetExplore: a web server to link metabolomic experiments and genome-scale metabolic networks. Nucleic Acids Res. 38, W132–W137 (2010).

  31. 31.

    et al. Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data. Bioinformatics 28, 373–380 (2012).

  32. 32.

    et al. The reactome pathway knowledgebase. Nucleic Acids Res. 44, D481–D487 (2016).

  33. 33.

    et al. WikiPathways: building research communities on biological pathways. Nucleic Acids Res. 40, D1301–D1307 (2012).

  34. 34.

    & Sample preparation prior to the LC-MS-based metabolomics/metabonomics of blood-derived samples. Bioanalysis 3, 1647–1661 (2011).

  35. 35.

    & Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).

  36. 36.

    et al. Autonomous metabolomics for rapid metabolite identification in global profiling. Anal. Chem. 87, 884–891 (2015).

  37. 37.

    et al. Liquid chromatography quadrupole time-of-flight mass spectrometry characterization of metabolites guided by the METLIN database. Nat. Protoc. 8, 451–460 (2013).

  38. 38.

    et al. Mutations in APC, Kirsten-ras, and p53 - alternative genetic pathways to colorectal cancer. Proc. Natl. Acad. Sci. USA 99, 9433–9438 (2002).

  39. 39.

    & Signaling pathway networks mined from human pituitary adenoma proteomics data. BMC Med. Genom. 3, 26 (2010).

  40. 40.

    et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

  41. 41.

    et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).

  42. 42.

    & Next-generation transcriptome assembly. Nat. Rev. Genet. 12, 671–682 (2011).

  43. 43.

    & MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008).

  44. 44.

    , & Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat. Biotechnol. 19, 242–247 (2001).

  45. 45.

    , , , & Super-SILAC mix for quantitative proteomics of human tumor tissue. Nat. Methods 7, 383–385 (2010).

  46. 46.

    et al. Data streaming for metabolomics: accelerating data processing and analysis from days to minutes. Anal. Chem. 89, 1254–1259 (2017).

  47. 47.

    et al. Smartphone analytics: mobilizing the lab into the cloud for omicscale analyses. Anal. Chem. 88, 9753–9758 (2016).

  48. 48.

    , , & Experiment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data. Metabolomics 11, 851–860 (2015).

  49. 49.

    & Review of sample preparation strategies for MS-based metabolomic studies in industrial biotechnology. Anal. Chim. Acta 938, 18–32 (2016).

  50. 50.

    , , & LC-MS based global metabolite profiling: the necessity of high data quality. Metabolomics 12, 19 (2016).

  51. 51.

    & Recent advances in liquid and gas chromatography methodology for extending coverage of the metabolome. Curr. Opin. Biotechnol. 43, 77–85 (2017).

  52. 52.

    & Recent advances in liquid-phase separations for clinical metabolomics. J. Sep. Sci. 40, 93–108 (2017).

  53. 53.

    et al. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

  54. 54.

    , , , & The role of vitamin D in reducing cancer risk and progression. Nat. Rev. Cancer 14, 342–357 (2014).

  55. 55.

    , , & Hydrophobic bile acids, genomic instability, Darwinian selection, and colon carcinogenesis. Clin. Exp. Gastroenterol. 1, 19–47 (2008).

  56. 56.

    et al. Impact of overweight on the risk of developing common chronic diseases during a 10-year period. Arch. Intern. Med. 161, 1581–1586 (2001).

  57. 57.

    , & Ubiquinol-10 is an effective lipid-soluble antioxidant at physiological concentrations. Proc. Natl. Acad. Sci. USA 87, 4879–4883 (1990).

  58. 58.

    , & High resolution mass spectrometry. Anal. Chem. 84, 708–719 (2012).

  59. 59.

    , & Highly sensitive feature detection for high resolution LC/MS. BMC Bioinform. 9, 504 (2008).

  60. 60.

    , & Redescending M-estimators. J. Stat. Plan. Infer. 138, 2906–2917 (2008).

  61. 61.

    The generalisation of student′s problems when several different population variances are involved. Biometrika 34, 28–35 (1947).

  62. 62.

    & On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Statist. 18, 50–60 (1947).

  63. 63.

    On the probable error of a coefficient of correlation deduced from a small sample. Metron 1, 3–32 (1921).

  64. 64.

    & Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 47, 583–621 (1952).

  65. 65.

    , , , & Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18, S96–S104 (2002).

  66. 66.

    et al. Quantification of mRNA and protein and integration with protein turnover in a bacterium. Mol. Syst. Biol. 7, 511–511 (2011).

  67. 67.

    et al. Elucidation of gene-to-gene and metabolite-to-gene networks in Arabidopsis by integration of metabolomics and transcriptomics. J. Biol. Chem. 280, 25590–25595 (2005).

  68. 68.

    et al. UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–D212 (2015).

  69. 69.

    , & Meta-analysis of untargeted metabolomic data from multiple profiling experiments. Nat. Protoc. 7, 508–516 (2012).

  70. 70.

    et al. metaXCMS: second-order analysis of untargeted metabolomics data. Anal. Chem. 83, 696–700 (2011).

Download references

Acknowledgements

The authors thank the following for funding assistance: Ecosystems and Networks Integrated with Genes and Molecular Assemblies (ENIGMA), a Scientific Focus Area Program at Lawrence Berkeley National Laboratory for the US Department of Energy, Office of Science, Office of Biological and Environmental Research under contract number DE-AC02-05CH11231 (G.S.); and the National Institutes of Health (grants R01 GM114368 (G.S.) and PO1 A1043376-02S1 (G.S.)).

Author information

Affiliations

  1. Center for Metabolomics and Mass Spectrometry, The Scripps Research Institute, La Jolla, California, USA.

    • Erica M Forsberg
    • , Tao Huan
    • , Duane Rinehart
    • , H Paul Benton
    • , Benedikt Warth
    • , Brian Hilmers
    •  & Gary Siuzdak
  2. Department of Chemistry and Biochemistry, San Diego State University, San Diego, California, USA.

    • Erica M Forsberg
  3. Department of Food Chemistry and Toxicology, University of Vienna, Vienna, Austria.

    • Benedikt Warth

Authors

  1. Search for Erica M Forsberg in:

  2. Search for Tao Huan in:

  3. Search for Duane Rinehart in:

  4. Search for H Paul Benton in:

  5. Search for Benedikt Warth in:

  6. Search for Brian Hilmers in:

  7. Search for Gary Siuzdak in:

Contributions

E.M.F. and T.H. contributed equally to writing the manuscript. E.M.F., T.H., D.R., H.P.B., B.H. and G.S. contributed to platform development, and H.P.B., B.W. and G.S. contributed to manuscript writing.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Gary Siuzdak.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Methods and Supplementary Table 1.

Zip files

  1. 1.

    Supplementary Data 1

    Demonstration transcriptomics data set.

  2. 2.

    Supplementary Data 2

    Significant protein data set.

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nprot.2017.151

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.