Uncovering the molecular context of dysregulated metabolites is crucial to understand pathogenic pathways. However, their system-level analysis has been limited owing to challenges in global metabolite identification. Most metabolite features detected by untargeted metabolomics carried out by liquid-chromatography-mass spectrometry cannot be uniquely identified without additional, time-consuming experiments. We report a network-based approach, prize-collecting Steiner forest algorithm for integrative analysis of untargeted metabolomics (PIUMet), that infers molecular pathways and components via integrative analysis of metabolite features, without requiring their identification. We demonstrated PIUMet by analyzing changes in metabolism of sphingolipids, fatty acids and steroids in a Huntington's disease model. Additionally, PIUMet enabled us to elucidate putative identities of altered metabolite features in diseased cells, and infer experimentally undetected, disease-associated metabolites and dysregulated proteins. Finally, we established PIUMet's ability for integrative analysis of untargeted metabolomics data with proteomics data, demonstrating that this approach elicits disease-associated metabolites and proteins that cannot be inferred by individual analysis of these data.
At a glance
- Cellular metabolism and disease: what do metabolic outliers teach us? Cell 148, 1132–1144 (2012). &
- Innovation: Metabolomics: the apogee of the omics trilogy. Nat. Rev. Mol. Cell Biol. 13, 263–269 (2012). , &
- Metabolomics: from small molecules to big ideas. Nat. Methods 8, 117–121 (2011).
- Mass appeal: metabolite identification in mass spectrometry-focused untargeted metabolomics. Metabolomics 9, 44–66 (2013). et al.
- Bioinformatics: the next frontier of metabolomics. Anal. Chem. 87, 147–156 (2015). , , &
- After the feature presentation: technologies bridging untargeted metabolomics and biology. Curr. Opin. Biotechnol. 28, 143–148 (2014). , , &
- MetaMapR: pathway independent metabolomic network analysis incorporating unknowns. Bioinformatics 31, 2757–2760 (2015). , &
- 3Omics: a web-based systems biology tool for analysis, integration and visualization of human transcriptomic, proteomic and metabolomic data. BMC Syst. Biol. 7, 64 (2013). , &
- Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data. Bioinformatics 28, 373–380 (2012). et al.
- Mining the unknown: a systems approach to metabolite identification combining genetic and metabolic information. PLoS Genet. 8, e1003005 (2012). et al.
- Predicting network activity from high throughput metabolomics. PLoS Comput. Biol. 9, e1003123 (2013). et al.
- Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity. Nat. Genet. 41, 316–323 (2009). et al.
- Simultaneous reconstruction of multiple signaling pathways via the prize-collecting steiner forest problem. J. Comput. Biol. 20, 124–136 (2013). et al.
- Integrating proteomic, transcriptional, and interactome data reveals hidden components of signaling and regulatory networks. Sci. Signal. 2, ra40 (2009). &
- Dominant phenotypes produced by the HD mutation in STHdh(Q111) striatal cells. Hum. Mol. Genet. 9, 2799–2809 (2000). et al.
- Sphingosine-1-phosphate signaling and its role in disease. Trends Cell Biol. 22, 50–60 (2012). , , &
- FTY720 (fingolimod) is a neuroprotective and disease-modifying agent in cellular and mouse models of Huntington disease. Hum. Mol. Genet. 23, 2251–2265 (2014). et al.
- Fingolimod protects cultured cortical neurons against excitotoxic death. Pharmacol. Res. 67, 1–9 (2013). et al.
- Fingolimod, a sphingosine-1 phosphate receptor modulator, increases BDNF levels and improves symptoms of a mouse model of Rett syndrome. Proc. Natl. Acad. Sci. USA 109, 14230–14235 (2012). et al.
- Emerging roles for cholesterol in Huntington's disease. Trends Neurosci. 34, 474–486 (2011). &
- Brain cholesterol synthesis and metabolism is progressively disturbed in the R6/1 mouse model of Huntington's disease: a targeted GC-MS/MS sterol analysis. J. Huntingtons Dis. 4, 305–318 (2015). , , , &
- Essential fatty acids and the brain: from infancy to aging. Neurobiol. Aging 26 (Suppl. 1), 98–102 (2005). , &
- Altered cholesterol and fatty acid metabolism in Huntington disease. J. Clin. Lipidol. 4, 17–23 (2010). , , , &
- Ethyl-EPA in Huntington disease: a double-blind, randomized, placebo-controlled trial. Neurology 65, 286–292 (2005). et al.
- Reduction in cerebral atrophy associated with ethyl-eicosapentaenoic acid treatment in patients with Huntington's disease. J. Int. Med. Res. 36, 896–905 (2008). et al.
- Brain lipogenesis and regulation of energy metabolism. Curr. Opin. Clin. Nutr. Metab. Care 11, 483–490 (2008). &
- Huntingtin-protein interactions and the pathogenesis of Huntington's disease. Trends Genet. 20, 146–154 (2004). &
- The role of Cockayne Syndrome group B (CSB) protein in base excision repair and aging. Mech. Ageing Dev. 129, 441–448 (2008). , , &
- Mechanisms of disease: DNA repair defects and neurological disease. Nat. Clin. Pract. Neurol. 3, 162–172 (2007).
- iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinformatics 9, 405 (2008). , &
- HMDB 3.0—The Human Metabolome Database in 2013. Nucleic Acids Res. 41, D801–D807 (2013). et al.
- A community-driven global reconstruction of human metabolism. Nat. Biotechnol. 31, 419–425 (2013). et al.
- PSICQUIC and PSISCORE: accessing and scoring molecular interactions. Nat. Methods 8, 528–529 (2011). et al.
- SMPDB: The Small Molecule Pathway Database. Nucleic Acids Res. 38, D480–D487 (2010). et al.
- SBML Forum. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 19, 524–531 (2003). et al.
- LibSBML: an API library for SBML. Bioinformatics 24, 880–881 (2008). , , &
- Linking proteomic and transcriptional data through the interactome and epigenome reveals a map of oncogene-induced signaling. PLoS Comput. Biol. 9, e1002887 (2013). et al.
- Inference of sparse combinatorial-control networks from gene-expression data: a message passing approach. BMC Bioinformatics 11, 355 (2010). , , , &
- Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009). , &
- Assignment of endogenous substrates to enzymes by global metabolite profiling. Biochemistry 43, 14332–14339 (2004). et al.
- XCMS Online: a web-based platform to process untargeted metabolomic data. Anal. Chem. 84, 5035–5039 (2012). , , &
- NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 41, D8–D20 (2013).
- PHOSIDA 2011: the posttranslational modification database. Nucleic Acids Res. 39, D253–D260 (2011). , &
- Rapid detection of octamer binding proteins with 'mini-extracts', prepared from a small number of cells. Nucleic Acids Res. 17, 6419 (1989). , , &
- Extensive changes in DNA methylation are associated with expression of mutant huntingtin. Proc. Natl. Acad. Sci. USA 110, 2354–2359 (2013). et al.
- Supplementary Figure 2: The comparison of the disease-specific score distributions from real and random results. (70 KB)
The figure shows that the scores for the real results are significantly higher than random results (P value = 4.238E-37).
- Supplementary Figure 4: A Venn diagram comparing the potential metabolites matching disease features based on mass compared to the putative ones inferred by PIUMet. (93 KB)
The metabolites inferred by PIUMet are significantly enriched for the metabolites detected by a targeted metabolomic platform (hypergeometric test P value= 6.00×10− 4). All of the metabolites identified by the targeted platform were dysregulated in diseased cells.
- Supplementary Figure 5: Altered sphingolipids in STHdh Q111 cells compared to STHdh Q7 cells. These sphingolipids were measured in nine biological replicates. (111 KB)
Two-sided student t-test analysis showed significant changes in C24:0 Ceramide (d18:1) with P value =1.2×10−11 (a), C16:1 Ceramide (d18:1) with P value =7.74×10−5 (b), C24:1 Ceramide (d18:1) with P value =0.02 (c), and C24:1 sphingomyelin (SM) with P value =2.07×10−8 (d). The height of the bar plot shows the average of each sphingolipid levels, while the error bar shows their standard deviation.
- Supplementary Figure 6: PIUMet inferred Sphingosine-1-phosphate (S1P) as a disease-modifying hidden component of dysregulated sphingolipid pathway. (221 KB)
(a) S1P was significantly downregulated (P value = 1.99×10−4, two-sided student t-test) in STHdh Q111 cells compared to STHdh Q7 cells. The bar plot shows the average levels of S1P that were measured in nine biological replicates. The error bars show the standard deviation of S1P levels (b). The treatment of diseased cells with an analogue of S1P (FTY720-P) significantly decreased apoptosis (P value = 7.98×10−5, two-sided student t-test). The bar plot shows the average percentage of cell death, and the error bars show the standard deviation from two independent experiments with twenty replicates each. (c) Calcein (green) detects viable cells and monitors changes in cell shape and morphology, while propidium iodide (PI – red) stains late apoptotic cells with damaged membranes.
- Supplementary Figure 7: Western blot results showed that the DHCR7-encoded protein was significantly downregulated (P value = 0.025, two-sided student t-test) in the STHdh Q111 cells compared to STHdh Q7 cells. (74 KB)
The boxplot shows the measured protein levels in six biological replicates (black dots).
- Supplementary Figure 8: Altered fatty acids in STHdh Q111 cells compared to STHdh Q7 cells. (57 KB)
These fatty acids were measured in nine biological replicates. Two-sided student t-test analysis showed significant changes in eicosapentaenoic acid (EPA, P value=7.8×10−6, a) and dihomo-gamma-linolenic acid (DHGLA, P value=0.01, b). The height of the bar plot shows the average of each fatty acid levels, while the error bars show their standard deviation.
- Supplementary Figure 9: Western blot results showed that fatty acid synthase enzyme, encoded by the FASN gene, was significantly upregulated (P value = 0.025, two-sided student t-test) in the STHdh Q111 cells compared to STHdh Q7 cells. (109 KB)
The boxplot shows the measured protein levels in six biological replicates (black dots).
- Supplementary Figure 10: Western blot results of measuring RASA1 and ERCC6 protein levels. (118 KB)
(a) Western blot results showed a significant increase (P value = 0.008, two-sided student t-test) in the RASA1 protein levels. The boxplot shows the measured protein levels in three biological replicates (black dots). (b) The plot shows Western blot results indicating a significant increase (P value =0.028, two-sided student t-test) in the ERCC6 protein levels. The boxplot shows the measured protein levels in six biological replicates (black dots).
- Supplementary Figure 11: The comparison of degree distributions of features from terminal and detectable metabolite feature (DMF) sets. (57 KB)
The degree of a feature shows the number of metabolites corresponding to the feature.
- Supplementary Text and Figures (4650 KB)
Supplementary Figures 1–11 and Supplementary Table 1