The human gut microbiome plays a key role in human health1, but 16S characterization lacks quantitative functional annotation2. The fecal metabolome provides a functional readout of microbial activity and can be used as an intermediate phenotype mediating host–microbiome interactions3. In this comprehensive description of the fecal metabolome, examining 1,116 metabolites from 786 individuals from a population-based twin study (TwinsUK), the fecal metabolome was found to be only modestly influenced by host genetics (heritability (H2) = 17.9%). One replicated locus at the NAT2 gene was associated with fecal metabolic traits. The fecal metabolome largely reflects gut microbial composition, explaining on average 67.7% (±18.8%) of its variance. It is strongly associated with visceral-fat mass, thereby illustrating potential mechanisms underlying the well-established microbial influence on abdominal obesity. Fecal metabolic profiling thus is a novel tool to explore links among microbiome composition, host phenotypes, and heritable complex traits.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The study was funded by the Wellcome Trust, European Community's Seventh Framework Programme (FP7/2007-2013). The study also received support from the National Institute for Health Research (NIHR)-funded BioResource, Clinical Research Facility and the Biomedical Research Centre based at Guy's and St Thomas’ NHS Foundation Trust in partnership with King's College London, by the Chronic Disease Research Foundation and by the Denise Coates Foundation. HLI, Inc., collaborated with King's College London to produce the metabolomics data from Metabolon, Inc. C.M. was funded by the MRC AIM HY (MR/M016560/1) project grant. We thank J. Goodrich and R. Ley for support in sequencing the fecal samples.
Integrated Supplementary Information
We found three fecal metabolites and one metabolite ratio significantly associated with genetic loci. Each panel shows one of these associations with the respective lead SNP. 3-hydroxyhexanoate was found in less than 80% of all samples and was, thus, analyzed as dichotomous trait. The other metabolites are observed in at least 80% of the samples and were analyzed as continuous traits.
Each panel shows the qq-plot for one of the (a-c) three metabolites and (d) metabolite ratio, which have a genome-wide significant association with a genetic locus in the discovery cohort (n = 739). P-values were calculated using the score test implemented in GEMMA.
(a) Caffeine metabolism pathway showing the generation of the two metabolites that form the fecal metabolite ratio associated with NAT2 genetic variants. (b) Box plot showing the relationship between circulating caffeine (bottom versus top tertile) and the metabolite ratio 1,3-dimethylurate/5-acetylamino-6-amino-3-methyluracil (n = 444). (c) Box plot showing the relationship between coffee intake (from food frequency questionnaires) and 1,3-dimethylurate/5-acetylamino-6-amino-3-methyluracil (n = 676). P-values were calculated from linear mixed models.
We used a Gaussian graphical model to illustrate multivariate dependencies of fecal metabolite levels and gut microbes (n = 644). Microbes on the y-axis are ordered by taxonomy, metabolites on the x-axis by pathway and hierarchical clustering of the partial correlation matrix. Connections indicate significant (FDR < 5%) shrinkage partial correlations of fecal metabolites and microbial OTUs, given all other metabolites and microbes in the model. Edges are colored by the metabolic pathway.
To assess the effect of storage (a) in the participants’ fridge before being stored in the TwinsUK Biobank and (b) in the freezer at -80 °C before being analyzed, we calculated linear regression models of fecal metabolites against both measures (n = 786). Here we present qq-plots where the dashed lines indicate Bonferroni-cutoff.
Supplementary Figures 1–5 and Supplementary Table 1
Complete list of all analyzed metabolites with the proportion of samples in which each metabolite was observed (n), and the relative standard deviation (RSD) if the metabolite was present in at least 90% of quality control samples. Values in the columns 6-8 indicate the amount of variance attributed to the compartment of additive genetic factors (A or heritability), common/shared environmental factors (C) and unique environmental factors (E) estimated with structural equation modelling on 148 MZ and 155 DZ twin pairs (n=606). Similarly, values in column 9 indicate the proportion of variance explained by gut microbial composition (M) estimated from UniFrac beta diversities using linear mixed models (n=644). Finally, subsequent columns indicate each metabolite association with age, gender, BMI (n=786), microbial alpha diversity (n=644), and visceral fat mass (n=647) obtained using linear regression models for metabolites present in more than 80% of the samples and logistic regression models for metabolites present in less than 80% (but more than 20%) of samples. Green cells indicate significant results passing FDR=5%
Genome-wide association studies were conducted for 428 heritable metabolites and 31,226 heritable metabolite ratios using GEMMA (n=739). This table lists all associations of genetic variants with fecal metabolites passing a p-value of 10-5 and associations of metabolite ratios passing 10-8. Metabolite annotation is given in Supplementary Table 7
Associations between fecal metabolite levels and gut microbial operational taxonomic units (OTUs) and higher taxonomical levels were calculated using mixed linear regression models correcting for Shannon alpha diversity, age, sex, BMI, storage time, and family relationships (n=644). Nominally significant associations are flagged by *, FDR significant associations by ** and Bonferroni significant associations with ***. The annotation of metabolites and OTUs can be found in Supplementary tables 6 and 7, respectively
Gaussian graphical models combining fecal metabolites and microbial OTUs were calculated using GeneNet (n=644). The table contains the shrinkage partial correlation and the corresponding false discovery rate for each edge. Annotation of fecal metabolites and bacterial OTUs can be found in Supplementary tables 6 and 7, respectively
The table lists the IDs, biochemical names, and pathway annotations for all analyzed fecal metabolites
Microbial sequencing data was clustered in organizational taxonomical units (OTUs) using the de novo approach. The table contains all analyzed OTUs along with their representative sequences as well as the corresponding taxonomical annotation from the GreenGenes Database
Regional association plots were created for all significant associations of genetic loci with fecal metabolites using the web tool SNIPA (http://snipa.helmholtz-muenchen.de/). Colors indicate the strength of linkage disequilibrium (LD) with the sentinel SNP. The chromosomal positions are based on GRC37 and Ensembl v82 was used for gene annotations