Hippurate as a metabolomic marker of gut microbiome diversity: Modulation by diet and relationship to metabolic syndrome

Reduced gut microbiome diversity is associated with multiple disorders including metabolic syndrome (MetS) features, though metabolomic markers have not been investigated. Our objective was to identify blood metabolite markers of gut microbiome diversity, and explore their relationship with dietary intake and MetS. We examined associations between Shannon diversity and 292 metabolites profiled by the untargeted metabolomics provider Metabolon Inc. in 1529 females from TwinsUK using linear regressions adjusting for confounders and multiple testing (Bonferroni: P < 1.71 × 10−4). We replicated the top results in an independent sample of 420 individuals as well as discordant identical twin pairs and explored associations with self-reported intakes of 20 food groups. Longitudinal changes in circulating levels of the top metabolite, were examined for their association with food intake at baseline and with MetS at endpoint. Five metabolites were associated with microbiome diversity and replicated in the independent sample. Higher intakes of fruit and whole grains were associated with higher levels of hippurate cross-sectionally and longitudinally. An increasing hippurate trend was associated with reduced odds of having MetS (OR: 0.795[0.082]; P = 0.026). These data add further weight to the key role of the microbiome as a potential mediator of the impact of dietary intake on metabolic status and health.

days increased abundance of 11 operational taxonomic units (OTUs), and elevated faecal levels of secondary bile acids and metabolites derived from microbial modifications of plant-derived components 11 .
To our knowledge, the role of a diverse gut microbiome in humans as a potential mediator of the impact of dietary intake on metabolic status and health has not been robustly addressed. Therefore, the aims of this study are: to (i) identify blood metabolites correlated with gut microbiome diversity, (ii) examine the impact of food intake on these metabolites and to, (iii) examine if longitudinal changes in these metabolites are predictive of future development of the metabolic syndrome.

Results
Supplementary Table S1 provides the study population characteristics and subject numbers.
OTU and Taxa associated with both hippurate and the hippurate diet score. Five OTUs and five taxa associated with hippurate were also associated with the diet score in the same direction (Table 2) in the whole sample. Specifically, reduced abundances of one OTU assigned to the genus Actinomyces were associated with an increased diet score, and similar to hippurate this trend was significant at both the family and order levels. At the genus level, reduced abundances of Ruminococcus (including 2 OTUs) and Eubacterium were associated with increasing diet scores. Moreover, increased diet scores were associated with increased abundances of OTUs assigned to the species Faecalibacterium prausnitzii and the genus Clostridiales. Most associations appeared to be to be primarily related to intakes of fruit and whole grains ( Table 2). Though higher abundances of one Clostridiales OTU were associated with increased coffee intake. Table 3 provides the clinical characteristics of the subsample of individuals studied. Longitudinal hippurate trajectories were significantly associated with Shannon diversity, independently of diet and covariates in a subsample of 1032 individuals (15.736 [1.96]; P = 4.95 × 10 −15 ), moreover the hippurate trend accounted for 6.5% of the variance in Shannon diversity. Figure 2 shows the results of the analysis for associations between diversity, the hippurate trend and MetS. Higher Shannon diversity and an increasing hippurate trend were associated with a reduced risk of having MetS (Fig. 2a). 61.1% of the effect of the hippurate trend on MetS was accounted for by the association between Shannon diversity and MetS. Five collapsed taxa and 3 of the OTUs that were associated with both diet and hippurate were associated with MetS, including: the Actinomycetaceae family, and Actinomycetales order and Actinomyces genus within Actinomycetaceae, the Eubacterium and Ruminococcus genera (plus one OTU), which were positively associated; and OTUs assigned to the order Clostridiales and Faecalibacterium prausnitzii that were inversely associated. The percentage variance in the metabolite trend and MetS that was accounted for by the MetS association with these associated OTUs/taxa is shown in Fig. 2b.

Relationship of diversity, the hippurate trend and diet to MetS and its components. Supplementary
Confirmation of results in discordant twins. We identified 55 MZ twin pairs who were discordant (1 SD apart) for Shannon diversity. Associations between Shannon diversity and hippurate were significant (P < 0.05)  Table 2. List of taxa associated with hippurate, the hippurate diet score and foods 1 . * Statistically significant: P < 0.0017; WG: whole grain products. 1 Microbiome OTUs and collapsed taxa significantly associated with both hippurate and the hippurate diet score (quartile-ranked, scored and summed intakes of coffee, fruit and whole grains) are shown. Associations were adjusted for covariates (age, Shannon Index, metabolite batch, BMI, sex and family relatedness) and multiple testing using Bonferroni correction. Hippurate diet score associations were also adjusted for hippurate. 2 OTU or collapsed taxonomy. 3 All foods included in the hippurate diet score were fitted into a backwards stepwise linear regression using P < 0.05 as the cut-off threshold with each taxa associated to both hippurate and the diet score. Results displayed are the betas with standard errors of foods at least nominally associated (P < 0.05). Statistical significance was defined as P < 0.0017 (Bonferroni: 0.05/[10 taxa × 3 foods]).

Discussion
Here we have identified hippurate, through its relationship with microbiome diversity, as a potential modulator of MetS, independently of diet. Specifically, we find that circulating levels of five blood metabolites, including hippurate, are associated with greater gut microbiome diversity both cross-sectionally and longitudinally and higher intakes of fruit and whole grains. Hippurate and intakes of foods including coffee, fruit and whole grains also correlated with 5 OTUs and 5 collapsed taxa independent of diversity. In particular, hippurate was positively associated with an OTU within the order Clostridiales and the species Faecalibacterium prausnitzii and negatively with the Actinomycetaceae family, and the genera Eubacterium and Ruminococcus. Both an increasing Shannon diversity score and increasing hippurate levels were associated with a lower risk of MetS. We identified and validated in an independent sample 5 metabolites associated with gut microbiome diversity: hippurate, p-cresol sulfate, phenylacetylglutamine and 3-phenylpropionate which were positively associated; and a negative association with hyodeoxycholate. P-cresol sulfate (also p-cresyl sulfate) and phenylacetylglutamine are potentially toxic uremic solutes formed from the putrefication of undigested dietary proteins by colonic bacteria (and, in the case of p-cresol sulfate, subsequent modification by the liver). Our group previously showed higher p-cresol sulfate and phenylacetylglutamine levels to be strongly associated with early kidney dysfunction and gut microbiome OTUs 12 . Hyodeoxycholate is a secondary bile acid, produced from intestinal bacterial metabolism. Hyodeoxycholate is metabolised by glucuronidation in the human liver and kidneys, a pathway for toxin elimination 13 . Both 3-phenylpropionate and hippurate are derived from gut microbial metabolism of polyphenols to benzoates, though hippurate accounted for a larger portion of the variance in diversity. Hippurate is a glycine conjugate of benzoic acid formed in the mitochondria of the liver and kidneys 14 , and also through gut bacterial metabolism of dietary components, primarily polyphenols 15,16 . Although hippurate is also derived from the metabolism of quinic acid and/or shikimic acid 17 .
Overall, increased Shannon diversity and the hippurate trend were associated with a reduced odds of having MetS. Interestingly, 61.1% of the effect of the hippurate trend on MetS was mediated by the association between Shannon diversity and MetS. A previous study has shown reduced gut bacterial diversity of MetS features 6 . We believe the relationship between MetS and blood hippurate to be a novel finding, though studies have shown reduced urinary hippuric acid excretion in obesity [18][19][20] . We have previously found increasing circulating levels of hippurate to be associated with adipose tissue gene expression levels of neuroglobin, a type of globin 21 . Neuroglobin expressed in neurons and some endocrine tissues acts to protect cells against hypoxia and oxidative stress 22 .
Overall increased abundances OTUs/taxa of the Actinomycetaceae family and the genera Eubacterium and Ruminococcus were associated with reduced hippurate and the diet and increased MetS risk, and increased abundances of OTUs of the order Clostridiales and the species Faecalibacterium prausnitzii were associated with increased hippurate and diet score and reduced MetS risk.
The Actinomycetaceae family are typical commensals present within the oral cavity. In rare cases, an Actinomyces overgrowth contributes to an infection within the gut through forming filamentous branches that grow through damaged mucosal tissue penetrating the gut barrier, forming abscesses and fistula 23 . The relationship between Actinomycetaceae and the foods forming the hippurate diet score is not entirely clear. We found increased abundance of the genus Eubacterium to be associated with lower hippurate and the diet score, particularly whole grain intake. Contrary to our findings, feeding of whole grains 24 and switching from a Western to plant-based diet 25 in 10 humans have enriched abundances of species Eubacterium rectale. Another species, Eubacterium dolichum was found to be elevated in mice fed a Western-style diet 26 . Higher abundances of the genus Ruminococcus (one OTU and collapsed taxonomy) were associated with higher MetS risk. Increased Ruminococcus abundances were associated with lower fruit intake. Ruminococcus abundances were reduced following 12-week feeding of schisandra chinensis fruit, which is high in flavonoids 27 .
Increased abundances of OTUs within Clostridiales and Faecalibacterium prausnitzii were associated with increased hippurate, dietary components and reduced risk of MetS. The Clostridiales OTU was strongly and positively associated with coffee intake. Significantly elevated levels of the Clostridium coccoides-Eubacterium rectale group have been shown following the incubation of human faecal microbiota with coffee samples 28 . Faecalibacterium prausnitzii has been shown to correlate with microbiome gene count and predict weight loss over time 6 . Faecalibacterium prausnitzii has been shown to be depleted in 239 MetS subjects and partially restored following a 2 year Mediterranean diet intervention 29 . Increased abundances of the Faecalibacterium prausnitzii OTU were mildly associated with higher whole grain intake. Whole grains appear to allow Faecalibacterium prausnitzii to flourish 30 .
There were a number of limitations to this study. Importantly, the metabolomics methods utilised in our study do not yield absolute concentrations. Ideally a targeted method that could provide robust validation of the quantitative results. These data however offer important insights that can be tested by other groups using targeted metabolomics methods. The use of targeted assays would have been an ideal way to validate our findings. Unfortunately it is not a viable option for our cohort. We included few males therefore these results may only apply to women. As the FFQ relies on subject reporting, the accuracy of this recall data is always to some extent problematic. If misclassification of intakes had occurred, it would appear as error and likely have obscured any real findings and not strengthened them. Furthermore, we replicated our findings in MZ twin pairs and from previous feeding studies which have shown hippurate excretion to be increased following the consumption of these foods. We have not considered the influence of freezing and transit time in these analyses. Whilst these may influence results we do not expect this to be a large effect as variations in sample collection would be distributed randomly in relation to the phenotypes assessed. However, future analyses could be improved by taking these into consideration in the experimental design and analysis stages. Moreover, it will be of interest to carry out further research to identify how the bacteria interact and how that might influence microbiome diversity, related metabolites and be affected by diet. Finally, the range of time differences between sample collections is an issue throughout this paper. Though likely our results would have been stronger if closer time points were used between data collections.
There were also many advantages. We had a large number of twin subjects with the unique combination of metabolomics profiling, dietary information and gut microbiome profiling. We also had access to a unique longitudinal metabolomics dataset in order to evaluate the influence of changes in hippurate levels on MetS risk.
In conclusion, we identified novel diet-microbiome-metabolite relationships including five specific metabolites that are related to a diverse microbiome profile. Hippurate in particular was strongly associated to increased gut microbiome diversity and consumption of polyphenol-rich foods including coffee, whole grains and fruit and reduced odds of MetS. The potential of hippurate as a marker of alpha-diversity and the interplay between a diet rich in these foods, gut microbes and hippurate production should now be established in dynamic mechanistic studies using a dietary intervention setting.

Materials and Methods
The study population included twins enrolled in the TwinsUK registry, which is a national register of healthy adult twins residing throughout the UK 31 . Twins included in the dataset all had completed food intake questionnaires, faecal microbiome and blood metabolomics profiling (Fig. 3a shows the dataset information).  Table S5). The sample was divided into discovery and validation groups according to when FFQs were completed (before or in/after 2014). Female twins who completed FFQs collected between 1995 and 2001 and in 2007 were used in the discovery analysis (n = 1529). New FFQ data was collected between 2014 and 2015; 484 additional individuals had microbiome and metabolomics data and completed FFQs during this time, 420 of these individuals who had no co-twin in the discovery sample were used as a validation sample. Quality control of the FFQ dataset has been outlined previously 33 . The study has been approved by the local St. Thomas' Figure 3. Overview of the study datasets and flow chart of study design. (a) Provides an overview of the study datasets. There were 5 different datasets used in the study. The colors and outline of the boxes indicate the datasets used; color: blue, whole; green, discovery; orange, validation; outline: solid, whole; dashed, subsample. All individuals included in the study had FFQ, blood metabolomics and microbiome data available. For part 1 cross-sectional analyses the whole sample was divided into discovery and validation groups based on when FFQs were completed. A subsample of individuals from the discovery group were used to examine baseline diet associations with longitudinal blood metabolomics. For part 2 analysis a subsample of individuals from the whole dataset were used to examine MetS associations with longitudinal metabolomics and cross-sectional diet and microbiome. (b) Shows the study outline for part 1 of the analysis where metabolite markers of microbiome diversity were identified and their relationship to diet examined. The flow chart is numbered in the order the analysis was conducted. On the left side of the figure the datasets used for each analysis step are indicated. (c) Shows the study outline for part 2 analysis where longitudinal levels of the top metabolite marker were examined for its relationship with MetS. Shannon diversity, the diet score, and metabolite-and diet-associated OTUs/taxa were investigated for their association with MetS status cross-sectionally.Abbreviations: MetS, metabolic syndrome; NA, not applicable.
Hospital Research Ethics Committee and was performed in accordance with the approved guidelines. All study participants provided written informed consent.
Faecal microbiome composition. Faecal samples were collected at follow-up between 2012 and 2016, and the composition of the gut microbiome was determined by 16 S rRNA gene sequencing carried out as previously described 34 . Details for the microbiome quality control are described in Supplementary Text S1. Shannon diversity was the primary index considered for analysis, this index is normally distributed and most commonly used. There were on 5.46 (3.83) years between FFQ and faecal sample collection.
Metabolomic profiling. Non-targeted mass spectroscopy-based metabolomic profiling was undertaken in three batches all within 2 years by the metabolomics provider Metabolon, Inc. (Durham, NC) on 6056 fasting blood samples (batch 1 in serum and batches 2 and 3 in plasma), as previously outlined 35 and detailed in Supplementary Text S2. The Metabolon platform identified 292 structurally named biochemicals that belong to the following broad categories: amino acids, carbohydrates, vitamins, lipids, nucleotides, peptides, and xenobiotics (Supplementary Table S6). The associations between metabolite and batch adjusted for age, BMI and family relatedness can be found in Supplementary Table S7. Quality control on the metabolomics dataset was performed as previously described 35 (Supplementary Text S2). Briefly, raw data were median-normalised by dividing the metabolite concentrations by the daily median for that metabolite. Metabolite concentrations did not follow a normal distribution and were therefore inverse-normalized. There were on average 3.13 (SD: 3.10) years between the FFQ and the blood sample collection.
Classification of the metabolic syndrome. Clinical visits were undertaken where a trained research nurse or assistant collected blood samples, waist circumference and blood pressure (Supplementary Text S3 for details).
MetS status was determined by the International Diabetes Federation and the American Heart Association/ National Heart, Lung, and Blood Institute criteria 36 .
Data Availability. TwinsUK omics and phenotypic data are publicly available upon request. Details can be found on the departmental website (http://www.twinsuk.ac.uk/data-access/accessmanagement/).

Statistical analysis.
Statistical analysis was carried out using Stata version 12. The statistical analysis was undertaken in two parts. First, a metabolomic marker of microbiome diversity was identified, its relationship to food intake explored and associations with microbiome OTUs/collapsed taxonomies identified. In the second part, the relationship between longitudinal levels of the diversity metabolite marker, diet, diversity and associated OTUs/taxa with the risk of MetS was explored.
Part 1: Metabolomics associations with diversity, relationship to food intake and associations with microbiome OTUs/taxa. Figure 3b shows the study outline for this section.
Microbiome diversity-metabolite associations. A linear mixed regression model with Shannon diversity as a predictor of the metabolite level was undertaken in a group of 1529 females, adjusted for age, batch, BMI and family relatedness as random intercept and multiple testing (Bonferroni: 0.05/292 = 1.71 × 10 −4 ). Significant metabolites from the discovery sample were then evaluated against Shannon diversity using the same linear regression (additionally including sex as a covariate) in the validation sample (including n = 113 males), associations passing the 5% level of significance were considered validated.
Food intake associations with diversity-associated metabolites. In the discovery sample, reported intakes of 20 food groups were used to predict levels of the validated metabolites (Bonferroni: 0.05/[5 metabolites × 20 food groups] = 5.00 × 10 −4 ), adjusted for the same covariates as above. In a subsample of individuals from the discovery group (n = 788) longitudinal metabolomics data was available (n = 705 3 time points, n = 83 2 time points) as well as reported food group intake at the same time or 5 years before the first blood sample. Trajectories of change (indicated as 'trend' from here forward) in the top diversity-associated metabolite (years after baseline: 2 nd time point: 7.4 [range: 2.3-12.8]; 3 rd time point: 13.9 [range: 8.3-17.9]) were determined by empirical Bayes predictions (adjusted for age and BMI) which estimates the rate of change in standard deviations/year 37 . Using this method, point estimates were calculated and a slope of change determined. Food group intake at or before baseline metabolite levels was then used to predict the metabolite trends (Bonferroni: 0.05/20 food groups = 0.0025).
A dietary score predictive of the metabolite trend was created from quartile (Q)-ranking significantly associated foods, scoring according to the direction of association (i.e. positive association: Q1 = 0-Q4 = 3; negative association: Q1 = 3-Q4 = 0) and summing. The heritability of the score was determined by structural equation modelling using Mx (Supplementary Text S4 for details).
To establish the association between the metabolite and diet score with richness (defined here as number of observed OTUs) and additional diversity metrics (Simpson and Chao1), these associations were run using the same linear regression as for the Shannon diversity discovery analysis, though in the whole sample.
Food-microbiome-metabolite axis. To identify associations between the metabolite and the microbiome, the discovery and validation samples were combined (n = 2013). Associations with the microbiome were evaluated by calculating a linear regression model using the OTUs and OTUs collapsed at each taxonomic level (phylum, class, order, family, genus) as predictors of the metabolite adjusted for covariates (age, BMI, batch, sex and family relatedness), Shannon diversity and multiple-testing (Bonferroni cut-off; Supplementary Table S8 shows the significance threshold for the OTUs and each taxonomic level). To determine the total variance of both Shannon diversity and the metabolite explained by the metabolite-associated OTUs, we included all associated OTUs in a backwards stepwise linear regression using P < 0.05 as the threshold cut-off and we report theR 2 for each model. We then calculated linear mixed models of metabolite-associated OTUs/taxa using the diet score as a predictor of abundances adjusted for covariates, the metabolite and multiple testing (assigned at each taxonomic level; Supplementary Table S8). To investigate if any of the foods forming the score were driving associations, we ran a multivariate regression model including all metabolite-associated foods and the same covariates. Part 2: The relationship between longitudinal levels of the diversity metabolite marker, diet, diversity and associated OTUs/taxa with the risk of MetS. Figure 3c shows the analysis pipeline for this section.
Relationship between microbiome diversity, longitudinal metabolite and the MetS. A subsample of 1032 individuals (primarily female) had longitudinal levels of the blood metabolite (n = 533 with 2 time points, n = 499 with 3 time points; range: 2.4-17.9 years). To evaluate whether longitudinal levels of the top diversity metabolite predicted MetS status, we ran independent linear regression models each for Shannon diversity, the metabolite trend (determined using empirical Bayes predictions as above), the diet score, and metabolite/diet-associated OTUs/ taxa to predict MetS status adjusting for age, sex, and family relatedness.
Longitudinal Metabolite association with MetS mediated by Shannon diversity, the diet score and specific taxa. The aim of this analysis was to determine how much of the variance in the metabolite trend and MetS was accounted for by the MetS association with Shannon diversity, the diet score or specific OTUs/taxa. Details are provided in Supplementary Text S5.