Hyocholic acid species as novel biomarkers for metabolic disorders

Hyocholic acid (HCA) is a major bile acid (BA) species in the BA pool of pigs, a species known for its exceptional resistance to spontaneous development of diabetic phenotypes. HCA and its derivatives are also present in human blood and urine. We investigate whether human HCA profiles can predict the development of metabolic disorders. We find in the first cohort (n = 1107) that both obesity and diabetes are associated with lower serum concentrations of HCA species. A separate cohort study (n = 91) validates this finding and further reveals that individuals with pre-diabetes are associated with lower levels of HCA species in feces. Serum HCA levels increase in the patients after gastric bypass surgery (n = 38) and can predict the remission of diabetes two years after surgery. The results are replicated in two independent, prospective cohorts (n = 132 and n = 207), where serum HCA species are found to be strong predictors for metabolic disorders in 5 and 10 years, respectively. These findings underscore the association of HCA species with diabetes, and demonstrate the feasibility of using HCA profiles to assess the future risk of developing metabolic abnormalities.

M etabolism-related disorders, such as obesity, type 2 diabetes mellitus (T2DM), nonalcoholic fatty liver disease (NAFLD), stroke, and cardiovascular disease (CVD), have reached epidemic proportions and thus, become key areas of clinical and translational research over recent decades [1][2][3] . However, common biomarkers used for the early detection and differential diagnosis of these metabolic diseases have been challenged due to high inter-individual variabilities. For example, not all individuals with obesity develop subsequent metabolic abnormalities, and~25-40% of them can remain healthy throughout their lives [4][5][6][7] . Therefore, the identification of highrisk individuals at an early stage is critical for the prevention and control of metabolic diseases.
Bile acids (BAs) have been linked predominantly to cholesterol metabolism in the liver and the stimulation of cholesterol, fatsoluble vitamins, and lipid absorption from the intestines. Recently, BAs have gained increasing recognition as important signaling molecules that regulate triglycerides (TG), cholesterol, glucose, and energy homeostasis [8][9][10][11][12] . For example, dietary supplements of BA increased energy expenditure and prevented the development of high fat-induced obesity and insulin resistance in mice [13][14][15] . Plasma chenodeoxycholic acid (CDCA), cholic acid (CA) (primary BAs), and deoxycholic acid (DCA, a secondary BA) concentrations were found to be positively associated with homeostasis model assessment of insulin resistance (HOMA-IR) and negatively associated with glucose infusion rate in healthy volunteers, patients with T2DM, and non-diabetic individuals with obesity 16 . Our group has found that CDCA% (percentage of CDCA relative to total BA) was significantly higher in diabetic individuals with obesity compared to those with normal glucose tolerance and was positively correlated with body mass index (BMI), hemoglobin A1c (HbA1c), TG, and low-density lipoprotein cholesterol (LDL-c), while negatively correlated with highdensity lipoprotein cholesterol (HDL-c), and diabetes duration. Obese diabetics with higher baseline CDCA% were more prone to achieve remission 2 years after gastric bypass surgery 17 .
Pigs are unique mammalian species with exceptionally strong resistance to the development of metabolic diseases such as T2DM, NAFLD, and CVD, despite their being raised under diabetes inducing conditions. There is a distinct difference between the BA profiles of pigs and humans that pigs have a large proportion of hyocholic acid (HCA) species (>75%) whereas humans have a much lower percentage (~3%) 18,19 . Due to the important physiological and metabolic roles of BAs, we were motivated to assess the association of circulating HCA species with the development of metabolic disorders in humans.
Here, we show both obesity and diabetes are associated with lower serum and fecal concentrations of HCA species in two clinical cohorts. Serum HCA levels increase in the patients after gastric bypass surgery and could predict the remission of diabetes 2 years after surgery. Two independent prospective cohorts show that serum HCA species are strong predictors for metabolic disorders in 5 or 10 years.

Results
Obesity and diabetes were associated with lower serum concentrations of HCA species. To evaluate the association between HCA species and diabetes, we conducted targeted serum BA profiling analyses in a cohort consisting of 1107 participants (610 men and 497 women) selected from the Shanghai Obesity Study (SHOS) 20 . The participants were separated into three groups: healthy lean (HL, n = 585), healthy overweight/obese (HO, n = 419), and overweight/obese with T2DM (newly diagnosed, drug naive) (OD, n = 103). Key clinical metabolic markers were significantly different between any two of the three groups (Supplementary Table 1). The BA measurement was carried out in the laboratory of Shanghai Jiao Tong University Affiliated Sixth People's Hospital (Lab 1), and a total of 23 BAs met the quality control criteria and were quantified. The three groups had similar total BA levels in all, male and female subjects (Supplementary Fig. 1). The alterations of the different BA species were then compared, the levels of which were the concentration summation of individual unconjugated and conjugated BAs as shown in Supplementary Table 2. The term "HCA species" in this case, indicated HCA, hyodeoxycholic acid (HDCA), glycohyocholic acid (GHCA), and glycohyodeoxycholic acid (GHDCA). The results showed that the HCA species and DCA species were significantly decreased in HO and OD relative to HL. Pairwise Spearman correlation analysis ( Supplementary Fig. 2) showed that HCA species inversely correlated with BMI, fasting and postload glucose, insulin levels and insulin resistance shown by HOMA-IR. DCA species were also correlated with BMI, postload glucose and insulin levels. These results implied that the levels of HCA species and DCA species were related to glucose metabolism and regulation.
From HL to HO to OD, the participants had increasingly older age, higher BMI, and a lower ratio of male/female (although the sex ratios were not significantly different among groups) (Supplementary Table 1). To eliminate the confounding effects of age, sex, and BMI, we selected 103 older participants with higher BMI, and more women from the HL and HO groups to better match the 103 participants in the OD group. After this selection, all three groups had matched age and sex ratios, and HO and OD also had matched BMI (Table 1). These matched samples showed higher levels of total BAs in the HO group compared to the HL group (Fig. 1a). Among BA species, HCA species remained at low levels in HO and OD relative to HL (Supplementary Table 3, Fig. 1b), while the levels of DCA species showed no significant difference between groups. Although CDCA species and lithocholic acid (LCA) species showed consistent changes in HO and OD relative to HL, the individuals of each species changed inconsistently (Fig. 1b). Thus, HCA species were distinctive among these BA species. Consistent with the observations in all samples from this cohort, the levels of HCA species still showed an inverse correlation with clinical markers in these matched samples, while levels of DCA species did not (Fig. 1c). These results highlighted the importance of HCA species among BA species in obesity and diabetes.  Fig. 4 and Fig. 1i). The results suggested that metabolic disorders including obesity and diabetes were associated with lower concentrations of HCA species in serum.
Pre-diabetes and diabetes were associated with lower concentrations of HCA species in serum and feces. We then confirmed the findings above in a separate cohort that focused on diabetes development. The second cohort consisted of 91 participants (35 men and 56 women) including 26 healthy, 30 prediabetic and 35 newly diagnosed diabetic individuals (drug naive). In addition to serum samples, fecal samples were collected to evaluate the association between HCA species and metabolic disorder. The clinical markers showed that the HbA1c and fasting and post-load blood glucose levels of patients with pre-diabetes and diabetes were significantly higher than those of healthy controls (Supplementary Table 4). The BA results showed that no significant group differences were found in serum and fecal total BAs (Fig. 2a, b). Compared with healthy controls, the patients with pre-diabetes and diabetes had lower levels of total HCA species in both serum and feces for all, male and female subjects (Fig. 2c, d). The group differences were greater in feces than in serum. As expected, individual HCA species showed similar group differences ( Fig. 2e-j, and Supplementary Tables 5, 6). The concentrations of fecal GHCA and GHDCA are not shown as they were below the detection limit. Total and individual HCA species in serum and feces had strong inverse correlations with fasting and post-load blood glucose levels (Fig. 2k, l).
Gastric bypass surgery increased serum concentrations of HCA species in the patients with metabolic disorders. We further studied the changes of HCA species in patients with obesity and diabetes after Roux-en-Y gastric bypass (RYGB) surgery. Thirtyeight patients who received RYGB were examined before and at 1, 3, 6, and 12 months post-surgery (Table 2). Serum concentration of total BAs gradually increased after RYGB surgery, and became significantly higher than baseline at 12 months post-operation (Fig. 3a). The concentrations of total and individual HCA species in the serum increased drastically 1 month after the surgery (FC = 2.66, 2.75, 3.85, 2.27, and 2.30 for total HCA species, HCA, HDCA, GHCA, and GHDCA, respectively) and maintained minor increases afterward (Fig. 3b, c and Supplementary Table 7). Improvements in BMI, fasting and post-load blood glucose levels, HbA1c, and decreased insulin resistance occurred throughout the post-surgical 12 months (Fig. 3d). The receiver operating characteristic (ROC) analysis showed that the area under the curves (AUCs) for the 12-month changes of total HCA species, HCA, HDCA, GHCA, and GHDCA were 0.82, 0.70, 0.76, 0.67, and 0.73, respectively (Fig. 3e), providing evidence that HCA levels had potential prediction capability for the metabolic outcome of RYGB surgery.
Two years after surgery, a total of 26 individuals remained in remission, while 12 had diabetes recurrence. The remission group had lower HbA1c levels and fasting blood glucose levels at baseline compared to the non-remission group (Supplementary Table 8). Regarding serum BAs (Supplementary Table 9), total BAs were comparable between both groups (Fig. 3f). The levels of total HCA species and most individual BAs, HCA, GHCA, and GHDCA, were significantly higher in the remission group compared to the non-remission group ( Fig. 3g-k). The level of HDCA was also relatively higher in the remission group (Fig. 3i), although such difference did not reach statistical significance due to intra-group variations. The binary logistic regression models indicated that higher levels of total HCA species before surgery were associated with a higher probability of maintaining diabetes remission 2 years after RYGB. The odds ratio was 0.74 (95% CI, 0.58-0.95) after adjustment for age, gender, HbA1c level, and fasting blood glucose level. However, some individuals achieved remission at 6 or 12 months but relapsed at 2 years. We therefore divided the individuals into four groups based on their clinical indices of remission or non-remission at 6, 12, and 24 months post-surgery. The four groups include (1) remission at 6 months post-surgery without recurrence of diabetes; (2) remission at 12 months without recurrence of diabetes; (3) remission at 6 or 12 months but relapsed at 12 or 24 months; and (4) nonremission after surgery. The results ( Supplementary Fig. 5) showed that the initial remitters who relapsed at 12 or 24 months had relatively low levels of HCA species pre-surgery, similar to the non-remitters. Differently, the individuals in the group (2) had relatively high levels of HCA species at baseline, similar to the remitters. Thus, pre-surgery levels of HCA species may be useful in predicting long-term remission after surgery instead of shortterm remission.
Serum HCA species were strong predictors for future metabolic outcomes in healthy individuals. To evaluate the association between HCA species and future metabolic health, we selected 132 subjects (36 men and 96 women) from the Shanghai Diabetes Study 21 . All of them were metabolically healthy (MH, defined in the "Methods" section) at their enrollment. After 10 years, 86 participants became metabolically unhealthy (MU, defined in the "Methods" section), and 46 remained MH. At baseline, the future MU group were older, had higher BMI and more men than the future MH group (although group differences of sex ratio did not reach statistical significance), however, the major metabolic markers were similar between the two groups (Supplementary Table 10). To eliminate the confounding effects of age, sex and BMI, we chose 46 younger participants with lower BMI and comprised of more women, from the MU group to match the 46 participants in the MH group (Supplementary  Table 11). When samples from all participants were considered, the concentrations of total BAs in serum were comparable between the MH and MU groups, but the concentrations of total and individual HCA species were significantly lower in the MU than the MH group (Supplementary Fig. 6 and Supplementary  Table 12). Age-, sex-, and BMI-matched samples yielded similar results as all group samples did ( Fig. 4a- Fig. 7). The ROC curve analysis showed that the total HCA species (red line in Fig. 4g) had the highest AUC of 0.92, and the AUCs of individual HCA species ranged from 0.62 to 0.89, providing supporting evidence for using total and individual HCA species as predictors for future metabolic outcome.
Validation study of HCA species as predictive biomarkers for future metabolic outcome. To validate the predictive capability of serum HCA species for future metabolic outcome, we further collected serum samples from an independent cohort from Beijing, China. BA profiling was performed in an independent laboratory in Shenzhen, China (Lab 2), and the assessment instruments and methods were slightly different from those applied in the previous four cohorts in the Lab 1. A total of 207 subjects (117 men and 90 women) were selected. These participants were metabolically healthy in the year of 2011, and after 5 years, 90 of them (60 men and 30 women) became MU, while 117 (57 men and 60 women) remained metabolic healthy. The major metabolic markers were similar between the two groups in 2011 (Supplementary Table 14). We further selected 87 samples (57 men and 30 women) from MH and MU groups, respectively, with matched age and metabolic markers as a matched cohort (Supplementary Table 15).
A total of 27 BAs were quantified in Lab 2, among which, 20 were the same BAs that were also detected in Lab 1, and 3 BAs including 3-ketoCA, 7-ketoLCA, and 12-ketoCDCA were missed for lack of chemical standards. Notably, THCA and THDCA were quantified in Lab 2 due to the high sensitivity of the mass spectrometry, while these two metabolites did not pass the quality control in Lab 1 (>40% lower than limit of quantification). Thus, the HCA species in this cohort were defined as the concentration summation of HCA, HDCA, GHCA, GHDCA, THCA, and THDCA.
The samples from all participants (Supplementary Table 16 and Supplementary Fig. 8) and the matched cohort (Supplementary Table 17 and Fig. 4h-m) showed similar results, that is, the concentration of total BAs was comparable between the MH and MU groups. The concentrations of total HCA species and four individual BAs including HCA, HDCA, GHCA, and GHDCA were significantly lower in the MU than the MH group. The levels of THCA and THDCA were also lower in the MU group compared to the MH group, although without significant difference ( Supplementary Fig. 9). The ROC analysis showed that the total HCA species (the summation of HCA, HDCA, Fig. 2 Performance of HCA species in the second cross-sectional study. a-d Total BAs and total HCA species in serum and feces of healthy control (C, n = 26), pre-diabetes (preDM, n = 30) and diabetes (DM, n = 35) groups. e-j Individual HCA species in the three groups in all (n = 91), male (n = 35) and female (n = 56) samples. k, l Heatmaps of Spearman correlation coefficients of total and individual HCA species in serum and feces with representative metabolic markers. Data are expressed as median with interquartile range (a-j). In the bar plots (a-j), * and # indicate the statistical significance (p < 0.05) between two groups and among three groups, respectively, based on two-sided Kruskal-Wallis test. In the heatmaps (k, l), r value indicates Spearman correlation coefficient, and * indicates the statistical significance (p < 0.05) based on Spearman correlation. HCA species in serum is the concentration summation of HCA, HDCA, GHCA, and GHDCA. HCA species in feces is the concentration summation of HCA and HDCA.  Fig. 4n) had the highest AUC of 0.71, and the AUCs of the four individual HCA species ranged from 0.63 to 0.66. We further carried out the ROC analysis in the summation of HCA, HDCA, GHCA, and GHDCA, and the AUC was 0.72, higher than that of total HCA species of six BAs (Supplementary Fig. 10). These results validated our previous results that total and individual HCA species had robust predictive capability for future metabolic outcome. It also suggested that the concentrations of the four individual HCA species, HCA, HDCA, GHCA, and GHDCA, as well as their summation were good predictors in humans, while THCA and THDCA were not.

Discussion
BAs play a crucial role in dietary lipid digestion and absorption, and also act as signaling molecules to regulate glucose and lipid metabolism through activation of several receptors including the nuclear hormone receptor farnesoid X receptor (FXR) and the Gprotein-coupled receptor, TGR5. In recent years, BA derivatives have been extensively evaluated as potential therapeutic agents for metabolic diseases 22,23 . Consistent with published studies 24 , our data showed that total BA concentrations were higher in obese patients and positively correlated with BMI. We also observed that serum total BAs were decreased with increases in blood glucose levels, and the ratio of 12α-hydroxy-to non-12αhydroxy-BA was inversely associated with insulin sensitivity in T2DM patients [25][26][27][28][29] . Activation of the BA alternative synthetic pathway leading to increased production of non-12α-hydroxy-BA resulted in beneficial effects on glucose and lipid metabolism 15,28,[30][31][32] . It is worth noting, however, the levels of total BAs and BA pool composition in T2DM patients differed among studies. Some other studies have shown that there was no change in fasting total BAs in T2DM compared with non-diabetic controls, while postprandial total and glycol-BAs increased in T2DM 25 . The differences in BA results among studies may be due to the differences in methodological parameters (e.g., feeding state and time of T2DM onset) 16 . Thus, it is necessary to select specific BA species, such as HCA species, which are closely associated with glycemic control for comparisons between patient groups.
A recent 5-year prospective study in patients with pre-diabetes reported similar results regarding the association between low HCA levels and the risk of new-onset diabetes 33 . We found in this study that lower serum concentrations of HCA species were associated with diabetes and were closely related to glycemic markers. We aimed to evaluate the clinical performance of HCA species in not only T2DM patients but also those with obesity and metabolic syndrome. In this study, we enlarged the sample size in several independent cohorts from different cities in China.
In the 5-year longitudinal cohort, THCA and THDCA were detected at baseline but were below detection limits in future MU samples. Meanwhile, these two BAs were not as significantly decreased as the four HCA species. However, THCA and THDCA showed similar stimulation effects on GLP-1 secretion as the other four HCA species in our studies (data were not shown). Thus, taurine conjugated HCA species might be good candidates for pharmacological research, while they could not be clinical biomarkers due to their low serum concentrations.
T2DM is inherently associated with obesity and aging 24 , so we tried to eliminate the confounding effects of BMI and age when evaluating the role of HCA species in T2DM. By matching age and/or BMI between the groups in comparison, we demonstrated that HCA species had direct correlations with glycemic markers and future metabolic outcomes. These results provided evidence that HCA species play critical roles in regulating glucose homeostasis and are protective against the development of T2DM in humans.
We also showed that, compared with healthy controls, patients with pre-diabetes and diabetes had only~27% lower serum levels of HCA species, but strikingly~57% lower HCA species in feces, although these patients had similar levels of total BAs in feces as controls. Notably, the patients with pre-diabetes and diabetes had higher BMIs than the healthy controls, which suggest that they may also have altered gut microbiota 34 as previously discussed.
RYGB surgery is considered a rapid resolution of metabolic disorders. Total and individual HCA species were found significantly increased after RYGB 19 ; and among all BAs, the increases in HCA species were the most pronounced and consistent. Our results further highlight the potential predictive value of HCA species for long-term post-operative metabolic outcomes irrespective of the initial remission. Future follow-up studies with large sample sizes and long duration (3 years or more) are necessary to provide more convincing evidence to validate the prognostic value of blood HCA levels for RYGB surgery.
In conclusion, obesity and diabetes were associated with significantly lower levels of HCA species in serum. There were Table 2 Metabolic markers of patients with diabetes at baseline (0 m) and 1, 3, 6, and 12 months after surgery in the gastric bypass surgery intervention study.  Human study 1: cross-sectional study 1. This is a nested case-control study, performed within the SHOS 20 , which was designed to investigate the occurrence and development of the metabolic syndrome and its related diseases. Beginning in 2009, the SHOS recruited participants from four communities in Shanghai, China. The participants that were selected satisfied the criteria for overweight/obese and diabetes. The exclusion criteria were: type 1 diabetes, pregnancy, severe diabetic complications (diabetic retinopathy, diabetic neuropathy, diabetic nephropathy, and diabetic foot); severe hepatic diseases including chronic persistent hepatitis, liver cirrhosis or the co-occurrence of positive hepatitis and abnormal hepatic transaminase; severe organic disease, including cancer, coronary heart disease, renal disease, thyroid disease, myocardial infarction, or cerebral apoplexy; infectious disease; alcoholism; and continuous medication (including weight loss or psychotropic medication) for over 3 days prior to enrollment. According to these criteria, a total of 1107 subjects with fasting serum samples were selected prior to BA analysis. This cohort included 585 HL (329 men and 256 women), 419 HO (229 men and 190 women) and 103 OD (52 men and 51 women) participants (all samples from cohort 1). We further selected 103 subjects from each group with matched age, sex, and BMI as a matched cohort (matched samples from cohort 1).
Human study 2: cross-sectional study 2. A group of 91 subjects including 26 healthy controls (9 men and 17 women), 30 individuals with pre-diabetes (10 men and 20 women) and 35 patients with diabetes (16 men and 19 women) were recruited for this study. The exclusion criteria were the same as in human study 1. Fasting sera and fecal samples of 91 participants were collected and stored for later analysis.
Human study 3: Gastric bypass surgery intervention study. A total of 38 patients with obesity and diabetes who received RYGB surgery were enrolled in the study 17 . Any patient with a history of open abdominal surgery, a serious disease (such as heart or lung insufficiency) that was incompatible with surgery, an acute T2DM complication, severe alcohol or drug dependency, a mental disorder, type 1 diabetes, secondary diabetes, an unstable psychiatric illness, or who was at a relatively high surgical risk (such as a patient with an active ulcer) was excluded. The fasting serum specimens of these subjects were collected and stored for future analysis before (baseline) and 1, 3, 6, and 12 months after the surgery.
Human study 4: a 10-year longitudinal study. A group of 132 subjects (36 men and 96 women) were selected from the Shanghai Diabetes Study, which was intended to assess the prevalence of diabetes and diabetes-associated metabolic disorders in urban Shanghai 21  Criteria for lean, overweight/obesity, pre-diabetes, diabetes, metabolically healthy, and unhealthy status. Individuals with BMI < 25 kg/m 2 were considered lean and those with BMI ≥ 25 were classified as overweight/obese. Individuals with 6.1 mmol/L ≤ fasting blood glucose < 7.0 mmol/L or 7.8 mmol/L ≤ oral glucose tolerance test (OGTT) (2 h) < 11.1 mmol/L were classified as pre-diabetic. Subjects with fasting blood glucose ≥ 7.0 mmol/L and/or OGTT (2 h) ≥ 11.1 mmol/L were classified as diabetic. Subjects were considered "metabolically healthy" if they met all of the following criteria: fasting blood glucose < 6.1 mmol/L, OGTT (2 h) < 7.8 mmol/ L and no previous history of diabetes; systolic blood pressure (SP)/diastolic blood pressure (DP) < 140/90 mmHg and no previous history of high blood pressure; fasting plasma TG < 1.7 mmol/L and fasting plasma high-density lipoproteincholesterol (HDL-c) ≥ 0.9 mmol/L (men) or ≥1.0 mmol/L (women), and no previous history of high cholesterol (total cholesterol (TC) < 5.18 mmol/L); no history of cardiovascular or endocrine disease 7 . Those who failed to meet all criteria above were classified as "metabolically unhealthy".
Clinical measurements and sample collection. All human samples were collected, stored and measured following the standard operating protocol of the hospital. Participants were given a standard 75-g OGTT after an overnight fast of more than 8 h. Venous blood samples were drawn at 0 (fasting), 30, 60, and 120 min. Fasting and postprandial plasma glucose and insulin levels, fasting serum lipid profiles (TC, TG, HDL-c, LDL-c), blood pressure (SP and DP), waist circumference, BMI, liver, and kidney function tests were determined as previously described 7,35 . The blood samples were centrifuged for plasma or serum collection, and then divided into aliquots and delivered on dry ice to the study laboratory. Fecal samples from the recruited subjects were collected in the sterile feces containers (Cat# 80.9924.014, SARSTEDT). Each sample was either frozen immediately at −80°C or briefly stored in personal −20°C freezers before transport to the laboratory within 24 h. All samples were stored in a −80°C freezer until analysis.
Quantitative analysis of BAs. BAs were quantified using in-house established methods with minor modifications to improve accuracy ( Supplementary Fig. 11, 12, and Supplementary Table 18).
Methods of Lab 1. For sample pretreatment, an aliquot of 50 µL serum sample was mixed with 300 µL acetonitrile-methanol (8:2 v/v) containing six internal standards (IS) (D4-GCA, D4-GDCA, D4-CA, D4-UDCA, D4-LCA, and D4-GCDCA, 50 nM for each). The mixture was allowed to stand at 20°C for 30 min, and was then centrifuged at 13,000 × g at 4°C for 30 min. An aliquot of 300 µL of supernatant was transferred to another tube and then vacuum-dried. A 25 µL volume of acetonitrile-methanol (9/1, v/v) containing 0.01% formic acid was added, and the sample was re-vortexed at 1500 × g, 10°C for 10 min followed by addition of 25 µl water containing 0.01% formic acid. The sample was vortexed again at 1500 × g, 10°C for 10 min, and then centrifuged at 13,000 × g, 4°C for 15 min. The supernatant was used for ultra-performance liquid chromatography-tandem mass spectrometry ( . The column was maintained at 45°C and the injection volume for all samples was 5 μL. The mass spectrometer was operated in negative ion mode with a 2.5 kV capillary voltage. The source and desolvation gas temperature were 150 and 450°C, respectively. The data were collected with a multiple reaction monitor (MRM), and the cone and collision energy for each BA used the optimized settings from QuanOptimize application manager (Waters Corp., Milford, MA).
For instrumental analysis, an aliquot of standard stock solution was prepared by mixing BA standards for a final concentration of 5 µM each. A series of standard calibration solutions were diluted with 50% methanol for the calibration curve. The calibration curve and the corresponding regression coefficients were obtained using internal standard (D4-LCA, D4-DCA, D4-GDCA, D4-GCDCA, D4-CA, D4-GCA, D4-TCA) adjustment. A Waters ACQUITY UPLC system equipped with a binary solvent delivery manager and a sample manager (Waters, Milford, MA) was used throughout the study. The mass spectrometer was a Waters XEVO TQS instrument with an ESI source (Waters Corp., Milford, MA). The entire LC−MS system was controlled by MassLynx 4.1 software. All chromatographic separations were performed with a CORTECS UPLC C18 column (1.6 μm, 100 mm × 2.1 mm internal dimensions) (Waters Corp., Milford, MA). The mobile phase consisted of water with 5 mM ammonium acetate (mobile phase A) and acetonitrile/methanol (80/20, v/v, mobile phase B). The flow rate was 0.40 mL/min with the following mobile phase gradient: 0−0.5 min (5% B), 0.  16.5−17 min (5% B). The column was maintained at 30°C and the injection volume for all samples was 5 μL. The mass spectrometer was operated in negative ion mode with a 2.5 kV capillary voltage. The source and desolvation gas temperature were 150 and 450°C, respectively. The data were collected with MRM, and the cone and collision energy for each BA used the optimized settings from QuanOptimize application manager (Waters Corp., Milford, MA).
Method validation. The two methods for BA quantification were verified independently using commercially available standard human plasma (NIST 1950) and cross-validated between the two labs (Lab 1 and Lab 2). The limit of detection (LOD) and limit of quantification (LOQ) are shown in Supplementary Tables 19,  20. The results (Supplementary Table 21) showed that the intra-batch and interbatch precision CVs were lower than 20%.
Statistical analysis. The BA profile raw data acquired using UPLC-MS were processed and quantified using TargetLynx software (Waters Corp., Milford, MA). Manual checking and corrections were carried out in order to ensure data quality. The sample sizes were predetermined by IP4M 2.0 (http://ip4m.cn) 36 . The power was set as 0.85 (significant level 0.05, two tails) and the effect size was set as 0.5. The sample distribution was determined using a Kolmogorov-Smirnov normality test. For statistical comparisons, Mann-Whitney U test or Kruskal-Wallis test followed by a pairwise comparison was carried out for comparisons of two or more than two groups, respectively. Wilcoxon signed-rank test was carried out for comparisons of paired sample groups. Spearman's rank correlation coefficients were calculated to examine the association of BAs and typical clinical measurements. Two-tailed p values smaller than 0.05 were considered significant. All the p values were corrected for multiple testing by the Benjamini-Hochberg false discovery rate test. ROC analysis was used to test the sensitivity and specificity of total and individual HCA species in the group separation. The ratio of molar sums of BA (Fig. 1b) was the ratio of the level of specific BA for each individual in HO and OD groups to the mean level in HL group. SPSS (V19, IBM, USA), GraphPad Prism (6.0, GraphPad, USA), and IP4M 2.0 were used for statistical analyses and graphic generation.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The metabolomics data were deposited and available at Metabolights repository with accession code MTBLS2343. Other data supporting the findings of this study are available from the corresponding authors upon reasonable request. Source data are provided with this paper.