Prediction and causal inference of hyperuricemia using gut microbiota

Hyperuricemia (HUA) is a symptom of high blood uric acid (UA) levels, which causes disorders such as gout and renal urinary calculus. Prolonged HUA is often associated with hypertension, atherosclerosis, diabetes mellitus, and chronic kidney disease. Studies have shown that gut microbiota (GM) affect these chronic diseases. This study aimed to determine the relationship between HUA and GM. The microbiome of 224 men and 254 women aged 40 years was analyzed through next-generation sequencing and machine learning. We obtained GM data through 16S rRNA-based sequencing of the fecal samples, finding that alpha-diversity by Shannon index was significantly low in the HUA group. Linear discriminant effect size analysis detected a high abundance of the genera Collinsella and Faecalibacterium in the HUA and non-HUA groups. Based on light gradient boosting machine learning, we propose that HUA can be predicted with high AUC using four clinical characteristics and the relative abundance of nine bacterial genera, including Collinsella and Dorea. In addition, analysis of causal relationships using a direct linear non-Gaussian acyclic model indicated a positive effect of the relative abundance of the genus Collinsella on blood UA levels. Our results suggest abundant Collinsella in the gut can increase blood UA levels.

phenotypes 10 .Therefore, it has frequently been applied in GM research in recent years.Studies on GM and UA in humans are epidemiological studies and only examine associations, not causation.In other words, they do not assess the cause of the prediction, i.e. causality.One causal inference method that has been proposed to assess the causal structure of variables is the linear non-Gaussian acyclic model (LiNGAM) 11 .The aim of this study was to use LiNGAM to infer the causal relationship between GM and UA in Japanese adults.

Clinical background
Two of the 488 participants who submitted fecal samples were excluded because they had less than 5000 sequences in the NGS analysis.Forty-one were taking UA-lowering drugs, antibiotics, steroids, bowel regulators, biocides, antibacterials, and proton pump inhibitors, and five were undergoing cancer treatment.Ten had missing health examination data, and 30 did not fast before blood collection.A total of 400 participants (176 men and 224 women) were included in the analysis.

Differences in GM composition by HUA
Table 1 shows the clinical characteristics of the HUA (UA > 7.0 mg/dL in the blood) and non-HUA groups.There were significant differences in BMI, waist circumference, UA, S-Cre, eGFR, and frequency of alcohol consumption between the two groups.The composition of the top 30 genera of intestinal bacteria in the two groups at the level of genus is shown in Fig. 1A.The Shannon index was significantly reduced in the HUA group (Fig. 1B, P = 0.027, ANCOVA), and non-metric multidimensional scaling analysis using the Bray-Curtis distance (diversity) showed no significant difference in gut bacterial composition between the two groups (Fig. 1C).

GM associated with HUA
LEfSe analysis of all 436 bacteria (Fig. 2) showed that 11 and 15 that were significantly high in the HUA and non-HUA groups.The bacteria with the highest linear discriminant analysis (LDA) score in the HUA group was the genus Collinsella (LDA score = 3.569, P = 0.013), and the bacteria with the highest LDA values in the non-HUA group were Faecalibacterium (LDA score = 4.138, P = 0.033).

Correlation between UA levels and GM
Figure 4 shows the correlation between serum UA levels and the relative abundance of the nine intestinal bacteria selected in the LGBM; as the heatmap shows, a significant correlation was demonstrated between the genera Collinsella and Dorea and serum UA levels (Fig. 4).No significant correlation was observed between these two intestinal genera and renal function indices in serum, such as eGFR and S-Cre, other than UA (Supplementary Fig. 1).

Causal relationship between serum UA levels and Intestinal bacteria using Direct LiNGAM
LiNGAM algorithm was used to infer the causal relationship between serum UA levels and intestinal bacterial abundance ratios.The inferred causal diagrams, causal ranks, and partial regression coefficients for the serum

Discussion
Statistical analysis and machine learning revealed associations between specific gut bacteria and HUA and inferred a causal relationship.The genera Collinsella, Dorea, and Lachnospiraceae FCS020 group were identified as characteristic bacteria involved in HUA.Direct LiNGAM suggested that the genera Collinsella and Lachnospiraceae FCS020 group may alter serum UA levels.The presence or absence of HUA can be accurately predicted using general laboratory information and gut bacteria data.Particularly, the genus Collinsella is presumed to have a direct causal relationship with UA.Collinsella aerofaciens, a representative species of the genus Collinsella, is abundant in the intestinal flora of Asians 12 and produces butyric, formic, lactic (LA), and acetic acids 13 .Indeed, C. aerofaciens has been reported to affect host health and disease 14 , and there are currently no reports of Collinsella spp.increasing or decreasing in subjects with HUA or affecting serum UA levels to become a risk factor for HUA.
There are four possible mechanisms by which Collinsella spp.modulate host serum UA levels.First, Collinsella spp.directly produce UA.Second, Collinsella spp.indirectly inhibit UA degradation by other bacteria.Finally, the metabolites produced by Collinsella spp.reduce renal and intestinal excretion of UA.
Collinsella spp.harbor gene sequences for hypoxanthine, the precursor of UA, and xanthine dehydrogenase, which converts xanthine into UA (NCBI database).www.nature.com/scientificreports/A known bacterial catabolic pathway for UA is the allantoin pathway, which involves the interconversion of 5-hydroxyisouric acid, 2-oxo-4-hydroxy-4-carboxy-4-carboxy-5-ureidoimidazoline, and allantoin in three steps and is readily degraded to ammonia 15 .Many bacterial species utilize this metabolic pathway.Lactobacillus brevis (DM9218) and Lactobacillus gasseri (PA-3) have also potential as probiotics to improve HUA by degrading intermediates of purine metabolism 16 .Lactobacillus gasseri (PA-3) is a bacterium recently found in yogurt and other products, suggesting that dietary habits may be affecting UA levels via GM 17 .If Corinella spp.can inhibit the activity and growth of enterobacteria that cause the degradation of interstitial UA, this may constitute a mechanism to increase UA levels in the host.
Serum UA is excreted from the kidneys and intestinal tract.Thus, indole and LA from Corinella spp.may cause additional renal UA excretion.Indole and LA have been found to inhibit serum UA excretion when the blood UA levels increase [18][19][20] .Corinella spp.possess tryptophanase, which metabolizes tryptophan to indole.The indole produced is transferred to the liver, where it is converted to indoxyl sulfate and is thought to be responsible for the aggravation of renal and vascular diseases 18,19 .Kurihara et al. reported that C. aerofaciens produces sufficient LA 20 .However, bacterial species, such as Enterococcus faecalis and Bacteroides intestinalis, have been reported to produce particularly high lactate levels, suggesting that Collinsella spp.may not be the only cause of HUA via this mechanism.
Loss of function of the ATP-binding cassette transporter G2 (ABCG2), which is abundantly expressed in the intestinal tract, mainly in the ileum, has been reported to cause HUA and gout 21,22 ; ABCG2 excretes not only UA but also the aforementioned indole sulfate 23,24 .The mechanisms by which Chorinella spp.regulate host UA levels require further study.In addition to Collinsella spp., Lachnospiraceae FCS020 group also showed the potential to reduce serum UA levels; Lachnospiraceae FCS020 group was significantly reduced in the HUA group, similar to Lachnospiraceae bacteriaceae 25 .Bacterial species in this family may act as protective factors against HUA.
In general, seafood, soy products, and beer, when consumed in excess, tend to increase uric acid levels 26 .In addition, consumption of probiotic-containing beverages such as yogurt and Yakult may affect the intestinal bacteria associated with HUA.Therefore, using information about participants daily eating habits may improve the accuracy and reliability of models predicting HUA.
This study has certain limitations.First, the sample size was small.Overall, 400 samples were included in the analysis, of which 31 were from patients with HUA.In addition to the small sample size, the lack of data in another population did not allow us to conduct an external validation to assess the performance of the forecasting model by LGBM.Second, analysis was performed at the bacterial genus level in the 16s rRNA V3-V4 region.A more detailed classification at the bacterial species level, rather than at the bacterial genus level, would reveal  changes in serum UA levels, which would be more beneficial for clinical applications such as dietary and other probiotic interventions.Finally, although this study was predictive and inferential based on observational data and took confounding factors into account to the greatest extent possible, the influence of potential confounding factors cannot be completely ruled out.
In conclusion, we confirmed that the genus Collinsella may be the GM most causally related to serum UA levels in the present population.This suggests that maintaining a low ratio of certain gut bacteria may lead to the maintenance of serum UA levels, reducing the risk of HUA.In the future, it may be possible to identify GM compositions that improve UA metabolism and contribute to the prevention of HUA.The discovery of prebiotics that affect Chorinella spp.and increasing the number of gut bacteria that antagonise Chorinella spp.could be a new therapeutic strategy for patients with HUA.Further studies are required to elucidate the detailed mechanisms of action of the GM in HUA.

Participants
The participants were 488 residents (224 men and 254 women) aged 40 years or older, of Shika-machi, Hakui-gun, Ishikawa Prefecture, Japan, whose fecal samples were collected during a health checkup in January 2018 and 2020 (n = 254: 115 men, 131 women, 8 unknown) and January 2020 (n = 234: 109 men, 123 women, 2 unknown).The patients were divided into two groups, HUA, and non-HUA groups, based on a criterion 1 : the HUA group with serum UA > 7.0 mg/dL in blood.We excluded following patients (1) who had been taking UA-lowering drugs, antibiotics, steroids, bowel regulators, biocides, antibacterial agents, and proton pump inhibitors; (2) who had been undergoing any treatment for cancer, (3) who had eaten within 10 h at the time of blood collection; and (4) whose diagnostic data were missing.

Data source
Data from the Shika-machi Super Preventive Health Examination, a population survey aimed at establishing preventive methods for lifestyle-related diseases, were used.The survey was conducted twice, in January 2018 and January 2020.The four model districts selected from the Shika area were Horimatsu, Higashimasuho, Tsuchida, and Higashiki 27,28 .

Ethical considerations
This study was approved by the Kanazawa University Hospital Human Research Ethics Committee (approval number: 1491) and conducted in accordance with the principles of the Declaration of Helsinki and the Kanazawa University Microbial Safety Management Regulations.After providing an overview of the study to all participants at the time of physical examination, written informed consent prior to GM collection was obtained.The fecal samples were processed in a non-proliferation level 2 (P2) laboratory.

Data collection
The Super-Preventive Health Checkup data (Shika Town) regarding parameters such as age, sex, medical history, medication status, and alcohol consumption/smoking status were collected using a questionnaire.The body mass index (BMI) was calculated by dividing the current weight (kg) by the square of the height (m 2 ).After fasting for 12 h, venous blood was collected and serum UA levels (s-UA), and serum creatinine (S-Cre) were measured.Estimated glomerular filtration rate (eGFR) was calculated with S-Cre as in previous articles 28 .

Fecal sample collection and DNA extraction
Fecal samples were collected from 488 participants using the method described previously 29 .The stool surface samples were collected independently by the participants using clean paper (AZ-ONE, Osaka, Japan) and a clean spatula with a plastic tube (AZ-ONE, Japan).The collected fecal samples were kept on ice and transported to the laboratory.The samples were stored at − 80 °C until DNA extraction.The total DNA extraction was performed using the NucleoSpin® DNA Stool (Machery Nagel, Dürren, Germany).

Next-generation sequencing
The DNA extracted from the GM was processed for identification of the 16S rRNA gene sequence by NGS, using a previously described method 28 .The 16S rRNA gene was amplified using the 1st PCR primers (F: 5′-TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCT ACG GGN GGC WGC AG-3′; R: 5′-GTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGA CTA CHV GGG TAT CTA ATC C-3′) 11 (Hokkaido system science Co., Ltd., Osaka, Japan).Ex Taq® hot-start version (TaKaRa Bio Inc., Shiga, Japan) and TaKaRa PCR Thermal Cycler Dice® Gradient (TaKaRa Bio Inc., Shiga, Japan) were used to amplify the V3-V4 region of the 16S rRNA gene.Polymerase chain reaction (PCR) products were purified using Agencourt AMPure XP magnetic beads (Beckman Coulter, Inc., CA, USA).The concentrations of the resultant PCR products were measured using the Qubit®dsDNA HS Assay Kit and Qubit® 3.0 Fluorometer (Thermo Fisher Scientific).All the purified PCR products were indexed and sequenced using MiSeq (Illumina, Inc., CA, USA) with MiSeq Reagent Kit version 3 and PhiX Control v3 (Illumina).

Microbiome analysis
For microbiome analysis, QIIME2 software was used 30 .Demultiplexed paired-end sequence data were denoised with DADA2, and the Silva 16S rRNA database (release 132) 31 naïve Bayes classifier was used for ASV classification.Samples with fewer than 5000 sequences were removed from the analysis.

Figure 1 .
Figure 1.Differences in gut microbiota between HUA and non-HUA groups.(A) Comparison of relative abundance ratios at the phylum and genus level for the top 30 bacterial genera.(B) The difference in α-diversity calculated using the Shannon index (P = 0.027, Quade's nonparametric ANCOVA).(C) Plot of β-diversity analysis calculated by NMDS ordering based on Bray-Curtis distance matrix.Red: HUA, blue: non-HUA.Ellipses represent 95% confidence intervals for each genus used in the analysis.(P = 0.888, PERMANOVA).NMDS non-metric multidimensional scaling, ANCOVA analysis of covariance, HUA hyperuricemia, PERMANOVA permutation multivariate analysis of variance.

Figure 2 .
Figure 2. Identification of the intestinal bacteria involved in HUA.LEfSe analysis of the top 436 bacterial species, with LDA score = 2.0 as the cutoff value.HUA hyperuricemia, non-HUA non-hyperuricemia, LEfSe linear discriminant analysis effect size.

Figure 3 .
Figure 3. Receiver operating characteristic (ROC) curve curves for models predicting the presence or absence of HUA.The performance of the model using 13 characteristics, including bacterial genera, is shown in red.The performance of the model using only four variables, age, BMI, waist circumference and frequency of alcohol consumption, is shown in blue.The ROC curve of the model with the median AUC out of 50 cross-validations is shown.

Figure 4 .
Figure 4. Correlation between serum UA levels and bacterial genus abundance ratios.The bacteria shown in the figure are the nine bacterial genera that could be predicted to have the highest AUC in LGBM.Spearman's correlation coefficient value determines the color intensity of the heatmap.Red: positive correlation, blue: negative correlation.(*P < 0.05).The correlation matrix was visualized as a heatmap using "pheatmap in R.

Figure 5 .
Figure 5. Causal inference between serum UA levels and GM by LinGAM.Arrows indicate the direction of causality between the two indices.Values are standardized partial regression coefficients.Red: bacteria with an inferred causal relationship with UA; blue: serum UA level.Numerical values are absolute values of the partial regression coefficients.

Table 1 .
Characteristics of study participants.The P-values were calculated by covariance analysis (ANCOVA or Quade's non-parametric ANCOVA).ANCOVA analysis by covariance, BMI body mass index, S-Cre Serum creatinine, eGFR estimated glomerular filtration rate, DM diabetes mellitus, DL dyslipidemia, HT hypertension, CVD Cardiovascular Disease.

Table 2 .
List of features used in the HUA prediction algorithm showing the highest AUC in LGBM.BMI body mass index.