Distinguishing feature of gut microbiota in Tibetan highland coronary artery disease patients and its link with diet

The prevalence of coronary artery disease (CAD) in Tibetan Highlanders is lower than that in plain-living individuals, but the mechanism still unclear. Gut microbiota (GM) disorder is considered one of the potential factors involved in the pathogenesis of CAD, but the GM characteristics of Tibetan Highlanders suffering from CAD are unknown. We sequenced the V3-V4 region of the 16S ribosomal RNA of gut bacteria from fecal samples from Tibetan and Han CAD patients and healthy individuals inhabiting the Qinghai-Tibet Plateau, as well as from Han CAD patients and healthy individuals living at sea level, and we analyzed the GM characteristics of these subjects by bioinformatics analysis. The results showed that Tibetan Highlanders suffering from CAD had higher GM α-diversity, with differently distributed cluster compared with healthy Tibetan Highlanders and Han CAD patients living at high and low altitudes. Genera Catenibacterium, Clostridium_sensu_stricto, Holdemanella, and Ruminococcus 2 were enriched in Tibetan Highlanders suffering from CAD compared with healthy Tibetan Highlanders and Han CAD patients living at high- and low-altitudes. Prevotella was enriched in Tibetan Highlanders suffering from CAD compared with Han CAD patients living at high- and low-altitudes. Moreover, Catenibacterium was positively correlated with Prevotella. Additionally, Catenibacterium, Holdemanella, and Prevotella were positively correlated with fermented dairy product, carbohydrate and fiber intake by the subjects, while Clostridium_sensu_stricto was negatively correlated with protein intake by the subjects. In conclusion, our study indicated that Tibetan Highlanders suffering from CAD showed distinct GM, which was linked to their unique dietary characteristics and might associated with CAD.


Results
Participant general characteristics and diet structure. Age, sex, body mass index (BMI), smoking status, systolic blood pressure (SBP) and diastolic blood pressure (DBP) on admission, total cholesterol (TC), triglycerides (TG), low-density lipoproteins (LDL-c), alanine aminotransferase (ALT), aspartate aminotransferase (AST), urea nitrogen (BUN), creatinine, fasting blood glucose, and c-reactive protein (CRP) levels were noted for each participant. Cardiac function and geometry, including left atrium diameter (LA), right atrium diameter (RA), interventricular septal thickness (IVS), left ventricular end-diastolic diameter (LVEDD), right ventricle diameter (RV), and ejection fraction (EF%) were collected. For CHD patients, coronary anatomy and time since last chest pain were also collected. Age, sex, BMI, smoking status, SBP, DBP, TG, ALT, AST, BUN, creatinine, fasting blood glucose, CRP levels and cardiac function and geometry (including LA, RA, IVS, LVEDD, RV and EF%) showed no significant differences among the 6 groups (continuous variables: one-way analysis of variance (ANOVA), categorical variables: Kruskal-Wallis test, probability (p) > 0.05, Table 1). The No. of stenosed vessels and time since last chest pain also showed no significant differences within the CAD subgroups (continuous variables: ANOVA, categorical variables: Kruskal-Wallis test, p > 0.05, Table 1). The HHC group showed lower TC and LDL-c levels than the HHN group (two-tailed Student's t-test, TC: t = 3.217, false discovery rate (FDR)adjusted p = 0.009; LDL-c: t = 2.803, FDR-adjusted p = 0.063), the LHC groups showed lower TC levels than the HHN group (two-tailed Student's t-test, t = 3.664, FDR-adjusted p = 0.009), which might attributed to statins use by CAD patients. We also analyzed the diet composition of subjects from different groups. Tibetan Highlanders consumed more fermented dairy products, carbohydrates and fiber but less protein and fat than high-altitude and low-altitude Han individuals (Kruskal-Wallis test, p < 0.05, Table 2).

Multivariable statistical analysis GM diversity and distribution among different groups. First,
the Shannon rarefaction curves based on the operational taxonomic units (OTUs) profiles for all samples reached a plateau, revealing that the sequencing depth sufficiently captures the GM variation in these populations (Fig. 1a). Second, to demonstrate the diversity of the GM, α-diversity analysis based on the Shannon, Ace, and Chao indexes were compared among the 6 groups. The multivariable analysis results showed that there were statistical differences in α-diversity among the 6 groups (one-way ANOVA, Shannon: p = 5.5e−06, Ace: p = 1e−12, Chao: p = 3.9e−13). The paired groups comparing analysis results showed that the Shannon index between HTC and HTN (two-tailed Student's t-test, FDR-adjusted p = 0.038) and between LHC and LHN (twotailed Student's t-test, FDR-adjusted p = 0.033), showed significant differences (Fig. 1b-d). Among the CAD subgroups, the Shannon, Ace, and Chao indexes of the HTC subgroup were significantly higher than those of the HHC and LHC subgroups (two-tailed Student's t-test, FDR-adjusted p < 0.005, Fig. 1b-d). Among the healthy individual subgroups, the Shannon, Ace, and Chao indexes of the HTN subgroup were significantly higher than those of the HHN and LHN subgroups (two-tailed Student's t-test, FDR-adjusted p ≤ 0.005, Fig. 1b-d). However, no difference was shown between the HHC and LHC subgroups, and no difference was shown between the HHN and LHN subgroups (two-tailed Student's t-test, FDR-adjusted p > 0.05, Fig. 1b-d). In conclusion, Tibetans living at high altitude showed higher α-diversity than Han individuals living at high-or low-altitude. Moreover, Tibetan CAD patients living at high altitude showed higher Shannon index than that of healthy Tibetan individuals living at high altitude. Furthermore, weighted UniFrac distance matrix-based principal coordinate analysis (PCoA) was used to estimate the β-diversity. In general, bacterial communities distributions of the 6 groups were different (adonis test, R 2 = 0.106, p = 0.001, Table 2). Within the altitude-and ethnicity-matched subgroups, the bacterial community of HTC was different from HTN (adonis test: R 2 = 0.034, FDR-adjusted p = 0.078), and the bacterial community of LHC was different from LHN (adonis test: R 2 = 0.033, FDR-adjusted p = 0.057), but no significant difference was shown between HHC and HHN (adonis test: R 2 = 0.016, FDRadjusted p = 0.763, Fig. 1e and Table 3). Within the CAD subgroups, the bacterial community of HTC, HHC and LHC formed distinct clusters with each other (adonis test, HTC vs. HHC: R 2 = 0.102 FDR-adjusted p = 0.002; HTC vs. LHC: R 2 = 0.137, FDR-adjusted p = 0.002, HHC vs. LHC: R 2 = 0.078, FDR-adjusted p = 0.005, Fig. 1e and Table 3). Within healthy individual subgroups, the bacterial community of HTC, HHC and LHC also formed distinct clusters with each other (adonis test, HTN vs. HHN: R 2 = 0.038, FDR-adjusted p = 0.002; HTN vs. LHN: R 2 = 0.088, FDR-adjusted p = 0.002, HHN vs. LHN: R 2 = 0.076, FDR-adjusted p = 0.002, Fig. 1e and Table 3 www.nature.com/scientificreports/  Finally, we confirmed the independent factors associated with CAD by binary logistics regression analysis. First, univariate binary logistic regression analysis was used to compare of clinical characteristics, laboratory characteristics, diet composition and specific bacteria read counts between CAD patients and health individuals. Individuals in the two groups were as the dependent variables (0 = health individual, 1 = CAD). No significant differences were detected in terms of sex, smoking, BMI, DBP, fasting blood glucose, ethnic, living altitude, and fermented, fat, fiber, and protein intake, Clostridium_sensu_stricto, Holdemanella, Ruminococcus 2, and Prevotella read counts between the two groups (p > 0.05, Table 4 left column). Age, SBP, CRP, carbohydrates intake and Catenibacterium read counts were significantly different between the two groups (p < 0.05, Table 4 left column). Then, we performed multivariate binary logistic regression analysis to identify the risk factors for CAD. All parameters showing significant differences between CAD and health individual groups in univariate regression analysis were selected for multivariate regression analysis. The results revealed that age (OR = 1.073, 95% CI 1.019-1.130, p = 0.007) and SBP (OR = 1.071, 95% CI 1.014-1.132, p = 0.014) were independent risk factors for the CAD (Table 4 right column). Catenibacterium read counts was also independently associated with CAD (OR = 1.000, 95% CI 1.000-1.001, p = 0.021, Table 4 right column). These results showed that compared with altitude-and ethnicity-matched healthy individuals, the composition of the GM in patients with CAD had its own characteristics. The composition of the GM was more unique in CAD patients, especially in the HTC group.
Catenibacterium might be involved in the development of CAD in Tibetan highlanders.

Discussion
CAD is a disease affected by multiple factors, and exploring the pathogenesis of CAD is still the goal of many researchers. Recently, studies have shown that the GM and its metabolite, TMAO, are closely related to CAD 14,15,18 . GM disorder is considered one of the potential factors involved in the pathogenesis of CAD. Because of the unique genetic background, living environmental factors and diet of Tibetan Highlanders, the incidence and mortality of individuals in these populations suffering from CAD are different from those of other populations 4-6 . Interestingly, the incidence of CAD in Tibetan Highlanders is lower than that in Japanese individuals living at sea level, which suggests that unknown contributors play protective roles in the CAD of Tibetan Highlanders. Disorders of the GM may lead to the occurrence of diseases, but the optimization of the GM may prompt the body to adapt to the environment. Several studies have shown that Tibetan Highlanders showed different GM characteristic when compared with other individuals, which might be because stable and balanced gut ecosystems play an important role in human self-protection in harsher environments 16,19,20 . In this study, our results showed that α-diversity, measured as microbial richness and evenness by the Shannon, Ace, and Chao indexes, were higher in the HTN group than in the HHN and LHN groups. And β-diversity, measured as microbial distribution by weighted UniFrac distance matrix-based PCoA, was different in the HTN group compared with it in the HHN and LHN groups. There results are in accordance with what Liu et al. reported 21 . Moreover, we also found that Shannon, Ace, and Chao indexes were higher in the HTC group than in the HHC and LHC groups. And weighted UniFrac distance matrix-based PCoA in the HTC group formed different cluster comparing with those in the HHC and LHC groups. There results indicated that host genetic and environmental factors shaped the diversity and distribution of GM in both healthy individuals and patients suffered from CAD. Moreover, this might be one of the reasons why CAD was not prevalent in Tibetan Highlanders.
To date, data suggests that both Prevotella and Catenibacterium are closely correlated with dietary habits, living environments and/or ethnicities. Wu et al. 22 reported that enterotypes dominated by Prevotella and Catenibacterium were strongly associated with long-term diets containing high carbohydrates but little protein and animal fat. Several studies have demonstrated that Prevotella is one of the core microbiota genera of the Tibetan population, and its high abundance is associated with high carbohydrate and low fat and protein intake 16,23,24 . Dehingia et al. 25 confirmed that the abundance of Prevotella was positively correlated with the industrialization level of the living environment. He et al. 26 reported that supplying a diet with additional oat bran, which is a food containing high fiber, increased the abundance of Prevotella and Catenibacterium in the gut of the pigs. Catenibacterium also correlated with ethnicity. A higher abundance of Catenibacterium was noted in Indian adults than in Chinese adults 27 . Catenibacterium was also found to be enriched in the gut of Egyptian children and Bangladeshi children compared to US children 28,29 . By analyzing the diet structure of our subjects, we also confirmed that Tibetan Highlanders consumed more carbohydrates, fiber and fermented dairy products. The industrialization level of the QTP is much lower than that of Wuhan, which is a large modern city. Moreover, both genera Prevotella and Catenibacterium were enriched in Tibetan Highlanders compared with healthy conditionmatched Han individuals (HTC vs. HHC and LHC; HTN vs. HHN and LHN). In addition, Prevotella and Catenibacteriumn were closely positively correlated with each other. Furthermore, we also confirmed that Prevotella and Catenibacterium were positively correlated with altitude, and fermented dairy product, carbohydrate and fiber  www.nature.com/scientificreports/  www.nature.com/scientificreports/ intake by the subjects. Therefore, our study confirmed that enriched Prevotella and Catenibacterium in gut of Tibetan Highlanders probably associated with unique dietary habits and living environments of this population. More importantly, we found that the abundances of Catenibacterium was higher in the HTC group than in the HTN group, which demonstrated that Catenibacterium is related to the progression of CAD. Accumulating evidence indicates that Catenibacterium might play protective roles in cardiovascular diseases. It is known that some carbohydrates and dietary fiber can escape digestion from upper gastrointestinal tract and are fermented by GM in the caecum and colon 30 . The most abundant metabolites produced after carbohydrates and fiber are broken down by GM are SCFAs, which exerted beneficial effects on regulation of inflammation and slowing down the development of atherosclerosis [31][32][33] . One study reported that Catenibacterium could improve gut health and nutrient utilization by enhancing the fermentation of fiber to produce SCFAs 26 . Fu et al. 34 reported that Catenibacterium was inversely correlated with host BMI. Kelly et al. 35 showed that enrichment of Catenibacterium was associated with a decreased lifetime CAD risk. Therefore, it can be assumed that Catenibacterium is special genera induced by the unique dietary habits of Tibetan Highlanders and might contribute to Tibetan Highlanders suffered from CAD.
Clostridium_sensu_stricto is another fermentative bacteria and beneficial for host by producing SCFAs 36 , but the role of Clostridium_sensu_stricto in CAD is till unclear. Fan et al. 37 confirmed that fed pig with low level protein diet, the proportion of Clostridium_sensu_stricto in colon was decreased. Here we found that Clostridium_sensu_stricto was enriched in the HTC group than it in the HTN group, as well as than it in HHC and LHC groups. Moreover, enriched Clostridium_sensu_stricto was negatively correlated with protein intake by subjects. So, Clostridium_sensu_stricto might be another potential essential bacteria associated with HTC.
Gut ecosystem linked with variable factors, including general characteristics (like age, sex, BMI, et al.), health status, dietary habit, ethnicity, geography, altitude and civilization. Tibetans are unique ethnicity with tough living environments and different dietary habits. Studies have demonstrated GM is involved in the progression of CAD, but little is known about the specific feature of GM in different ethnic groups from different geographical locations who suffering from CAD. In this study, we demonstrated that the GM of Tibetan Highlanders suffering from CAD showed higher α-diversity and a distinct cluster compared with healthy Tibetan Highlanders and Han CAD patients living at high and low altitudes. The beneficial genera Catenibacterium and Clostridium_sensu_stricto were enriched in Tibetan Highlanders, suffering from CAD compared with healthy Tibetan Highlanders and Han CAD patients living at high or low altitude. Moreover, Prevotella and Catenibacterium positively correlated with each other and were core genera in the genus co-network. Additionally, Catenibacterium, Holdemanella, and Prevotella were positively correlated with fermented dairy product, carbohydrate and fiber intake by the subjects, while Clostridium_sensu_stricto was negatively correlated with protein intake by subjects. In conclusion, our study indicated that Tibetan Highlanders suffering from CAD showed distinguishing GM, which was linked to their unique dietary characteristic and might associated with CAD. However, our study still have some limitations. First, the sample size of the HTC group is small. There are two reasons for this limitation. On one hand, CAD inclusion standards were very strict in this study. CAD is a disease affected by multiple factors. Hypertension, diabetes, and obesity are major risk factors of CAD. However, many current studies have reported that the GM dysbiosis was found correlated with the development of hypertension, diabetes and/or obesity [38][39][40] . Moreover, there are several types of CAD, the status of GM might also be different in these sub-types. One www.nature.com/scientificreports/ study has confirmed there were no significant differences in the diversity of the GM between the healthy control subjects and the patients with stable angina 10 . In the acute phase, patients with myocardial infarction often passively change their diet, daily life, and defecation habits, while these factors closely link to diversity of GM. In order to avoid the confounding factors caused by these comorbidities and different types of CAD, all CAD patients recruited in this study were unstable angina (UA), and all UA those who had comorbidities, including hypertension, diabetes, obesity, heart failure, renal failure, stroke, peripheral artery diseases, or any other acute or chronic inflammatory diseases were excluded. On the other hand, Tibetan Highlanders having a lower incidence of CAD than plain-living individuals. So, it takes a long time for a single center to raise a large number of HTCs. Second, medication is a major confound factor in this study. Ideally, the comparison should be between CAD and healthy control before medication. But from an ethical point of view, we must give medication to patients diagnosed with CAD. Actually, in several such studies, CAD patients were medicated. In the research finished by Jie et al. 41 patients with CAD used several medicines, including acarbose and atorvastatin. They proved that CAD statue, but not those drug used, caused the major distinguishing feature of GM in CAD patients. In the study conducted by Emoto et al., medication also did not matched between CAD and control groups 42 . In this study, we recruited CAD patients were administered aspirin, statins, angiotensin-converting enzyme inhibitor/ angiotensin II receptor blocker and β blocker, which are all essential drugs for secondary prevention of CAD. Therefore, we assume that these medications might weaken the disease signal, which meanings an even more significant difference would be expected if the study was free of medication. Due to these limitations, multicenter, larger-scale, drug-free studies are needed to verify our results. As HTC-specific genera, Catenibacterium and Clostridium_sensu_stricto, are both SCFAs producing bacterial. Next, we will study whether Catenibacterium and Clostridium_sensu_stricto are involved in Tibetan Highland CAD development by producing SCFAs and mediating inflammatory response in a mouse model of atherosclerosis that simulates Tibetan living environment and dietary habits. www.nature.com/scientificreports/ Bioinformatics analysis. The original image data files obtained by Illumina Miseq™ were converted into Raw Reads by Base Calling analysis. First, the primer adapter sequences were removed, and then the paired-end reads were merged into single sequences, and then the sequences were identified and distinguished by the barcodes, and finally quality control filtering was performed to obtain valid data for each sample. All high-quality sequences were clustered into OTUs at a 97% sequence similarity, and the OTU table was generated using USE-ARCH for OTU cluster analysis. The Shannon index, Ace index, and Chao index and rarefaction curves were calculated by mothur software (v.1.43.0) 44 and plotted in the R studio. The weighted UniFrac distance matrices were calculated using mothur and visualised by PCoA within the R (v.3.6.0), and statistical significance of bacterial distribution was evaluated by the adonis in R. The barplot was drawn using circlize package in R. LEfSe was drawn using the LEfSe software (v.1.1.0) 45 . The genera enrichment of each group was compared at read counts levels, and the box plot was drawn using SPSS (version 26). The correlation network and corrplot diagram were used to show link between every two genus. The correlation coefficient were calculated with SparCC and the diagram were drawn using igraph package (v.1.0.1) and corrplot package in the R studio, respectively. Bacterial abundance above 1% was analyzed. The the correlation matrix heat map was illustrated by Excel (v.2013) and the correlation coefficient were calculated by SPSS.

Methods
Statistical analysis. Data was analyzed by SPSS (v.26) and R software (v.3.6.0). The Shapiro-Wilk test was used to determine whether or not the measurement data was normally distributed. For normally distributed measurement data, one-way ANOVA and the two-tailed Student's t-test were used to test differences among more than two groups and between two groups, respectively. For non-normally distributed measurement data and categorical variables, Kruskal-Wallis test was used to test the significance of differences among different groups. For pairwise comparison read counts of genera, Wilcoxon rank sum test was used. To discover the link between two parameters, Spearman's correlation analysis was used, and Spearman's correlation coefficient was calculated. Levels of genera Prevotella, Catenibacterium, and Escherichia_Shigella were adjusted for traditional CAD risk factors (including age, sex, SBP, DBP, fasting blood glucose, BMI, smoking status, and CRP), ethnicity, and altitude by quadratic logistic regression analysis. For all tests, a value of p < 0.05 was considered statistically significant. For multiple comparison, p-value was adjusted by FDR at a threshold of < 0.05 or < 0.08 was considered for statistical significant. For LEfSe analysis, p < 0.05 and the logarithm of LDA > 2 was considered for statistical significant.