## Introduction

Gut microbiota has been widely implicated in many host diseases and shown to provide potential microbial targets for therapeutic development, such as diabetes1, inflammatory bowel disease2, and irritable bowel syndrome3. Probiotics are live microorganisms that can enhance host health by modulating the gut microbiome while administered in adequate amounts (FAO/WHO)4,5. Dietary administration of probiotics6 has been accepted as one of the most important strategies to modulate the gut microbiota for human health.

The ecological effect of probiotic administration on gut microbiota composition has been well documented in previous clinical microbiome studies or animal models7,8,9,10. However, evolutionary pressures leading to changes, such as single-nucleotide variants (SNVs), in the indigenous gut microbial community, thereby altering the functional potential of the gut microbiome. Many studies implied that a small number of genetic mutations or even a single SNV in the microbial genome can significantly alter the pathogenic behavior of gut bacteria and affect host health11,12,13. Chen et al.1 identified that specific SNVs on the genome of Bacteroides coprocola were correlated with T2D. Zou et al.12 reported that BlcE84-encoding bacteria with a distinctive SNV on the genome caused the destruction of the worm and mouse epithelial barrier and immune activation. Bacterial genetic variations in specific locations can even promote the longevity of their host14.

A previous study has highlighted that the evolution and transfer of genetic information of host-associated microbiota could enable resilience to biotic and abiotic perturbations15. In the organism Bacteroides fragilis, for example, many parallel evolutions of genes were found related to cell-envelope biosynthesis and polysaccharide utilization16. Notably, probiotics can impose persistent selective pressures on host gut bacteria by accumulating mutations related to carbohydrate utilization and acid tolerance within the mouse gut microbiome, such as E coli. Nissle17 and the driving force may provide the potential for genomic variations of gut resident species16. The indigenous gut microbiota, including competitors and collaborators, rapidly evolved to adapt to the ecological invasion of probiotics18. However, these in vivo genetic processes of gut microbiota are still poorly characterized due to probiotic consumption using a wide array of human and animal models. The in vivo evolution of the indigenous gut microbiota and probiotics facilitates the understanding to leverage these gut selective forces for the genetic engineering of probiotics. Hence, the comprehensive analysis of genomic alterations in gut commensal bacteria after probiotic exposure was important for evaluating the safety of probiotics and investigating the long-term effect of probiotics on the functional dynamics of host gut commensal bacteria. Adaptive evolution of gut microbes can be confirmed by parallel evolution or convergent evolution or by increased frequency of mutations inconsistent with neutral drift16. Shotgun metagenomic sequencing technologies provide access to the entire gut microbiome genetic information and thus enable such a thorough investigation of adaptive SNVs arisen by probiotic ingestion and microbial composition and functional genes involved.

This meta-analysis aims to systematically evaluate the effect of probiotic administration on strain-level variations of the gut microbiome, and the association between adaptive SNVs, host species, probiotic strains, probiotic intervention duration, and probiotic dose. Furthermore, we also sought to identify and characterize the universal adaptive mutations arisen from probiotic ingestion in a wide range of shotgun metagenomic studies.

## Results

### Probiotic intake commonly altered the genetic composition of gut microbial residents

To comprehensively understand the adaptive mutations in the resident gut microbiota due to probiotic intake, we first collected and curated publicly available metagenomic studies related to probiotics with the following criteria. (1) The study has a longitudinal design, which at least has a baseline and end time point for the probiotic consumption for a human or animal host subject. (2) The study does not use probiotics in combination with any other substance, such as medications, prebiotics, minerals, vitamins. (3) The study’s raw data were published and had detailed metadata. (4) The study’s sequencing data quality allows us to analyze at least species-level composition in the gut microbiome. (5) The study provided clear probiotics species/strains/product information. (6) The study has a clear statement on the dose and duration for probiotic intake. Finally, 11 high-quality metagenomic studies were included, among which seven were human cohorts (U.S., N = 1; Israel, N = 1; New Zealand, N = 2; China, N = 3), and five were animal cohorts (dog, N = 1; rat, N = 1; mice, N = 3). In total, 224 probiotic-treated individuals and 197 placebo controls (Tables 1 and 2) were included. The probiotic-administration duration for hosts in studies ranged from 1 week to 2 years and its median was 4 weeks. The median dose of probiotic administration was 910 CFU/day, ranging from 108 to 1010. Next, MetaPhlan2 was employed to identify the microbial compositions that have a relative abundance >0.5% for SNV profiling (Supplementary Data 1). The metagenomic reads were then mapped to the reference genomes of these selected species for SNV identification. We compare the SNVs against each reference genome for each host before and after probiotic treatment. A total of 16,901 SNVs were associated with probiotic administration (Supplementary Data 2). We first wondered how diverse were resident gut microbes that spontaneously mutated after probiotic consumption and if such diversity can be different from usual (the control group). Interestingly, the number of gut resident species occurring SNVs significantly decreased with hosts after the dietary intervention with Probio-Fit, Lactobacillus rhamnosus GG (L. rhamnosus GG) and Bifidobacterium lactis HN019, besides in the mice with Lactobacillus plantarum HNU082 (L. plantarum HNU082) (Wilcoxon rank-sum test, Fig. 1a). Next, raw SNV frequency might be not comparable across studies due to the inevitable sample/study-level disparity in the metagenome sequencing depth. Specifically, raw SNV frequency positively correlated with sequencing depth in both human and mice populations (Fig. 1b and Supplementary Fig. 1). To reduce this technical bias across studies, a sequencing-depth-normalized number of SNVs (nSNVs) was used for the following cross-study comparisons.

$${{{{{\rm{nSNVs}}}}}}={{{{{\rm{the}}}}}}\,{{{{{\rm{number}}}}}}\,{{{{{\rm{of}}}}}}\,{{{{{\rm{SNVs}}}}}}/{{{{{\rm{sequencing}}}}}}\,{{{{{\rm{depth}}}}}}\,{{{{{\rm{per}}}}}}\,{{{{{\rm{sample}}}}}}$$
(1)

The consumption of probiotic L. plantarum HNU082, L. rhamnosus GG, and Bifidobacterium lactis HN019 significantly reduced the total frequency of SNVs (nSNVs) in the gut residents (Wilcoxon rank-sum test, Fig. 1c). Overall, these suggested that probiotic intake can significantly change the genetic composition of a wide range of indigenous gut microbiota that was often not assumed.

### The nSNVs introduced by probiotics consumption were strain-specific

We next compared the nSNVs before and after probiotic intake in each of the studies. Overall, probiotic intake caused more SNVs in gut microbiota than the control group without any probiotic consumption (Fig. 2a). Furthermore, the SNVs in mice native gut microbiome outnumbered that in humans. Next, alpha diversities (Shannon and Simpson index) and beta diversity were calculated for each sample based on the profile of species-level SNVs. PERMANOVA was used to measure the effect size of probiotic intake on the SNV profiles at the species level (p < 0.001) (Fig. 2b−d and Supplementary Data 3). Our results suggested that the overall pattern of SNVs induced by probiotics was highly specific to probiotic strains. Furthermore, there is no significant correlation between nSNVs and experimental factors such as probiotics dose and duration of probiotics, observed from our investigation (Fig. 2e).

To mitigate the potential effect of confounding factors, such as individuality in the gut microbiome, in our analyses, six probiotic studies were focused, including Bifidobacterium longum AH1206 (B. longum AH1206)7, Supherb Bio-25 19, L. rhamnosus GG20, Probio-Fit21, L. plantarum HNU082 22 and Lactobacillus casei Zhang (L. casei Zhang) where host participants had paired/repeated microbiome measurements before and after the probiotic intervention. The correlation pattern between nSNVs and the beta diversity of gut microbiota was highly specific to what probiotic strains had been consumed (Fig. 2f). We found that nSNVs caused by B. longum AH1206 and L. plantarum HNU082 consumption had a positive correlation with Bray−Curtis distance of gut microbiota between baseline and post-probiotic intervention (B. longum AH1206, R = 0.43; L. plantarum HNU082, R = 0.607), while L. rhamnosus GG and L. casei Zhang had a negative correlation (L. rhamnosus GG, R = −0.346; L. casei Zhang, R = −0.367). No correlation between the nSNVs caused by mixed probiotics and Bray−Curtis distance of gut microbiota was found. These suggested that the overall pattern of nSNVs induced by probiotics was highly probiotic-strain-specific.

### Universal adaptive mutations in indigenous gut microbes in response to probiotic intervention

We identified three bacterial gut residents that accumulated the convergent genetic changes in response to probiotic consumption in six human metagenomic studies, including Faecalibacterium prausnitzii (F. prausnitzii), Eubacterium rectale, and Roseburia intestinalis (Supplementary Data 2 and Fig. 3a−c). Interestingly, the probiotic interventions did not significantly alter the relative abundance of all these three species (Wilcoxon rank-sum test, p > 0.05), except for the LGG cohort (Wilcoxon rank-sum test, p < 0.05, F. prausnitzii and Eubacterium rectale). While ecological alterations in the gut microbiome were limited, the probiotic intervention led to widespread shifts in the genetic composition (detectable SNVs) of these individual gut residents (Fig. 4a, F. prausnitzii, p = 0.026). These suggested that evolutionary response might precede the ecological changes in the microbial communities under selection pressure.

To investigate if different probiotic interventions can lead to similar genomic variations, candidate adaptive SNVs were explored, which can be commonly found in at least three out of six probiotics-intervention studies. Remarkably, F. prausnitzii ATCC 27768 had the most shared SNVs (N = 19) across independent studies (Supplementary Data 4), while Eubacterium rectale and Roseburia intestinalis also had two shared SNVs respectively. We next validated whether these candidate adaptive SNVs produced by probiotic intervention can also occur in the control group (null model, Supplementary Data 5). The four SNVs from Eubacterium rectale and Roseburia intestinalis can be also identified in Israel control cohorts (null model). Two SNVs from F. prausnitzii in the probiotics group were detected in the control group as well. Therefore, we pinpointed a total of adaptive 17 SNVs occurred in F. prausnitzii specifically adapted to probiotic intake and can be validated across distinct host cohorts (Fig. 4a).

### Functional annotation of SNV-related genes of F. prausnitzii induced by probiotic intervention

Among those 17 adaptive SNVs due to probiotics consumption, 13 (76.5%) occurred in the gene coding regions of functional genes. Seven were non-synonymous mutations, while six were synonymous mutations. These mutations involved in nine functional proteins, including 30S ribosomal protein S5, phosphohydrolase, sensor histidine kinase KdpD, ferritin, fprA family A-type flavoprotein, nitroreductase family protein, ribonucleotide-diphosphate reductase subunitbeta, peptidase S24 and Type II toxin-antitoxin system PemK/MazF family toxin (Fig. 4a and Table 3), including four types of mutations A > G (n = 6), A > C (n = 6), G > A (n = 3) and G > T (n = 2) (Fig. 4b and Table 3). Given six protein-expressing genes contained non-synonymous mutations. Next, Phyre2 was employed to predict the protein structure before and after probiotic intake and further visualized how these non-synonymous genetic mutations significantly changed the protein structure via EZMOL. The predicted structure of nitroreductase family protein and fprA family A-type flavoprotein has been substantially modified (Fig. 4c), suggesting significant changes in the functional potential of the gut microbiome after probiotic exposure. The structures and amino acid sequences of other proteins have been provided in Supplementary Fig. 2.

To investigate how differentially functional genes responded to the gut selective pressure due to probiotic intake, the ratio of non-synonymous and synonymous (dN/dS) was calculated. The dN/dS ratio < 0.25 indicated the purifying selection acting on the genes, while the ratio >1 suggests that a gene was under positive selection for adapting to a new and or changing habitat23,24. In our study, the dN/dS ratios in different probiotic interventions ranged from 0.15 to 2.0 or from 0.25 to 1 (Fig. 4d). This suggested that different functional genes of a gut microbial strain can have diverse evolutionary trends. Moreover, the same gene may present parallel evolutionary trends under the different interventions of probiotics. Specifically, the dN/dS ratio of nitroreductase family protein was >1 in probiotics B. longum AH1206, L. plantarum HNU082, and L. casei Zhang group. Phosphohydrolase was positively selected during the probiotic treatment with both L. plantarum HNU082 and mixed probiotics (Probio-Fit). Also, the same dN/dS ratios pattern for mixed probiotics (Probio-Fit) and a single-strain probiotic (L. plantarum HNU082) was exhibited in peptidase S24 and type II toxin-antitoxin system PemK/MazF family toxin. Nonetheless, different probiotic products may still have distinct patterns of evolutionary effect on a microbial functional gene of gut residents. Under the intervention of probiotic strain L. plantarum HNU082, the dN/dS ratios of ferritin and fprA family A-type flavoprotein were >1, while the mixed strains intervention was the opposite. Notably, only one gene, sensor histidine kinase KdpD, was under purifying selection (dN/dS < 0.25). It suggests that most genes in F. prausnitzii tend to be neutral by the new gut environment shaped by the probiotic ingestion. The above results illustrated the distinct evolutionary changes in the intestinal microbiota under the environmental pressure of different probiotic interventions.

### The heritability of adaptive SNVs induced by probiotic intervention

To investigate whether or how long such adaptive mutations accumulated in the key gut residents, such as F. prausnitzii, can be inherited, an independent longitudinal microbiome study of probiotic intervention was conducted using L. plantarum HNU082 as a model strain (Fig. 5a). All six human participants in this validation study successfully completed two experimental phases: (I) continuous probiotic intervention for 7 days; (II) a long-term follow-up microbiome study (6 months after phase I). They volunteered to provide stool samples throughout all experimental phases as requested. Firstly, we identified 610 SNVs of F. prausnitzii at the end point of phase I, while a total of 1828 SNVs genomes were identified at phase II. Among those 610 SNVs identified from phase I, 317 (51.96%) were transient mutations that were not detectable at phase II, while 293 (48.04%) were retained on the F. prausnitzii genome at phase II (Fig. 5b). These suggested that probiotic intervention led to long-lasting yet often overlooked genetic changes in the gut residents. In the 293 heritable SNVs and 317 transient SNVs we observed, 129 functional genes were identified. Within the 129 functional genes, 39 were uniquely from heritable SNVs, 43 were uniquely from transient SNVs, and 47 overlap (Fig. 5c and Supplementary Data 6).

We next characterized the functional genes with entirely inherited or transient SNVs induced by probiotic intervention from phase I to II. Sixteen entirely SNV-inherited proteins were identified firstly (Fig. 5d), which contained at least two consistent SNVs at both phases I and II. We next functionally annotated 20 protein products that have at least two transient SNVs at phase I whereas these two SNVs were not detectable at phase II (Fig. 5e and Supplementary Data 6). For example, one of those entirely SNV-inherited proteins, FprA family A-type flavoprotein, possesses ten SNVs induced by probiotic HNU082 that can inherit in an extraordinarily long period. Intriguingly, most transient-SNVs-related proteins are involved in carbohydrate transport and metabolism, such as carbohydrate ABC transporter permease, carbohydrate ABC transporter substrate-binding protein and carbohydrate-binding protein. These suggested that residents in the gut microbial communities tended to adaptively evolve carbohydrate-related proteins for the short-term probiotic invasion.

## Discussion

It has been widely recognized that probiotics can modulate the composition and function of gut microbiota25. SNVs and structural variants of gut microbiota also have long been noted26. However, evolutionary changes in gut microbes due to probiotics intervention remain poorly characterized. The increasing attention had been brought to the high strain-specificity of probiotics19,27 and host individuality in the probiotic efficacy28, which motivated us to give priority to perform such a meta-analysis study. Hence, from the perspective of adaptive SNVs, our study assessed the effect of probiotic intake on the genomic stability of indigenous gut microbes, and we characterized the specific or common evolutionary changes of gut microbes under the selection pressure of a variety of probiotics.

The indigenous gut microbiome suffered increased intestinal selection pressure with the invasion of probiotics. Notably, probiotics caused more adaptive mutations in gut microbiota than the control group, and more mutations were observed in mice than in humans. It suggested that there were strong antagonistic relationships between probiotics and indigenous gut microbes, which were more intense in mice. This is consistent with the results of the previous studies22. However, correlation analysis revealed that the number and magnitude of local adaptive SNVs were greatly related to the host environment, and which probiotic strain(s) have been supplemented. Accordingly, we hold the opinion that more studies with specific probiotic strains and various larger number populations should be needed to further explore the complicated relationships of probiotics and indigenous gut microbiota at the single-nucleotide level.

Collectively, we found probiotics increased the instability of the gut microbial genome and highly divergent genomic responses to probiotics intake between humans and mice. Given the functional modules, the presence and absence of SNVs involving carbohydrate-related proteins suggest intensive competition between probiotics and gut microbes for carbon sources. This meta-analysis largely extended our understanding of the adaptive evolution of gut microbiota under the selection pressure of probiotics.

## Methods

### Sequence data collection and curation

A total of 1499 literature records were identified through the extensive database searching in PubMed and ISI Web of Science, while two records were kindly provided by peers. Next, 433 studies were retained after the removal of duplicates. The initial records were screened using keywords, titles, and abstracts, and 415 citations were excluded. Therefore, 18 studies were identified that we can get access to the full article and successfully performed shotgun metagenomic sequencing of stool samples collected from hosts that consumed probiotics. Among these 18 studies, seven studies were further filtered out as the corresponding sequencing data are not publicly accessible or its quality or sample size did not meet the minimum standard for re-analysis. Following the data curation process above (Supplementary Fig. 3), we finally pinpointed 11 probiotic studies, a total of 421 fecal samples, 224 probiotic-treatment individuals, and 197 placebo control that were included in our meta-analysis7,8,10,18,19,20,22,36,37,38 (Tables 1 and 2). The study was approved by the Ethical Committee; for human participants, they provided informed consent before they enrolled in the study. Host models included mice, dogs, and rat and human cohorts spanning four countries (American, Israel, New Zealand, and China) with the administration of a single (Lactobacillus and Bifidobacterium) or mixed probiotic strains. All studies specifically aim to understand gut microbiome changes due to probiotics interventions, while no combined treatments related to prebiotics, drugs, etc. have been involved. In particular, L. rhamnosus GG and L. plantarum HNU082 were collected in both animal and human cohorts.

An unpublished cohort (probiotics L. casei Zhang), the sequence data have been deposited in the NCBI database (metagenomic sequencing data: PRJNA762428). In this study, we have recruited volunteers (ten females and ten males, BMI 18.98−21.54) who had an allergy history or not. The allergy was defined as: who suffered from had a severe allergic reaction due to one or more food and still allergic to it. The study was approved by the Ethical Committee of Hainan University, and informed consent was obtained from all volunteers before they enrolled in the study. They were asked to take probiotics tablets (1010 CFU/day) for 28 days, and we collected their feces at baseline and at 28 days for metagenomic sequencing. Whole-genome shotgun sequencing of the samples was carried out using Illumina HiSeq 2500 instrument. Libraries were generated using a fragment length of approximately 300 bp. Paired-end reads were created using 150 bp in the forward and reverse directions.

### Quality control of the raw data and the removal of host DNA

Raw sra files were separated into paired or single fastq files using sratoolkit 2.10.7 software (https://github.com/ncbi/sra-tools). The raw reads were trimmed using Sickle (https://github.com/najoshi/sickle) and subsequently aligned to the host genome (human: GRCh38, mice: GRCm38.p6 dog: GCA_000002285.2, rat: GCA_000001895.4) to remove the host DNA fragments using Bowtie2 39 with default settings.

### Identification of microbial taxonomy and SNV annotation

Firstly, MetaPhlan2 was employed to identify microbes and estimate their abundances in each stool sample using shotgun metagenomic sequencing reads40. The overall metagenomic sequencing depth and the sequencing coverage of each microbial strain can directly affect the identification of intestinal microbial SNVs. Therefore, based on the species-level profiles from MetaPhlan2, we pre-selected microbial species whose average relative abundance was greater than 0.5% for SNV annotation. The references or representative strains for all selected species from NCBI and their GenBank accessions are listed in Supplementary Data 1. Next, MIDAS41 (Metagenomic Intra-Species Diversity Analysis System) was employed to profile the species-level SNV frequency and gene contents in the gut microbiota. Briefly, reference bacteria in a high-abundance genome database were constructed. Then, the shotgun metagenomic sequencing reads with 100 as minimum read depth were mapped to the database for SNV calling using Bowtie2 39. Candidate SNVs were identified and filtered with minimum quality 60 using SAMtools42 and Bcftools (https://github.com/samtools/bcftools). For more details, refer to the code in the GitHub repository: https://github.com/HNUmcc/Probiotics-SNV-meta.

### Limited influence of different reference genomes on SNVs annotation

F. prausnitzii is the most common human gut microbe. In order to investigate the impact of reference genomes on our results, SNVs were annotated using multiple F. prausnitzii genomes with the methods we described previously. F. prausnitzii ATCC 27768 (NZ_CP030777, Assembly ID: GCF_003312465.1) was selected in our study as it is the top-1 representative reference genome recommended by NCBI. Next, additionally top-3 NCBI-recommended reference genomes were included for this species: F. prausnitzii A2165 (Assembly ID: GCF_002734145.1), F. prausnitzii JCM31915 (Assembly ID: GCF_010509575.1), F. prausnitzii Indica (Assembly ID: GCF_002586945.1). Firstly, we identified the presence of all four strains in the gut of BH1206 cohort and demonstrated the accumulated coverage (Supplementary Fig. 4). The coverage (%) of a reference genome on each sample was calculated and the relationship was visualized between the cumulative coverage and the number of metagenome samples included in a study. Both ×1 (blue) and ×100 (orange) minimum sequencing depth were considered for genome coverage calculation here. We found that the genome coverage of these genomes rapidly increased with multiple samples included, and the accumulated coverage almost saturated after less than ten metagenome samples were included. These suggested that all these included reference genomes can be detected and extensively covered by stool metagenome reads from most samples (Supplementary Data 4a).

Next, the genome-wide distance was compared between these four genomes with the average nucleotide identity (ANI) values (http://enve-omics.ce.gatech.edu/ani/index) (Supplementary Table 1). Typically, microorganisms that belong to the same species have over 95% ANI among themselves. However, the ANI values between NZ_CP03077 and other newly selected ones were far less than this conventional species boundary of ANI values (Supplementary Data 4b). Firstly, we tested if or how much percentage of these four genomes can be covered by the shotgun metagenomic reads from stool samples in a human cohort (e.g., BH1206).

Again, MIDAS was employed to profile the species-level SNV frequency and gene contents. We next compared the SNVs annotation results with different reference genomes on the four cohorts, including BH1206, Bio-25, LGG, Probio-Fit, HNU082 and Zhang (Supplementary Figs. 5, 6). Firstly, with our SNVs calling pipeline, no significant difference was found in the number of nSNVs between the different F. prausnitzii reference genomes at the T0 and T1 time points in the vast majority of studies and only slight differences were found in the BH1206 and Probio-Fit cohorts (Supplementary Data 4c). Secondly, the gene functions affected by SNVs changes between T0 and T1 time points (or due to probiotic intervention) were largely similar (Supplementary Data 4d). This indicated that it is plausible and sufficient to select ATCC 27768 as the reference genome.

### Definition of adaptive SNV induced by probiotic intervention

In this study, only the SNV profiles were investigated in the native gut microbiome, while insertions and deletions were not our focus. The mutant quality of a base (produced by Bcftools) below 60 is excluded. In this manuscript, the paired data were focused (baseline and end point of probiotic consumptions) in the population cohort. Unmatched animal studies and human cohorts (N = 6) were not included in the meta-analysis (starting at Fig. 2b). Next, as illustrated in Supplementary Data 5, adaptive mutations that occurred after probiotic consumption do not necessarily relate to nucleotides on the reference genome. We mapped the metagenome reads from the same hosts at time points to the same reference genomes and identified single-nucleotide changes (adaptive SNVs) before and after the probiotic consumption. Next, candidate adaptive SNVs due to probiotic consumptions were thought to meet the following requirements. (1) For a given microbial species (genome), a single-nucleotide difference should be identified between baseline and end point of a host, despite the nucleotide difference between the reference genome and either of them (Supplementary Fig. 7). (2) Such a genetic change can be observed in at least 30/50% of hosts in a study (Supplementary Fig. 8). (3) Such a genetic change did not show up within a period that is not related to any probiotic treatments for a host. Ideally, we can further exclude SNVs that are not adaptive, when they met the requirement (1) but also showed up before the time points of a host consumed the probiotic. However, most studies did not sample the time points before probiotic treatment except for the Israeli cohort (Bio-25). Therefore, for this study, we specifically removed such SNVs as they are less likely to be related to probiotic consumption. These excluded SNVs are mainly located at Megamonas rupellensis, Roseburia inulinivorans, Roseburia intestinalis, Eubacterium rectale, etc. After the correction by null model, a reasonable set of adaptive SNVs results was obtained in Supplementary Data 2. We further finalized the set of universal adaptive SNVs which can be detected in at least three (50%) of six studies with the technical requirements as we mentioned before.

### Statistics and reproducibility

All statistical analyses were performed using R software. The differential abundances of various profiles were tested with the Wilcoxon rank-sum test, and the significant difference was considered at a nominal level of p < 0.05. Alpha diversity analysis was performed by in-house R code. Beta-diversity analysis was conducted using “vegan” and “plyr” package, and PCoA based on Bray−Curtis dissimilarity matrix was used to visualize the sample clustering based on gut microbial composition. The package “ggplot” was used to generate boxplot, barplot, violin plot, and fitted curve. The heatmap was constructed using the “pheatmap” package. The packages, “circlize”, “ComplexHeatmap”, and “grid” were used for SNV genome circle map. The protein structure was predicted and displayed using Phyre2 43 (http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index) and EZMOL44 (http://www.sbg.bio.ic.ac.uk/ezmol/). The Venn diagram is rendered by using InteractiVenn45 (http://www.interactivenn.net/).