Meta-analysis of gut microbiome studies identifies disease-specific and shared responses

Hundreds of clinical studies have demonstrated associations between the human microbiome and disease, yet fundamental questions remain on how we can generalize this knowledge. Results from individual studies can be inconsistent, and comparing published data is further complicated by a lack of standard processing and analysis methods. Here we introduce the MicrobiomeHD database, which includes 28 published case–control gut microbiome studies spanning ten diseases. We perform a cross-disease meta-analysis of these studies using standardized methods. We find consistent patterns characterizing disease-associated microbiome changes. Some diseases are associated with over 50 genera, while most show only 10–15 genus-level changes. Some diseases are marked by the presence of potentially pathogenic microbes, whereas others are characterized by a depletion of health-associated bacteria. Furthermore, we show that about half of genera associated with individual studies are bacteria that respond to more than one disease. Thus, many associations found in case–control studies are likely not disease-specific but rather part of a non-specific, shared response to health and disease.

significantly enriched in Roseburia, Phascolarctobacterium, and an unclassified genus in the family Veillonellaceae (multivariate linear model, q <= 0.25). Patients with UC showed significantly higher levels of Clostridiaceae (multivariate linear model, q <= 0.25). In our re-analysis, we did not find any genera that were significantly enriched in IBD patients. We found that healthy patients had significantly greater abundances of Ruminococcus, and Gemmiger relative to both UC and CD patients (q <= 0.05, KW tests). Additionally, CD patients were depleted in Clostridium IV relative to healthy controls (q <= 0.05, KW tests). non-IBD controls. 13 Non-IBD controls were patients with symptoms such as: constipation, abdominal pain, gastroesophageal reflux, poor weight gain, diarrhea, blood in stool and oropharyngeal dysphagia. At the genus level, they found that controls were enriched in Alistipes, Subdoligranulum, Anaerovorax, Oscillibacter, Parabacteroides, Odoribacter, Ruminococcus, Butyricicoccus, Akkermansia, Anaerotruncus, Sporobacter, Phascolarctobacterium, Lawsonia, Ethanoligenens, Peptococcus relative to IBD patients (KW, q < 0.01). The only genus that was found to be enriched in IBD patients was Escherichia-Shigella. In our re-analysis, we also found Escherichia-Shigella and Cronobacter to be enriched in patients with IBD (q <= 0.05, KW tests). When comparing healthy controls with UC patients, we also found an enrichment of Haemophilus in the UC patients. Control patients showed higher abundances of Phascolarctobacterium, Butyricicoccus, Ruminococcus II, Oscillibacter, Ruminococcus, Gemmiger, Subdoligranulum, Clostridium IV, Odoribacter, Alistipes, and Parabacteroides relative to all IBD patients (q <= 0.05, KW tests). Additionally, control patients were enriched in Clostridium XIVa, Flavonifractor, and Akkermansia relative to UC patients. Overall, our results match very closely what was found in the original paper.  14 The authors reported variable, and sometimes opposing shifts in the microbiomes of patients with UC, ileal CD and colonic CD at different taxonomic resolutions. We found no significant differences between IBD and healthy patients in our re-analysis. When comparing healthy controls with CD cases only, we found an enrichment of Butyricicoccus and Oscillibacter in the control patients (q <= 0.05, KW tests).
In summary, there are certain consistencies across IBD studies. IBD patients tend to be depleted in butyrate-producing clostridia: Ruminococcus and Lachnospiraceae.
The organisms the are enriched in CD and UC patients tend to vary across studies.
One consistency is organisms associated with the upper gut, like Lactobacillus and Enterobacteriaceae appear to be enriched in IBD patients. 5 This result fits with the reduced stool transit times associated with IBD (i.e. diarrhea). overweight, and 185 obese. 15 The authors report higher levels of Lactobacillaceae, Eggerthella, and Lachnospiraceae (Blautia and Dorea) in obese individuals (q < 0.05, FDR-corrected T-test). They showed enrichment for Christensenellaceae, Dehalobacterium, Lachnospira, Mogibacteriaceae, Rikenellaceae, Methanobre, Coriobacteriaceae, Peptococcaceae, Oscillospira, Ruminococcaceae, and Sarcina in healthy BMI individuals (q < 0.05, FDR-corrected T-test). In our re-analysis, we found higher levels of Streptococcus, Weissella, Roseburia, Blautia, Clostridium XlVb, and Mogibacterium in obese individuals, while Robinsoniella, Ruminococcaceae (Oscillibacter, Pseudoflavonifractor, Sporobacter, and Anaerofilum), and Anaerovorax were more abundant in low-BMI individuals (q <= 0.05, KW tests). Our results only partially agree with the authors' original findings, which may be due to the fact that we used a different statistical test and OTU-calling method and that we binned the data at the genus level. varying BMIs. 16 They found a significant positive correlation between the abundance of Collinsella and BMI (i.e. enriched in obese individuals), while Lachnobacterium, Anaerotruncus, Faecalibacterium, and Clostridium were negatively correlated with BMI (i.e. enriched lean individuals) (p ¡ 0.001, Spearman correlation). We found no significant differences in the proportion of genera between lean and obese individuals in our re-analysis. Turnbaugh et al. (2008) looked differences in gut microbial community structure between 31 monozygotic and 23 dizygotic twin pairs concordant for leanness or obesity. 17 The authors report a reduction in alpha diversity in obese individuals. They also report a significant decrease in Bacteroidetes and an increase in Actinobacteria in obese twins. In our re-analysis of these data, we did not see a significant reduction in alpha diversity (Supplementary Figure 6). We found significant increases in Cateni- They found no significant differences between patients with high and low BMIs within their 63 patient cohort, but identified several significant differences between their patient population and the HMP data set. However, it is unclear whether these differences were related to obesity, so we do not discuss them here. Our re-analysis of these results also found no significant differences in the relative abundances of bacterial genera between high-and low-BMI subjects.  19 For obesity, the authors found that Prevotella was enriched in high-BMI patients, while healthy controls showed significantly greater relative abundances of Bifidobacterium, Blautia, and Faecalibacterium (p <= 0.05, ANOVA with post-hoc Tukey's tests).
Overall, we found several differences between lean and obese patients that were consistent across at least two studies. Roseburia and Mogibacterium were enriched in obese individuals in more than one study. Pseudoflavonifractor, Oscillobacter, Anaerovorax and Clostridium IV were enriched in the controls across more than one study. However, no genera showed consistent differences across three or more studies.
Our results are largely consistent with a recent meta-analysis of obesity studies, which found no universal signature of human obesity. And healthy patients were also enriched in Oridibacter, Anaerostipes, and Parasutterella. Many of the significant genera from the Lozupone study were shown to be strongly associated with sexual behavior in the Noguera-Julian study (i.e. these genera were significantly different in men who have sex with men versus other subjects; see below) and may not necessarily be related to HIV status.  The authors also note a reduced alpha diversity in autistic children. After reprocessing these data, we found no significant differences in alpha diversity or genera abundances between autistic and control children (Figure 1; q > 0.05, Kruskal-Wallis). The original conclusion that Prevotella and Veillonellaceae were different was based on q-values of 0.04, which is only moderately convincing evidence against the null-hypothesis. Therefore, the loss of this marginal significance (for q <= 0.05) is unsurprising when using a different statistical test.
In a more recent study, Son et al. (2015) found no significant differences in microbial community diversity or composition between autistic and neurotypical children (n = 59 ASD and 44 neurotypical). 24 One genus, representing chloroplast sequences, was associated with ASD children with functional constipation, but this signal appeared to be due to dietary intake of chia seeds. Similar to the authors findings, we did not detect any significant differences in genera abundances between ASD children and neurotypical children in the reprocessed data (q > 0.05, Kruskal-Wallis).
Taken together, we find no evidence for changes in the composition or diversity of the gut microbiome in response to ASD. However, we cannot discount subtle dysbiosis (i.e. small effect size) in response to ASD due to the small number of patients in each study.

Type 1 Diabetes (T1D; 2 studies)
Alkanani et al. (2015) compared 23 healthy patients to 35 early-onset T1D patients and 21 seropositive T1D patients. 25 The authors report higher relative abundances of Lactobacillus, Prevotella and Staphylococcus genera in healthy patients (p < 0.05, Wilcoxon). T1D patients showed higher levels of Bacteroides (p < 0.05, Wilcoxon). In our re-analysis, we found no significant differences in bacterial genera across healthy and diseased patients. They also found higher levels of Acidaminococcus and Megamonas genera (in the Veillonellaceae family) in the controls (p < 0.05, ANOVA, Tukey-Kramer test). We saw no significant differences in our re-analysis of these data.
Overall, the original authors report a consistent increase in Bacteroides and depletion in Prevotella genera associated with T1D. However, our re-analysis found that these differences did not pass our significance threshold. Thus, we cannot yet conclude that there is a consistent dysbiosis associated with T1D. They found that control patients were enriched in Faecalibacterium and Anaerosporobacter genera, while NASH patients showed significantly higher levels of Parabacteroides and Alisonella genera (p < 0.05, t-test). In our re-analysis of these data, we saw no significant differences.
In summary, there were not many consistencies between the two NASH studies analyzed here. The original studies consistently report a depletion in Faecalibacterium in NASH patients. Thus, the overall influence of NASH on the microbiome is difficult to assess without further study.
1.9 Minimal hepatic encephalopathy and liver cirrhosis (LIV; 1 study) Zhang et al. (2013) looked at the microbiomes of 26 healthy patients, 26 patients with MHE, and 25 patients with CIRR. 28 The original paper reported several genera that differed between diseased and control patients. Odoribacter, Flavonifractor, and Coprobacillus were all enriched in MHE patients relative to controls, while Eubacterium, Lachnospira, Parasutteralla, and an unclassified Erysipelotrichaceae genus were enriched in healthy patients (p < 0.01, Mann-Whitney). The authors also reported depletion in Prevotella in non-MHE patients with cirrhosis (CIRR), relative to controls. When we re-processed and re-analyzed these data, the only difference we found was an enrichment in Veillonella in case (MHE and CIRR) patients (q < 0.05, KW test). When comparing controls with MHE patients alone, we also saw an enrichment of Faecalibacterium in healthy controls relative to MHE cases.
1.10 Rheumatoid and psoriatic arthritis (ART; 1 study) Scher et al. (2013) investigated the impacts of arthritis on a cohort of 86 arthritic and 28 healthy patients. 29 The authors report that greater abundances of Prevotella copri can predict susceptibility to arthritis. There were three types of arthritic conditions studied, but only new-onset untreated rheumatoid arthritis (NORA) showed a strong association with multiple Prevotella OTUs among others (q < 0.01, LEfSe).
The other RA groups were not easily distinguishable from controls. Indeed, when grouping all arthritis patients together for our re-analysis as well as comparing RA and psoriatic arthritis patients separately, we did not find any genera that were significantly different between arthritic patients and controls.
1.11 Parkinson's disease (PAR; 1 study) Scheperjans et al. (2014) looked for differences in the gut microbiome between 72 neurotypical patients and 72 Parkinson's (PAR) patients. 30 They found a small handful of significant differences at the family level. Control patients showed higher relative abundances of Prevotellaceae, while PAR patients were enriched in Lactobacillaceae, Verrucomicrobiaceae, Bradyrhizobiaceae, and Clostridiales Incertae Sedis (q < 0.05, Mann-Whitney). In our re-analysis, we found significantly higher relative abundances of Lactobacillus (within Lactobacillaceae) and Alistipes (within Rikenellaceae) in PAR patients (q < 0.05, KW tests).
Supplementary Note 2: Stratifying heterogenous case groups shows consistent disease-specific signals In our main analyses, we combined Crohn's disease (CD) and ulcerative colitis (UC) patients together as IBD cases. We also performed separate analyses on these individual patient groups. All four IBD studies included CD cases and three included UC cases (all except Gevers et al. (2014) 11 ). We performed the same analysis as in Figure   1 for these stratified groups, and found that both CD and UC patients are charac- disease microbiome signal that is being identified even across different diseases. These results also indicate that models for each disease group are predictive of cases and controls for other datasets within that group, since the leave-one-dataset-out classifier, which included datasets of the test disease group in the training set, performed better than the leave-one-disease-out classifier, which did not.

Supplementary Note 4: Shared microbial response is robust to different definitions
Our simple heuristic defined non-specific microbes as those which were significantly enriched or depleted in two diseases. To ensure that this definition was not being dominated by the diarrhea datasets and that we were indeed identifying microbes which respond non-specifically to multiple diseases, we re-defined the non-specific genera as those which were significantly enriched or depleted in two diseases, excluding  22 ). 31 We combined each dataset's FDR-corrected q-values with scipy.stats.combine pvalues(method='stouffer'), using the square root of each study's sample size as the weights. Genera with a combined q-value less than 0.05 were considered non-specific responders. Overall, these results did not conflict with the heuristic definition (i.e. only two genera, Porphyromonas and Gemmiger, were "health-associated" with one method and "disease-associated" with the other; Supplementary Figure 10). Stouffer's method is less conservative than the heuristic definition, identifying 111 genera in the non-specific response (60 health-associated and 51 disease-associated). In addition, using Stouffer's method does not allow for the identification of mixed genera (i.e. those which respond in both health-and disease-associated directions across multiple diseases). Finally, combining q-values with Stouffer's method does not ensure that identified microbes are responding non-specifically to multiple diseases: one highly significant genus in a large study can dominate other q-values and be flagged as a non-specific responder, despite only being associated with one disease. Thus, the heuristic definition is more conservative and more directly related to the biological question of identifying shared microbial responses to disease.
We tested whether the overall number of non-specific responders we observed was greater than we would expect to see due to chance. We built an empirical null distribution of the number of each type of non-specific responder. We shuffled q-values within each dataset, re-defined non-specific responders, and counted how many health-associated, disease-associated, and mixed genera were found, repeating this process 1000 times. When we considered significance in two diseases as the threshold for our heuristic (as presented in the main text), we did not find a significantly larger number of non-specific responses than would be expected by chance (Supplementary Figure 11). When we raised the heuristic threshold to three diseases our results became more significant, but there was a large reduction in the number of identified non-specific genera. Thus, there is currently not enough information to fully distinguish between microbes that are sporadically detected across multiple diseases from those that may be consistently associated with general health or disease.
Future meta-analyses that include many more datasets for each of many conditions might be able to distinguish microbes that are consistently associated with health or disease from those that are sporadically associated with different conditions. Despite the fact that the number of non-specific microbes did not reach statistical significance, we identified multiple lines of evidence for a coherent microbial response to health and disease. First, the healthy vs. disease classifiers successfully classified case patients across a variety of diseases even when the disease being tested was not in the training set, indicating that some aspects of disease-associated microbiome shifts can generalize across diseases (Supplementary Figure 9). Second, the statistical significance of the number of non-specific responders increased as we increased the number of diseases threshold (Supplementary Figure 11). Thus, future metaanalyses which include many more studies and disease states may be able to more robustly identify bacteria which respond across a broader variety of disease states.
Third, we saw a coherent phylogenetic signal in the non-specific response (e.g. Proteobacteria and Lactobacillaceae associated with disease and Rumminococcaceae and Lachnospiraceae associated with health), which points to potential mechanisms (e.g. shorter stool transit time or inflammation) for a shared response to health or disease ( Figure 3A). Thus, we expect that future meta-analyses that include more studies and diseases will identify a consistent set of bacteria that form a general microbial response to health and disease in the gut. Singh 2015, EDD fasta n/a n/a n/a n/a 200 Papa 2012, IBD fasta n/a n/a n/a n/a 200 Zhu 2013, NASH fasta n/a n/a n/a n/a 200 Turnbaugh 2009, OB fasta n/a n/a n/a n/a 200 Zhu 2013, OB fasta n/a n/a n/a n/a 200  we assigned reads to samples by their barcodes (Yes) or if the files were already de-multiplexed (No).
Primers column indicates whether we removed the primers from sequences. Quality filtering and Quality cutoff columns indicate the type of quality filtering we performed on the data. Length trim is the length to which all sequences were truncated before clustering into OTUs. In the case of -fastq truncqual quality filtering, reads were length trimmed after quality truncation. In the case of -fastq maxee quality filtering, reads were length trimmed before quality filtering.    across all studies. Rows are genera, ordered phylogenetically (as in Figure 3A). Columns are datasets, grouped by disease and ordered according to total sample size (decreasing from left to right). The first and second heatmap panels from the left are the same as in Figure 3A.   Figure 15: Heatmap of log-fold change between cases and controls (i.e. log 2 ( mean abundance in cases mean abundance in controls ) for all genera which were significant (q < 0.05) in at least one dataset, across all studies. Rows are genera, ordered phylogenetically (as in Figure 3A). Columns are datasets, grouped by disease and ordered according to total sample size (decreasing from left to right). The first and second heatmap panels from the left are the same as in Figure 3A. Values are colored according to directionality of the effect, where red indicates higher mean abundance in patients relative to controls and blue indicates higher mean abundance in controls. Opacity indicates fold change and ranges from 1300 to 0, where fold changes greater than