Observational studies have shown that the composition of the human gut microbiome in children diagnosed with Autism Spectrum Disorder (ASD) differs significantly from that of their neurotypical (NT) counterparts. Thus far, reported ASD-specific microbiome signatures have been inconsistent. To uncover reproducible signatures, we compiled 10 publicly available raw amplicon and metagenomic sequencing datasets alongside new data generated from an internal cohort (the largest ASD cohort to date), unified them with standardized pre-processing methods, and conducted a comprehensive meta-analysis of all taxa and variables detected across multiple studies. By screening metadata to test associations between the microbiome and 52 variables in multiple patient subsets and across multiple datasets, we determined that differentially abundant taxa in ASD versus NT children were dependent upon age, sex, and bowel function, thus marking these variables as potential confounders in case–control ASD studies. Several taxa, including the strains Bacteroides stercoris t__190463 and Clostridium M bolteae t__180407, and the species Granulicatella elegans and Massilioclostridium coli, exhibited differential abundance in ASD compared to NT children only after subjects with bowel dysfunction were removed. Adjusting for age, sex and bowel function resulted in adding or removing significantly differentially abundant taxa in ASD-diagnosed individuals, emphasizing the importance of collecting and controlling for these metadata. We have performed the largest (n = 690) and most comprehensive systematic analysis of ASD gut microbiome data to date. Our study demonstrated the importance of accounting for confounding variables when designing statistical comparative analyses of ASD- and NT-associated gut bacterial profiles. Mitigating these confounders identified robust microbial signatures across cohorts, signifying the importance of accounting for these factors in comparative analyses of ASD and NT-associated gut profiles. Such studies will advance the understanding of different patient groups to deliver appropriate therapeutics by identifying microbiome traits germane to the specific ASD phenotype.
The prevalence of Autism Spectrum Disorder (ASD) continues to rise1, and mounting evidence implies a potential role of the gut microbiome in ASD symptomatology. The core behavioral features of ASD are often accompanied by multiple comorbidities such as gastrointestinal (GI) and immune dysfunction2. A highly metabolically active entity, the gut microbiome resides at the intersection of numerous communication axes in the body3. Its involvement in immune development4,5, mood disorders6, and other extra-GI disorders has been established. Seemingly countless observational studies have found significant variability in the bacterial7,8,9,10,11,12,13,14,15,16,17,18,19,20 and fungal19 populations in ASD-diagnosed v. neurotypical children, but signals have not been robust across cohorts. Consistency in linking specific features or signatures to the ASD phenotype has proven futile to date21,22,23,24.
Due to the varying degree of behavioral symptoms and unique set of clinical features each individual will display, the ASD phenotype is extremely heterogeneous25. Between 40 and 70% of children with ASD experience GI abnormalities such as constipation and diarrhea26,27. While stool consistency has been shown to affect the observed microbial profile significantly28,29, this variable is not properly controlled in many studies. A recent reanalysis30 of a published dataset19 concluded that the original results were confounded by constipation in the ASD subjects. Additionally, human gut microbial populations fluctuate dramatically with age, particularly during early childhood31. ASD is typically diagnosed within the first 4 years of life32 yet there exists high variation in the median age of the past cohorts ranging from 3 to 11 years. This study explores variation in age of test subjects and mismatches in comorbidities between case and control groups, as they likely contribute to the inconsistencies observed across investigations conducted.
Leveraging previously published microbiome sequencing data from case–control studies, we identified relationships between specific variables and the gut microbiomes of all ASD children, as well as those observed only in subsets of the children studied. Utilizing linear mixed-effects models for meta-analysis, we identified bacterial taxa whose differential abundances were consistent across studies. We show that age, sex, and bowel function are important confounders that must be controlled and evaluated in a consistent manner when appraising inter-study ASD datasets. A discussion and synopsis of the sample sizes and statistical methods necessary to account for these variables and thus more accurately estimate ASD-associated effect size(s) ensues. The approach described here will aid in understanding the heterogeneity of the patient groups, leading to the identification of appropriate patient subsets to deliver targeted therapeutics in the future.
Dataset selection and inclusion
PubMed and the Sequence Read Archive (SRA; National Center for Biotechnology Information) were searched in March 2019 to identify publicly available raw datasets collected from case–control studies investigating the gut microbiota in ASD. The following queries yielded 72 potential publications/datasets: “autism[Title/Abstract] AND (microbiome[Title/Abstract] OR microbiota[Title/Abstract]) AND (16S OR sequencing OR metagenomic)” and “(autism gut) AND bioproject_sra[filter]” for PubMed and SRA, respectively. Inclusion criteria were adopted as follows: case–control ASD studies with human subjects under 18 years of age, datasets generated by 16S rRNA gene amplicon or metagenomic sequencing of human fecal samples, and raw data deposited in a public repository. Nine publications and 1 additional BioProject met all inclusion criteria, and an additional 6 publications met all criteria except for the availability of raw data (Supplementary file 1: Table S1). Of these 6, only 1 author provided raw data upon request. In total, 11 raw datasets associated with the 11 included studies were obtained. Three additional datasets acquired from an internal case–control ASD cohort33 were also included. One public dataset was subsequently excluded due to corrupted FASTQ files, bringing the final total to 13 datasets representing 10 cohorts. For datasets where repeated samples were collected, a single time point per individual was selected at random to include in the analysis.
Metadata variable selection
Metadata were curated according to the Second Genome controlled vocabulary. Metadata variables which were present in multiple studies with values for n ≥ 6 individuals per group per study were included in the analysis. Those present in a single study were excluded. All possible pairing combinations of the included metadata variables were also used in the analysis to test each variable in subsets of subjects (Supplementary file 2: Fig. S1). In the combinations, each value of each categorical variable was used to subset subjects and test microbiota associations with the remaining variables. Original contrasts were labelled with the prefix “Subset: None” (e.g. Subset: None; Variable: Autism Spectrum Disorder—FALSE over TRUE) while contrasts performed in combination with another variable were labelled with the subset group value (e.g. Subset: Biological sex = Male; Variable: Autism Spectrum Disorder—FALSE over TRUE).
Raw data were downloaded from the respective repositories, obtained directly from authors (Supplementary file 1: Table S1), or generated in-house (see Supplementary file 3). Data were processed using standardized pipelines for each sequencing technology. For 16S rRNA gene amplicon sequencing data generated with Illumina technology, the DADA2 workflow was used with default settings for filtering, learning errors, dereplication, amplicon sequence variant (ASV) inference, and chimera removal34. Truncation quality was set to 2, and ten nucleotides were trimmed from each terminus of each read for both forward and reverse. Data generated using pyrosequencing technology, and any sequencing data with trimmed reads, were merged (for paired-end sequencing) then aligned to an in-house strain database (StrainSelect, strainselect.secondgenome.com, version 2019 i.e. SS19) as described in the next section. Remaining sequences without unique strain matches were quality filtered, dereplicated, and clustered (97%) with UPARSE to generate de-novo operational taxonomic units (OTUs). For metagenomic shotgun sequencing data, adapter sequences and low-quality ends were trimmed using Trimmomatic35 (< Q20), then contaminant sequences were removed using Bowtie236. Host sequences were removed with Kraken37 and rRNA sequences were removed with SortMeRNA38. Sourmash39 was used to generate compressed representations of DNA sequences. Data generated using PhyloChip hybridization technology40 (see Supplementary file 3) were processed with the Sinfonietta software (Second Genome, Inc, Brisbane, CA) as previously described41, generating empirical OTUs (eOTUs).
Across all data types, consistently formatted identifiers for known strains were assigned using the SS19 database. SS19 contains known microbial strains publicized as of July 22, 2019. The abundance of each strain within shotgun metagenomic reads was calculated using Sourmash (kmer = 51, scaled = 5000). Using USEARCH42, ASVs matching a unique strain (≥ 99% global alignment identity) in the database were annotated with the strain identifier only if no genes from different strains had equivalent or higher identity matches than the unique strain hit. Abundances were summed for all ASVs matching the same strain. If a unique strain match was not achieved for an ASV, then species level and higher taxonomic placement was estimated with sintax (-cutoff 0.80)43. Genome Taxonomy Database (GTBD44) taxonomic nomenclature for species and higher taxa were used where available. Overall, 6%, 0.9%, and 6.8% of 16S ASVs, OTUs and eOTUs, respectively, were unique matches to strains.
Within each dataset, taxonomic units (e.g. strains, sequence variants) present in less than 5% of biospecimens were removed. In addition, biospecimens with a sequencing depth less than 1% of the mean sequencing depth in the respective dataset were removed. To identify variables most associated with changes in bacterial community composition in each dataset, permutational multivariate analysis of variance, also known as Adonis (R package “vegan”) was performed for each variable/combination at each taxonomic rank.
For individual taxa, effect sizes (fold change in log2 scale) and standard errors were calculated within each dataset for each variable/combination at every taxonomic rank (phylum to species levels). Calculations for non-aggregated (strain level) 16S rRNA gene amplicon sequencing data were obtained using DESeq245. Given that sourmash results in a table of relative abundances, effect sizes and standard errors were calculated using the escalc function (measure = “ROM”) in the “metaphor” R package46 to obtain log transformed ratios of means which were then transformed to log2 scale. Metagenomic data were not aggregated to higher taxonomic ranks. Aggregated 16S data were first transformed to relative abundance then effect sizes and standard errors were calculated as above. PhyloChip fluorescence intensity data were log2 transformed and here, escalc (measure = “MD”) was utilized to calculate the raw mean difference between groups which is equivalent to log2(fold change).
To identify taxa with concordant effect sizes for any given metadata variable/combination across multiple datasets, linear mixed-effects models46 were calculated using the rma.mv function in the “metafor” package. Each model was computed by regressing the effect sizes for a single taxon against a fixed effect, the metadata variable/combination, while controlling for the random variability introduced by each distinct dataset. The random-effects model can be written as yi = µ + ui + εi, where ui ∼ N(0, τ2) and εi ∼ N(0, vi). For a set of i = 1,..., k independent studies, yi denotes the observed effect size in the ith study, µ denotes the average true effect, and τ2 is the variance in the true effects. Sampling variances are equal to vi, where vi is the square of the standard errors of the estimates. In cases where multiple datasets originated from the same cohort study, inner (dataset) and outer (cohort) levels of the random effect were specified; datasets within a given cohort share correlated random effects. P values were adjusted according to the Benjamini–Hochberg method for all models within a given taxonomic rank. A floor of 1e-10 was applied to adjusted P values. Plots were generated using R packages “ggplot” and “ggpubr”. See Supplementary file 2: Fig. S2 for a schematic of the methods.
Concordance of metadata variables across studies
Raw sequencing datasets (n = 10) were collected and re-analyzed from publicly available case–control studies. These ten datasets, representing nine distinct cohorts, were evaluated alongside three microbiome datasets from an internal cohort (Table 1). Datasets consisted of 16S rRNA gene amplicon and metagenomic DNA sequences. In addition, one internal dataset was generated using PhyloChip technology. Amplicon sequencing datasets covered multiple 16S rRNA variable regions. Cohorts included subjects between 2 and 18 years of age from the United States, Canada, China, Italy, and India. Participants were reported to be age-matched in 7 of the 10 studies although we found a significant difference in the ages of ASD v. NT children in one study17 (Supplementary file 2: Fig. S3). Strati et al.19 did not report the minimum and maximum ages of their participant pool.
To identify microbial features associated with specific metadata variables across datasets, we first examined all the variables that were concordant across studies. A total of 52 distinct variables were reported (Supplementary file 1: Table S2). Age (10 of 13 datasets), sex (10 of 13 datasets), and bowel function (6 of 13 datasets) were among the most prevalent metadata categories. Of the 52 variables selected for meta-analysis, the vast majority were present in fewer than five datasets, and only five variables were present in more than four datasets (Fig. 1). To investigate the impact of potential confounders, all possible pairwise combinations of variables were also included in the meta-analysis, resulting in 580 tests (Fig. 1b).
Multi-cohort bacterial abundance signatures germane to ASD exist at all taxonomic ranks
To assess the association between bacterial community structure (beta-diversity) and each of the metadata variables/variable combinations, Adonis tests were performed on non-aggregated data (i.e. ASV or OTU level) as well as data aggregated at each taxonomic rank. Non-aggregated data yielded the greatest number of significant (P < 0.05) tests (Fig. 2a). The main broad variable under investigation, denoted as “Subset: None; Autism Spectrum Disorder—FALSE over TRUE”, was significant in several datasets at every taxonomic rank. Of the 580 possible combinatorial metadata tests, 30 were significantly associated with bacterial community structure in at least two datasets and at least one taxonomic rank (Supplementary file 1: Table S3). Thirteen of the 30 tests focused on differences in subject age. Age was a significant contributor to bacterial community variation within the NT population (“Subset: Autism Spectrum Disorder = FALSE”) in four out of nine datasets, but was only significant in two of 10 datasets in the ASD population (“Subset: Autism Spectrum Disorder = TRUE”). Significant changes in beta-diversity for each variable within each subset are shown in Fig. 2b. Variables bearing the greatest number of dataset associations were found in healthy, male children (Subsets: “Biological sex = Male,” “Period of life = Childhood,” and “Autism Spectrum Disorder = FALSE”).
In addition to community-wide associations, relationships between individual taxa and each variable/variable combination were assessed using random-effects models. When comparing ASD and NT children (“Subset: None; Autism Spectrum Disorder—FALSE over TRUE”; Fig. 3) with aggregated data, Burkholderiales (order level) was detected in nine of the 10 datasets (metagenomic data were not aggregated) and was significantly more abundant in children with ASD, although the effect size was small (log2(fold change) = − 0.59). An unannotated species, s__PROV_t__172009 (genus Lawsonibacter), was also enriched in the ASD cohorts (present in five of the 10 datasets). While remaining meta-analysis results from aggregated data were weak, with significant taxa detected in only three datasets, non-aggregated data revealed more consistent associations. Two sequence variants, annotated as unclassified strains of the species Ruminiclostridium_E siraeum, were less abundant in ASD cohorts and detected in 12 of 13 datasets (Supplementary file 1: Table S4). Barnesiella intestinihominis and Faecalibacterium prausnitzii_K were also depleted in the ASD cohorts from 11 and 9 datasets, respectively. Taxa most differentially abundant in the ASD-diagnosed cohorts and exhibiting the largest effect sizes included: Fusicatenibacter saccharivorans (10 datasets), Bacteroides uniformis (8 datasets), and Bacteroides thetaiotaomicron (5 datasets).
Heterogeneity of the ASD phenotype is evident in taxonomic profiles affected by GI dysfunction
We next explored the differential taxonomic findings in subsets of patients. Due to the influence of GI function on gut bacterial profiles47,48 and the high prevalence of GI dysfunction in the ASD population26,49, we elected to appraise ASD-associated differential taxa abundances in children without bowel dysfunction (“Subset: Tends to have normal bowel function; Autism Spectrum Disorder—FALSE over TRUE”). Removing children with bowel dysfunction invalidated many significant ASD v. NT findings suggesting the initial findings were more related to bowel function than study group, although the number of datasets and participants that could be investigated in the subset was reduced (Fig. 4, Supplementary file 2: Fig. S3). Several bacterial taxa exhibited differential abundance in ASD v. NT children only after subjects with bowel dysfunction were removed, indicating a close relationship with ASD diagnosis rather than constipation/diarrhea. As expected, the gut bacterial community profiles of ASD individuals with constipation differed greatly from those of ASD individuals with diarrhea (Supplementary file 2: Fig. S6), which confounds the case–control contrast. Many taxa discriminated between ASD subjects afflicted with constipation and those afflicted with diarrhea, including members of the Lachnospiraceae family and species of Bacteroides, Bifidobacterium, and Streptococcus (Supplementary file 1: Table S5). The number of differentially abundant taxa was far greater than that observed when comparing ASD and NT children (Supplementary file 1: Table S4), suggesting that bowel dysfunction exerts a greater influence on microbial community dynamics than the ASD phenotype.
Although the effect sizes of most ASD-associated taxa in children with normal bowel function were small, 19 significant taxa (q < 0.05) were identified. We discuss the top four (based on effect size): Massilioclostridium coli (species, depleted in ASD), Granulicatella elegans (species, depleted in ASD), t__190463 (Bacteroides stercoris strain, depleted in ASD), and t__180407 (Clostridium M bolteae strain, enriched in ASD). The two species, M. coli and G. elegans, were completely absent from ASD patients, but had low prevalence in the NT tested (7% and 3%, respectively; 2 datasets). The two strains, t__190463 and t__180407, were prevalent (25% and 38%, respectively) and detected in 4 and 3 datasets, respectively. Effect sizes from individual studies in addition to pooled effect sizes are reported in Supplementary file 2: Fig. S7. Changes in sample size due to the removal of children with bowel dysfunction are reported in Supplementary file 2: Fig. S8.
Age and sex confound comparative analyses of bacterial differential abundance in ASD v. NT children
Evidence suggests that the gut microbiome continues to evolve and mature well into childhood31. Given the wide age range of children with ASD studied, it is likely that some inconsistencies in the literature stem from age-related development of the gut microbiota. In this study, two distinct periods of life were considered: childhood, i.e., 2–9 years old, and adolescence, i.e., 10–17 years old. The abundances of several bacterial taxa differed between ASD and NT subjects only when considering children, or only when considering adolescents (Fig. 5). Though only detected in two datasets, one unannotated species (s__PROV_t__96605, genus Prevotella) was significantly depleted in the ASD groups, irrespective of age.
Sex bias plays a significant role in ASD studies due to the elevated prevalence (~ 4X) of diagnoses in males v. females50. While differences in gut bacterial composition have been documented between male and female subjects51, this confounder is often omitted from statistical analyses of ASD-associated microbial population flux. In this study we see that sex-dependent differences in the fecal microbiome are stronger in the ASD-diagnosed cohort. Taxa that discriminated between males and females were also specific to each study group (Fig. 6a). Only three taxa were significantly differentially abundant in both male and female ASD and NT children (Supplementary file 1: Table S7). Furthermore, there was no overlap in differentially abundant taxa between only male or only female ASD v. NT comparisons (Fig. 6b). The abundances of members of the Lachnospiraceae family and several species of Bacteroides and Bifidobacterium were elevated in the gut microbiomes of ASD-diagnosed males compared to their female counterparts. The abundance of these same taxa varied significantly between ASD children with constipation v. diarrhea (Supplementary file 1: Table S5). This hints at a potential interaction or shared pathway between the two confounders (sex and bowel function) in the ASD population.
Adjusting for confounding factors eliminates noise in ASD-associated microbiome changes
Since we observed confounding for age, sex, and bowel function in microbiome changes, ASD-associated strains were evaluated in two datasets, i.e., DS1 and DS11, wherein all three metadata were reported across both ASD and NT study groups. In DS1, the abundances of four distinct strains were significantly different in ASD subjects when the confounders were omitted from consideration. With confounder adjustment, only one of the strains retained significance but seventeen additional strains were significantly differentially abundant (Supplementary file 2: Fig. S9). Initially, no differentially abundant strains were detected in DS11. After adjusting for confounders, however, a single strain was significantly more abundant in the ASD cohort (Supplementary file 2: Fig. S9).
Although there is convincing evidence that gut microbiome profiles differ between ASD and NT children, there is little consensus as to which bacterial taxa are impactful and/or at all relevant to ASD symptomatology. Independent studies yield results that are not generalizable to the ASD population, potentially due to the heterogeneity of the ASD phenotype and variability in age of subjects studied. Recent efforts to systematically review initiatives linking the microbiome to ASD and/or conduct meta-analyses on the reported results of such efforts show little to no empirically derived associations between varying microbiome diversity and the ASD phenotype21,22,23,24. We hypothesized that standardized pre-processing of raw data generated from ASD microbiome surveys coupled with downstream comprehensive meta-analyses would yield more insightful and more accurate findings. In a similar vein, we thoroughly investigated microbiome populations alongside all the metadata variables compiled from numerous studies to reconsider some differential findings in subsets of ASD v. NT children. We found that certain bacterial abundances were significantly different between ASD and NT, oftentimes the direct consequence(s) of the confounding effects of bowel dysfunction, age, and sex. Our statistical approach is simple, and results are directly interpretable, making it favorable for mining microbiome data.
Of the clinical comorbidities documented in children with ASD, GI issues are among the most common52. Diarrhea, constipation, and abdominal pain are the most frequently reported GI symptoms in the ASD population53,54, but many have alternating constipation/diarrhea so these categories are often inadequate to describe their GI symptoms over time. However, stool consistency (a proxy for constipation/diarrhea) has been reported as the top fecal microbiome covariate28 and has been shown to confound case–control studies29. Unsurprisingly, the works discussed here suggest that many of the differential taxa abundances previously thought to associate directly with the ASD phenotype were likely an artifact linked more closely to constipation and/or diarrhea in the subjects sampled. Our results align with a recent reanalysis30 of a dataset included here19. Only two previously published investigations9,17 and our internal cohort reported information relating to bowel function in both case and control groups. Given the significant impact of stool consistency on fecal microbial populations and the prevalence of diarrhea and constipation in the ASD population, current and future studies must consider, at minimum, scoring collected samples based on the Bristol Stool Scale55 and reporting this covariate index in statistical analyses.
Upon considering only individuals with normal bowel function, we were able to identify taxa more closely associated with ASD status, a feat not possible in isolated cohorts due to inadequate statistical power. Of note, three taxa were depleted in individuals with ASD compared to NT controls across multiple datasets: Bacteroides stercoris t__190463 (strain level), Granulicatella elegans (species level), and Massilioclostridium coli (species level). Although the two species of interest had low prevalence in the children studied, they were not detected in the ASD patients. Bacteroides stercoris was previously reported to have lower relative abundance in ASD compared to NT children56, but higher relative abundance in a different study15. Here, meta-analysis estimated reduced relative abundance in the ASD population after pooling signals from four different datasets, thus demonstrating its utility in uncovering democratized associations. Although evidence to suggest a potential mechanism for the depletion of B. stercoris in ASD pathology has yet to be demonstrated, we believe the strain identified warrants further investigation as a therapeutic. Meta-analysis also revealed that Clostridium_M bolteae t__180407 (strain level) was enriched in the ASD populations investigated (3 datasets). Clostridium bolteae has been investigated in the context of ASD and is consistently more abundant in these children compared to their NT counterparts10,57,58. Abundances are even higher in individuals with Pitt Hopkins syndrome59, a severe ASD with a high incidence of GI dysfunction. C. bolteae produces a conserved specific capsular polysaccharide which is immunogenic in rabbits and has been the focus of ASD vaccine efforts60,61. Although these findings are promising, more evidence is needed as relatively few datasets supported the result in this study and the strain level findings may be confounded by between-study heterogeneity (Supplementary file 2: Fig. S7).
Age is another covariate of immense importance and relevance to microbiome health and status, particularly in young children. The gut microbiome begins to resemble that of an adult sometime around age three62, but evidence suggests further maturation over the course of later childhood31. Studies on ASD have inadvertently targeted child subjects spanning a broad age range, and unfortunately age-dependent gut microbiome differences within each study group likely affect results and confound inter-study inferences. Seven of the 10 studies revisited herein ensured that case and control groups were age-matched, though none mentioned controlling for age in statistical analyses even though children of preschool age (2–4 years old) and teenagers (13–17 years old) were evaluated in the same study. A recent analysis of more than 2500 individuals revealed that disease-microbiome associations depend on the age group studied, and that adjusting for age improves the detection of microbes truly relevant to the disease phenotype63. Although demonstrated in adults, it is conceivable that this paradigm applies to children and adolescents who are growing and maturing rapidly. Our differential findings in young children v. adolescents support the notion that the microbiome differs according to age in non-adult populations as well.
Our findings demonstrate that ASD-associated bacterial taxa abundances differ innately as a function of sex. Due to the roughly four-times greater prevalence of ASD diagnoses in males v. females, study populations are often biased and imbalanced with respect to sex. Male subjects made up 70–89.5% of the cohorts examined here, indicating that sex-dependent microbiome associations were challenging to assess in the isolated studies prior. By exploiting meta-analytical approaches, we were able to show that sex exerts an even greater influence on the microbiome in children with ASD than those with typical development. In addition, associations between fecal microbiomes and ASD were stronger in females compared to males. Consistent with our findings (Adonis test), previous studies have reported no associations between gut microbial community structure and sex in healthy children64,65,66. However, we detected several taxa that were differentially abundant between male and female NT children, and these disparities were more drastic in male v. female ASD children, but only three strains were significant in both contrasts. There was no overlap in differentially abundant taxa between ASD and NT children in male and female subsets. These findings suggest that recent surveys of ASD-microbiome variation, all of which are based predominantly on male subjects, may not be generalizable to the female population.
Finally, the limited sample size of most studies including but not limited to those considered in this comprehensive analysis (n < 100 across nine cohorts; n < 50 across 5 cohorts) drastically restricts statistical power and thus experimental resolution. Considering only the extent of variability observed within the ASD population and the number of confounders that need to be addressed, it is overly apparent that larger studies are warranted to improve statistical power and strengthen downstream inferences and conclusions. The findings presented here strongly suggest that surveys investigating simple case–control contrasts are not suitable when investigating relationships between gut microbiome perturbations and the ASD phenotype. This work underscores the dire need to systematically collect, curate, and report highly detailed metadata. It is also apparent that statistical methods used to estimate effect sizes between cases and controls should integrate confounder adjustments to more accurately account for age, sex, and stool sample consistency, at a minimum. Other confounders not addressed by our analysis due to inadequate reporting include diet and medications. Dietary preference and medication usage are strong gut microbiome covariates28,29 and are particularly relevant to studies of ASD where case patients often have extreme food selectivity67 and medical comorbidities requiring pharmacological treatment68.
Our study demonstrates that the gut microbiomes of the ASD population exhibit appreciable heterogeneity, an observation that has been established regarding the clinical manifestations of the disorder. High within-group variability produces artifacts and masks true ASD-microbiome relationships. As population-scale studies of ASD may be difficult to establish, we demonstrated meta-analytical approaches with confounder adjustment to unveil gut bacterial disturbances directly related to ASD symptomatology. This is a substantial breakthrough in understanding the patient population and associated comorbidities, which will help lead to personalized microbiome-based therapeutics.
Accession numbers for publicly available raw data are detailed in Supplementary file 1: Table S1. Raw sequencing data generated from the internal cohort has been deposited at http://files.cgrb.oregonstate.edu/David_Lab/M3_longitudinal_16s/. Raw PhyloChip data generated from the internal cohort is available in MIAME format at https://greengenes.secondgenome.com/?prefix=downloads/phylochip_datasets/ (SG_SIwai_2021_M3_ASD_CIMA.tgz). All code used to generate the figures presented can be found in Supplementary file 4.
Autism Spectrum Disorder
Amplicon sequence variant
Empirical operational taxonomic unit
Operational taxonomic unit
Ribosomal ribonucleic acid
Sequence Read Archive
Maenner, M. J. et al. Prevalence and characteristics of autism spectrum disorder among children aged 8 years—autism and developmental disabilities monitoring network, 11 sites, United States, 2018. MMWR Surveill. Summ. 70, 1–16 (2021).
Muskens, J. B., Velders, F. P. & Staal, W. G. Medical comorbidities in children and adolescents with autism spectrum disorders and attention deficit hyperactivity disorders: A systematic review. Eur. Child Adolesc. Psychiatry 26, 1093–1103 (2017).
Schroeder, B. O. & Bäckhed, F. Signals from the gut microbiota to distant organs in physiology and disease. Nat. Med. 22, 1079–1089 (2016).
Ivanov, I. I. et al. Specific microbiota direct the differentiation of IL-17-producing T-helper cells in the mucosa of the small intestine. Cell Host Microbe 4, 337–349 (2008).
Atarashi, K. et al. Th17 cell induction by adhesion of microbes to intestinal epithelial cells. Cell 163, 367–380 (2015).
Bravo, J. A. et al. Ingestion of Lactobacillus strain regulates emotional behavior and central GABA receptor expression in a mouse via the vagus nerve. Proc. Natl. Acad. Sci. USA 108, 16050–16055 (2011).
Finegold, S. M. et al. Pyrosequencing study of fecal microflora of autistic and control children. Anaerobe 16, 444–453 (2010).
Kang, D.-W. et al. Differences in fecal microbial metabolites and microbiota of children with autism spectrum disorders. Anaerobe 49, 121–131 (2018).
Kang, D.-W. et al. Reduced incidence of Prevotella and other fermenters in intestinal microflora of autistic children. PLoS One 8, e68322 (2013).
De Angelis, M. et al. Fecal microbiota and metabolome of children with autism and pervasive developmental disorder not otherwise specified. PLoS One 8, e76993 (2013).
Ma, B. et al. Altered gut microbiota in chinese children with autism spectrum disorders. Front. Cell. Infect. Microbiol. 9, 40 (2019).
Liu, S. et al. Altered gut microbiota and short chain fatty acids in Chinese children with autism spectrum disorder. Sci. Rep. 9, 287 (2019).
Plaza-Díaz, J. et al. Autism spectrum disorder (ASD) with and without mental regression is associated with changes in the fecal microbiota. Nutrients 11, 25 (2019).
Zhang, M., Ma, W., Zhang, J., He, Y. & Wang, J. Analysis of gut microbiota profiles and microbe-disease associations in children with autism spectrum disorders in China. Sci. Rep. 8, 13981 (2018).
Averina, O. V. et al. The bacterial neurometabolic signature of the gut microbiota of young children with autism spectrum disorders. J. Med. Microbiol. 69, 558–571 (2020).
Pulikkan, J. et al. Gut microbial dysbiosis in indian children with autism spectrum disorders. Microb. Ecol. 76, 1102–1114 (2018).
Wang, M. et al. Alterations in gut glutamate metabolism associated with changes in gut microbiota composition in children with autism spectrum disorder. mSystems 4, 25 (2019).
Coretti, L. et al. Gut microbiota features in young children with autism spectrum disorders. Front. Microbiol. 9, 3146 (2018).
Strati, F. et al. New evidences on the altered gut microbiota in autism spectrum disorders. Microbiome 5, 24 (2017).
Li, N. et al. Correlation of gut microbiome between ASD children and mothers and potential biomarkers for risk assessment. Genom. Proteom. Bioinform. 17, 26–38 (2019).
Bezawada, N., Phang, T. H., Hold, G. L. & Hansen, R. Autism spectrum disorder and the gut microbiota in children: A systematic review. Ann. Nutr. Metab. 20, 1–14. https://doi.org/10.1159/000505363 (2020).
Ho, L. K. H. et al. Gut microbiota changes in children with autism spectrum disorder: A systematic review. Gut Pathog. 12, 6 (2020).
Lacorte, E. et al. A systematic review of the microbiome in children with neurodevelopmental disorders. Front. Neurol. 10, 727 (2019).
Iglesias-Vázquez, L., Van Ginkel Riba, G., Arija, V. & Canals, J. Composition of gut microbiota in children with autism spectrum disorder: A systematic review and meta-analysis. Nutrients 12, 25 (2020).
Masi, A., DeMayo, M. M., Glozier, N. & Guastella, A. J. An overview of autism spectrum disorder, heterogeneity and treatment options. Neurosci. Bull. 33, 183–193 (2017).
Wang, L. W., Tancredi, D. J. & Thomas, D. W. The prevalence of gastrointestinal problems in children across the United States with autism spectrum disorders from families with multiple affected members. J. Dev. Behav. Pediatr. 32, 351–360 (2011).
Mazefsky, C. A., Schreiber, D. R., Olino, T. M. & Minshew, N. J. The association between emotional and behavioral problems and gastrointestinal symptoms among children with high-functioning autism. Autism 18, 493–501 (2014).
Falony, G. et al. Population-level analysis of gut microbiome variation. Science 352, 560–564 (2016).
Vujkovic-Cvijin, I. et al. Host variables confound gut microbiota studies of human disease. Nature 587, 448–454 (2020).
Fu, S.-C., Lee, C.-H. & Wang, H. Exploring the association of autism spectrum disorders and constipation through analysis of the gut microbiome. Int. J. Environ. Res. Public Health 18, 25 (2021).
Derrien, M., Alvarez, A.-S. & de Vos, W. M. The gut microbiota in the first decade of life. Trends Microbiol. 27, 997–1010 (2019).
Brett, D., Warnell, F., McConachie, H. & Parr, J. R. Factors affecting age at ASD diagnosis in UK: No evidence that diagnosis age has decreased between 2004 and 2014. J. Autism Dev. Disord. 46, 1974–1984 (2016).
Chrisman, B. S. et al. Improved detection of disease-associated gut microbes using 16S sequence-based biomarkers. BMC Bioinform. 22, 509 (2021).
Callahan, B. J. et al. DADA2: High-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583 (2016).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Wood, D. E. & Salzberg, S. L. Kraken: Ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 15, R46 (2014).
Kopylova, E., Noé, L. & Touzet, H. SortMeRNA: Fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics 28, 3211–3217 (2012).
Titus Brown, C. & Irber, L. sourmash: A library for MinHash sketching of DNA. J. Open Source Softw. 1, 27 (2016).
Schatz, M. C. et al. Integrated microbial survey analysis of prokaryotic communities for the PhyloChip microarray. Appl. Environ. Microbiol. 76, 5636–5638 (2010).
Ravilla, R. et al. Cervical microbiome and response to a human papillomavirus therapeutic vaccine for treating high-grade cervical squamous intraepithelial lesion. Integr. Cancer Ther. 18, 1534735419893063 (2019).
Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).
Edgar, R. C. Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences. PeerJ 6, e4652 (2018).
Parks, D. H. et al. A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life. Nat. Biotechnol. 36, 996–1004 (2018).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Viechtbauer, W. Conducting Meta-Analyses in R with the metafor Package. J. Stat. Softw. Articles 36, 1–48 (2010).
Roager, H. M. et al. Colonic transit time is related to bacterial metabolism and mucosal turnover in the gut. Nat. Microbiol. 1, 16093 (2016).
Tottey, W. et al. Colonic transit time is a driven force of the gut microbiota composition and metabolism: In vitro evidence. J. Neurogastroenterol. Motil. 23, 124–134 (2017).
Adams, J. B., Johansen, L. J., Powell, L. D., Quig, D. & Rubin, R. A. Gastrointestinal flora and gastrointestinal status in children with autism–comparisons to typical children and correlation with autism severity. BMC Gastroenterol. 11, 22 (2011).
Maenner, M. J. et al. Prevalence of autism spectrum disorder among children aged 8 years—autism and developmental disabilities monitoring network, 11 sites, United States, 2016. MMWR Surveill. Summ. 69, 1–12 (2020).
Kim, Y. S., Unno, T., Kim, B. Y. & Park, M. S. Sex differences in gut microbiota. World J. Mens Health 38, 48–60 (2020).
Kohane, I. S. et al. The co-morbidity burden of children and young adults with autism spectrum disorders. PLoS One 7, e33224 (2012).
Buie, T. et al. Evaluation, diagnosis, and treatment of gastrointestinal disorders in individuals with ASDs: A consensus report. Pediatrics 125(Suppl 1), S1-18 (2010).
Coury, D. L. et al. Gastrointestinal conditions in children with autism spectrum disorder: Developing a research agenda. Pediatrics 130, S160–S168 (2012).
Lewis, S. J. & Heaton, K. W. Stool form scale as a useful guide to intestinal transit time. Scand. J. Gastroenterol. 32, 920–924 (1997).
Dan, Z. et al. Altered gut microbial profile is associated with abnormal metabolism activity of Autism Spectrum Disorder. Gut Microbes 11, 1246–1267 (2020).
Song, Y., Liu, C. & Finegold, S. M. Real-time PCR quantitation of clostridia in feces of autistic children. Appl. Environ. Microbiol. 70, 6459–6465 (2004).
Kandeel, W. A. et al. Impact of Clostridium bacteria in children with autism spectrum disorder and their anthropometric measurements. J. Mol. Neurosci. 70, 897–907 (2020).
Dilmore, A. H. et al. The fecal microbiome and metabolome of Pitt Hopkins syndrome, a severe autism spectrum disorder. mSystems 6, e0100621 (2021).
Pequegnat, B. et al. A vaccine and diagnostic target for Clostridium bolteae, an autism-associated bacterium. Vaccine 31, 2787–2790 (2013).
Pequegnat, B. & Monteiro, M. A. Carbohydrate scaffolds for the study of the autism-associated bacterium, Clostridium bolteae. Curr. Med. Chem. 26, 6341–6348 (2019).
Yatsunenko, T. et al. Human gut microbiome viewed across age and geography. Nature 486, 222–227 (2012).
Ghosh, T. S., Das, M., Jeffery, I. B. & O’Toole, P. W. Adjusting for age improves identification of gut microbiome alterations in multiple diseases. Elife 9, 25 (2020).
Zhong, H. et al. Impact of early events and lifestyle on the gut microbiota and metabolic phenotypes in young school-age children. Microbiome 7, 2 (2019).
Hollister, E. B. et al. Structure and function of the healthy pre-adolescent pediatric gut microbiome. Microbiome 3, 36 (2015).
Ringel-Kulka, T. et al. Intestinal microbiota in healthy US young children and adults—a high throughput microarray analysis. PLoS One 8, e64315 (2013).
Yap, C. X. et al. Autism-related dietary preferences mediate autism-gut microbiome associations. Cell 184, 5916-5931.e17 (2021).
Feroe, A. G. et al. Medication use in the management of comorbidities among individuals with autism spectrum disorder from a large nationwide insurance database. JAMA Pediatr. 175, 957–965 (2021).
We would like to thank all researchers who made their data available for this project, and those who responded to requests for additional data/metadata. We thank Myron LaDuc for scientific editing.
This work was supported by Second Genome Inc., NIH Small Business Innovation Research funding (Grant number: R44DA043954).
KAW, XY, EMR, BW, JC, RLH, WH, ML, ER, KS, YW, DPW, KD, TZD, and SI were employed by Second Genome, Inc. for the duration of this work. MMD has a financial interest in Second Genome Inc. and is co-owner of Microbiome Engineering Inc. and NeuroBiome, two companies specialized in developing biosensors.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
West, K.A., Yin, X., Rutherford, E.M. et al. Multi-angle meta-analysis of the gut microbiome in Autism Spectrum Disorder: a step toward understanding patient subgroups. Sci Rep 12, 17034 (2022). https://doi.org/10.1038/s41598-022-21327-9
This article is cited by
Scientific Reports (2023)
Nature Microbiology (2023)