Introduction

Neurodevelopmental disorders are detrimental to lifelong health and quality of life,1,2 and carry both social3,4 and financial impacts.5,6,7 Despite recent advancements in our understanding of the etiology of neurodevelopmental outcomes such as genetics and environmental factors,8,9,10 questions remain unanswered, especially regarding the mechanisms by which known risk factors act on the vulnerable fetal, infant, and early childhood brain.11,12 The most rapid period of brain development begins in the prenatal period and continues through approximately 3 years of age,13 mirroring the development of the gut microbiome (the bacteria, viruses, and fungi that reside in the gastrointestinal tract).14,15 We know that enteric bacteria influence brain function through various bidirectional systems of communication collectively known as the microbiome gut−brain axis.16 This connection between the gut and the brain plays a vital role throughout the life course, but particularly in early life when neural circuitry is still forming.17

The gut microbiome has been associated with several neurological outcomes including autism spectrum disorder (ASD),18,19,20,21,22,23,24 attention-deficit/hyperactivity disorder (ADHD),25,26 anxiety,27 and depression28,29 in adult populations. However, most studies to date have described these relationships in a cross-sectional setting where reverse causation is likely given the behavioral nature of these outcomes (i.e., the behaviors may result in changes to the microbiome, rather than the microbiome affecting behaviors). Additionally, only a few studies have captured the critical window of the infant microbiome that parallels the most rapid period of brain development in the lifespan, including our prior work on ASD-related social behaviors.30,31 The infant microbiome is less stable than that of the adult,14 and early-life exposures influence neurodevelopment and in turn later behavioral outcomes. Furthermore, the prevalence of many of these behavioral outcomes differs by child sex, and the previous epidemiologic studies on the microbiome present aggregate estimates, potentially missing unique associations that exist specifically in either boys or girls.

To address these gaps, we examined the relationships between the infant and early-childhood microbiome and continuous measures of internalizing, externalizing, and social behaviors at 3 years of age, which inform clinical diagnoses of anxiety, depression, ADHD, and autism in preschool-aged children. Specifically, the Behavioral Assessment System for Children, second edition (BASC-2) is a dimensional instrument that measures clinical and adaptive behaviors in children and young adults, including scales standardized for use in preschool-aged (2–5 years) populations.32 When children’s scores fall outside the normal range, the BASC-2 can be informative in pursuing further diagnostic instruments to implement treatment plans including in-school services and medications. In addition to calculating aggregate associations across the population, we explored differences in associations among boys and girls using interaction models.

Methods

Population description

The New Hampshire Birth Cohort Study (NHBCS) is a pregnancy cohort with an ongoing recruitment and follow-up. Women were recruited at early pregnancy visits and were eligible if they were 18–45 years of age, used a private water system (e.g., well) at their residence during pregnancy, and did not plan on moving before delivery. In a subset, parents collected a stool sample from their child at 6 weeks, 1 year, and 2 years postpartum as previously described.30 Parents regularly completed interviews and questionnaires pertaining to sociodemographic data, their child’s health, and other relevant covariate information. When their child was approximately 3 years old, parents completed neurobehavioral assessments including the BASC-2 preschool form.32 A participant selection flow-chart and a table of exact sample sizes for each analysis are included in Table 1 and the supplement (Supplementary Fig. S1 and Supplementary Table S1). Parents of participants provided written informed consent. All study protocols were approved by the Center for the Protection of Human Subjects at Dartmouth, and all methods were carried out in accordance with relevant guidelines and regulations.

Table 1 Sample sizes for analysis.

Microbiome sequencing

Details of the protocols for microbial DNA extraction from stool samples using the Zymo Fecal DNA extraction kit (Irvine, CA) have been described elsewhere.33,34 Sequencing of the V4-V5 hypervariable region of the 16S rRNA gene (forward primer 518F: CCAGCAGCYGCGGTAAN, pooled reverse primers: 926R1: CCGTCAATTCNTTTRAGT, 926R3: CCGTCAATTTCTTTGAGT, 926R4: CCGTCTATTCCTTTGANT) was performed at Woods Hole Marine Biological Laboratory, MA (MBL) on the Illumina MiSeq platform (San Diego, CA) using established Illumina protocols.35 DADA2 (v1.6) was used to clean and process reads to determine amplicon sequence variants (ASVs).34,36 Bacteria were classified using the SILVA database (v138).37 Additionally, a smaller set of DNA samples from 6-week and 1-year stools underwent shotgun metagenomic sequencing at MBL using the NextSeq Illumina platform. As described elsewhere,30 sequences were annotated to the species level using MetaPhlAn238 and metabolic functional capacity was inferred through the HUMAnN239 algorithm using the MetaCyc database (v19.1).40 Alpha diversity was calculated from 16S rRNA data with the Shannon Index to capture both richness and evenness41 using the vegan package.42 Secondary analyses considered the Simpson Index43 and the count of observed ASVs. Beta diversity was computed using generalized UniFrac (GUniFrac) distances with 16S rRNA relative abundance data.44 Taxa inferred from both 16S rRNA and shotgun metagenomic sequencing data were considered in relation to the outcomes.

Behavior Assessment System for Children, second edition

Parents assessed their 3-year-old child’s behavioral development using the BASC-2,32 an instrument with a validated preschool version (ages 2–5 years) that captures adaptive and maladaptive behaviors. Internal validity metrics were used to exclude one subject for whom the consistency metric indicated the data should be used with extreme caution (>21).32 This subject would have only contributed to the analysis of the 2-year-old microbiome. A priori, we selected five primary clinical or adaptive scales and one content scale that measure phenotypes that have previously been associated with the microbiome in either animal models or in an epidemiological setting. These included internalizing behaviors (Anxiety and Depression),45 behaviors related to ADHD (Hyperactivity and Attention Problems),46 and behaviors that are related to autism (Social Skills and Developmental Social Disorders).47 To capture broader phenotypes, we included the four BASC-2 composite scales (Internalizing Problems, Externalizing Problems, Behavioral Symptoms Index, and Adaptive Skills). Primary clinical and adaptive scales contribute to the composites and have overlapping items with content scales, meaning that the ten outcomes are not independent (Supplementary Fig. S2). Scores are standardized to a mean of 50 with a standard deviation of 10. For adaptive scales (Social Skills and Adaptive Skills Composite) a higher score indicates better behavior, whereas for the maladaptive scales an increased score indicates worse behavior.

Covariates

Potential covariates were selected a priori based on their potential to confound the association between the microbiome and BASC-2 scales (Supplementary Fig. S3). After examining the univariate relationships of each variable with alpha diversity and BASC-2 scales, we selected gestational age, maternal education (any graduate education or none), child sex, parity (nulliparous or parous), maternal and paternal age, child age at follow-up, delivery mode, early life exclusive breastfeeding (EBF) versus mixed or exclusive formula feeding, maternal smoking during pregnancy, and breastfeeding duration as variables in our final model. As in our previous work,30 we examined models excluding variables for which the microbiome may act as a mediator (delivery mode,48 EBF and breastfeeding duration,49,50,51 and maternal smoking during pregnancy)52 as sensitivity analyses.

Multiple imputation

To maintain sample size and reduce selection bias we imputed missing covariate data using multivariate imputation by chained equations (mice package in R).53,54 Variables with missing data are indicated in Table 2. We imputed 40 datasets with 20 iterations with all other covariates, outcomes, and bacterial alpha diversity as predictors of the missing covariates, as recommended.55 Predictive mean matching was used for continuous variables (paternal age and breastfeeding duration) and logistic regression with bootstrapping was used for binary variables (EBF delivery mode, parity, maternal education status, and peripartum antibiotic exposure, which was imputed for effect modification models). Where possible in downstream models (i.e., in linear regression models), the mice package was used to run 40 separate models and pool results. For more complex models, a single dataset was created by using the median/mode of the imputed data for each subject. We conducted a sensitivity analysis among subjects with no missing covariate data.

Table 2 Characteristics of participants included in any analysis [mean ± SD (% missing) or n (%)].

Models

Linear regression was used to determine the association between alpha diversity indices and BASC-2 scores. Using the pool() function of the mice R package,54 parallel regressions were run on each of the imputed datasets before the estimates and their errors were combined. For all other models, a single dataset of the median/mode of imputed variables was used. To examine bacterial community structure related to each of the BASC-2 scales (beta diversity), we modeled GUniFrac distances as the outcome in adonis2 models with 10,000 permutations (vegan package in R).42 Marginal p values are reported. For both alpha and beta diversity analyses, a nominal p < 0.05 was considered statistically significant. MaAsLin2 (v1.6) was used to determine the relationships between bacterial ASVs (16S), species (metagenomic sequences), metabolic pathways, and BASC-2 scores.56,57 The minimum abundance and prevalence parameters were set so that a feature had to have at least 0.0001% abundance in at least 10% of samples to be assessed. Otherwise, default parameters were used (i.e., minimum variance: 0, normalization method: total sum-scaling, transformation: log, analysis method: linear, standardized metadata). The Benjamini−Hochberg procedure to adjust for multiple comparisons was applied to all ASVs/species/pathways in a given set of models, and a false discovery rate (FDR) of 10% (q < 0.1) was considered statistically significant for ASVs and species.58 Because the functional pathway relative abundance is more exploratory, a nominal p value of 0.01 was applied as a cut-off. Features that were associated with Social Responsiveness Scale, second edition (SRS-2) scores in our previous analysis were examined for similar associations with autism-related BASC-2 scales.30 MaAsLin2 models feature relative abundance as the outcome of models, and because it is the opposite of the hypothesized microbiome−BASC-2 relationship requires careful interpretation.

Our main analysis examined the associations of the microbiome on BASC-2 scores in the overall population, but we hypothesized that the microbiome may have differential effects on BASC-2 scores in certain subgroups, specifically depending on child sex, which is relevant in the context of neurodevelopment and can alter susceptibility to exposures.59,60,61 To explore this possibility, we fit all models described above with an interaction between sex and the exposure of interest (microbiome measurement or BASC-2 in the cases where the microbiome was modeled as the outcome) to estimate sex-specific effects and interaction p values (pinteraction). Similarly, in secondary analyses, we explored the possibility that certain sensitive or healthy populations respond differently to perturbances in the microbiome. These analyses included models with interactions for peripartum antibiotic exposure and EBF status, as well as models conducted among only infants who were breastfed for at least 6 months and those who were vaginally delivered. Because these strata are the majority of our population, it was not feasible to conduct parallel analyses in the other strata (i.e., Caesarean delivered infants or those breastfed less than 6 months). All analyses were conducted in R 4.0.2 62 and R Studio 1.3.959.63

Results

Population characteristics

Of the 260 infants in our analysis, the majority were born by a vaginal delivery (68.5%), roughly half were exclusively breastfed at 6 weeks postpartum, and the average duration of any breastfeeding was close to 1 year (mean ± SD, 11 ± 8.5 months; Table 2). Approximately 95% of mothers were non-smokers during their pregnancy and were an average of 32.2 years old at delivery. When their children were approximately 3 years of age (mean ± SD, 3.2 ± 0.3 years), parents of children in the New Hampshire Birth Cohort Study (NHBCS) completed the BASC-2. From the full range of scales, we selected ten primary clinical, adaptive, and composite scales for investigation that we hypothesized may be related to the microbiome (Supplementary Fig. S4). These captured both maladaptive (higher score is worse, Anxiety, Depression, Internalizing Problems, Attention Problems, Hyperactivity, Behavioral Symptoms Index, Externalizing Problems, Developmental Social Disorders) and adaptive (higher score is better, Social Skills and Adaptive Skills) behaviors. NHBCS participants were rated slightly better than the normative population (50 ± 10), with slightly lower scores on maladaptive scales and higher scores on adaptive behavioral scales. Notably, this was not the case with the Anxiety scale, where children’s scores were similar to those of the normative population. On average, boys were rated worse than girls on most scales with the exception of the three scales related to internalizing behaviors (Anxiety, Depression, and the composite Internalizing Problems; Table 2). As expected, BASC-2 scales were correlated, particularly composite scales and those that contribute to that composite (Supplementary Figs. S2, S4). BASC-2 scores were also correlated with scores on the Social Responsiveness Scale, second edition (SRS-2), a separate instrument that measures autism-related social behaviors and was previously examined in relation to the microbiome in this cohort.30 Boys and girls were similar with respect to covariate information, but fathers of girls were marginally older than fathers of boys. Although boys and girls were equally likely to be exposed to antibiotics in the peripartum period, boys were more likely to receive antibiotics within the first year of life (Supplementary Table S2). As expected, boys were heavier than girls at birth, but were similar according to World Health Organization sex- and gestational age-specific expectations (z-scores) for birthweight.

Higher alpha diversity related to internalizing behaviors among boys

Within-subject bacterial diversity increased with age in both boys and girls, with no notable differences between the sexes (Supplementary Table S3). Increased Shannon diversity at 6 weeks was associated with lower depression scores, particularly among boys [βBoys = −1.81 points/std increase Shannon diversity, 95% CI: (−3.22, −0.41), pBoys = 0.01, βGirls = −0.72, 95% CI: (−2.34, 0.89), pGirls = 0.38, pinteraction = 0.346; Fig. 1 and Supplementary Table S4]. The relationship between increased early-life diversity and fewer internalizing behaviors among boys was also captured with the Anxiety scale and Internalizing Problems composite scale (Fig. 1 and Supplementary Tables S5, S6). Six-week alpha diversity was not significantly associated with any other BASC-2 scales, nor were there significant differences between boys and girls (Supplementary Tables S7S13). Secondary interaction models found the relationship between early-life alpha diversity and depression was consistent across most populations (Supplementary Table S4). Alpha diversity at 1 year was not associated with 3-year-old BASC-2 scores (Fig. 1 and Supplementary Tables S4S13).

Fig. 1: Effect estimates and 95% confidence intervals for a standard deviation increase in Shannon Index on BASC-2 scores (SDSixWeeks = 0.54, SDOneYear = 0.55, SDTwoYears = 0.46).
figure 1

Sex-specific estimates derive from models with sex-specific variables for Shannon z-scores. Each outcome was considered in a separate model adjusting for the main effect child sex, gestational age, maternal education, parity, delivery mode, maternal and paternal age at delivery, maternal smoking during pregnancy, early life exclusive breastfeeding, duration of any breastfeeding, and child age at follow-up. *indicates an estimate is statistically significant (p < 0.05), brackets highlight where the sex-specific estimates are different from one another (p < 0.1), with the exact p value of the interaction term in text near the bracket.

Divergent associations at 2 years by subgroups according to sex and perinatal antibiotic exposure

At 2 years, increased alpha diversity was associated with better Developmental Social Disorders, Social Skills, and Adaptive Skills Composite scales among boys, but worse scores on these same scales among girls [e.g., βBoys,SocialSkills = 1.66 points/std increase Shannon diversity, 95% CI: (0.11, 3.2), βGirls,SocialSkills = −2.43, 95% CI: (−4.54, −0.32), pinteraction = 0.003; Fig. 1 and Supplementary Table S13]. In subgroup analyses, children exposed to peripartum antibiotics performed better on these scales when their Shannon Index was higher at 2 years (Supplementary Tables S11S13). In contrast, children who were not exposed to antibiotics in the peripartum period had worse Behavioral Symptoms Index Scores with increased alpha diversity ([β = 1.62 points/std increase Shannon diversity, 95% CI: (0.04, 3.19), p = 0.05; Supplementary Table S8]; a similar pattern was observed on the Depression scale [β = 1.91 points/std increase Shannon diversity, 95% CI: (0.11, 3.72), p = 0.04, Supplementary Table S4].

No overall association between bacterial community structure and BASC-2 scores

Overall, beta diversity was unrelated to any BASC-2 scale in fully adjusted models. In models excluding factors for which the microbiome may mediate an association, 6-week beta diversity was associated with Attention Problem scores (R2 = 0.013, p = 0.01, Supplementary Table S14). Among boys, significant differences in beta diversity occurred with Anxiety scores (R2 = 0.012, p = 0.019). At 1 year community structure was associated with Attention Problems and Adaptive Skills Composite, but only among those exposed to peripartum antibiotics. In models excluding factors for which the microbiome may mediate an association, we also found an association with Developmental Social Disorders (R2 = 0.011, p = 0.023), Hyperactivity (R2 = 0.009, p = 0.048), and Adaptive Skills Composite (R2 = 0.01, p = 0.028) scores among boys. We found few associations between beta diversity at 2 years and BASC-2 scores in our main models, although we observed differences by Attention Problems among those who were exclusively breastfed in early life (R2 = 0.012, p = 0.047). Results were similar among complete cases (Supplementary Table S15). Infant sex was not associated with beta diversity at any age (Supplementary Fig. S5). However, early life exclusive breastfeeding was associated with beta diversity at all ages and paternal and maternal age were associated with 2-year-old beta diversity.

Key taxa (16S) relate to outcomes in a sex-specific manner

The relative abundances of several taxa at 6 weeks were associated with Adaptive Skills Composite scores in a sex-specific manner (FDR q < 0.1, Fig. 2 and Supplementary Table S16). Notably, this included Bifidobacterium, one of the most abundant taxa in the infant gut (Supplementary Table S17), for which a single point increase in Adaptive Skill Composite score was associated with a 1.91% increase in relative abundance among boys [95% CI: (0.7, 3.13), q = 0.08], but had no association among girls [−1.54 %/point, 95% CI: (−3.14, 0.06), q = 0.89]. Due to the methods used, where bacterial species abundance was modeled as the outcome, the units for the estimates imply behaviors influence the microbiome. However, the temporality of data collection suggests the reverse (i.e., microbiome influences behavior). Similarly, Bacteroides vulgatus, and a Streptococcus taxon at 6 weeks related to better Adaptive Skills Composite scores among boys, but not among girls. In contrast, three ASVs in the Klebsiella, Clostridium, and Haemophilus genera at 6 weeks related to worse Adaptive Skills Composite scores among boys. Tyzzerella nexilis was associated with better Depression scores among boys [−0.89 %/point, 95% CI: (−1.4, −0.38), q = 0.07]. At 1 year, the relative abundance of Faecalitalea was associated with worse Hyperactivity scores, with no discernable sex-specific effects [0.16 %/point, 95% CI: (0.08, 0.23), q = 0.05]. At 2 years, the relative abundance of four Blautia ASVs was associated with worse measures of hyperactivity. These associations were stronger among girls, but q values did not meet the FDR threshold after imputing missing covariates. None of these ASVs was prevalent enough at earlier time points to be examined. ASVs that were previously associated with SRS-2 scores (1-year Blautia producta and an unknown Lachnospiraceae; 2-year Coprococcus, Ruminococcus gnavus, Bifidobacterium, and Sutterella)30 mostly had the same direction of association with autism-related BASC-2 scores, and the association between the Coprococcus ASV relative abundance and worse Social Skills scores did reach nominal significance (−0.18%/point, p = 0.007, Supplementary Table S18).

Fig. 2: Effect estimates and 95% confidence intervals for the difference in percent relative abundance per point increase on the given BASC-2 scale.
figure 2

Sex-specific estimates derive from models with sex-specific variables for BASC-2 scores adjusting for the main effect child sex, gestational age, maternal education, parity, delivery mode, maternal and paternal age at delivery, maternal smoking during pregnancy, early life exclusive breastfeeding, duration of any breastfeeding, and child age at follow-up. * indicates FDR q < 0.05; indicates FDR < 0.1.

Metagenomic species related to BASC-2 scores

Metagenomic sequencing allowed for more precise annotation of bacterial species associated with BASC-2 scores. While no bacterial taxa were strongly associated with any BASC-2 scales, we identified a weak association between Eggerthella lenta at 6 weeks and Depression (Fig. 3 and Supplementary Table S19). Additionally, we found evidence of sex-specific associations between several other bacteria at 6 weeks and BASC-2 scores. Specifically, Klebsiella oxytoca was adversely associated with Adaptive Skills Composite and Developmental Social Disorders scores among boys (−2.98 %/point Adaptive Skills and 2.57 %/point Developmental Social Disorders), but no evident relationship among girls (pinteraction = 0.052 and 0.023 for Adaptive Skills and Developmental Social Disorders, respectively). In contrast, Granulicatella was associated with worse Anxiety scores among girls (0.58 %/point), but not boys (pinteraction = 0.257). Similar to the findings at 6 weeks, no strong associations were observed at 1 year. Weak adverse associations were observed between Anaerofustis stericohominis and Adaptive Skills Composite scores (−0.31 %/point) and Lachnospiraceae bacterium 1_1_57FAA and Externalizing Problems scores (0.21 %/point). Streptococcus peroris relative abundance at 1 year was associated with lower Depression and Internalizing Problems scores, but only among girls [(−1.38 %/point Depression, pinteraction = 0.098; −1.16 %/point Internalizing Problems, pinteraction = 0.169), Fig. 3 and Supplementary Table S19].

Fig. 3: Associations between select bacterial taxa (shotgun metagenomics) and BASC-2 scores.
figure 3

Estimates, standard errors, p values, and FDR q values derive from MaAsLin2 models where the species relative abundance is modeled as the outcome and the given BASC-2 is treated as the exposure. Sex-specific estimates derive from models with sex-specific variables for BASC-2 scores. All models adjust for the main effect child sex, gestational age, maternal education, parity, delivery mode, maternal and paternal age at delivery, maternal smoking during pregnancy, early life exclusive breastfeeding, duration of any breastfeeding, and child age at follow-up.

Bacteria that were previously identified as being related to worse SRS-2 scores (6-week Flavonifactor plautii and 1-year Adlercreutzia equolifaciens, Ruminococcus torques, Eubacterium dolichum, and bacterium 6 1 63 FAA in the Lachnospiraceae)30 were found to have similar relationships (i.e., in the same direction) with the relevant BASC-2 scales (Social Skills and Developmental Social Disorders). Although the effect estimates were further from the null, they were not statistically significant in the context of multiple testing (Supplementary Fig. S6). The relationship between Adlercreutzia equolifaciens and Social Skills scores was nominally significant [−0.17 %/point, 95% CI: (−0.33, −0.01), p = 0.04, q = 0.588)], and this relationship was found to be primarily among boys although the interaction term was not statistically significant [−0.98 %/point, 95% CI: (−1.72, −0.24), p = 0.01, pinteraction = 0.187]. Similarly, the association between Lachnospriraceae bacterium 6_1_63FAA and Developmental Social Disorder scores was only significant among boys [0.9 %/point, 95% CI: (0.14, 1.67), p = 0.02] and not girls [−0.03 %/point, 95% CI: (−0.91, 0.85), p = 0.94, pinteraction = 0.64].

Bacterial functional pathways related to BASC-2 scores

The relative abundance of several metabolic pathways at 6 weeks and 1 year was associated with BASC-2 scales (Fig. 4). At 6 weeks, increased relative abundance PWY-4981 [L-proline biosynthesis II (from arginine)] and PWY-7399 (methylphosphonate degradation II) genes were associated with better scores on the Hyperactivity scale and the two composite scales to which it contributes—Behavioral Symptoms Index and Externalizing Problems (Supplementary Table S20). PWY-5910 (superpathway of geranylgeranyldiphosphate biosynthesis I (via mevalonate)) and PWY-7560 (methylerythritol phosphate pathway II) were also associated with better Externalizing Problems scores. The NAD salvage pathway (PNC VI cycle) was associated with worse Internalizing Problems scores, whereas catechol degradation I (meta-cleavage pathway) was associated with lower Attention Problems scores.

Fig. 4: Volcano plots of associations between metabolic functional pathways and BASC-2 scores.
figure 4

Estimates, standard errors, p values, and FDR q values derive from MaAsLin2 models where the pathway relative abundance is modeled as the outcome and the given BASC-2 is treated as the exposure. All models adjust for the main effect child sex, gestational age, maternal education, parity, delivery mode, maternal and paternal age at delivery, maternal smoking during pregnancy, early life exclusive breastfeeding, duration of any breastfeeding, and child age at follow-up. ASPASN.PWY: superpathway of L-aspartate and L-asparagine biosynthesis, PYRIDNUCSAL.PWY: NAD salvage pathway I (PNC VI cycle), PWY.5415: catechol degradation I (meta-cleavage pathway), DENOVOPURINE2.PWY: superpathway of purine nucleotides de novo biosynthesis II, PWY.4981: L-proline biosynthesis II (from arginine), PWY.7399: methylphosphonate degradation II, PWY4FS.7: phosphatidylglycerol biosynthesis I (plastidic), PWY4FS.8: phosphatidylglycerol biosynthesis I (non-plastidic), PWY.7111: pyruvate fermentation to isobutanol (engineered), PWY.5910: superpathway of geranylgeranyldiphosphate biosynthesis I (via mevalonate), PWY.7580: phycoerythrobilin biosynthesis II, PWY.7187: pyrimidine deoxyribonucleotides de novo biosynthesis II, 7ALPHADEHYDROX.PWY: bile acid 7α-dehydroxylation, PWY.922: mevalonate pathway I (eukaryotes and bacteria).

At 1 year, the superpathway of L-aspartate and L-asparagine biosynthesis was associated with better Depression scores and better scores on the Behavioral Symptoms Index (to which the Depression scale contributes), as well as better scores on the Externalizing Problems and Adaptive Skills Composite scales. The superpathway of purine de novo biosynthesis II was associated with better Attention Problems, Developmental Social Disorders, and Social Skills scores, and de novo pyrimidine biosynthesis was also associated with better Developmental Social Disorders scores. PWY-7111, an engineered pathway of pyruvate fermentation, was associated with worse Externalizing Problems and Behavioral Symptoms Index scores. Both plastidic and non-plastidic phosphatidylglycerol biosynthesis were associated with better Behavioral Symptoms Index scores. Bile acid dihydroxylation was associated with worse Developmental Social Disorders and Adaptive Skills Composite scores. Finally, a mevalonate pathway was associated with better Social Skills scores.

Different pathways were associated with BASC-2 scores among boys and girls (Supplementary Figs. S7, S8). Generally, pathways that were significantly associated in one sex had null associations in the other, as opposed to being significantly associated in the opposite direction. More associations were observed among girls with 1-year functional capacity compared to 6 weeks, and with the Social Skills scale compared to other BASC-2 scales. In contrast, there were more associations among boys at 6 weeks compared to 1 year, and with the Internalizing and Externalizing Problems scales. However, the Depression scores were also associated with the relative abundance of several microbial functional pathways at 1 year among boys, including PWY0-845 (superpathway of pyridoxal 5ʹ-phosphate biosynthesis and salvage), the superpathway of L-aspartate and L-asparagine biosynthesis, and L-ornithine biosynthesis II. Notably, several pathways whose relative abundances were associated with better Depression scores among boys related to vitamin B6 biosynthesis or salvage (PWY0-845 at 1 year, PYRIDOXSYN-PWY at 1 year, and PWY-7204 at 6 weeks). A high correlation between PWY0-845 and PYRIDOXSYN-PWY may be responsible for their concordant results, but PWY-7204 is not correlated with either of the other pathways (Supplementary Fig. S9).

Discussion

This prospective study of the concurrent development of the microbiome and brain function measured by later behaviors is one of the first to be conducted at this young age. We found that at the youngest ages, higher diversity was related to better internalizing behaviors at 3 years of age, especially among boys. Sex-specific differences in behavioral sensitivities to Shannon diversity were larger with microbiome samples at older ages but were also apparent with 6-week microbiome samples. Our analysis of beta diversity found that microbiome differences related to BASC-2 scores did not lead to differences in community structure, with the notable exception of Attention Problems scores in certain models. However, there was some indication that boys may be more sensitive at the beta-diversity level. In taxon-level analyses, using both 16S rRNA data and shotgun metagenomics data, we found time- and sex-specific effects. Similarly, in exploring the functional capacity of the gut microbiome, we found time- and sex-specific effects with biologically relevant gene pathways.

Our findings regarding the sensitivity of the development of specific internalizing behavioral outcomes, such as anxiety and depression, to early differences in gut bacterial diversity are supported by some prior studies of these outcomes in other populations64,65,66,67 and animal models.68 However, it was not expected that this would be detectable as early as 6 weeks, and our findings suggest there may be an early opportunity to intervene upon asymptomatic at-risk infants to lower the burden of these disorders in early childhood, adolescence, and young adulthood. In addition to the relationship between diversity and internalizing behaviors, specific bacteria were associated with BASC-2 scores. To date, sex-specific differences have not been examined or noted in early life, whereas we found the relative abundance of a Granulicatella species at 6 weeks was associated with worse Anxiety scores among girls. Conversely, at 1 year, Streptococcus peroris was associated with better Internalizing Problems scores and specifically scores on the Depression scale, but only among girls. These findings may indicate a potential window of vulnerability for female neurodevelopment related to internalizing behaviors or may be due to chance. Additional research is required to confirm this finding.

While some epidemiological studies have considered the microbiomes of children with ADHD compared to controls, few large-scale longitudinal epidemiological studies have considered ADHD symptoms such as inattention and hyperactivity.69 Although we did not observe large shifts in beta diversity related to most BASC-2 outcomes, there was an association between Hyperactivity scores and GUniFrac distances at 6 weeks in minimally adjusted models. This is likely due to a strong association between maternal smoking during pregnancy, which was not included in the minimal model, and Hyperactivity scores but more research is needed into whether the microbiome, which is also affected by smoke exposure,52 could mediate this association. Our analysis of bacteria sequenced by 16S also uncovered novel associations between several Blautia ASVs and worse Hyperactivity scores. To date, no other studies have described a similar relationship between Blautia and ADHD symptoms. Some Blautia species have been linked to conditions like depression,70 whereas in other studies some anti-inflammatory Blautia are reduced in subjects with ASD.71,72,73 Lack of prior evidence related to hyperactivity may be due to few prospective studies of ADHD symptoms in young pediatric populations.

Shotgun metagenomic sequencing allows for annotation of genes across the genome, which permits an inference of the metabolic functional capacity of the bacteria in the gut. In this study, we found several associations between functional pathways and BASC-2 measures of externalizing behaviors that are biologically relevant. For example, proline, biosynthesis of which was associated with better Hyperactivity, Externalizing Problems, and Behavioral Symptoms Index scores, has been found to be lower among unmedicated subjects with ADHD, albeit not statistically significantly.74 Another example is that catechol degradation was associated with better Attention Problems scores. Catechol is a toxic compound that naturally occurs in some fruits and vegetables, but is also synthesized for use as a pesticide.75 Interestingly, studies have found that SNPs in the human catechol-O-methyltransferase gene are linked to the diagnosis and severity of ADHD symptoms, anxiety, and depression.76,77,78,79 It is of note that the scales that had the most associations with bacterial functional pathway relative abundance were related to the Externalizing Problems scale. More research is needed to determine whether the neural mechanisms that underlie these behaviors are especially vulnerable to microbial influences at this early-life stage.

We found that boys were more sensitive to early-life (i.e., 6 weeks of age) differences in their microbiomes as evaluated with 16S sequencing, with several bacteria, including Bifidobacterium a common beneficial genus, associated with better Adaptive Skills Composite scores.80 Notably, this included a positive association between a Klebsiella ASV and Adaptive Skills Composite scores that was supported by a similar association of Klebsiella oxytoca and Adaptive Skills Composite scores among boys in our metagenomics analysis. This commensal species has been reported to be depleted in patients with ASD,81 although not consistently across studies.18,19,20,21,22,23,24 The Adaptive Skills Composite score includes the Social Skills scale, which is relevant to autistic behaviors. In our prior work with an autism-specific outcome (SRS-2), we did not find strong associations between bacterial diversity and autism-related behaviors,30 whereas here we found relationships between the 2-year-old microbiome and autism-related BASC-2 scales (i.e., Developmental Social Disorders, Social Skills, and Adaptive Skills Composite) when examining sex-dependent effects not explored previously. Notably, these scales were where we observed the greatest difference between boys and girls, with boys performing significantly worse. It is possible that early manifestations of socially impaired behaviors related to ASD affect the 2-year-old microbiome (i.e., there is reverse causation). Nevertheless, more clinical studies are required to identify when therapies targeting the microbiome may be most effective for ASD, and who may respond best to such interventions.

Our novel finding of sex-specific associations in the relationship between the microbiome and BASC-2 scores highlights the importance of considering populations that may be more sensitive to alterations in the microbiome and host characteristics that may modulate the relationships between gut bacteria and human health. There is strong evidence that microglial abundance, morphology, and gene expression differ between boys and girls in the neonatal period,61 suggesting that the microbiome, which interacts with microglia,82 could have sex-specific effects. We also observed several differences in the relationships between bacterial diversity, relative abundance, and functional capacity and BASC-2 scores between boys and girls. These could result from underlying differences in BASC-2 score distributions between boys and girls, from sex differences in the timing and course of neurodevelopmental maturation, or from chance due to small sample sizes contributing to sex-specific estimates. However, at least some of the sex-specific findings are supported by biological evidence. For example, three pathways related to vitamin B6 biosynthesis and salvage at either 6 weeks or 1 year were associated with better Depression scores among boys. Vitamin B6 is essential in the synthesis of several neurotransmitters,83 and low dietary intake of B vitamins has been linked to adolescent depression.84 Additional research is required to understand the interplay between host sex, early-life diet, microbiome, and later depressive symptoms.

Several studies have examined the relationship between the gut microbiome and psychobehavioral outcomes in a case-control setting in children,18,19,20,21,22,23,24,25,26,85,86 in adults,27,87,88 or in specific contexts (e.g., comorbid inflammatory bowel disease and internalizing behaviors).89 However, many such outcomes develop at an early age and are best managed through early interventions.90,91,92,93,94 The neonatal and early pediatric period may be particularly important for hypothalamic−pituitary−adrenal system stress response development due to rapid, synchronous development of the microbiome and brain circuitry.95 Notably, the landmark study describing the importance of the early-life microbiome in anxiety-related behaviors compared germ-free and specific pathogen-free mice, which better mimics a change in alpha diversity rather than differences in specific taxa, in agreement with our findings for internalizing behaviors.96 The variation in our results depending on microbiome sampling age may reflect differences in the bacteria present in the infant gut as it matures over the first 2 years of life or could reflect different windows of susceptibility in neurodevelopment related to the relevant BASC-2 scales.

Differences in 16S rRNA compared to metagenomic sequencing results may be explained by several methodological factors. First, 16S rRNA sequencing relies on PCR amplification of a hypervariable region of a highly conserved gene. The primers used likely create some bias in the relative abundances of observed taxa depending on their alignment to the primers.97 Additionally, the reference databases used to infer taxa identities are different for the two methods. For whole-genome sequencing, we used MetaPhlAn2, which infers taxonomy based on unique clade-specific marker genes and has been considered conservative to many other available tools,98,99 likely contributing to differences. Finally, participants in the 16S rRNA and metagenomic analyses overlap but are not identical. Given the unique composition of individual’s microbiomes, results may be subject to population differences. Thus, replication studies are essential to confirm our findings.

Our study is limited by its observational nature and limited sample size, particularly for our population-specific estimates. However, ours is one of the largest studies to prospectively examine the connection between the gut microbiome and neurodevelopment and suggests several novel associations that can be explored in mechanistic models. Available microbiome methods preclude efficient modeling of bacterial species and functions as the exposure, and thus our results should be interpreted carefully and not causally. While we adjusted for many covariates, we did not include dietary factors except for breastfeeding. Although diet has been shown to be important to the adult microbiome, little is known about what nutritional components can alter the infant/toddler microbiome.100 Thus, more research is necessary in this arena to investigate the role of diet as a confounder in the association between the early-life microbiome and neurodevelopment. Further, because antibiotic usage was not associated with our outcomes, it was not included in our analysis. Our secondary analysis calculating population-specific estimates for those exposed and unexposed to peripartum antibiotics accounted for the possibility that exposure to peripartum antibiotics may prime certain individuals to be more sensitive to differences in their microbiomes.101 NHBCS participants have robust neurodevelopment and are rated better than the normative population on most BASC-2 scales, potentially decreasing the sensitivity of our analysis for identifying correlates of adverse behavioral development. Lastly, our assessment of neurobehavioral outcomes at an early age may not fully capture later adverse behaviors.

This is one of the first prospective studies to examine the relationship between the microbiome and a broad range of behavioral outcomes. Given the likelihood for behaviors to affect the microbiome (i.e., reverse causation), our ability to establish the temporality of the relationship is particularly important. Additionally, our novel use of multiple imputations to increase sample size and reduce selection bias improves the precision and accuracy of our estimates. We conducted our analysis in healthy individuals that reflect the general population in the geographic region from which they derive, rather than a clinical population, increasing the generalizability of our findings. Notably, while this population performed better on many of the BASC-2 scales than the normative population, they were similar with respect to internalizing behaviors (e.g., anxiety and depression), which are of growing concern in pediatric populations.102,103 Finally, our extensive analysis of modifiers of the relationship between the microbiome and behavior uncovered important differences depending on host sex.

Our findings highlight the importance of prospective data and consideration of host characteristics in studies relating the microbiome to health outcomes. Specifically, host sex may modify the relationship between bacteria and neurodevelopment, even during windows when differences in gonadal steroid production do not occur (1 and 2 years). Future studies should consider sex differences in results. The results of this research suggest the early-life microbiome is relevant to behavioral outcomes. Interventional studies could clarify the causality of this relationship.