Introduction

Breastmilk is considered an ideal source of nutrition. Benefits of breastfeeding include protection against allergies1, better immune system development2, lower risk of childhood obesity3,4, and optimal brain development5. For these reasons, the World Health Organization and the United Nations International Children’s Emergency Fund recommend exclusive breastfeeding for the first six months of life6. Despite this recommendation, exclusive breastfeeding may not always be possible or suitable for all mothers and infants. In the United States, only 62.6% of infants are exclusively breastfed immediately following birth. This rate drops to 24.9% by 6-months of age7. While infant formula is designed to provide all the necessary nutrients for infant growth and development, it has been linked with infant hospitalizations and infections8, childhood obesity9, and lowered levels of docosahexaenoic acid (DHA), an important fatty acid related to brain development10,11.

Breastmilk is hypothesized to improve infant health in part through beneficial impacts on the developing gut microbiome and fecal metabolome12,13,14,15,16. Around 6-months of age, infants begin to consume small amounts of complementary solid foods, which can further impact the gut microbiome and fecal metabolome17,18,19. The fecal metabolome can serve as a functional readout of gut bacteria20, which can be dynamically impacted by early life feeding patterns, and it can influence infant health through the diffusion of metabolites into circulation21,22,23. Thus, characterizing the infant fecal metabolome and the individual metabolites within may provide important mechanistic insights regarding the association of early-life nutrition with developmental outcomes later in life5,24,25,26,27,28,29,30. For example, growing evidence suggests that the infant gut microbiome31,32,33, fecal metabolome34,35, and prolonged breastfeeding36,37 are each associated with improved neurodevelopmental outcomes. This is important because the brain grows to 80–90% of its adult volume in the first two years of life, establishing the structural foundations of cognitive and motor development38,39,40. Prior studies have nevertheless not examined how dietary patterns are associated with characteristics of the infant fecal metabolome and brain development in early life.

Previous studies have shown that the infant fecal metabolome changes based on infant age41, delivery mode42, and antibiotic usage43,44. The fecal metabolome also clearly depends on whether infants are exclusively breastfed or exclusively formula fed45,46,47,48. As complementary solid foods are introduced, fecal metabolomic profiles begin to converge and the metabolome of breastfed infants is indistinguishable from that of formula fed infants by 1 year of age46,47. Yet, no studies have examined how levels of mixed feeding, which is a far more common feeding choice, impact the infant fecal metabolome in the first 6 months of life. Therefore, we sought to determine if mixed feeding was associated with the alterations in the infant fecal metabolome in 112 Latino infants from the Southern California Mother’s Milk Study. Our primary aim was to determine if infant feeding patterns, including varying proportions of breastmilk or formula, were associated with alterations in the infant fecal metabolome at 1- and 6-months of age. At 6-months of age, we then sought to determine whether the infant fecal metabolome differed among infants who received solid foods in addition to either breastmilk or formula. As a secondary aim, we explored whether feeding-associated fecal metabolites were associated with neurodevelopmental outcomes at 2 years of age.

Results

Study population characteristics

This study examined 112 Latino mother-infant pairs; general population characteristics are shown in Table 1. At the 1-month postpartum visit, average maternal age was 29.0 ± 6.3 years old (18–35), and average maternal pre-pregnancy body mass index was 28.5 ± 5.8 kg/m2. Most families were of a lower socioeconomic status (SES), with an average Hollingshead Index of 26.5 ± 12.0. Average infant age in days at each fecal metabolome assessment was 32.5 ± 3.4 days at 1-month and 185.9 ± 8.1 days at 6-months. Roughly half of the infants were female (53.6%), most were born vaginally (72.3%), and 10.7% received antibiotics, based on self-reported questionnaires. The average age of introduction to solid foods was 5.9 ± 1.7 months for all infants (Supplementary Table 1) and 5.4 ± 0.7 months for only infants included in 6-month analyses.

Table 1 Characteristics of mother-infant dyads from the Southern California Mother’s Milk Study, 2016–2019.

The infant fecal metabolome is associated with infant feeding patterns

Overall, we were able to confirm the chemical identities of 143 unique metabolites from HILIC chromatography and 104 metabolites from C18 chromatography with Level 1 evidence (i.e., features whose m/z and retention time could be matched to authentic standards with MS/MS under identical conditions). Among these confirmed metabolites, many were conserved across feeding groups at 1-month (exclusively breastfed for 100% of feedings, breastfed >50% of feedings, or formula fed ≥50% of feedings) and 6-months (majority breastfed with complementary solid foods, or majority formula fed with complementary solid foods). For example, 106/143 metabolites in the HILIC column were observed in at least 50% of samples within each feeding group, as were 78/104 metabolites in the C18 column at 1-month (Fig. 1). Similarly, 124/143 metabolites in the HILIC column were observed in at least 50% of samples within each feeding group, as were 85/104 metabolites in the C18 column at 6-months.

Fig. 1: Confirmed metabolites observed across feeding groups in at least 50% of samples at 1- and 6-months in the HILIC and C18 chromatography columns.
figure 1

Confirmed metabolites observed in at least 50% of samples across feeding groups at 1-month of age in the HILIC (A) and C18 (B) chromatography columns and at 6-months of age in the HILIC (C) and C18 (D) chromatography columns.

As the presence and intensities of confirmed metabolites appeared to change across feeding groups (Figs. 1, 2), we performed non-parametric univariate permutational multivariate analysis of variance tests (PERMANOVA) to determine if this variability could be attributed to feeding groups. We found that metabolomic variation was indeed driven by feeding group at both 1- and 6-months. At 1-month, 6.5% (P = 0.001) of the variability in confirmed metabolites in the HILIC column and 7.8% (P = 0.001) of the variability in confirmed metabolites in the C18 column could be attributed to infant feeding grouping. At 6-months, 4.1% (P = 0.002) of the variability in confirmed metabolites in the HILIC column and 6.1% (P = 0.001) of the variability in confirmed metabolites in the C18 column could be attributed to infant feeding grouping.

Fig. 2: Ordination plot of principal components 1 and 2 for confirmed metabolites detected in HILIC and C18 chromatography columns by feeding group at 1- and 6-months of age.
figure 2

Principal component (PC) analysis of fecal metabolomic data from infants at 1- (top) and 6-months (bottom) of age. The plots show the first two PCs, which explain a total of 21.2% of the variance in the data in HILIC 1-month samples (A), 21.9% in C18 1-month samples (B), 22.0% in HILIC 6-month samples (C), and 21.4% in C18 6-month samples (D). The left plots show results from the HILIC chromatography column, and the right plots show the C18 chromatography column. Points are colored by infant feeding group, with corresponding legends to each timepoint denoted below. R2 and P values calculated using Permutational Multivariate Analysis of Variance (PERMANOVA), with infant feeding group as the explanatory variable. Plots show that infant fecal metabolome compositions at 1- and 6-months are significantly influenced by infant feeding patterns.

Infant fecal metabolites are associated with increased breastmilk or formula-feeding

We examined the associations of metabolite intensities with feeding groups using linear models that adjusted for infant age in days and mode of delivery. At 1-month, there were 35 metabolites significantly associated with infant feeding in the HILIC chromatography column and 22 metabolites in the C18 chromatography column (PBH < 0.05). Of these, the intensity of 17 metabolites (e.g., kynurenine, cholesterol, heptadecanoate, eicosadienoic acid) were positively associated with increased proportion of breastmilk feedings (the top ten most significantly associated metabolites are shown in Tables 2, 3; all significant metabolite results are presented in Supplementary File 1). Additionally, 40 metabolites (e.g., glyceric acid, thymidine, proline, laurate, myristate, and hypoxanthine) were positively associated with increased proportion of formula feedings. At 1-month of age, variations in metabolite intensities by feeding group were consistently monotonic as intensities either consistently increased or decreased as the proportion of breastmilk or formula feedings increased (Supplementary Figs. 14).

Table 2 Top ten confirmed metabolites detected by HILIC and C18 chromatography columns that were associated with infant feeding at 1-month of age.
Table 3 Top ten confirmed metabolites detected by HILIC and C18 chromatography columns that were associated with infant feeding at 6-months of age.

At 6-months, there were nine metabolites significantly associated with infant feeding in the HILIC chromatography column and 16 metabolites in the C18 chromatography column (PBH < 0.05). Tables 2 and 3 summarize the top 10 most significant results for each chromatography column. The intensity of 13 metabolites (e.g., kynurenine, cholesterol, methyl vanillate/homovanillate, 6-deoxy-galactose (fucose)/rhamnose) were positively associated with increased proportion of breastmilk feedings. At the same time, increased proportion of formula feedings was associated with higher intensities of 12 other metabolites (e.g., those associated with amino acid metabolism like glyceric acid, glycerate, and lysine, and nucleotide metabolism like thymidine and hypoxanthine).

Infant fecal metabolites associated with feeding are also associated with neuro-developmental outcomes at two years

As a secondary aim, we sought to determine if feeding-associated fecal metabolites were also associated with neurodevelopmental outcomes at 2 years of age. To accomplish this, we examined associations between metabolites found to be significantly associated with feeding patterns and neurodevelopmental outcomes (Bayley cognitive, motor, and language scaled scores) at 2 years of age. Results from this analysis broadly indicated that metabolites found to be positively associated with breastfeeding were also associated with higher Bayley scores, while metabolites found to be positively associated with formula feeding were also associated with lower Bayley scores at 2 years of age (Fig. 3). While some metabolites were very close to our multiple testing threshold of PBH ≤ 0.20, others exceeded this threshold and did not hold up to multiple correction testing. However, we felt it was important to report all significant metabolites at the uncorrected P value level, as there were consistent neurodevelopmental patterns based on feeding group. The only metabolite to break these trends was caffeine, which was found to be positively associated with breastfeeding at 1-month but was negatively associated with language scaled score (β = −0.11; P = 0.03; PBH = 0.21). We observed positive associations between lysoPC(16:0) (β = 0.11; P = 0.02; PBH = 0.21) and cholesterol (β = 0.21; P = 0.02; PBH = 0.21) with language scaled scores at 2 years—importantly, our analysis also revealed that these metabolites were positively associated with more breastfeeding. Cholesterol and eicosadienoic acid were also positively associated with breastfeeding and we found that these two metabolites were associated with higher cognitive scaled scores (β = 0.13; P = 0.04; PBH = 0.86; β = 0.28; P = 0.03; PBH = 0.65; respectively).

Fig. 3: Metabolites that were significantly associated with feeding patterns at 1- and 6-months detected by HILIC and C18 chromatography columns that were also significantly associated with Bayley Scores at 2 years.
figure 3

Metabolites that were significantly associated with feeding patterns at 1- and 6-months detected by HILIC and C18 chromatography columns that were also significantly associated with Bayley Scores at 2 years. Results are based multivariate linear models that adjusted for infant birth weight and mode of delivery. Results from these models were adjusted using the Benjamini-Hochberg (BH) procedure within each neurodevelopmental score. An asterisk (*) next to a metabolite name indicates that it was a C18 metabolite, no (*) indicates HILIC. Outline border indicates the super pathway associated with each metabolite. All findings were statistically significant at unadjusted P < 0.05 except for those indicated (γ = PBH ≤ 0.20).

We also observed negative associations between several fecal metabolites and neurodevelopmental outcomes previously found to be positively associated with formula feeding. These metabolites included cadaverine (β = −0.08; P = 0.03; PBH = 0.21), proline (β = −0.10; P = 0.04; PBH = 0.24), and methyl ecgonine (β = −0.15; P = 0.03; PBH = 0.21) with language scaled scores at 2 years. The metabolites glycerate and glutarate/ethylmalonic acid were also positively associated with formula feedings and negatively associated with motor scaled scores (β = −0.11; P = 0.01; PBH = 0.31; β = −0.07; P = 0.04; PBH = 0.47; respectively). Likewise, metabolites significantly associated with majority formula feeding at 6-months showed significant negative associations with Bayley scores at 2 years (Fig. 3). Myristate (β = −0.14; P = 0.04; PBH = 0.19), rac-glycerol 1-myristate (β = −0.19; P = 0.02; PBH = 0.19), and palmitate (β = −0.23 ; P = 0.03; PBH = 0.19) were all negatively associated with motor scaled score, while petroselinic acid was negatively associated with language scaled score (β = −0.12 ; P = 0.047; PBH = 0.69). All statistically significant results are also summarized via volcano plots in Fig. 4. Lastly, we conducted the same analysis using all 143 HILIC and 104 C18 metabolites, rather than the subset of ones significantly associated with feeding group and found the same trends in results. These analyses largely replicated the results in the main analyses and are summarized in Supplementary Figs. 58 as well as Supplementary File 2.

Fig. 4: Summary of the associations between HILIC and C18 1- and 6-month metabolites that were significantly associated with infant feeding group and Bayley Scores at 2 years.
figure 4

Estimates were generated using linear models that adjusted for infant birth weight and mode of delivery. P values were adjusted for multiple testing using the Benjamini-Hochberg (BH) procedure. The dashed gray line corresponds to P = 0.05. Points are colored by previous feeding association (orange or blue), and triangular points indicate PBH < 0.2, while circular points indicate PBH > 0.2.

We also tested for associations between feeding groups and neurodevelopmental scores. Across both 1- and 6-month feeding groups, there were no statistically significant differences between the cognitive, motor, and language scaled scores of each feeding group. At 1-month, the exclusively breastmilk-fed group had the highest cognitive (Only breastmilk: β = 0.35; P = 0.62; Over 50% breastmilk: β = −0.56; P = 0.41), motor (Only breastmilk: β = 0.60; P = 0.53; Over 50% breastmilk: β = 0.41; P = 0.65), and language scaled scores (Only breastmilk: β = 0.08; P = 0.94; Over 50% breastmilk: β = −0.59; P = 0.54) compared to infants that received a majority of formula. At 6-months, the majority breastmilk group had higher Bayley’s scores across all three scoring domains, but these differences were not statistically significant, apart from motor scaled scores. Again, compared to infants receiving mostly formula, the majority breastmilk-fed group had a higher cognitive (β = 0.94; P = 0.11), motor (β = 1.63; P = 0.046), and language scaled scores (β = 0.59; P = 0.48).

Discussion

Using novel high-resolution metabolomics and well-characterized dietary and neurodevelopmental data collected across the first 2 years of life, this study explores the relationship between early life feeding patterns and the fecal metabolome in infants at 1- and 6-months old, and for the first time, ties these findings to neurodevelopmental outcomes at 2 years of age. Overall, we found that feeding groups were associated with 82 fecal metabolites and that 14 of these feeding-associated metabolites were also linked with neurodevelopmental outcomes at 2 years of age. To our knowledge, this is the first study to examine groups of infants by different amounts of mixed breastmilk and formula feedings and their associations with the infant fecal metabolome while including complementary solid foods at 6-months. Similarly, this is the first investigation to find evidence that feeding-specific metabolites are associated with neurodevelopment outcomes in early life. These findings suggest that early life feeding patterns have the potential to impact the infant fecal metabolome, which has implications for optimizing early life brain development.

Among the most pronounced and consistent findings was that infant feeding group in the first 6 months of life was associated both with the overall composition and with individual metabolites of the fecal metabolome. For example, we found that feeding group explained roughly 4–8% of the variability across samples. In comparison, our previous work found that 29–30% of the variation in the infant metabolome was explained by intra-individual variability in the first two years of life, while age accounted for 6–7% of variability during this time period41. This suggests that while intra-individual variability is the largest factor influencing the infant fecal metabolome, age and feeding patterns also substantially influence the fecal metabolome. Overall, our findings show that fecal metabolites at 1- and 6-months were associated with higher breastmilk and formula feedings per day, which is largely consistent with results from previous studies that examined either exclusive breastmilk or formula feeding. For example, we found that fecal cholesterol was positively associated with breastmilk feedings at 1- and 6-months. Breastmilk has higher cholesterol content than formula and may also induce synthesis of cholesterol through nutritional programming49. Conversely, formula has higher levels of plant-based oils49,50, which may explain why we observed higher levels of fecal palmitate and petroselinic/elaidic acid in infants fed formula for 50% or more of feedings. We also found that kynurenine intensity was higher in infants fed more breastmilk; similar findings from other infant fecal metabolome studies show that kynurenine abundance is higher in exclusively breastfed versus exclusively formula fed infants46. Likewise, kynurenine declines with infant age, which aligns with breastfeeding trajectories in early life41. We found that laurate levels were lowest in infants fed more breastmilk, which is in agreement with previous work that has noted that laurate is lower in breastmilk fed compared to formula fed infants51, with laurate even serving as a biomarker for formula feeding in another previous study52. Lastly, we found higher levels of the metabolite 4-pyridoxate with increasing formula feeding, which was also observed in exclusively formula fed compared to breastfed infants42. Collectively, these findings indicate that while exclusive breast feeding may not always be possible, increasing the proportions of breastmilk relative to formula may have beneficial impacts on the infant fecal metabolome.

While our primary aim was to determine if infant feeding patterns were associated with the infant fecal metabolome at 1- and 6-months of age, we additionally sought to determine if feeding-associated fecal metabolites were also associated with neurodevelopmental outcomes. Overall, we identified 14 feeding-associated metabolites that were linked with neurodevelopmental outcomes at 2 years of age. Specifically, except for caffeine, all breastmilk-associated metabolites were positively associated with language, motor, and cognitive scaled scores. Prenatal caffeine exposure has been previously reported to be associated with lower neurodevelopment scores at 6–7 years of age53. While typical consumption of caffeine (for example, up to 3 cups of coffee/day) is generally still considered safe for lactating mothers54, consumption above this level could cause caffeine to accumulate in an infant’s system, causing symptoms of caffeine stimulation55. Formula associated metabolites were negatively associated with neurodevelopmental scores at 2 years of age. Results from the current study suggest that these beneficial effects may be partly attributable to the specific metabolites, including lysoPC(16:0), cholesterol, and eicosadienoic acid. While fecal metabolites may be obtained directly from feeding and diet56, they may also be obtained through metabolic transformation by gut bacteria57,58. Supporting this hypothesis, the gut microbiome and bacterial-derived metabolites have been associated in murine models with brain function and development59,60,61.

Several studies have found that breastfeeding compared with formula feeding during infancy is associated with enhanced maturation of the central nervous system29 and earlier acquisition of key developmental milestones, including language and motor skills, and with overall higher cognitive development scoress5,27,28. We found that lysoPC(16:0) was positively associated with breastmilk feedings at 1-month and with language scaled scores at 2 years. LysoPC(16:0) is the preferred pathway of carrying DHA to the brain62, which is a long-chain polyunsaturated fatty acid that may have a positive impact on infant neurodevelopment63,64. We also found that cholesterol was positively associated with 1-month breastmilk consumption and higher language and with cognitive Bayley scaled scores at age 2 years. In early life, cholesterol is synthesized in the central nervous system during the first weeks following birth, and is implicated with many important neurodevelopmental processes, including synaptogenesis and myelination)65. Previous work has shown that children with higher dietary cholesterol intake perform better in cognitive developmental tests compared to those with the lowest consumption of dietary cholesterol66.

In contrast, formula feeding compared with breastfeeding during infancy has been previously implicated with poorer neurodevelopmental outcomes, including lower cognitive developmental scores67, and later completion of motor development milestones in mixed fed versus exclusively breastfed infants68. In this study, cadaverine was significantly associated with increased formula consumption at 1-month and lower language Bayley scaled scores at 2 years. Cadaverine is a common metabolite and contaminant present in infant formula69. It is classified as a biogenic amine, compounds known to elicit toxicological effects at high concentrations69,70. While no work has assessed the health effects of cadaverine in infants, increased levels of cadaverine have been observed in adults with ulcerative colitis71 and in children suffering from nutrient malabsorption conditions (e.g., cystic fibrosis, short bowel syndrome)72. While formula typically contains what is considered safe levels of biogenic amines69, future study in infants is warranted since they are more vulnerable to the health effects of contaminated foods. Other metabolites associated with formula included those derived from plant-based oils used to mimic fatty acid content present naturally in human milk. For instance, petroselinic acid/elaidic acid, which was associated with increased formula at 6-months and lower language scaled scores, is an industrially-derived trans fatty acid (TFA) produced by hydrogenation of vegetable oils50. Previous animal studies have demonstrated that brain DHA concentrations are reduced after high intakes of elaidic acid73, and that DHA deficiency during infancy delays brain development74. Additionally, a human study found increased TFAs were associated with unfavorable neurologic outcomes at 18-months75. We also found another plant-based oil associated metabolite, palmitate (a saturated fatty acid derived from palm oil), was associated with increased formula feedings at 6-months and lower motor scaled scores. In contrast with our findings, another study found formula supplemented with palmitate was associated with improved motor skills and higher levels of the beneficial bacteria Bifidobacteria in the infant gut due to a prebiotic effect76. These mixed findings suggest insufficient evidence regarding the role of palm oil/palmitate and whether it should be avoided in infant formula.

Findings from the current study suggest that formula-associated metabolites may have a greater impact on neurodevelopmental outcomes at 2 years of age compared with breastmilk-associated metabolites. At both 1- and 6-months, metabolites that were associated with increased formula feeding were also associated with lower neurodevelopment scores at 2 years. Conversely, metabolites associated with increased breastfeeding at only 1-month of age were associated with neurodevelopment outcomes. These differential effects of formula and breastfeeding may be attributable to the introduction of solid foods, which can disrupt the counter the protective effects of breastmilk77. For instance, which can disrupt the protective prior studies report that early solid food introduction is associated with increased allergenic responses in infants compared to exclusively breastmilk fed infants77. Additionally, the composition of the infant gut microbiome can change in both breastmilk and formula fed infants as solids foods are introduced78. For example, infants who were exclusively breastfed before solid food introduction have been shown to have a higher proportion of protective gut bacteria like Bifidobacterium, and a lower abundance of Bacteroidetes and Clostridiales – changes that are associated with altered immune functioning, increasing inflammation, and weight gain79,80,81. Therefore, increased breastmilk compared with formula feeding may provide the gut microbiome with a greater plasticity that eases the transition into solid foods82, which may partly explain why only formula-associated metabolites were negatively associated with neurodevelopmental outcomes following the introduction of solid foods. Overall, the timing of solid food introduction as well as the composition of those foods may have important implications for the protective effects of breastmilk feeding later in life. Moreover, all breastmilk is not created equal in the first 6 months of life; temporal variation in breastmilk composition may account for more associations at 1-month than 6-months of age. For example, the first form of breastmilk, colostrum, is higher in essential components like antibodies, protein, fatty acids, and other nutrients and growth factors than mature milk83,84. Although we do not have specific information regarding colostrum feeding, the strong associations seen at 1-month may reflect the positive neurodevelopmental effects of colostrum feeding shown in previous work85, which may persist but are no longer significant after 6-months. It is also pertinent to consider that formula feeding might disrupt the natural breastfeeding process by influencing milk supply. When infants are introduced to formula, it can lead to reduced demand for breastmilk, potentially impacting the stimulation necessary for sustained milk production. This alteration in breastfeeding patterns could further contribute to the variations observed in the composition of colostrum and mature milk, emphasizing the intricate interplay between feeding practices and the evolving nutritional content of breastmilk over the early stages of infant development. Understanding these dynamics is crucial for comprehensively interpreting the differences in infant gut metabolomics associated with various feeding patterns.

While this study had many strengths, including repeated sampling and comprehensive metabolomics profiling of the infant fecal metabolome from a well-established cohort of infants, it also has limitations worth noting. First, it focused on metabolites that were identified with Level 1 evidence, and therefore we characterized patterns and associations in a relatively limited number of metabolites, which may not capture the systemic alterations in the infant fecal metabolome across the first 6-months of life. Additionally, this study utilized an exclusively Latino cohort with exclusions for factors like preterm birth, low birth weight, cigarette smoking or recreational drug use, which may limit the generalizability of our findings. Stool samples were collected via OMNIGene GUT kits, which can limit the diversity of metabolites identified in comparison with other collections methods compared to immediate freezing86. However, other work shows that the biological effects, like individual variation, outweigh technical effects, such as collection method87. Further, many of the metabolites reported in this study have been previously observed in other infant metabolomic studies from varying populations41,46,47,51,52. Future studies should seek to incorporate the gut microbiome into fecal metabolomics to further explore microbial and metabolomic associations in the gut with dietary patterns, as gut bacteria likely play a key role in the composition of the fecal metabolome58. Finally, many factors may contribute to early life neurodevelopmental outcomes. While dietary patterns may play one role in shaping these outcomes, we cannot recognize the importance of other lifestyle factors (e.g., access to healthcare, social support, parenting practices, etc.) that also influence neurodevelopmental trajectories and outcomes in infants during early life.

Overall, this study showed that varying proportions of breastmilk or formula feedings are significantly associated with the composition of the infant fecal metabolome as well as individual metabolite intensities at both 1- and 6-months of age. Further, apart from caffeine, metabolites associated with more breastmilk feedings were associated with better neurodevelopmental performance at 2 years of age, while metabolites associated with more formula feedings were associated with worse neurodevelopmental scores. These findings suggest that increased breastfeeding, even in the context of Supplementary formula feeding, may have beneficial impacts on infant health and development.

Methods

Study population

The Southern California Mother’s Milk Study is an ongoing, longitudinal cohort of 219 Latino mother-infant pairs who, beginning in 2016, were recruited from maternity clinics associated with the University of Southern California and Children’s Hospital Los Angeles, as described in detail in previous studies41,88. Individuals were eligible to participate in the Mother’s Milk Study if mothers were (1) ≥18 years old at time of delivery; (2) had a healthy, singleton birth; (3) enrolled in the study by 1-month postpartum; and (4) could read at a 5th-grade level in either Spanish or English. Potential participants were excluded if they had (1) any diagnoses known to impact mental/physical health, nutritional status, or metabolism; (2) were currently using tobacco or recreational drugs; (3) had infants who were self-reported by mothers to be preterm or low birth weight; or (4) had infants with clinically diagnosed fetal abnormalities. The Institutional Review Boards of the University of Southern California, Children’s Hospital Los Angeles, and the University of Colorado Boulder approved of the study procedures; all research was performed in accordance with the relevant guidelines and regulations. Written informed consent for the infants in this study was obtained from parents/legal guardians at time of enrollment.

Study design

All participants were recruited from the Mother’s Milk Study between 2016–2017, a longitudinal cohort in which mothers and infants attended clinical visits at 1-, 6-, 12-, 18-, and 24-months postpartum. Initially, 219 mother-infant dyads enrolled in the Mother’s Milk cohort. A subset of 127 participants were selected based on completion of fecal sample collection at all timepoints to undergo fecal metabolomics analysis. Those individuals excluded from this analysis did not differ significantly from those who were included (Supplementary Table 1). SES was estimated using a modified version of the Hollingshead index89, as previously described, which ranged in possible value from 3–6688,90. Questionnaires were used to assess self-reported birth mode (vaginal or cesarean section), infant antibiotic exposure, and infant feeding practices.

Infant feedings assessments

At 1- and 6-months, infant breastfeedings and formula feedings per day were based on questionnaire data with answer options of 0–1, 1, 2, 3, 4, 5, 6, 7, and ≥8 breast feedings per day. We assigned 0–1 as 0 feedings per day, 1–7 as their reported values, and ≥8 as eight feedings per day. Age of solid food introduction was assessed in months, e.g., a value of 5 indicates the infant began consuming solid foods when they were 5 months old. At 1- and 6-months, there were 113 infants with complete fecal metabolomic data at both timepoints, and of those, 112 infants had complete data on the number of feedings they received per day (one infant had no information recorded and was thus excluded from the analysis), see participant flow chart for more information (Supplementary Fig. 9). At 1-month, we defined feeding groups as (1) exclusively breastfed, meaning 100% of feedings per day were breastmilk and there was a 0 value for formula feedings per day, (n = 40), (2) breastfed >50% of feedings, (n = 46), or (3) formula fed ≥50% of feedings (n = 26). Importantly, as participants were recruited into the Mother’s Milk Study based on intention to breastfeed, there were very few participants who exclusively formula fed (n = 3), as such, we grouped based on levels of breastfeeding from most to least. These groupings were chosen based on total and within group sample size, as well as the distribution of breastmilk and formula feedings. We excluded infants at 6 months who either (a) were not eating solid foods yet (n = 16) or (b) reported solid food consumption, but had no data on supplementary milk or formula feedings (n = 10). This decreased the sample size at 6-months to 87 infants. At 6-months, feeding groups were defined as majority breastmilk (>50% of feedings, n = 41) or majority formula fed (≥50% of feedings, n = 46), both complemented by solid foods. The two sample groups at 1- and 6-months did not significantly differ in any characteristics except for age in days (Table 1).

Neurodevelopmental assessments

Neurodevelopmental outcomes were assessed at 2 years of age using the Bayley Scales of Infant and Toddler Development-Third Edition (BSID-III)91,92. Trained research personnel administered the BSID-III under the supervision of an expert in child developmental assessment. Cognitive, motor, and language domains were assessed using BSID-III in an interactive examination lasting approximately 2 h. As there was a range in infant age at the 24-month visit (minimum age: 709 days; mean age: 735 days; maximum age: 799 days), scaled scores were used as the primary outcome, although composite scores (used to describe overall development in the relative domain) were also assessed. As a sensitivity analysis, we excluded infants who had difficulty completing the Bayley’s assessment (e.g., because of tiredness, crying, etc.; n = 8). However, because results were largely unchanged (Supplementary File 3), these infants were retained in the final analysis.

Sample collection and extraction for high-resolution metabolomics

OMNIGene GUT kits were used to collect infant stool samples at 1- and 6-months of age. Untargeted high-resolution metabolomics analysis was carried out using established protocol by the Emory Clinical Biomarkers Laboratory, as previously described in detail93,94. Briefly, stool samples were first added to ice-cold acetonitrile to precipitate proteins, kept on ice for 30 min, centrifuged for 10 min at 14,000 g, and kept at 4 °C until analysis. Extractants were analyzed in triplicate using liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS) (Dionex Ultimate 3000, Thermo Scientific Orbitrap Fusion).

Instrumentation and analytical conditions

Instrumentation methods for this analysis have been previously described in detail by Holzhausen et al41. In this study we used hydrophilic interaction liquid chromatography (HILIC) (Waters XBridge BEH Amide XP HILIC column; 2.1 × 50 mm2, 2.6 μm particle size) with positive electrospray ionization (ESI) and reverse phase (C18) chromatography (Higgins Targa C18 2.1 × 50 mm2, 3 μm particle size) with negative ESI. We conducted HILIC analyte separation using water, acetonitrile, and 2% formic acid mobile phases following the subsequent gradient elution. Our initial 1.5-min period consisted of 22.5% water, 75% acetonitrile, and 2.5% formic acid with a subsequent linear increase to 75% water, 22.5% acetonitrile, and 2.5% formic acid at 4 min, followed by a final hold for 1 min. We conducted analyte separation for the C18 chromatography column using water, acetonitrile, and 10 mM ammonium acetate mobile phases under the following gradient elution. The initial 1-min period consisted of 60% water, 35% acetonitrile, and 5% ammonium acetate with a subsequent linear increase to 0% water, 95% acetonitrile, and 5% ammonium acetate at 3 min with a final hold for the last 2 min. Mobile phase flow rate was 0.35 mL/min for the first minute and was increased to 0.4 mL/min for the last 4 min for the HILIC and C18 chromatography columns. LC-HRMS was run in full scan mode, with 120k resolution; the range of mass-to-charge ratio (m/z) was from 85 to 1275. Tuning parameters for sheath gas were 45 (arbitrary units) for positive ESI and 30 for negative ESI. For positive ESI, auxiliary gas was set to 25 (arbitrary units) and spray voltage was set at 3.5 kV. For negative ESI, auxiliary gas was set to 5, and spray voltage was set to −3.0 kV. Internal standards included pooled stool and standard reference materials for human metabolites in stool. We added these internal standards at the beginning and end of each 20-sample batch for quality control and standardization.

Metabolite confidence and identification

We analyzed data from HILIC positive ESI and C18 negative ESI separately; raw files were converted to the .mzXML format. Two internal standards, which include pooled stool and standard reference material for human stool metabolites (NIST SRM 1950), were added at the beginning and the end of each batch of 20 samples for normalization, to control for background noise, batch evaluation, and post hoc quantification. Metabolomic signals (i.e., metabolic features) were then extracted and aligned using apLCMS95 with modification of xMSanalyzer96 for quality control and reduction of batch effects following instrument analysis96,97. Coefficients of variation (CV) of metabolic features were assessed as part of our quality control. Metabolic features whose intensity had CV > 30% were removed, then intensities of metabolic features were averaged across triplicates. Metabolic features which were detected in <10% of samples were excluded. Outliers were assessed visually using principal component analysis (PCA) of the log2 transformed feature intensities. In a sensitivity analysis, samples whose PCA score was >3 standard deviations for PC 1 or PC 2 were removed (not shown). There were no important differences in results, and as such these observations were not removed. Metabolic features were then annotated and confirmed using the Metabolomics Standards initiative criteria98. Level 1 confidence was assigned to metabolic features whose m/z, retention time, and extracted ion chromatograph matched the authentic standards analyzed with tandem mass spectrometry under identical conditions (within 10 ppm and 50 s; HILIC maximum retention time difference: 41.7, HILIC minimum difference: −48.1; C18 maximum retention time difference: 38.1, HILIC minimum difference: −49.1). In all analyses described below, we focus on these confirmed metabolites with Level 1 evidence.

Statistical analysis

Descriptive statistics for key variables were performed on the full analytic data set of 112 participants at 1-month or 87 participants at 6-months of age. We used the {ggVenn} package in R to visualize how many metabolites were present in 50% of samples in each feeding grouping at each timepoint99. We performed PCA on log2 transformed feature intensities to visualize overall metabolomics profiles between feeding groupings within each timepoint of either 1- or 6-months; any metabolic features which were below the minimum level of detection, and were therefore missing were set to 0 before performing the PCA. We used permutational multivariate ANOVA (PERMANOVA) tests to explore how overall fecal metabolite intensities changed in relation to feeding groupings, using the “adonis2” function implemented by the {vegan} package in R and using Euclidian distance (permutations = 1000) while removing any missing values before performing the distance calculation100.

Linear models were used to estimate the associations of the log2 transformed intensity of each confirmed Level 1 metabolite with infant feeding groupings at 1- and 6-month timepoints. Models included adjustments for infant age in days and mode of delivery based on a Directed Acyclic Graph (DAG) (Fig. 5A), with results adjusted for multiple testing using the Benjamini-Hochberg procedure at PBH < 0.05101. Boxplots were used to visualize the intensity of selected confirmed metabolites associated with infant age in days in the HILIC and C18 chromatography columns by feeding groupings.

Fig. 5: Directed acyclic graphs between infant feeding (exposure) and fecal metabolome (outcome), and fecal metabolome (exposure) and infant neurodevelopment (outcome).
figure 5

Directed acyclic graphs (DAGs) were developed based on causal relationships determined from review of relevant literature using (A) infant feeding practices at 1- and 6-months as the exposure, and the infant fecal metabolome at 1- and 6-months as the outcome and (B) the infant fecal metabolome at 1- and 6-months as the exposure and infant neurodevelopment at 24-months as the outcome. Figures were created with BioRender.com.

We ran metabolome-wide association studies using linear models to estimate the relationship between the log2 transformed intensity of each confirmed metabolite that was also significantly associated with feeding and neurodevelopmental outcomes (Bayley scores) at 24-months. We also ran the same models to estimate relationships between all confirmed metabolites as an untargeted approach. Models included adjustments for infant birthweight and mode of delivery based on a DAG (Fig. 5B), with results adjusted for multiple testing using the Benjamini-Hochberg procedure at PBH ≤ 0.20. Considering the exploratory nature of this study, we chose a significance level of PBH < 0.20. By selecting a 20% false discovery rate, we aimed to capture biologically meaningful associations between feeding-associated gut metabolites and neurodevelopmental outcomes. Volcano plots were used to visualize which metabolites were significantly associated with each Bayley score measure and were also significantly associated with feeding groupings from the previous linear models. We also ran linear models to estimate the relationship between neurodevelopmental outcomes and infant feeding groupings at 1- and 6-months postpartum. These models included adjustment for maternal SES based on a DAG.