Impact of antibiotics on off-target infant gut microbiota and resistance genes in cohort studies

Background Young children are frequently exposed to antibiotics, with the potential for collateral consequences to the gut microbiome. The impact of antibiotic exposures to off-target microbes (i.e., bacteria not targeted by treatment) and antibiotic resistance genes (ARGs) is poorly understood. Methods We used metagenomic sequencing data from paired stool samples collected prior to antibiotic exposure and at 1 year from over 200 infants and a difference-in-differences approach to assess the relationship between subsequent exposures and the abundance or compositional diversity of microbes and ARGs while adjusting for covariates. Results By 1 year, the abundance of multiple species and ARGs differed by antibiotic exposure. Compared to infants never exposed to antibiotics, Bacteroides vulgatus relative abundance increased by 1.72% (95% CI: 0.19, 3.24) while Bacteroides fragilis decreased by 1.56% (95% CI: −4.32, 1.21). Bifidobacterium species also exhibited opposing trends. ARGs associated with exposure included class A beta-lactamase gene CfxA6. Among infants attending day care, Escherichia coli and ARG abundance were both positively associated with antibiotic use. Conclusion Novel findings, including the importance of day care attendance, were identified through considering microbiome data at baseline and post-intervention. Thus, our study design and approach have important implications for future studies evaluating the unintended impacts of antibiotics. Impact The impact of antibiotic exposure to off-target microbes and antibiotic resistance genes in the gut is poorly defined. We quantified these impacts in two cohort studies using a difference-in-differences approach. Novel to microbiome studies, we used pre/post-antibiotic data to emulate a randomized controlled trial. Compared to infants unexposed to antibiotics between baseline and 1 year, the relative abundance of multiple off-target species and antibiotic resistance genes was altered. Infants who attended day care and were exposed to antibiotics within the first year had a higher abundance of Escherichia coli and antibiotic resistance genes; a novel finding warranting further investigation.

238 infants with viable 6-week/1-year paired stool metagenomic data 99 infants included with pediatric medical record data up to the 1-year stool sample 183 infants with antibiotic exposure information • Antibiotic exposure based on medical record • Exposures assessed between 6-week and 1-year stool sample: • Exposed/unexposed to antibiotics • Number of antibiotic prescriptions • Time since antibiotic prescription • Antibiotic exposure based on medical record and interview data • Exposures assessed between 6-week and 1-year stool sample: • Exposed/unexposed to antibiotics  and day care attendance in the first year of life. Correlation is measured using pairwise spearman correlation. The species and ARGs are hierarchically clustered using HAllA. Only species and ARGs with at least one statistically significant pairwise association are shown in each figure. Each part represents the following: A) infants unexposed to antibiotics and unexposed to day care (n = 68), B) infants unexposed to antibiotics but exposed to day care (n = 40), C) infants exposed to antibiotics but unexposed to day care (n = 23), and D) infants exposed to antibiotics and day care (n = 36).

Methods to assess the alpha (within) and beta (between) sample diversity:
To estimate the richness and evenness of samples, Shannon alpha diversity at the species level was computed for each sample. Similar to relative abundance assessments, we were interested in estimating the difference in the change in Shannon index that would have occurred over the follow-up period if an infant had not been exposed to antibiotics. We used a linear mixed-effects model implemented with the lmer() function in the R package lme4 1 :

Shannon index ~ (intercept) + sample age (in days) + antibiotic exposure + antibiotic exposure × sampling interval (baseline vs 1 year) + covariates + (1 | infant)
Where Shannon index is the alpha diversity of each sample and the remaining variables are the same as the off-target microbe and ARG abundance calculation.
We analyzed the between sample or beta diversity of bacterial species in infant stool samples in the study. We used a centered-log ratio transformation and calculated the distance between samples via the Euclidean distance. This is akin to the Aitchison distance and can outperform common metrics such as the Bray-Curtis dissimilarity metric for compositional data and is a true linear difference 2,3 . Differences between samples were visualized using the first 2 components of the principal components analysis. Similar to 4 , we were interested in how microbiome composition varied by the interaction between an exposure and age. Specifically, we were interested in understanding how compositional variation with respect to antibiotic exposure is affected by compositional variation by age. Dispersion of samples by covariates were analyzed for statistical significance using PERMANOVA models with the adonis2() function in the vegan 5 package. 1000 permutations were run for each model. Samples were removed prior to running the PERMANOVA if they had a missing value for breastfeeding duration.

Results: within (alpha) and between (beta) sample diversity of off target microbes:
We evaluated the change to the Shannon index of off-target microbes among infants who had ever received an antibiotic, an antibiotic to treat otitis media, or 2 or more courses of antibiotic between baseline and 1-year measurements. Interestingly, at baseline, infants ultimately exposed to antibiotics had a slightly greater alpha diversity than unexposed infants, but by 1 year had an overall decreased alpha diversity. After accounting for increasing compositional diversity with time, at the one year mark, antibiotic-exposed infants in the Antibiotic Exposure Cohort (n = 216) had on average a 0.24 (95% CI: -0.44, -0.05) decrease in alpha diversity more than they would have had they not been exposed across the two time points (see Figure below). This estimate was consistent across adjusted linear mixed-effects models that considered any antibiotic exposure or an antibiotic exposure specifically for otitis media (see Tables below) Likewise, the trend was consistent in analyses restricted to NHBCS infants and subsequently stratified by day care attendance at 1 year (see Tables below). We also detected a significant dose-dependent association; infants exposed to 2 or more antibiotics between baseline and one-year measurements had on average a decrease in Shannon alpha diversity of 0.34 (95% CI: -0.63, -0.04) more than they would have had they not been exposed.
Results from between sample (beta) diversity analyses of microbiome composition did not visually reveal any clear separation based on antibiotic exposure (see Figure below), but as predicted, showed the strong impact of sample age on between sample dispersion (see Figure   below). While visual results did not depict a strong impact of antibiotic exposures, variation based on the interaction between antibiotic exposure and age were consistently significant across models assessing any antibiotic exposure (see Tables below, p-value < 0.01). Among 216 infants, the interaction between sample age and antibiotic exposure by 1 year described 2.6% of the variation. However, the interaction for exposure to 2 or more antibiotics and age among 132 infants was neither strong nor statistically significant (R 2 = 0.003, p-value = 0.22).
Difference-in-difference models assessing antibiotic exposure and time on Shannon alpha diversity

Figure:
Within and between sample species diversity by antibiotic exposure and time period in 216 infants. A) A conceptual model depicting the average within sample (alpha diversity) difference by antibiotic exposure (yes/no) and time period (pre/post antibiotic exposure). Estimates are derived from a crude linear regression model with the Shannon index per sample (n= 432) as the outcome and antibiotic exposure ever, baseline/1-year treatment, and the difference-in-difference estimate for the interaction as independent variables. All differences are statistically significant at α = 0.05. The solid lines represent the trajectory of the Shannon index by group observed in this study. The lighter blue dashed line depicts the alpha diversity in a counterfactual scenario in which antibioticexposed infants were never exposed to antibiotics. B and C) Principal component analysis of between sample diversity. The relative abundance of species was centered log-transformed and plotted using the Euclidean distance. Samples are colored by B antibiotic exposure by 1 year (excluding antibiotic exposure immediately following birth) or C age of the infant at sample collection.

Discussion: Antibiotics' impacts on microbial diversity
Previous studies have found mixed results associating diversity and antibiotic exposure depending on geography, age of the child, type of antibiotic prescribed, and time since antibiotic exposure, but, generally, antibiotic exposures are associated with stable or decreased withinsample diversity 6 . Consistently, we found NHBCS infants had a statistically significantly (pvalue < 0.05) higher alpha diversity than samples in the DIABIMMUNE Study. This may be due to true differences in alpha diversity between the cohorts or lower sequencing depths for samples in the DIABIMMUNE Study. In the DIABIMMUNE cohort study 7 , on average, infants without antibiotic exposures had a microbiome with higher richness, but this was only evident after the first year of life. In our study, we consistently found that there was an inverse association between antibiotic exposure and Shannon alpha diversity prior to 1 year. Interestingly, however, at the baseline timepoint (at approximately 6 weeks of life) infants who were ultimately exposed to antibiotics had higher alpha diversity than infants not exposed to antibiotics. This emphasizes the importance of considering timing of exposure, in particular pre-and post-antibiotic exposure measurement, to discern trends over time. Sample sizes including a flow diagram ( Figure S1) and results from sensitivity analyses are available below.

NHBCS Antibiotic Exposure Cohort:
The conceptual exposure for this cohort was antibiotics given to infants between the baseline (6week) and 1-year stool sample collection that could impact the gut (i.e., ingested or administered systemically).
Operationally, the antibiotic exposures cohort required more protocols to ensure that antibiotic exposures classified via pediatric medical records and interview data were captured correctly.
Depending upon the data available, these were the protocols used to assign infants' antibiotic exposures: • If the infant had both pediatric medical record and interview data available, only include infants that had concordant pediatric medical record and interview data indicating an antibiotic exposure.
• If the infant had only interview or pediatric medical record data but that indicated an antibiotic exposure during the time period of interest, mark them as antibiotic exposed.
• If the infant had only interview or pediatric medical record data but that indicated no antibiotic exposure during the time period of interest, mark them as antibiotic unexposed.
Determining if the antibiotic exposure occurred before or after the stool sample was also considered. With pediatric medical record data, the time (days since birth) of the antibiotic prescription was given providing antibiotic exposure timing. However, antibiotic exposures as measured on the interview at 12 months could occur slightly before or after the stool sample was collected. Based on these challenges, we created a schema to determine how to include infants based only on interview data or pediatric medical record data. Since prescription dates are given for pediatric medical record data, the time of antibiotic prescriptions are pulled from there. With interview data, one of the following conditions needed to be met in order to include an infant: 1) 12-month interview data collected within a set time interval before or after the stool sample was collected.
2) 12-month interview occurred significantly earlier than the 1-year stool sample but there was definitive evidence of an antibiotic exposure prior to the stool sample collection.
3) The 12-month interview occurred significantly after the stool sample was collected but no antibiotic exposures were noted.
The next step was determining the time interval to determine correct classification of samples.
To determine the best time interval, we utilized infants that had 8-and 12-month interview data as well as pediatric medical record data beyond their first year of life as a sensitivity cohort. We 2) The pediatric medical record is considered the "gold standard" because it includes more detailed information on the exact date of the antibiotic prescription which cannot be assumed by the interview. Additionally, parents are less likely to be as knowledgeable about specific antibiotic prescriptions given compared to clinicians.
3) Infants prescribed an antibiotic on the pediatric medical record were actually given an antibiotic.

4)
If interview data mentions an antibiotic was given but it's not matched on the pediatric medical record, it was considered an antibiotic exposure, but ultimately that infant would be censored as there is disagreement between the medical record and interview (i.e., the assumption is that no antibiotic prescriptions on the medical records were missed among infants with full medical records).
5) Infants noted as receiving an antibiotic on the interview data, were given the antibiotic.
6) On the interview, we assume recall bias is minimal as parents would likely know if an infant is given an antibiotic for a condition or not (i.e., there's a high likelihood that parents would remember either way). There was concern about recall bias for specificity of conditions. In other words, on a medical record the clinician would often say an antibiotic was for "otitis media" while on the interview it may list a more general list of symptoms for why the antibiotic was given (e.g., ear infection, runny nose, and fever).
While this doesn't change the fact that the infant was exposed to an antibiotic, it does limit the ability to compare between the two classification systems.

7)
Infants only needed a completed interview at 8 and 12 months as the vast majority of antibiotic exposures occurred after approximately 6 months. 8) Infants that had an antibiotic exposure noted on the 4-month interview were exposed to an antibiotic after the baseline sample was collected. This assumption was based on the sensitivity cohort in which no infant with pediatric medical record data had an antibiotic exposure before 6 weeks unless they were initially given one at birth which is noted in the infant medical record.

NHBCS Antibiotic Frequency Cohort:
The conceptual exposure in this cohort was based on the number of antibiotic prescriptions. This was operationalized by assessing all antibiotic prescription from all infants with pediatric medical records up to the 1-year stool sample collection.
Ultimately, the decision was made to include infants that have a recorded antibiotic prescription even if it was not noted on the interview data. We made this assumption because we do not know in infants with missing interview data if 1) the antibiotic prescription would have been captured, 2) antibiotic prescriptions are missing from interview data due to the infant not redeeming their antibiotic prescription, or 3) the caregiver did not recall an antibiotic exposure for his/her/their child.

NHBCS Antibiotic Exposure Cohort:
To assess the time interval that could be used to count antibiotic exposures among infants with interview data in NHBCS Antibiotic Exposure Cohort, we assessed concordance between interview and medical record data at 14, 21, and 28 day intervals among infants with both interview and pediatric medical record data available. 81 infants had pediatric medical record data at least 28 days beyond when their 1-year stool sample was collected as well as having an 8month and 12-month interview available.

Unexposed to antibiotic
Interview occurred within 28 days of stool sample or infant could be classified based on other condition* Exposed to antibiotic 22 1 *Other conditions included: 12-month interview occurred significantly earlier than the 1-year stool sample but there was definitive evidence of an antibiotic exposure prior to the stool sample collection. The 12-month interview occurred significantly after the stool sample was collected but no antibiotic exposures were noted.

Unexposed to antibiotic
Using the sensitivity cohort for the 28-day interval (i.e., the cohort with full pediatric medical record and interview data above), we would have included 65 infants (out of the 68, the 3 incorrect classifications would have been censored). Of those 65 infants, 22 were exposed to an antibiotic. This provides an incidence of antibiotic exposure in the cohort of 22/65 or 33.8%. For any antibiotic exposure for otitis media [not shown], the concordance is still high. In the sensitivity set (n = 68), 4 infants were censored based on unclear antibiotic for otitis media exposure. This left 64 infants that were correctly able to be classified and 19 were exposed to antibiotics for otitis media among them (incidence = 19/64 or 29.7%). Lastly, in the 28-day sensitivity set (n = 68) and censoring for unclear antibiotic exposure for any reason or for unclear otitis media diagnosis, 5 infants would have been censored for antibiotic exposure in general or for antibiotic exposure for otitis media. 20 infants were exposed to antibiotics in general and 19 for otitis media. This makes the incidence 20/63 and 19/63 or 31.7% and 30.2% respectively.
The final NHBCS Antibiotic Exposure Cohort only included infants that were not censored for unclear antibiotic exposure or unclear antibiotic exposure for otitis media ultimately consisting of 183 infants. Of which, 63 were exposed to any antibiotic (incidence = 34.4%), and 58 to an antibiotic for otitis media (incidence = 31.7%).
Thus, our sensitivity analysis showed strong concordance and agreement between the incidence of antibiotic exposures in the sensitivity and ultimate NHBCS Antibiotic Exposure Cohort.

NHBCS Antibiotic Frequency Cohort:
Similarly to the sensitivity analysis for the NHBCS Antibiotic Exposure Cohort, we utilized a sensitivity cohort to assess agreement between interview and antibiotic prescriptions data per infant. We included infants in this cohort based on slightly different metrics than the NHBCS Antibiotic Exposure Cohort. Infants were included if they had interview data at 8 and 12 months with the 12-month interview occurring within 28 days of the stool sample along with full pediatric medical record data (n = 64). Additionally, 2 infants were included because they had interview and medical record data beyond the point of their sample collection with no antibiotic exposure. In this sensitivity analysis cohort of 66 infants we found the following results: This table demonstrates that there is substantial agreement between methods; however, as expected, some of the antibiotic prescriptions are not captured on the interview. This could be due to a lack of reporting of the antibiotic exposure or that the antibiotic was not given to the infant even when it was prescribed. Given this uncertainty, we decided to include all antibiotic prescriptions. Based on this sensitivity set, if bias was present, we believe this would bias our results towards the null.

Covariates
Extensive data on lifestyle, medical history, and environmental exposures were collected from participants in the NHBCS; select covariate data were available for the DIABIMMUNE Study.
Across both cohorts, covariates were selected as a result of an a priori literature review and a theorized association with antibiotic exposure and the infant gut microbiome. We included gestational age (in weeks) 8 , duration of breastfeeding (in days) 9-11 , delivery mode (vaginal or cesarean) 11-13 , sex (male or female) 14 , infant age at stool sample collection (in days) 12 , and antibiotic usage immediately following birth (yes/no) 15 . Infants were excluded if they had a known antibiotic exposure after the period immediately following birth but before the baseline microbiome sample was collected. Additionally, for NHBCS data, we considered day care attendance by 1 year (yes/no) as a covariate as it had previously been associated with otitis media 16,17 and we hypothesized that it could have an association with the gut microbiome 18,19 .
Adjusted models including infants from both the NHBCS and DIABIMMUNE Study also included a dichotomous covariate to adjust for study origin.

NHBCS samples:
Stool collection and metagenomic DNA shotgun sequencing have been previously completed and described for our cohort 20,21 . In brief, infant stool samples were collected in diapers at approximately 6 weeks and 1 year of age. DNA was extracted from the stool samples with concentrations of DNA samples used for sequencing ranging from 1 ng/ul to 25 ng/ul.
Sequencing was conducted at the Marine Biological Laboratory in Woods Hole, MA using established protocols. A Covaris S220 focused ultrasonicator sheared DNA samples to a mean size of 400 base pairs and libraries were constructed using Nugen's Ovation Ultralow V2 protocol.
All samples were processed using KneadData as part of the bioBakery3 pipeline 22 . This pipeline included samples that were sequenced and merged as paired end reads or samples that were only sequenced as forward reads. Previously, we identified that using both single and paired end reads Taxonomic and antibiotic resistance gene profiling: For samples from both the NHBCS and DIABIMMUNE Studies, taxonomy down to the species level was assigned using MetaPhlAn3 25 . Only bacterial taxa were considered. Taxa that were identified in less than 1% of samples (assessed in the Antibiotic Exposure and Frequency Cohorts) were excluded in further analyses. After prevalence filtering, the relative abundance at each level of interest (phyla, genera, and species) were normalized to sum to 1.
ARGs were profiled using ShortBRED 26

Assessing individual off-target microbes and antibiotic resistance genes:
Our assessment of off-target microbes and ARGs integrated the strengths of the cohort studies; specifically, longitudinal metadata, medical record data, and microbiome data. We were interested in estimating the difference in relative abundance that would have occurred over the follow-up period if an infant had not been exposed to antibiotics.
Estimation of the difference-in-difference estimate was performed using Microbiome Multivariable Associations with Linear Models (MaAsLin2) 30 . Default options were used with the exception of the following, which aided in the interpretation of our results: no transformations were made to assess microbial abundance calculations and no transformations or normalizations were used for ARG assessment. A 95% confidence interval was computed from the output for the difference-in-difference effect estimate's standard error in MaAsLin2.

Modeling antibiotic exposure on overall antibiotic resistance gene abundance:
To estimate difference-in-difference estimates for the overall ARG abundance per sample, we ran fixed-effects linear regression models with ARG abundance in RPKM as the outcome.
The general formula for these analyses was the following: Overall ARG abundance ~ (intercept) + sample age (in days) + antibiotic exposure + antibiotic exposure × sampling interval (baseline vs 1 year) + covariates The same covariates as the individual microbe and ARG model were used. Difference-indifference estimates with 95% confidence intervals are reported.

Species and antibiotic resistance gene correlation analyses:
HAllA version 0.8.18 was used to identify blocks and pairs of species and ARGs that were associated with each other 31 . HAllA uses a three-step system that calculates pairwise associations between 'omics datasets, hierarchically clusters each dataset, and then iteratively identifies associated blocks of features. We assessed the correlation based on the difference between the baseline and 1-year relative abundance of each species and ARG. Species and ARGs with greater than a 0.5% mean abundance in the Antibiotic Exposure Cohort were included.
Species and ARG abundances were centered log-ratio transformed using the microbiome 32 package to account for relative abundance data. Samples were stratified by antibiotic exposure and day care attendance at 1 year before running them through HAllA. Spearman correlation with a Benjamini-Hochberg correction of 0.05 was used to associate features and evaluate statistical significance.

Model sensitivity analyses:
In addition to crude and adjusted models for abundance and diversity metrics, we estimated difference-in-difference or interaction estimates for two other antibiotic exposure scenarios. One set of sensitivity models considered antibiotic exposure specifically for otitis media with an added variable to account for antibiotic exposures not for otitis media (tested with the Antibiotic Exposure Cohort). In the other model sets, infants were only considered exposed if they received two or more antibiotic prescriptions (tested with the Antibiotic Frequency Cohort).
Based on assumptions of the difference-in-difference modeling approach, we stratified infants in the NHBCS cohort based on day care attendance. Difference-in-difference regression was used to evaluate if antibiotic exposure impacted abundance or diversity metrics differentially by the two groups.

Difference-in-differences assumptions and sensitivity analyses
Difference-in-differences assumptions: In order for the difference-in-differences approach to represent an unbiased amount of change in the population due to an antibiotic, a few assumptions must be met 33,34 .
The first assumption (often called the strict exogeneity assumption) is that the intervention cannot be associated with the outcome at baseline. While this may not be true if we were considering microbes targeted for treatment by the antibiotic exposures, in this case we were only interested in off-target microbes and antibiotic resistance genes.
A second assumption (often called the positivity assumption) is that all people in the study could have been given the intervention. We assume in this study that, since all infants are involved in the medical system to participate in cohort studies, that they all have access to antibiotics.
The last assumption is the parallel trends assumption. This assumption says that there are no unmeasured variables that are both time and group varying. Graphically this can be depicted using two lines to represent the relative abundance of a microbe or ARG among those exposed and unexposed prior to the intervention. If the lines are not parallel, the assumption is not met.
These lines are often depicted as linear, but this is not a prerequisite. While this assumption is challenging to address with data at only 2 time points, care has been taken to evaluate variables that could be considered time and group variant. Unless otherwise mentioned, we assume that covariates are time invariant and that the effect of the covariates on our outcome at baseline is also time invariant. This is why we did not assess the interaction between time and other covariates.

Sensitivity analyses:
In particular, we explored sample age, duration of breastfeeding, exclusive breastfeeding at the sample collection, and exposure to day care by 1 year due to time variance (Supplemental tables included below). Sample age, duration of breastfeeding, and exclusive breastfeeding at sample collection did not vary significantly by group in the Antibiotic Exposure Cohort (n = 216; Kruskal-Wallis or chi-squared p-value in baseline and one year > 0.05). However, at baseline, infants eventually exposed to an antibiotic were breastfed slightly longer than infants not exposed to an antibiotic (48.6 days vs. 44.8 days, Kruskal-Wallis p-value = 0.054). Due to this small difference in days breastfeeding and non-significant differences in exclusive breastfeeding status, we did not consider this to be a major factor affecting results and did not stratify analyses based on duration or exclusivity of breastfeeding. Since we only had exposure to daycare available in the NHBCS cohorts, we only explored this in NHBCS infants (Supplemental table   included below). Among the 183 infants in the Antibiotic Exposure Cohort, infants that were given an antibiotic were also more likely to attend daycare. Thus, in order to tease apart the effects of daycare and antibiotic exposure, we conducted sensitivity analyses among infants that attended and didn't attend daycare by 1 year.