Streptococcus pneumoniae (also known as pneumococcus) is a leading cause of severe infections among children and adults, with the highest incidence of pneumococcal disease occurring during infancy [1]. S. pneumoniae causes a broad range of infections ranging from mild respiratory illnesses, such as acute otitis media and acute sinusitis, to invasive pneumococcal disease (IPD), which includes serious illnesses such as bloodstream infection and meningitis [2]. Globally, S. pneumoniae is responsible for more than 300,000 child deaths each year, the overwhelming majority of which result from pneumonia [3] and occur in low- and middle-income countries [4]. Colonization of the nasopharynx precedes infections caused by S. pneumoniae and occurs in 25–65% of children and <15% of adults, with substantial variation in prevalence by geography and socioeconomic factors [5,6,7,8,9]. Pneumococcal conjugate vaccines effectively prevent IPD caused by vaccine serotypes [10, 11] but are less effective in preventing non-invasive infections, such as pneumonia [12, 13] and otitis media [14]. Moreover, the introduction of pneumococcal conjugate vaccines has been followed by the emergence of non-vaccine serotypes, some of which are highly virulent and multidrug resistant [15,16,17], and nonencapsulated pneumococci [18], which together threaten to compromise the long-term benefits of these vaccines [19, 20]. There is thus an urgent need to develop alternative approaches to preventing infections caused by S. pneumoniae.

A complex microbial community resides in the upper respiratory tract and has co-evolved with humans [21]. Over the last decade, accumulating evidence has emerged for this microbiome’s role in the pathogenesis of respiratory infections [22], with microbial communities at distinct anatomical sites within the upper respiratory tract contributing to resistance to colonization or infection by specific pathogens. Within the nasopharynx, resident microbes may resist colonization by S. pneumoniae through competition for nutrients and adhesion sites, secretion of antimicrobial factors, and modulation of host immune responses [23]. Most prior studies focused on associations between S. pneumoniae and colonization by other major respiratory pathobionts, demonstrating primarily positive associations with Haemophilus influenzae and Moraxella catarrhalis [24,25,26] and negative associations with Staphylococcus aureus [25, 27]. More recently, negative associations were reported between S. pneumoniae and other streptococcal species [28, 29], Lactobacillus species [30, 31], non-diphtheriae Corynebacterium species [6, 32, 33], and Dolosigranulum pigrum [32, 33]. The interspecies interactions that underlie these associations are often bidirectional, occur by diverse mechanisms, and may vary based on the local microenvironment. For instance, H. influenzae can inhibit S. pneumoniae by downregulating expression of pneumococcal adherence factors [34] or by stimulating complement-dependent phagocytosis of S. pneumoniae [35]. Still, these species often persist together in a multi-species biofilm that can confer protection to S. pneumoniae from antibiotics [36, 37]. Similarly, inhibition of pneumococcal growth by D. pigrum appears to require the presence of specific Corynebacterium species [38]. At the same time, Corynebacterium accolens inhibits S. pneumoniae through hydrolysis of free fatty acids from host skin surface triacylglycerols [33]. Although these laboratory studies have furthered our understanding of respiratory microbial community ecology, the extent to which these interspecies interactions contribute to population-level trends in S. pneumoniae colonization is unknown.

As described above, colonization of the nasopharynx is a necessary precursor to infections caused by S. pneumoniae, and is particularly prevalent among young children. Notably, the nasopharyngeal microbiome undergoes rapid shifts in composition during early childhood [39,40,41], primarily driven by environmental factors, such as delivery mode [42, 43], infant feeding practices [41, 44], contact with other children [39], season [40, 45], and antibiotic exposures [39]. Previous studies that evaluated associations between the nasopharyngeal microbiome and S. pneumoniae colonization in children were cross-sectional [6, 32, 33], and few data are available from low- and middle-income countries, where >80% of child deaths from S. pneumoniae occur [3] and household [46, 47] and environmental [48] exposures may differ from those in high-income countries. Longitudinal studies in these settings are necessary to determine the impact of environmental exposures on the developing infant nasopharyngeal microbiome and to identify microbiome features that facilitate or inhibit S. pneumoniae colonization.

In this study, we sought to identify interspecies interactions that modify the risk of pneumococcal colonization during infancy and to describe nasopharyngeal microbiome development during the first year of life in a sub-Saharan African setting. We studied the nasopharyngeal microbiomes of 179 mother–infant dyads in Botswana using 16S ribosomal RNA (rRNA) gene sequencing and identified S. pneumoniae colonization with a species-specific PCR assay. We describe changes in microbiome diversity and composition during infancy, evaluate associations between the nasopharyngeal microbiome and pneumococcal colonization risk, and identify environmental factors that influence nasopharyngeal microbiome composition during infancy.


The nasopharyngeal microbiome is a low-diversity microbial community throughout infancy

We collected nasopharyngeal swab samples monthly (0–6 months) or bimonthly (6–12 months) from 179 mother–infant dyads recruited at urban and rural study sites in southern Botswana. Infants were born vaginally, had a median [interquartile range (IQR)] birth weight of 3120 g (2855 g, 3408 g), and were predominantly breastfed (Table 1). Infants were followed in this study to a median (IQR) age of 12.0 (8.0, 12.1) months. Of 51 mothers with HIV, 44 (86%) received antiretroviral therapy during pregnancy for a median (IQR) duration of 9 [5, 9] months; median (IQR) CD4 count was 461 (312, 641) cells/µL. We performed 16S rRNA sequencing of infant nasopharyngeal samples from all study visits, and maternal nasopharyngeal samples from the delivery visit, resulting in 1368 infant and 172 maternal samples passing quality control procedures. The median (IQR) Shannon index and number of unique ASVs in these samples were 1.38 (1.03, 1.72) and 71 (50, 100), respectively. Infant nasopharyngeal microbiome diversity remained relatively stable during infancy and similar to maternal nasopharyngeal microbiome diversity (Fig. 1), except at birth (I0), when diversity was higher than later in infancy (Wilcoxon signed-rank tests, p < 0.0001) and compared to maternal samples (Wilcoxon signed-rank test, p < 0.0001). Nasopharyngeal microbiome richness increased with age during infancy (negative binomial regression, p < 0.0001) and did not differ from the richness of the maternal nasopharyngeal microbiome after five months of age (Wilcoxon signed-rank tests, p > 0.05).

Table 1 Characteristics of the 179 mother–infant dyads included in the study population.
Fig. 1: Alpha diversity of the nasopharyngeal microbiome among mother–infant dyads in Botswana.
figure 1

Box plots depict nasopharyngeal microbiome diversity, as measured by the Shannon index (a), and richness, as measured by the number of unique ASVs (b). Maternal nasopharyngeal samples are from the birth visit only (M0), and are shown in red, while nasopharyngeal samples from infants were sequenced at ten time points during the first year of life (I0-I12), and are shown in blue. The Shannon index of the infant nasopharyngeal microbiome is higher at birth than at all later time points (Wilcoxon signed-rank tests, p < 0.0001) and compared to maternal samples (Wilcoxon signed-rank tests, p < 0.0001). The number of unique ASVs in infant nasopharyngeal samples is lower than in maternal samples from birth through five months of age (Wilcoxon signed-rank tests, p < 0.05). ASV amplicon sequence variants.

Rapid shifts in nasopharyngeal microbiome composition occur during early infancy

Six bacterial genera—Corynebacterium, Dolosigranulum, Haemophilus, Moraxella, Staphylococcus, and Streptococcus—accounted for more than 90% of the sequencing reads identified in infant nasopharyngeal samples collected at or after one month of age (Supplementary Table 1). Only at the birth visit were other bacterial genera abundant, including Acinetobacter, Gardnerella, Lactobacillus, and Sneathia, likely reflecting colonization of the infant upper respiratory tract by pioneering microbes from the maternal gut [49, 50] and vaginal microbiomes [50, 51]. Nasopharyngeal microbiome composition shifted dramatically during early infancy (Fig. 2); microbiome composition at each study visit differed from the preceding visit from birth through five months of age (PERMANOVA on Bray-Curtis dissimilarity, p < 0.05), after which no significant differences in overall microbiome composition were observed between consecutive study visits. While maternal nasopharyngeal microbiome composition at the birth visit differed from infant microbiome composition at all study visits (PERMANOVA on Bray-Curtis dissimilarity, p < 0.001), the dissimilarity of these microbiomes increased with age (linear mixed effects model, p < 0.0001), indicating progressive divergence of the infant nasopharyngeal microbiome from an adult microbiome profile during the first 12 months of life.

Fig. 2: Composition of the nasopharyngeal microbiome among mother–infant dyads in Botswana.
figure 2

a Principal components (PCoA) plot based on Bray-Curtis distances showing nasopharyngeal microbiome composition among mothers at delivery (M0) and infants throughout the first year of life (I0-I12). During infancy, nasopharyngeal microbiome composition progressively diverges from the composition of the adult nasopharyngeal microbiome. Ellipses define the regions containing 80% of all samples that can be drawn from the underlying multivariate t distribution. Ellipses are shown for samples from mothers at delivery (M0), infants at birth (I0), and infants at 12 months of age (I12). b Relative abundances of highly abundant genera in nasopharyngeal samples from mothers (M0; n = 172 samples) and infants (I0-I12, n = 1368 samples) by month of visit.

To further describe shifts in microbiome composition during infancy, we classified each infant nasopharyngeal sample based upon whether a single genus comprised 50% or more of the sequencing reads in that sample. Samples for which no genus met this relative abundance threshold were categorized as “biodiverse.” A single genus was dominant in 844 of 1368 (62%) infant samples with biodiverse (n = 524; 38%), Moraxella-dominant (n = 343; 25%), Corynebacterium-dominant (n = 153; 11%), and Staphylococcus-dominant (n = 142; 10%) being the most common microbiome “biotypes” (Supplementary Table 2). We considered a per sample microbiome profile to be “stable” if the next sample from that infant was classified as the same biotype. Based on this definition, within-infant stability of the nasopharyngeal microbiome varied by biotype (Fig. 3 and Supplementary Table 3). Compared to a biodiverse microbiome profile, a Moraxella-dominant biotype tended to be associated with higher stability (logistic regression model, p = 0.07), while lower stability was observed with Dolosigranulum-dominant (p = 0.005), Haemophilus-dominant (p = 0.001), and Streptococcus-dominant (p = 0.0004) biotypes.

Fig. 3: State transitions of the nasopharyngeal microbiome during infancy.
figure 3

Each nasopharyngeal sample was classified based on the dominant bacterial genus in that sample or, if no genus occupied ≥50% of the sequencing reads, the sample was classified as biodiverse. Each diagram depicts changes in infant nasopharyngeal microbiome biotype between two consecutive study visits. Arrows point to the directionality of microbiome transitions while ribbon widths represent the frequency of these transitions. The colors correspond to specific microbiome biotypes.

Environmental exposures influence nasopharyngeal microbiome composition during infancy

To identify early life factors that influence the nasopharyngeal microbiome, we used MaAsLin2 [52] to fit generalized linear mixed models evaluating associations between sociodemographic factors and environmental exposures and the abundances of specific bacterial genera within the nasopharyngeal microbiome (Supplementary Table 4). The most substantial microbiome composition changes were associated with recent antibiotic exposures, 13-valent pneumococcal conjugate vaccine (PCV-13) doses, breastfeeding, and the winter season (Fig. 4). Antibiotic exposures were associated with decreases in the relative abundances of several bacterial genera generally associated with respiratory health, including Corynebacterium and Lactobacillus [31, 53,54,55], and increases in the relative abundances of genera containing common respiratory pathobionts (Haemophilus, Moraxella, Streptococcus). Similarly, during winter months, the relative abundance of Corynebacterium declined and was accompanied by an increase in the relative abundance of Haemophilus. In contrast, breastfeeding was associated with an increase in the relative abundance of Corynebacterium and decreases in the relative abundances of Haemophilus, Moraxella, and Streptococcus. No significant differences in nasopharyngeal microbiome composition were observed by sex, low birth weight status, and urban residence.

Fig. 4: Associations between environmental exposures and the composition of the nasopharyngeal microbiome during infancy.
figure 4

MaAsLin2 was used to fit log-transformed generalized linear mixed models evaluating associations between sociodemographic factors and environmental exposures and the relative abundances of bacterial genera within the infant nasopharyngeal microbiome. The coefficients from these models, which correspond to the relative effect sizes of associations, are shown for significant associations (q < 0.20) identified for (a) antibiotic exposures, (b) number of PCV-13 doses, (c) breastfeeding, and (d) the winter season. The number of PCV-13 doses was modeled as an ordinal variable such that the coefficients represent the relative effect sizes associated with each successive vaccine dose. Bacterial genera that increase in relative abundance with the exposure are shown as yellow bars. Bacterial genera that decrease in relative abundance with the exposure are shown as blue bars.

Corynebacterium species are associated with a lower risk of S. pneumoniae colonization

S. pneumoniae colonization was identified in 144 of 179 (80%) infants at a median (IQR) age of 71 (39, 126) days (Supplementary Fig. 1). To identify nasopharyngeal microbiome features that precede S. pneumoniae colonization, we first fit a Cox proportional hazards model to evaluate associations between nasopharyngeal microbiome biotypes and acquisition of S. pneumoniae at the next study visit (Table 2). Compared to a biodiverse microbiome profile, a Corynebacterium-dominant biotype was associated with a lower hazard of S. pneumoniae colonization [hazard ratio (HR): 0.43, 95% confidence interval (CI): 0.23–0.80]. In addition, the risk of S. pneumoniae colonization increased during winter months (HR: 1.48, 95% CI: 1.07–2.04) and with each additional child household member (HR: 1.21, 95% CI: 1.09–1.34), and declined with increasing number of PCV-13 doses (HR: 0.10, 95% CI: 0.06–0.16). To further explore the negative association between Corynebacterium species and S. pneumoniae, we classified samples into quartiles based on Corynebacterium relative abundance and evaluated the association between these sample quartiles and S. pneumoniae acquisition. The hazard ratio for pneumococcal colonization declined with each successive quartile increase in Corynebacterium relative abundance (Supplementary Table 5); compared to the lowest quartile, the highest quartile of Corynebacterium relative abundance was associated with a 69% lower hazard of S. pneumoniae colonization (HR: 0.31, 95% CI: 0.18–0.53).

Table 2 Cox proportional hazards model analyses evaluating associations between microbiome “biotypes” identified in infant nasopharyngeal samples and the risk of acquisition of S. pneumoniae prior to the next study visit.

Identification of Corynebacterium species negatively associated with S. pneumoniae colonization

Given the diversity of Corynebacterium species isolated from the human upper respiratory tract [56,57,58], we sought to determine if specific species accounted for the negative association with S. pneumoniae colonization. We used BLAST searches to identify the species or group of closely related species (supraspecies) corresponding to each of the 279 Corynebacterium amplicon sequence variants (ASVs) identified in infant nasopharyngeal samples [59]. We operationally classified those ASVs into 45 unique Corynebacterium species or supraspecies (Supplementary Table 6), the most abundant of which were C. pseudodiphtheriticum/propinquum (90.9% sample prevalence, 10.4% mean relative abundance), C. accolens/macginleyi (72.1% sample prevalence, 5.8% mean relative abundance), and C. tuberculostearicum (54.5% sample prevalence, 1.2% mean relative abundance). Subsequent analyses demonstrated that the relative abundances of each of these three Corynebacterium species or supraspecies were inversely associated with the risk of S. pneumoniae colonization (Supplementary Table 7). These findings illustrate the enormous diversity of Corynebacterium species that colonize the human upper respiratory tract and indicate that multiple Corynebacterium species likely contribute to colonization resistance to S. pneumoniae.

Strain-specific secretion of antipneumococcal factors by Corynebacterium species

To further investigate the inhibition of S. pneumoniae by Corynebacterium species, we cultured infant nasopharyngeal samples on selective media and identified culture isolates to the species level using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS). We isolated a total of 35 morphologically distinct Corynebacterium strains from 21 infants, including strains of C. accolens (n = 21), C. tuberculostearicum (n = 5), C. pseudodiphtheriticum (n = 4), C. coyleae (n = 2), C. propinquum (n = 2), and C. striatum (n = 1). We then screened supernatants from cell cultures of each of these Corynebacterium strains for inhibition of growth of reference (ATCC 6303; serotype 3) and contemporary infant nasopharyngeal strains (05-160; serotype 11 A) of S. pneumoniae. We observed inhibition of pneumococcal growth by cell-free supernatants from cultures of seven (20%) strains (Fig. 5), including species accolens (1 of 21; 5%), tuberculostearicum (3 of 5; 60%), coyleae (2 of 2; 100%), and striatum (1 of 1; 100%). While the characteristics and mechanisms underlying the inhibition require further investigation, these assays support a causal basis for the inverse relationship between Corynebacterium spp. and S. pneumoniae observed in microbiome data analyses. Furthermore, these data indicate that specific species and strains have the capacity to inhibit S. pneumoniae.

Fig. 5: Strain-specific inhibition of pneumococcal growth by Corynebacterium.
figure 5

Two strains of S. pneumoniae, one reference strain (ATCC 6303; serotype 3) and one strain isolated from an infant nasopharyngeal sample (05-160; serotype 11 A) were separately added to sterile, cell-free media from overnight cultures of Corynebacterium strains. Growth of these strains of S. pneumoniae was determined by OD600 readings that were normalized to blank media controls (TSB and BHIT-TSB). a Growth curves of S. pneumoniae strains in different media, including cell-free media from inhibitory strains of C. accolens (05-122), C. tuberculostearicum (05-144), and C. coyleae (05-104). b Growth curves of S. pneumoniae strains in different media, including cell-free media from strains of C. accolens (05-161), C. tuberculostearicum (05-150), and C. propinquum (05-124) that do not demonstrate pneumococcal growth inhibition. The decline in OD600 observed in these experiments after the peak density is reached reflects a transition of S. pneumoniae from the exponential growth phase to the autolysis phase. TSB tryptic soy broth, BHIT-TSB brain heart infusion medium with 0.2% Tween80 diluted 50% with tryptic soy broth.


In this study, we described the nasopharyngeal microbiome dynamics during the first year of life and identified environmental factors that influence infant nasopharyngeal microbiome composition. We also found that higher abundances of Corynebacterium species in the nasopharyngeal microbiome are associated with a lower risk of S. pneumoniae colonization during infancy. Finally, we demonstrated inhibition of pneumococcal growth by secreted factors from strains of several Corynebacterium species.

We found that the composition of the nasopharyngeal microbiome of infants in Botswana bears similarities to that reported in previous studies of infants and young children in high-income countries [39, 41]. Specifically, independent of the resource setting, this low-diversity microbial community is typically comprised mostly of bacteria from six genera, with Corynebacterium and Staphylococcus predominating during the first several months of life and Dolosigranulum and Moraxella becoming more abundant later in infancy [39, 40]. These trends in nasopharyngeal microbiome composition during infancy appear to be highly conserved despite substantial differences in host characteristics, household exposures, and climate. However, some caution must be taken in these statements given that most studies of the infant nasopharyngeal microbiome have used 16S rRNA gene sequencing. Substantial functional or genomic differences may exist between the bacterial strains that colonize the upper respiratory tracts of children in geographically distinct human populations that would only be revealed with alternative methodologies. We also found that the nasopharyngeal microbiome of infants in Botswana is highly dynamic, although microbiome stability varies markedly based on the dominant genus. In particular, higher microbiome stability was observed with nasopharyngeal microbiome profiles that were dominated by Moraxella, while lower stability was seen with profiles that were dominated by Dolosigranulum, Haemophilus, or Streptococcus. Findings from studies conducted in high-income countries were broadly similar, although notably nasopharyngeal microbiome profiles with high abundance of Dolosigranulum were reported to be highly stable in these settings [39, 41]. Further research is needed to investigate associations between the presence and abundance of Dolosigranulum in the nasopharyngeal microbiome and child respiratory health in low- and middle-income countries.

Our findings also demonstrate the impact of early life environmental exposures on the nasopharyngeal microbiome. In particular, the composition of the nasopharyngeal microbiome of infants varied with season, as reported in several prior studies [40, 45], with winter months associated with a declining relative abundance of Corynebacterium. The association between season and the abundance of Corynebacterium in the nasopharyngeal microbiome could contribute to the higher incidence of S. pneumoniae colonization during the winter season observed in our cohort and reported in several prior studies [60,61,62,63]. In contrast, we found few changes in the infant nasopharyngeal microbiome associated with maternal HIV infection, suggesting that other factors may account for the increased risks of pneumococcal colonization and disease observed in HIV-exposed, uninfected infants [64,65,66]. Alternatively, the mild immunosuppression of mothers with HIV in our study may have limited the effect of HIV infection on the maternal upper respiratory microbiome; indeed, mothers with and without HIV in our cohort had similar nasopharyngeal microbiome composition (PERMANOVA on Bray-Curtis dissimilarity, p = 0.99). Feeding practices also influenced the composition of the nasopharyngeal microbiome of infants in this study, with breastfeeding promoting the enrichment of the nasopharyngeal microbiome with Corynebacterium species. Interestingly, previous studies have not reported an association between breastfeeding and pneumococcal colonization during infancy [6, 67], which could reflect the complexity of the effect of breastfeeding on the infant upper respiratory microbiome. Taken together, our findings indicate that the nasopharyngeal microbiome may be a previously unrecognized and potentially modifiable mechanism by which environmental factors influence the risk of pneumococcal infections during childhood.

We found that higher abundances of Corynebacterium species within the nasopharyngeal microbiome are associated with a lower risk of S. pneumoniae colonization during infancy. Moreover, we identified strains of multiple Corynebacterium species that secrete factors that inhibit pneumococcal growth in laboratory experiments. Corynebacterium is a diverse bacterial genus that includes common residents of the upper respiratory tracts, skin, and gastrointestinal tracts of humans and animals. Although >150 species of Corynebacterium have been identified to date [68], the most common species isolated from the human respiratory tract are C. accolens, C. pseudodiphtheriticum, C. propinquum, C. striatum, and C. tuberculostearicum [56,57,58]. With the exception of Corynebacterium diphtheriae, the etiological agent of diphtheria [69], Corynebacterium species only rarely cause human disease, often in the setting of compromised host immunity [70, 71], indwelling prosthetic material [71,72,73], or chronic respiratory diseases [74, 75]. This low pathogenicity of non-diphtheriae Corynebacterium species is a key characteristic that supports their further evaluation for use as biotherapeutics. Despite the commonality of these species in the human microbiome, surprisingly little is known about the mechanisms by which Corynebacterium species adhere to human mucosal surfaces, interact with co-occurring microbial species and the host immune system, and rarely cause invasive infection.

Prior studies demonstrated antagonistic relationships between Corynebacterium species and several important bacterial pathobionts. C. propinquum was recently recognized to produce siderophores that inhibit the growth of coagulase-negative staphylococci through iron restriction [76], a strategy that this species may use to colonize the upper respiratory tract. Intranasal administration of a C. pseudodiphtheriticum strain effectively eradicated S. aureus nasal carriage in adults [77], demonstrating the potential use of Corynebacterium species as respiratory probiotics. Strains of C. pseudodiphtheriticum also inhibited growth of M. catarrhalis in co-cultivation experiments [78]. Lower relative abundances of Corynebacterium species were previously observed in the upper respiratory tracts of children with S. pneumoniae colonization [6, 32, 33]. Moreover, Bomar et al. identified a C. accolens strain that inhibited S. pneumoniae through the production of a lipase that releases antipneumococcal free fatty acids from human skin triacylglycerols [33]. Our findings demonstrate the substantial strain-level heterogeneity that characterizes interactions between Corynebacterium spp. and S. pneumoniae and point towards the need to study large collections of Corynebacterium strains to understand the breadth and distribution of phenotypes exhibiting anti-pneumococcal activity.

Our study has several limitations. First, this study was conducted at urban and rural sites in southern Botswana, and may not be generalizable to infants living in other settings. We excluded infants who were born by caesarian delivery, and were thus unable to evaluate the effects of delivery mode on the nasopharyngeal microbiome or the risk of S. pneumoniae colonization. Interestingly, in a study of >2000 Fijian infants 5–8 weeks of age, the prevalence and density of pneumococcal colonization were higher among infants born by vaginal delivery than among infants delivered by caesarian birth [79]. Comparative analyses of microbiome composition were based on sample relative abundances, and we are thus unable to determine if the observed associations represent differences in the absolute amounts of bacterial genera or species within the nasopharynx across samples. Our study employed 16S rRNA gene sequencing to characterize the nasopharyngeal microbiome of infants and mothers; future studies incorporating shotgun metagenomic sequencing may provide a more detailed understanding of interspecies interactions within the upper respiratory tract. Although the analyses adjusted for a large number of potential confounders in evaluating associations between the nasopharyngeal microbiome and S. pneumoniae colonization, residual confounding remains possible. Finally, we have not yet identified the secreted factors by which specific Corynebacterium strains inhibited S. pneumoniae in growth inhibition assays, nor did we assess Corynebacterium strains for pneumococcal inhibition through non-secretome-based mechanisms.


In summary, we identify environmental exposures that shape the developing infant nasopharyngeal microbiome and may influence colonization resistance to S. pneumoniae. Moreover, we demonstrate an inverse relationship between the abundances of Corynebacterium species within the nasopharyngeal microbiome and the risk of S. pneumoniae colonization during infancy. Future studies to define the molecular mechanisms of these bacterial interactions could lead to development of the first rationally-designed biotherapeutics for the prevention of infections caused by S. pneumoniae.

Materials and methods


Botswana is a landlocked country in southern Africa with a semi-arid climate and a rainy summer season that typically occurs from November to March. The country’s under-five child mortality rate was estimated to be 41.6 per 1000 live births in 2019 [80]. Haemophilus influenzae type B (Hib) and 13-valent pneumococcal conjugate (PCV-13) vaccinations were included in the national immunization program in November 2010 and July 2012, respectively. Complete vaccine series coverage rates in 2019 were estimated to be 95% for Hib and 92% for PCV-13 [81]. Botswana’s capital and largest city, Gaborone, is located in the country’s South-East district and has a population of 231,626 based on a census conducted in 2011 [82]. The HIV prevalence among adults 15 to 49 years of age in Botswana was 20.7% in 2019 [83]. More than 95% of pregnant women with HIV in Botswana receive antiretroviral therapy, and the mother-to-child HIV transmission rate is estimated to be <2% [83].

Data and biospecimen collection

Mother–infant dyads (n = 179) were enrolled within 72 h of delivery between February 2016 and December 2018 at three sites: a referral hospital in Gaborone, a public clinic in a low-income urban neighborhood in Gaborone, and a public clinic in a rural village located ~15 km outside of Gaborone. Exclusion criteria included maternal age <18 years, infant birth weight <2000 g, multiple gestation pregnancy, and caesarian delivery. Participants were seen for monthly study visits until the infant was six months of age and every other month thereafter until the infant was 12 months of age. At all study visits, a caregiver questionnaire was administered and nasopharyngeal swabs were collected from mothers and infants by trained study personnel. Nasopharyngeal samples were placed directly into MSwab medium (Copan Italia), transported to the National Health Laboratory in Gaborone, and frozen within 4 h of collection to −80 °C. Testing for S. pneumoniae was performed using a quantitative PCR assay targeting the autolysin gene (lytA), as previously described [6, 84]. A dried blood spot was collected from HIV-exposed infants by heel prick at two months of age and, for infants who were breastfed for any duration, again at 12 months of age. These samples were tested for HIV-1 DNA using the Cobas AmpliPrep/Cobas TaqMan HIV-1 Qualitative Assay, version 2.0 (Roche) [85]. All study participants or their legal guardians provided written informed consent to participate in this study. The study protocol was approved by the Botswana Ministry of Health, the Princess Marina Hospital ethics committee, and institutional review boards at the University of Pennsylvania, Children’s Hospital of Philadelphia, McMaster University, and Duke University.

Processing of nasopharyngeal samples for 16S ribosomal RNA gene sequencing

The Duke Microbiome Core Facility extracted DNA from nasopharyngeal samples using Powersoil Pro extraction kits (Qiagen) following the manufacturer’s instructions. DNA concentrations were determined using Qubit dsDNA high-sensitivity assay kits (Thermo Fisher Scientific). Negative extraction and PCR controls were amplified with all four batches of samples included in analyses to evaluate for background contamination. For the first two sample batches, these negative controls were verified to not have visible bands on gel electrophoresis. For the final two sample batches, sequencing was performed on these negative control samples. Bacterial community composition was characterized by PCR amplification of the V4 variable region of the 16S rRNA gene using the forward primer 515 and the reverse primer 806 following the Earth Microbiome Project protocol [86]. These primers carry unique barcodes that allow for multiplexed sequencing. Equimolar 16S rRNA PCR products from all samples were quantified and pooled prior to sequencing. Sequencing was performed by the Duke Sequencing and Genomic Technologies Core Facility on a MiSeq instrument (Illumina, Inc.) configured for 250 base-pair paired-end sequencing. Raw sequences were trimmed using Trimmomatic version 0.36 [87], demultiplexed using QIIME2 tools [88], and analyzed through a pipeline that used DADA2 version 1.16 [89]. ASVs were given taxonomic assignments based on alignment to the expanded Human Oral Microbiome Database version 15.1 [90]. The most abundant genera in negative control samples were Delftia, Brevundimonas, Bacteroides, and Leptothrix. Presumed reagent contaminant ASVs (n = 248) were identified and removed based on presence in negative control samples or negative correlation with DNA concentration using the frequency method (threshold = 0.10) implemented in the decontam R package version 1.12 [91]. Samples with <1000 sequencing reads after quality filtering were excluded. Sequencing reads were classified into 7167 ASVs representing 200 genera from 8 phyla. For each of the 279 ASVs assigned to the genus Corynebacterium, we performed a standard nucleotide REFSEQ BLAST search using the National Center for Biotechnology Information’s Bacteria and Archaea 16S ribosomal RNA project database [59]. We assigned species information to ASVs using a best-hit approach based on the E value with a minimum percent identity of 97%.

Statistical methods for analyzing infant nasopharyngeal microbiome diversity and stability

We calculated alpha (Shannon index and observed ASVs) and beta diversity (Bray-Curtis dissimilarity) using the phyloseq R package version 1.36 [92]. We used Wilcoxon signed-rank tests to compare microbiome alpha diversity across infant ages and between paired infant and maternal samples. To evaluate associations between infant age and alpha diversity measures, we used negative binomial mixed effect models with subject as a random effect to account for repeated sampling of individuals. We compared beta diversity with PERMANOVA using the adonis function within the vegan R package version 2.5.7 [93]; for comparisons of infant samples collected at different ages and between maternal and infant samples, we included a unique identifier for each mother–infant pair as a blocking variable [94]. For initial analyses of specific microbiome features, we aggregated ASVs at the genus level and classified a sample as dominated if 50% or more of the sequencing reads generated from this sample were assigned to a single genus. Samples dominated by a genus other than the six most highly abundant genera were classified in a single “other” category. Samples for which no single genus accounted for the majority of the sequencing reads were classified as “biodiverse.” We considered a sample’s microbiome profile to be “stable” if the next visit’s sample from that infant was classified as the same biotype. We then used logistic regression to evaluate associations between specific nasopharyneal biotypes and microbiome stability, adjusting for infant age in days.

Statistical analyses for analyzing infant nasopharyngeal microbiome composition

We used MaAsLin2 version 1.6 [52] to fit log-transformed linear mixed models evaluating associations between sociodemographic factors and environmental exposures and the relative abundances of bacterial genera within the infant nasopharyngeal microbiome. These analyses considered the following variables identified based on a literature review: sex, low birth weight (<2500 g), HIV exposure status, location of residence (urban vs. rural), household use of solid fuels, number of other child household members (<5 years of age), season (summer vs. winter), breastfeeding, number of PCV-13 doses, and systemic antibiotic exposures [39,40,41, 44, 45, 95,96,97]. MaAsLin2 analyses were limited to bacterial genera present in at least 10% of nasopharyngeal samples. The comparisons were corrected for the false discovery rate using the Benjamini–Hochberg procedure and a q value threshold for significance of 0.20. The models included subject as a random effect. To identify compositional features of the nasopharyngeal microbiome that influence the risk of pneumococcal colonization, we first fit a Cox proportional hazards model evaluating the association between nasopharyngeal microbiome biotype and S. pneumoniae colonization detected at the subsequent study visit. Given the observed negative association between a Corynebacterium-dominant biotype and pneumococcal colonization in this model, additional Cox proportional hazards models were fit to evaluate the association between the relative abundance of Corynebacterium, modeled on the genus level and separately for specific highly abundant species, and S. pneumoniae colonization. Two (1%) infants who were colonized with S. pneumoniae at the birth visit were excluded from these analyses. In addition, infants who were not colonized with S. pneumoniae at any study visit were censored at the last visit for which S. pneumoniae colonization data were available. Time was modeled as a continuous variable corresponding to infant age in days. All models were implemented using the survival R package version 3.2.11 [98] and were adjusted for all previously specified sociodemographic factors and environmental exposures, with the season, breastfeeding, PCV-13 doses, and systemic antibiotic exposures modeled as time-dependent covariates.

Laboratory experiments evaluating for pneumococcal growth inhibition by Corynebacterium strains

To isolate Corynebacterium strains from infant nasopharyngeal samples, we streaked 10 µL of sample on plates containing brain heart infusion (BHI) medium (Sigma–Aldrich) supplemented with 50 µg/mL of fosfomycin disodium (Fisher Scientific) and 1% Tween 80 (VWR). We subcultured single bacterial colonies to 5% sheep blood agar plates (Fisher Scientific) and identified bacterial species using a VITEK MS automated mass spectrometry microbial identification system (bioMérieux). We then used cell-free growth inhibition assays to screen Corynebacterium strains for the secretion of anti-pneumococcal factors. Corynebacterium strains were grown in 20 mL of BHI medium supplemented with 0.2% Tween 80 for 12–18 h at 37 °C and 5% CO2. These cultures were centrifuged at 3000 rpm for 10 min to generate cell pellets, and the supernatants were sterile-filtered with a 0.22-µM filter. The resulting cell-free media was diluted 50% in tryptic soy broth (TSB; Fisher Scientific) and glycerol stocks of two strains of S. pneumoniae, one a reference strain (ATCC 6303; serotype 3) and the second strain isolated from an infant nasopharyngeal sample (05-160; serotype 11 A), were separately diluted 1:50 into the diluted cell-free media. Growth was assessed by OD600 readings relative to blank media controls for each growth medium over 24 h.