Association between clinical and environmental factors and the gut microbiota profiles in young South African children

Differences in the microbiota in populations over age and geographical locations complicate cross-study comparisons, and it is therefore essential to describe the baseline or control microbiota in each population. This includes the determination of the influence of demographic, clinical and environmental factors on the microbiota in a setting, and elucidates possible bias introduced by these factors, prior to further investigations. Little is known about the microbiota of children in South Africa after infancy. We provide a detailed description of the gut microbiota profiles of children from urban Cape Town and describe the influences of various clinical and environmental factors in different age groups during the first 5 years of life. Prevotella was the most common genus identified in the participants, and after infancy, the gut bacteria were dominated by Firmicutes and Bacteroidetes. In this setting, children exposed to antibiotics and indoor cooking fires were at the most risk for dysbiosis, showing significant losses in gut bacterial diversity.

Demographic, clinical and environmental data collection. Extensive systematic data collection was undertaken at the baseline visit through clinical interview, questionnaire to the household, and physical examination. Participant demographics and clinical and environmental factors such as HIV status, medical history (including mode of birth, antibiotic use, deworming and hospitalisation), exposure to cigarette smoke, indoor cooking fires, breastfeeding and exposure to pets and day-care was recorded. Mid-upper arm circumference (MUAC) scores were used as an indicator of malnutrition in participants > 6 months. A MUAC of > 13.5 cm indicates a low risk for malnutrition, while a MUAC between 12.5 and 13.5 cm indicates mild risk for malnutrition. The following were used as indicators of socio-economic status: housing structure type, ablution type and drinking water supply. Where relevant, children were grouped into three age bands: A (0-1 years), B (> 1 to 2 years) and C (> 2 to < 5 years), to account for the highly evolving nature of the gut microbiota during early childhood.
Sample collection. Stool samples (1 per child) were collected in sterile 25 ml faecal containers with spoons (Lasec, South Africa) without preservative and transported to the laboratory in cooler boxes with ice packs within 72 h, where they were homogenised and immediately stored at − 80 °C. Samples were stored on ice packs (n = 49) or in a fridge at 2-8 °C (n = 59) before transport. In a small number of cases (n = 8) no such facilities were available, and samples were stored at room temperature. Sample weight and consistency prior to homogenisation were also recorded.
DNA extraction and sequencing. DNA was extracted using the QIAamp PowerFecal DNA Isolation Kit (Qiagen, Germany) according to the manufacturer's instructions. Samples that did not meet the purity requirements of the sequencing provider were subjected to ethanol-based purification. The extracted DNA was frozen at − 20 °C until delivery on ice to the Centre for Proteomic and Genomic Research (CPGR) where 16S rRNA gene amplicon sequencing was performed on the Illumina MiSeq platform. Previously published primers 16 were used to target the V4 hypervariable region. The MiSeq Reagent v3 Kit (600 cycles) was used to generate sequencing libraries, which were spiked with 10% of a 5 pM PhiX sequencing control. The ZymoBIOMICS Microbial Community DNA standard (Zymo Research, USA) and batched negative extraction controls were included in the sequencing run, which was set to produce 2 × 200 bp paired-end reads. The DNA purity requirements and sequencing controls are described in detail in the "Supplementary material". Sequence analysis and statistical testing. Sequence 24 , followed by error correction, quality filtering and chimera removal using the dada2 plug-in 25 .
Taxonomic profiling. A Naïve Bayes classifier trained on the SILVA 138 99% OTU V4 region database (https:// www. arb-silva. de/ silva-licen se-infor mation/) was used to assign taxonomy 26,27 . Taxonomic features appearing in less than five samples, as well as unassigned and non-bacterial features were filtered prior to visualizing the taxonomy bar plots.
Alpha and beta diversity testing. Rarefaction plots were generated with feature frequency and Shannon diversity, and five samples were excluded from all alpha and beta diversity analyses after rarefying to a sequencing depth of 52,653. Univariate analysis was performed for individual factors, samples with no data were automatically excluded and the p value for significance was set at 0.05 for all calculations. The evenness and richness (alpha diversity) within groups (based on the factors defined under the demographic, clinical and environmental data section) was calculated using the Shannon (H) and Faith's Phylogenetic Diversity (PD) diversity metrics 28,29 and statistical significance between groups was determined using Kruskal-Wallis pairwise tests and Benjamini-Hochberg False Discovery Rate (BH-FDR) multiple test correction where appropriate 30,31 . Spearman correlation analysis was performed for numerical data 32 . The differences in microbial communities between groups were investigated by principal coordinate analysis (PCoA) using the Bray-Curtis, and unweighted and weighted Uni-Frac dissimilarity metrics 33,34 . Significance of differences were determined by PERMANOVA 35 with BH-FDR adjustment where appropriate.
Differential abundance analysis. Analysis of composition of microbiomes (ANCOM) 36 was used to identify differentially abundant features at genus level for factors such as age, medication use, method of birth, breastfeeding and exposure to pets and smoke. The selection of factors to investigate for possible bacterial biomarkers was guided by clustering in the PCoA space following beta diversity analysis. For all ANCOM analyses, recommended filtering was performed to include only features present ≥ 20 times overall and in at least 25% of the samples, to remove noise caused by less abundant features.

Results
Participant demographics. Stool  Sequencing and taxonomy profiles. A total of 9,481,364 paired reads were imported into QIIME2 and no trimming was performed due to high quality across all bases in both forward and reverse reads (median ≥ Q30). After quality filtering and chimera removal, a total of 8,019,511 paired reads remained (7,766,477 excluding controls, median: 67,517 reads/sample, IQR: 61,452-72,667). Taxonomic profiles were generated for each sample at phylum level and ordered by age group (Fig. 1). At phylum level, children younger than 1 year had a higher relative abundance of Proteobacteria (16.6%) and Actinobacteriota (9.8%) compared to those older than 2 years (Proteobacteria: 4.6%, Actinobacteriota: 2.9%). In children between the age of 2−5 years, the profiles were dominated by Firmicutes (45.1%) and Bacteroidota (44.9%). At genus level, there was a notable abundance of Bifidobacterium, Escherichia-Shigella and Veillonella in infants, whereas the most prevalent genus overall was Prevotella (Fig. 2).  Table S2). The upper three 1-year age bands did not differ significantly from each other, indicating a stabilization of within-sample diversity. There were no significant differences in alpha diversity due to sex (M/F) or due to the mother's HIV status. The method of sample storage, sample consistency and sample weight did not impact alpha diversity.

Alpha diversity.
For factors more directly linked to age, such as method of birth (caesarean (C/S) vs normal vaginal delivery (NVD)), feeding pattern and day-care exposure, the data were stratified into three age groups: group A (0-1; n = 24), group B (> 1 to 2; n = 25) and group C (> 2 to 5; n = 66). None of these age-related factors were associated with changes in alpha diversity in any of the age groups, including method of birth (C/S vs NVD), premature birth, breastfeeding in the first 6 months (including exclusive breastfeeding), duration of breastfeeding (in months) and age of solid food introduction. In addition, none of the factors relating to day-care exposure, including the age at which day-care was started, size of day-care groups or hours spent in day-care, were associated with differences in alpha diversity in the oldest age group (Supplementary Tables S2 and S3).
Alpha diversity was significantly lower in those who had received antibiotics in the 2 weeks prior to sample collection (H: Fig. 3B, p = 0.005), but no difference was seen between children who had or had not received antibiotics in the last 6 months. Children who had received treatment from a traditional healer in the last 3 years had lower diversity ( Fig. 3B; p = 0.01), while those who had been dewormed in the 6 months prior to sample collection had significantly higher diversity ( Fig. 3B; p = 0.029). Vitamin A supplementation in those older than 6 months, hospitalisation in the 6 months prior to sample collection and hospitalisation in the first 6 months of life did not affect alpha diversity.
While there were no significant differences between children exposed to cigarette smoke during pregnancy or in the household, those exposed to an indoor cooking fire using wood or paraffin (kerosene) had significantly lower diversity ( Fig. 3C; p = 0.004). Children who lived in a home with pets (cats or dogs) had significantly higher diversity ( Fig. 3C; p = 0.03). The type of housing structure, ablutions and drinking water supply did not affect alpha diversity.
Beta diversity. Like richness and evenness, differences in age were also associated with dissimilarity in the gut microbial community. The PCoA plots generated by the Bray-Curtis (BC), unweighted UniFrac (UWU) and weighted UniFrac (WU) dissimilarity metrics showed distinct clustering of samples in age groups A (0-1 years) and C (> 2 to 5 years), and a transitional cluster representing age group B (> 1 to 2 years) (Fig. 4). PERMANOVA testing showed that groups A and B were dissimilar to group C and each of its subgroups (> 2 to 3 years, > 3 to 4 years, and > 4 to 5 years), and that the subgroups of group C did not differ from each other.
Differential abundance with ANCOM. Differential abundance analysis was performed to identify possible bacterial biomarkers associated with the differences between groups following alpha-and beta-diversity analysis. These analyses support the observed age-related differences in alpha and beta diversity, and the transitional nature of the microbiome during infancy. Sixteen differentially abundant genera were identified between groups A and C, 7 between groups A and B, and 5 between groups B and C (Fig. 5). When comparing the age groups A and C, only three features (an uncultured genus-level group (UCG) from the Oscillospiraceae (UCG-002), Agathobacter and a genus group in the Lachnospiraceae family) were more abundant in the older age www.nature.com/scientificreports/ group. A number of features from the Actinobacteriota and Proteobacteria were significantly more abundant in the youngest age group. There was a significant increase in genera from the order Clostridia in group B, as compared to group A. The differentially abundant genera between groups B and C belong to the same phylum and therefore do not represent major global changes in the microbiota. Prevotella was enriched in those not exposed to antibiotics in the 2 weeks prior to sample collection and Haemophilus was more abundant in those who had been dewormed. The differences in microbiota composition due to antibiotic treatment and deworming are shown in Supplementary Fig. S1. Other clinical and environmental factors that were associated with changes in diversity did not have any significant bacterial biomarkers. Factors that have been known to influence the microbiome in the first year of life, such as method of birth, prematurity and breastfeeding, were not associated with any differentially abundant features in group A or the older age groups.

Discussion
This study characterized the bacterial gut microbiota profiles of children from urban communities in Cape Town, South Africa. Diversity and differential abundance analyses were performed in the context of various demographic, clinical and environmental factors. It was found that the richness and evenness of the bacterial microbiota in this population stabilize after infancy and that age is the main driver of differences between participants.
The gut microbiota of children in this study shows similar maturation patterns to those described previously in both urban and rural settings, where a shift occurs from Proteobacteria and Actinobacteria such as Escherichia and Bifidobacterium in early life, to Firmicutes/Bacteroidota dominated profiles represented by Prevotella and Bacteroides after infancy 11,37,38 . Prevotella was the most common genus in older children (> 2 years) in our study; this is comparable to studies done in other developing countries 11,39,40 . A recent South African study that evaluated the association between diet and atopic dermatitis in children found that the higher abundance of Prevotella in healthy controls may indicate a protective effect against atopic dermatitis 18 . There was no depletion of Bacteroides, as has been noted in some child and adult studies in rural African communities 41,42 . At phylum level, the microbiota profiles in this study were more similar to those of children in studies from the United States of America (USA) than those from Asia, Europe and Africa, with an equal distribution of Firmicutes and Bacteroidota 11,39,40 . This intermediate state between the microbial profiles of traditionally non-western microbiomes and western microbiomes has also been described in South African adults and has been attributed in part to urbanization and changes in diet 15 . A limitation of our study was that the dietary intake of the children was not recorded, and the impact of different diets on the gut microbiota could therefore not be investigated in this population. We found no significant differences in diversity or bacterial features when considering age-related factors known to influence the gut microbiota in early life, such as method of birth, prematurity, breastfeeding and day-care exposure. This may be related to the smaller number of participants when stratifying by age group, which limited the statistical analysis.
A significant increase in potential short-chain fatty acid (SCFA) producing taxa was noted after the first year of life, including members of the Lachnospiraceae, Oscillospiraceae and Ruminococcacaea families. Agathobacter, Butyricicoccus and Lachnospiraceae NK4A136 group were among the taxa which increased in abundance with age; these taxa have previously been identified as potential probiotic targets [43][44][45] . Faecalibacterium, another taxon associated with good health in adults 46 , represented a substantial proportion of the gut bacteria in this study.
Reduced microbial diversity was associated with antibiotic treatment within 2 weeks of sample collection, visiting traditional healers within 3 years and living in a home with an indoor wood or paraffin cooking fire, which may have placed children at risk for dysbiosis. It should be noted that all but one child in the small group who had visited traditional healers were below the age of two, and lower diversity in this univariate evaluation is thus likely to be driven by age. Of note, all of these children had visited the healers due to minor issues such as colic. In this study, only Prevotella was found to be significantly reduced in those who had recently taken antibiotics. No significant differences in microbial profiles or diversity were observed in children exposed to antibiotics within 6 months of sample collection, suggesting resilience and recovery of the microbiome after short-term antibiotic use. However, these children are exposed to MDR-TB in the household, may be biased toward living in poorer socio-economic conditions and they are likely to have a higher rate of antibiotic consumption and hospitalisation in the longer term, which may impact the microbiome. The possible selection of resistant organisms has not been considered here, but there is contradictory evidence regarding the long-term effect of antibiotic use on the levels of resistance in the gut [47][48][49] , highlighting the need for more long-term resistome surveillance studies. For populations such as the one included in this study, this is particularly important.
Consistent with a pilot study in our setting 50 , the different storage methods used in this study did not significantly influence the microbiota. We also did not find any associations between diversity and sample consistency and weight. Further studies with larger sample sizes need to be conducted to explore the potential benefits of unrefrigerated sample transport for studies set in remote or resource-limited settings. www.nature.com/scientificreports/ Exposure to indoor cooking fires rather than cigarette smoke was associated with a reduction in alpha diversity and significant bacterial community dissimilarity between samples, although no differentially abundant features could be identified. Wood and kerosene fires produce high levels of household pollutants, including a mixture of gases and particulate matter (PM). The WHO attributes 3.8 million deaths a year to household air pollution 51 , yet the relationship between PM and the microbiome is not well studied. Liu and colleagues recently reported that exposure to certain PM was associated with a reduction in alpha diversity 52 ; however, results in animal studies have been conflicting 53,54 . Further studies are needed to improve our understanding of the impact of the use solid fuel and kerosene on gut microbiome related health in low-and middle-income countries.
In this study, deworming within 6 months of sample collection was associated with increased overall diversity and an increase in the abundance of Haemophilus. A Kenyan study determined that albendazole-based deworming resulted in an increase in Clostridiales and reduction of Enterobacterales; however the overall diversity decreased post-treatment 55 . In comparison, a study that investigated various anthelminthic agents did not observe   www.nature.com/scientificreports/ any significant changes in alpha diversity but did find an association between combination tribendimidine and ivermectin therapy, and an increase in the phylum Bacteroidetes 56 . Similarly, alpha diversity was not affected by deworming therapy in hospitalised South African infants, but the bacterial gut microbiota communities of dewormed infants were found to be dissimilar to those who had not been dewormed 19 . Interestingly, a decrease in the relative abundance of Bacteroidetes has been associated with individuals who remained infected with helminths following deworming therapy 57 . As noted in this study, Yang et al. found that deworming resulted in a short-term increase in bacterial diversity, and additionally noted an increase in the probiotic Bifidobacterium, and decrease in Fusobacteria; however, this was dependent on the pre-treatment microbial profile 58 . They also suggested a possible link to increased secretory IgA (SIgA) following treatment 58 . In a recent study, people with SIgA deficiency were shown to have significantly less diverse microbiomes when compared to healthy controls 59 . SIgA is also known to aid in pathogen control and limit inflammation in the gut, thereby contributing to gut homeostasis 60 . However, the relationship between SIgA and deworming has not been established and needs further evaluation. There are no reports in the literature linking Haemophilus abundance to deworming. A study performed on South African infants determined that Haemophilus was a candidate gut pathogen associated with respiratory disease in infants 19 , but the clinical significance of Haemophilus outside of specific disease contexts is unknown. Further, we were only able to include 59 of 115 children in this analysis due to discrepant answers on the questionnaires, and did not evaluate the helminth infection status, which has been associated with specific microbiome profiles 61 . The numerous benefits of deworming in child health, such as control of anaemia and a lower risk of stunting and malnutrition, must also be considered and may mediate perceived negative microbiome effects. Like deworming, living with pets was also linked to an increase in gut microbial diversity. This relationship has previously been shown in infants, and changes in the abundance of a number of taxa have been associated with exposure to pets 62,63 ; however, this was not seen in this study. This cohort presented the rare opportunity to describe the microbiota of relatively healthy children from South African communities. The differences in the microbiota during the first 5 years of life in this setting are similar to what has been described elsewhere, and the abundance of taxa previously associated with good health indicates a low level of dysbiosis in this population. Those exposed to antibiotics and indoor wood/kerosene cooking fires were at the greatest risk for dysbiosis with significant losses in community diversity. This highlights the importance of continued and increased microbial surveillance in populations exposed to antibiotics and indoor cooking fires. Considering that it can be difficult to draw comparisons between 16S rRNA gene sequencing studies, due to the variance introduced by DNA extraction, sequencing and taxonomic profiling strategies, further functional studies are necessary to elucidate the full impact of environmental factors on the metabolic and immunological pathways connected to the microbiome. The extensive baseline metadata analysis performed here provides the ideal platform for future studies in this setting that will aim to identify the long-term impact of fluoroquinolones on the child gut microbiome and resistome; after the parent trial has been completed and the researchers can be unblinded.