Comprehensive characterization of maternal, fetal, and neonatal microbiomes supports prenatal colonization of the gastrointestinal tract

In this study, we aimed to comprehensively characterize the microbiomes of various samples from pregnant women and their neonates, and to explore the similarities and associations between mother-neonate pairs, sample collection sites, and obstetrical factors. We collected samples from vaginal discharge and amniotic fluid in pregnant women and umbilical cord blood, gastric liquid, and meconium from neonates. We identified 19,597,239 bacterial sequences from 641 samples of 141 pregnant women and 178 neonates. By applying rigorous filtering criteria to remove contaminants, we found evidence of microbial colonization in traditionally considered sterile intrauterine environments and the fetal gastrointestinal track. The microbiome distribution was strongly grouped by sample collection site, rather than the mother-neonate pairs. The distinct bacterial composition in meconium, the first stool passed by newborns, supports that microbial colonization occurs during normal pregnancy. The microbiome in neonatal gastric liquid was similar, but not identical, to that in maternal amnionic fluid, as expected since fetuses swallow amnionic fluid in utero and their urine returns to the fluid under normal physiological conditions. Establishing a microbiome library from various samples formed only during pregnancy is crucial for understanding human development and identifying microbiome modifications in obstetrical complications.


Maternal and neonatal microbiome landscape during delivery
We identified 19,597,239 bacterial sequences and 22,412 unique amplicon sequence variants (ASVs) from 641 samples, including cervicovaginal discharge (n = 154), amniotic fluid (n = 40), gastric liquid (n = 100), umbilical cord blood (n = 125), meconium (n = 160), and negative controls (n = 62).The ASVs were taxonomically annotated, but we found evidence of batch effects in our sequencing data for all sample types except VD (Fig. 1 and Supplementary Fig. S1).The batch effects were likely introduced during library construction for nextgeneration sequencing (NGS) and not during sequencing itself (Supplementary Fig. S2).However, this was expected because our samples were collected from body sites with low-biomass specimens, making our samples prone to contamination 20 .Therefore, we expected to find many false positives and applied a series of filters, as outlined in Supplementary Fig. S3.Notably, we found and removed 203 ASVs that were statistically determined as contaminants because they were highly prevalent in negative controls (Supplementary Fig. S4) or they showed higher frequencies in low-concentration samples (Supplementary Figs.S5 and S6).
We measured the alpha diversity of the samples by calculating Shannon indices (Fig. 2A).The alpha diversity decreased in the following order: GL, AF, M, CB, and VD.The negative control group showed a slightly higher diversity compared to the VD sample, suggesting that negative controls for 16S amplicon sequencing can have microbiome diversity as rich as real biological specimens.Next, we estimated the beta diversity of our samples by computing the weighted UniFrac distances (Fig. 2B).The samples were moderately well separated by sample collection site when projected using principal coordinates analysis (PCoA).

Clinical relevance of microbiome in pregnancy
To better understand the sources of variation seen in the beta diversity of our samples, we carried out the permutational multivariate analysis of variance (PERMANOVA) using different factors, including clinical information.As shown in Table 2, when all sample types were included in the analysis, the variable "Site" explained 17.2% of the variation (p-value = 0.001), and the variable "LibraryMonth, " was 7.4% (p-value = 0.002).This result indicates that the samples could still be separated well based on the microbiome pattern unique to their body site, despite the significant batch effects present within our dataset.When the analysis was restricted to each sample type, except for the sample VD group, the variable "LibraryMonth" was found to be significant for all sample types.Table 1.Clinical characteristics of the study population.BMI, body mass index; IVF-ET, in-vitro fertilization and embryo transfer; IUI, intrauterine insemination.Values are expressed as the median (interquartile range) for continuous variables and percentage for categorical variables.a The denominator is the number of newborns.b Asthma for three cases, allergic rhinitis, angioedema, and cholinergic urticaria.c Major depressive disorder for two cases, anxiety disorder, and panic disorder.

Characteristics Values
Age (years) The explanatory power increased to a range between 24.5 and 48.9%.These results align with the hypothesis that our samples are predominantly low-biomass specimens and prone to contamination.Additionally, the variable "DeliveryMethod" was returned as significant for the VD group, the variables "PretermBirth37" and "AntibioticsUse" for the M group, and the variable "Weight" for the CB group (Fig. 3).We explored the significant variables in each group using PCoA with weighted UniFrac distance.Several ASVs of Lactobacillus and one ASV of Gardnerella were found in the VD group.In the M group Staphylococcus showed a strong association with preterm birth.Lastly, the lists of bacterial taxa were connected to the weights of neonates in the CB group.Table 3 shows the analysis of the composition of microbiomes (ANCOM) for various clinical data to study any statistically significant relevance with bacteria in multiple sample types.

The resemblance of twin microbiome in delivery
To test the hypothesis that samples from twins, both monochorionic and dichorionic, have higher similarity in microbiome composition than randomly chosen samples, we compared the mean of weighted UniFrac distance between twin samples and randomly selected samples.More specifically, for each of the AF, CB, GL, and M groups, we performed bootstrapping hypothesis testing by randomly sampling pairwise distances with replacement from all samples 1000 times to build a 95% confidence interval with the means of the sampled distances.We rejected the null hypothesis that there was no difference between the twin samples and randomly selected samples for all four sample types because the mean pairwise distance for twin samples was below the confidence interval (Fig. 4 and Supplementary Fig. S7).Next, we divided the twins into monochorionic and dichorionic www.nature.com/scientificreports/twins and repeated hypothesis testing.We found that we could still reject the null hypothesis for all four sample types for dichorionic twins.For monochorionic twins, however, only the CB and M groups passed the test.

Characterization of the vaginal health-related microbiome
Several pathogenic and commensal vaginal microbiota have been shown to have important consequences for a woman's reproductive and general health.To establish reference ranges of vaginal microbiota with known clinical associations in generally healthy pregnant women, we searched for bacterial targets commonly tested for assessing vaginal health within VD samples.More specifically, we focused on 31 bacterial targets (15 genera The filtered ASV table was rarefied before Shannon index was computed for each sample.The VD group exhibited the least amount of alpha diversity.AF, amniotic fluid; CB, umbilical cord blood; GL, gastric liquid; M, meconium; VD, cervicovaginal discharge; NC, negative control; (B) Beta diversity: The filtered ASV table was rarefied before the samples were projected into 2D-space with principal coordinates analysis using the weighted UniFrac distance.and 16 species) that are tested by the "SmartJane" assay from uBiome Inc., including Lactobacillus, Sneathia, and Gardnerella 21 .Of the 31 bacterial taxa of clinical importance, 12 were identified in our samples (Fig. 5).We observed a higher relative abundance of Lactobacillus at the genus level but lower abundances of Aerococcus, Fusobacterium, Gardnerella, Peptoniphilus, Porphyromonas, and Prevotella.Most of our patients did not have any severe pregnancy-related complications.In addition, the majority of preterm birth ranged in the late preterm period from 34 + 0 weeks to 36 + 6 weeks.Therefore, the "SmartJane" assay did not capture almost any pathogenic microbiome.The specification level was examined and is listed in Fig. 5.We found Lactobacillus iners and Lactobacillus jensenii from the assay lists, but Lactobacillus crispatus was not commonly found in the vaginal microbiome.This could be simply because the SILVA reference database we used omitted Lactobacillus crispatus.We confirmed that some of the ASVs from the Lactobacillus genus were indeed Lactobacillus crispatus using the National Center for Biotechnology Information (NCBI) database (data not shown).

Controversies surrounding in utero colonization
Since contamination is a critical issue in microbiome research, we used several up-to-date methods to confirm the presence of bacteria and found evidence of in utero colonization.The distinct bacterial composition in   a Significant hits were found by ANCOM, but these results were discarded as they have a very low W score (zero in many cases) and are likely artifacts; note that this is a known bug in ANCOM, typically caused by small sample size for a given test.b Amplicon sequence variants were labelled 'Unassigned' if it was not possible to classify them at the highest taxonomic level at the required confidence level.c These amplicon sequence variants could not be classified beyond the domain level at the required confidence level.meconium, the first stool passed by newborns, supports that microbial colonization occurs in the intrauterine environment during normal pregnancy 22,23 .The microbiome in neonatal gastric liquid was similar to that in maternal amnionic fluid, as expected since fetuses swallow amnionic fluid in utero and their urine returns to the fluid under normal physiological conditions.However, the microbiome in gastric liquid was not exactly the same as in amnionic fluid, indicating the existence of unknown mechanisms for flora formation in the fetal oral cavity or proximal gastrointestinal tract, such as esophagus, from the intrauterine environment.

Do different samples from mothers and newborns share the same microbiome?
The study aimed to determine whether samples from various body sites of pregnant women and their infants would have similar microbiomes, or if the maternal microbiome would be passed on to her fetus.Our results suggest that the microbiome primarily differed based on the body compartment where it was obtained, not the mother-fetus pair.That is, out of all factors, including various obstetric conditions, the sampling site was the most significant factor in determining microbiome similarity.

Establishing a representative microbiome library of various samples to understand the microbiomes of typical pregnant women, fetuses, and neonates
This study's key strength is its study population, which comprised of pure Asians and reflected general, low-risk pregnancies.The maternal age range was between 20 and 45 years, which is considered typical for reproductive age.There were roughly equal numbers of nulliparous women, caesarean sections, and male and female neonates.Other than a small number of instances of fetal distress, such as low Apgar scores and meconium staining, newborns with extremely pathological conditions that could alter the microbiome, such as severe preterm birth and treatment in a neonatal intensive care unit (NICU), were excluded.As a result, the microbiome analyzed in this study population is likely to represent typical pregnancy.It is crucial to establish a microbiome library for low-risk pregnant women and their normal neonates as a basis for comparison with pathological conditions, to better understand the microbiome composition during pregnancy.

Association between microbiome and various pregnancy-related phenotypes
To identify the microbiomes associated with pregnancy-related conditions, such as delivery method, we conducted statistical analysis of differential abundance.Despite the challenges posed by the low microbial biomass and difficulties in controlling study subjects, which can result in false positive results, the bacteria listed in Table 3 seem to align with previous findings.For example, Finegoldia and Bifidobacterium have been previously linked to a healthier pregnancy, and our data confirms this association 24,25 .Other taxa listed in the table also have links to inflammation and pregnancy complications, such as gestational diabetes mellitus, preeclampsia, and preterm birth.The presence of Campylobacter and Lachnospiraceae in vaginal discharge, for example, is in line with previous research showing that these bacterial infections can lead to inflammation and preterm birth 26,27 .By cross-referencing with clinical databases, our analysis revealed several significant associations.First, the abundant presence of Lactobacillus and Gardnerella in vaginal discharge is a well-known indicator of the pregnancy microbiome.Lactobacillus plays a protective role in the maternal microbiome during pregnancy, while Gardnerella is considered a pathogen and is strongly associated with preterm birth or pregnancy complications 11,13,26 .The presence of Faecalibacterium in cord blood is noteworthy, as it has been shown to be depleted in gestational diabetes mellitus 28 , even though the number of cases in our study population was relatively small.Additionally, Staphylococcus was found to be strongly associated with preterm birth in meconium.This result coincides with previous findings that suggest Staphylococcus infections can lead to preterm birth 29,30 .
Regarding the effect of antibiotics, we analyzed the relationship between antibiotic use and meconium samples, but the results showed limited association due to the small sample size.As Tormo-Badia et al. reported, antibiotics can alter the gut microbiome of offspring in pregnant mice 31 .Given that the existence of a "healthy microbiome" during pregnancy is considered crucial for maintaining a normal pregnancy, it is easy to imagine the potential negative consequences of antibiotics administration during pregnancy.Since antibiotics are only given to pregnant women who have signs of infection or inflammation, specific diseases, or preterm premature rupture of membranes with the risk of ascending infection to the fetus, it is practically challenging to determine the effect of antibiotics on the modification of the birth-related microbiome.
The meconium samples showed the presence of microbiome taxa such as Lactobacillus, Staphylococcus, and Ureaplasma, which are collectively known as the vaginal flora 11,32 .We attempted to evaluate the relationship between delivery mode and the microbiome in meconium, but we did not find any statistically significant differences in composition or diversity.According to a study by Dominguez-Bello et al., there are differences in the bacterial communities in the guts of infants depending on the mode of delivery 33 .Neonates born vaginally have a microbiome resembling their mother's vaginal microbiota, dominated by Lactobacillus.Conversely, infants born via cesarean section have a microbiome dominated by Staphylococcus, Corynebacterium, and Propionibacterium, which are commonly found on their mothers' skin surfaces.

Twin pregnancy and microbiome
Approximately a quarter of the pregnancies in our study were twin pregnancies (37/141).To the best of our knowledge, this is the first study to assess the microbiomes of twin newborns.Generally, our twin samples (AF, CB, GL, and M) showed a more similar composition compared to randomly selected samples, even for dichorionic twins who have separate intrauterine compartments.The only exception was CB and M samples from monochorionic twins, where randomly selected samples showed greater similarity, which is likely due to the small sample size of monochorionic twins.

Conclusion
Exploring the microbiologic features related to pregnancy has been a challenging and controversial task for many years.Microbial invasion of the gestational cavity such as amniotic fluid or placenta can lead to serious obstetric complications such as preterm birth and severe neonatal morbidities that may persist throughout life.Despite the importance of research on the microbiome in pregnancy, progress has been limited due to ethical and accessibility issues.We have collected various samples from pregnant women and their neonates using a standardized protocol and established a microbiome database, which can serve as a reference library for studying samples with other pregnancy-related or pathologic conditions.

Study design and sample collection
A prospective study was performed on live births delivered between March 2020 and January 2021.Samples were collected from women who had delivered at Seoul National University Bundang Hospital and their newborns.Women with unstable vital signs or those requiring urgent management such as transfusion and neonates admitted to the NICU or who had unstable vital signs after birth were excluded from the study.Samples for microbiome analysis included maternal VD, AF, CB, neonatal GL, and M. As a pregnant woman was hospitalized with expectancy of delivery, the VD sample was obtained using a polyester swab inserted into the posterior fornix of the vagina, assisted by sterile speculum examination.For those who had undergone cesarean section for delivery or amniocentesis for specific indications (i.e., for detection of intraamniotic inflammation/infection), approximately 10 cc of AF was obtained through a syringe for the study.During delivery, both cesarean section and vaginal delivery, approximately 20 cc of CB was taken through a syringe from the vein of the umbilical cord immediately after clamping.The syringe needle was directly inserted into the umbilical cord at the delivery site surrounded by sterile drapes to minimize surgical field contamination.Since removing amniotic fluid or other liquid from the newborn's mouth and stomach after birth is a part of initial management to help the airway and to stimulate spontaneous breathing, most neonates received suctioning procedures, and the liquid collected in the suction bottle (approximately 15 ml) was carried into a conical tube for analysis of GL.The M sample, the newborn's very early stool, was carefully obtained within 24 h after birth using a polyester swab inserted into the anus as the neonate stabilized after initial management.We tried to collect all five different samples from each woman and neonate(s), nonetheless, a small part of samples from mother-neonate pairs were not obtained or missed for clinical circumstances.The primary outcome was the distribution and composition of the microbiome of the above samples from pregnant women and their neonates.To determine the association between the microbiome from different compartments and obstetric factors, medical records were collected and thoroughly reviewed.Data included maternal age, gestational age at delivery, delivery mode (vaginal delivery or cesarean section), the use of ART, other obstetric complications, and neonatal outcomes such as sex and birth weight.

Ethics approval and consent to participate
This study was performed with the informed consent of appropriate participants in compliance with the Declaration of Helsinki.The study protocol was approved by the Institutional Review Board of the Seoul National University Bundang Hospital (B-1606/350-003).

Microbial DNA isolation
Microbial deoxyribonucleic acid (DNA) was extracted from the VD, GL, AF, and CB samples with the ZymoBI-OMICS DNA Miniprep Kit (Zymo Research, Irvine, CA) and the sample M using the DNeasy PowerSoil Pro Kit (Qiagen, Germantown, MD) according to the manufacturer's instructions.Briefly, samples were enzymatically and mechanically lysed by bead beating, followed by washing and filtering in the provided column.Extracted DNA concentrations were measured using a Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, Waltham, MA, USA).The total amounts of extracted DNA were varied based on sample types, such as 1-10 ug for VD, 3 μg for CB, 30-200 ng for M, and 50 ng for GL and AF.For each box of the DNA extraction kit used, no material was used as a negative control.The blanks were processed in the entire protocol and analyzed.

16S rRNA gene amplification
The 16S ribosomal ribonucleic acid (rRNA) gene was amplified using the two-step polymerase chain reaction (PCR) protocol in the 16S Metagenomic Sequencing Library Preparation (Illumina, San Diego, CA).In the first PCR step, the V3-V4 hypervariable region of the 16S rRNA gene was amplified using 10 ng of each sample, 10 µM of 341F/785R primers, and Herculase II fusion DNA polymerase (Agilent, Santa Clara, CA).In the below primer sequence, 'N' base is selected from any random base, 'W' base is A or T, 'H' base is A, C or T, and 'V' base is A, C, or G. 341F: 5′-TCG TCG GCA GCG TCA GAT GTG TAT AAG AGA CAG CCT ACGGGNGGC WGC AG-3′ 785R: 5′-GTC TCG TGG GCT CGG AGA TGT GTA TAA GAG ACA GGA CTACHVGGG TAT CTA ATC C-3′ PCR cycling was performed with an initial cycle at 95 °C for 3 min, followed by 25 cycles of 95 °C for 30 s, 55 °C for 30 s, 72 °C for 30 s, and a final extension cycle at 72 °C for 5 min.The amplicons were cleaned with AMPure XP beads (Beckman Coulter, Brea, CA, USA).In the second PCR, index primers from the Nextera DNA CD Index Kit (Illumina, San Diego, CA) were added to the ends of the amplicons generated in the first PCR.PCR cycling was performed with an initial cycle at 95 °C for 3 min, followed by ten cycles of 95 °C for 30 s, 55 °C for 30 s, 72 °C for 30 s, and a final extension cycle at 72 °C for 5 min.Each sample was cleaned with AMPure XP beads (Beckman Coulter, Brea, CA, USA) and eluted in UltraPure DNase/RNase-Free Water (Thermo Fisher Scientific, Waltham, MA).The amplified DNA was checked using a 2100 Bioanalyzer system using an Agilent DNA 1000 Kit (Agilent, Santa Clara, CA, USA).For each library production, no template was used as a negative control.

Sequencing data generation
We divided the samples into nine batches (Runs 1-9) and sequenced the V3-V4 region of the 16S rRNA gene using Illumina MiSeq machines with a target depth of 100,000 per sample (Supplementary Fig. S8).Sequencing was performed with 250 bp paired-end reads for all of the sequencing runs except for the last one (Run 9), where sequencing was performed with 300 bp paired-end reads for practical reasons.The read quality scores for each sequencing run are shown in Supplementary Fig. S9.The bcf2fastq program of Illumina was used to demultiplex raw sequencing data (BCL files) and output forward and reverse FASTQ files for each sample.Of note, some samples were sequenced more than once to assess the impact of batch effects.These included "sequencing duplicates" in which the identical NGS library of one sample was sequenced in separate runs and "library duplicates" in which multiple NGS libraries were prepared from the identical sample at different dates and then sequenced separately.

Data analysis and visualization
Unless stated otherwise, all analyses were carried out using the QIIME 2 platform, a powerful communitydeveloped platform for microbiome bioinformatics 34 .For each sequencing run, FASTQ files were imported to QIIME 2 and the DADA2 plugin 35 to identify ASVs by trimming low-quality parts of sequence reads, denoising trimmed reads, and then merging the forward and reverse reads (Supplementary Fig. S8).The observed ASVs from individual sequencing runs were then merged into one ASV table.To detect and remove potential contaminants, we ran the decontam program on our samples, which looked for ASVs per sequencing batch that appeared at higher frequencies in low-concentration samples and were repeatedly found in the negative control 36 .Taxonomy classification was performed using a naive Bayes classifier using the SILVA database 37 .To visualize the outputs from QIIME 2, we developed the Dokdo program (https:// github.com/ sbslee/ dokdo), an open-source and MIT-licensed Python package for microbiome sequencing analysis using QIIME 2. Dokdo internally uses the application programming interface of QIIME 2 and therefore does not require any other dependencies.Dokdo can be used to perform a variety of secondary analyses or create publication-quality figures from QIIME 2 files/ objects (e.g. a taxonomic bar plot or an alpha rarefaction plot).

Diversity analysis
We used the QIIME 2 command "qiime diversity core-metrics-phylogenetic" to compute the alpha and beta diversity metrics of our samples.When running the command, to normalize for the difference in read depth across the samples, we used the "-p-sampling-depth" option to rarefy our samples to 5,000 sequence reads and have an equal depth of coverage.We also ensured that all samples were sequenced to a sufficient depth of coverage for diversity analysis by creating rarefaction curves (Supplementary Fig. S10).Additionally, we used the "-i-phylogeny" option to provide a rooted phylogenetic tree of observed ASVs, which is required for performing PCoA based on the weighted UniFrac distance 38 .

Statistical analysis
To assess the differential abundance of the microbiome in the context of clinical information such as preterm birth, we used the QIIME 2 command "qiime composition ancom" to perform ANCOM, which compares the centered log-ratio (CLR) of relative abundance between two or more groups of samples 39 .To determine whether groups of samples are significantly different from one another in beta diversity, we carried out PERMANOVA using the QIIME 2 command "qiime diversity adonis" which fits linear model assumptions to a distance matrix (e.g., weighted UniFrac) with the chosen variables.We performed bootstrapping hypothesis testing by building a 95% confidence interval with the "scipy.stats.t.interval" method in the scipy package to compare similarities in microbiome composition between twins and randomly chosen samples 40 .

Figure 1 .
Figure 1.Batch effect detection in 16S rRNA amplicon sequencing data.Center log-ratio transformation was used to normalize the filtered ASV table before generating a hierarchically clustered heatmap based on correlation coefficients.AF, amniotic fluid; CB, umbilical cord blood; GL, gastric liquid; M, meconium; VD, cervicovaginal discharge; NC, negative control.

Figure 2 .
Figure 2. Alpha and beta diversity of the Korean maternal and neonatal microbiome.(A) Alpha diversity:The filtered ASV table was rarefied before Shannon index was computed for each sample.The VD group exhibited the least amount of alpha diversity.AF, amniotic fluid; CB, umbilical cord blood; GL, gastric liquid; M, meconium; VD, cervicovaginal discharge; NC, negative control; (B) Beta diversity: The filtered ASV table was rarefied before the samples were projected into 2D-space with principal coordinates analysis using the weighted UniFrac distance.

Figure 3 .
Figure 3. Beta diversity results of the PERMANOVA analysis.Principal coordinates analysis using weighted UniFrac distance is shown for (A) the cervicovaginal discharge samples, (B) and (C) the meconium samples, and (D) the umbilical cord blood samples.

Figure 4 .
Figure 4. Higher similarity of microbiome composition in twin samples than in randomly chosen samples.For each sample type, the means of weighted UniFrac distances are shown for the twin samples.A 95% confidence interval was constructed by randomly sampling pairwise distances with replacement from the samples for 1000 times.

Table 3 .
Summary of the results from analysis of composition of microbiomes (ANCOM) at the genus level.

rRNA gene sequencing and analysis
Based on the DNA size and concentration, the amplicons were pooled in equimolar amounts and spiked with 30% PhiX (Illumina, San Diego, CA).These were then sequenced on the Illumina MiSeq platform using pairedend 250 cycle MiSeq Reagent Kit V2 (Illumina, San Diego, CA) and a 300 cycle MiSeq Reagent Kit V3 (Illumina, San Diego, CA).Negative controls from the DNA extraction and library were sequenced.