The vaginal microbiome of pregnant women is less rich and diverse, with lower prevalence of Mollicutes, compared to non-pregnant women

The vaginal microbiome plays an important role in maternal and neonatal health. Imbalances in this microbiota (dysbiosis) during pregnancy are associated with negative reproductive outcomes, such as pregnancy loss and preterm birth, but the underlying mechanisms remain poorly understood. Consequently a comprehensive understanding of the baseline microbiome in healthy pregnancy is needed. We characterized the vaginal microbiomes of healthy pregnant women at 11–16 weeks of gestational age (n = 182) and compared them to those of non-pregnant women (n = 310). Profiles were created by pyrosequencing of the cpn60 universal target region. Microbiome profiles of pregnant women clustered into six Community State Types: I, II, III, IVC, IVD and V. Overall microbiome profiles could not be distinguished based on pregnancy status. However, the vaginal microbiomes of women with healthy ongoing pregnancies had lower richness and diversity, lower prevalence of Mycoplasma and Ureaplasma and higher bacterial load when compared to non-pregnant women. Lactobacillus abundance was also greater in the microbiomes of pregnant women with Lactobacillus-dominated CSTs in comparison with non-pregnant women. This study provides further information regarding characteristics of the vaginal microbiome of low-risk pregnant women, providing a baseline for forthcoming studies investigating the diagnostic potential of the microbiome for prediction of adverse pregnancy outcomes.

Sequencing results and OTU analysis. We characterized the vaginal microbiomes of pregnant women at low risk of preterm birth using pyrosequencing of the universal target region of the cpn60 gene. Sequence reads from the vaginal microbiomes of pregnant women were mapped on to a manually curated reference set of 1,561 OTU sequences as described in the methods. Raw sequence data files for the 182 samples described in this study were deposited to the NCBI Sequence Read Archive (BioProject PRJNA317763). A total of 1,415,117 cpn60 reads was generated. Median and average read count per sample was 5,024 and 7,775 (range 494-43,245), respectively. Average read length was 448 bp. The average MAPQ value was 21.1.
Results of Bowtie2 mapping showed that these reads corresponded to 645 OTUs from the reference assembly (Supplementary Table S1). A total of 82 OTUs (corresponding to 53 nearest neighbour "species") were at least 1% abundant in at least one sample. And only 22 "species" were detected in at least 50% of samples (Table 3). Although the ranges of percent identity to reference sequences are large in some cases, reflecting the diversity in the community, the most abundant OTU were at the high end of the range. In fact, of the 25 most abundant OTU in the study (accounting for 95% of the sequence reads generated), 22/25 were >95% identical to their nearest neighbour (Supplementary Table S1).
Most reads (68.8%) were identified as Lactobacillus spp. And only three OTUs, all of which matched to Lactobacillus spp., accounted for 55.7% of all reads generated: OTU 1403: L. crispatus (30. Weissella viridescens were detected in most samples, they had very low abundance, representing only 0.33% and 0.22% of all reads, respectively (Supplementary Table S1).
CST distribution of the pregnant participants was compared to the previously described non-pregnant cohort. Twelve pregnant women had microbiome profiles identified as the L. gasseri-dominated CST II, which was not observed among the profiles of non-pregnant women 34 . Additionally, CST IVA was not detected among pregnant  (Fig. 1). Although there were differences regarding presence/absence of specific CST in the pregnant and non-pregnant cohorts, overall microbial profiles could not be distinguished based on pregnancy status alone ( Fig. 2A,B).
Ecological analysis. Assessment of alpha diversity revealed that microbiomes of pregnant women were less diverse (Shannon diversity index, 1.3 ± 0.9) and less rich (Chao1, 38.9 ± 15.3) when compared to those of non-pregnant women (1.6 ± 1.1; 47.6 ± 26) (t-test, p < 0.001) (Fig. 4B,C). When comparisons were conducted within CST, profiles of pregnant women in CST I, III and V were less diverse and less rich than profiles of non-pregnant women in the same category (t-test, p < 0.05) but no statistically significant differences were observed in CST IVC and IVD.
Pregnant women were less likely to be positive for Mollicutes detection by PCR when compared to non-pregnant women, regardless of CST (Chi-square, p < 0.0001, Mollicutes positive: 217/310 non-pregnant women and 74/182 pregnant women) (Fig. 4D) Table 3. Prevalence and proportion of total reads for "species" detected in at least 50% of samples. a Closest match in the cpnDB reference database based on sequence identity. Higher bacterial loads were detected in samples from pregnant women (log 7.7 ± 0.9) when compared to non-pregnant women (log 6.8 ± 1.5) (t-test, p < 0.0001) (Fig. 4F). Comparisons within CST confirmed that samples from pregnant women in CST I (log 7.6 ± 0.7), CST III (log 8.0 ± 0.7) and CST IVC (log 8.8 ± 1) had higher bacterial load than non-pregnant women in the same categories (CST I = log 6.7 ± 1.4, CST III = log  182). Each column represents a woman's vaginal microbiome profile, and each row represents an OTU. Only OTU that are at least 1% abundant in at least one sample are shown. The proportion of the total microbiome comprised is indicated by white to red colour according to the legend. The coloured bars above the heatmap show the community state type (CST) and the Nugent score category (Nugent) for each woman. Legend: white = missing data.
Relationships between microbiological and socio-demographic characteristics across the pregnant cohort. The characteristics of the microbial community of pregnant women were analyzed in terms of their relationship with the socio-demographic and clinical data. First, we determined whether there was any relationship between the CST (I, II, III, IVC, IVD, V) and demographic characteristics such as BMI, ethnicity, unprotected sex, folic acid intake, vitamins, natural conception, antibiotic use, gestational age at delivery, mode of delivery, neonatal in high level care nursery, parity, pre-existing conditions, surgeries (past 10 years), smoking and alcohol drinking status as well as Nugent score (Fig. 5). Besides Nugent score, the only significant interaction was between CST and parity (0 or ≥ 1) (Chi-square, p = 0.033), with 45% of women at parity 0 in CST I (27/60) and 23% of women at parity ≥ 1 in CST I (29/122). Microbiological and demographic characteristics were also compared to presence of Mollicutes (yes/no) and Ureaplasma (yes/no), microbiome richness (continuous variable) and diversity (continuous variable). These four observations were compared to 29 other variables (Supplementary Methods). PCR detection of Mollicutes (p = 0.017) and Ureaplasma (p = 0.017) was significantly associated with bacterial load (Chi-square).   Although smoking has been previously shown to have a possible effect on the vaginal microbiota 37 , there was no significant association between CST and smoking status in this study (Chi-square, p > 0.05). Also, microbiome analysis was redone excluding samples from participants who are smokers (results not shown). The results led to the same conclusions regarding the overall microbiome (PCoA), Lactobacillus abundance, Shannon diversity, Chao1 richness, bacterial load and Mollicutes/Ureaplasma prevalence.

Discussion
In this study we characterized the vaginal microbiome of pregnant women with healthy ongoing pregnancies, at low risk of experiencing pregnancy complications such as preterm birth, and compared these results to our previously characterized cohort of non-pregnant women of similar ethnicity. We focussed our studies on the vaginal microbiome at 11-16 weeks. Pregnancy pathologies such as spontaneous preterm birth and early onset pre-eclampsia have their origins in the first or early second trimester and therapeutic interventions at this stage have been shown to be efficacious 38 . This is also the gestational age at which pregnant women in Canada often have their first prenatal visit with a health care provider and a vaginal swab is taken to assess the presence or absence of vaginal/cervical infections.
In order to make valid comparisons between pregnant and non-pregnant women microbiomes, we analysed the results from both cohorts based on socio-demographic characteristics ( Table 1). The two cohorts were comparable, with no significant differences detected in any category except for maternal age and smoking. Differences regarding smoking status were not surprising since behavioural changes such as reduced drinking and smoking have been documented in pregnancy 28 . Although statistically significant, the identified difference in maternal age (pregnant: 33 ± 4 and non-pregnant: 30 ± 7) was not considered biologically relevant since the difference of the mean values was only 3 years.
Overall microbial profiles could not be distinguished from each other based on pregnancy status alone. However, a more detailed analysis revealed several differences between the microbiomes of healthy pregnant women and those of healthy non-pregnant women. The BV-associated CST IVA (dominated by G. vaginalis subgroup B and Atopobium) was not detected in the pregnant cohort, whereas L. gasseri dominated CST II, which was not detected among the 310 non-pregnant women in our previous study 34 , was detected in 12 women in the pregnant cohort. Pregnant women in Lactobacillus-dominated CST had higher relative abundance of Lactobacillus spp. when compared to non-pregnant women. Vaginal microbiomes of pregnant women had lower richness and diversity and a correspondingly lower prevalence of Mollicutes and Ureaplasma when compared to non-pregnant women.
Microbial profiles from the pregnant women clustered in six different groups, mostly Lactobacillus-dominated CST (CST I, CST II, CST III and CST V), originally defined by Ravel & Gajer 58 . Non-Lactobacillus-dominated (CST IV) profiles are described in the literature as either very heterogeneous or dominated with BV-associated bacteria 34,59 . None of the pregnant women, including 61 with intermediate or high Nugent scores, were identified as belonging to CST IVA, which is characterized by dominance of G. vaginalis subgroup B and Atopobium 34 . This distribution is notably different from the non-pregnant cohort, where 24.6% of women with intermediate or high Nugent scores were assigned to CST IVA 34 . Results of other studies have suggested that CST dominated by BV-associated microorganisms are less frequently detected in pregnancy 30,32 . While our study design does not allow us to address the issue of overall prevalence of BV-associated CST in pregnancy, our results suggest that the distribution of these CST among pregnant women may be different than in non-pregnant women. This suggests a role CST IVA could be playing in early pregnancy loss.
OTUs that were weakly similar to Streptococcus and Weissella species were detected in most samples, but they represented only 0.33% and 0.22% of all reads, respectively. These OTUs were previously described as highly prevalent in the vaginal microbiome of healthy non-pregnant women 34 . They have low sequence identity to any reference sequences in the cpnDB_nr database (OTU 0026: S. devriesei 83% and OTU 0021: W. viridescens 58.8%). However, this subset of the cpnDB database contains only selected representative sequences of named species. A broader search shows that these OTU are more similar to metagenomic sequences derived from the fecal microbiome (OTU 0021 is 97% identical to Genbank accession GQ178631) or oral microbiome (OTU 0026 is 85% identical to KJ406686) that represent uncharacterized Firmicutes; reminders of the common but still uncharacterized constituents of the human microbiome.
Our findings of greater Lactobacillus abundance and lower richness and diversity in the vaginal microbiomes of pregnant women relative to non-pregnant women are consistent with previous studies in the literature. Aagaard et al. 29 analyzed the microbiomes of 24 healthy, pregnant American women sampled at three different locations within the vagina. Vaginal site did not drive the structure of the microbial community, but the authors found that overall microbiomes of pregnant women were less diverse and less rich when compared to non-pregnant women. Similarly, Walther-António et al. 31 described the microbiomes of 12 White pregnant American women based on longitudinal sampling and observed reduced microbiome diversity and higher Lactobacillus spp. relative abundance during pregnancy. Romero et al. 30 also reported that Lactobacillus spp. abundance was significantly higher in pregnant women in comparison to non-pregnant and increased as a function of gestational age. They also described higher stability of the microbiome during pregnancy when compared to reproductive age non-pregnant women. In another longitudinal study, MacIntyre et al. 32 analyzed the microbiomes of 42 British women during pregnancy and the post-partum period. L. jensenii-dominated profiles were more common among these women than Northern American women. The authors also observed that post-partum microbiomes become less Lactobacillus spp. dominant and more rich and diverse (i.e. more similar to the microbiomes of non-pregnant women) regardless of ethnicity, providing strong support for the idea that pregnancy has a transient effect on the vaginal microbial community. Importantly, the conclusions of these studies and our current study are consistent despite differences in the cohort studied (Canadian, American or European cohorts of varying mixtures of ethnicity), universal target amplified (cpn60 universal target or 16 S rRNA gene) or sequencing platforms used (454/ Roche pyrosequencing or Illumina MiSeq).
The explanations for these pregnancy-associated changes are not well established, but a relationship between sex steroid hormone levels and the composition of the vaginal microbiome has been previously reported 58,60,61 . Increased levels of estrogen during pregnancy lead to increased thickness of the vaginal mucosa and increased deposition of glycogen 62,63 . Glycogen is the main carbohydrate utilized by Lactobacillus spp. for the production of lactic acid, which contributes to the protective effect of a low vaginal pH [64][65][66][67] . This may contribute to the greater dominance of Lactobacillus in pregnancy and, consequently, the lower richness and diversity in this cohort.
A novel finding in our study was the lower prevalence of Mollicutes and Ureaplasma in pregnancy detected by family and genus-specific PCR. Mollicutes have been associated with preterm birth, preterm premature rupture of membranes and low birth weight [68][69][70] . The lower prevalence of Mollicutes and Ureaplasma is consistent with the overall lower species richness and diversity in the vaginal microbiomes of pregnant relative to non-pregnant women. We also found that pregnant women had higher bacterial load than non-pregnant women as estimated by quantitative PCR targeting the 16 S rRNA gene. Hormone induced production of glycogen may offer a nutritionally richer environment for bacterial growth in the vagina during pregnancy. Additionally, it is known that pregnancy alters the amount and consistency of the mucus, which becomes more abundant and thicker 71 . Thus, it is possible that swabs sampled from the pregnant women carried more material when compared to non-pregnant women. We are unable to resolve this question since the swabs were not weighed before DNA extraction steps. In addition to pregnancy associated physiological differences and mucus consistency, other technical factors such as storage conditions or inherent differences in the study populations cannot be ruled out.
One limitation of this study was the assessment of pregnancy outcomes and microbial profiles since there were very few poor outcomes in this cohort, as it would be expected for a low risk group. This study, however, does provide crucial baseline information for future studies in pregnant women. In addition, we detected significant interaction between parity and CST. Considering the large number of variables in the metadata, analysis of these associations should be interpreted with caution. We can speculate that the prevalence of the L. crispatus-dominated CST I among nulliparous women might be associated with more cautious or health conscious behaviour among women in their first gestation. Pregnancy-induced physiologic alterations that can persist after delivery have been previously reported 72 . In addition to physiologic changes, a disturbed vaginal microbiome that persisted for up to a year post-partum has also recently been described 33 . Those persistent changes might explain the association between CST and parity we observed, with post-partum microbiome changes affecting the current status of primiparous and multiparous women. Further studies are needed to investigate these relationships in more detail.
In conclusion, we have identified several differences in the composition of the vaginal microbial communities of pregnant women living in Canada relative to non-pregnant women: larger total bacterial community, lower richness and diversity, higher Lactobacillus abundance and lower Mollicutes/Ureaplasma prevalence. These findings give us a better understanding of the vaginal microbiome in pregnancy, which is a critical step toward being able to exploit the diagnostic potential of the microbiome for the prediction of adverse pregnancy outcomes as well as to explore alternative therapeutic procedures through microbiological intervention. Establishing an understanding of the normal microbiome in low risk pregnant women is a vital baseline for comparison to the microbiome of women who have adverse perinatal outcomes such as preterm birth.

Study population and sampling. This study received ethical approval from the Mount Sinai Hospital
Research Ethics Board (Approval Number 08-0005-A). All participants provided written informed consent and all methods were performed in accordance with the relevant guidelines and regulations. Women attending antenatal clinics at Mount Sinai Hospital (Toronto, ON, Canada) between May 2012 and October 2013 were invited to be part of a clinical trial to determine the effect of oral probiotic lactobacilli in altering the vaginal microbiome in asymptomatic pregnant women with an abnormal Nugent score 73,74 . Nugent score was determined on Gramstained swabs taken at the same time as the swab for microbial analysis. Women with normal Nugent scores were excluded from the probiotic trial and are included in this analysis. Samples from women with an abnormal Nugent score who were subsequently randomized and included in the analysis for this study were taken prior to any intervention in these women and there was no difference in the pregnancy outcomes between the lactobacilli and placebo groups 74 . A preliminary report on the results of that probiotic trial have been presented previously 74 .
Women were eligible to participate if the following inclusion criteria were met: currently pregnant, adequate comprehension of English language to sign written informed consent, age ≥ 18 years old, no evidence of fetal complications such as intrauterine growth restriction, and no evidence of medical complications of pregnancy. Exclusion criteria included inability to provide informed written consent, multi-fetal pregnancies, currently taking antibiotics or other antimicrobial therapy for BV treatment. Study data were collected and managed using REDCap electronic data capture tools 75 .
Vaginal swabs were collected under direct visualization using a speculum by either a physician or a nurse and placed in dry tubes prior to being placed in −80 °C. A total of 182 pregnant women at 11-16 weeks gestation were enrolled in the vaginal microbiome study, including 111 women with normal Nugent scores (inconsistent with BV), 61 women with Nugent scores that were intermediate or consistent with BV, and 10 women with indeterminate Nugent scores due to poor quality smears. Total nucleic acid was extracted from swabs using the MagMAX ™ Total Nucleic Acid Isolation Kit (Life Technologies, Burlington, ON, Canada) as per manufacturer's instructions. Kit reagents are aliquoted to eliminate repeated accessing of open reagents, and samples are processed in small batches using filter-tips to prevent cross-contamination. Pipettes and other lab surfaces are regularly treated with DNA surface decontaminant (DNA Away, ThermoFisher Scientific, Waltham, MA). Regular monitoring of reagent only DNA extraction controls in our lab by universal PCR confirms that these procedures are sufficient to eliminate detectable template contamination of study samples.
The microbial profiles of low-risk pregnant women were compared to profiles previously generated from healthy, reproductive aged, non-pregnant Canadian women from the greater Vancouver area, British Columbia, Canada (n = 310) 34 . Samples were collected as being non-menstrual but were not sampled at any specific non-menstrual cycle time as other studies have demonstrated there is little variation in microbiome profiles through the cycle 48 . Samples from this previous study were processed in the same way as in the current work in terms of swab type, storage temperature, DNA extraction, library preparation and sequencing. Although the year of sampling was different between the two cohorts, there was no difference in time from sample collection to sequencing between the two groups.
Conventional PCR: Some Mollicutes (Mycoplasma and Ureaplasma) species lack a cpn60 gene 78 . Thus, we performed a family-specific semi-nested PCR targeting the 16 S rRNA gene to detect Mollicutes 79 , and a PCR targeting the multiple banded antigen gene to detect Ureaplasma spp.. In this assay, PCR products from Ureaplasma parvum and U. urealyticum can be differentiated by size 80 .
Scientific RepoRts | 7: 9212 | DOI:10.1038/s41598-017-07790-9 cpn60 Universal Target (UT) PCR and Pyrosequencing. Universal primer PCR targeting the 552-558 bp cpn60 UT region was performed using a mixture of cpn60 primers consisting of a 1:3 molar ratio of primers H279/H280:H1612/H1613, as described previously 47,48,81 . To avoid cross-contamination, samples were handled in small batches, and a no template control was included with each set of PCR reactions. To allow multiplexing of samples in a single sequencing run, primers were modified at the 5′ end with one of 24 unique decamer multiplexing identification (MID) sequences, as per the manufacturer's recommendations (Roche, Brandford, CT, USA). Amplicons were pooled in equimolar amounts for sequencing on the Roche GS Junior sequencing platform. The sequencing libraries were prepared using the GS DNA library preparation kit and emulsion PCR (emPCR) was performed with a GS emPCR kit (Roche Diagnostics, Laval, Canada).

Analysis of Operational Taxonomic Units (OTU).
Raw sequence data was processed by using the default on-rig procedures from 454/Roche. Filter-passing reads were used in the subsequent analyses for each of the pyrosequencing libraries. MID-partitioned sequences were mapped using Bowtie 2 (http://bowtie-bio.sourceforge.net/bowtie2/) on to a manually curated reference set of 1,561 OTU sequences representing human vaginal microbiota. Bowtie 2 was run using the default end-to-end alignment mode, in which the minimum "cutoff " for any individual read to be validly aligned to a reference sequence is an alignment score of −0.6 + −0.6 * L, where L is the length of the read. The best valid alignment for each read is reported. Mapping quality was also evaluated by MAPQ value, which is based on the probability that alignment does not correspond to the read's true point of origin.
The OTU reference set was generated originally by de novo assembly of cpn60 sequence reads from each of 546 vaginal microbiomes, which included 182 samples from pregnant women (this study) and 364 samples from non-pregnant women from previous studies by our research group. The reference assembly was created by the microbial Profiling Using Metagenomic Assembly pipeline (mPUMA, http://mpuma.sourceforge.net) 82 with Trinity as the assembly tool 83 . Assembled OTU were labeled according to their nearest reference sequence determined by watered-Blast comparison 84 to the cpn60 reference database, cpnDB_nr (downloaded from http://www.cpndb.ca 78 ). cpnDB_nr is a subset of the cpnDB database that includes a non-redundant collection of sequences representing all species in cpnDB, with a preference for inclusion of the type strain for each species when available. This reference assembly approach allows us to compare the microbial profiles from various cohorts under investigation, including the 182 pregnant women described in this study. To improve comprehension of some figures, we have pooled reads from OTU into "nearest neighbour species" based on their taxonomic label. Thus, the term "species" refers to OTUs that have the same nearest neighbour match in cpnDB.

Statistical Analysis.
Comparisons across pregnancy status cohorts were based on analysis of variance (ANOVA), t-test and Chi-square, performed in IBM SPSS (Statistical Package for the Social Sciences, version 21) at 5% level of significance. For analysis of associations between socio-demographic characteristics and microbiome profiles, a false discovery rate (FDR) correction for multiple comparisons was applied 85 (for the complete list of variables tested, see Supplementary Methods).
Alpha (Shannon diversity and Chao1 estimated species richness) and beta diversity (jackknifed Bray-Curtis dissimilarity matrices) were calculated as the mean of 100 subsamplings of 1000 reads (or all reads available when less than 1000) in QIIME (Quantitative Insights Into Microbial Ecology) 86 . Plots of alpha diversity measures against bootstrap sample number were generated in R and visually inspected to ensure that an adequate sampling depth for each sample was achieved. Microbiome profiles were also compared based on Bray-Curtis dissimilarity matrices using Principal coordinates analysis (PCoA) in QIIME.
For community state type analysis, a Jensen-Shannon distance matrix was calculated using the 'vegdist' function in the vegan package 87 with a custom distance function that calculates the square root of the Jensen-Shannon divergence 88 . This distance matrix was used for hierarchical clustering using the 'hclust' function in R, with Ward linkage.
Data Availability. Raw sequence data files for the 182 samples described in this study were deposited to the NCBI Sequence Read Archive (Accession SRP073152, BioProject PRJNA317763). Due to ethical and legal restrictions related to protecting participant privacy imposed by the Mt. Sinai Hospital Research Ethics Board, all other relevant data are available upon request pending ethical approval. Please submit all requests to initiate the data access process to the corresponding author.