A longitudinal analysis of the vaginal microbiota and vaginal immune mediators in women from sub-Saharan Africa

In cross-sectional studies increased vaginal bacterial diversity has been associated with vaginal inflammation which can be detrimental for health. We describe longitudinal changes at 5 visits over 8 weeks in vaginal microbiota and immune mediators in African women. Women (N = 40) with a normal Nugent score at all visits had a stable lactobacilli dominated microbiota with prevailing Lactobacillus iners. Presence of prostate-specific antigen (proxy for recent sex) and being amenorrhoeic (due to progestin-injectable use), but not recent vaginal cleansing, were significantly associated with microbiota diversity and inflammation (controlled for menstrual cycle and other confounders). Women (N = 40) with incident bacterial vaginosis (Nugent 7–10) had significantly lower concentrations of lactobacilli and higher concentrations of Gardnerella vaginalis, Atopobium vaginae, and Prevotella bivia, at the incident visit and when concentrations of proinflammatory cytokines (IL-1β, IL-12p70) were increased and IP-10 and elafin were decreased. A higher ‘composite-qPCR vaginal-health-score’ was directly associated with decreased concentrations of proinflammatory cytokines (IL-1α, IL-8, IL-12(p70)) and increased IP-10. This longitudinal study confirms the inflammatory nature of vaginal dysbiosis and its association with recent vaginal sex and progestin-injectable use. A potential role for proinflammatory mediators and IP-10 in combination with the vaginal-health-score as predictive biomarkers for vaginal dysbiosis merits further investigation.

The vaginal mucosal surface is colonised by a variety of bacterial species and the composition, which has implications for reproductive health, is influenced by both endogenous and exogenous factors (reviewed in 1 ). Using culture-dependent and molecular amplification techniques (such as quantitative polymerase chain reaction (qPCR) and next generation sequencing), a 'normal' vaginal microbiota (VMB) has been defined as one dominated by lactic acid-producing Lactobacillus species. The clinical condition bacterial vaginosis (BV) is associated with increased diversity and quantity of bacteria and a concomitant decrease in lactobacilli 2 . Molecular studies have shown some lactobacilli (notably L. crispatus) are more associated with health than others (L. iners) because they are associated with a lower risk of developing vaginal dysbiosis 3 or acquiring sexually transmitted infections (STIs) 4 .

Results
The cross-sectional characteristics of all 430 women enrolled in the Vaginal Biomarkers Study, including the composition of the VMB, have been previously described 15,16,21,22 . In this sub-study, the median age of the 40 women with a consistently normal VMB (reference group) and the 40 women who developed incident BV (incident BV group) was 23 and 24 years respectively (ranges 16-34 years and 16-33 years). The median age at first vaginal intercourse was 17 years for both groups. Half of the women with a consistently normal VMB (53%) and 43% of the women who developed BV had had two or three lifetime sex partners with most (78% and 88%, respectively) having had one sex partner in the last three months. Over fifty percent of the women in both groups had delivered a child at least once. Most women in both groups (80%) currently used contraception: 36% used progestin injections, 13% used combined hormonal pills, 5% were sterilised, and 26% used condoms only. All women in the reference group tested negative for pregnancy, HIV, syphilis, Neisseria gonorrhoeae, Chlamydia trachomatis, and Trichomonas vaginalis, by design. The baseline herpes simplex virus type 2 (HSV-2) prevalence was 30% in the women with a consistently normal VMB and 33% in the women who developed BV. Table 1 describes participant characteristics by group over the five study visits for those parameters that were subsequently included in mixed effects regression models as potential confounders of the main associations of interest between VMB bacteria and vaginal immune mediators (see methods). The reference group included HIV-negative adult women (N = 16 Kenya, N = 16 South Africa), adolescents (N = 6 Kenya), and HIV-negative sex workers (N = 2 Rwanda). The incident BV group included HIV-negative adult women (N = 16 Kenya, N = 11 South Africa), adolescents (N = 5 Kenya, N = 2 South Africa), pregnant women (N = 1 Kenya, N = 3 South Africa); and HIV-negative sex workers (N = 2 Rwanda). The detection of prostate-specific antigen (PSA) in the vagina as a marker of vaginal sex in the last 24-48 hours 23,24 and self-reported vaginal cleansing in the evening or morning just prior to the study visit were both common (25-57% and 28-53% at different visits, respectively). Clinician-observed abnormal vaginal discharge and cervical mucus were also common, but cervical epithelial findings visible by the naked eye (abrasion, laceration, ecchymosis, petechiae, erythema, or ulcer) were uncommon (occurring in 1-8 women at each visit), throughout the study (Table 1). At all visits, a substantial proportion of women (up to 33% of the women with a consistently normal VMB and up to 68% of the women who developed BV) had a vaginal pH above 4.5, which is considered outside the normal range and is one of the Amsel criteria for the diagnosis of BV 25 .

VMB bacteria and Candida over time (reference group). The presence of individual VMB bacteria
and Candida albicans was determined by qPCR and expressed as log 10 genome equivalents (geq) per millilitre (ml). Over the five visits, presence was classified as: never present (0% of visits); sporadically present (1-25% of visits); regularly present (26-74% of visits) and consistently present (75-100% of visits). The presence of individual Lactobacillus species was relatively stable over the five visits in the reference group; i.e. either consistently or never present (Figs 1 and 2). This was particularly true of L. crispatus, which was consistently present in 47% of women or never present in 53% of women (Fig. 2). In 79% of the women with consistent L. crispatus, this was accompanied by a consistent or regular presence of L. vaginalis (Fig. 1). L. iners was consistently present in 75% of women and regularly present in another 10% of women. L. iners and L. crispatus did occur together at least twice in 35% of women, but women with high concentrations of L. crispatus had lower concentrations of L. iners and vice versa (Fig. 1). C. albicans, L. jensenii and L. gasseri were never present in 60%, 63% and 75% of the women, respectively. Escherichia coli was present (but always in a lower concentration than the lactobacilli) at least once in 90% of women, Prevotella bivia in 91% of women, Gardnerella vaginalis in 58% of women, and Atopobium vaginae in only 17% of women.
Correlates of longitudinal variations in the concentrations of VMB bacteria were assessed in mixed effects linear regression models for those VMB bacteria that were consistently present in at least 25% of the women in the reference group. Each model had one such VMB bacteria concentration as the outcome, individual women as random effects, and presence or absence of a menstrual cycle, menstrual cycle phase (follicular or luteal phase; see methods), presence of vaginal PSA, and recent vaginal cleansing as fixed effects. It is important to note that all amenorrhoeic women in this sub-study were progestin injection users. The models showed that changes in the concentrations of VMB bacteria over time were larger within women than they were between women, with the exception of L. jensenii ( Table 2). The mean Lactobacillus genus concentration in amenorrhoeic women was lower (−0.55 log 10 geq/ml; p = 0.023) than the mean concentration in women with a menstrual cycle (Table 2), with L. crispatus accounting for the greatest difference ( Table 2). The mean Lactobacillus genus (−0.39 log 10 geq/ ml; p = 0.010), L. iners (−0.75 log 10 geq/ml; p = 0.008) and P. bivia (−0.38 log 10 geq/ml; p = 0.045) concentrations were significantly lower at visits with vaginal PSA detected ( Table 2). The mean E. coli concentration was significantly lower at luteal phase visits compared to follicular phase visits in women with a menstrual cycle (−0.75 log 10 geq/ml; p = 0.020). There were no significant associations between recent vaginal cleansing and concentration of any VMB bacteria (Table 2).
Vaginal immune mediators over time (reference group). Concentrations of various cytokines, chemokines, and growth factors were measured in cervicovaginal lavages (CVLs) and expressed in log 10 pg/ml (see methods). Mixed effects linear regression models with each immune mediator concentration as the outcome, individual women as random effects, and menstrual cycle presence and phase as fixed effects showed that changes in concentrations of immune mediators over time were larger within women than they were between women, with the exception of interleukin (IL)−1α ( Table 3). The mean IL-1α concentration was significantly higher in luteal phase relative to follicular phase visits (0.16 log 10 pg/ml; p = 0.004) but mean IL-6 (−0.26 log 10 pg/ ml; p < 0.001), CC chemokine macrophage inflammatory protein (MIP)−1β (−0.26 log 10 pg/ml; p < 0.001) and Characteristics of women with a normal VMB throughout   granulocyte colony-stimulating factor (G-CSF) concentrations (−0.26 log 10 pg/ml; p = 0.007) were significantly lower. Mean concentrations of IL-8 (0.28 log 10 pg/ml; p = 0.016), IL-12(p70) (0.15 log 10 pg/ml; p = 0.038) and MIP-1β (0.34 log 10 pg/ml; p = 0.013) were higher in amenorrhoeic women compared to women with a menstrual cycle. Further mixed effects linear regression models with each immune mediator as the outcome, individual women as random effects, and presence and phase of the menstrual cycle as fixed effects, were fitted with the following additional fixed effects added (in separate models): vaginal pH category (<4.0, 4.0-4.5, >4.5); presence of abnormal vaginal discharge, cervical mucus, a cervical epithelial finding, or vaginal PSA; and recent vaginal cleansing. Visits with PSA detected had significantly higher mean concentrations of IL-6, IL-12(p70), and CXC chemokines interferon (IFN)-γ-inducible protein (IP-10); visits with a higher vaginal pH had a higher mean concentration of IL-1RA and a lower mean concentration of secretory leucocyte peptidase inhibitor (SLPI); visits with abnormal vaginal discharge had lower mean concentrations of IL-1α, IL-1RA, GM-CSF and elafin; visits with cervical mucus had a lower mean concentration of elafin; and visits with cervical epithelial findings had a lower mean concentration of granulocyte macrophage colony stimulating factor (GM-CSF) ( Table 3). Recent vaginal cleansing was not significantly associated with concentrations of any of the immune mediators (data not shown).

VMB bacteria and immune mediator associations over time (both groups). In mixed effects linear
regression models including all 80 women and controlled for presence and phase of menstrual cycle and PSA presence, a higher 'composite qPCR vaginal health score' was associated with a higher IP-10 concentration (2.53 log 10 pg/ml; p < 0.001) and lower IL-1α (−1.14 log 10 pg/ml; p = 0.005), IL-8 (−1.55 log 10 pg/ml; p = 0.002), and IL-12(p70) (−1.80 log 10 pg/ml; p < 0.001) concentrations (Table 5). This vaginal health score was calculated as [log 10 geq/ml (Lactobacillus genus)−log 10 geq/ml (G. vaginalis + A. vaginae)] and a higher score therefore suggests better vaginal health 26 . The Lactobacillus genus concentration (which is one component of the vaginal health score) showed a similar pattern except that it was not significantly associated with the IL-1α concentration and the reduction in IL-12(p70) concentration did not reach statistical significance. The L. crispatus and L. vaginalis concentrations were not significantly associated with any immune mediator concentrations over time. The L. iners concentration was significantly positively associated with IP-10 and IL-8 concentrations and negatively associated with IL-1α concentration. The P. bivia concentration was positively associated with the IL-1α and IL-8 concentrations and negatively associated with the IP-10 concentration, and the E. coli concentration was positively associated with the IL-8 concentration.

Discussion
In this longitudinal study of young sub-Saharan African women, we confirmed that a Nugent score of 0-3 over an eight week period was associated with consistently high concentrations of Lactobacillus species (regularly accompanied by much lower concentrations of the BV-associated bacteria G. vaginalis, A. vaginae, and P. bivia and the pathobiont E. coli), whereas incident BV was associated with significantly reduced concentrations of lactobacilli and increased concentrations of G. vaginalis, A. vaginae and P. bivia, but not E. coli. In women with a normal VMB throughout the study, VMB variations were larger within women over time than between women, as has been seen in other studies 19 Table 2. Mean differences in VMB bacteria concentrations in women with a Nugent score of 0-3 during five visits over eight weeks by presence and phase of the menstrual cycle, presence of PSA and recent vaginal cleansing. Abbreviations: conc = concentration; diff = difference; PSA = prostate-specific antigen; SD = standard deviation; VMB = vaginal microbiota; vs = versus. 1 Expressed in log 10 genome equivalents per mL (geq/ml). The expected value for women with a menstrual cycle in the follicular phase of the cycle. 2 The between-women and within-women standard deviations. 3 The mean difference in concentration (log 10 geq/ ml) between the luteal and follicular phases of the cycle for women with a menstrual cycle. 4 From the mixed effects linear regression models with each item in the first column as the outcome, individual women as random effects, and fixed effects as described in the first row of the table. For women with the bacteria present during at least 75% of visits and excluding the visits during which the bacteria was absent. We only included VMB bacteria that were consistently present (in at least 75% of the visits) in at least 25% of women. 5 The mean difference in concentration (log 10 geq/ml) between women with amenorrhoea (all visits) and women with a cycle (all visits). 6 The mean difference in concentration (log 10 geq/ml) between visits with PSA present versus not present for visits with the same presence and phase of menstrual cycle and the same vaginal cleansing status. 7 The mean difference in concentration (log 10 geq/ml) between visits at which reporting recent vaginal cleansing was reported versus not reported among visits with the same presence and phase of menstrual cycle and the same PSA status.    1 Expressed in log 10 pg/ml. The expected value for women with a menstrual cycle in the follicular phase of the cycle. 2 The between-women and within-women standard deviations for women in the model with presence and phase of the cycle as fixed effects. 3 From mixed effects linear regression models with each item in the first column as the outcome, individual women as random effects, and including presence and phase of the menstrual cycle as fixed effects. 4 From mixed effects linear regression models (separate model for each clinical characteristic and PSA) with each item in the first column as the outcome, individual women as random effects, and including the item in the first row as fixed effects, and controlled for presence and phase of the menstrual cycle. A model with recent vaginal cleansing (the evening or morning before the visit) as fixed effect controlled for presence and phase of the menstrual cycle was also fitted but the data are not shown because none of the findings were statistically significant. The clinical characteristics are clinician-observed during speculum examination. 5 The mean difference in concentration (log 10 pg/ml) between luteal and follicular phase visits in women with a menstrual cycle. 6 The mean difference in concentration (log 10 pg/ml) between women with amenorrhoea (all visits) and women with a cycle (all visits). 7 The mean difference in concentration (log 10 pg/ml) between visits with PSA present versus absent, for visits with the same presence and phase of menstrual cycle. 8 The mean difference in concentration (log 10 pg/ml) between visits at which the vaginal pH was 4.0-4.5, or >4.5, each compared to <4, for visits with the same presence and phase of menstrual cycle. 9 The mean difference in concentration (log 10 pg/ml) between visits with vaginal discharge present versus absent, for visits with the same presence and phase of menstrual cycle. 10 The mean difference in concentration (log 10 pg/ml) between visits with cervical mucus present versus absent for visits with the same presence and phase of menstrual cycle. 11 The mean difference in concentration (log 10 pg/ml) between visits with cervical epithelial findings present versus absent, for visits with the same presence and phase of menstrual cycle. Cervical epithelial findings included abrasions, oedema, ecchymosis, petechiae, erythema, and ulcers. not been reported in those studies: L. crispatus was often accompanied by L. vaginalis; L. jensenii and L. gasseri were never present in most women; and E. coli was regularly present in almost all women. Another important pathobiont in the vaginal niche is Streptococcus agalactiae, and unfortunately, we only have qPCR data for that organism at baseline 29 . In a cross-sectional baseline analysis of all 430 women in the Vaginal Biomarkers Study using qPCRs, 16% had S. agalactiae 29 and 28% E. coli in their VMB 16 . The limited number of other molecular VMB studies that reported on S. agalactiae and E. coli carriage showed varying results, with generally lower detection in studies that employed 16 S sequencing compared to qPCR 18,30,31 . Vaginal carriage of these pathobionts should be further investigated, preferably by qPCR in longitudinal studies, given their associations with vaginitis, reproductive health, and neonatal meningitis and sepsis 32 .
In women with a normal VMB throughout the study, variations in concentrations of soluble immune mediators were greater within women over time than between women, which is in agreement with other studies 12,33,34 . In the women developing BV, incident BV was associated with increased concentrations of proinflammatory cytokines and decreased concentrations of the antiprotease elafin and IP-10. Other studies have reported similar proinflammatory profiles associated with BV, as well as increased proteolytic activity [35][36][37][38] . IP-10 findings across studies are more difficult to interpret (see below). It should be noted that incident or recurrent urogenital infections other than BV could have been responsible for some of the variation in immune mediators seen. However, none of the 80 women in this study had symptomatic vaginal candidiasis throughout the study and C. albicans, detected only occasionally and never more than twice in the same women, was present in low concentrations in the majority of women. Furthermore, we screened all women for STIs at baseline and selected women without STIs (with the exception of chronic HSV-2 infection) for this sub-study. Only women with clinician-observed signs of urogenital infections during the eight-week follow-up period were retested for STIs, but such clinician-observed signs were rare (Table 1). We therefore believe that incident or recurrent STIs during the eight-week follow-up period were uncommon.
We assessed several other potential correlates of VMB bacteria as well as immune mediator concentration variations in women with a normal VMB throughout the study: the presence and phase of the menstrual cycle, the presence of PSA as a marker of recent sex, and recent vaginal cleansing. Amenorrhoeic women had a reduced concentration of lactobacilli, (notably L. crispatus), compared to women with a menstrual cycle even after controlling for PSA presence and recent vaginal cleansing, and this may be due to the induction of a hypo-oestrogenic state during injectable progestin use. Current evidence suggests that the VMB destabilising effect of hypo-oestrogenism in these women is larger than any potential protective effect associated with the absence of regular menstrual bleeding 30 . Amenorrhoeic women also had increased concentrations of several proinflammatory immune mediators. This is in agreement with results from two African studies (Tanzania and South Africa/Kenya) 39,40 but in contrast to the results of a recent study in Kenyan women that showed sustained decreases in IL-6, IL-8, and IL-1RA after initiation of depot medroxyprogesterone acetate (DMPA) injectable contraception 41 . The comparison groups in these studies differed, with our study comparing amenorrhoeic injectable progestin users with all other women, the Tanzanian and South Africa/Kenya studies comparing current DMPA users with women not using hormonal contraception, and the Kenyan study comparing women before and after initiation of DMPA use. It is possible that DMPA use is immunosuppressive initially as it binds to the corticosteroid receptor with an affinity similar to that of cortisol 42 , but becomes proinflammatory with prolonged use due to increasing hypo-oestrogenism which in turn can lead to VMB dysbiosis and vaginal wall atrophy.
We did find some differences in VMB bacteria and the concentration of immune mediators in samples collected during luteal phase visits compared to follicular phase visits, but these patterns were not consistent. While levels of both oestrogen and progesterone are higher in the luteal phase than the follicular phase of the menstrual cycle, we sampled women around days 9 and 23 of their cycles, and these time points do not correspond with peak hormone levels. Oestrogen in particular is known to associate with higher concentrations of lactobacilli 17,43,44 . We did see increases in concentrations of Lactobacillus genus, L. crispatus, and L. jensenii at luteal phase visits, but these did not reach statistical significance. The differences in mean immune mediator concentrations between the luteal and follicular phases that we observed were not seen in the earlier mentioned Tanzanian study, and that study assessed menstrual cycle stage more carefully by urine pregnanediol 3-glucuronide testing 39 .
Vaginal sex in the last 24-48 hours as measured by the presence of PSA in vaginal swab eluates was associated with concentrations of various lactobacilli, with L. iners showing the greatest reduction. PSA presence was also associated with higher concentrations of IL-6, IL-12(p70), and IP-10. Similar effects of recent vaginal sex on the VMB have been previously reported by us 33 and by others 17,[45][46][47] . A direct effect was demonstrated in vitro when seminal plasma was co-cultured with cervical epithelial cells 48 . The VMB-destabilising and proinflammatory effects of sexual activity are likely due to the direct effect of seminal fluid as condom use seems to prevent them 45,49 . Recent vaginal cleansing was not significantly associated with any changes in VMB bacteria or immune mediator concentrations in any of our analyses.
Using data from all 80 women, we investigated the direct associations between the concentrations of VMB bacteria and vaginal immune mediator concentrations over time while controlling for presence and phase of the menstrual cycle and PSA presence. Perhaps the most significant finding was that a higher 'composite qPCR vaginal health score' (suggesting better vaginal health) was associated with decreased concentrations of  Table 4. Differences in VMB bacteria and immune mediator concentrations in 40 women with incident BV between the visit before the incident BV visit and the incident BV visit. Abbreviations: BV = bacterial vaginosis; G-CSF = granulocyte colony stimulating factor; GM-CSF = granulocyte macrophage colony stimulating factor; IL = interleukin; IP-10 = interferon-inducible protein 10; MIP-1β = macrophage inflammatory protein 1β; SLPI = secretory leukocyte protease inhibitor; VMB = vaginal microbiota. 1 The first incident BV visit was visit 2 for 16 women, visit 3 for 7 women, visit 4 for 11 women and visit 5 for 6 women. 2 Expressed in log 10 genome equivalents per mL (geq/ml) for VMB bacteria and log 10 pg/ml for immune mediators. 3 Wilcoxon signed rank tests. 4 log 10 geq/ml (Lactobacillus genus)−log 10 geq/ml (G. vaginalis + A. vaginae). all three modelled proinflammatory cytokines (IL-1α, IL-8, and IL-12(p70)) and an increased concentration of IP-10. Unfortunately, the sample size of this sub-study was small; some statistically significant associations in the cross-sectional analyses of the Vaginal Biomarkers Study baseline data showed the same trends as this sub-study but did not reach statistical significance 22 . When interpreting the cross-sectional 16,22 and longitudinal data together, we conclude that Lactobacillus species are associated with an increase in IP-10 and reductions (L. crispatus, L. vaginalis) or no change in multiple proinflammatory cytokines; BV-associated bacteria are associated with a decrease in IP-10 and increases in multiple proinflammatory cytokines; and E. coli and S. agalactiae are associated with increases in IP-10 and multiple proinflammatory cytokines (the S. agalactiae data are unpublished). A cross-sectional Canadian study employing 16 S sequencing to characterise the VMB reported very similar results: a decrease in IP-10 and increases in multiple proinflammatory cytokines in women with BV (community state type (CST)-4), and no inflammation but an increase in IP-10 in women with a L. iners dominated VMB (CST-3) 50 . A longitudinal South African study also found significant increases in multiple proinflammatory cytokines at visits during which vaginal dysbiosis was detected, but no association with IP-10 12 . IP-10 (also known as CXCL10) is induced by type I and II interferons and TNF-α and is a ligand for the CXCR3 receptor [51][52][53] . IP-10 levels are generally elevated in uncontrolled viral infection, but a reduction of IP-10 levels by pathogenic bacteria, and particularly combinations of bacteria, has been described before [54][55][56][57] . The significance of this remains unclear. A recent study among women in South Africa by Masson et al. found that increased IL-1β and reduced IP-10 concentrations in female genital secretions of HIV-negative women predicted the presence of BV and/or other treatable discharge-causing STIs 37 . The combination of these two biomarkers identified a significantly higher proportion (77%) of women with BV and treatable STIs than clinical criteria (19%). Consequently, the authors suggested to explore the use of those biomarkers in the detection of BV and discharge causing STIs 38 . Our study had some limitations. Unfortunately, we could not afford to quantify all relevant bacteria and immune mediators in all longitudinal samples from all participants of the Vaginal Biomarkers Study. The current sub-study design was considered a next best but feasible alternative. This design required us to select women based on longitudinal Nugent scores, which are a cruder way of classifying VMBs than the molecular methods we employed in the sub-study. However, multiple studies have shown a good correlation between the two methods in classifying woman as having a lactobacilli-dominated or dysbiotic VMB 1 , with the molecular testing adding nuance. Our sub-study design also reduced our statistical power, especially related to the potential effects of VMB minority species. Due to the stringent selection criteria, our results may not be generalisable to all women.
In conclusion, our well-controlled longitudinal data confirm the inflammatory nature of anaerobic vaginal dysbiosis and E. coli colonisation, recent vaginal sex, and progestin-injectable use. While anaerobic vaginal dysbiosis or BV is by far the most common vaginal dysbiosis, high abundance of E. coli, S. agalactiae, and other pathobionts as a distinct inflammatory vaginal dysbiosis deserves further study. The roles of a selection of the vaginal mediators (IL-1α, IL-1β, IL-8, IL-12, IP-10) with or without the composite qPCR vaginal health score as predictive biomarkers for the above conditions warrant further investigation.   Table 5. Longitudinal associations between VMB bacteria and immune mediator concentrations among all visits of all 80 women (with Nugent 0-3 throughout and with incident BV) over the eight week study period. Abbreviations: BV = bacterial vaginosis; Est = model estimate; IL = interleukin; IP-10 = interferon-inducible protein 10; qPCR = quantitative polymerase chain reaction; PSA = prostate-specific antigen; VMB = vaginal microbiota; vs = versus. 1 From mixed effects multiple regression models with each item in the first column as the outcome, individual women as random effects, and fixed effects for all variables in the first row. 2 log 10 geq/ ml (Lactobacillus genus)−log 10 geq/ml (G. vaginalis + A. vaginae). 3 For women with the bacteria present during at least 75% of visits and excluding the visits during which the bacteria was absent. We only included VMB bacteria that were consistently present (in at least 75% of the visits) in at least 25% of women.

Methods
Scientific Women with positive HSV-2 serology at baseline were included due to the high prevalence of 34%. A total of 54 and 48 women qualified for the reference and incident BV groups, respectively, and 40 women were selected for each group from among the qualifying women at random. No matching was done. Women were followed for five consecutive visits over eight weeks. The visits were tightly scheduled around the menstrual cycle with the enrolment visit (visit 1) scheduled shortly after the last day of the menstrual period on day 9 (±2 days) of the cycle; the absence of menses was verified during vaginal examination at this visit. The next four visits were scheduled with two week intervals over two menstrual cycles (visits 2-5). Thus, visits 3 and 5 coincided with day 9 (±2 days) of the menstrual cycle (the follicular phase) and visits 2 and 4 with day 23 (±2 days) or the luteal phase. The same visit schedule was followed for women using hormonal contraception, including those who were amenorrhoeic due to progestin-injectable use. At baseline, eligible women interested in participating provided written informed consent, were interviewed about sociodemographic and behavioural characteristics, underwent a physical and vaginal examination, and were tested for HIV and the above-mentioned reproductive tract infections. Interviews and vaginal examinations were also done at all subsequent visits.
Sample collection. At each of the five visits included in this sub-study, the following samples were collected before any other procedures in the following order: two sterile flocked swabs (Copan Diagnostics, Inc., Murrieta, CA) that were rotated against the mid-portion of the vaginal wall under visual inspection, dipped in the posterior fornix and carefully removed to prevent contamination; and a CVL that was obtained by gently flushing 10 ml normal saline through the speculum and aspirating the fluid from the posterior fornix. At each study site, one trained clinician performed all the examinations using one standard operating procedure to minimise inter-clinician variability.
Sample processing. CVLs were collected in 15 ml falcon tubes, kept on ice for transport (2-8 °C), and processed within a maximum of one hour after collection. CVLs were centrifuged at 1,000 x g for 10 minutes at 4 °C and supernatants (~9 ml) were aliquoted into three fractions; two of approximately 4 ml each and one of 1 ml. The aliquots were stored at −80 °C locally. CVLs and vaginal swabs, frozen at −80 °C, were shipped in batches using a temperature-monitored dry shipper to the central laboratory at the ITM in Antwerp, Belgium, where they were stored at −80 °C before analysis of soluble immune mediators and VMB bacteria.
Characterisation of vaginal microbiota. Vaginal Gram-stained slides were examined and scored at the ITM using the Nugent method 14 . qPCR was performed on extracted DNA from vaginal swab eluates for the following ten species and one genus: Lactobacillus genus, L. crispatus, L. gasseri, L. iners, L. jensenii, L. vaginalis, A. vaginae, G. vaginalis, E. coli, P. bivia, and C. albicans and in duplicate at the ITM and at the University of Ghent, Belgium, as previously described 16,18 . The number of organisms was expressed as genome equivalents per ml (geq/ ml); the genomic concentration was calculated using the described genomic sizes of the type strains.

Quantification of soluble immune mediators in CVLs.
Concentrations of the cytokines IL-1α, IL-1β, IL-6 and IL-12(p70), MIP-1β, IP-10 and IL-8, and growth factors GM-CSF and G-CSF in CVLs were measured at the ITM using the Bio-Plex ™ human cytokine assay kit (Bio-Rad Laboratories NV-SA, Nazareth, Belgium) as previously described 33 . Elafin, SLPI, IL-1RA and the total protein concentration in CVLs were measured in the Laboratory of Genital Tract Biology, Brigham and Women's Hospital, Boston, MA, USA. Elafin and SLPI were quantified using ELISA kits from R&D Systems (Minneapolis, MN) following manufacturers' instructions. IL-1RA was measured using the Meso Scale Discovery (MSD) multiplex platform and Sector Imager 2400 (MSD, Gaithersburg, MD). The MSD Discovery Workbench Software was used to convert relative luminescent units into protein concentrations (pg/ml) using interpolation from several log calibrator curves. Total protein in CVLs was determined by a bicinchoninic acid (BCA) assay (Thermo Scientific, Rockford, IL) using the Victor 2 counter. Optical densities were read at 450 nm with a second reference filter of 570 nm using a Victor2 multi-label reader and WorkOut Software (PerkinElmer, Waltham, MA).
Prostate-specific antigen detection. PSA was measured in vaginal swab eluates using the Seratec ® PSA semiquant assay (Seratec Diagnostica, Göttingen, Germany). A volume of 150 µl of the eluate, in diluted phosphate buffered saline (1,200 μl; 1 part phosphate buffered saline and 9 parts saline, pH 7.4), was centrifuged for 10 min at 13,000 x g. After centrifugation, 120 µl of supernatant was used for testing according to the manufacturer's instructions.
Scientific RepoRts | 7: 11974 | DOI:10.1038/s41598-017-12198-6 Data analysis. Statistical analyses were performed using Stata 13 (StataCorp, College Station, TX), SAS 9.4 (SAS Institute Inc, Cary, NC) and R 3.0.1 (The R Foundation, Vienna, Austria). Over the five study visits, the detection of individual VMB bacteria by qPCR was classified as follows: never present; sporadically present (present at 25% or fewer visits); regularly present (present at 26-74% of visits) and consistently present (present at 75% or more visits). The concentrations of VMB bacteria (in geq/ml) and immune mediators (in pg/ml) were log 10 transformed in all analyses.
For the women with a normal VMB throughout the study: Longitudinal variations in the concentrations of VMB bacteria were assessed in mixed effects linear regression models for those VMB bacteria consistently present in at least 25% of the women with a normal VMB throughout the study. All models included one VMB bacteria as the outcome and individual women as random effects. We added the following fixed effects: sampling in the luteal (visits 2 and 4) versus follicular phase (visits 1, 3 and 5) of the menstrual cycle, the absence (amenorrhoea due to progestin-injectable use) or presence of a menstrual cycle (either a natural cycle or regular withdrawal bleeds during combined contraceptive use), presence of PSA as a marker of sex within the last 24-48 hours 23,24 , and recent vaginal cleansing (the evening or morning just prior to the study visit).
For the women with a consistently normal VMB, mixed effects linear regression models were also fitted with each immune marker concentration as the outcome, individual women as random effects, and including presence and phase of the menstrual cycle as fixed effects. Further mixed effects linear regression models for each marker as outcome, controlled for presence and phase of the menstrual cycle, were fitted separately for the following covariates: vaginal pH category (<4.0, 4.0-4.5, >4.5), presence of clinician-observed abnormal vaginal discharge, cervical mucus, a cervical epithelial finding (abrasion, laceration, ecchymosis, petechiae, erythema, or ulcer), vaginal cleansing, and PSA. These covariates were selected based on previous cross-sectional analyses in the same study population 16,21,22 and based on the published literature 7,39 .
In the women with incident BV, we assessed the mean change in concentrations of VMB bacteria and immune mediators between the visit preceding the first incident BV visit and the first incident BV visit using Wilcoxon signed rank tests. Furthermore, the direct associations between VMB bacteria concentrations (for VMB bacteria consistently present in at least 25% of the women) and immune marker concentrations (IL-1α, IL-8, IL-12, IP-10, and elafin) were determined in mixed effects linear regression models in all 80 women. All models included the concentration of an individual VMB bacterium as the outcome, individual women as random effects, and the following fixed effects: each immune mediator concentration, PSA presence, and presence and phase of menstrual cycle. We also considered a 'triple taxa qPCR vaginal health score' based on the concentrations of three key VMB bacteria [log 10 (Lactobacillus genus)−log 10 (G. vaginalis + A. vaginae)] as the outcome because this score was shown to be the best indicator of vaginal health in the Vaginal Biomarkers Study 26 .
Data availability. According to the Institute of Tropical Medicine's policy, all data are available from the Institute of Tropical Medicine Institutional Data Access for researchers who meet the criteria for access to confidential data. Requests for data access can be made by emailing Mr. Jef Verellen, Quality Specialist at ITMresearchdataaccess@itg.be.