Intermittent Lactobacilli-containing Vaginal Probiotic or Metronidazole Use to Prevent Bacterial Vaginosis Recurrence: A Pilot Study Incorporating Microscopy and Sequencing

Bacterial vaginosis (BV) is associated with HIV acquisition and adverse pregnancy outcomes. Recurrence after metronidazole treatment is high. HIV-negative, non-pregnant Rwandan BV patients were randomized to four groups (n = 17/group) after seven-day oral metronidazole treatment: behavioral counseling only (control), or counseling plus intermittent use of oral metronidazole, Ecologic Femi+ vaginal capsule (containing multiple Lactobacillus and one Bifidobacterium species), or Gynophilus LP vaginal tablet (L. rhamnosus 35) for two months. Vaginal microbiota assessments at all visits included Gram stain Nugent scoring and 16S rRNA gene qPCR and HiSeq sequencing. All interventions were safe. BV (Nugent 7–10) incidence was 10.18 per person-year at risk in the control group, and lower in the metronidazole (1.41/person-year; p = 0.004), Ecologic Femi+ (3.58/person-year; p = 0.043), and Gynophilus LP groups (5.36/person-year; p = 0.220). In mixed effects models adjusted for hormonal contraception/pregnancy, sexual risk-taking, and age, metronidazole and Ecologic Femi+ users, each compared to controls, had higher Lactobacillus and lower BV-anaerobes estimated concentrations and/or relative abundances, and were less likely to have a dysbiotic vaginal microbiota type by sequencing. Inter-individual variability was high and effects disappeared soon after intervention cessation. Lactobacilli-based vaginal probiotics warrant further evaluation because, in contrast to antibiotics, they are not expected to negatively affect gut microbiota or cause antimicrobial resistance.

(by culture and/or wet mount) and treated with 500 mg oral metronidazole twice per day for seven days. An enrollment visit was scheduled within three days after treatment completion, and only women whose treatment had been successful (no BV by modified Amsel criteria and no TV on wet mount), and who were still not pregnant and free of urinary tract infection and syphilis, were randomized to the four groups described in the manuscript. Women could be rescreened a maximum of three times.
The Chief Investigator (JvdW) based in Liverpool used a random number generator to assign participant identification numbers to groups in blocks of four. At the Rinda Ubuzima clinic, each eligible woman was assigned the next available participant identification number, and the corresponding sealed envelope was opened to reveal her randomization group. The laboratory technicians were blinded but it was not possible to blind the clinicians and participants. Behavioral counseling was offered to all participants in all randomization groups. The counseling focused on reducing known risk factors for BV such as unprotected sex, vaginal hygiene practices, male partner penile hygiene practices, and alcohol use (which can lead to unprotected sex and can cause severe side effects when used in combination with metronidazole). At follow-up visits, participants underwent a face-to-face interview, counselling, speculum examination, laboratory assessments (BV by modified Amsel and Nugent criteria, TV by wet mount and culture, vulvovaginal candidiasis by wet mount, and additional testing if clinically indicated), and sampling for future molecular testing. HIV, sexually transmitted infection, urinary tract infection, and pregnancy tests were repeated at M6 only. All participants were offered male condoms free of charge at each study visit. Forty six participants made unscheduled visits to collect treatment for an infection that was diagnosed by laboratory testing after the participant had left the clinic (n=35 women), because of new symptoms (n=14), and/or to withdraw informed consent (n=1).

Dosing and adherence
The dosing regimens were based on earlier studies (metronidazole) and on marketing approvals (vaginal probiotics), except that GynLP was used every four days for two months instead of the three weeks recommended by the company so that all women in the intervention groups dosed for two months. We had initially intended to use the Sobel et al regimen of a metronidazole gel, 3 but we were not able to identify any metronidazole gels that had proven stability at 25°C and 30°C for up to two years. Given that the trial was conducted in Rwanda, the University of Liverpool Sponsorship Committee required proven stability at these temperatures, and we ended up using metronidazole tablets that did have proven stability. The study products were stored in an air-conditioned room with controlled access until handed to participants. Adherence to the interventions was assessed at the D7, M1, and M2 visits by structured interviewer-administered questionnaire, review of a diary card that the participant completed in between study visits, review of returned used packaging and unused product, and by asking the participant to complete a self-rating adherence scale. These different sources of adherence data were triangulated to arrive at an overall level of adherence (between 0-100%) for each participant between visits. Women were allowed to cease product use during menstrual bleeding (done by 12 women), and the adherence data were not adjusted for this.
In the probiotic strains detection discussion in the manuscript, we state that we estimate the average total bacterial load of the vagina to be in the order of 2x10 10 bacteria. This is based on the following assumptions. The average vaginal surface area was estimated to be 87.5 cm 2 . 4 One Dacron swab head in this study absorbed about 10 6 /µL bacteria. If we assume that one swab head absorbs on average 200 µL, 5 and that this covers about one cm 2 of a total of about 100 cm 2 vaginal surface, the total bacterial load in the vagina would be in the order of 2x10 10 bacteria.

Diagnostic testing
All diagnostic testing was conducted at Rinda Ubuzima or the National Reference Laboratory in Kigali using validated procedures. Vaginal swab, blood, and urine specimens were processed on the collection day and either tested immediately or stored at -80°C until testing. Dacron vaginal swabs were used for wet mounts, Gram stains, and TV InPouch culture (Biomed Diagnostics, White City, OR, USA) at all study visits, as described in the manuscript. All other diagnostic tests were only done at screening, M6, and when judged clinically necessary by the physician, with the exception of pregnancy and urinalysis tests, which were repeated at enrollment prior to randomization. Whole blood was tested for HIV 1/2 using the Kehua HIV Rapid Test (Kehua Bio-engineering, Shanghai, China), followed by the Alere Determine HIV-1/2 Rapid Test (Abbott Laboratories, Tokyo, Japan) for confirmation of positive results and the Unigold HIV Rapid Test (Trinity Biotech, Bray, Ireland) as tie-breaker (if applicable). Urine was tested for pregnancy using an hCG dipstick test and urinary tract infection using a urinalysis dipstick test (both by Nova, Atlast Link Technology, Beijing, China). Plasma was tested for herpes simplex type 2 serology (Kalon, Guildford, UK; using an optical density cut-off of >1.1 for a positive result and <0.9 for a negative result) and syphilis by Rapid Plasma Reagin test followed by Treponema pallidum Hemagglutination Assay (both by Spinreact, Girona, Spain). Endocervical swabs were tested for C. trachomatis and N. gonorrhoeae by real-time PCR (Presto, Beek, The Netherlands). 6 A sub-sample of 49 M6 samples were tested by GeneXpert CT/NG assay (Cepheid, CA, USA) after study completion. The sensitivities and specificities of the two real-time PCR assays are high and comparable. 7

DNA extraction
For molecular testing, the physician collected two Dacron vaginal swabs per woman during speculum examinations at screening, enrollment, and each scheduled follow-up visit (N=1,016 physician-sampled swabs). The small discrepancy between expected ([176+74+65+66+66+64] x2=1,022; Figure 1) and collected physician-sampled swabs can be explained as follows: six of the 176 women who attended a screening visit did not provide swabs, four of the 68 randomized women missed at least one study visit, and a few extra swabs were taken from women who attended an enrollment visit but turned out to be ineligible for randomization and from two randomized women. The swab heads were stored dry at Rinda Ubuzima at -80°C on the collection day. The 12 women in the self-sampling group self-sampled two flocked swabs (Copan Diagnostics, Murrieta, CA, USA) every Monday, Wednesday, and Friday of the first month after randomization (N=258 self-collected swabs). The discrepancy between expected (12x12x2=288) and collected self-collected swabs can be explained as follows: one of the 12 women completed the study but never selfsampled, and the remaining 11 women combined missed one self-sampling time point and sampled one instead of two swabs at four time points. The swab heads were initially stored at room temperature in the participant's home in cryovials containing 1 ml RNALater (Thermo Scientific, Paisley, UK), and then at Rinda Ubuzima at -80°C within seven days after collection. Frozen samples were shipped to Liverpool on dry ice. DNA extraction and sequencing were done at the University of Liverpool Centre for Genomic research. 8 DNA was extracted from one sample per participant per time point (N=639 swabs). Those frozen dry were thawed. Those frozen in RNALater were thawed and then centrifuged at 16,000 g for 10 minutes at 4 °C: supernatants were discarded, and pellets and swab heads were used for DNA extraction. DNA was extracted from each sample by adding 180 µl of enzymatic lysis buffer containing lysozyme (Sigma-Aldrich, Dorset, UK); incubation for 30 minutes at 37 °C; adding 25 µl of proteinase K and 200 µl of buffer AL using the Qiagen DNeasy Blood and Tissue kit (Qiagen, Manchester, UK); incubation for 30 minutes at 56 °C; and bead-beating after adding 200 mg of 0.1 mm zirconia/silica beads (Thistle Scientific, Glasgow, UK) on a Qiagen TissueLyser II (Qiagen, Manchester, UK) for 5 minutes at 25 Hz. Next, 200 µl of 100% ethanol was added, the sample was centrifuged, the swab head was discarded, and the pellet was purified in four subsequent centrifugation steps after adding one-by-one 200 µl 100% ethanol, 500 µl buffer AW1, 500 µl buffer AW2 and 75 µl buffer AE as per manufacturer's instructions (Qiagen, Manchester, UK). We included one negative control (an empty tube) with each DNA extraction round of 24 study samples to be able to detect contaminants in extraction reagents downstream. The DNA concentration of randomly selected samples was measured by Qubit (Invitrogen, Thermo Scientific, Paisley, UK) and the DNA quality of all samples by Nanodrop (Thermo Scientific, Paisley, UK).

PCR amplification and 16S rRNA sequencing
Each of the 667 DNA samples (639 study samples and 28 negative controls) underwent two PCR rounds for 16S rRNA gene amplification and barcoding. In the first PCR round, the V3-V4 region of the 16S rRNA gene was amplified as described previously. 9 DNA was amplified in a 25 µl reaction volume using 1.25 µl of a 10 µM concentration of 319F 5'-ACTCCTACGGGAGGCAGCAG-3' forward primer and 1.25 µl of a 10 µM concentration of 806R 5'-GGACTACHVGGGTWTCTAAT-3' reverse primer, 12.5 µl NEB Next HF 2x PCR Master Mix (New England Biolabs, Hitchin, UK), 9 µl of nuclease-free water and 1 µl of DNA extraction product. The first denaturation cycle took 30 seconds at 98 °C and was followed by 10 cycles consisting of a denaturation cycle of 10 seconds (at 98 °C), an annealing cycle of 30 seconds (at 58 °C), an extension cycle of 30 seconds (at 72 °C), and a final extension cycle of 5 minutes at 72 °C. PCR products were then purified and sizeselected using Agencourt AMPure XP beads (Beckman Coulter, High Wycombe, UK) in a 0.8:1.0 bead-to-sample ratio. The second PCR round was to barcode V3-V4 sequences by a dual-index approach using standard Illumina Nextera XT index kit v2 (Illumina, San Diego, CA, USA), permitting multiplexing of up to 384 samples. The barcoding was performed in a 25 µl reaction volume using 2.5 µl of Index 1 primer, 2.5 µl of Index 2 primer, 12.5 µl NEB Next HF 2x PCR Master Mix and 7.5 µl sample. The first denaturation cycle took 3 minutes at 98 °C and was followed by 15 cycles consisting of a denaturation cycle of 30 seconds (at 98 °C), an annealing cycle of 30 seconds (at 55 °C), an extension cycle of 30 seconds (at 72 °C), and a final extension cycle of 5 minutes at 72 °C. PCR products were then purified using AMPure beads as explained above, again in a 0.8:1.0 bead-to-sample ratio. We added a negative control to each PCR run (10 µl of nuclease-free water instead of 9 µl of nucleasefree water and 1 µl of DNA) to identify contaminants, as well as a commercially available positive control (10 µl of 0.2 ng/µl ZymoBiomics Microbial Community DNA standard; Zymo Research Corp, Irvine, CA, USA). The DNA extraction negative controls were also included in the PCR runs. DNA from samples collected at different visits but belonging to the same participant were included in the same PCR run to control for inherent differences between PCR runs. PCR product DNA concentrations of each sample (N=683: 639 study samples, eight positive PCR controls, eight negative PCR controls, and 28 negative DNA extraction controls) were measured using the Qubit Fluorometer with the dsDNA HS Assay kit (Invitrogen, Thermo Scientific, Paisley, UK). All samples, including the positive and negative controls, were successfully amplified and used for subsequent steps.
Amplicons from samples were evenly pooled into sequencing libraries at a mass of 0.8 ng DNA per amplicon. To achieve this, Qubit DNA concentrations were used to calculate the volumes of each study sample to be added. Samples with a DNA concentration of <0.30 ng/µl (e.g., the negative controls) were added in a fixed volume of 1 µl. The two libraries were sequenced on an Illumina HiSeq instrument (Illumina, San Diego, CA, USA), run in rapid mode, 2x300bp using a 250PE and 50PE kit. DNA from samples collected at different visits but belonging to the same participant were included in the same library (and hence, sequencing run) to control for inherent differences between sequencing runs.
Panbacterial 16S rRNA gene qPCR The panbacterial 16S rRNA gene copy concentrations of all samples collected at study visits and containing at least 1,111 reads (rarefaction depth) by Illumina HiSeq sequencing (N=393; see 'Further data processing' below for rationale) were determined at the Institute for Genome Sciences of the University of Maryland (Baltimore, MD, USA) using the BactQuant qPCR assay. This assay was developed based on an analyses of 4,938 16S rRNA gene sequences in the Greengenes database. 10,11 The DNA samples were tested as described previously. 11,12 Briefly, 1.5 µl of template (1:10 diluted DNA) was added to 3.5 µl of reaction mix, with the final reaction containing 1.8 µM each of the forward (341F) and reverse (806R) primer targeting the 16S V3-V4 region, 225 nM of the TaqManW probe, 1X Platinum Quantitative PCR SuperMix-UDG with ROX (Invitrogen, Thermo Scientific, Waltham, MA, USA) and molecular-grade water. Each experiment included an in-run standard curve (ranging from 10 to 10 8 , with 10 2 -10 8 in 10-fold serial linear dilutions) and no-template controls performed in triplicate. Amplification and real-time fluorescence detections were performed on the Bio-Rad CFX 384 instrument (Bio-Rad Inc., Hercules, CA, USA) using the following PCR conditions: three minutes at 50 °C for UDG treatment, 10 minutes at 95 °C for Taq activation, 15 seconds at 95 °C for denaturation and one minute at 60 °C for annealing and extension, times 40 cycles. Cycle threshold (Ct) value for each 16S qPCR reaction were obtained using a manual Ct threshold of 0.05 and automatic baseline. The 16S rRNA gene concentration was reported in copies/µL for each sample.

Molecular data processing
We obtained a mean raw unpaired read count of 374,543 reads per study sample (95% confidence interval (CI): 305,845 -443,242) in run 1 and 302,431 reads per study sample (95% CI: 266,423 -338,440) in run 2. Reads were first demultiplexed, and primer sequences were removed from forward and reverse reads using Cutadapt 1.2.1. 13 All subsequent steps were performed in the DADA2 v1.4.0 package for large paired end datasets in R v3.4.3 (R core team, 2015). 14 We chose DADA2 because of its ability to resolve reads differing by only one nucleotide, maximizing our chances of differentiating probiotic reads from naturally occurring Lactobacillus species reads. Error correction was performed using the fastqFilter command with parameter settings aiming to maximize read retention. For forward and reverse reads, respectively, the minimum read lengths (truncLen) were set to 260 and 210 based on the quality plots, maxEE to a maximum of 5 and 8 expected errors, maxN to zero ambiguous bases allowed, and truncQ to zero. Around 10% of reads were discarded after error correction. Next, error rates of forward and reverse reads were determined using the learnErrors command. Forward and reverse reads were dereplicated (assigned to unique amplicon sequence variants (ASVs)) using the derepFastq command, and denoised (ASVs with higher than average error rates discarded) using the dada command. 14,15 Forward and reverse reads were then merged into overlapping reads using the mergePairs command. Bimeras (chimeric compositions of two separate parent ASVs) were removed using the removeBimeraDenovo command with the Silva v128 database as the reference database; 16 10.1% of ASVs were identified as bimeric and removed. Overall, a median of 16% of the raw reads per study sample was removed during these DADA2 clean-up steps.
Taxonomic assignment was also done in DADA2 in two steps: assignTaxonomy to map ASVs to taxa at genus level or above using the RDP classifier with a minimum bootstrap value of 50% and the Silva v128 database as the reference database, 16,17 followed by addSpecies to map ASVs to species level, allowing only ASVs with exact (100%) matches with species in the Silva database to be assigned to that species, and allowing assignment of one ASV to multiple species.
Further data processing Further data processing was performed in Microsoft Excel 2013 (starting with a spreadsheet containing the sequences, taxonomic assignments, and read counts for each ASV per sample) and STATA v13 (StataCorp, College Station, TX, USA). We removed all rare ASVs (defined as a read count in all samples combined of less than 100), four non-bacterial ASVs, and two likely contaminant ASVs (a Rhodanobacter glycinis/terrae and a Sneathia genus) that were present in more than one negative control at relative abundances higher than in any study sample. The vaginal taxa BV-associated bacterium 1 (BVAB1), BVAB2, Mageeibacillus indolicus (BVAB3), BVAB TM7 and Fenollaria massiliensis are not included in the Silva v128 database but their sequences have been published elsewhere. Similarly, the Lactobacillus and Bifidobacterium species included in the vaginal probiotic Ecologic Femi+ (EF+; Winclove Probiotics, Amsterdam, Netherlands) are not included in the Silva database, but their sequences were provided to us by Winclove. The vaginal probiotic Gynophilus LP (GynLP; Biose, Arpajon-sur-Cère, France) contains Lactobacillus rhamnosus (Lcr strain 35), and the NCBI database contains a reference sequence for this probiotic strain. 18 We identified all of the above species in our ASV spreadsheet using the Needleman-Wunsch Global Align Nucleotide Sequences function on the NCBI website, 19 requiring 100% matches between reference sequences and uploaded DADA2-derived ASVs of interest. The Silva-based taxonomic assignment of 146 ASVs with a relative abundance of at least 0.05% of the read count of all samples combined (out of a total of 1,797 ASVs) were double-checked using the Microbial Nucleotide BLAST (BLASTn) function on the NCBI website 20,21 using the nonredundant V3-V4 version of the Vaginal 16S rDNA Reference Database as a tiebreaker, 22 three discordances were resolved, 24 Lactobacillus genus ASVs were reassigned to various Lactobacillus species, six Gardnerella genus ASVs were reassigned to G. vaginalis, and one Atopobium genus ASV was reassigned to A. vaginae. Next, read counts for ASVs assigned to the exact same taxonomy were summed for each sample, except for ASVs assigned to probiotic strains. Finally, we rarefied at a depth of 1,111 reads (the lowest total read count for a specific sample above 1,000 reads) using the GUniFrac 1.0 package in R. 23 The rarefied ASV table contained 629 samples and 401 unique ASVs (10/639 samples became invalid due to rarefaction), with 255 (63.6%) mapping to species level, 116 (28.9%) to genus level, and 30 (7.5%) to higher taxonomic levels. Rarefied read counts were transformed into relative abundances using the prop.table function in R. Of the 401 ASVs, 177 ASVs were present at a relative abundance of at least 1% in at least one sample; the other 224 ASVs were minority species.
Of the 393 samples that were tested by BactQuant assay (Figure 1), five did not amplify in two of three, or all three, of the triplicate reactions and were considered invalid. The 16S rRNA gene concentration results of the other samples were normally distributed with the exception of nine samples with a concentration of <1,000 copies/µl, which were considered outliers. A total of 14/393 samples (= 3.6%) were therefore excluded from all analyses. We estimated the ASV-specific concentrations per sample using the sample-specific 16S rRNA gene concentration data. First, we identified the 16S rDNA gene copy number per ASV in the NCBI version of the rrnDB database, 24 and in the case of missing data, in the RDP version of the rrnDB database (which only contains information at genus level and above) and the Greengenes database. 10 If an ASV was mapped to multiple species at genus level, the mean of the mean 16S gene copy number for each individual species was used. If the mean 16S gene copy number of a species was not present in the database, we used the mean copy number of the corresponding genus. BVAB1 and BVAB2 belong to the Clostridiales order and lower level taxonomic information is not available. We therefore used the Clostridiales order copy number (=4.62). Concentrations in cells/µl per ASV per sample were estimated by multiplying the ASV-specific copy-normalized rarefied relative abundance by the samplespecific 16S rRNA gene copies concentration. This method has been shown by others to correlate well with species-specific quantitative PCR results for non-minority species. 25,26 This yielded concentrations for 401 ASVs in 379 samples in cells/µl, which were log 10transformed. Concentrations between zero and one cell/µl were set to one prior to log 10transformation to prevent skewed negative values.

Trial endpoints and hypotheses
The primary aims of the trial were to determine the safety and preliminary efficacy of the interventions, each compared to the no intervention group. We hypothesized that all interventions would be safe. The primary preliminary efficacy endpoints were the incidence of BV by Nugent score and modified Amsel criteria (which were hypothesized to decrease), and symptomatic vulvovaginal candidiasis (VVC) by wet mount (which was hypothesized not to increase). The secondary preliminary efficacy endpoints included membership of specific VMB clusters/types and bacterial group concentrations over time as determined by Illumina HiSeq sequencing.

Safety endpoints
The main safety endpoints were self-reported solicited and unsolicited (serious) adverse events (AEs) and social harms, and clinician-observed speculum exam findings. Adverse events were coded using the Medical Dictionary for Regulatory Activities (medDRA v19.1, McLean, VA, USA). Laboratory test results were not considered AEs, but primary and secondary endpoints.

Primary preliminary efficacy endpoints
For the BV by Nugent 7-10, modified Amsel criteria and VVC endpoints, see the 'diagnostic testing' section above. Analyses were also conducted using a Nugent score of 4-10 and the full Amsel criteria as the definition of BV but the results are not shown because they are similar to the Nugent 7-10 and modified Amsel criteria results, and less informative than the molecular data.
Molecular endpoints: richness, diversity, VMB types, and bacterial groups Data reduction was required for molecular efficacy analyses, and was done in three different ways. First, we calculated richness and alpha diversity for each sample using the rarefied relative abundance data. Richness was defined as the total number of ASVs per sample. Alpha diversity was determined by Simpson diversity (1-D) using the phyloseq package v1.14.0 in R. 27 Second, we used existing knowledge to assign all 401 ASVs to four bacterial groups, as described in the manuscript and in Supplement 2. Third, we clustered the rarefied relative abundance data using the phyloseq package in R using Euclidean distance with complete linkage. 27 We used this information, and information from previously conducted studies, to classify samples into eight mutually exclusive VMB types (each sample was assigned to only one VMB type) as described in the manuscript. To improve the statistical power of the mixed effects models, and to facilitate visualizations of VMB transitions in alluvial diagrams, the eight VMB types were condensed further into four 'pooled VMB types' as described in the manuscript.

Statistical analyses
Statistical analyses were performed using STATA v13 (StataCorp, College Station, TX, USA). They are described in the manuscript, and additional technical details are available here. We used the phyloseq and gplots packages in R to make heatmaps of the 20 ASVs with the highest mean relative abundance ( Figure S1A) or concentration ( Figure S1B) across all study samples. 28 Bar graphs and line graphs were made using the catplot and scatter functions in STATA and alluvial diagrams were made using the ggalluvial package in R. 29 For cross-sectional analyses, we used Fisher's exact test to compare binary and categorical variables, Kruskal-Wallis test to compare continuous variables, and Mann Whitney U test for pairwise comparisons of continuous variables if the Kruskal-Wallis test was significant at p<0.05. For longitudinal analyses, we used incidence rates (IRs) and incidence rate ratios (IRRs) with 95% confidence intervals (CI; primary endpoints only), and mixed effects models. IRs were defined as the number of new infections during follow-up divided by the person-years at risk for that infection using the ststet function in STATA. Women who had BV recurrence within 10 days that persisted until M6 were considered persistent infections to prevent inflation of the IRs. IRRs were calculated by dividing the IR of each intervention group by the IR of the control group. Mixed effects models were done in STATA using the xtmelogit function for categorical endpoints and the mixed function for continuous endpoints. All models included one VMB endpoint at a time as the outcome, the participant identification number as the random effect, and randomization group as the main fixed effect. Three variables were added as additional fixed effects in adjusted mixed effects models because they were associated at p<0.05 in mixed effects models with at least one of four a priori selected VMB endpoints (Nugent score, or concentrations of lactobacilli, BVanaerobes, or pathobionts; Table S6), and data were available at all study visits. They were 'current use of hormonal contraception or being pregnant' (versus not using contraception and not being pregnant; six samples from copper intrauterine device users were excluded), a 'sexual risk' composite variable (with lower risk defined as having reported condom use at each vaginal sex act since the last study visit and fewer than the median of five sex partners in the past month), and 'age' (30 years or older versus younger than 30 years; the median age of the screened population was 30 years). One additional variable was associated with one of the a priori selected VMB outcomes at p<0.05: managing menses with sanitary pads versus other methods (Table S6). However, this variable was not added as an additional fixed effect in the adjusted mixed effects models due to only having been recorded at enrollment visits and not at follow-up visits.
All analyses were conducted on the intent-to-treat (ITT) population (n=68), and IR and IRR analyses were also conducted on a modified ITT population (n=51). Women who were BVnegative by modified Amsel criteria but BV-positive by Nugent score at the time of randomization (n=17) were excluded from the modified ITT population. Another 17 women had ongoing chlamydia and/or gonorrhea infection at the time of randomization, but the molecular analyses did not identify substantial differences in the VMB compositions of women with and without ongoing infection and we therefore did not exclude them ( Figure  S2). In accordance with local treatment guidelines, BV and VVC were treated when laboratory-confirmed and symptoms and/or clinician-observed signs were present. TV and other sexually transmitted infections were always treated when identified by laboratory testing. Some women received antimicrobial drugs for other ailments from external clinics. Antimicrobial use during the study is shown in Table S2. We took antimicrobial use into account to determine whether cases were incident or persistent, but we did not remove users from the modified ITT population because they were evenly distributed between the randomization groups and these short course treatments were expected to have much less of an effect on the VMB than the longer-term interventions.
IR and IRR analyses were conducted for the product use period (between enrollment and M2, including samples from D7, M1, and M2) as well as the period between M2 and M6 (M6 samples). Mixed effects models were done for the product use period only (D7, M1, M2, and self-collected samples) to determine if any of the observed effects were statistically significant; we did not run mixed effects models for the period after product cessation because the IR/IRR analyses and graphs had already shown that the observed effects during the interventions had disappeared after cessation of the interventions.      . ‡Prescribed for amoebiasis/ dysentery (externally). § Includes amoxicillin prescribed for abortion prophylaxis, tonsillitis, dental caries, tooth extraction, cough, and trauma (all externally), chloramphenicol prescribed for an upper respiratory tract infection (externally), ciprofloxacin prescribed for urinary tract infection (at study clinic) and typhoid fever (externally), cloxacillin prescribed for a traumatic wound (externally), and doxycycline prescribed for chlamydia (at study clinic) and an unspecified sexually transmitted infection (externally). ¶Prescribed for BV, TV and amoebiasis (all at the study clinic). ||Includes amoxicillin prescribed for tonsillitis and contusion (both externally), ciprofloxacin prescribed for gonorrhea or urinary tract infection (of which 8/12 at the study clinic), and doxycycline for chlamydia (all but one at the study clinic).        Abbreviations: BV, bacterial vaginosis; CI, confidence interval; est conc, estimated concentration; OR, odds ratio. *Covariates were tested in mixed effects models with participant identification number as a random effect, the potential predictor as fixed effects, and the four vaginal microbiota outcomes as separate outcomes. All possible time points were included in the models. The potential predictor variables were not correlated with each other. †Women using a copper intra-uterine device were not included in this model. ‡Versus remainder of the cycle. §Only antibiotics given for reasons not associated with the primary/secondary outcomes of the study were included. ¶Or in the past two months (when asked at the M2 visit), or in the past four months (when asked at the M6 visit). ||Composite variable of condom use consistency and number of sexual partners: women were considered low risk when they reported fewer than five sexual partners in the past month plus consistent condom use. The variable 'exchanging sex for money and/or goods' (next row) was not added to this composite variable because the great majority of women reported this behavior (Table 1). **Versus 29 years old or younger. † †Asked at the enrollment visit only.  7/65 0.27 (0.14 -0.54) Abbreviations: CI, confidence interval; CT, Chlamydia trachomatis; Enr, enrollment visit; IR, incidence rate; M2/6, month 2/6 visit; NG, Neisseria gonorrhoeae; STI, sexually transmitted infection; TV, Trichomonas vaginalis; UTI, urinary tract infection. Incidence rates of STIs and UTIs during study follow-up. Incidence rate ratios were not calculated due to low number of incident cases. *Number of women (n) who developed at least one incident infection during the specified time period as a proportion of the women who completed all follow-up visits in that time period (N). †One participant in the GynLP group was tested for UTI at an unscheduled visit between M2 and M6 and subsequently withdrew her informed consent.