Introduction

The number of studies characterizing the vaginal microbiome (VMB) using molecular methods has increased rapidly in the past 10 years.1 These studies have shown that a healthy VMB is dominated by one or more Lactobacillus species, and is associated with a low vaginal pH. Several dysbiotic VMB compositions have been described in which Gardnerella vaginalis, Atopobium vaginae, Prevotella species, and/or several other (facultative) anaerobic bacteria are more abundant than lactobacilli. Although dysbiosis is asymptomatic in 40% of affected women,2 some women develop symptomatic bacterial vaginosis (BV), a syndrome characterized by thin vaginal discharge and amine odor. BV has traditionally been diagnosed using microscopy of vaginal fluid and the presence of symptoms,3, 4 and both symptomatic and asymptomatic BV have been associated with adverse reproductive health outcomes, including acquisition of HIV and other sexually transmitted infections (STIs), pelvic inflammatory disease, and preterm birth.5, 6

The vaginal mucosal barrier is dynamic and unique in the human body. It is under (cyclic) hormonal influence, has a lower pH than other mucosal barriers, and is frequently exposed to inflammatory stimuli. The mucosal immune system develops tolerance to several non-self antigens (semen, fetal antigens, and commensal bacteria), whereas at the same time protecting the woman from STIs and other pathogens. The mucosal barrier consists of a mechanical barrier (mucus and epithelium), factors associated with the innate immune system (such as antimicrobial peptides (AMPs) and enzymes), and the adaptive immune response. A lactobacilli-dominated VMB strengthens the mucosal barrier: Lactobacillus species thrive in the glycogen-rich acidic environment and help protect against pathogens by producing lactic acid and other antimicrobial compounds.7, 8

It is not yet clear how dysbiosis increases the risk of HIV/STI acquisition and other adverse reproductive health outcomes but cervicovaginal inflammation and other changes to the mucosal barrier are thought to have important roles. BV was originally considered a non-inflammatory condition because it is not associated with an influx of neutrophils. However, recent clinical studies have shown increased cervicovaginal fluid concentrations of the pro-inflammatory cytokines interleukin (IL)-1β and IL-8, and decreased concentrations of the anti-inflammatory cytokine IL-10 and the antiproteases secretory leukocyte protease inhibitor and elafin (PI3).9, 10, 11, 12, 13, 14 Furthermore, in vitro studies have shown that lactobacilli do not generally induce human AMPs, whereas BV-associated bacteria (such as Gardnerella vaginalis, A. vaginae and P. bivia) do.10, 12

The cervicovaginal proteome has been characterized by others before,15, 16, 17, 18, 19, 20 but never in the context of different VMB compositions. We hypothesized that VMB dysbiosis is associated with cervicovaginal inflammation, and other changes to the mucosal barrier that might cause increased HIV acquisition and other adverse reproductive health outcomes. To test this hypothesis, we compared the human cervicovaginal proteome (by mass spectrometry) of 50 Rwandan female sex workers who had previously been clustered into four VMB groups (using a 16S phylogenetic microarray) in order of increasing bacterial diversity: Lactobacillus crispatus-dominated VMB (group 1), Lactobacillus iners-dominated VMB (group 2), moderate dysbiosis (group 3), and severe dysbiosis (group 4).21 We compared relative protein abundances among these VMB groups using targeted (abundance of pre-defined mucosal barrier proteins) and untargeted (differentially abundant proteins among all human proteins identified) approaches.

Results

Sociodemographic, behavioral, and clinical characteristics

The median age of the women was 28 with no significant differences among VMB groups (Table 1). Women in group 1 were more likely to ever have been married than women in the other groups (Ptrend=0.03) but other sociodemographic and sexual risk-taking characteristics, use of hormonal contraception, and stage of the menstrual cycle did not differ significantly among VMB groups. None of the women used antibiotics in the past 14 days. The molecular VMB groups correlated well with BV diagnoses by Nugent score categories and with vaginal pH, but not with the other Amsel criteria. All women in groups 1 and 2 were BV-negative by Nugent score (Nugent score of 0–3), all women in group 4 were BV-positive (Nugent score 7–10), and the women in group 3 had a mixture of BV diagnoses. Women in group 1 had the lowest prevalence of viral STIs (HIV, herpes simplex virus type 2, and human papillomavirus; Ptrend<0.01), as has been reported elsewhere in more detail.21 Among HIV-positive women, the median CD4 count was 557 cells mm−3 (interquartile range 462–914 cells mm−3), and none had a CD4 count of <200 cells mm−3. Self-reported genitourinary symptoms, abnormal pelvic exam findings, abnormal wet mount findings, laboratory-confirmed bacterial STIs, and abnormal cervical cytology were rare in this sample and equally distributed among the VMB groups. Blood and leukocytes were commonly found in the cervicovaginal lavages (CVLs) but also equally distributed among the VMB groups.

Table 1 Sociodemographic, behavioral, and clinical characteristics of women from the selected microbiome groups

Targeted hypothesis-driven analysis: the mucosal barrier

The median total protein concentration in CVLs was 194 μg ml−1 (interquartile range 125–344 μg ml−1) and was not associated with VMB composition (data not shown). Mass spectrometry identified 549 human proteins. In the targeted hypothesis-driven analysis, we compared the relative abundances of pre-defined proteins of the mucus layer, epithelial layer, and immune system (AMPs, cytokines, and immunoglobulins) among the VMB groups, using pairwise comparisons and tests for trend without correction for multiple comparisons.

Mucus layer

Four mucins were detected: mucin 5AC (MUC5AC), mucin 5B (MUC5B), mucin 6 (MUC6), and mucin 16 (MUC16). The relative abundance of MUC5B increased significantly from groups 1 to 4 (Figure 1; Ptrend<0.01) and the relative abundance of MUC5AC was higher in group four than in groups 1 and 3 (P=0.03 and P=0.05, respectively). MUC6 and MUC16 were not associated with VMB groups.

Figure 1
figure 1

Relative abundance of mucins (MUC) among the four VMB groups. Of the four mucins identified, MUC5AC and MUC5B were increased in the dysbiotic groups. Box plots represent median (black line), first and third quartiles (box) and range within 1.5 times the interquartile range from the box (whiskers). Outliers are plotted as points. 2Ptrend<0.01; *P-value<0.05.

PowerPoint slide

Epithelial layer

Of the 13 identified keratins (KRT), KRT1, 4, 5, 6A, 10, and 13 were the most abundant, and KRT4, 5, 6A, 13, 19, and 76 decreased significantly from groups 1 to 4 (Figure 2; Ptrend<0.01). Cell death was assessed using the biomarkers lactate dehydrogenase (LDH) subunits A and B. The relative abundance of LDHA was higher than LDHB in all groups, and both subunits showed a significant increasing trend from groups 1 to 4 (Figure 3; Ptrend<0.01).

Figure 2
figure 2

Relative abundances of keratins (KRT) among the four VMB groups. Six keratins decreased significantly from groups 1 to 4. Box plots represent median (black line), first and third quartiles (box) and range within 1.5 times the interquartile range from the box (whiskers). Outliers are plotted as points. 2Ptrend<0.01; 3Ptrend<0.001. *P-value<0.05; **P-value<0.01; ***P-value<0.001.

PowerPoint slide

Figure 3
figure 3

Relative abundances of lactate dehydrogenase (LDH) subunits A and B as biomarkers for cell death among the four VMB groups. Both subunits increased significantly from groups 1 to 4. Box plots represent median (black line), first and third quartiles (box) and range within 1.5 times the interquartile range from the box (whiskers). Outliers are plotted as points. 2Ptrend<0.01; 3Ptrend<0.001. *P-value<0.05; ** P-value<0.01.

PowerPoint slide

AMPs and cytokines

Of the >30 AMPS in the cervicovaginal environment that have been described by others,10, 12 19 were identified in our CVLs (Figure 4). These were bactericidal permeability-increasing protein, cathelicidin antimicrobial peptide (or LL-37), cystatin A (CSTA), neutrophil defensin 1 (DEFA1 or HNP1), lactoferrin, lysozyme C (LYZ), PI3, ubiquitin (RPS27A), psoriasin (S100A7), calprotectin (S100A8 and S100A9), secretory leukocyte protease inhibitor, and six histones (HIST2H3A/C/D, HIST4H4, H2AFZ, HIST2H2AA3/4, HIST3H2A and HIST2H2BF). CSTA, LYZ, and RPS27A decreased significantly from groups 1 to 4 (all Ptrend<0.001), and HIST3H2A, HIST2H3A/C/D, HIST4H4, S100A7, and S100A9 increased from groups 1 to 4 (Ptrend<0.001, Ptrend<0.01, Ptrend 0.02, Ptrend<0.001, Ptrend 0.01, respectively). Furthermore, eight cytokines were identified: IL-36α (IL36A), IL-36γ, complement C5 (C5), glucose-6-phosphate isomerase, macrophage migration inhibitory factor, metalloproteinase inhibitor 1, IL-36 receptor antagonist, and IL-1 receptor antagonist. The pro-inflammatory cytokines IL-36α, C5, glucose-6-phosphate isomerase, and migration inhibitory factor, and the anti-inflammatory cytokine Il-36 receptor antagonist, showed a significantly increasing trend from groups 1 to 4 (all P<0.001; Figure 5). IL-36γ, IL-1 receptor antagonist, and metalloproteinase inhibitor 1 were not associated with VMB groups.

Figure 4
figure 4

Relative abundances of antimicrobial peptides (a) and histones (b) among VMB groups. CSTA, LYZ and RPS27A decreased significantly from groups 1 to 4, and HIST3H2A, HIST2H3A/C/D, HIST4H4, S100A7, and S100A9 increased from groups 1 to 4. Box plots represent median (black line), first and third quartiles (box) and range within 1.5 times the interquartile range from the box (whiskers). Outliers are plotted as points. 1Ptrend<0.05; 2Ptrend<0.01; 3Ptrend<0.001. *P-value<0.05; ** P-value<0.01; ***P-value<0.001.

PowerPoint slide

Figure 5
figure 5

Relative abundances of cytokines among the four VMB groups. Four of six pro-inflammatory cytokines (a) and one of two anti-inflammatory cytokines (b) that were identified using mass spectrometry showed an increasing trend from groups 1 to 4. Box plots represent median (black line), first and third quartiles (box) and range within 1.5 times the interquartile range from the box (whiskers). Outliers are plotted as points. 3Ptrend<0.001. *P-value<0.05; ** P-value<0.01; ***P-value<0.001.

PowerPoint slide

Humoral immune response

Immunoglobulin (Ig) heavy chain constant regions IgA, IgG, IgM, and IgD were identified, as well as the J-chain that is required for IgA and IgM to be secreted into the mucosa. IGHG1 showed the highest relative abundance, followed by IGHA1 and IGHG2 (Figure 6). The relative abundances of IGHG1 and IGHG2 decreased from groups 1 to 4 (Ptrend 0.03 and Ptrend<0.01, respectively). Other heavy chain constant regions were not associated with VMB groups.

Figure 6
figure 6

Relative abundances of immunoglobulins (Ig) among the four VMB groups. IGHG1 showed the highest relative abundance, followed by IGHA1 and IGHG2. IGHG1 and IGHG2 decreased significantly from groups 1 to 4. Box plots represent median (black line), first and third quartiles (box) and range within 1.5 times the interquartile range from the box (whiskers). Outliers are plotted as points. 1Ptrend<0.05; 2Ptrend<0.01. *P-value<0.05; ** P-value<0.01.

PowerPoint slide

Untargeted analysis: differentially abundant human proteins

In addition to the hypothesis-driven analysis, we conducted an untargeted analysis to identify all human proteins differentially abundant among the VMB groups. Eighty-two of the 549 proteins were differentially abundant as determined by analysis of variance with Bonferroni correction (resulting in a P-value cutoff of 9.1 × 10−5). Most differentially abundant proteins either increased or decreased from groups 1 to 4 (Figure 7). Proteins increasing from groups 1 to 4 included proteasome core complex proteins and proteases (calpain-2 catalytic subunit, protein DJ-1, endoplasmic reticulum aminopeptidase 1, calpain small subunit 1, transmembrane protease serine 11A, and bleomycin hydrolase) and other proteins involved in catabolism (including ubiquitously expressed proteins such Rab guanosine diphosphate dissociation inhibitor beta and ubiquitin-conjugating enzyme E2 L3). Proteins decreasing from groups 1 to 4 included the protease-inhibitors CSTA, inter-alpha-trypsin inhibitor, serine protease inhibitor Kazal-type 5, and serpin B3. Several cytoskeletal proteins were differentially abundant, with actin-organizing proteins increasing from group 1 to 4 (including a few actin capping proteins), and epithelial proteins including cornified envelope proteins (small proline-rich proteins, repetin, and CSTA) and KRT (KRT4, KRT5, KRT6A) decreasing from groups 1 to 4. Similarly, some proteins involved in the inflammatory and immune response increased from group 1 to 4 (such as complement factor 3 (C3) and migration inhibitory factor) and some decreased (the antiproteases inter-alpha-trypsin inhibitor and serine protease inhibitor Kazal-type 5, ubiquitin (RPS27A) and thioredoxin).

Figure 7
figure 7

Hierarchical clustering, heatmap, and classification of differentially abundant proteins among VMB groups. (a) Using hierarchical clustering on the range-scaled relative abundances of the differentially abundant proteins, two groups of proteins could be distinguished: proteins increasing from VMB groups 1 to 4 (in red), and proteins decreasing from VMB groups 1 to 4 (in green). (b) Each row of the heatmap represents a protein and the color indicates the range-scaled (0–1) ion intensity per sample as indicated in the color bar below the heatmap. Each column represents a sample, with samples being ordered on VMB composition as indicated on the top. (c) Shows molecular functions and biological processes represented in the group of differentially abundant proteins. A black box indicates that the protein is assigned to that particular classification.

PowerPoint slide

Sensitivity analysis: hormonal contraception

We hypothesized a priori that hormonal contraception could be a confounder or effect modifier of the relationship between the VMB and the human cervicovaginal proteome. We therefore repeated all analyses stratified by hormonal contraception (women not using hormonal contraception (n=28) and women using hormonal pills, injectables, or implants (n=21); Supplementary File online). The results were similar to the original analysis. However, women using hormonal contraception who were in VMB group 3 (moderate dysbiosis) had a more similar protein profile to women in VMB groups 1 and 2 (lactobacilli-dominated) than to women in VMB group 4 (severe dysbiosis; Figure 8).

Figure 8
figure 8

Association between the range-scaled ion intensities of the differentially abundant proteins and VMB groups, stratified by hormonal contraception (HC) use. Range-scaled ion intensities ranged from 0 to 1, representing the lowest and highest observed ion intensity per protein, respectively. (a) Range-scaled ion intensities of proteins that decreased from groups 1 to 4 (n=27) as shown in green color in Figure 7. (b) Range-scaled ion intensities of proteins that increased from groups 1 to 4 (n=55) as shown in red color in Figure 7. Box plots represent median (black line), first and third quartiles (box) and range within 1.5 times the interquartile range from the box (whiskers). Outliers are plotted as points.

PowerPoint slide

Sensitivity analysis: HIV status

Similarly, we hypothesized that HIV status could be a confounder or effect modifier. We therefore repeated all analyses stratified by HIV status (HIV-negative women (n=23) and HIV-positive women (n=27); Supplementary File). Again, the results were similar to the original analysis. Furthermore, when directly comparing the abundances of all 549 proteins between HIV-negative and HIV-positive women using the Student's t-test with Bonferroni correction, none of them were statistically significantly different (results not shown).

Discussion

Using a systems biology approach, this study shows that a strong relationship exists between the VMB and cervicovaginal human proteome in a cohort of Rwandan women at high risk of HIV and other STIs. With increasing bacterial diversity, we found: mucus alterations, cytoskeleton alterations, increasing cell death, increasing proteolytic activity, altered AMP balance, increasing pro-inflammatory cytokines, and decreasing IgG1/2.

Cervical mucus consists predominantly of glycoproteins called mucins, the most important ones being the gel-forming mucins MUC5B, MUC5AC, and MUC6.22 These likely exert a protective effect against pathogens by physically capturing microbes and by facilitating binding of antibodies.23 Our study showed an increase of MUC5B, and to a lesser extent MUC5AC, but not of MUC6 and MUC16, with increasing bacterial diversity. This is consistent with another cervicovaginal proteome study that found a 3.9-fold increase of MUC5B in women with BV by Amsel criteria, and a decrease of MUC5B and MUC5AC after the BV had been treated.16 The increase of MUC5B/AC proteins in CVLs could be caused by increased degradation of the polymer gels formed by these proteins or increased mucin production by epithelial cells. The former mechanism is well-known: BV-associated bacteria produce proteases that degrade cervical mucus and produce a watery vaginal discharge.12, 24 Evidence for the latter comes from an in vitro study that showed increased mucin secretion by endocervical epithelial cells when microbial products were added to the cell culture,25 but was not confirmed by another in vitro study showing no effect when BV-associated bacteria were added to vaginal epithelium.26 Another possibility is that increased levels of mucins act in favor of dysbiosis. MUC1, the most ubiquitously expressed cell-associated mucin across mucosal tissues, has been shown to cause mechanical blockage and anti-inflammatory modification of the immune response in in vitro studies.27, 28 Whether an increase in MUC5B and/or MUC5AC increases tolerance of the cervicovaginal mucosal immune system for dysbiosis remains to be studied.

Our data show evidence of cytoskeleton alterations (increasing actin-organizing proteins; decreasing KRT; and cornified envelope proteins) in women with moderate or severe dysbiosis, and an increase of cell death using LDHA/B as biomarkers of cell death. This most likely reflects epithelial damage, desquamation, and/or remodeling due to bacterial cytotoxic enzymes and/or bacteria-induced inflammation.29, 30 This is in concordance with a vaginal metabolomics study by Yeoman et al.,31 in which women with BV had altered concentrations of cell wall-associated proteins, suggesting loss of epithelial integrity, and a recent metatranscriptomics study that found an increase of vaginolysin expression by G. vaginalis and of cholesterol-dependent cytolysin by L. iners in women with BV by Nugent scoring.32 However, in our study, these cytoskeleton alterations could also be due to viral replication and virus-induced inflammation (the viral STIs were more common in VMB groups 3 and 4 than in groups 1 and 2; Table 1). Furthermore, we considered the possibility that skin contamination from specimen handlers might have influenced our results. However, this seems unlikely given the linear relationship between bacterial diversity and epithelial proteins, and the fact that KRT4 is only expressed by mucosal stratified squamous epithelium.33

Our data also show an increase in proteolytic activity (increasing proteasome core complex proteins and proteases; decreasing antiproteases) with increasing bacterial diversity. In fact, proteasome core complex proteins (including subunit beta 9, which is part of the immunoproteasome) and proteases made up a substantial proportion of the differentially abundant proteins that increased with increasing bacterial diversity. An increase of extracellular proteasomes has been noted in other inflammatory and infectious conditions before, and it is unclear whether they are released by cells that are damaged or dead as a result of these conditions or are actively excreted as a defense mechanism to help clear the conditions.34 There is some evidence for both hypotheses. For example, inflammation attracts antigen-presenting cells that use (immuno)proteasomes for the breakdown of proteins for presentation in the major histocompatibility complex -I,35 whereas in vitro studies have shown that T cells excrete proteasomes using exosomes and that this excretion is enhanced when T cells are activated.36 In contrast, antiproteases were least abundant in the dysbiotic VMB groups. A delicate balance between proteases and antiproteases most likely allows for combating infections, whereas at the same time minimizing tissue damage and inflammation. Studies have indeed shown negative associations between the antiprotease CSTA and known BV complications in pregnant women,17, 18, 19 and between the antiproteases CSTA and serine protease inhibitor Kazal-type 5 and HIV acquisition in African sex workers.15 The authors hypothesized that the latter is likely due to reduced cervicovaginal inflammation, and therefore a reduced influx of CD4+ CCR5+ target cells for HIV to mucosal surfaces that are exposed to HIV. This is in agreement with our own finding that women in the lactobacilli-dominated VMB groups have a higher abundance of antiproteases in their CVLs and a lower prevalence of HIV infection.21

Although an association between dysbiosis and AMPs has been shown repeatedly, the mechanisms are complex and the findings between studies are not consistent.10, 12 Most AMPs are secreted constitutively to exert immune modulatory functions and protection against pathogen invasion, but are significantly up or downregulated in case of infection or inflammation.10, 12 In our study, the AMPs CSTA (which is also an antiprotease), LYZ, and RPS27A decreased from groups 1 to 4, whereas S100A7, several histones, and to a lesser extent S100A8 and S100A9, increased from groups 1 to 4. Other clinical studies have found a decrease of S100A9 and LYZ (the latter non-significant) in women with BV by Amsel criteria, and a non-significant increase of LYZ and significant decrease of S100A8 and S100A9 when the BV was treated.16, 37 The roles of S100A7, histones, and RPS27A have not been studied before in the context of vaginal dysbiosis. However, in vitro studies have shown that S100A7 has strong anti-Escherichia coli activity38 and is expressed by epithelial cells when they are exposed to microbial products.12 Furthermore, histones and RPS27A are known to have pivotal functions in the innate immune system.39, 40 We did not find associations between dysbiosis and some AMPs (LL-37, secretory leukocyte protease inhibitor, PI3, and lactoferrin) that have been associated with BV by others.10, 12 This could be due to the fact that most AMPs have additional functions over and above their antibacterial properties, and processes other than the innate immune response might also alter their expression levels. Finally, we did not detect β-defensins, which have been associated with BV in in vitro studies, but are present in the vagina in low concentrations.20

Our cytokine findings are consistent with previous studies that reported an increase of cervicovaginal pro-inflammatory cytokines in women with BV.11, 13, 14 However, some cytokines that have frequently been studied in this context (such as IL-1β, IL-6, and IL-8) were not identified in our study, probably due to relatively low concentrations that fall outside the dynamic range of the mass spectrometer.41 The cytokines that we did detect (IL-36α, IL-36γ, Il-36 receptor antagonist, migration inhibitory factor, glucose-6-phosphate isomerase, C5, metalloproteinase inhibitor 1) have not been studied in the context of vaginal dysbiosis before. They increased with increasing bacterial diversity, except for IL-36γ and metalloproteinase inhibitor 1. In contrast to the cytokines, immunoglobulins (particularly, IgA and IgG) were present in high abundance in our CVLs. IGHG1 and IGHG2 levels decreased with increasing bacterial diversity, which is in agreement with a previous study that found decreased IgG levels in CVLs of HIV-positive women with BV by Amsel criteria.42 However, two other clinical studies found no difference in IgG levels between women with and without BV by Nugent scoring.43, 44 The decline in IgG levels is most likely due to increased degradation by human and/or bacterial proteases and/or by dilution owing to the increased volume and thinning of mucus.44, 45

Several limitations of our study should be mentioned. The study only included Rwandan female sex workers at high risk of HIV and other STIs and the results may not be generalizable to other women. The data were cross-sectional, and temporal relationships can therefore not be established. Owing to the limited number of samples available in some of the VMB groups, we did include women regardless of their HIV status and their use of hormonal contraception, even though these two factors may be confounders or effect modifiers of the VMB-human proteome relationship. We conducted sensitivity analyses to assess this. The VMB-human proteome relationships were not significantly different in HIV-positive and HIV-negative women, but women with moderate dysbiosis who were using hormonal contraception had a protein profile that was more similar to women with a lactobacilli-dominated VMB than to women with severe dysbiosis (see Supplementary File). These findings should, however, be interpreted with caution because the sample sizes in the sensitivity analyses were small. The advantages and disadvantages of using a microarray for microbiome characterization were discussed previously.21 As is the case in all proteomics research, some of our results may be spurious owing to the many statistical comparisons made, or incomplete because of our dependence on evolving publicly available protein annotation and expression databases. Furthermore, we normalized the amount of total protein in each sample prior to mass spectrometry, which means that we can only report relative protein abundances instead of absolute concentrations. However, we did measure CVL volume and total protein concentration for each sample, and neither one of these was associated with VMB composition.

In conclusion, we have shown a strong relationship between the VMB and cervicovaginal human proteome in a cohort of Rwandan women at high risk of HIV and other STIs. Our findings support the hypothesis that dysbiosis causes cervicovaginal inflammation and other detrimental changes to the mucosal barrier that are thought to lead to BV complications. We believe that systems biology approaches should be incorporated into larger epidemiological studies to address knowledge gaps in etiology and pathogenesis of VMB dysbiosis, associations of different dysbiotic states with clinical outcomes, and to evaluate interventions aimed at restoring and maintaining lactobacilli-dominated VMB.

Methods

Study design. The Kigali HIV Incidence Study estimated the incidence and prevalence of HIV and other STIs in a cohort study of Rwandan female sex workers between 2006 and 2009.46 The study was approved by the National Ethics Committee, Rwanda, and the Columbia University Medical Center Review Board, USA. All participants provided written informed consent. The study design and procedures were described elsewhere.46 In brief, 800 women were tested for HIV at baseline, and 24% tested HIV-positive. Two groups of women returned for follow-up: a subset of the HIV-negative women (n=397) and HIV-positive women (n=141). They regularly underwent interviewing about sociodemographic characteristics, HIV/STI risk behavior, and clinical symptoms, pelvic examinations, and testing for HIV, other STIs, BV, candidiasis, pregnancy, and cervical cytology.46 The presence of BV was assessed by Gram stain Nugent scoring (score of 7–10 representing BV)4 and by the presence of three out of four Amsel criteria (elevated pH, clue cells on wet mount, and positive whiff test).3 The cervical swabs and CVLs that were used for the microbiome and proteome analyses described in this paper were collected at the final study visit. Women received treatment for curable STIs and symptomatic BV and candidiasis at the study clinic, and were referred to other local clinics for care related to HIV, pregnancy, and abnormal cervical cytology. Women also received HIV counseling and condoms free of charge.

Microbiome analysis. Cervical spatulas and cytobrushes were rinsed in Preservcyt medium (ThinPrep Pap Test; Cytyc Corporation, Boxborough, MA). The Preservcyt specimens were stored at −80 °C until batched testing at the end of the study for phylogenetic microarray analysis (TNO, Zeist, the Netherlands) as described previously.21, 47 DNA was extracted using the AGOWA magMini DNA isolation kit (AGOWA, Berlin, Germany) and bead beating in a BeadBeater (BioSpec Products, Bartlesville, OK). The microarray, designed for VMB analysis, contained 461 DNA hybridization probes targeting microorganisms and 164 control probes. Clustering analysis was performed using the signal/background (S/B) ratio of the 251 probes, which generated consistent results. Using an unsupervised clustering algorithm, six microbiome clusters were identified,21 but for the purposes of this analyses we used four VMB groups as described below.

CVL collection, processing, and selection for proteomics analysis. The left and right fornix and cervical os were irrigated twice with 5 ml normal saline, which was aspirated after 30 s; a median volume of 5.5 ml (range 3.8–7.5 ml) was recovered. The CVL fluid was immediately placed on ice or at 4 °C. The color and volume was recorded, and the presence of blood and leukocytes was assessed using a urine dipstick (Urine-10, Cypress Diagnostics, Langdorp, Belgium). The CVLs were centrifuged at 1,000 rpm for 10 min within 4 h of collection. Supernatants were filtered using a sterile 0.2 μm cellulose acetate membrane (VWS International, Lutterworth, UK). In total, 15% of supernatants were not filtered owing to logistic problems. Cell pellets and aliquots of supernatant were stored at −80 °C until testing.

We selected CVL supernatants for proteomic analysis from four pre-defined VMB groups: L. crispatus-dominated VMB (group 1), L. iners-dominated VMB (group 2), moderate dysbiosis (group 3), and severe dysbiosis (group 4). Women from groups 3 and 4 had mixed microbiota with high abundance of G. vaginalis, Prevotella spp., and Atopobium vaginae. However, richness and diversity were lower in group 3 than group 4 (median richness of 13 and 17, respectively, and median Shannon diversity of 1.9 and 2.2, respectively), and women in group 3 had lower prevalence of STIs.21 Only one sample per woman was included and all samples had to meet the following additional conditions: the woman was younger than 45 years, not pregnant, and the supernatant was filtered and not macroscopically bloody. Furthermore, in group 2 we only selected samples with high S/B ratios for L. iners but low S/B ratios for G. vaginalis. This resulted in 7 samples in group 1, 11 samples in group 2, and 14 samples in group 3. Eighteen samples were selected at random from group 4 to end up at a total of 50 CVL samples from 50 women.

Protein extraction and mass spectrometry. CVL supernatants were heat inactivated for 30 min at 56 °C before further processing. Total protein concentrations were determined using the Pierce Coomassie Plus (Bradford) Protein Assay (Thermo Scientific, Rockford, IL). Sample protein content and volume were normalized with 25 mM ammonium bicarbonate. Soluble proteins were precipitated using an equal volume of ice-cold 30% (w/v) trichloroacetic acid in acetone and incubated at −20 °C for 2 h. Samples were centrifuged at 12,000 g for 10 min (4 °C) to pellet proteins. Pellets were washed three times with ice-cold acetone and allowed to air dry. Further sample processing was performed as previously described with minor modifications.48 In brief, protein pellets were resuspended in 25 mM ammonium bicarbonate, 0.05% (w/v) rapigest (Waters, Elstree, UK), reduced, and alkylated. Digestion was performed with proteomic-grade trypsin (Sigma-Aldrich, St Louis, MO) at a protein:trypsin ratio of 50:1. Rapigest was precipitated by addition of trifluoroacetic acid to a final concentration of 0.5% (v/v). Peptide mixtures were analyzed by on-line nanoflow liquid chromatography using the nanoACQUITY-nLC system (Waters) coupled to an LTQ-Orbitrap Velos (ThermoFisher Scientific, Bremen, Germany) mass spectrometer equipped with the manufacturer’s nanospray ion source. The gradient of the analytic column (nanoACQUITY UPLCTM BEH130 C18 15cm × 75 μm, 1.7 μm capillary column) consisted of 3–40% acetonitrile in 0.1% formic acid for 90 min then a ramp of 40-85% acetonitrile in 0.1% formic acid for 5 min.

Runs were time aligned using default settings of Progenesis LC–MS (version 4.1, Nonlinear Dynamics, Newcastle, UK) and using an auto-selected run as reference. Peaks were picked by the software and filtered to include only peaks with a charge state of between +2 and +7. Peptide intensities were normalized against the reference run. Throughout this paper, these normalized ion intensities are referred to as “relative abundance”. Peptide identification was performed using the Mascot (version 2.3.02, Matrix Science, London, UK) search engine. Tandem MS data were searched against a human database (Uniprot 2013_06 release containing 20,255 entries) and RefSeq databases of 28 vaginal bacterial species, Trichomonas vaginalis, Candida albicans, and Candida glabrata to rule out bacterial origin of proteins. Search parameters were as follows; precursor mass tolerance set to 10ppm and fragment mass tolerance set to 0.6 Da. The false discovery rate was <1%. Individual ion scores >13 indicated identity or extensive homology (P<0.05). Only human proteins with two or more unique peptides were used in further analyses. The mass spectrometry proteomics data have been deposited in the PRIDE partner repository of the ProteomeXchange Consortium with the data set identifier PXD001435 and 10.6019/PXD001435.49

Statistical analyses. Statistical analyses were performed using STATA release 12 (StataCorp, College Station, TX) and R release 3.1.0 (R Foundation for Statistical Computing, Vienna, Austria). In the targeted hypothesis-driven analysis, we compared the relative abundance of pre-defined cervicovaginal mucosal barrier proteins of interest as identified by mass spectrometry among VMB groups without correcting for multiple comparisons. The pre-defined proteins of interest included proteins from the mucus layer, epithelial layer, and immune system (AMPs, cytokines, and immunoglobulins) that have been described in the literature before. We assessed all mucins and KRT in the data set; LDHA and B as markers of cell death;50 AMPs that were described in recent review articles;10, 12 cytokines using the GO-term “GO:0005125 cytokine activity” (see below) and IL-1 receptor antagonist; and all heavy chain constant regions of immunoglobulins. In the untargeted analysis, each of the 549 identified proteins were tested for differential abundance among VMB groups using one-way analysis of variance on log-transformed ion intensities. We used Bonferroni correction with an alpha of 0.05, resulting in a P-value cutoff of 9.1 × 10−5.

Information on molecular function and associated biological processes was retrieved from the Gene Ontology website using AmiGO51 and UniProt.52 The GO-terms that were used are “GO:0005839 proteasome core complex”; “GO:0008233 peptidase activity”; “GO:0030414 peptidase inhibitor activity”; “GO:0005856 cytoskeleton”; “GO:0030036 actin cytoskeleton organization”; “GO:0001533 cornified envelope”; “GO:0045095 keratin filament”; “GO:0009056 catabolic process”; “GO:0006955 immune response”; “GO:0006954 inflammatory response”. Minor modifications were made to GO-term GO:0030414 (excluding complement factor 3 and serpin B5), and GO:0005856 (excluding proteasome subunit beta type-3), because the functions of the excluded proteins are better described by other terms according to the published literature.53, 54, 55 An annotated heatmap was drawn using the “Heatplus” package in R and hierarchical clustering with Euclidean distance and complete linkage was performed on the range-scaled ion intensities of the differentially abundant proteins.

Differences in correlates between VMB groups were assessed by two-sided Kruskal–Wallis test for continuous data and Fishers’ exact test for categorical data. Paired-wise comparisons in protein relative abundance between microbiome groups were tested using the two-sided Mann–Whitney test. The “nptrend” function in STATA, an extension of the Wilcoxon rank-sum test,56 was used for trends in correlates and protein relative abundance among microbiome groups. Sensitivity analyses were performed by stratifying women according to their HIV status and their use of hormonal contraception. Furthermore, a separate pairwise comparison of proteome compositions was performed between HIV-negative and HIV-positive women using the Student's t-test on log-transformed ion intensities with Bonferroni correction.