Microbial Diversity of Genital Ulcers of HSV-2 Seropositive Women

We measured the microbial community structure of genital ulcers in women. Swabs from clinically detected ulcers were tested for HSV-2 and Treponema pallidum by polymerase chain reaction (PCR). HSV-2 and T. pallidum were detected by serum antibody testing. Microbial community structure was characterized by high-throughput 16 s rRNA gene amplicon sequencing. Multiple group testing and Elastic net and Lasso regressions identified taxa associated with differences in factors of interest. Among 49 ulcer specimens from 49 HSV-2 seropositive women, by PCR HSV-2 was recovered from 28 (57%) specimens and T. pallidum from none; one woman showed serologic evidence of syphilis. Overall, 63% of women were HIV-positive and 49% had an uncircumcised male sex partner. By both multiple group testing and regression, Porphyromonas (FDR p-value = 0.02), Prevotella (FDR p-value = 0.03), Anaerococcus (FDR p-value = 0.07), and Dialister (FDR p-value = 0.09) were detected at higher relative abundance in HSV-2 PCR-positive than negative ulcers. The presence of HSV-2 in a lesion was associated with presumed bacterial agents of Bacterial vaginosis. Differences in bacterial communities may contribute to HSV-2 ulcer pathogenesis, severity, or prolonged healing. If these results are confirmed, future studies may consider the influence of BV treatment on women’s GUD and HSV-2 incidence and recurrence.

Detection of HSV-2 and Treponema pallidum. Serum specimens were tested at the study clinic for HSV-2 antibody (Kalon HSV-2 IgG ELISA, Kalon Biological Limited, Aldershot, United Kingdom) according to the manufacturer's instructions using recommended cutoffs. Syphilis infection was detected using the rapid plasma reagin test (RPR) (Macro-Vue ™ , Becton Dickinson, New Jersey, United States), confirmed by the T. pallidum hemagglutination (TPHA) assay (Randox Laboratories Ltd., Ardmore, United Kingdom).
Genital ulcer specimens were frozen at −20 °C, and shipped to Johns Hopkins University STD Research Laboratory for detection of HSV-1, HSV-2 and T. pallidum by multiplex polymerase chain reaction (PCR). Swabs were treated with 350 µl of 1X TRIS-EDTA (TE) buffer and vortexed for 15 seconds. The swab was removed from the TE buffer with a pair of tweezers, and the swab was expressed on the side of the tube ~3 times by rotating and pressing it into the tube. The swab was added back to the tube, and 200 µl of the 1X TE buffer was pipetted into a sample processing tray for extraction with the Roche MagNA Pure LC robot, leaving 150 µl for amplicon sequencing. All samples were extracted utilizing the Roche MagNA Pure LC robot with DNA Isolation Kit I, following the DNA I Blood/Cells High Performance extraction protocol and PCR was performed for HSV-1, HSV-2, and T. pallidum 12 . Analysis of microbial community structure. Genomic DNA (gDNA) from ulcers was PCR amplified and prepared for next-generation sequencing (NGS) using a modified two-step targeted amplicon sequencing approach, similar to that described previously 13,14 , with primers 515 F and 806 R 15 , targeting the V4 variable region of Bacterial and Archaeal small subunit (SSU) ribosomal RNA (rRNA) gene. The primers contained 5′ common sequence tags (common sequence 1 and 2 [CS1 and CS2]) as described previously 16 . The forward primer, CS1-515F (ACACTGACGACATGGTTCTACAGTGCCAGCMGCCGCGGTAA) and reverse primer, CS2-806R (TACGGTAGCAGAGACTTGGTCTGGACTACHVGGGTWTCTAAT) were synthesized by Integrated DNA Technologies (IDT; Coralville, Iowa) as standard oligonucleotides. The PCR amplifications were performed in 10 microliter reactions in 96-well plates. A mastermix for the entire plate was made using the 2X AccuPrime SuperMix II (Life Technologies, Gaithersburg, MD). Final concentration of CS1_515 F and CS2_806 R primers was 500 nM. Cycling conditions were as follows: 95 °C for 5 minutes, followed by 28 cycles of 95 °C for 30", 55 °C for 45" and 68 °C for 30". A final, 7 minute elongation step was performed at 68 °C. A single microliter of the first amplification reaction was transferred to a second amplification reaction with the same mastermix conditions, but with different primers. No purification was performed. Each well received a unique primer pair obtained from the Access Array Barcode Library for Illumina sequencers (Fluidigm, South San Francisco, CA; Item# 100-4876), which contain Illumina sequencing adapters, a sample-specific barcode (reverse primer), and CS1 or CS2 sequences. Cycling conditions were as follows: 95 °C for 5 minutes, followed by eight cycles of 95 °C for 30", 60 °C for 30" and 68 °C for 30". A final, 7 minute elongation step was performed at 68 °C. Samples were pooled in equal volume and purified using solid phase reversible immobilization (SPRI) cleanup, implemented with AMPure XP beads (Beckman Coulter, Brea, CA, USA) at a ratio of 0.6X (v:v) SPRI solution to sample volume.
The library pool was spiked with 15% non-indexed PhiX control library provided by Illumina and loaded onto a MiSeq v2 flow cell at a concentration of 8 pM for cluster formation and sequencing. Sequencing was performed at the W.M. Keck Center for Comparative and Functional Genomics at the University of Illinois at Urbana-Champaign (UIUC), and data were analyzed using the Casava1.8 pipeline.
Raw sequence data were imported into the software package CLC genomics workbench (v7.0; CLC Bio, Qiagen, Boston, MA). Sequences were merged, quality trimmed (Q20), and filtered to remove residual phiX contamination, and exported as FASTA files. Subsequently, sequence data were processed using the software package QIIME (v1.8.0; 17 ). Briefly, sequences were screened for chimeras using the usearch61 algorithm 18 using de novo and reference-based detection methods, and putative chimeric sequences were removed from the dataset. All the reads were then clustered into operational taxonomic units (OTU), using a threshold similarity of 97%. Representative sequences from each OTU were extracted, and these sequences were classified using the "assign_ taxonomy" algorithm implementing the RDP classifier, with the Greengenes reference OTU build (v13_8; 19 ). A taxon-by-sample abundance matrix (biological observation matrix, BIOM; 20 ) was generated at taxonomic levels from phylum to genus using the "make_OTU_table" algorithm. The BIOM was rarefied to 1,250 sequences per sample within QIIME to avoid analytical issues associated with variable sequence number between samples 21 .
Statistical Analysis. We used three analytic approaches to compare the bacterial community of genital ulcers by detection of HSV-2, male sex partner circumcision status, and HIV status: (1) test for global differences in bacterial communities by factors of interest; (2) compare diversity of bacterial communities by factors of interest; (3) identify specific bacterial taxa differing by factors of interest. Analyses were conducted at the taxonomic level of genus. Additionally, we compared clinical characteristics by recovery of HSV-2 from ulcers. Differences between categorical explanatory variables and factors were assessed by the chi-square test, or Fisher's exact test when cell size was < 5. Inferential analyses were conducted using STATA/SE 13.0 for Windows (Stata Corp., College Station, TX).
First, we tested global bacterial communities by factors using analysis of similarity (ANOSIM). Rarefied sequence data were transformed [log(X + 1)] prior to generation of the Bray-Curtis resemblance matrix. Second, diversity at the genus level was measured by the Shannon diversity index (log base e) and was compared by factors with Wilcoxon rank sum test. ANOSIM and Shannon diversity index analyses were conducted in the software package Primer6 (Primer-E, version 6.1.13, United Kingdom). Third, to identify specific bacterial taxa that differed by HSV-2 PCR result, male partner circumcision status, and HIV status, we compared the number of sequence reads of each genus using the Kruskal-Wallis test. We report test statistic p-values and false discovery rate (FDR)-corrected p-values. Group significance testing was performed within QIIME, employing the Kruskal-Wallis non-parametric test on untransformed number of sequence reads, with calculation of false discovery rate (FDR) corrected p-values 17 . To verify the robustness of these results and considering the dependence between bacterial taxa, we also applied regression to identify specific bacterial bacteria that differed by factors. Elastic net regression was chosen as our primary regression approach as its method of variable selection is not bounded by the number of samples. It uses an additional ridge regression penalty to nullify this constraint 22 . For the validation process, k-fold cross validation was used as the sample size was not sufficient to be divided into sizable training and validation sets. We applied numerous values of k in order to find out the change in the parameters of our equation. We tested numerous values for the ridge regression parameter to identify the optimum model for appropriate variable selection. To assess the robustness of the results of Elastic net, LASSO regression was implemented 23 and we identified 7 bacteria that were selected by both Elastic net and LASSO regression as having an association with any of the outcomes (HSV-2 status, HIV status, male sex partner circumcision status) at the p < 0.10 level. We applied linear regression to these 7 bacteria for each of the three outcomes, and report coefficients, p-values and adjusted R-squared values, which was multivariable adjusted when a bacterium was found to be significantly associated (p < 0.10) with more than one dependent variable. Before the analysis by linear regression, the data was standardized with µ = 0 and σ = 1. Regression analyses were executed in SAS software, version 9.4 (SAS Institute Inc., Cary, NC, USA).
Principal coordinates analysis was used to visualize bacterial community differences by the three dependent variables (HSV-2 PCR status, HIV status, and male sex partner circumcision status). We calculated both within and between group weighted Unifrac distances to find out the phylogenetic distances between the different groups of samples according to the dependent variables. The Weighted Unifrac distances and principal coordinate analysis plots were operationalized using the phyloseq, ggplot2, plyr, ape and wesanderson packages in the R 3.3.2 environment (R Development Core Team). Data Access. The 16 S rRNA gene amplicon sequence data from this study have been submitted to the NCBI Sequence ReadArchive (http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi) under accession number PRJNA281808.

Results
For this pilot study, the target sample size was 60 subjects, and 58 subjects were enrolled in the allotted timeline. Data were rarified to 1,250 sequences, which resulted in maintaining 56 subjects. Overall, 49 (88%) women were HSV-2 seropositive and 7 (12%) were HSV-2 seronegative. Among the 49 HSV-2 seropositive women, by PCR HSV-2 was detected in 28 (57%) ulcers and T. pallidum in none. Serologically, syphilis was detected in 1 (2.0%) woman (titer 1:4; TPHA-positive); there were no sequence reads from the genus Treponema detected in this specimen (0/34,176 sequences before rarifaction). One serologically negative subject had an OTU annotated as Treponema (30/47,102 sequences before rarefaction) with HSV-2 recovered from the ulcer. Among the 7 HSV-2 seronegative women, there was no serologic evidence of syphilis. HSV-2 was PCR-detected in three (38%) of seven ulcers from HSV-2 seronegative women while T. pallidum was not detected; amplicon sequencing did not identify any Treponema spp. sequence reads in HSV-2 seronegative women. All GUD specimens were negative for HSV-1. Therefore, no STI etiology was identified in four (7%) of 56 women with clinically detected genital ulcers. Given the small number of HSV-2 seronegative subjects, inferential analyses were restricted to the 49 HSV-2 seropositive women. From these 49 specimens we obtained 1,664,532 sequences with an average read-length of 250 base pairs. An average of 33,970 sequences were obtained from each sample (min = 1,261, max = 64,939).

Socio-Demographic and Behavioral
Characteristics. HSV-2 seropositive women were median age 26 years (range 18-59) and 76% married or cohabiting (Table 1). Overall, 60% of women reported having one sex partner in the past six months, with 17% reporting two or more sex partners, and 49% reported having any uncircumcised sex partner in the past six months. Male sex partner's circumcision status at the last sex, past 30 days, and past six months were highly correlated (Pearson correlation coefficient = 0.68-0.79); subsequent analyses proceeded with circumcision status in the past six months as most women reported one sex partner in the past six months. Having an uncircumcised male sex partner did not differ between women with HSV-2 PCR positive (50%) or negative (47%) ulcers (p = 0.86). Overall, 63% of women were HIV positive, and this did not differ significantly between women with HSV-2 PCR positive or negative ulcers (71% vs. 50%, p = 0.13). There were no statistically significant differences in sociodemographic, behavioral characteristics, or the joint distribution of HIV status and male sex partner circumcision status between women with HSV-2 PCR positive or negative ulcers (Table 1).
Clinical Characteristics by HSV-2 PCR Results. Compared to HSV-2 positive ulcers, the treating clinician was more likely to describe HSV-2 PCR negative ulcers as papular (32% vs. 4%, p = 0.02), and less likely to describe HSV-2 PCR negative ulcers as "patch" (0% vs. 21%, p = 0.03), and hyperpigmented (18% vs. 47%, p = 0.08; Table 2). The clinical impression was associated with PCR detection of HSV-2: 54% of ulcers in which HSV-2 was recovered were diagnosed as "GUD of herpetic appearance" compared to 24% of those that were HSV-2 negative (p = 0.05). (Table 3) showed that the difference in bacterial community composition between HSV-2 PCR-positive and PCR-negative women was statistically significant (p = 0.002). There were no statistically significant differences in community composition by HIV status or male sex partner circumcision status. The principal components plots are complementary visualization to these analytic results: the community centroid for samples from women with HSV-2 PCR positive ulcers are distinct from the centroid of samples from women whose ulcers were HSV-2 PCR negative (Fig. 1), while centroids representing bacterial communities by HIV status (Fig. 2) and male sex partner circumcision status (Fig. 3) show much greater overlap and shorter distance between centroid.

Global Comparison of Microbial Community Structure by mPCR Results. Analysis of similarity
As reflected by the median Shannon index (Table 1), there was greater diversity in microbial communities from ulcers in which HSV-2 was detected compared to ulcers in which HSV-2 was not detected (1.9 vs. 2.1, p = 0.05). The median number of taxa recovered was also higher for ulcers in which HSV-2 was detected (26.5 vs. 17, p = 0.02).

Discussion
Among HSV-2 seropositive women, HSV-2 was detected by PCR in 57% of clinically identified ulcers, and the bacterial community of these ulcers differed by whether HSV-2 was detected with PCR. All women were HSV-2 seropositive; the extent to which it matters that GUD of putatively similar etiology has variation in bacterial   community composition relates to the potential association between these different bacterial community structures with co-infections or impact on severity of disease or treatment. This study was conducted as response to our previous finding that 55% of men with clinically detected genital ulcers were serologically negative for HSV-2 and syphilis, and 39% also had no HSV-2 or T. pallidum recovered by PCR 8 . This high proportion of non-etiologically defined GUD is in keeping with studies throughout Sub-Saharan Africa [24][25][26][27][28][29] and India 30,31 . Using amplicon sequencing in our study of men, we found that anaerobic bacteria associated with BV were more commonly recovered from genital ulcers in these men without HSV-2 or T. pallidum. Our current findings present two striking differences to these previous results. First, a minority of women (9%) with GUD were without STI-associated etiology. Subjects in the current study self-presented with symptoms, whereas in the previous study, samples were collected from men taking part in a randomized clinical trial of MMC to reduce HIV incidence 3 . Men in the trial underwent physical examination every 6 months, and were encouraged to come to the clinic for symptoms at interim visits. HIV was rare (4%) in our prior analysis as men had to be HIV negative for trial entry, compared to 63% HIV-positivity among women in the current study. HIV can alter the natural history of genital herpes, producing more severe and frequent outbreaks and delayed healing 32 . Therefore, in the trial we increased our chance of detecting non-STI related ulcers and ulcers that may not have been severe enough to lead to health care seeking otherwise. Secondly, our present study found that ulcers in which HSV-2 was detected by PCR had more BV-associated bacteria -specifically, Porphyromonas, Prevotella, Peptostreptococcus, Sneathia, and possibly Anaerococcus and Dialister -than ulcers from which HSV-2 was not detected. Ulcers in which HSV-2 was not detected by PCR may have been at a different point in healing, and therefore HSV-2 was no longer detectable. However, the median duration of ulcer symptoms did not differ by whether or not HSV-2 was recovered from the ulcer. In a cohort of Kenyan women, Kaul et al. found that baseline HSV-2 infection was associated with increased incidence of BV 33 . At the same time, prospective epidemiologic and clinical investigations support the increased risk of HSV-2 acquisition among women with BV 34 . Whether HSV-2 increases risk of BV and/or BV increases risk of HSV-2, such association would increase the likelihood of co-detection of HSV-2 and BV-associated bacteria in cross-sectional study design. Though published data regarding the effect of BV on the natural history of HSV-2 are limited, Ursell et al. demonstrate that BV-associated bacteria (G. vaginalis, A. vaginae) have been associated with increased duration and number of HSV-2 related outbreaks 35 .
A primary public health concern is that among HSV-2 seropositive individuals there is increased risk of HIV acquisition 36 and transmission 37 , highlighted by the 63% HIV prevalence among our sample of HSV-2 infected women. We detected significant global bacterial community dissimilarity between HIV-positive and HIV-negative women only among ulcers that were HSV-2 PCR negative. We observed that Finegoldia was significantly reduced among women with HIV. Prospective studies of the effect of HIV on the vaginal microbiome are limited, but our recent analysis of a cohort of U.S. women showed that HIV-status did not alter the change in the vaginal microbiome over an 8-to 10-year period 38 .
The microbial community of ulcers from women with an uncircumcised partner reflected more BV-associated bacteria. This is consistent with what we expected based on our previous findings and those of others showing that female partners of uncircumcised men are more likely to have BV, and that this stems from the greater abundance of BV-associated anaerobic bacteria among uncircumcised men compared to circumcised men 39 . However, in our study this difference was only observed in ulcers from which HSV-2 was not recovered. We are unable to determine why the differences by HIV status and male partner circumcision status were detected only among HSV-2 PCR negative ulcers. Although these women were all HSV-2 seropositive, the GUD detected could have had different or multiple etiologic factors. This may be supported by findings of the differences in ulcer characteristics between HSV-2 PCR positive and negative ulcers, and overall, the clinical impression was less often "GUD, herpetic" for PCR negative (25%) than positive ulcers (54%). Limitations. We recruited women seeking care at an STI clinic, which likely biased our sample towards those with more severe and HSV-2 related GUD. Data on male partner circumcision status were self-reported by women, and there may have been misclassification. In a study of women's knowledge of partners' male circumcision status in Uganda, 1.2% of women misreported the status of confirmed circumcised men and 8.2% of women misreported the status of uncircumcised men 40 , though study of this in Zambia and Swaziland find misclassification rates 11-15% for either status 41 . We did not conduct species-level analysis, which will be necessary in subsequent studies with more specific hypotheses. This pilot study had low power to detect modest associations for individual bacteria, especially in the context of multiple testing correction, which was necessary for hypothesis exploration. However, we used multiple analytic approaches to increase confidence in the robustness of results: detection of differences is supported by results of ANOSIM, Bray-Curtis dissimilarity measure, Kruskal-Wallis group testing, and regression analyses, with convergence on specific bacteria. We did not collect data on frequency of previous GUD, occurrence of prodrome, or clinician characterized stage of healing. We did not diagnose BV by Amsel's criteria or Nugent score, as the clinic at which the study took place provides only syndromic diagnosis. This information would have aided interpretation of the observed differences in bacterial community structure. These data emphasize the need for longitudinal study to understand the relationships between the vaginal microbiome and women's risk of HSV-2, HIV, BV, and the role of male sex partner's circumcision status in these outcomes.  Table 4. Mean sequence reads of bacterial taxa by HSV-2 PCR status, HIV status, and male sex partner circumcision status with Kruskall-Wallis group tests of significance FDR P-value = False discovery rate corrected P-value "Uncircum" = Uncircumcised; "Circum" = Circumcised. We present results from the 14 bacteria representing the top 10 bacteria of significance (ranked by P-value) for each factor. Because some bacteria were in the top 10 for some outcomes but not others (e.g., Gardnerella for HIV status and male sex partner circumcision status, but not for HSV-2 status), this leads to presentation of more than 10 bacteria total.  Table 5. Results of linear regression: Bacteria associated with HSV-2, HIV, and circumcision status. Interpretation: These results represent 7 different models for the 7 bacteria identified in Elasticnet and Lasso regression as associated with HSV-2 ulcer status, HIV status, and/or male sex partner's circumcision status at the p < 0.10 level. As Sneathia was associated with both HSV-2 status and male sex partner circumcision status, the model controlled for both factors. Data are standardized (µ = 0 and σ = 1) and we report the coefficient (standard error), model p-value and adjusted R-squared.