The characteristics and risk factors of human papillomavirus infection: an outpatient population-based study in Changsha, Hunan

This cross-sectional study investigated the characteristics of cervical HPV infection in Changsha area and explored the influence of Candida vaginitis on this infection. From 11 August 2017 to 11 September 2018, 12,628 outpatient participants ranged from 19 to 84 years old were enrolled and analyzed. HPV DNA was amplified and tested by HPV GenoArray Test Kit. The vaginal ecology was detected by microscopic and biochemistry examinations. The diagnosis of Candida vaginitis was based on microscopic examination (spores, and/or hypha) and biochemical testing (galactosidase) for vaginal discharge by experts. Statistical analyses were performed using SAS 9.4. Continuous and categorical variables were analyzed by t-tests and by Chi-square tests, respectively. HPV infection risk factors were analyzed using multivariate logistic regression. Of the total number of participants, 1753 were infected with HPV (13.88%). Females aged ≥ 40 to < 50 years constituted the largest population of HPV-infected females (31.26%). The top 5 HPV subtypes affecting this population of 1753 infected females were the following: HPV-52 (28.01%), HPV-58 (14.83%), CP8304 (11.47%), HPV-53 (10.84%), and HPV-39 (9.64%). Age (OR 1.01; 95% CI 1–1.01; P < 0.05) and alcohol consumption (OR 1.30; 95% CI 1.09–1.56; P < 0.01) were found to be risk factors for HPV infection. However, the presence of Candida in the vaginal flora was found to be a protective factor against HPV infection (OR 0.62; 95% CI 0.48–0.8; P < 0.001). Comparing with our previous study of 2016, we conclude that the subtype distribution of HPV infection is relatively constant in Changsha. Our data suggest a negative correlation between vaginal Candida and HPV, however, more radical HPV management is required in this area for perimenopausal women and those who regularly consume alcohol.

. The top 5 hrHPV subtypes with the highest infection rates in China include HPV-16, HPV-52, HPV-58, HPV-53, and HPV-18 3 . HPV infection is most prevalent among childbearing-age and menopausal women; the prevalence among menopausal women is known as the "secondpeak" 4 . Some of the factors reported as closely correlated with HPV infection include alcohol consumption, smoking, age at first marriage, marital status, vulvovaginal ulcers, and vulvovaginal inflammation 5,6 . The biological mechanisms of risk factors on HPV infection are less well understood over the decades. However, these investigations have yielded few results when it comes to the pathogenesis of this infection. In recent years, scientists have become interested in and made promising progress investigating the mechanisms of vaginal flora. Some studies have reported that vaginal microorganisms might promote or impede HPV infection by infecting a host's cervical micro-ecological environment, and may further play essential roles in the pathogenicity of HPV and cervical lesions 7 . Lactobacillus maintains a low vaginal pH value through producing lactic acid to perform HPV protection 8 , and Candida stimulates T cell proliferation to against HPV infection 9 . However, Gardnerella, Fusobacteria, Mycobiota, and Chlamydial trachomatis (CT) were associated with HPV infection 8,10,11 , for example, CT plays as the entryway of the cervical epithelium and facilitate HPV infection 11 .
In 2016, Xiao et al. reported on the characteristics of HPV infection in Changsha based on information obtained from the Gynecological Outpatient Center of our hospital (January 2009-December 2013, The Third Xiangya Hospital of Central South University) 12 . However, more evidence and investigation need to be updated to analyze the longitude development of cervical cancer prevention and treatment in Changsha. Many risk factors of cervical cancer include lifestyle such as smoking, HPV type and distribution, and the influence of vaginal flora on HPV need to be updated. We examined patients that attended the Physical Examination Outpatients Center at our hospital and hypothesized that the distribution of HPV might have changed in the few years since this previous study. We also hypothesized that vaginal flora or Candida might play a vital role in influencing this infection.
Age distribution of HPV-positive participants. We analyzed the age distribution of the study participants. The highest positive infection rate of HPV was 31.26% in 40-49.9 years old group (n = 548; 4.34% of total numbers), followed by 27.44% in 30-39.9 years old group (n = 481), and 24.59% in 50-59.9 years old group (n = 431). This data showed the HPV distribution of different age groups (Table 2).

Risk and protective factors of HPV infection identified using multiple logistic regression analysis.
The trends of the unadjusted OR (OR u ) and the adjusted OR (OR a ) were found to be consistent. The risk factors of HPV infection were age (OR a 1.01; 95% CI 1-1.01; P = 0.011) and alcohol consumption (OR a 1.31; 95% CI 1.09-1.56; P < 0.01). However, vaginal Candida infections (OR a 0.62; 95% CI 0.48-0.8; P < 0.001), age at first marriage (≥ 20 years, OR a 0.79; 95% CI 0.65-0.95; P = 0.012), and age at first childbirth (> 30 years, OR a 0.67; 95% CI 0.49-0.93; P = 0.016) were protective factors against HPV infection in the study. Additionally, the age at first childbirth ≥ 20 to < 30 years (OR a 0.88; 95% CI 0.67-1.17) was also found to be protective against HPV infection, but the difference was not significant (P = 0.38) ( Table 4).
Assessment of prediction accuracy using tenfold cross-validation. The AUCs (area under curves) were significant at vaginal Candida infection, age at first sexual intercourse, age at first childbirth, and alcohol consumption sectors in the study (P < 0.05). The AUC values for all these variables were found to be approximately 0.8. Additionally, the ANOVA test and the initial models that included all variables produced results that were not significant, which indicates that the statistics using AUC were reliable. As an example, the largest AUC as a predictive factor for HPV infection was that of vaginal Candida infection. This AUC value was 0.881, the optimal cut off value was 0.164, the sensitivity was 1.000, and the specificity was 0.827 (Fig. 2

Discussion
The overall HPV-positive rate of this study was lower (13.88%) than that of the overall in mainland Chinese women (19.0%) 3 . In our study, the top five HPV subtypes were the following: HPV-52, HPV-58, HPV-CP8304, HPV-53, and HPV-39. However, the top 5 subtypes found in the study of Xiao et al., performed at the same hospital as our study, were HPV-52, HPV-16, HPV-58, HPV-CP8304, and HPV-53 12 , and the top 5 HPV subtypes found in China overall were HPV-16, HPV-52, HPV-58, HPV-53, and HPV-18 3 . The detection results of our hospital in recent five years showed that the positive rates of hrHPV and HPV16 were lower than the past and the national average level, indicating that Changsha city has achieved good results in reducing the high risk of HPV16 and related cervical precancerous lesions. The HPV-CP8304 is a popular subtype in our region with high prevalence but low risk. Our data were predominately consistent with the previous investigation 12 and mostly in line with the analysis of national data 3 . These differences may be a result of variable geography, ethnicity, education, and horizontal health care. However, when considering vertical health care, this decrease in HPV infection evident in our study may be explained by improvements in women's health care and the improvements in awareness of women's health in recent years 13,14 .
Persistent hrHPV infection was believed as the causative agent in over 90% CC in early in 2000 by Bosch et al. 15,16 . In 2015, Wang et al., confirmed that the hrHPV rate was 91.8% patients with invasive cervical cancer (ICC) in Hunan province 17 . Also, HPV-16 and HPV-18 are considered the top hrHPV subtypes worldwide [18][19][20] , However, neither HPV-16 (6th) nor HPV-18 was identified as a prevalent subtype in our study. Several factors may account for this, such as regional differences, improvements in women's health care, and the awareness of women's health. Another reason is the participants involvement differences between studies. We investigated women from the Physical Examination Outpatient Center from our study while others from the Gynecological Outpatient Center. Symptomatic or sick patients would generally choose the Gynecological Outpatient Center for CC screening, while the Physical Examination Outpatient Center encounters more asymptomatic or healthy 2 outliers in vaginal cleanliness as measured by vaginal microecology (shown as "+" and "−", respectively); 4 due to missing information as to age of menstruation onset; 637 due to missing height information; 639 due to missing weight information; and 7 due to missing fasting blood glucose information. www.nature.com/scientificreports/ cases. Taking this into account, although our data is reasonably consistent with previous studies, the differences were unavoidable. These screening sample differences also suggest that the CC screening at physical examination outpatient centers, rather than the gynecological outpatient centers, might receive a closer result compared to the rate of HPV infection across the whole country. The age distribution was later investigated. The age group with the highest HPV infection rate was 40-50 years old, followed 30-40 years, then 50-60 years. This finding is not in a full agreement with the report of the bimodal pattern in the distribution of HPV infection by age, with peaks at 26-30 years and 46-50 years 21 . Several factors might contribute to this pattern, including that immunity decreases with age, meaning that older women are at an increased the risk of developing HPV infections 22 . Besides, the postmenopausal population have elevated pH values due to the decreased estrogen levels that is correlated to the low glycogen levels and Lactobacillus abundance, making them sensitive to HPV infection 23,24 . Additionally, older women tend to seek out for routine gynecological care and cancer screenings 25 , which potentially elevates the HPV-positive rate for this age group. This explains our finding that increased age may be a risk factor for HPV infection (OR 1.01, 95% CI 1-1.01, P < 0.05) ( Table 4). In young women, however, the risk of developing HPV infection might be more associated with an active sex   www.nature.com/scientificreports/ number of sexual partners 4 , education levels, living conditions, and their use of oral contraceptives 27 . However, our findings still suggest that more attention should be paid to aged women and more routine gynecological examinations should be provided as part of routine health care for women of senior populations. We also investigated potential risk factors of HPV infection. Our data showed that the vaginal pH of the HPV-positive group was significantly elevated than that of the HPV-negative group (4.37 ± 0.25, 4.35 ± 0.25, separately, P < 0.001), this might due to the vaginal dysbiosis, and particularly with the displacement of Lactobacillus 28 . Besides, the sialidase was positively correlated with HPV infection (4.5%, 487/10,875; 7.1%, 124/1753, separately, P < 0.001). This might be explained that the sialidase, one of the virulent biomarkers of Gardneralla vaginalis, that through hindering the epithelial biofilm formation, facilitates the infection/co-infection of HPV and other microorganisms (such as bacterial communities, chlamydial, virus such as human immunodeficiency virus (HIV), herpes simplex virus (HSV), human cytomegalovirus (HCMV) [29][30][31][32][33][34][35] . In line with the study of Lili et al., that the leukocyte esterase positive rate is higher in the HPV-positive group when compared with that of the HPV-negative (21.7%, 380/1753; 18.2%, 1977/10,857, separately, P = 0.001) 36 . We then figured out that alcohol consumption was correlated with HPV infection. An underlying reason for this could be that alcohol may increase sexual disinhibition, which results in increased unsafe sexual behaviors 37,38 . Smoking is another potential risk factor. Multiple studies have posited that both passive and active smoking of cigarettes increase the risk of developing HPV infection, as smoking suppresses innate immunity and can cause structural and functional changes within the respiratory system [39][40][41] . However, our data found no significant correlation between smoking and HPV infection. This is possibly due to limited case numbers of HPV-infected participants and the relatively restricted region within which participants lived.
Meanwhile, the probable protective factors were summarized. Our study found that a marriage age of 20 years or older (OR 0.79) and an age at first childbirth of 30 years or older (OR 0.67) were protective against HPV infection (P < 0.05). Consistent with these findings, Niyazi et al. reported that early marriage might be a risk factor for hrHPV infection 6 . This phenomenon might be explained by the notion that older women are more likely to engage in safe sexual behaviors 6 . Additionally, older women are believed to have greater awareness of genital hygiene and healthcare, which reduces the rate of HPV infection 42 . The vaginal Candida infection was significantly negatively correlated with HPV infection (OR 0.62; 95% CI 0.48-0.8; P < 0.001). This might be due to the presenting symptoms of Candida vaginitis, such as leukorrhea, vulvar pruritus, dyspareunia, and dysuria 43 , encouraging patients to seek timely gynecological screening. Additionally, Candida parapsilosis can serve as a biofilm on the surface of the genital tract, which may act as a shield against invasion by other microorganisms 44  Some deficiencies cannot be ignored for this study. Besides the limited number of HPV cases and lack of information regarding participants' marital status, number of sexual partners, use of oral contraceptives, and the history of sexual transmitted infections (STIs) such as Gonorrhea, Syphilis and HIV that might infect the vaginal-cervical microbiota 48,49 . There are further constraints in our study that might limit its wider applicability. We neither investigate details of alcohol consumption, nor follow up on infection persistence, progression, or the outcomes of participants. Besides, our documented vaginal microecology narrowly focused on Candida of different fugal types, more preliminary clinical findings need to be further studied, with the rest of vaginal cofounders, such as Gardnerella, Fusobacterium, bacterial vaginosis and aerobic vaginitis 8,10,45,50 . Furthermore, we excluded the previous positive HPV patients with or without local drug administration which might bias its distribution. Additionally, we did not further investigate the participants infected by multiple subtypes of HPV.
Despite these limitations, our study is in strong agreement with previous research. Additionally, this study is the first to explore the potential relationship between vaginal Candida with HPV infection in Changsha, Hunan. Our findings potentially provide valuable information to assist in the improvement of clinical HPV screening and CC prevention in local regions. To improve future research, larger sample sizes, optimization of the participant questionnaire, improved follow-up of participants, and in vitro experimentation might be considered.
In conclusion, we found that the prevalence of HPV infection and the distribution of its subtypes is relatively constant in Changsha, Hunan. Our data suggest that vaginal Candida infection is a protective factor against HPV infection and that more radical HPV management is required in the local Changsha area for perimenopausal women and who regularly consume alcohol.

Materials and methods
Participants. Every female patient that attended the Physical Examination Outpatients Center at the Third Xiangya Hospital of Central South University between 11 August 2017 and 11 September 2018 was asked to complete a questionnaire and voluntarily sign written informed consent. This study was approved by the ethics committee of the Third Xiangya Hospital of Central South University (IRB No. 20017). All methods were carried out in accordance with relevant guidelines and regulations.
To be included in this study, participants had to satisfy all of the following inclusion criteria: (1) age ≥ 18 years old and no previous positive HPV results; (2) 51,52 were carefully recorded and double-checked by two research assistants. The vaginal and cervical samples were collected by gynecologist, while the blood was drawn by nurse. All vaginal samples were collected for the test of vaginal microecology, the cervical samples were collected for HPV screening, and the blood samples were collected for lab testing. The Pentaplex Vaginitis Detection kit (Rhfay, Guangzhou) was used for vaginal ecology testing, including galactosidase, sialidase, leukocyte esterase, H 2 O 2 , and pH value. The diagnosis of Candida vaginitis is based on microscopic examination (spores, and/or hypha) and biochemical testing (galactosidase) for vaginal discharge by experts. In our study, the criteria of bacterial infection diagnosis by Amsel method are as follows: (1) uniform vaginal secretion; (2) pH > 4.5; (3) amine smell in secretion with 10% potassium hydroxide; (4) positive laboratory test results (Gram staining for bacteria in secretion or wet film for clue cells). If any three of the above criteria are met (but the last one is necessary), the diagnosis can be made.
A total of 12,628 participants with complete medical records were enrolled in this study and retrospectively analyzed; 10,875 were non-HPV infected females and 1753 were HPV-infected females (Supplementary Information).
Statistical analysis. Statistical analyses were performed using Statistical Analysis System 9.4 (SAS Institute, USA). Continuous variables were analyzed by t-tests, and the categorical variables were analyzed by Chisquare tests. The age distribution and subtypes distribution were summarized based on HPV positive cases.
The multivariate logistic regression risk model was used for investigating the risk and protective factors of HPV infection. The risk factor was defined as odds ratio (OR) > 1 and the protective factor was OR < 1. The multivariable regression model was established in three steps. Firstly, univariate analyses were performed to demonstrate which patient variables correlated with the presence of HPV infection with a significance of P < 0.05, and the analysis of variance (ANOVA) test and all variables were significantly different in the model (P < 0.05). Next, non-significant variables (P > 0.05) were removed, and stepwise regression was performed using the forward and backward method. Finally, the variables demonstrated as significant (P < 0.05), including age, a vaginal microecology test sample positive for fungal infection, age at first sexual intercourse, age at first childbirth, and alcohol consumption were included in the model. Analyses were performed using tenfold cross-validation, that the data set is divided into 10 parts, including 9 parts as training data and 1 part as test data. The final data is generated area under the curve (AUC) values of the receiver operating characteristic (ROC) curves. This was then used to determine the model's classification ability, and AUCs were compared to assess prediction accuracy. A P value of < 0.05 was considered statistically significant.
Ethical statement. The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.