Genetic variants of the dUTPase-encoding gene DUT increase HR-HPV infection rate and cervical squamous cell carcinoma risk

Deoxyuridine 5′-triphosphate nucleotidohydrolase (dUTPase) is involved in the repair and prevention of uracil misincorporations into DNA. Maintenance of DNA integrity is critical for cancer prevention. Many studies have identified susceptibility loci and genetic variants in cervical cancer. The aim of this study was to explore the distribution frequency of six single nucleotide polymorphisms (SNPs) in the dUTPase-encoding gene DUT in a case-control study to identify the relationship between DUT genetic variants and cervical cancer susceptibility. Six DUT intronic SNPs (rs28381106, rs3784619, rs10851465, rs28381126, rs3784621 and rs11637235) were genotyped by mismatch amplification-PCR in 400 cervical squamous cell carcinomas (CSCCs), 400 precursor cervical intraepithelial neoplasia (CIN) III lesions and 1,200 normal controls. No correlations were found between four DUT SNPs (rs3784621, rs10851465, rs28381106 and rs28381126) and CIN III and CSCC risk. However, the homozygous GG allele of rs3784619 and TT allele of rs11637235 correlated significantly with increased risk of CIN III and CSCC (OR = 2.29, 2.05; OR = 3.15, 3.15, respectively). Individuals with the G allele or G carrier allele (AG + GG) at rs3784619 and with the T allele or T carrier allele (CT + TT) at rs11637235 were at higher risk for CIN III and CSCC (OR = 1.26, 1.30; OR = 1.41, 1.65, respectively). Similarly, in the human papillomavirus (HPV)-positive groups, we found that the homozygous GG alleles of rs3784619 and TT alleles of rs11637235 markedly increased the risk of CIN III and CSCC (OR = 2.44, 2.71; OR = 3.32, 4.04, respectively). When performing a stratified analysis of sexual and reproductive histories, we found that the GG genotype of rs3784619 had a particularly high level of enrichment in the group of patients with > one sexual partner in CIN III (P = 0.043) and CSCC (P = 0.007). Meanwhile, the TT genotype of rs11637235 was enriched for in the high risk HPV (HR-HPV)-positive cases of CIN III (P = 0.033) and CSCC (P = 0.022). Analysis of the haplotype between rs3784619 (A/G) and rs11637235 (C/T) revealed that the genotypes with AA-TT (OR = 2.59), AG-TT (OR = 2.29), GG-CC (OR = 2.72), GG-CT (OR = 3.01 (1.83–4.96)) were significantly associated with increased risk of CIN III. More notably, this risk was much greater for CSCC (AA-TT (OR = 3.62), AG-TT (OR = 5.08), GG-CC (OR = 5.28), and GG-CT (OR = 4.23). Additionally, most GG genotypes of rs3784619 were linkage GG-CT, while most TT genotypes of rs11637235 were linkage AA-TT. In conclusion, these findings suggested that the homozygous GG allele of rs3784619 and the TT allele of rs11637235 in the DUT gene significantly increased the risk of CIN III and CSCC. Most GG genotypes of rs3784619 and TT genotypes of rs11637235 were linkage GG-CT and AA-TT, respectively. The TT genotype of rs11637235 was enriched in the HR-HPV-positive cases. These two SNPs of the DUT gene can be early predictive biomarkers of CIN III and CSCC, and may be involved in HR HPV infection.


Correlation of DUT SNP genotypes with CIN III and CSCC risk. Genotypic and allelic frequencies of
DUT SNPS rs3784619, rs3784621, rs11637235, rs10851465, rs28381106 and rs28381126 are depicted in Table 1. Genotype distributions were in Hardy-Weinberg equilibrium.
Our results indicate that four DUT SNPs, rs3784621, rs10851465, rs28381106 and rs28381126, were not correlated with CIN III and CSCC risk ( Table 1).

Correlation of DUT SNP genotypes with HR-HPV-positive CIN III and CSCC risk.
In the HR-HPV-positive group, DUT SNPs rs3784621, rs10851465, rs28381106 and rs28381126 were not correlated with CIN III or CSCC risk. However, the homozygous GG allele of the rs3784619 SNP was associated with an increased risk of CIN III (OR = 2.44 (1.32-4.49, P = 0.004) and CSCC (OR = 3.32 (1.73-6.38), P = 0.000 Association between DUT rs3784619, rs11637235 polymorphism with the sexual behavior and reproductive history in CIN III or CSCCs. Cases were designated into two groups according to sexual behavior and reproductive history. A stratified analysis was then performed with the DUT rs3784619 (A/G) and rs11637235 (C/T) genotypes (Table 3). We found that there was higher enrichment of the rs3784619 GG genotype for CIN III (χ 2 = 4.089, P = 0.043) and CSCC (χ 2 = 7.228, P = 0.007) when the patient had more than one sexual partner.

Discussion
Mechanisms for the prevention of uracil misincorporation into DNA and the removal of misincorporated uracil from DNA are essential for maintaining genomic integrity. It has been shown that failure to remove misincorporated uracil can result in mutations or double stranded DNA breaks following DNA replication, ultimately leading to chromosomal abnormalities, both of which are hallmarks of cancer 2,19-21 . Mammalian cells have enzymes, such as dUTPases, that prevent the incorporation of uracil into DNA by dephosphorylating dUTP into dUMP. In humans, the only known UTPase is encoded by the DUT gene 2,22 . This gene encodes an essential enzyme for nucleotide metabolism. The encoded protein forms a ubiquitous, homotetrameric enzyme that hydrolyzes dUTP to dUMP and pyrophosphate. This process serves two cellular goals: providing a precursor (dUMP) for the synthesis of thymine nucleotides needed for DNA replication, and limiting intracellular dUTP levels. Elevated dUTP levels lead to increased incorporation of uracil into genomic DNA, which induces extensive excision repair mechanisms mediated by uracil glycosylase. High dUTP levels would lead to uracil misincorporation followed by excision repair, ultimately causing DNA fragmentation and cell death 18 . Furthermore, the redundant dUMP is used in a salvage pathway for synthesizing deoxythymidine triphosphate 23 .
Chanson et al. examined the relationship between 23 genetic variants in five uracil-processing genes and uracil concentrations in whole blood DNA in 431 participants of the Boston Puerto Rican Health Study 6 . They found that four SNPs in DUT, UNG, and SMUG1 had a significant association with DNA uracil concentrations. The SNPs in SMUG1 (rs2029166 and rs7296239) and UNG (rs34259) were associated with increased uracil concentrations, whereas the DUT SNP (rs4775748) was associated with decreased uracil concentrations. These results suggested that the four SNPs in DUT, UNG, and SMUG1 may have a direct effect on uracil concentrations in normal cells, thereby affecting post-DNA replication uracil misincorporation rates, resulting mutations, and ultimately, cancer risk.
Anogenital HPV infections are the most common sexually-transmitted infections, with a prevalence of 70 million cases and an incidence of 14 million cases per year in the United States [24][25][26] . Fifteen HR-HPV subgenotypes can lead to cancer of the cervix, penis, vagina, vulva, and oropharynx 27 . HPV-16 and HPV-18 subtypes are the most prevalent HR-HPV subgenotypes in HPV-associated cancers, accounting for approximately 70% of cervical cancers, with the other HR-HPV subtypes (31,33,35,39,45, 51, 52, 56, 58, 59, 68, 73, and 82) account for the remaining cervical cancer cases 28 . Nearly all cervical cancers are HPV-associated, including CSCC, cervical adenocarcinomas, and other histological cervical tumors 27 . Most HR-HPV-related cervical infections are asymptomatic, and more than 90% of detected infections clear within about two years 29 .
Although several contributing factors in cervical cancer development have been identified, mainly intrinsic genetic factors and extrinsic factors such as HR-HPVs, genetic factors show great potential for use as susceptibility or prognostic indicators 30,31 . A large amount of epidemiological evidence supports that genetic variants are associated with cervical cancer risk 15 .
Broderick et al. detected novel germline sequence variations in TDG, UNG and SMUG1 in colorectal cancer cases with familial aggregation, suggesting that these variants may play a role in disease susceptibility 32 . Other studies have described an association of a SNP in MBD4 with increased risk of lung 33,34 and esophageal cancer 35 .
Many literature studies report that there is a significant correlation between SNPs in DNA repair genes and the susceptibility to different cancers 7,36 . We previously found an association between SNPs of XPD, XPG, PARP-1 repair genes and the susceptibility to CSCC and HR-HPV infections 37,38 .
To date, no studies have been performed to investigate the correlation between SNPs of the DUT gene and cancer susceptibility. The development of cervical carcinoma usually requires multiple stages, eventually developing from precursor CIN lesions to cervical malignant carcinoma. In the present study, we found that there was no associations between four SNPs of the DUT gene (rs3784621, rs10851465, rs28381106 and rs28381126) and increased risk of CIN III or CSCC. On the other hand, the homozygous GG allele of rs3784619 and TT allele of rs11637235 were associated with a higher risk of CIN III and CSCC. Individuals with the G allele or G carrier allele (AG + GG) at rs3784619 and with the T allele or T carrier allele (CT + TT) at rs11637235 were at higher risk for CIN III and CSCC. We also found that the homozygous GG allele of rs3784619 and the TT allele of rs11637235 had a higher risk of CIN III and CSCC. These results showed that the homozygous GG allele of rs3784619 and the homozygous TT allele of rs11637235 in the DUT gene may play a role in initiation and progression of precancerous lesions (CIN) and cervical carcinoma. The present study is the first to report an association between DUT rs3784619 and rs11637235 genetic variants and any solid tumor type.
SNP loci that affect the spatial structure and function of genes are typically located in the 5′ UTR promoter, coding region, or 3′ UTR region. Although the two variants we investigated are both located in the intron, which cannot affect amino acid designation, it is still possible that there can be linkage disequilibrium with other functional genetic variants and therefore these variants can serve as a genetic biomarker of susceptibility 39 . The DUT rs3784619 and rs11637235 genetic variants could also influence primary mRNA splicing and regulation, and therefore could affect dUTPase protein expression or result in an alternative spliceosome. Very little is known about the functional effects of uracil-processing gene SNPs. It is thought that these SNPs can lead to altered enzyme activity, contributing to increased uracil concentrations and therefore uracil misincorporation, and ultimately resulting in human diseases including malignant cancers 2 . When performing stratified analyses of the sexual and reproductive histories of patients, we found that the GG genotype of rs3784619 had a particularly high level of enrichment in the group with > one sexual partner for CIN III and CSCC. Meanwhile, the TT genotype of rs11637235 was enriched in HR-HPV-positive cases of CIN III and CSCC. These data suggested that there may be a correlation between the GG genotype of rs3784619 with female sexual behavior. The TT genotype of rs11637235 may affect HR-HPV infection at early onset of disease.
The analysis of the haplotypes of rs3784619 (A/G) and rs11637235 (C/T) revealed that the genotypes with AA-TT, AG-TT, GG-CC, and GG-CT were significantly associated with increased risk of CIN III and CSCC, with the risk being much greater for CSCC. Additionally, most GG genotypes of rs3784619 were linkage GG-CT, while most TT genotypes of rs11637235 were linkage AA-TT. Our data showed that the rs3784619 (A/G) and rs11637235 (C/T) polymorphisms of the DUT gene correlated with increased risk of CIN III and CSCC. Haplotype analysis showed a significantly greater risk of cancer occurrence particularly for the genotype combinations of linked GG-CT and AA-TT. These results revealed that the distribution of GG (rs3784619) and TT (rs11637235) genotypes was mostly caused by linkage disequilibrium of the corresponding alleles. Therefore, the significantly-higher odds ratios suggested a synergistic effect of polymorphisms in the DUT gene, and could be an important factor affecting susceptibility to CIN III or CSCC.
In conclusion, these findings shed light on two polymorphisms (rs3784619 and rs11637235) in the DUT gene associated with a higher risk of CIN III and CSCC that could be used as biomarkers. Most GG genotypes of rs3784619 and TT genotypes of rs11637235 were linkage to GG-CT and AA-TT, respectively. The two investigated SNPs of the DUT gene could be used as early predictive biomarkers of CIN III and CSCC, and in addition, may play a role in HR-HPV infection.

Ethics statement. This research project was authorized by the Medical Ethical Committee of Women's
Hospital, School of Medicine, Zhejiang University (approval number 2004002). All patients provided written informed consent to participate in the study. The research methods protocol was carried out in accordance with approved guidelines and regulations. The number of individuals younger than/older than 40 years old were 602/598, 258/142 and 160/240 for the normal controls, CIN III and CSCC groups, respectively. In the CSCC group, there were more individuals older than 40 years old (χ 2 = 12.431, P < 0.001) compared to the normal control group. The CIN III group had more individuals younger than 40 years old (χ 2 = 24.793,P < 0.001) compared to the normal control group. Statistically significant differences were only identified for the number of parities in the CIN III and CSCC groups (χ 2 = 4.627, P = 0.031; χ 2 = 20.49, P < 0.001). Statistically significant differences were not observed for sexual and reproductive histories, including age at first intercourse (≤20 years old, >20 years old), number of sexual partners (≤1 partner, >1 partner) and age at first birth (≤22 years old, >22 years old) among the carcinoma, CIN III and control groups In the normal controls, CIN III and CSCC groups, the infection rate of HR-HPV was 31.4%, 86.8% and 88.6%, respectively. The infection rate of HR-HPV in the CIN III and CSCC groups were higher than in the control group (χ 2 = 277.107, P < 0.001; χ 2 = 199.315, P < 0.001).

SNPs Selection.
We searched the SNP status of the DUT gene with the SNP library option using the website for the National Center for Biotechnology Information, U.S. National Library of Medicine (https://www.ncbi.nlm. nih.gov/snp/). We utilized the filter option (filters activated: snp, minor allele frequency from 0.05 to 0.5, by-1000 Genomes, by-cluster, by-frequency, by-2hit-2allele) to obtain six effective SNPs in the DUT gene (rs3784619 (A/G), rs3784621 (C/T), rs11637235 (C/T), rs10851465 (C/T), rs28381106 (G/T) and rs28381126 (G/T)). All of these six SNPs are located in the intron of the DUT gene.
DNA Extraction and Genotyping. A genomic DNA extraction kit was used to extract whole genomic DNA from peripheral white blood cells according to the manufacturer's instructions (Sangon, Shanghai, China). All genomic DNA was dissolved in distilled water and frozen until further use.
The six intronic SNPs of the DUT gene were detected by a modified polymerase chain -mismatch amplification (MA-PCR) reaction as described previously 37 . PCR forward and reverse primer sequences and product lengths are showed in Table S1. In brief, the PCR was performed in a 30 µL reaction mixture, containing 50 ng of genomic DNA, 5.0 pmol of each primer, 0.2 mM of each dNTP and 1.5 units of Taq DNA polymerase (TAKARA, Dalian, China). The PCR reaction was performed with the following conditions: initial denaturation at 94 C for 5 minutes; followed by 35 cycles at 94 C for 30 seconds, 55-58 C for 30 seconds for annealing, and 72 C for 45 seconds for elongation. A final step of 72 C for 10 minutes was performed. PCR products were electrophoresed on    Table 5. DUT haplotypes(rs3784619-rs11637235) in CIN III and CSCC cases. Underlined values show significant difference. aHaplotypes were composed by two SNPs of DUT gene: rs3784619(A/G), rs11637235(C/T). *The P-values are standardized by age, age at first intercourse, number of sexual partners, age at first full-term pregnancy and number of parities.