Introduction

Uracil misincorporation occurs spontaneously at low levels as a result of cytosine deamination or misincorporation of deoxyuridine monophosphate (dUMP) during genomic DNA replication1,2. Under normal conditions, such lesions are rapidly repaired by the base excision repair mechanism initiated by uracil-DNA glycosylase enzymes3,4. The maintenance of DNA stability and integrity through DNA repair is critical in the prevention of cancer. Several factors are associated with an increased risk of cancer due to germline mutations in genes encoding DNA repair enzymes5. Five genes, UNG, SMUG1, MBD4, TDG, and DUT, are involved in the repair and prevention of uracil misincorporation into DNA, an anomaly that can lead to cancer-causing mutations. Little is known about the determinants of uracil misincorporation, including the effects of single nucleotide polymorphisms (SNPs) in the above-mentioned genes6. SNPs in DNA repair genes have been related to cancer risk as well7,8.

Cervical cancer is the third most common malignancy in women worldwide, and high risk human papillomavirus (HR-HPV) is the primary etiological agent9. In China, cervical cancer has become the most common female cancer (98.9 per 100,000) in addition to breast cancer. The cervical cancer mortality rate is 30.5 per 100,000, and the incidence rate is increasing10. However, although ~80% of women will acquire an HPV infection during their lifetime, only a small proportion of women will progress to develop invasive cancer11. Pedigree studies show that cervical cancer has a significant heritability factor, supporting a critical role of genetic susceptibility in cervical cancer etiology12.

Many studies, including two genome-wide association studies (GWAS), have identified susceptibility loci and genetic variants in cervical cancer13,14,15. However, these variants explain only some of the cervical cancer genetic susceptibility. Thus, additional susceptibility genetic loci need to be further explored.

The enzyme deoxyuridine 5′-triphosphate nucleotidohydrolase (dUTPase) is essential for the viability of cells in all organisms16,17. dUTPase is encoded by the DUT gene in humans. It catalyzes the hydrolysis of deoxyuridine triphosphate (dUTP) to dUMP and pyrophosphate, and thus removes dUTP from the DNA synthesis pathway. High levels of dUTP can result in uracil misincorporation18. Chanson et al. reported that the DUT SNP rs4775748 is associated with decreased uracil concentrations. This suggests that DUT SNPs may influence cancer risk because elevated uracil misincorporation may induce mutagenic lesions6.

Thus far there have been no reports investigating the association between SNPs in the DUT gene and tumor susceptibility. In this study, we selected six SNPs in the DUT gene with minor allele frequency values greater than 0.05, and explored their distribution frequency in a case-control study (400 cervical squamous cell carcinomas (CSCCs), 400 precursor cervical intraepithelial neoplasia (CIN) III lesions and 1,200 normal controls). Our goal was to identify the relationship between DUT genetic variants and cervical cancer susceptibility.

Results

Correlation of DUT SNP genotypes with CIN III and CSCC risk

Genotypic and allelic frequencies of DUT SNPS rs3784619, rs3784621, rs11637235, rs10851465, rs28381106 and rs28381126 are depicted in Table 1. Genotype distributions were in Hardy-Weinberg equilibrium.

Table 1 Correlation between DUT SNPs with the risk of CIN III and CSCCs.

Our results indicate that four DUT SNPs, rs3784621, rs10851465, rs28381106 and rs28381126, were not correlated with CIN III and CSCC risk (Table 1).

The AA, AG, and GG frequencies of the DUT SNP rs3784619 in the controls were 53.8%, 38.4% and 7.8%, respectively, 48.0%, 36.0% and 16.0%, respectively, in the CIN III group, and 45.3%, 34.0% and 20.8%, respectively, in the CSCC group. These findings indicated that the homozygous GG allele of rs3784619 was associated with a significantly increased risk of the precursor lesion CIN III (OR = 2.29 (1.60–3.27), P = 0.000) and CSCC (OR = 3.15 (2.24–4.41), P = 0.000). We found that the frequency of the G allele at rs3784619 was significantly higher in CIN III (272/800, 34.0%) and CSCC (302/800, 37.8%) compared to normal controls (649/2400, 27.0%). The increased risk associated with the G allele and CIN III and CSCC were (OR = 1.39 (1.17–1.65), P = 0.000) and (OR = 1.64 (1.38–1.94), P = 0.000), respectively. Individuals with the G allele or a G carrier allele (AG + GG) at rs3784619 were at a higher risk for CIN III (OR = 1.26 (1.00–1.58), P = 0.046) or CSCC (OR = 1.41 (1.12–1.77), P = 0.003).

The frequencies of CC, CT, and TT for the rs11637235 SNP in the controls were 46.8%, 41.1% and 12.2%, respectively, 40.3%, 38.3% and 21.5%, respectively, in the CIN III group, and 34.8%, 36.8% and 28.5%, respectively, in the CSCC group. These findings indicated that the homozygous TT allele of the rs11637235 SNP was associated with an increased risk of CIN III (OR = 2.05 (1.49–2.82), P = 0.000) or CSCC (OR = 3.15(2.32–4.29), P = 0.000). The frequency of the T allele at the rs11637235 SNP was significantly higher in CIN III (325/800, 40.6%) and CSCC (375/800, 46.9%) groups compared to controls (785/2400, 32.7%). The increased risk of the T allele in association with CIN III and CSCC were 1.41 (1.19–1.66) and 1.82 (1.54–2.14), respectively. The T allele or the T carrier allele (CT + TT) at rs11637235 was associated with a higher risk for development of CIN III (OR = 1.30 (1.04–1.64), P = 0.024) and CSCC (OR = 1.65 (1.30–2.09), P = 0.000).

Correlation of DUT SNP genotypes with HR-HPV-positive CIN III and CSCC risk

In the HR-HPV-positive group, DUT SNPs rs3784621, rs10851465, rs28381106 and rs28381126 were not correlated with CIN III or CSCC risk. However, the homozygous GG allele of the rs3784619 SNP was associated with an increased risk of CIN III (OR = 2.44 (1.32–4.49, P = 0.004) and CSCC (OR = 3.32 (1.73–6.38), P = 0.000). The increased risk of the G allele for CIN III and CSCC development were OR = 1.44 (1.09–1.90) and 1.73 (1.27–2.35), respectively (Table 2).

Table 2 Correlation between DUT SNPs with the risk of CIN III and CSCCs in HPV-positive cases.

The homozygous TT allele in the rs11637235 SNP was also associated with an increased risk of CIN III (OR = 2.71 (1.59–4.59), P = 0.000) and CSCC (OR = 4.04 (2.26–7.21), P = 0.000). The increased risk of the T allele with CIN III and CSCC were OR = 1.68 (1.29–2.19) and 2.18 (1.62–2.94), respectively.

Association between DUT rs3784619, rs11637235 polymorphism with the sexual behavior and reproductive history in CIN III or CSCCs

Cases were designated into two groups according to sexual behavior and reproductive history. A stratified analysis was then performed with the DUT rs3784619 (A/G) and rs11637235 (C/T) genotypes (Table 3). We found that there was higher enrichment of the rs3784619 GG genotype for CIN III (χ2 = 4.089, P = 0.043) and CSCC (χ2 = 7.228, P = 0.007) when the patient had more than one sexual partner.

Table 3 Association between DUT rs3784619 polymorphisms and the risk for CIN III and CSCCs stratified by the sexual, reproductive history.

Enrichment was only found for rs11637235 (C/T) (Table 4) when HR-HPV infection was positive in the CIN III (χ2 = 4.542, P = 0.033) and CSCC (χ2 = 5.226, P = 0.022) groups.

Table 4 Correlation between DUT rs11637235 polymorphisms with the risk of CIN III and CSCCs stratified analysis by the sexual behavior, reproductive history.

Linkage disequilibrium analysis between DUT rs3784619 and rs11637235 genotypes

We conducted a linkage disequilibrium analysis between the rs3784619 (A/G) and rs11637235 (C/T) genotypes based on the observation that these genotypes were associated with an increased risk of CIN III or cervical carcinoma. The frequencies of the nine genotypes are shown in Table 5. The GG (rs3784619)-TT (rs11637235) genotype was not detected in CIN III and CSCC, and was only detected in one case in a normal control. Compared to the AA (rs3784619)-CC (rs11637235) genotype, genotypes with AA-TT (OR = 2.59 (1.70–3.94), P = 0.000), AG-TT (OR = 2.29 (1.36–3.85), P = 0.002), GG-CC (OR = 2.72 (1.59–4.68), P = 0.000), and GG-CT (OR = 3.01 (1.83–4.96), P = 0.000) showed a significant correlation with increased risk for CIN III. More notably, this increased risk was higher for CSCC (AA-TT (OR = 3.62 (2.36–5.55), P = 0.000), AG-TT (OR = 5.08 (3.15–8.19), P = 0.000), GG-CC (OR = 5.28 (3.18–8.78), P = 0.000), and GG-CT (OR = 4.23 (2.56–6.99, P = 0.000)). This meant that the linkage pattern was at high risk for either the GG homozygote of rs3784619 (A/G) or the TT homozygote of rs11637235 (C/T).

Table 5 DUT haplotypes(rs3784619-rs11637235) in CIN III and CSCC cases.

Additionally, most GG (rs3784619) genotypes were linkage GG-CT (36/64, 43/83) in the CIN III and CSCCs, while most TT (rs11637235) genotypes were linkage AA-TT(57/86, 63/109) in group CIN III or in group CSCCs. These results revealed that the distribution of GG (rs3784619) and TT (rs11637235) genotypes was mostly caused by linkage disequilibrium of the corresponding alleles. Thus, the specific linkage disequilibrium between rs3784619 (A/G) and rs11637235 (C/T) can be used as a predictor of CIN III and CSCC.

Discussion

Mechanisms for the prevention of uracil misincorporation into DNA and the removal of misincorporated uracil from DNA are essential for maintaining genomic integrity. It has been shown that failure to remove misincorporated uracil can result in mutations or double stranded DNA breaks following DNA replication, ultimately leading to chromosomal abnormalities, both of which are hallmarks of cancer2,19,20,21. Mammalian cells have enzymes, such as dUTPases, that prevent the incorporation of uracil into DNA by dephosphorylating dUTP into dUMP. In humans, the only known UTPase is encoded by the DUT gene2,22. This gene encodes an essential enzyme for nucleotide metabolism. The encoded protein forms a ubiquitous, homotetrameric enzyme that hydrolyzes dUTP to dUMP and pyrophosphate. This process serves two cellular goals: providing a precursor (dUMP) for the synthesis of thymine nucleotides needed for DNA replication, and limiting intracellular dUTP levels. Elevated dUTP levels lead to increased incorporation of uracil into genomic DNA, which induces extensive excision repair mechanisms mediated by uracil glycosylase. High dUTP levels would lead to uracil misincorporation followed by excision repair, ultimately causing DNA fragmentation and cell death18. Furthermore, the redundant dUMP is used in a salvage pathway for synthesizing deoxythymidine triphosphate23.

Chanson et al. examined the relationship between 23 genetic variants in five uracil-processing genes and uracil concentrations in whole blood DNA in 431 participants of the Boston Puerto Rican Health Study6. They found that four SNPs in DUT, UNG, and SMUG1 had a significant association with DNA uracil concentrations. The SNPs in SMUG1 (rs2029166 and rs7296239) and UNG (rs34259) were associated with increased uracil concentrations, whereas the DUT SNP (rs4775748) was associated with decreased uracil concentrations. These results suggested that the four SNPs in DUT, UNG, and SMUG1 may have a direct effect on uracil concentrations in normal cells, thereby affecting post-DNA replication uracil misincorporation rates, resulting mutations, and ultimately, cancer risk.

Anogenital HPV infections are the most common sexually-transmitted infections, with a prevalence of 70 million cases and an incidence of 14 million cases per year in the United States24,25,26. Fifteen HR-HPV subgenotypes can lead to cancer of the cervix, penis, vagina, vulva, and oropharynx27. HPV-16 and HPV-18 subtypes are the most prevalent HR-HPV subgenotypes in HPV-associated cancers, accounting for approximately 70% of cervical cancers, with the other HR-HPV subtypes (31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 68, 73, and 82) account for the remaining cervical cancer cases28. Nearly all cervical cancers are HPV-associated, including CSCC, cervical adenocarcinomas, and other histological cervical tumors27. Most HR-HPV-related cervical infections are asymptomatic, and more than 90% of detected infections clear within about two years29.

Although several contributing factors in cervical cancer development have been identified, mainly intrinsic genetic factors and extrinsic factors such as HR-HPVs, genetic factors show great potential for use as susceptibility or prognostic indicators30,31. A large amount of epidemiological evidence supports that genetic variants are associated with cervical cancer risk15.

Broderick et al. detected novel germline sequence variations in TDG, UNG and SMUG1 in colorectal cancer cases with familial aggregation, suggesting that these variants may play a role in disease susceptibility32. Other studies have described an association of a SNP in MBD4 with increased risk of lung33,34 and esophageal cancer35.

Many literature studies report that there is a significant correlation between SNPs in DNA repair genes and the susceptibility to different cancers7,36. We previously found an association between SNPs of XPD, XPG, PARP-1 repair genes and the susceptibility to CSCC and HR-HPV infections37,38.

To date, no studies have been performed to investigate the correlation between SNPs of the DUT gene and cancer susceptibility. The development of cervical carcinoma usually requires multiple stages, eventually developing from precursor CIN lesions to cervical malignant carcinoma. In the present study, we found that there was no associations between four SNPs of the DUT gene (rs3784621, rs10851465, rs28381106 and rs28381126) and increased risk of CIN III or CSCC. On the other hand, the homozygous GG allele of rs3784619 and TT allele of rs11637235 were associated with a higher risk of CIN III and CSCC. Individuals with the G allele or G carrier allele (AG + GG) at rs3784619 and with the T allele or T carrier allele (CT + TT) at rs11637235 were at higher risk for CIN III and CSCC. We also found that the homozygous GG allele of rs3784619 and the TT allele of rs11637235 had a higher risk of CIN III and CSCC. These results showed that the homozygous GG allele of rs3784619 and the homozygous TT allele of rs11637235 in the DUT gene may play a role in initiation and progression of precancerous lesions (CIN) and cervical carcinoma. The present study is the first to report an association between DUT rs3784619 and rs11637235 genetic variants and any solid tumor type.

SNP loci that affect the spatial structure and function of genes are typically located in the 5′ UTR promoter, coding region, or 3′ UTR region. Although the two variants we investigated are both located in the intron, which cannot affect amino acid designation, it is still possible that there can be linkage disequilibrium with other functional genetic variants and therefore these variants can serve as a genetic biomarker of susceptibility39. The DUT rs3784619 and rs11637235 genetic variants could also influence primary mRNA splicing and regulation, and therefore could affect dUTPase protein expression or result in an alternative spliceosome. Very little is known about the functional effects of uracil-processing gene SNPs. It is thought that these SNPs can lead to altered enzyme activity, contributing to increased uracil concentrations and therefore uracil misincorporation, and ultimately resulting in human diseases including malignant cancers2.

When performing stratified analyses of the sexual and reproductive histories of patients, we found that the GG genotype of rs3784619 had a particularly high level of enrichment in the group with > one sexual partner for CIN III and CSCC. Meanwhile, the TT genotype of rs11637235 was enriched in HR-HPV-positive cases of CIN III and CSCC. These data suggested that there may be a correlation between the GG genotype of rs3784619 with female sexual behavior. The TT genotype of rs11637235 may affect HR-HPV infection at early onset of disease.

The analysis of the haplotypes of rs3784619 (A/G) and rs11637235 (C/T) revealed that the genotypes with AA-TT, AG-TT, GG-CC, and GG-CT were significantly associated with increased risk of CIN III and CSCC, with the risk being much greater for CSCC. Additionally, most GG genotypes of rs3784619 were linkage GG-CT, while most TT genotypes of rs11637235 were linkage AA-TT. Our data showed that the rs3784619 (A/G) and rs11637235 (C/T) polymorphisms of the DUT gene correlated with increased risk of CIN III and CSCC. Haplotype analysis showed a significantly greater risk of cancer occurrence particularly for the genotype combinations of linked GG-CT and AA-TT. These results revealed that the distribution of GG (rs3784619) and TT (rs11637235) genotypes was mostly caused by linkage disequilibrium of the corresponding alleles. Therefore, the significantly-higher odds ratios suggested a synergistic effect of polymorphisms in the DUT gene, and could be an important factor affecting susceptibility to CIN III or CSCC.

In conclusion, these findings shed light on two polymorphisms (rs3784619 and rs11637235) in the DUT gene associated with a higher risk of CIN III and CSCC that could be used as biomarkers. Most GG genotypes of rs3784619 and TT genotypes of rs11637235 were linkage to GG-CT and AA-TT, respectively. The two investigated SNPs of the DUT gene could be used as early predictive biomarkers of CIN III and CSCC, and in addition, may play a role in HR-HPV infection.

Methods

Ethics statement

This research project was authorized by the Medical Ethical Committee of Women’s Hospital, School of Medicine, Zhejiang University (approval number 2004002). All patients provided written informed consent to participate in the study. The research methods protocol was carried out in accordance with approved guidelines and regulations.

Study subject selection, sexual and reproductive histories, and HR-HPV infection status

Four-hundred CSCC patients, 400 CIN III patients, and 1,200 normal control volunteers were selected from Zhejiang Province, China. Pathological diagnoses were made by two pathologists under double blind conditions. Normal controls were healthy female volunteers attending the hospital for routine physical examinations between June 2004 and December 2008. Normal control volunteers without gynecological neoplasm, cytological findings, endometriosis, other solid cancers or immune disorders were included. Of the included patients, 201 CSCC patients, 357 CIN III patients and 609 normal control volunteers agreed to provide cervical brush samples for HR-HPV detection.

The number of individuals younger than/older than 40 years old were 602/598, 258/142 and 160/240 for the normal controls, CIN III and CSCC groups, respectively. In the CSCC group, there were more individuals older than 40 years old (χ2 = 12.431, P < 0.001) compared to the normal control group. The CIN III group had more individuals younger than 40 years old (χ2 = 24.793,P < 0.001) compared to the normal control group.

Statistically significant differences were only identified for the number of parities in the CIN III and CSCC groups (χ2 = 4.627, P = 0.031; χ2 = 20.49, P < 0.001). Statistically significant differences were not observed for sexual and reproductive histories, including age at first intercourse (≤20 years old, >20 years old), number of sexual partners (≤1 partner, >1 partner) and age at first birth (≤22 years old, >22 years old) among the carcinoma, CIN III and control groups In the normal controls, CIN III and CSCC groups, the infection rate of HR-HPV was 31.4%, 86.8% and 88.6%, respectively. The infection rate of HR-HPV in the CIN III and CSCC groups were higher than in the control group (χ2 = 277.107, P < 0.001; χ2 = 199.315, P < 0.001).

SNPs Selection

We searched the SNP status of the DUT gene with the SNP library option using the website for the National Center for Biotechnology Information, U.S. National Library of Medicine (https://www.ncbi.nlm.nih.gov/snp/). We utilized the filter option (filters activated: snp, minor allele frequency from 0.05 to 0.5, by-1000 Genomes, by-cluster, by-frequency, by-2hit-2allele) to obtain six effective SNPs in the DUT gene (rs3784619 (A/G), rs3784621 (C/T), rs11637235 (C/T), rs10851465 (C/T), rs28381106 (G/T) and rs28381126 (G/T)). All of these six SNPs are located in the intron of the DUT gene.

DNA Extraction and Genotyping

A genomic DNA extraction kit was used to extract whole genomic DNA from peripheral white blood cells according to the manufacturer’s instructions (Sangon, Shanghai, China). All genomic DNA was dissolved in distilled water and frozen until further use.

The six intronic SNPs of the DUT gene were detected by a modified polymerase chain -mismatch amplification (MA-PCR) reaction as described previously37. PCR forward and reverse primer sequences and product lengths are showed in Table S1. In brief, the PCR was performed in a 30 µL reaction mixture, containing 50 ng of genomic DNA, 5.0 pmol of each primer, 0.2 mM of each dNTP and 1.5 units of Taq DNA polymerase (TAKARA, Dalian, China). The PCR reaction was performed with the following conditions: initial denaturation at 94 C for 5 minutes; followed by 35 cycles at 94 C for 30 seconds, 55–58 C for 30 seconds for annealing, and 72 C for 45 seconds for elongation. A final step of 72 C for 10 minutes was performed. PCR products were electrophoresed on a 1.5% agarose gel and visualized. All samples were tested twice by two different technicians in a double-blind fashion, with the reproducibility of assays being consistent.

HR-HPV infection determination

HR-HPV infection detection was determined by the Hybrid Capture II assay kit (Digene Inc., USA) using probe B, which contains a sets of RNA probes for HR-HPV type 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, and 68. Cervical DNA sampling for HR-HPV testing was obtained from cervical brushings with the Digene Cervical Sampler.

Statistical analysis

A binary logistic regression analysis was used to analyze the correlation between SNP genotypes of the DUT gene with risk of CIN III and CSCC. OR(odds ratio), CIs (95% confidence intervals) and P-values are indicated. The normal control group was used as a reference. Stratified analyses were conducted between the sexual behavior, reproductive history and genotype distribution of SNPs in the DUT gene with a Kruskal-Wallis H test. All statistical values are bilateral. Statistical significance was recognized when P value were less than or equal to 0.05. All analyses were performed using SPSS statistical software version 18.0 for Windows.