MicroRNA binding mediated Functional sequence variant in 3′-UTR of DNA repair Gene XPC in Age-related Cataract

DNA oxidative damage repair is strongly involved in the pathogenesis of age-related cataract (ARC). The sequence variants of in coding region of DNA repair genes have been shown to be associated with ARC. It is known that single nucleotide polymorphisms (SNPs) in the 3′-terminal untranslated region (3′-UTR) can alter the gene expression by binding with microRNAs (miRNAs). We hypothesize that SNP(s) in miRNA binding site of certain DNA oxidative damage repair genes might associate with ARC risk. We examined 10 miRNA binding SNPs in 3′-UTR of 7 oxidative damage genes and revealed the XPC- rs2229090 C allele was associated with nuclear type of ARC (ARNC) risk in Chinese population. The individuals with the variant G allele (CG and GG) of XPC- rs2229090 had higher XPC mRNA expression compared to individuals carrying CC genotype. The in vitro assay showed that luciferase reporter gene expression can be down regulated by hsa-miR-589-5p in cells transfected with rs2229090 C allele compared to G allele. These results suggested that the C allele of XPC-2229090 increase the risk with ARNC. The mechanism underlying might be due to the stronger interation of the C allele with hsa-miR-589-5p, resulting in lower XPC expression and DNA repair capability than the individuals carring G allele in lens.


Selection of SNPs and genotyping.
Haplotype-tagging SNPs located in the 3′-UTR regions of some DNA repair genes were selected by searching Han Chinese data in NCBI dbSNP (https://www.ncbi.nlm.nih.gov/snp). The SNPs with a MAF ≥ 10% were included while excluding those having strong linkage disequilibrium (LD) between adjacent variants with r 2 threshold ≤0.80 (Table 2).
SNP Genotyping was performed with the TaqMan genotyping assay (Thermos fisher, Foster City, CA, USA) according to the manufacturer's instructions, as described in our previous publications 29, 30 .
In silico analysis. The PolymiRTS database 3.0 (http://compbio.uthsc.edu/miRSNP) and miRNA Target Detection (http://www.microrna.org/microrna/getGeneForm.do) were used to predict the candidate miRNAs which bind the selected 3′-UTR sequences. LD analysis was analyzed based on the 1000 Genomes data for the CEU population using the SNP Annotation and Proxy (SNAP) tool (http://www.broadinstitute.org/mpg/snap) 31 . An online software, the RNAhybrid program (http://bibiserv.techfak.uni-bielefeld.de/rnahybrid), was employed to calculate the minimum free energy (MFE) of hybridization between miRNAs and their potential target sequences with MFE < −20 kcal/mol as the threshold were selected according previous study 29,31 . Comet Assay. Comet assay, the single cell gel electrophoresis assay, is a sensitive technique to detect the DNA breaks. We measured DNA damage of LECs from the capsule samples and lymphocytes from peripheral venous blood using comet Assay kit (Trevigen, Gaithersburg, Maryland, USA) according to the manufacturer's protocol. LECs and lymphocytes were isolated and then suspended at 1 × 10 4 cells/ml in PBS. The data analysis was performed with measuring the percentage of DNA in the tail of comets (%Tail DNA) and the olive tail moment (OTM) according to the method described by our previous study 32,33 .
Quantification of XPC mRNA expression. TaqMan gene expression assay probes (Thermos fisher) were used for XPC mRNA quantification (assay ID: Hs01104205_g1). Human GAPDH (Hs02786624_g1) was used as housekeeping gene control. Real-time PCR analysis was performed by ABI StepOne plus real-time PCR system (Applied Biosystems, Foster City, CA, USA). The fold change of genes mRNA level was calculated using 2 (−ΔΔCt) algorithm.
The synthetic sequences of reporter construct (XPC-rs2229090) were as follows:
DNA sequencing was used to confirm the recombinant constructs.
Cells culture and transfection. SRA01/04 cell line originated from human lens epithelium was bought from Chinese Academy of Sciences (Shanghai, China). The cells were cultured in Dulbecco's modified eagle medium (DMEM) (Invitrogen) supplemented with 10% fetal bovine serum (FBS) (Lonza, Basel, Switzerland), 1% Penicillin-Streptomycin Solution (100 U/ml of penicillin and 0.1 mg/ml of Streptomycin) according our previous study 18 , in a humidified atmosphere with 5% CO 2 at 37 °C. The cells were plated into 96-well plates at a density of 2 × 10 5 cells/well. Transfection was conducted when cells reached 60-70% confluence.
Luciferase reporter assay. The cells were prepared 48 h after transfection, and 100 μl of supernatants was removed from each well for luminescence assay. A luciferase assay kit (Promega, Madison, Wisconsin) was used to measure luciferase activity. Experiments were repeated at least three times.
Statistical analysis. Statistical analyses were used by Stata software (Stata Corp, College Station, TX). The χ 2 test was performed to test the association between the alleles frequencies of ARC groups and controls, and to calculate odds ratios (OR) and 95% confidence interval (CI). Hardy-Weinberg Equilibrium (HWE) of genotype distributions were also tested by the χ 2 test. Bonferroni correction was conducted when positive association exist in the initial allele analysis. Various genetic model analyses were performed to characterize the association as  . We only present the most significant model in the results. P < 0.05 was considered as statistically significant. The values of the Comet assay were expressed as mean ± SD. The ANOVA was used to compare the differences of the Comet assay parameters between the genotypes. P < 0.05 was considered as statistically significant. The qRT-PCR and Luciferase assay in this study were repeated at least 3 times independently. Data were presented as means ± SD. The t test was used to compare the average values of two groups.

Results
Characteristics of the participants for the association study. The participants of the study were recruited from the epidemiologic. The Table 1 showed the general demographic characteristic of the study participants. There was no statistically significant difference about age and gender between ARCs and controls (P > 0.05).
Bioinformatics selection of candidate SNPs. Ten SNPs in 3′-UTR region of seven genes were selected for genotyping. Their basic information and predicted miRNAs were listed in Table 2.
Association between SNPs and risk of ARC. Among the ten SNPs, the allele frequency of XPC-2229090 of ARC cases was significantly different from those of controls before and after multiple comparison correction (Bonferroni correction) (P < 0.0001, Pa < 0.001) ( Table 3). We then further performed stratification analysis to explore the SNP involvement in subtypes of ARC. The results showed that frequency of the minor alleles of XPC-2229090 were significantly lower in the C, N and M type of ARCs than in the controls (P = 0.0372; P = 0.0001; P = 0.008) (Table 4). However, the significances of the SNPs was only present between ARNC and controls after Bonferroni correction.
The genetic model analysis found that rs2229090 were associated with the relevant types of ARC in the dominant model and the recessive model. The associations still exist after Bonferroni correction (P < 0.05) ( Table 5).
The effects of rs2229090 on the mRNA levels of XPC in biopsy samples. As shown in Fig. 1A, the mRNA expression of XPC was lower in LECs of ARNC group than that of the controls. Moreover, individuals carrying the minor G allele in all subjects compared with the CC genotype (CG versus CC, P < 0.05; GG versus CC, P < 0.01) (Fig. 1B).   The correlation of rs2229090 with DNA breaks and ARC risk evaluated by Comet assay. There were prominent comets indicating lots of DNA breaks in the LECs and peripheral lymphocytes of ARNCs and few in LECs and peripheral lymphocytes of the controls ( Fig. 2A,B). The %Tail DNA and OTM by Comet assay in lymphocytes and LECs of ARNCs and controls are shown in Table 6. The assay showed there were much more DNA damage in ARNCs than controls (P < 0.001) (Fig. 2C). The %Tail DNA and OTM in LECs was positive correlation with those in lymphocytes (Fig. 2D). However, there no correlation of DNA breaks of peripheral lymphocytes and LECs with different genotypes of rs2229090 was found (Fig. 2E).
Functional analysis of the rs2229090. The SNP rs2229090 is located in a putative 3′-UTR binding site of hsa-miR-589-5p, and The C allele of rs2229090 was predicted to bind more efficiently than the G allele to the miRNA (Fig. 3A).
To test whether there is an allele-specific effect of rs2229090 on XPC expression using a surrogate report gene in the presence of hsa-miR-589-5p, we transfected the miRNA mimics or inhibiters and constructed reporter plasmids (rs2229090-C and rs2229090-G) to SRA01/04 cell lines. Co-transfection with hsa-miR-589-5p mimics, the relative luciferase activity was lower in the reporter constructs containing rs2229090 C allelic than of the G  Table 5. Association Between rs2229090 and the N Type of ARC. allele in the cell lines. Meanwhile, co-transfection with hsa-miR-589-5p inhibitors, the relative luciferase activity was higher than the controls without hsa-miR-589-5p in the experiment using rs2229090 C allelic reporter constructs but not the G allelic reporter constructor (P > 0.05). (Fig. 3B), indicating that the interaction between hsa-miR-589-5p and the C allele of mRNA is more robust. We further examined whether hsa-miR-589-5p could inhibit XPC expression in the cell line. We measured XPC mRNA after transfecting SRA01/04 cells (CC genotype) with has-miR-589-5p mimics and inhibitors., The XPC mRNA decreased when mimics were added as shown in Fig. 3C (P < 0.05).

Discussion
DNA oxidative damage may lead to ARC, and its timely repair can maintain the healthy status in LECs 11 . DNA oxidative damage caused by various factors can be repaired through the DNA damage repair enzymes. Once the function of these repair genes is in malfunction, it would be a serious problem for cells and organisms 10 .
Many reasons can lead to the inefficiency of DNA repair, one of the reasons is the variation of DNA repair genes 34 . SNP are the most abundant form of DNA variation 13,14 . We have proved the roles of some vital genes of DSBR and NER in ARC 4,26 . In this study, we selected other genes of NER and DSBR pathways 10,19 to explore whether they are also related to ARC formation. In the current research, we only found XPC -rs2229090 was associated with risk of ARNC with the C allele as a risk and G allele as a protection. This SNP is located in 3′-UTR region of the gene and in the binding sequence of hsa-miR-589-5p by in silico prediction. Hsa-miR-589-5p can reduce luciferase activity in an allele-specific manner (C allele) in vitro was found by luciferase reporter assays. We also found that the expression of XPC was lower in samples carry with the C allele in ARC and control groups. The mechanism of this genetic component to ARC pathogenesis could be described as: assuming that individuals have similar level of hsa-miR-589-5p in lens tissue; those with the C allele would have stronger interaction with hsa-miR-589-5p, resulting in lower XPC expression and DNA repair capability than the individuals carrying G allele.
NER is one of the most and well-established DNA repair mechanisms in maintaining genomic stability and integrity 35 . XPC is a vital part of the NER pathway and plays a vital role in the early steps of global genome   NER 36,37 , which plays vital role for removal of oxidative damage 35 and regulation of the cell cycle for DNA damage response 38 . The SNPs in the coding region of XPC have been associated with many diseases in some studies 39,40 . MiRNAs participate in regulation of genes expression through the binding the 3′-UTR of target mRNA leading to mRNA degradation or translational repression 41 . Our data added new evidence on the biology of SNP and miRNA interaction in the context of gene expression Indeed, GG or CG genotype of rs2229090 was associated with significantly increased mRNA levels of XPC compared with CC genotype; the rs2229090 SNP has functional consequences on miRNAs targeting. We proved that the G allele of rs2229090 altered the expression of mRNA of XPC, which were highly possibly due to changes in binding free energy with hsa-miR-589-5p. In our previous study, compared with controls, ARC patients have more DNA damage in peripheral lymphocytes and in LECs. These damage of two locales were positively correlated 5 . In this study, the degree of DNA damage in peripheral lymphocytes and LECs assessed by Comet assay was significantly higher in ARNCs regardless of the genotypes. But the DNA damage between different genotypes showed no difference. Therefore, we believe that DNA damage in peripheral lymphocytes and LECs is common phenomenon and not associated with different allele in ARC group.
Currently, we could not explain why rs2229090 is exclusively associated with N types of ARC. Previous reports XPC binds to a wide variety of damage such as UV-induced photoproducts 42 . Our previous study showed that UV-induced DNA damage lead to the formation of N types of ARC 4 . In current study, we found the expression of XPC with C allele is lower in N types of ARCs than that of controls. We speculate that the lower expression of XPC leads to deficiency in repairing UV-induced damage, thus increasing the risk of ARNC.
In conclusion, our study focused on the importance of SNPs located in 3′-UTR of DNA repair genes to increase understanding of ARC pathogenesis. The results suggested that miRSNP rs2229090 of XPC may influence an individual's susceptibility to ARNC in Han Chinese population. The rs2229090 C allele of XPC gene may be possive correlation with the risk of ARNC, and the mechanism may be that the free energy of hsa-miR-589-5p binging C allele is excessive than G allele of rs222900 of XPC, further destroy the post-transcription of XPC, thus leading to ARNC. This finding provides a novel approach and potential therapeutic target for ARNC management.

Data Availability
The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.