Polymorphism in ERCC1 confers susceptibility of coronary artery disease and severity of coronary artery atherosclerosis in a Chinese Han population

Excision repair cross-complementing 1 (ERCC1) gene encodes ERCC1 protein, which is mainly responsible for the repair of DNA damage in different diseases including coronary artery atherosclerosis by acting as a rate-limiting element in nucleotide excision repair (NER). Using a three-stage case-control study with 3037 coronary artery disease (CAD) patients and 3002 controls, we investigated associations of three single nucleotide polymorphisms (SNPs) with CAD risk and severity of coronary artery atherosclerosis in Chinese Han population. In the discovery set, the variant allele T of rs11615 was significantly associated with higher CAD risk (adjusted OR = 1.27, P = 0.006) and severity of coronary artery atherosclerosis (adjusted OR = 1.54, P = 0.003). These associations were more remarkable in the merged set (adjusted OR = 1.23, P = 8 × 10−6 for CAD risk; adjusted OR = 1.36, P = 4.3 × 10−5 for severity of coronary artery atherosclerosis). And the expression level of ERCC1 was significantly higher in CAD cases than controls. Multiplicative interactions among SNP rs11615, alcohol drinking, history of T2DM, and history of hyperlipidemia could increase 5.06-fold risk of CAD (P = 1.59 × 10−9). No significant association of rs2298881 and rs3212986 with CAD risk was identified. Taken together, SNP rs11615 in ERCC1 gene might confer susceptibility to CAD and severity of coronary atherosclerosis in a Chinese Han population.

of ERCC1 was related highly to the increase and decrease of infarct volume in the ischemic brain respectively 12 . Mice with ERCC1 deficiency showed vascular dysfunctions, including elevated blood pressure and increased vascular stiffness 13 and also led to elevated serum cholesterol level and the expression change of genes involved in metabolism of cholesterol, which have been considered as traditional risk factors for CAD 14,15 . Based on the function of ERCC1 and the effect of ERCC1 deficiency on brain ischemia, vascular dysfunction, and cholesterol metabolism, we guessed that the SNPs related to ERCC1 expression might also contribute to CAD risk since CAD and stroke are both belong to atherosclerosis cardiovascular diseases due to high cholesterol levels. Then, we chose three SNPs in ERCC1 for further study. One is synonymous C19007T (rs11615) polymorphism locating at exon IV, which has been reported to be associated with differential ERCC1 mRNA and protein levels 16 . The other is C8092A (rs3212986) in 3′-untranslated region (UTR) and may regulate ERCC1 expression by affecting ERCC1 mRNA stability 17 . The last is rs2298881 locating in intron I, which has been reported to alter ERCC1 expression by affecting ERCC1 promoter activity 18 .
Therefore, in the present study, we performed a three-stage case-control study to investigate the association of these three SNPs with CAD risk and severity of coronary artery atherosclerosis in a Chinese Han population.

Results
Clinical characteristics of study population. Baseline and clinical characteristics of study populations were shown in Table 1. Statistically significant differences between cases and controls in three independent populations of our study were found in body mass index (BMI) and the frequencies of smoking, alcohol drinking, hypertension, T2DM and hyperlipidemia. The genotype distribution of all three SNPs in controls was consistent with Hardy-Weinberg equilibrium (HWE) (Supplementary Table S1).
Association results of ERCC1 SNPs with CAD risk. In the discovery set, significantly increased CAD risk was found with the minor allele T of SNP rs11615 (adjusted OR = 1.27, P = 0.006) after adjusting for age, sex, BMI, smoking, alcohol drinking, hypertension, T2DM and hyperlipidemia (Supplementary Table S2). For further verifying accuracy of this significant association, we genotyped SNP rs11615 in the following two sets and obtained similar significant results (adjusted OR = 1.19, P = 0.021 in validation set and adjusted OR = 1.23, P = 0.01 in replication set) (Supplementary Table S2). Moreover, this association became much more remarkable in the combined set (3037 cases and 3002 controls) with an adjusted OR of 1.23 and an adjusted P value of 8 × 10 −6 ( Table 2). Assuming a minor allele frequency (MAF) of 0.258 and 0.222 in CAD and control respectively and OR of 1.23, the combined set could provide a power of 92.4% to detect the association with the type I error of 0.05.
We then used the dominant model to perform genotypic analyses for exploring its potential effect on CAD risk. In the discovery set, we identified an important association of SNP rs11615 with CAD risk in the dominant model (adjusted OR = 1.27, P = 0.022) (Supplementary Table S2). this significant association, having been verified in validation set and replication set, remained unchanged in the combined set (adjusted OR = 1.24, P = 1 × 10 −4 in the dominant model) ( Table 2). All significant associations in the combined set continued to be meaningful after Bonferroni correction (Table 2).
However, we found no evidence supporting the association of SNP rs2298881 and rs3212986 with CAD risk in Chinese (Table 2 and Supplementary Table S2).
Subgroup analyses on the association of SNP rs11615 with CAD risk. Using a dominant model with and without adjustment for covariates, subgroup analyses found significant associations of variant genotypes (CT + TT) of SNP rs11615 with increased CAD risk in almost all subgroups, expect for participants without alcohol drinking habit (Table 3). After the Bonferroni correction, significant associations also remained in the elder subgroup (>60) and participants with alcohol drinking, T2DM and hyperlipidemia (Table 3). Besides, multiplicative likelihood ratio test indicated that variant genotypes (CT + TT) of SNP rs11615 interacted with alcohol drinking (P inter = 0.004), T2DM (P inter = 0.013) and hyperlipidemia (P inter = 0.018) to increase CAD risk (Table 3). Classification and regression tree (CART) analyses. Alcohol drinking, T2DM, hyperlipidemia and SNP rs11615 information were included in classification and regression tree (CART) analyses based on the result of multiplicative interaction analyses. Hyperlipidemia had the greatest influence on CAD risk among the factors included, which consisted with it being the initial split node in CART analyses. Subsequent inspection of the CART tree revealed a 5.06-fold (P = 1.59 × 10 −9 ) increased risk of CAD in the participants with variant genotypes (CT + TT) of SNP rs11615, alcohol drinking and history of T2DM and hyperlipidemia compared to the reference group, which may be an evidence of genetic and environmental interactions ( Fig. 1).
Associations of SNP rs11615 with severity of coronary artery atherosclerosis. For SNP rs11615, Gensini scores increased obviously from the CC carriers to the CT + TT carriers in the discovery set (P = 0.002), validation set (P = 0.004), replication set (P = 0.015) and merged set (P = 1.6 × 10 −5 ) (Fig. 2). Then, we classified CAD cases into two groups based on the median of Gensini scores (32.5). Multivariate logistic regression analyses indicated that the variant genotypes (CT + TT) of SNP rs11615 were associated with higher Gensini scores in the merged set (adjusted OR = 1.36, P = 4.3 × 10 −5 ) (Fig. 2). These associations were more distinct in female and hypertension subgroups and still statistically significant after Bonferroni correction ( Table 4).
Correlations of SNP rs11615 with ERCC1 mRNA expression and plasma ERCC1 level. We measured ERCC1 mRNA expression in 363 subjects (110 CAD patients vs 253 controls) and plasma ERCC1 level in 78 subjects (39 CAD patients vs 39 controls), all of whom were randomly selected from the validation and replication sets. After adjusting for covariates, ANCOVA models showed that ERCC1 mRNA expression (1.74 ± 0.36 vs 1.53 ± 0.59, P = 0.001) were higher in CAD patients than controls and then a significant association was found between variant genotypes (CT + TT) and decreased ERCC1 mRNA expression in CAD patients (P = 0.033) but not in controls (P = 0.488) (Fig. 3). Moreover, CAD patients had a higher plasma ERCC1 level than controls (612.35 ± 82.33 vs 580.62 ± 65.85, P = 0.239), but without reaching statistically difference. No association was discovered between SNP rs11615 genotypes and plasma ERCC1 level.

Discussion
This three-stage case-control study with 3037 CAD cases and 3002 controls, for the first time, revealed that the minor allele T of SNP rs11615 was associated with significantly increased CAD risk in Chinese Han population, especially in the elder subgroup (>60) and participants with alcohol drinking, T2DM and hyperlipidemia. And CART analyses shown that variant genotypes (CT + TT) of SNP rs11615, combined with the above three CAD risk factors, could increase CAD risk up to 5.06 times. Furthermore, variant genotypes (CT+TT) of SNP rs11615 contributed to the severity of coronary artery atherosclerosis. In addition, it presented that CAD patients had a significantly higher ERCC1 mRNA expression levels than controls, however we found the statistical association between the CT + TT genotype and a lower ERCC1 expression level only in CAD patients. Actually previous studies have reported that mice with ERCC1 deficiency showed a higher mutation frequency, increased genomic instability, the dramatic accumulation of unrepaired lesions and then the development of cardiovascular disease like CAD 13,19 . In this study, we found a higher frequency of variant genotypes (CT + TT) in CAD patients. And the patients with variant genotypes (CT + TT) had a decreased level of ERCC1 expression. Besides, a previous study in cell lines got coincident results that ERCC1 levels in human ovarian cancer cell (MCAS) with T allele of SNP rs11615 reduced almost 60% of that in human ovarian cancer cell (A2781/ CP70) with wild-type ERCC1 sequence 20 . So we inferred that the T allele of SNP rs11615 was associated with increased risk and severity of CAD in Chinese Han population, probably by reducing the expression level of ERCC1 compared to the C allele. However, potential mechanism needs to be further studied.
In regard to the mechanism under the association between rs11615 genotype and ERCC1 expression, it is widely accepted that the synonymous polymorphism (rs11615) at codon 118 of the ERCC1 gene, converting a high-usage codon (AAC) to a relatively infrequent one (AAT), could affect ERCC1 translation and the level of ERCC1 protein and thereby impair repair activity on account of codon usage being decreased by almost half 21,22 . Subsequent bioinformatics using ENCODE at UCSC (http://genome.ucsc.edu/ENCODE/) further revealed that variant allele of SNP rs11615 changed the binding site of ZNF263 transcription factors, which might have the ability to influence the expression of the target gene 23 . Moreover, Epigenome Browser (http://epigenomegateway.wustl.edu/) also showed that the wild type of SNP rs11615 located in a region having potential enhancer activity and the variant allele of this locus had the possibility to relatively down-regulate ERCC1 expression ( Supplementary Fig. S1). Therefore, we speculated that SNP rs11615 had the ability to increase CAD risk by decreasing ERCC1 expression levels in the transcription level instead of translation level.
In the present study, we just found the association of variant genotypes of SNP rs11615 with ERCC1 mRNA expression levels in CAD patients, but not in controls. These results could arise from the restriction of small sample study, or dominants of DNA damage in patients, or because peripheral blood cells are not in step with plaque tissue. DNA damage occurred mainly in atherosclerotic plaque, ranging from deletions of parts of chromosomes to DNA strand breaks and DNA adducts and that was more serious than DNA damage in peripheral white blood  Table 3. Subgroup analyses for the association of SNP rs11615 with CAD risk. N, number; OR (95CI), odds ratio (95% confidence interval); SNP, single nucleotide polymorphism; BMI, body mass index; T2DM, type 2 diabetes mellitus. *Adjusted OR (95% CI) and P adj values were obtained from logistic regression analyses after adjusting for age, sex, BMI, smoking status, alcohol drinking and histories of T2DM, hyperlipidemia and hypertension. † P inter values were obtained from the multiplicative likelihood ratio test to assess the interactions between SNP rs11615 and selected variables in CAD risk. Bold values indicate statistically significant after the Bonferroni correction (P < 0.05/40 = 0.00125).
cells 7,24 . It is common knowledge that differential extent of DNA damage accompanied with varying degree of expression of DNA repair-related genes. Consistent with this, immunoreactivity for 7,8-dihydro-8-oxo-2′-d eoxyguanosine (8-oxo-dG), an oxidative DNA damage marker, was detected in atherosclerotic plaque VSMCs, macrophages and endothelial cells, but not in VSMCs of adjacent normal media 25 . Therefore, ERCC1 expression levels in peripheral blood, only relatively reflect the repair situation of coronary artery in CAD patients to a certain extent, and future studies are needed to detect ERCC1 expression in atherosclerotic plaque. To explore potential genetic and environmental interaction effect on CAD risk, we performed CART analyses and found that participants with variant genotypes (CT + TT) of SNP rs11615, alcohol drinking and history of T2DM and hyperlipidemia tended to have increased CAD risk. Heavy alcohol drinking always generated a great deal of reactive oxygen species (ROS), which subsequently induced a variety of DNA damage as well as low density lipoprotein (LDL) oxidation 26 . Accumulation of the unrepaired DNA damage which resulted directly from alcohol drinking or indirectly from DNA repair enzyme (ERCC1) deficiency, contributed to the progression of CAD by activating the release of inflammatory cytokines 13 . And it was reported that interaction of SNP rs11615 with alcohol drinking could increase the risk of laryngeal cancer 27 . For history of T2DM, recent studies have shown that abnormal glucose metabolism could enhance ERCC1 expression and protein levels by activating the release of insulin and that mice with ERCC1 deficiency showed a progeroid phenotype with disturbance of glucose metabolism 28,29 . As for hyperlipidemia, lipid oxidation and peroxidation was common reason for DNA damage and then aggravated the development of atherosclerosis. By the way, ERCC1 deficiency could affect lipid metabolism by up-regulating genes related to extracellular efflux of cholesterol 14,15 . All the above evidence, combined with the proved effect of SNP rs11615 on ERCC1 expressions, support that development of CAD could be further aggravated by interaction between SNP rs11615 and traditional CAD factors (alcohol drinking, T2DM and hyperlipidemia). However, further functional studies are needed to explore the potential mechanism of this interaction.
The present study evaluated severity of coronary artery atherosclerosis by calculating Gensini score and revealed that variant genotype (CT + TT) of SNP rs11615 was associated significantly with higher Gensini scores. This result was supported by a recent study that knockdown of ERCC1 expression exhibited an increasing infarct volume in the ischemic rat brain 12 and the report that ERCC1-deficient mice showed increased vascular stiffness and vascular dysfunction 13 . Therefore, combined with the association of SNP rs11615 with ERCC1 expression, we speculated that SNP rs11615 might accelerate the progress of coronary artery atherosclerosis.
Some limitations should be taken into consideration. First, we just genotyped three most common SNPs in ERCC1 and failed to explore the effect of other ERCC1 genetic variants on CAD risk. Second, although we have adjusted for common CAD risk factors, other genetic and environmental factors could also involve the development of CAD. Finally, ERCC1 expression and protein levels in plaques were not detected in our study due to restriction of samples.
In conclusion, this three-stage case-control study of 3037 cases and 3002 controls, for the first time, suggested significant association of SNP rs11615 with CAD risk as well as possible interactions among SNP rs11615, alcohol drinking and history of T2DM and hyperlipidemia. Functional studies are required to validate our findings and illuminate the potential mechanism.

Materials and Methods
Study population. This three-stage case-control study, involving 3037 CAD patients and 3002 controls, was selected from three sets: the discovery set (study 1) with 806 CAD cases and 816 controls from Zhongnan Hospital of Wuhan University between January 2011 and December 2012; the validation set (study 2) with 1124 CAD cases and 1118 controls from Wuhan Asia Heart Hospital between March 2013 and October 2014; and the replication set (study 3) with 1107 CAD cases and 1068 controls from the above two centers between March 2015 and May 2016. All study participants were Han nationality by self-description.
The diagnosis criteria of CAD was stenosis of more than 50%, confirmed by coronary angiography, in at least one segment of main coronary artery or their main branches. Patients with the following diseases were excluded: cardiac diseases including acute heart failure, congenital heart disease, myocardial bridge or cardiomyopathy and coronary artery spasm, as well as systemic disease such as malignancy, autoimmune disease, severe liver or renal disease and immunosuppressive drug use. Controls were age-and gender-matched participants without detectible luminal stenosis identified by coronary angiography (1054 controls) and healthy individuals without above-mentioned cardiac or systemic diseases discovered by physical examination (1948 controls). The following data were extracted, traditional CAD risk factors 30,31 including cigarette smoking, alcohol drinking and histories of hypertension, type 2 diabetes mellitus (T2DM) and hyperlipidemia (Supplementary materials and methods) and clinical data, such as blood pressure, body mass index (BMI), fasting plasma glucose (FPG) and lipid levels. This study and informed consent were approved by Medical Institutional Review Board of Zhongnan Hospital of Wuhan University and conformed to guidelines of the Declaration of Helsinki.

Selection of SNPs and genotyping.
A large number of studies investigated the association of single nucleotide polymorphisms (SNPs) in ERCC1 gene with cancer. Based on the 1000 Genome Database (http://www.1000genomes.org/) 32 , three most common SNPs, which were rs11615 in exon IV, rs3212986 in 3′-untranslated region (UTR), and rs2298881 in intron I, were selected (Supplementary Table S1).
We extracted genomic DNA from peripheral blood leucocytes by using a phenol/chloroform method and then genotyped 3 SNPs by high-throughput sequencing using illumina Miseq system (Illumina, San Diego, CA) (Supplementary genotyping methods and Supplementary Fig. S2) 33,34 . Direct PCR sequencing was performed to confirm the accuracy of genotyping (Supplementary Fig. S3). Detailed information for genotyping and direct sequencing, such as primer sequences and PCR conditions, was exhibited in Supplementary Table S3.
Scoring of coronary angiogram. In the Gensini scoring system 35,36 , angiographic stenosis of each coronary artery segment was scored as 1 point for the range of 0-25%, 2 for 26-50%, 4 for 51-75%, 8 for 76-90%, 16 for 91-99% and 32 for 100%. Then, each coronary artery branch corresponds to a coefficient, ranging from 0.5 to 5, depending on the location of stenosis and importance of areas supplied by that segment. A patient's final Gensini score is the sum of the weighted scores for each stenosis segment.

Figure 2.
Genetic estimates of association of SNP rs11615 with Gensini score. Variant genotypes (CT + TT) of SNP rs11615 was associated with higher Gensini score (median) in the discovery set (a), validation set (b), replication set (c) and the merged set (d) and also associated with increased risk of severity of coronary artery atherosclerosis in independent or merged set (e). P, non-adjusted P; P adj , adjusted P.

CART analyses.
To evaluate the potential high-order genetic and environmental interactions in CAD risk, we carried out classification and regression tree (CART) 37 analyses by using Clementine 12.0 (SPSS Inc, Chicago, IL, USA) programs. CART adopted Gini index as splitting criterion to build a hierarchical classification tree for finding an optimal combination of genetic and environmental factors, which could predict CAD risk more forcefully 38 . Logistic regression analyses were used to assess the association of each terminal node (TN) with CAD risk.
Real-time quantitative PCR analysis of ERCC1 mRNA expression. After extracting from human peripheral blood leukocytes using Trizol reagent (Invitrogen, Carlsbad, CA, USA), total RNA was prepared to remove DNA contamination using the RNase-Free gDNA eraser and then conduct reverse transcription using a reverse transcriptase kit (Takara Bio Inc, Kusatsu, Shiga, Japan). The cDNA product was used to determine ERCC1 mRNA expression using real-time quantitative PCR (RT-qPCR) analysis with the SYBR-Green kit on a CFX96 Touch system (Bio-rad, Hercules, CA, USA). The 2 −ΔΔCq method was used to calculate the relative expression of ERCC1 by normalizing to the internal reference gene (GAPDH).

Measurement of plasma ERCC1 levels.
Plasma samples were isolated from the whole blood by centrifugation and stored at −80 °C immediately until use. According to the manufacturer's instructions, we measured plasma concentration of ERCC1 by an enzyme-linked immunosorbent assay (ELISA) (ERCC1 ELISA kit, Xinfan Biosystems, Shanghai, China) and then quantified it using a standard curve with the detection range of 30-1200 pg/ml. Statistical analyses. The differences in clinical characteristics between cases and controls were analyzed by the Pearson chi-square test (categorical variables) and the Student's t-test (continuous variables). The Pearson chi-square test was also used to assess Hardy-Weinberg equilibrium (HWE) of each SNP in three independent study populations. Odds ratio (OR) and their 95% confidence intervals (CIs) were calculated by logistic regression analyses to estimate the association of ERCC1 SNPs with CAD risk before or after adjusting for age, sex, BMI, smoking, alcohol drinking, hypertension, T2DM and hyperlipidemia. Genetic and environmental interactions  Table 4. Association of SNP rs11615 with the severity of coronary atherosclerosis. N, number; OR (95CI), odds ratio (95% confidence interval); BMI, body mass index; T2DM, type 2 diabetes mellitus. *Gensini scores are expressed as median (interquartile range) because of the skewed distributions. † P values were obtained from the Mann-Whitney U test. ‡ CAD patients were classified into two groups based on the median (32.5) of Gensini scores, and then adjusted OR (95% CI) and P adj values were obtained from logistic regression analyses after adjusting for age, sex, BMI, smoking status, alcohol drinking and histories of T2DM, hyperlipidemia and hypertension. § P inter values were obtained from the multiplicative likelihood ratio test to assess the interactions between SNP rs11615 and selected variables in CAD risk. Bold values indicate statistically significant after the Bonferroni correction (P < 0.05/48 ≈ 1.04 × 10 −3 ).
of ERCC1 SNPs and selected stratification variables on CAD risk were evaluated by the multiplicative likelihood ratio test. When assessing severity of coronary artery atherosclerosis by Gensini score, Mann-Whitney U test and logistic regression analyses were used to analyze the association of ERCC1 SNPs with Gensini score. All the analyses above were carried out using SPSS 22.0 (SPSS Inc., Chicago, IL, USA) and P values were considered statistically significant below the cut-off value of 0.05. Also, Bonferroni correction was performed for multiple comparisons in the entire analyses. Moreover, power analyses were performed using Power and Sample Size program (PS) 3.0 (Vanderbilt University, Nashville, TN, USA).
Data Availability. All data generated or analyzed during this study are included in this published article (and its Supplementary Information files).