Single nucleotide polymorphisms of nucleotide excision repair pathway are significantly associated with outcomes of platinum-based chemotherapy in lung cancer

Nucleotide excision repair (NER) pathway plays critical roles in repairing DNA disorders caused by platinum. To comprehensively understand the association between variants of NER and clinical outcomes of platinum-based chemotherapy, 173 SNPs in 27 genes were selected to evaluate association with toxicities and efficiency in 1004 patients with advanced non-small cell lung cancer. The results showed that consecutive significant signals were observed in XPA, RPA1, POLD1, POLD3. Further subgroup analysis showed that GTF2H4 presented consecutive significant signals in clinical benefit among adenocarcimoma. In squamous cell carcinoma, rs4150558, rs2290280, rs8067195 were significantly associated with anemia, rs3786136 was significantly related to thrombocytopenia, ERCC5 presented consecutive significant signals in response rate. In patients receiving TP regimen, significant association presented in neutropenia, thrombocytopenia and gastrointestinal toxicity. Association with anemia and neutropenia were found in GP regimen. rs4150558 showed significant association with anemia in NP regimen. In patients > 58, ERCC5 showed consecutive significant signals in gastrointestinal toxicity. Survival analysis showed SNPs in POLD2, XPA, ERCC6 and POLE were significantly associated with progression free survival, SNPs in GTF2H4, ERCC6, GTF2HA, MAT1, POLD1 were significantly associated with overall survival. This study suggests SNPs in NER pathway could be potential predictors for clinical outcomes of platinum-based chemotherapy among NSCLC.


Characteristics of patients and clinical outcomes.
In order to investigate the association between polymorphisms of NER pathway and clinical outcomes of platinum-based chemotherapy, 1004 patients with advanced NSCLC who received only first-line platinum-based chemotherapy were enrolled in this study. The details of patient characteristics and clinical outcomes were listed in Table 1. The median age of cohort was 58 (ranged from 26 to 82). The patients who were more than 58-year-old accounted for 48.4%, and the ones who were less than or equal to 58-year-old accounted for 51.6%. Most patients were male (70.3%). The percentage of patients with ECOG PS 0-1 was 91.3%. 42.5% of the patients were non-smoker. All patients recruited presented advanced NSCLC, and most of which were stage IV (62.6%). Adenocarcinoma was the most common histological type, which accounted for 57.5%. Platinum-navelbine (NP) (31.5%), platinum-gemcitabine (GP) (23.8%), platinum-paclitaxel (TP) (31.1%), platinum-docetaxel (DP) (8.7%) were the four mainly used chemotherapy regimens in this study. The responses of platinum-based chemotherapy were classified into 4 categories in terms of complete response (CR), partial response (PR), stable disease (SD), and progressive disease (PD) according to Response Evaluation Criteria in Solid Tumors (version 1.0) 21 . Clinical benefit was defined as patients with CR, PR or SD. Response rate contains CR and PR. The response rate was 18.2%, and clinical benefit was 80.7%. The median PFS was 9.1 months and the median OS was 19.3 months. In the toxicity analysis, gastrointestinal toxicity and hematological toxicities including anemia, thrombocytopenia, and neutropenia were collected. 8.3% of patients presented severe gastrointestinal toxicity, 3.1% of patients presented severe anemia, 12.3% of patients presented severe neutropenia, and 3.6% of patients presented severe thrombocytopenia.
Association between the polymorphisms of NER pathway and efficiency of platinum-based chemotherapy. To investigate the association between polymorphisms of NER pathway and the efficiency of platinum-based chemotherapy, clinical benefit and response rate were introduced in this study to evaluate the efficacy of platinum-based chemotherapy. There were many polymorphisms presented significant association with clinical benefit and/or response rate of platinum-based chemotherapy (P < 0.05), however, after Bonferroni correction, no significant results were remained (P < 2.89 × 10 −4 (0.05/173)) ( Fig. 1A). rs3176721 located in XPA showed the most significant signal in clinical benefit analysis (χ 2 test P = 0.003; OR = 1.74, 95%CI:1.25-2.44, P = 0.001).
SCIEnTIFIC REpoRTS | 7: 11785 | DOI:10.1038/s41598-017-08257-7 Association between polymorphisms of NER pathway and the toxicities of platinum-based chemotherapy. Gastrointestinal toxicity and hematological toxicities including anemia, thrombocytopenia, and neutropenia were collected to investigate the association between SNPs of NER pathway and the toxicities of platinum-based chemotherapy. The results showed that GTF2H1/P62 and DDB2 presented consecutive significant signals on anemia. RPA1 and POLD1 presented consecutive significant signals on thrombocytopenia. POLD3 presented consecutive significant signals on neutropenia (Fig. 1A). However, no SNPs satisfied the significant level of Bonferroni correction (P < 2.89 × 10 −4 ).
Association between polymorphisms of NER and survival of platinum-based chemotherapy. Survival analysis was performed to assess the association between the polymorphisms of NER and PFS or OS. The results showed that 5 SNPs were associated with PFS, and all these SNPs decreased the risk of disease progression (Table 3, Fig

Discussion
NER pathway is important in DNA damage repair, especially in repairing the distortion of DNA helical structure 22 . Many genes involved in lesion recognition, DNA unwinding, incision of the DNA around lesion, and finally DNA resynthesis and ligation 13 . Platinum-based chemotherapy is one of the most effective treatments for lung cancer. The mechanism of platinum in cancer treatment is to form intra and inter-strand crosslinks, which could distort the DNA helix, inhibit DNA replication and cause cancer cells apoptosis 5 . NER pathway is the main damage repair system involved in platinum-caused DNA distortion 4 . Many studies focused on the relationship between the expression level of NER-related genes and efficacy of platinum-based treatment for cancer. The status of ERCC1 protein expression was reported as a predictive marker for outcomes of platinum-based chemotherapy in lung cancer 17 . Some studies also pointed out those SNPs in some members of NER pathway showed significant association with clinical outcomes of platinum-based chemotherapy. The polymorphisms of XPD were significantly associated with not only efficiency but also severe toxicity of platinum-based chemotherapy in lung cancer 23,24 . Other members of NER pathway, such as XPA, ERCC5, and ERCC2, were related to the response of platinum-based chemotherapy in lung cancer 15,25,26 . In order to comprehensively assess the association between polymorphisms of NER pathway and clinical outcomes of platinum-based chemotherapy, a total of 173 SNPs located in 27 genes were investigated in this study to evaluate their association with gastrointestinal toxicity, neutropenia, anemia, thrombocytopenia, clinical benefit, response rate, overall survival (OS), and progression-free survival (PFS). Our results showed that variants in NER pathway were significantly associated with clinical outcomes of platinum-based chemotherapy. Polymorphisms in XPA, DDB2 and GTF2H4 were significantly associated with clinical benefit. Polymorphisms in ERCC2, ERCC5 were significantly associated with response rate. Polymorphisms in GTF2H1, ERCC2 and RPA1 showed significant association with anemia. Polymorphisms in RPA1 showed significant association with thrombocytopenia. Polymorphisms in ERCC2, ERCC6, DDB2, RPA1, POLD1 and POLD3 presented significant association with neutropenia. Polymorphisms in POLD2, XPA, ERCC6, POLE presented significant association with PFS. Polymorphisms in GTF2H4, ERCC6, GTF2H1, MAT1 and POLD1 presented significant association with OS.
XPA encodes a zinc-finger DNA-binding protein, and plays an important role of damage recognition in NER pathway 27 . Genetic variants in XPA were significantly associated with lung cancer risk 28 . Knockdown the expression of XPA could sensitize NSCLC-derived cell lines to cisplatin 29 . Our results showed that rs3176721 in XPA was significantly associated with clinical benefit in all patients, as well as in AC subgroup. rs3176658 in XPA was significantly associated with PFS, and the A allele could significantly decrease the risk of disease progression.  DDB2 is a component of DDB which is the damage-specific DNA-binding heterodimeric complex 30 . SNPs in DDB2 were significantly associated with the risk of lung cancer 31 . A recent GWAS analysis showed that rs747650 in DDB2 was a new susceptibility locus of severe acne 32 . Overexpression of DDB2 could sensitize the cancer cells to cisplatin treatment which indicated that DDB2 may play important role in platinum-based chemotherapy 33 . In our study, we found that rs2306353 significantly associated with clinical benefit in patients receiving NP regimen, and rs326222 in DDB2 were significantly risk factor for neutropenia in subgroup of patients younger than 58 years old. GTF2H4 (also known as P52) encodes a subunit of transcription factor II H (TFIIH), and is known to be involved in nucleotide excision repair 34 . In a recent study of a large-scale analysis of six published GWAS datasets pointed out that rs114596632 in GTF2H4 was significantly associated with lung cancer risk 35 , rs2074508 in GTF2H4 was significantly associated with smoking-related lung cancer 36 . In the current study, GTF2H4 presented consecutive significant signals in clinical benefit among AC patients. rs3130780 in GTF2H4 was significantly associated with OS, and AA genotype could significantly increase risk of death.
ERCC5 plays important roles in DNA incision in NER pathway. ERCC5 is a well-known gene which has great impact on cancer. Our study showed that ERCC5 presented consecutive significant signals not only in response rate in SCC, but also in gastrointestinal toxicity among patients > 58 years old. rs2296147 was the most significant SNP which associated with response rate. It was reported that rs2296147 was not only associated with cancer risk, but also related to prognosis of cancer 37 . There were also many studies showed that rs2296147 was associated with prognosis of advanced non-small cell lung cancer treated with platinum-based chemotherapy, and could predict the clinical outcomes of platinum-based chemotherapy [38][39][40][41] . rs2296147 is located in the promoter of ERCC5. The transcription repressor of SNAI1 is predicted to bind to the sequence around rs2296147, which indicating that rs2296147 may take part in negative regulating the expression of ERCC5.
RPA1 is an important subunit of RPA which is a major eukaryotic single-strand DNA-binding protein complex, and essential for DNA repair, DNA replication, DNA recombination, telomere maintenance, activation of DNA damage checkpoints and the maintenance of genomic integrity 42 . RPA1 is also reported as a part of the replication fork protection complex 43 . Previous studies showed that RPA1 played important roles in Pt-DNA repair 44 , and expression level of RPA1 could be used to predict prognosis of cancer 45 . However, no studies focused on the relationship between RPA1 and the hematological toxicities of platinum-based chemotherapy. In this study, we found that polymorphisms in RPA1 presented significant association with all 3 hematological toxicities. rs12727 and rs3786136 showed significant association with thrombocytopenia, rs8067195 and rs6416887 showed significant association with anemia, rs12150513 showed significant association with neutropenia. rs12727 is located in the 3′UTR of RPA1, and the sequence around it is the potential target of miR-345-3p, miR-6732-3p and miR-6771-3p. RPA1 is also a target of PTEN function in fork protection to maintain genome stability 46 .
ERCC6 can recognize DNA damage and recruit NER repair factors to the DNA damage site. Polymorphisms in ERCC6 showed significant association with the risk and prognosis of lung cancer 47 . Previous study showed that no statistically significant association was found between the platinum-related toxicities and SNPs of ERCC6 or, CCNH 48 . In our study, we found that rs4253002 in ERCC6 showed significant association with gastrointestinal toxicity in the patients receiving TP regimen, and rs4253212 in ERCC6 showed significant association with neutropenia in the patients receiving GP regimen. We also found rs2290280 in CCNH was significantly associated with anemia in SCC subgroup. In survival analysis, rs12571445 in ERCC6 showed significant association with PFS, and rs2281793 in ERCC6 showed significant association with OS. Our results suggested that both ERCC6 and CCNH might involve in regulating clinical outcomes of platinum-based chemotherapy. DNA polymerase δ is conserved from humans to yeast, and performs important functions in DNA replication and repair processes. The Polδ complex was comprised of four subunits (p125, p66, p50 and p12) which encoded by POLD1, POLD3, POLD2 and POLD4 49 . Polymorphisms and mutations in POLD1 and POLD3 were reported to be associated with cancer risk 50,51 . Overexpression of POLD1 was associated with platinum resistance in a long-term survivor of mesothelioma 52 . In this study, POLD1 and POLD3 showed significant association with neutropenia. rs1726801, rs1673041 and rs3219341 in POLD1 showed significant association with neutropenia in patients receiving TP regimen. rs10857 and rs6592576 in POLD3 showed significant association with neutropenia in all patients. rs3757843 in POLD2 showed significant association with PFS, and rs2546551 in POLD1 showed significant association with OS.
We also found that rs11609456 and rs5744751 in POLE showed significant association with PFS, rs4151374 in MAT1 and rs4150667 in GTF2H1 showed significant association with OS. rs4150558 in GTF2H1 was significantly associated with anemia in all patients, the same effect was also observed in not only SCC but also subgroup of patients receiving NP regimen. Our results showed that some of the significant signals of χ 2 test were absent in multiple logistic regression analysis, especially in subgroup analysis. For example, rs12727 in RPA1 showed in significantly different distribution in thrombocytopenia in AC subgroup, rs4151405 in MNAT1 and rs17584703 in RFC1 showed significantly different distribution in thrombocytopenia in patients receiving TP regimen, however, multiple logistic regression analysis showed no significant association. This might be because that the number of patients were few in some subgroups, resulting in the distribution of genotypes disequilibrium and significant signals of χ 2 test. However, P value for trend as well as OR and 95%CI were used in multiple logistic regression analysis, which reveal the real relationship or association between clinical outcomes and polymorphisms.
In the current study, subgroups analysis of chemotherapy regimen was carried out to investigate other drugs affect the results of association analysis of platinum. We found that different genes were associated with different outcomes in different subgroups, which suggested that other drugs effect might have impact on clinical outcomes of platinum-based treatment and subgroup analysis was important in platinum-related pharmacogenetics studies. In survival analysis, some significant signals were only presented in heterozygote, but disappeared in mutant homozygote. This phenomenon was termed "heterozygote advantage". Many other studies showed the similar results. For example, there was a clear association between heterozygosity at the TIRAP S180L locus and protection against multiple infectious diseases 53 . In breast cancer that the heterozygous genotype of 5′ UTR -26 G > A polymorphism located in BRCA2 was found to be protective effect in cancer risk. Our results also showed that heterozygous genotype was significantly associated with good prognosis 54 . In some subgroups of survival analysis, especially in recessive model, such as rs12571445 (ERCC6) in PFS analysis, and rs3130780 (GTF2H4) and rs2546551 (POLD1) in OS analysis, the sample size of homozygous mutation is too small to get reliable results, and more samples are needed to confirm the results.

Methods
Study population. 1004 patients recruited in current study were histopathologically diagnosed stage IIIA-IV NSCLC patients in Shanghai, China. Each patient was informed consent before enrolled. The criteria for recruitment were defined as below: (1) the patients enrolled in this study was over 18 years old; (2) the patients were newly diagnosed, and only received platinum-based chemotherapy. Any patient with surgery, radiotherapy, concurrent chemoradiotherapy or previous chemotherapy was excluded; (3) the performance status was between 0 and 2; (4) there were no other malignancy in the past 5 years; (5) no cardiac arrhythmias, no active congestive heart failure, and no uncontrolled clinical infections; (6) the absolute neutrophil count ≥ 1.5 × 10 9 cells/L, platelets ≥ 100 × 10 9 cells/L, creatinine clearance ≥ 60 mL/min, serum creatinine ≤ 1.5 × upper limit normal, alanine and aspartate aminotransferase ≤ 1.5 × upper limit normal. All the methods mentioned in the protocol were carried out in accordance with the institutional guidelines and approved by the Ethical Review Committee of Fudan University, and informed consent was obtained from all patients before samples collection.
Clinical outcomes including toxicities, responses and survival were evaluated in the current study. The responses to platinum-based chemotherapy were assessed after two cycles of treatment, and the responses were classified into 4 categories in terms of complete response (CR), partial response (PR), stable disease (SD), and progressive disease (PD) according to response evaluation criteria in solid tumors (version 1.0) 21 . Clinical benefit was defined as patients with CR, PR or SD. Response rate contains CR and PR. Gastrointestinal toxicity and hematologic toxicities including neutropenia, anemia, and thrombocytopenia, were collected and evaluated twice a week according to the Common Terminology Criteria for Adverse Events V3.0 (CTCAE 3.0). Grade 3 or 4 toxicities were defined as severe adverse effects. Grade 5 toxicity, also known as death, was not observed in this study. Progression-free survival (PFS) and overall survival (OS) were assessed in the survival analysis. PFS was calculated from the date of first cycle of platinum-based chemotherapy to the date of PD, death, or the last follow-up. OS was calculated from the date of first cycle of platinum-based chemotherapy to the date of death or the last follow-up. The survival data was collected from follow-up calls, and the Social Security Death Index and inpatient and outpatient clinical medical records.
SNPs selection and genotyping. Base on the genotype data of Han Chinese in Beijing (CHB) from phase II Hapmap SNP database, 173 SNPs of 27 genes involved in NER pathway were selected using the strategies of tag-SNPs and functional SNPs by Haplowview 4.1 (http://www.broadinstitute.org/haploview) with the criteria of minor allele frequency ≥ 0.05 and correlation coefficient ≥ 0.8. The detail information was listed in Supplementary Table 1. Human genomic DNA was extracted from blood samples using Qiagen Blood Kit (Qiagen, CA). All SNPs were genotyped using iSelect HD BeadChip (Illumina, San Diego, Calif). The results of random duplicate assays were consistent. Following the criteria of SNP genotyping call rate > 0.95, MAF > 0.01, GenCall score > 0.2, all 173 SNPs located in 27 genes (detailed in supplementary Table 1) were included in final analysis. Statistical analysis. Demographic and clinical factors were test against clinical outcomes by chi-square tests or log-rank test. Factors that had P-value < 0.05 were regarded as covariates (Supplementary Table 2,  Supplementary Table 3). The Chi-square test was used to assess whether SNPs' genotypes were significantly different in the distribution of clinical outcomes. Bonferroni correction was performed by multiplying the number of all SNPs tested in the study to control for multiple comparisons. Significant SNPs from Chi-square were included in multiple logistic regression adjusted for covariates to estimate their association with clinical outcomes by odds ratio (OR) and confidence interval (CI). Log-rank test was used to compare the survival curve between patients' groups. Cox proportional hazards regression adjusted for covariates was performed to evaluate the association between survival and significant polymorphisms SNPs from log-rank test by hazard ratios (HRs) with 95% CIs in additive, dominant, or recessive model. All P-values presented were two-sided, and a level of P < 0.05 was considered statistically significant. SPSS software (SPSS, Chicago, IL) and PLINK v1.07 were used for statistical analyses in this study.