Gene-gene and gene-environment interactions influence platinum-based chemotherapy response and toxicity in non-small cell lung cancer patients

Platinum-based chemotherapy is a major therapeutic regimen of lung cancer. Various single nucleotide polymorphisms (SNPs) reported were associated with platinum-based chemotherapy response and drug toxicity. However, neither of the studies explored this association from SNP-SNP interaction perspective nor taking into effects of SNP-environment consideration simultaneously. We genotyped 504 polymorphisms and explore the association of gene-gene and gene-environment interactions with platinum-based chemotherapy response and toxicity in 490 NSCLC patients. 16 SNPs were found significantly associated with platinum-based chemotherapy, and they were picked out as study object in the validation cohort. We recruited 788 patients in the validation cohort. We found that HSPD1 rs17730989-SUMF1 rs2633851 interaction was associated with platinum-based chemotherapy-induced hematologic toxicity (adjusted OR = 0.233, P = 0.018). In addition, the combined effect of ABCG2 rs2231142-CES5A rs3859104 was significantly associated with overall toxicity (adjusted OR = 8.044, P = 4.350 × 10−5). Besides, the model of ARHGAP26 rs3776332-ERCC6 rs2228528-SLC2A1 rs4658-histology was associated with platinum-based chemotherapeutic response. Gene-gene and gene-environment interactions have been identified to contribute to chemotherapy sensitivity and toxicity. They can potentially predict drug response and toxicity of platinum-based chemotherapy in NSCLC patients.

factors simultaneously in this manuscript, and results suggested that gene-gene and gene-environment interactions might influence these phenotypes 13 . In addition to our study, others also showed that interaction of SNPs and their synergistic effects could better explain inter-individual difference from the pharmacogenetics aspect. And hence it is becoming the focus to explore the association of genetic and clinical factors with complex phenotypes 14,15 .
However, current pharmacogenomics studies of platinum-based chemotherapy in NSCLC only considered the effect of a single SNP and paid no attention to effects of gene-gene and gene-environment interactions. In our previous study, we analyzed 416 SNPs of 185 genes with platinum response and toxicity 13 . In the present investigation, based on these SNPs, we further explored the effects of gene-gene and gene-environment interactions on platinum-based chemotherapy response and toxicity in Chinese NSCLC patients.

Results
Clinical characteristics of subjects. 490 subjects who had enough clinical data for sensitivity and toxicity evaluation were included in the discovery stage. Clinical data were collected and were classified by susceptibility, hematologic toxicity, gastrointestinal toxicity and overall toxicity. Details were presented in Supplementary  Table S2A. All patients received at least two cycles of platinum-based chemotherapy. Among them, the response was evaluated in 328 patients, 111 (33.84%) subjects were responders and 217 (66.16%) individuals were non-responders. The toxicity was evaluated in all patients, 85 (17.35%) patients suffered severe gastrointestinal toxicity and 116 (23.67%) patients suffered hematologic toxicity.
To test the credibility of the discovery stage, we further investigated the SNPs in the validation cohort which enrolled 788 subjects. The clinical information was presented in Supplementary Table S2B. 781 subjects with sufficient susceptibility data were collected for drug response evaluation, and 788 subjects for gastrointestinal toxicity and 782 subjects for hematologic toxicity evaluation. As shown in Supplementary Influence of gene-gene and gene-environment interaction to platinum-based chemotherapeutic response. We first explored the paired gene-gene interaction to chemotherapy response. As mentioned above, we enrolled 504 SNPs in the first stage, a total of 16 interactional SNPs were found remarkably associated with chemotherapy response or toxicities in this cohort ( Fig. 1 and supplementary Table S3) We found that the paired interaction between SLC2A1 rs4658 and HSPD1 rs17730989 was significantly associated with platinum-based chemotherapy response in NSCLC patients (adjusted OR = 5.430, P = 2.610 × 10 −5 ), indicating the two-locus model of SLC2A1 rs4658-HSPD1 rs17730989 might be related to drug sensitivity. Then we confirmed this model in validation stage and no significant correlation was found (P = 0.340). We further explored the multi-dimensional SNP-SNP interaction. Due to the limitation of subjects' number, the number of dimension was set from 3 to 6. We finally found that the best model was six dimension containing ARHGAP26 rs3776332, BRCA1 rs799917, ERCC6 rs2228528, NPAT rs228589, REV3L rs462779 and SLC2A1 rs4658 with testing accuracy of 0.547 CV consistency of 4/10, significant test P = 0.011 (Table 2). However, this six-locus model could not be replicated in the validation cohort (Table 2). Thus, we didn't find a gene-gene interaction associated with drug response. We next considered whether environment factors such as age, gender, smoking status, tumor histology and PS score could play a role in mediating the effects of gene-gene interaction on platinum-based chemotherapy response. Then environment factors were set as markers to investigate the association. As shown in Table 3, the two best models that identified associated with platinum-based chemotherapy sensitivity were a four dimensional model of ARHGAP26 rs3776332-ERCC6 rs2228528-SLC2A1 rs4658-histology, with testing accuracy of 0.636, CV consistency of 7/10 and significant test P = 0.001, and a five dimensional model of ARHGAP26 rs3776332-ERCC6 rs2228528-NPAT rs228589-WISP1 rs2977549-histology, with testing accuracy of 0.674, CV consistency of 7/10 and significant test P = 0.001. In the validation stage, as shown in Table 4, only the four dimensional model was verified (testing accuracy of 0.591, CV consistency of 10/10 and significant test P = 0.011). In addition, a consistent result was obtained from a second validation in the combination cohort with testing accuracy of 0.616, CV consistency of 10/10and significant test P = 0.001 (Table 4). Thus, we concluded that the four dimensional model of ARHGAP26 rs3776332-ERCC6 rs2228528-SLC2A1 rs4658-histology was significantly associated with platinum-based chemotherapeutic response.
Influence of gene-gene and gene-environment interaction to platinum-based chemotherapeutic ADR. Next, we studied the association of interactions with platinum-based chemotherapeutic toxicity.
Primarily, we also explored the paired gene-gene interaction. Three SNP-SNP pairs that found to be significantly    associated with overall toxicity in the discovery stage included ABCG2 rs2231142-CES5A rs3859104 (adjusted OR = 8.040, P = 4.350 × 10 −5 ), HSPD1 rs17730989-SUMF1 rs2633851 (adjusted OR = 0.084, P = 4.670 × 10 −5 ) and ARHGAP26 rs3776332-HSPB1 rs2070804 (adjusted OR = 6.380, P = 5.310 × 10 −5 ). Then we further try to verify them in 788 patients in the validation cohort. It was interesting to note that ABCG2 rs2231142-CES5A rs3859104 model was successfully verified (adjusted OR = 1.870, P = 0.011). As for the multi-dimensional SNP-SNP interactions, we discovered that even the best model had no association with the overall toxicity (Table 2). Afterwards, we explored gene-environment interaction, a six dimensional model including ARHGAP26 rs3776332, BRCA1 rs799917, ERCC6 rs2228528, WISP1 rs2977549, ATM rs189037 and histology was identified to be significantly associated with overall toxicity (testing of accuracy 0.561, CV of consistency 8/10, significant test P = 0.011) ( Table 3). However, this model was not validated (Table 4). Thus, the paired interaction of ABCG2 rs2231142-CES5A rs3859104 was associated with chemotherapy overall toxicity. Considering the overall toxicity contained hematologic and gastrointestinal toxicity, we thus investigated the interactions in these two subgroup phenotypes. Interaction between HSPD1 rs17730989-SUMF1 rs2633851 and HSPD1 rs28688207-BRCA1 rs799917 pairs was found to be significantly related to hematologic toxicity. The HSPD1 rs17730989-SUMF1 rs2633851 pair was successfully verified in validation cohorts ( Fig. 1 and Supplementary Table S3). But the multi-dimensional gene-gene and gene-environment models had no relationship to hematologic toxicity (Tables 2 and 3).
In summary, the paired interaction of ABCG2 rs2231142-CES5A rs3859104 was associated with overall toxicity. And the HSPD1 rs17730989-SUMF1 rs2633851 pair was significantly related to hematologic toxicity.

Discussion
In this study, we investigated the association of gene-gene interaction with chemotherapy response and toxicity. The four-dimensional model of ARHGAP26 rs3776332-ERCC6 rs2228528-SLC2A1 rs4658-histology, paired interactions of ABCG2 rs2231142-CES5A rs3859104 and HSPD1 rs17730989-SUMF1 rs2633851 were found to be significantly associated with platinum-based chemotherapeutic response, overall and hematological toxicities respectively. All these interactions found in the discovery stage were successfully verified in validation stage.
In the current study, we found that some SNPs previously identified "negative" were in fact significantly related to phenotype in the form of SNP-SNP pairs. Many SNPs analyzed in this study were reported to have no association with platinum-based chemotherapy sensitivity and toxicity 13,[16][17][18] . These studies only focused on a single SNP which overlooked possible interactions between inter-SNPs. Gene-gene interaction takes genetic context into account, and the "missing heritability" partly attribute to the low power to detect gene-gene interactions mentioned. So gene-gene interaction acts as an indispensable approach of studying how SNPs influence phenotype. In addition, environment factors were not taken into account in the univariate studies. The factors alone didn't have so notable association with platinum-based chemotherapy sensitivity. When considering gene-environment interaction, we can found that environment also played important role in drug response. In conclusion, the analysis of gene-gene and gene-environment interaction could further discover SNPs that were associated with platinum-based chemotherapeutic response and toxicity. The strategy of analyzing gene-gene and gene-environment interaction may hold potential value to predict chemotherapeutic response and toxicity in NSCLC patients.  The effects of SNP-SNP interactions on chemotherapeutic response and toxicity could be partly explained by the specific function of these genes. The rs2231142 is a non-synonymous variant in ABCG2 (C421A, encoding Q141K, Gln141Lys), which is a member of ATP -binding cassette (ABC) transporters super family 19,20 . And it is reported that the ABCG2C421A variant is associated with severe thrombocytopenia 21 . CES5A is one of the five subfamilies of CEs and generally expressed in the liver. CES enzymes mediate drug hydrolysis and the metabolic products are, to some extent, responsible for hepatotoxicity and nephrotoxicity or hematic toxicity 22 . In a word, ABCG2 and CES5A are all associated with platinum-based chemotherapy toxicity, so when we take them both into consideration, the predication of toxicity may be improved. In terms of chemotherapy-induced hematologic toxicity, Heat Shock Proteins (HSPs) are the major chaperones mediating (re) folding of proteins, and Sulfatase modifying factor 1 gene (Sumf1) can activate the catalytic site of SGSH 23 .
When exploring potential association with platinum-based chemotherapy response, Rho GTPase activating protein 26 (ARHGAP26) is a negative regulator of the Rho family that converts the small G proteins RhoA and Cdc42 to their inactive GDP-bound forms, and is associated with gastric cancer 24,25 . Functional Rho can regulate activity of the c-fos promoter, which can enhance cell survival while leads to apoptosis in high concentrations 26,27 . Solute carrier family 2 facilitated glucose transporter member 1 (SLC2A1) supply cells with glucose by facilitating diffusion of glucose molecules across the plasma membrane when the cellular glucose concentration is low 28 . DNA repair gene ERCC6 is an important caretaker of the overall genome stability, and different genotypes of rs2228528 were associated with susceptibility 29 . In conclusion, these SNPs identified in the study are related to platinum-based chemotherapeutic response and toxicity, which is possibly a result of their impacts on function of relevant genes. A SNP in a gene can cause the change of phenotype in the process of platinum-based chemotherapy. The accumulated effect of two or more SNPs may perform a more important role in this process as our results indicated. Traditionally, histology was regarded as an important factor in treatment decisions in NSCLC patients. However, the histology couldn't predict platinum-based chemotherapeutic response and toxicity precisely owing to its heterogeneity. When we took genes into consideration, the predictive accuracy was improved. In other words, environmental factors combined with genes can achieve better predictive outcomes.
It's undeniable that the there are some limitation in our study. First, We did Bonferroni correction in our multiple tests and most of the results were statistically significant except rs17730989 HSPD1 -rs2633851 SUMF1 (OR is 0.233 and p-value is 0.018). However, this result was verified in independent validation cohort (OR is 0.570 and p-value is 0.018), so we regarded it as statistical significance and kept it in this article. Moreover, the GI toxicity between discovery cohort (17.35%) and validation cohort (8.38%) was inconsistent. We think that it may be one of the reasons why we failed to validate GI models.
In a word, this study aimed to propose a feasible approach to explore the association with SNPs and platinum-based chemotherapy induced susceptibility and toxicity. The combined effect of gene and environment via gene-gene and gene-environment interaction may help to predict the phenotype of NSCLC patients. And gene-gene interaction as well as gene-environment model may be indispensable in phenotype prediction in the future.

Subjects and Methods
Subjects. The current two-stage clinical investigation enrolled 490 and 788 NSCLC patients for the discovery and validation cohort, respectively. The clinical characteristics of all subjects were summarized in Table 1. All patients provided written informed consent in compliance with the code of ethics of the World Medical Association (Declaration of Helsinki) before this study was launched. The study protocol was approved by the Ethics Committee of Xiangya School of Medicine, Central South University (Registration Number: CTXY-110008-1). The study was also applied for clinical admission in the Chinese Clinical Trial Registry with Registration Number of ChiCTR-ROC-14005699.
Patients eligible for the study had to meet the following criteria: (1) diagnosed as NSCLC histologically and confirmed as a primary tumor; (2) receiving neither surgery nor radiation therapy before or during chemotherapy; (3) treated with platinum-based chemotherapy regimens containing platinum + gemcitabine (GP), platinum + docetaxel (DP), platinum + etoposide (EP), platinum + paclitaxel (TP), platinum + pemetrexed (PP), and other platinum-based chemotherapy (platinum + irinotecan, platinum + vinorelbine) at least two cycles; (4) assessed treatment efficacy before the third chemotherapy cycle. Patients who have pregnancy, lactation, active infection, symptomatic brain or leptomeningeal metastases, and other previous or concomitant malignancies were excluded. Data collection. We collected clinical data of all patients, including age, gender, smoking status, tumor histology and clinical stage. After two cycles of treatment, the response to chemotherapy was evaluated following the Response Evaluation Criteria in Solid Tumors (RECIST) guidelines. The curative effect was classified as complete response (CR), partial response (PR), stable disease (SD), and progressive disease (PD) 30 . We defined CR and PR as platinum-sensitive phenotypes, SD and PD as platinum-resistant phenotypes. According to the National Cancer Institute Common Toxicity Criteria 3.0 (NCI-CTC 3.0), toxicity was evaluated when participants receiving first two cycles of chemotherapy. And we classified toxicity into hematologic toxicity (anemia, leukopenia, neutropenia and thrombocytopenia) and gastrointestinal toxicity. Drug toxicity was classified into grade 0-4, grade 0-2 was considered as low-toxicity and grade 3 or 4 was considered as high-toxicity. Either gastrointestinal toxicities or hematologic toxicities that reach to grade 3 or 4 were considered as overall toxicity as CTCAE guide.
SNPs selection, sample preparation and genotyping. Most of the SNPs were selected as previously described 13 . In brief, genes associated with platinum response or toxicity were selected, a total of 504 SNPs with a minor allele frequency (MAF) ≥ 0.05 were genotyped in the discovery stage (Supplementary Table S1). The distribution of these SNPs located genes was shown in Supplementary Fig. S1. After a screening process, we selected 16 SNPs that were associated with chemotherapy sensitivity and toxicity for further validation.
Genomic DNA of all subjects was extracted from approximate 5 ml peripheral blood using Genomic DNA Purification Kit (Promega, Madison, Wisconsin, USA) according to the standard protocol, and was stored at −20 °C until use. We obeyed the experimental operating practices and qualities of all DNA samples were verified by agarose gel electrophoresis. All genotyping were conducted by Sequenom's Mass ARRAY system (Sequenom, San Diego, California, USA). Statistical analysis. Continuous variables were presented as means ± SD and analyzed by the two-sample t-test. Noncontiguous variables in different groups were compared using the χ 2 test. To investigate the influence of gene-gene and gene-environment interactions on platinum-based chemotherapy response and toxicity, PLINK and generalized multifactor dimensionality reduction (GMDR) software were employed 31,32 . All of the statistical analysis was planned and conducted in accordance with relevant guidelines. In this paper, false positive rate (FPR) and true positive rate (TPR) were employed to evaluate the model prediction accuracy. Cross-validation consistency was used to assess the quality of the model. And the sign test was used to evaluate the model statistically significance. Models with the maximum testing accuracy and cross-validation consistency (CVC) were regarded as the best interaction model. All P-value was two-tailed and P < 0.05 was considered as statistically significant. The flow diagram of the analysis was showed in Supplementary Fig. S2.