Glutathione S-transferasesP1 AA (105Ile) allele increases oral cancer risk, interacts strongly with c-Jun Kinase and weakly detoxifies areca-nut metabolites

The Glutathione S-transferases (GSTs) protects cellular DNA against oxidative damage. The role of GSTP1 polymorphism (A313G; Ile105Val) as a susceptibility factor in oral cancer was evaluated in a hospital-based case-control study in North-East India, because the habit of chewing raw areca-nut (RAN) with/without tobacco is common in this region. Genetic polymorphism was investigated by genotyping 445 cases and 444 controls. Individuals with the GSTP1 AA-genotype showed association with the oral cancer (OR = 3.1, 95% CI = 2.4–4.2, p = 0.0002). Even after adjusting for age, sex and habit the AA-genotype is found to be significantly associated with oral cancer (OR = 2.4, 95% CI = 1.7–3.2, p = 0.0001). A protein-protein docking analysis demonstrated that in the GG-genotype the binding geometry between c-Jun Kinase and GSTP1 was disrupted. It was validated by immunohistochemistry in human samples, showing lower c-Jun-phosphorylation and down-regulation of pro-apoptotic genes in normal oral epithelial cells with the AA-genotype. In silico docking revealed that AA-genotype weakly detoxifies the RAN/tobacco metabolites. In addition, experiments revealed a higher level of 8-Oxo-2′-deoxyguanosine induction in tumor samples with the AA-genotype. Thus, habit of using RAN/tobacco and GSTP1 AA-genotype together play a significant role in predisposition to oral cancer risk by showing higher DNA-lesions and lower c-Jun phosphorylation that may inhibit apoptosis.

GSTP1 AA-genotype is associated with the oral cancer risk. The rs1695 is found to be polymorphic in the oral cancer patients and in controls i.e., Minor Allele Frequency ≥ 0.05 (Table 1). The present SNP does not show any deviation from Hardy-Weinberg Equilibrium in the control group ( Table 1). The present data demonstrate that GSTP1 AA-genotype is significantly associated with the oral cancer in cases compared to controls (Dominant model; OR = 3.1, 95% CI = 2.4-4.2, p-value = 0.0002) ( Table 1). Even after adjusting for age, sex and habit the AA-genotype is found to be significantly associated with the risk of oral cancer in cases compared to controls (OR = 2.4, 95% CI = 1.7-3.2, p = 0.0001) ( Table 1). To investigate the contribution of risk genotypes of rs1695 in oral cancer, we performed habit-matched regression analysis separately for two groups including individuals with two different habits "RAN Only" and "RAN + tobacco". Even after adjusting for the probable confounders, the significant association of AA-genotype with oral cancer risk in cases is still evident (OR = 2.3, 95% CI = 1.4-3.7, p = 0.0001 for RAN only group; OR = 2.4, 95% CI = 1.6-3.7, p = 0.0001 for RAN + tobacco group) (  Table 1. Comparison of genotype frequencies of rs1695 in GSTP1 gene between oral cancer patients and healthy controls. a The odds ratio was calculated for the common genotype (AA) with reference to the variant genotype (AG + GG); Dominant model. b Chi-square p-value estimated by comparing genotype frequencies between cases and controls. c Odds ratio and p-value for multivariate logistic regression adjusted for age, sex and habit between cases and controls. d p-value < 0.05 is considered to be significant.
www.nature.com/scientificreports www.nature.com/scientificreports/ Mutant protein modelling, structure validation and its comparison with wild protein. The modelled mutant protein with amino acid substitution is subjected to energy minimizations and it is observed that the total energy of the mutant proteins is higher than the wild protein. Stereochemical and main chain parameter of validated modelled protein is shown in Supplementary Table S4(a,b). 99.5% (96.1% in most favoured regions) of the residues for mutant GSTP1 lie in allowed regions as revealed by Ramachandran plot indicating the reliability of the structures predicted. Comparative structure analyses of wild and mutant proteins (Fig. 1D,E) reveals the occurrence of secondary structure and protein folding alteration due to single amino acid polymorphism. Gln24,   Table 3. Both wild and mutant complex were found to have formed cation pi interaction/s. The interaction between Tyr 50 of wild GSTP1 and JNK has two pi cation interactions whereas only one pi cation interaction showed between mutant GSTP1 and JNK. Analysis of various trajectories of molecular dynamic simulation revealed that the wild complex reached stability compared to mutant at very early phase of simulation which could lead to a weaker binding affinity of mutant GSTP1 to JNK. For detail results, please see the Supplemental Information (Fig. S1).
Lower c-Jun phosphorylation in normal oral tissues of GSTP1 AA-genotype. We studied c-Jun phosphorylation in a panel of normal oral tissue samples with GSTP1 AA-genotype (Ile/Ile) (n = 20), AV-genotype (Ile/Val) (n = 12) and VV-genotype (Val/Val) (n = 6). Both Ile/Val and Val/Val showed significantly higher c-Jun phosphorylation than Ile/Ile samples ( Fig. 2A). H-score (Fig. 2B) of c-Jun phosphorylation varied from 42 to 140 for normal oral tissues with AA-genotype, 108 to 164 for AV-genotype and 154 to 210 for VV-genotype.
Higher expression of Bim and PUMA in tumor cells with GSTP1 GG-genotype. The above data indicate that wild GSTP1 has stronger affinity to JNK as compared to mutant GSTP1 and therefore the wild GSTP1 showed lower c-Jun phosphorylation. This might lead to reduce apoptosis by downregulating the proapoptotic genes. We explored this by analysing the expression pattern of Bim and PUMA, the downstream effectors of the Bcl-2 family. The present data demonstrated that the expression of both Bim and PUMA were significantly higher in the tumor samples with GSTP1 GG-genotype than the samples with AA-genotype (p value 0.006 for Bim and 0.01 for PUMA) (Fig. 2C,D).

Better detoxification of toxins by mutant protein: docking studies. Through docking studies we
were able to display the binding ability of wild-type and mutant proteins with GSH and various carcinogens/toxins derived from areca nut/tobacco (four such compounds showed in Fig. 3A and six more showed in supplementary section Fig. S2). Several amino acids of GSTP1 are found to be actively involved in binding of caricinogens/ toxins (Table 4 and Supplementary Tables S5 and S6). On the basis of the gold score, it is noted that most of the carcinogens/toxins bind almost equally with both wild and mutant GSTP1, except for gallotanic acid, a derivative of RAN, showed better binding with GSTP1 GG-genotype (Table 4), whereas N'-nitrosonornicotine (NNN), a derivative of tobacco, showed better binding with GSTP1 AA-genotype (Table S6). For effective detoxification process, GSTP1 has to bring GSH and substrate into close proximity inside the binding pocket for conjugation reaction. Hence, presence of both GSH and substrate in binding pocket is crucial for their conjugation. Therefore, further analysis of docking poses of both wild and mutant protein was performed. It was observed that in case of wild-type protein, GSH was at a fair distance from the docking pocket where carcinogenic ligands were bound. In contrast, homodimeric and heterodimeric mutant protein-ligand complexes indicate that both GSH and ligands reside together in the active site pocket of the protein (Fig. 3A). Similar feature was observed with several other toxins derived from either RAN or tobacco ( Fig. S2; Table S5 and S6).
Higher level of oxidative damages to DNA in tumor cells with GSTP1 AA-genotype. An enzyme-linked immunosorbent assay (ELISA) method was used to measure 8-OHdG, a marker of oxidative damage to DNA, in DNA extracted either from blood lymphocytes or tumor tissues of oral cancer patients who had the habit of RAN-chewing with and without tobacco (Fig. 3B). It was clear that higher level of 8-OHdG was present in DNA in tumor than blood lymphocytes. The level of 8-OHdG in the blood-lymphocytes was similar Docked complex

No of H-bond
Interacting residue of GSTP1 involve in H-bond   www.nature.com/scientificreports www.nature.com/scientificreports/ irrespective to GSTP1 genotypes. In contrast, the level of 8-OHdG in tumor DNA was significantly higher in GSTP1 AA-than GG-genotypes (p value 0.012 in RAN samples; p value 0.014 in RAN + Tobacco samples).

Discussion
The present study confirms that the AA homozygous genotype of GSTP1 gene is significantly associated with the risk of oral cancer even after adjusting for age, gender and habit of RAN/tobacco usage in the population. The similar results are obtained when tested with the habit-matched case-control data for the two most significant habits "RAN only" and "RAN + Tobacco" separately. GSTP1 AA reference allele is associated with an increased risk for esophageal cancer in those having smoking habit 21 and laryngeal cancer in those having smoking and drinking habit 31 . Earlier study with Indian samples has documented association of AA-genotype with the risk of oral leukoplakia which is essentially consistent with smokeless tobacco users 32 .
Comparative analysis of wild and mutant GSTP1 protein structure shows the changes in secondary conformation. An earlier investigation on this polymorphism has revealed differences in their thermal stability as well as its specific activity and affinity for electrophilic substrates 18 . Here we have compared the functional efficiency of both GSTP1 wild-type and the mutant in human samples. GSTP1 can regulate activities of several cellular proteins by forming protein:protein interactions with critical kinases involved in controlling stress response, apoptosis and proliferation 33 . The present protein-protein interaction study reveals that due to this polymorphism, binding geometry between GSTP1 GG-genotype and JNK is disrupted which weakens the affinity between these two proteins. The molecular dynamic simulation study indicates that the GSTP1 AA-genotype reaches stability at very early phase of simulation. Additionally, lower energy value of "wild complex" compared with the "mutant complex" indicates greater stability of the former. Therefore, present result indicates that GSTP1 AA-genotype has stronger affinity to JNK as compared with the mutant which might impair c-Jun-phosphorylation and reduce the extent of apoptosis. We have now validated it experimentally in normal human oral epithelial cells in this study. This assumption is further strengthened after observing an increased expression of pro-apoptotic proteins Bim and Puma, the downstream effector of the Bcl-2 family, in the tumor samples having GG-rather than www.nature.com/scientificreports www.nature.com/scientificreports/ AA-genotype. Both Bim and Puma, proapoptotic proteins are transcriptionally activated by JNK/cJUN axis 34 . It was earlier demonstrated in cell lines that the binding of GSTP1 to JNK1 is a crucial step in apoptosis repression 33,35 . Higher activity of JNK and consequent phosphorylation of c-Jun was observed in mice without GSTP1 36 .
The active site of GSTP1 consists of a hydrophilic G-site (glutathione [GSH]-binding site) and a hydrophobic H-site (xenobiotic-binding site) 37 . Since the data on association between different metabolites of RAN/tobacco and GSTP1 polymorphisms are scarce, we adopted an in silico approach to assess the association of GSTP1 Ile/ Ile with the susceptibility to RAN/tobacco metabolites than that of GSTP1 Ile/Val or Val/Val. Comparative electrostatic interaction of reduced GSH with RAN-derived toxic compounds such as N-methylnipecotyl-glycine (NMNG) and arecaidine and two other tobacco-derived toxic compounds was shown recently 12 . In this study, similar interaction between other toxic metabolites of RAN/tobacco and reduced GSH was evaluated in silico. It showed that Val105 substitution results in steric restriction of the H-site due to shifts in the side chains of  several amino acids leading to reduce the distance between G-site and H-site whereas in the Ile/Ile form such distance is increased suggesting less detoxification. Earlier it has been reported that the structure of GSTP1 Val, has more surrounding water molecules which are linked to a channel of additional water molecules in contrast to GSTP1 Ile, which is proposed to influence the catalytic process 38 . It has earlier been demonstrated that GSTP1 Val increases catalytic efficiency by several fold towards tobacco-related pollutants benzo(a)pyrene, and diol epoxide as compared to GSTP1 wild type enzyme 39 . Thus, weak detoxification of the RAN/tobacco metabolites by GSTP1 Ile/Ile leads to higher induction of oxidative damages to DNA leading to mutagenesis and genomic instability. The present quantification of 8-OHdG has been widely used earlier, not only as a biomarker indicating the level of endogenous oxidative damage to DNA but also as a risk factor for several diseases, including cancer 40 . Higher level of 8-OHdG has been noted in smokers than non-smokers 30 . It has been demonstrated that reactive oxygen species mediated DNA double strand breaks and 8-OHdG occurs via secretory cytokines in areca-nut exposed oral keratinocytes 41 . With this rational in view, validation of the present in silico observations was sought to be done by quantifying the level of 8-OHdG in DNA digests of cancer cells obtained from patients with different GSTP1 genotypes. Presence of significantly higher level of 8-OHdG in cancer DNA samples with AA-genotypes suggested that metabolic activation of RAN/tobacco in the oral cavity could produce a variety of toxic substances which induce various damages 42,43 including 8-OHdG 44 . This could be the reason for higher 8-OHdG observed in tumor DNA than DNA from peripheral blood of the same patient. Thus, metabolic activation and detoxification are considered to be an important factor in determining the ultimate effects of exposure to chemical carcinogens.

Conclusions
The GSTP1 AA reference allele (rs1695) is significantly associated with the risk of oral cancer to those having RAN consumption habit with and without tobacco. Such association can be attributed due to poor detoxification of RAN/tobacco toxins and lowering c-Jun phosphor-ylation due to its strong binding to JNK which consequently may inhibit apoptosis. Thus, it can be said that the development of cancer is not only due to the type of habit that patients have but depends on interaction between the metabolites and the genes that detoxify these metabolites/ carcinogens. Nowadays, SNPs have been considered as more tractable genotypic markers 45 and can be utilized in human genetic analysis which can provide critical proof-of-concept of a priori prediction of responses to certain food habit and environmental exposure. These data also provide a foundation for future genotype-phenotype association studies involving carcinogenesis risk.

Materials and Methods
Selection of study participants. The samples for the present study was collected from Nazareth hospital, Shillong, India. A total of 445 Oral Cancer patients and 444 healthy controls were recruited and peripheral blood sample was collected from each donor in heparinized vials, under aseptic condition. Of the total 445 oral cancer patients, 192 were only RAN chewers and 253 were from both RAN and tobacco chewing category. The age ranged from 28 to 84 years (mean ± SD; 53.8 ± 12.0) for oral cancer samples whereas the age varied from 21 to 90 years (mean ± SD; 45.4 ± 17.7) for healthy control group. For the details about demographic characteristics of the samples, please see the Supplemental Information Table S1. All the donors had no viral diseases or antibiotic therapy during the last 6 months. This study was approved by the Institutional Ethics Committee for Human Samples/Participants (IECHSP/2014/07) in North-Eastern Hill University, Shillong, India. The tumor tissues were obtained from patients after having their consent for participation and were individually interviewed before taking the biopsy. Informed consent was obtained from all the individuals studied before sample collection. Every biopsy was kept in RNAlater soon after its collection and all experiments were performed in accordance with relevant guidelines and regulations. A total of 38 normal oral squamous cell epithelium (3cms away from the cancer site) were also collected from oral cancer patients. Biopsy and resection samples were reviewed by the pathologists and Head and Neck Surgery Department of Nazareth Hospital to confirm the diagnoses and also select representative blocks for immunohistochemical analyses.
DNA isolation and SNP genotyping. Genomic DNA was extracted from 3 ml peripheral blood using proteinase K treatment and the standard phenol-chloroform extraction procedure 46 . The details of primer sequences (Table S2) and about the PCR reaction are given in the Supplementary section.
The genetic polymorphism of GSTP1 in exon 5 (rs1695, Ile/Ile, Ile/Val, Val/Val genotypes) was identified using the Alw26I restriction enzyme 47 . A 433 bp fragment of GSTP1 gene was amplified and the presence of Alw26I restriction site yielded 328 and 105 bp fragments, respectively (GSTP1 Ile/Ile). The presence of rs1695 (313 A/G) creates another restriction site within the 328 bp fragment which when digested by Alw26I, yielded two fragments of 222 and 106 bp (GSTP1 Val/Val) (Fig. 1A,B).
To validate the genotype data generated by PCR-RFLP for rs1695, a subset of samples (100 in each) were resequenced for the GSTP1 gene using the same primer pairs that were used for PCR during the RFLP assay. The sequencing reactions were performed by conventional Sanger sequencing method using an ABI PRISM 3100 Genetic Analyzer and the genotypes were determined from the electropherograms using Seqscape v.2.4 (Applied Biosystems) (Fig. 1C).
Statistical analysis. Estimation of allele and genotype frequencies for both the SNPs in the cases and controls and tests for deviation from Hardy-Weinberg equilibrium on the control group were performed using SPSS 20.0 and GraphPad Prism software, respectively. Case-control association study was performed for rs1695 to find out the risk genotype, if any, associated with the risk of developing oral cancer in Meghalaya, India. For comparing genotype frequencies of GSTP1 between cases and controls, the individuals were grouped into reference allele homozygous (AA) and mutant allele containing (AG + GG) genotypic groups, p-value estimation was done by www.nature.com/scientificreports www.nature.com/scientificreports/ Chi-square tests or Fisher exact tests used appropriately. For adjusting the influence of confounding covariates like age, gender and habit on association of risk genotypes with oral cancer, multivariate binary logistic regression was performed to compute Odds Ratio and 95% C.I. with case-control affection status as dependent variable and SNP genotypes (AA vs AG + GG), age, sex and habit as independent variables using SPSS 20.0. Habit-matched case-control analysis was performed for both the "Raw Areca-Nut only" and "Raw Areca-Nut+Tobacco" groups to understand the independent association of SNP genotypes with the risk of development of oral cancer compared to controls when the influence of habit is not a probable confounder.

Immunohistochemistry (IHC).
A total of 38 normal oral squamous cell were classified according to their genotypes of GSTP1 having Ile/Ile, Ile/Val and Val/Val at 105 positions. These normal tissue samples were dehydrated, paraffin embedded and sectioned with a microtome (Leica). Briefly, after blocking for endogenous peroxidase activity, the sections were incubated with anti-c-Jun phosphorylation (ab32385; Abcam, USA) primary antibody. IHC analysis was performed with a Strept-Avidin Biotin Kit (Dako). The scoring of immunohistochemical stains in each specimen was determined using a histological score (H) 48 . The H-score is computed on the basis of both extent and intensity of staining on the scale of 0, 1, 2 and 3, representing negative, weak, moderate and strong staining. Finally, the H-score had been obtained by multiplying the staining intensity by the percentage of positive cytoplasmic staining cells (Supplemental Information).
Quantitative real-time PCR. Total RNA was isolated with Trizol from tumor as well as normal tissue samples collected from each patient and then purified using the RNeasy Mini Kit (Qiagen) according to the manufacturer's protocol. Synthesis of cDNA was performed with 1 μg of total RNA from each sample using Quantiscript Reverse Transcriptase, Quantiscript RT-buffer and RT Primenr-mix of QuantiTect Reverse Transcription kit (Qiagen GmbH, Hilden, Germany) according to the manufacturer's protocol. Quantitative real-time PCR was performed using in 96-well optical reaction plates (Applied Biosystems, Darmstadt, Germany) using a StepOnePlus amplification and detection system (Applied Biosystems). The real-time RT-qPCR reactions were prepared using SYBR ® Select Master Mix (Life Technologies), and the following conditions were used: 95 °C for 5 min, 40 cycles of 95 °C for 30 s, 60 °C for 30 s and 72 °C for 30 s. The primers of target genes used for this analysis were Bim and PUMA, and GAPDH was used as the reference gene. The primer sequences are listed in Table S3. The gene copy numbers of Bim and PUMA were calculated by using a standard curve that was constructed using the OE33 cell line. The 2 −ΔΔCT method was used as a relative quantification strategy for qPCR data analysis. In total, 20 samples from Ile/Ile, 12 from Ile/Val and 6 from Val/Val genotype were used in this study.

8-OHdG measurement.
Measurement of 8-Hydroxydeoxyguanosine (8-OHdG), a known marker of oxidative stress-mediated DNA damage, was estimated with OxiSelect ™ Oxidative DNA damage ELISA-kit, Cell Biolabs Inc. (San Diego, CA) in DNA of blood lymphocytes and tumor tissues from the cancer patients having the habit of chewing RAN with and without tobacco. DNA were extracted from blood and tumor tissue of the same patient (12 from Ile/Ile, 12 from Ile/Val and 6 from Val/Val) and digested with nuclease P1 and calf intestinal phosphatase (Sigma, USA) and denatured. 8-OHdG was quantified by quantitative ELISA assay in 96-well plate format. The quantity of 8-OHdG in the specimens were determined by comparing its absorbance with known 8-OHdG standard curve.
Mutant protein modelling and structure validation. A graphical program for computational aided protein engineering, TRITON has been used for modelling GSTP1 protein mutant 49 and energy minimization for 3D structures was performed with GROMOS 4.5.4 package using OPLS (Optimised Potential for liquid simulation) force field 50 . The functional form of the OPLS force filled is very similar to that of the Amber and is represented by

ngles d ihedrals nonbonded
Protein-protein docking. In this study, the docking analysis of JNK with wild and mutant GSTP1 was initiated using Hex8.0.0 program 51 for automated comparative protein-protein docking. In this study JNK was treated as receptor, while GSTP1 wild and GSTP1 mutated proteins were treated as ligand. Further, Protein-protein docking was also carried with HADDOCK tool since it provides full flexibility to protein side chains. HADDOCK is an information-driven flexible docking approach for the modelling of biomolecular complexes.

Molecular dynamic simulation of wild and mutant complex.
To study the dynamic behaviour of the protein, molecular dynamic simulation of both wild and mutant protein complex was performed. The docked complexes of JNK with native and mutant GSTP1 generated by Hex were used as a starting point for molecular dynamic simulation which was carried out with GROMACS 4.5.4 package using OPLS force field. The details methodology is mentioned in the Supplemental Information.

Comparative catalytic activity of wild and mutant protein by docking studies
Protein and ligand preparation. X-ray crystallographic structure of wild protein (19GS) was obtained from protein data bank, and the mutant protein was modeled using TRITON. GSTs bind and detoxify substrate in their dimeric form and therefore only protein dimers have been selected in this study. The carcinogenic/toxic compounds from areca nut and tobacco have been used to study their binding affinity for homodimeric wild (GSTP1 Ile/Ile), homodimeric mutant (GSTP1 Val/Val) and heterodimeric mutant (GSTP1 Ile/Val) proteins. CASTp server 52 was used for the identification of active site of the proteins. The first step of catalytic mechanism is