Epidermal growth factor receptor intron-1 CA repeat polymorphism on protein expression and clinical outcome in Taiwanese oral squamous cell carcinoma

This study was designed to explore the relationship between epidermal growth factor receptor (EGFR) CA repeats polymorphism and protein expression in oral cavity squamous cell carcinoma (OSCC). A total of 194 OSCCs were examined for EGFR protein overexpression, gene copy number and the length of their CA repeats. The length of the EGFR CA repeats was found not to be associated with EGFR gene copy number or with protein overexpression. To exclude the effect of EGFR gene copy number on protein overexpression, only those OSCC tumors with disomy of the EGFR gene were included in further analysis. In this subgroup, EGFR protein overexpression was significantly associated with poor differentiation of the tumor cells and lymph node metastasis, especially extra-capsular spread. However, EGFR CA repeats were not related to any clinicopathological factor. Interestingly, patients genetically found to have the EGFR CA repeats SS genotype and having tumors with EGFR protein overexpression were found to have a worst prognosis in terms of disease-free survival (DFS) (HR = 2.68; 95% CI, 1.03–6.98) after multivariate adjustment. The present study demonstrates that concurrent overexpression of EGFR protein in the presence genetically of the SS form CA repeats acts as a predictor for poor DFS.

to confirm that the number of CA repeats had a significant influence on EGFR expression in a later study 10 . This contradictory result may be mainly due to the complexity of head and neck cancer, which is composed of cancers from a number of different anatomical sites. Thus, whether there is a relationship between the intron 1 CA repeat genotype and protein expression in head and neck cancer is still unresolved.
OSCC is the major head and neck cancer in Taiwan and mechanisms regulating levels of EGFR protein expression in OSCC are not fully understood. We have previously shown that EGFR genetic mutations play a very minor role in OSCCs, whereas gene copy number was found to be significantly correlated with EGFR protein overexpression 4 . However, the role of the patient's EGFR intron 1 CA repeat genotype in OSCC is rarely explored 11 . Since the EGFR intron 1 CA repeat genotype is known to be associated with the gene's transcriptional activity, the CA repeat genotype has been implicated in cancer risk and in patient clinical outcome 12 . In this study, we comprehensively investigated the effects of EGFR CA repeat genotype on OSCC risk and protein overexpression, as well as evaluating its prognostic role.

Methods and Materials
Patients, tissue specimens and clinical diagnosis. This study was approved by the Institutional Review Board, Chang Gung Medical Foundation. The committee approved the experiments, and the informed consent was obtained from all subjects. The methods in this study were carried out in accordance with the relevant guidelines, including any relevant details. A total of 194 male OSCC patients who received primary radical surgery treatment at Chang Gung Memorial Hospital, Lin-Kuo during the period from March 1997 to June 2004 were recruited to participate in the study. All cases gave written informed consent for participation before surgery and all cases were confirmed by histology. For each case, 10 ml of venous blood was drawn and then separated into plasma, buffy coat cells and red blood cells by centrifugation within 18 h of obtaining the blood; the buffy coat cells were then stored at −80 °C. Genomic DNA for EGFR intron 1 CA repeats genotyping was purified from the buffy coat cells as described previously 13 . As referent controls, 1444 Taiwanese random males, whose blood was originally collected to study their blood lead concentrations, were also included in this study 14 . Fluorescence in situ hybridization (FISH) assay to assess EGFR gene copy number. EGFR gene copies were investigated by FISH using the LSI EGFR SpectrumOrange/CEP 7 SpectrumGreen probe system (Vysis; Abbott Laboratories, Downers Grove, IL) as described previously 4 . At least 100 non-overlapping nuclei per case were scored independently by two independent observers. The FISH patterns were classified into three levels based on the copy number of EGFR genes per cell as described in previous studies 4,15,16 . These were normal disomy, with ≤two copies in more than 90% of the analyzed cells; low amplification/polysomy (LA/Poly), ≥three copies in more than 40% of the analyzed cells, and gene amplification, which was defined by the presence of tight EGFR gene clusters in ≥10% of the analyzed cells.
Immunohistochemical Analysis of EGFR protein overexpression. Immunohistochemical staining for EGFR protein was processed using anti-EGFR monoclonal antibody NCL-EGFR-384 (1:100) (Novocastra, Newcastle, UK) as described previously 17 . Normal skin, known to be EGFR positive, served as both positive (primary antibody added) and negative (no primary antibody) controls. The specimens were examined for the extent and intensity of nuclear and non-nuclear staining by the pathologist (W.-Y.C.) in a blind manner and scored according to the following criteria: 0, no discernible staining or background type staining; 1+, equivocal discontinuous membrane staining; 2+, unequivocal membrane staining with moderate intensity; and 3+, strong and complete plasma membrane staining. In the present study, when more than 25% of the cells had EGFR membrane staining with intensity scores of 2+ and 3+, then there was considered to be EGFR overexpression 15, 17, 18 . EGFR intron 1 CA repeats genotyping. The procedure for analysis of the EGFR intron 1 CA repeats length polymorphism was modified from previous reports 11,19,20 . Briefly, fluorescein-labeled forward primer 5′-FAM-GTTTGAAGAATTTGAGCCAACC-3′ and reverse primer 5′-GTCTGCACACTTGGCACACT-3′ was used for the PCR reaction, which began with initial heating for 12 min at 95 °C, followed by 30 cycles of denaturation at 94 °C for 30 s, annealing at 60 °C for 60 s, and extension at 72 °C for 60 s. The fragment length of the amplified PCR products based on the 500 LIZ size standards was determined using the ABI Prism 3100 DNA Analyzer with GeneScan software (Applied Biosystems, Foster City, CA). According to the NCBI Build 36.1 reference sequence, the PCR product is predicted to be 116 bp with 16 CA repeats. Homozygous samples were randomly selected for direct sequencing to verify CA repeat number and also used as the internal control for the GeneScan analysis. The primers used for direct sequencing of the CA repeat number were: forward primer 5′-AGAGCTCATCCTGGCCAAC-3′ and reverse primer 5′-GCTCAAGGTTGGAATTGTGC-3′.
Statistical analysis. Statistical analyses were performed using the SPSS statistical package (SPSS, Chicago, IL). The correlations between the EGFR intron 1 CA repeat genotype and age, cigarette smoking, alcohol drinking, AQ chewing, EGFR protein overexpression and clinicopathological parameters was examined by χ 2 test or Fisher's exact test as appropriate. Survival curves were constructed by the Kaplan-Meier method and the curves were compared using the log-rank test. The Cox regression model was applied to adjust simultaneously for all potential prognostic variables, including age and lymph node metastasis. A two-sided value of p < 0.05 was considered statistically significant.

EGFR intron 1 dinucleotide CA repeats polymorphism and OSCC risk.
We studied a total of 194 OSCC patients and 1444 referent control individuals (Supplementary Table 1). Twelve different alleles of the CA repeat length within the range of 10 to 24 were observed. The most common allele in both referent controls and OSCC patients was 20 followed by 16 and 15 CA repeats. As illustrated in Supplementary Figure Figure 1B). The distribution of CA repeat genotypes was not significantly different between the OSCC patients and the referent controls (p = 0.09).
To assess the association between EGFR intron 1 polymorphism and OSCC risk, the number (range: 10-24) of CA repeats in each allele was categorized at the sample median (20). The categories were CA repeat <20, which was named the short (S) form and CA repeat ≥20 which was named the long (L) form. The SS genotype in general was found to be slightly associated with an increased OSCC risk (odds ratio (OR) = 1.40; 95% confidence interval (CI), 0.95-2.05; p = 0.08). When stratified by the major risk factors of OSCC, the SS genotype was significantly associated with an increased OSCC risk among AQ chewers (OR = 1.70; 95% CI, 1.04-2.76; p = 0.03) ( Table 1). Since the mean age of the OSCC patients was 49.28 (standard deviation (SD) = 11.34) years old and that of the referent controls was 46.04 (SD = 16.68), we used an unconditional multivariate logistic regression to adjust this potential confounding variable (age). Individuals with SS genotype were still found to have a significantly higher OSCC risk than those with either the LL or LS genotype (OR = 1.65; 95% CI, 1.01-2.70; p = 0.05), especially among AQ chewers (Table 1). EGFR protein overexpression, the genotype of the CA repeats and OSCC clinicopathological factors. The genotype of the EGFR CA repeats of the OSCC tumors was found not to be associated with gains in the copy number (both low amplification/polysomy and amplification) of EGFR gene or with protein overexpression ( Table 2). As reported previously 3,4 , there was a significant association between a gain of EGFR gene copy number and protein overexpression in Taiwanese   protein overexpression of this increase in copy number of the EGFR gene, only those OSCC tumors with disomy of the EGFR gene were included in the further analysis. In this subgroup, EGFR protein overexpression was found to be significantly associated with poor differentiation of the tumor cells (p = 0.003) and lymph node metastasis, especially extra-capsular spread (ECS) (p = 0.03) ( Table 3). On the other hand, the tumor aggressiveness factors, including bone, skin invasion and perineural invasion were not related to EGFR protein overexpression (Table 3). Interestingly, OSCC patients without a history of alcohol drinking showed a higher frequency of EGFR protein overexpression than those who were alcohol drinkers. However, EGFR protein overexpression was not associated with either cigarette smoking or AQ chewing ( Table 3).
The patient's EGFR CA repeat genotype was found not to be associated with tumor stage, tumor differentiation, lymph node metastasis or tumor aggressiveness factors, including skin, bone and perineural invasion (Table 3). Interestingly, AQ chewing, but not cigarette smoking or alcohol drinking, was significantly associated with the EGFR CA repeat genotype. The OSCC patients with the SS genotype were all AQ chewers (Table 3).

Discussion
It has been shown that the allelic distribution of the EGFR intron 1 CA repeats has interethnic variability 14 and that this interethnic variability might help to explain the distinct features of EGFR amplification and protein overexpression in human cancers among certain populations 21 . The most frequent allele in Asians is the 20 repeat allele, while the 16 repeat allele is the most common among Caucasians. The allele frequencies of the CA repeats observed in this study in terms of the Taiwanese referent controls (52.34% for 20 repeat allele and 19.46% for 16 repeat allele) is in agreement with the previous findings for Asians 14,21 .
In vitro, EGFR transcription activity has been found to decline as the number of CA repeats increases and this then correlates with protein expression level in vivo 8 . In addition, a higher number of CA repeats has been found to be correlated with a higher frequency of amplification of the EGFR gene in breast cancer cases 21 . In this study, we have observed that a gain of EGFR gene copy number can be observed in 30% of the OSCC tumors and this frequency was only slightly increased in tumors from individuals with the CA repeat genotype compared to those with the SS genotype. However, our findings indicated that Taiwanese OSCCs have a significantly Patient no. higher frequency of EGFR amplification compared to German oral cavity cancers (19.6% (38/194) vs. 11.5% (24/209)), when analyzed using the same probe and the same amplification criteria 22 . This result is consistent with an interethnic study that consisted of German and Japanese breast cancer cases 21 . Thus, there is clearly an interaction between the number of CA repeats and the frequency of EGFR amplification. The homozygous SS genotype of the EGFR intron 1 CA repeats has been found to be associated with an increased risk for glioma, breast cancer and lung cancer 12,23,24 . In the present study, we found that individuals with the SS genotype had a significantly higher OSCC risk than those with either of the L form genotypes (OR = 1.65; 95% CI, 1.01-2.70; p = 0.05), especially among AQ chewers. In contrast, Kang et al. has demonstrated that carriers of >16 CA repeats have a 1.9-fold increased risk of oral cancer among a Puerto Rican population 13 . Conversely, they also found that the risk tended to increase as the number of alleles within the ≥16 CA repeats decreased. These inconsistent findings indicated that cutoff point used to distinguish short and long EGFR CA repeat alleles might have a significant effect on the interpretation of any results obtained. One major difficulty of investigating the effects of this polymorphism on protein expression in vivo is the wide distribution of CA repeats in terms of number, which leads to many possible heterozygous genotypes. Furthermore, there is no clear model   as yet as to how the two alleles interact to give rise to the final phenotype. In these circumstances it is clear that the relevance of this polymorphism to OSCC risk warrants further investigation. It has been implied that the EGFR CA repeats polymorphism might be a potential determinant of protein expression 8,9 . However, two recent in vitro studies have indicated that there is no relationship between EGFR overexpression and the length of the CA repeats present 25,26 . In addition, EGFR protein overexpression has been attributed to massive gene amplification 25 . Since EGFR protein overexpression, gene copy number and CA repeats have rarely been investigated simultaneously in human primary cancers, the relationship between EGFR CA repeats polymorphism and protein expression in human cancers, including head and neck cancer, remains very controversial 10 . In the present analysis, we did not find there to be an association between CA repeats polymorphism and protein expression in OSCC tumors with disomy of the EGFR gene. However, it has been demonstrated that there is a significant association between a gain of EGFR gene copy number and protein overexpression in Taiwanese OSCC tumors 3,4 and thus the influence of the EGFR CA repeats polymorphism on protein expression would seem to be minimal in Taiwanese OSCC tumors.
Etienne-Grimaldi et al. 9 reported that EGFR protein expression in head and neck cancer is an independent predictor of specific survival, while CA repeats polymorphism is not an independent predictor of specific survival under the same circumstances. In the present analysis, we found that EGFR protein overexpression and CA repeats was slightly or significantly associated with DFS and OS by univariate analysis. In addition, patients genetically shown to have the EGFR CA repeats SS genotype and a tumor with EGFR protein overexpression had a worst prognosis in terms of DFS (p = 0.002; HR = 4.11; 95% CI, 1.66-10.14) compared to those patients with the EGFR CA repeat LL/LS genotype and/or no EGFR protein overexpression and that this significant relationship still existed (p = 0.04; HR = 2.68; 95% CI, 1.03-6.98) after multivariate adjustment for age, primary tumor status, lymph node metastasis, tumor depth, and tumor cell differentiation. Although there was no significant association between EGFR CA repeats polymorphism and protein overexpression, these two factors did have a synergistic influence on patients' prognosis. From the present analysis, it appears that the EGFR CA repeat polymorphism may play a role synergistically with tumor EGFR expression level in predicting outcome among OSCC patients. It therefore has significant potential as a biomarker for risk stratification in OSCC. Future studies are needed to confirm our study.