Introduction

In Taiwan, oral cancer (including sub-sites in the oral cavity, oropharynx and hypopharynx) is the fourth most common cancer in men1. The primary treatment for oral cavity squamous cell carcinoma (OSCC) is radical surgery with or without post-operative chemoradiation2. However, for inoperable/recurrent disease or metastasis at distant sites, the patients’ treatment options are limited and their prognosis is usually poor. Recent findings have indicated that epidermal growth factor receptor (EGFR) and its signaling transduction pathway play an important role in head and neck cancer in Taiwan, including areca quid (AQ) associated OSCC3. Overexpression of EGFR has been confirmed to occur in AQ associated OSCC and has been reported to be associated with poor prognosis3,4,5. Treatment with an anti-EGFR agent has been reported to improve outcome compared to radiotherapy alone in head and neck cancers6. However, the levels of EGFR protein expression were found not to be consistently correlated with treatment response.

EGFR protein overexpression has primarily been attributed to increased transcriptional activity as well as to increases in EGFR copy number7. Basal transcription of the EGFR gene is regulated by Sp1 transcription factor; in this context the CA repeat genotype of intron 1 (rs 11568315) has been shown to contribute to different levels of transcriptional activity8, 9. Etienne-Grimaldi et al.9 have reported that the number of CA repeats is inversely correlated with protein expression in human tumors, including head and neck cancer. However, they were unable to confirm that the number of CA repeats had a significant influence on EGFR expression in a later study10. This contradictory result may be mainly due to the complexity of head and neck cancer, which is composed of cancers from a number of different anatomical sites. Thus, whether there is a relationship between the intron 1 CA repeat genotype and protein expression in head and neck cancer is still unresolved.

OSCC is the major head and neck cancer in Taiwan and mechanisms regulating levels of EGFR protein expression in OSCC are not fully understood. We have previously shown that EGFR genetic mutations play a very minor role in OSCCs, whereas gene copy number was found to be significantly correlated with EGFR protein overexpression4. However, the role of the patient’s EGFR intron 1 CA repeat genotype in OSCC is rarely explored11. Since the EGFR intron 1 CA repeat genotype is known to be associated with the gene’s transcriptional activity, the CA repeat genotype has been implicated in cancer risk and in patient clinical outcome12. In this study, we comprehensively investigated the effects of EGFR CA repeat genotype on OSCC risk and protein overexpression, as well as evaluating its prognostic role.

Methods and Materials

Patients, tissue specimens and clinical diagnosis

This study was approved by the Institutional Review Board, Chang Gung Medical Foundation. The committee approved the experiments, and the informed consent was obtained from all subjects. The methods in this study were carried out in accordance with the relevant guidelines, including any relevant details. A total of 194 male OSCC patients who received primary radical surgery treatment at Chang Gung Memorial Hospital, Lin-Kuo during the period from March 1997 to June 2004 were recruited to participate in the study. All cases gave written informed consent for participation before surgery and all cases were confirmed by histology. For each case, 10 ml of venous blood was drawn and then separated into plasma, buffy coat cells and red blood cells by centrifugation within 18 h of obtaining the blood; the buffy coat cells were then stored at −80 °C. Genomic DNA for EGFR intron 1 CA repeats genotyping was purified from the buffy coat cells as described previously13. As referent controls, 1444 Taiwanese random males, whose blood was originally collected to study their blood lead concentrations, were also included in this study14.

Fluorescence in situ hybridization (FISH) assay to assess EGFR gene copy number

EGFR gene copies were investigated by FISH using the LSI EGFR SpectrumOrange/CEP 7 SpectrumGreen probe system (Vysis; Abbott Laboratories, Downers Grove, IL) as described previously4. At least 100 non-overlapping nuclei per case were scored independently by two independent observers. The FISH patterns were classified into three levels based on the copy number of EGFR genes per cell as described in previous studies4, 15, 16. These were normal disomy, with ≤two copies in more than 90% of the analyzed cells; low amplification/polysomy (LA/Poly), ≥three copies in more than 40% of the analyzed cells, and gene amplification, which was defined by the presence of tight EGFR gene clusters in ≥10% of the analyzed cells.

Immunohistochemical Analysis of EGFR protein overexpression

Immunohistochemical staining for EGFR protein was processed using anti-EGFR monoclonal antibody NCL-EGFR-384 (1:100) (Novocastra, Newcastle, UK) as described previously17. Normal skin, known to be EGFR positive, served as both positive (primary antibody added) and negative (no primary antibody) controls. The specimens were examined for the extent and intensity of nuclear and non-nuclear staining by the pathologist (W.-Y.C.) in a blind manner and scored according to the following criteria: 0, no discernible staining or background type staining; 1+, equivocal discontinuous membrane staining; 2+, unequivocal membrane staining with moderate intensity; and 3+, strong and complete plasma membrane staining. In the present study, when more than 25% of the cells had EGFR membrane staining with intensity scores of 2+ and 3+, then there was considered to be EGFR overexpression15, 17, 18.

EGFR intron 1 CA repeats genotyping

The procedure for analysis of the EGFR intron 1 CA repeats length polymorphism was modified from previous reports11, 19, 20. Briefly, fluorescein-labeled forward primer 5′-FAM-GTTTGAAGAATTTGAGCCAACC-3′ and reverse primer 5′-GTCTGCACACTTGGCACACT-3′ was used for the PCR reaction, which began with initial heating for 12 min at 95 °C, followed by 30 cycles of denaturation at 94 °C for 30 s, annealing at 60 °C for 60 s, and extension at 72 °C for 60 s. The fragment length of the amplified PCR products based on the 500 LIZ size standards was determined using the ABI Prism 3100 DNA Analyzer with GeneScan software (Applied Biosystems, Foster City, CA). According to the NCBI Build 36.1 reference sequence, the PCR product is predicted to be 116 bp with 16 CA repeats. Homozygous samples were randomly selected for direct sequencing to verify CA repeat number and also used as the internal control for the GeneScan analysis. The primers used for direct sequencing of the CA repeat number were: forward primer 5′-AGAGCTCATCCTGGCCAAC-3′ and reverse primer 5′-GCTCAAGGTTGGAATTGTGC-3′.

Statistical analysis

Statistical analyses were performed using the SPSS statistical package (SPSS, Chicago, IL). The correlations between the EGFR intron 1 CA repeat genotype and age, cigarette smoking, alcohol drinking, AQ chewing, EGFR protein overexpression and clinicopathological parameters was examined by χ2 test or Fisher’s exact test as appropriate. Survival curves were constructed by the Kaplan-Meier method and the curves were compared using the log-rank test. The Cox regression model was applied to adjust simultaneously for all potential prognostic variables, including age and lymph node metastasis. A two-sided value of p < 0.05 was considered statistically significant.

Results

EGFR intron 1 dinucleotide CA repeats polymorphism and OSCC risk

We studied a total of 194 OSCC patients and 1444 referent control individuals (Supplementary Table 1). Twelve different alleles of the CA repeat length within the range of 10 to 24 were observed. The most common allele in both referent controls and OSCC patients was 20 followed by 16 and 15 CA repeats. As illustrated in Supplementary Figure 1A, the allelic distribution in referent controls and OSCC patients were similar. The most common genotype in referent controls were 20/20 (26.45%, 382/1444), 16/20 (20.01%, 289/1444) and 15/20 (9.56%, 138/1444); while the most common genotypes in OSCC cases were 20/20 (26.80%, 52/194), 16/20 (21.13%, 41/194) and 19/20 (7.73%, 15/194) (Supplementary Figure 1B). The distribution of CA repeat genotypes was not significantly different between the OSCC patients and the referent controls (p = 0.09).

To assess the association between EGFR intron 1 polymorphism and OSCC risk, the number (range: 10–24) of CA repeats in each allele was categorized at the sample median (20). The categories were CA repeat <20, which was named the short (S) form and CA repeat ≥20 which was named the long (L) form. The SS genotype in general was found to be slightly associated with an increased OSCC risk (odds ratio (OR) = 1.40; 95% confidence interval (CI), 0.95–2.05; p = 0.08). When stratified by the major risk factors of OSCC, the SS genotype was significantly associated with an increased OSCC risk among AQ chewers (OR = 1.70; 95% CI, 1.04–2.76; p = 0.03) (Table 1). Since the mean age of the OSCC patients was 49.28 (standard deviation (SD) = 11.34) years old and that of the referent controls was 46.04 (SD = 16.68), we used an unconditional multivariate logistic regression to adjust this potential confounding variable (age). Individuals with SS genotype were still found to have a significantly higher OSCC risk than those with either the LL or LS genotype (OR = 1.65; 95% CI, 1.01–2.70; p = 0.05), especially among AQ chewers (Table 1).

Table 1 Associations between EGFR CA repeat genotype and OSCC risk.

EGFR protein overexpression, the genotype of the CA repeats and OSCC clinicopathological factors

The genotype of the EGFR CA repeats of the OSCC tumors was found not to be associated with gains in the copy number (both low amplification/polysomy and amplification) of EGFR gene or with protein overexpression (Table 2). As reported previously3, 4, there was a significant association between a gain of EGFR gene copy number and protein overexpression in Taiwanese OSCC tumors (data not shown). To rule out the effect on protein overexpression of this increase in copy number of the EGFR gene, only those OSCC tumors with disomy of the EGFR gene were included in the further analysis.

Table 2 The relationship between EGFR CA repeat genotype, copy number and protein overexpression.

In this subgroup, EGFR protein overexpression was found to be significantly associated with poor differentiation of the tumor cells (p = 0.003) and lymph node metastasis, especially extra-capsular spread (ECS) (p = 0.03) (Table 3). On the other hand, the tumor aggressiveness factors, including bone, skin invasion and perineural invasion were not related to EGFR protein overexpression (Table 3). Interestingly, OSCC patients without a history of alcohol drinking showed a higher frequency of EGFR protein overexpression than those who were alcohol drinkers. However, EGFR protein overexpression was not associated with either cigarette smoking or AQ chewing (Table 3).

Table 3 The associations between EGFR protein overexpression, EGFR CA repeat genotype and clinicopathological parameters among EGFR disomy OSCC patients (n = 135).

The patient’s EGFR CA repeat genotype was found not to be associated with tumor stage, tumor differentiation, lymph node metastasis or tumor aggressiveness factors, including skin, bone and perineural invasion (Table 3). Interestingly, AQ chewing, but not cigarette smoking or alcohol drinking, was significantly associated with the EGFR CA repeat genotype. The OSCC patients with the SS genotype were all AQ chewers (Table 3).

The prognostic implications of the EGFR CA repeat genotype and protein overexpression among OSCC patients with disomy of the EGFR gene

Using univariate analysis, EGFR protein overexpression was slightly associated with disease free survival (DFS) (p = 0.07; hazard ratio (HR) = 1.59; 95% CI, 0.97–2.62) and overall survival (OS) (p = 0.07; HR = 1.60; 95% CI, 0.97–2.65) (Table 4). Patients with the EGFR CA repeat SS genotype had a worse DFS (p = 0.09; HR = 1.70; 95% CI, 0.92–3.13) and a worse OS (p = 0.03; HR = 1.92; 95% CI, 1.07–3.43). Furthermore, patients found genetically to have the EGFR CA repeat SS genotype and a tumor with EGFR protein overexpression had the worst prognosis in terms of both DFS (p = 0.002; HR = 4.11; 95% CI, 1.66–10.14) and OS (p = 0.01; HR = 3.25; 95% CI, 1.33–7.95) compared to those with either form of the L allele CA repeat genotype and/or no EGFR protein overexpression by their tumor (Table 4, Fig. 1). After multivariate adjustment for age, primary tumor status, lymph node metastasis, tumor depth, and tumor cell differentiation, this significance relationship was still existed for DFS (p = 0.04; HR = 2.68; 95% CI, 1.03–6.98) but not for OS (p = 0.07; HR = 2.41; 95% CI, 0.95–6.15) (Table 5).

Table 4 Univariate analysis of the prognostic covariates for EGFR disomy OSCC patients (n = 135).
Figure 1
figure 1

Kaplan-Meier analysis of the combined effect of the EGFR CA repeat genotype and protein overexpression on disease-free survival (A) and overall survival (B) of 135 Taiwanese male OSCCs with disomy of the EGFR gene.

Table 5 Multivariate Cox regression analysis of a combination of EGFR CA repeat genotype and protein overexpression among EGFR disomy OSCC patients (n = 135).

Discussion

It has been shown that the allelic distribution of the EGFR intron 1 CA repeats has interethnic variability14 and that this interethnic variability might help to explain the distinct features of EGFR amplification and protein overexpression in human cancers among certain populations21. The most frequent allele in Asians is the 20 repeat allele, while the 16 repeat allele is the most common among Caucasians. The allele frequencies of the CA repeats observed in this study in terms of the Taiwanese referent controls (52.34% for 20 repeat allele and 19.46% for 16 repeat allele) is in agreement with the previous findings for Asians14, 21.

In vitro, EGFR transcription activity has been found to decline as the number of CA repeats increases and this then correlates with protein expression level in vivo 8. In addition, a higher number of CA repeats has been found to be correlated with a higher frequency of amplification of the EGFR gene in breast cancer cases21. In this study, we have observed that a gain of EGFR gene copy number can be observed in 30% of the OSCC tumors and this frequency was only slightly increased in tumors from individuals with the CA repeat genotype compared to those with the SS genotype. However, our findings indicated that Taiwanese OSCCs have a significantly higher frequency of EGFR amplification compared to German oral cavity cancers (19.6% (38/194) vs. 11.5% (24/209)), when analyzed using the same probe and the same amplification criteria22. This result is consistent with an interethnic study that consisted of German and Japanese breast cancer cases21. Thus, there is clearly an interaction between the number of CA repeats and the frequency of EGFR amplification.

The homozygous SS genotype of the EGFR intron 1 CA repeats has been found to be associated with an increased risk for glioma, breast cancer and lung cancer12, 23, 24. In the present study, we found that individuals with the SS genotype had a significantly higher OSCC risk than those with either of the L form genotypes (OR = 1.65; 95% CI, 1.01–2.70; p = 0.05), especially among AQ chewers. In contrast, Kang et al. has demonstrated that carriers of >16 CA repeats have a 1.9-fold increased risk of oral cancer among a Puerto Rican population13. Conversely, they also found that the risk tended to increase as the number of alleles within the ≥16 CA repeats decreased. These inconsistent findings indicated that cutoff point used to distinguish short and long EGFR CA repeat alleles might have a significant effect on the interpretation of any results obtained. One major difficulty of investigating the effects of this polymorphism on protein expression in vivo is the wide distribution of CA repeats in terms of number, which leads to many possible heterozygous genotypes. Furthermore, there is no clear model as yet as to how the two alleles interact to give rise to the final phenotype. In these circumstances it is clear that the relevance of this polymorphism to OSCC risk warrants further investigation.

It has been implied that the EGFR CA repeats polymorphism might be a potential determinant of protein expression8, 9. However, two recent in vitro studies have indicated that there is no relationship between EGFR overexpression and the length of the CA repeats present25, 26. In addition, EGFR protein overexpression has been attributed to massive gene amplification25. Since EGFR protein overexpression, gene copy number and CA repeats have rarely been investigated simultaneously in human primary cancers, the relationship between EGFR CA repeats polymorphism and protein expression in human cancers, including head and neck cancer, remains very controversial10. In the present analysis, we did not find there to be an association between CA repeats polymorphism and protein expression in OSCC tumors with disomy of the EGFR gene. However, it has been demonstrated that there is a significant association between a gain of EGFR gene copy number and protein overexpression in Taiwanese OSCC tumors3, 4 and thus the influence of the EGFR CA repeats polymorphism on protein expression would seem to be minimal in Taiwanese OSCC tumors.

Etienne-Grimaldi et al.9 reported that EGFR protein expression in head and neck cancer is an independent predictor of specific survival, while CA repeats polymorphism is not an independent predictor of specific survival under the same circumstances. In the present analysis, we found that EGFR protein overexpression and CA repeats was slightly or significantly associated with DFS and OS by univariate analysis. In addition, patients genetically shown to have the EGFR CA repeats SS genotype and a tumor with EGFR protein overexpression had a worst prognosis in terms of DFS (p = 0.002; HR = 4.11; 95% CI, 1.66–10.14) compared to those patients with the EGFR CA repeat LL/LS genotype and/or no EGFR protein overexpression and that this significant relationship still existed (p = 0.04; HR = 2.68; 95% CI, 1.03–6.98) after multivariate adjustment for age, primary tumor status, lymph node metastasis, tumor depth, and tumor cell differentiation. Although there was no significant association between EGFR CA repeats polymorphism and protein overexpression, these two factors did have a synergistic influence on patients’ prognosis. From the present analysis, it appears that the EGFR CA repeat polymorphism may play a role synergistically with tumor EGFR expression level in predicting outcome among OSCC patients. It therefore has significant potential as a biomarker for risk stratification in OSCC. Future studies are needed to confirm our study.