Introduction

Interstitial lung disease (ILD) refers to a broad range of chronic lung disorders with diverse pathogenesis and complex histopathology, together accounting for 15% of respiratory care practice1. Most entities are manifested as epithelial injury, followed by fibroblastic proliferation and development of fibroblastic foci with exuberant deposition of matrix - typical hallmarks of pulmonary fibrosis2,3. Over two thirds of ILD cases do not have a known cause and are thus named idiopathic interstitial pneumonia (IIP). Although the incidence of ILD in the US is low (approximately 30 cases per 100,000 persons per year), the disease can be progressive and fatal. The mean survival time of ILD patients is only about 3 years4. The etiology and pathogenesis of most ILD entities remain unknown, thus greatly hampering progress in the development of therapeutics for the disease. To date, no proven drug therapy for most entities has been recognized5,6.

It is now widely accepted that the development of ILD has a strong genetic basis. Substantial evidence demonstrates that ILD is a heritable complex disease determined by genetic factors with involvement of environmental stimuli, such as tobacco smoke1,2,7,8. Family-based studies have been conducted in an attempt to identify genes predisposing to ILD and causal mutations have been identified in several genes, e.g. telomerase reverse transcriptase gene (TERT), the telomerase RNA component gene (TERC), surfactant proteins A2 (SPA2) and C (SPC) genes7,8,9,10,11. More recently, a few genome-wide association studies have identified a number of single nucleotide polymorphisms (SNP) located at or close to TERT12, TERC13, MUC5B14,15,16, FAM13A13, DSP13, OBFC113, ATP11A13, DPP913, TOLLIP15and SPPL2C15 genes significantly (P < 5 × 10−8) associated with IPF and/or IIP as an overall phenotype. However, these polymorphisms together were estimated to account for about only one third of the risk of IIP, suggesting additional genetic component yet to be identified13.

The epidermal growth factor receptor (EGFR) is a tyrosine kinase receptor for various growth factors including EGF (epidermal growth factor), TGF-α (transforming growth factor-α) and other EGF-like ligands. The EGFR pathway plays an important role in pulmonary physiology especially the function of epithelial cells via signaling transduction that regulates key cellular processes such as self-renew, wound-healing, proliferation, survival, adhesion, migration and differentiation. EGFR inhibitors have been widely used in treatment of non-small cell lung cancer (NSCLC). However, ILD has been consistently reported as one of the uncommon but severe adverse reactions of EGFR inhibitors17,18,19,20,21,22. A strong association between the incidence of ILD and anti-EGFR treatments has been reported in a large case-cohort study that included over 4,000 subjects. The study showed a 3.23-fold increase in risk of ILD in patients who received gefitinib when compared with those who underwent conventional chemotherapy22. Furthermore, significant inter-ethnic differences in the incidence of ILD in patients treated with EGFR inhibitors has been consistently observed. According to the U.S. Food and Drug Administration (FDA), an overall ILD incidence of 1% was demonstrated in 50,005 patients receiving gefitinib, including 18,960 patients from Japan and 23,000 from the U.S. Interestingly, the incidence of ILD was higher in Japanese patients (1.7%) compared to patients from the US (0.3%). There was also a significant difference in the median time to onset (TTO) of ILD between Japanese and U.S. patients. The TTO was about 24 days in the former but around 42 days in the latter23. These findings have been confirmed in other independent studies23. Taken together, these observations suggest that certain genetic factors related to the EGFR pathway may confer susceptibility to ILD in general. In order to corroborate this hypothesis, we set out in this study to test the genetic association between functional polymorphisms in EGFR, EGF and TGFA genes and ILD. These polymorphisms have been previously demonstrated to alter gene expression, function or other related phenotypes in our and other's studies24,25,26,27,28.

Methods

Ethics statement

Research conducted in this study was performed on anonymous adult individuals without intervening with patients and is therefore not considered to involve ‘human subjects.’ Samples were collected with written informed consent obtained from participants with approval of institutional review boards (IRBs) at the Lung Tissue Research Consortium (LTRC, http://www.ltrcpublic.com) and The University of Chicago. The study was carried out in accordance with the approved guidelines by the Purdue University IRB (approval number 1307013815) and was in compliance with the Helsinki Declaration.

Study population

DNA extracted from peripheral blood of ILD patients (n = 227) were obtained from the Lung Tissue Research Consortium (http://www.ltrcpublic.com). All patients were diagnosed with ILD in accordance with the American Thoracic Society/European Respiratory Society International Multidisciplinary Consensus Classification of the Idiopathic Interstitial Pneumonias29, with well-documented clinical data, Computed Tomography (CT) scan and pathological review of lung biopsies for all patients. Cases with known cause for the disease were excluded. The DNA samples came from patients who had been diagnosed with idiopathic pulmonary fibrosis (IPF) (n = 84), non-specific interstitial pneumonia (NSIP) (n = 27), desquamative interstitial pneumonia (DIP) (n = 9), respiratory bronchiolitis-interstitial lung disease (RB-ILD) (n = 22), cryptogenic organizing pneumonia (COP) (n = 10), hypersensitive pneumonitis (HP) (n = 8), lymphocytic interstitial pneumonia (LIP) (n = 1), acute interstitial pneumonia (AIP) (n = 1) and uncharacterized fibrosis (UF) (n = 67). Control DNA samples (n = 693) were collected and provided by the Translational Research Initiative in the Department of Medicine (TRIDOM) program at the University of Chicago. These samples came from patients regularly visiting the clinic, excluding individuals with any respiratory symptoms according to the ICD-9 classification. All control patients are self-reported Caucasians. Demographic data for ILD patients and controls were summarized in Table 1.

Table 1 Demographic and covariates data associated with ILD patients and controls

Genotyping

The selected three EGFR [−216G/T (rs712830), −191A/C (rs712829), 497R > K(A/G) (rs2227983)], one EGF [61A/G, (rs4444903)] and one TGFA (rs3821262C/T) polymorphisms were genotyped using polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) method. Primers were designed as described previously30. The genotyping procedure was performed according to a protocol reported previously24,25,28. For quality control purposes, 10% of the cases and controls were selected for PCR-sequencing to confirm the genotypes of all polymorphisms. As a result, all samples genotyped using PCR-RFLP and sequencing had concordant genotypes.

Statistical analysis

A Chi-squared test was used for evaluating the Hardy-Weinberg Equilibrium (HWE) of each polymorphism, as well as the allelic associations between each polymorphism and ILD. The HWE was tested based on df = 1. A multiple logistic regression model was also fit with all five SNPs and age as well as gender as covariates. Therefore, the total number of predictors in the model is 7. To select potential important variables, the “PROC GLMSELECT” procedure in SAS was employed according to the previous suggestion31. Using Bonferroni method to control family wise error rate, two significant variables including EGF 61A/G and age were included in the final model. In addition, we have also performed a Cochran-Armitage trend test to investigate the potential additive trend for the association between EGF 61A/G genotypes and ILD. All hypothesis tests were two-sided and a P value less than 0.01 (0.05/5) is considered statistically significant. Data analyses were performed using SAS 9.3 (SAS Institute, Cary NC) and PLINK32.

Results

We used a case-control study design to examine whether polymorphisms of EGFR and its ligand genes were associated with susceptibility to ILD. The observed genotype frequencies of these SNPs were all in agreement with the HWE in the control subjects (df = 1, P > 0.05 for all tests, data not shown). The genotype and allele distributions of the five SNPs between the cases and controls are summarized in Table 2.

Table 2 Association between EGFR pathway polymorphisms and ILD

Significant association was observed between the EGF 61A/G polymorphism and ILD (OR = 1.33, 95%CI = 1.07–1.66, P = 0.0099), with the A allele frequency being significantly higher in the cases (64%) than the controls (57%). Meanwhile, there appeared to be an additive effect of the A allele in association with the phenotype, with an increased odds ratio between A/A group and G/G group when compared to that between G/A and G/G groups (G/A vs G/G: OR = 1.44, 95%CI = 0.91–2.29; A/A vs G/G: OR = 1.85, 95%CI = 1.15–2.98, Ptrend = 0.0083) (data not shown). No significant association was found between other polymorphisms and ILD. After adjusting for age and gender, the association between EGF 61A/G and ILD remained to be significant (P = 0.0087) (data not shown).

Discussion

We observed a statistically significant association between an EGF polymorphism that previously demonstrated to alter EGF gene expression and ILD. While the association was weak, the results remained significant with Bonferroni correction and after adjusting for age and gender. The Cochran-Armitage test for trend also suggested a possible additive effect of the risk allele. Unfortunately, our result lacked further validation with additional sample sets. Therefore, independent confirmation of our finding is necessary. If this result can be further replicated, it would indicate that genetic variation in the EGFR pathway may confer risk to general ILD.

EGFR inhibitors (EGFRi) e.g. gefitinib and erlotinib have been widely demonstrated to induce ILD in lung cancer patients. Although induction of ILD was observed with many drugs, it was higher in EGFRi treatment than chemotherapy33. Meanwhile, pre-existing ILD has been identified as one of the risk factors for EGFRi-associated ILD33. These lines of evidence suggest that EGFRi-induced ILD may share similar pathobiology with the general ILD. Unfortunately the EGFRi-associated ILD is rare and it is difficult to test its association with genetic polymorphisms in the EGFR-axis genes. Previous studies have demonstrated that EGFR and its ligands, in particular EGF and TGF-α, exert an essential function in lung epithelium and fibroblastic cells under both normal and fibrotic conditions. Bronchial epithelium cells, smooth muscle cells and fibroblasts are all involved in fibrogenesis and have high EGFR, EGF and TGF-α gene expression34,35. Alveolar epithelial injury is deemed to be the initial step in the development of pulmonary fibrosis. It has been demonstrated that IL-1β induced epithelial repair is through EGF and TGF-α dependent pathways36. EGF treatment in mouse was also found dramatically increasing pulmonary expression of surfactant protein C (SP-C) where pathogenic mutations in ILD patients have been discovered10,36,37. EGF also enhances fibronectin synthesis in fibrotic human lung fibroblasts34. Recently, studies on animal models have also established that both EGFR and TGF-α were genetically involved in the development of ILD. Mouse models with either constitutive or conditional expression of TGF-α in lung epithelial cells were shown to develop PF phenotypes38,39. On the other hand, when inducing PF with bleomycin, TGF-α knockout mice had significantly reduced fibrotic phenotypes compared with wild-type mice40. Furthermore, it was demonstrated that PF was prevented in another bitransgenic mouse model with constitutive expression of TGF-α but genetically disrupted EGFR41. These findings collectively suggest that activation of the EGFR pathway is critical for the pathogenesis of ILD. However, this plausibly contradicts the observation that inhibition of EGFR induces ILD in humans. Similar controversial findings were observed in animal models as well with regard to the relationship between EGFRi treatment and PF phenotypes. Gefitinib was found to prevent bleomycin-induced lung fibrosis42,43 and diminish TGF-α overexpression-induced pulmonary fibrosis in mice44. In contrast, other studies found gefitinib augments pneumonitis45 and exacerbate the bleomycin-induced lung fibrosis46,47. Why both excessive activation of EGFR and inhibition of EGFR cause ILD remains an unanswered question. This reflects a poor understanding of the function of EGFR in lung fibrosis. It should be noted that the most current studies of EGFR in ILD are based on animal models or have used very few human samples and many were performed in vitro. Since multiple types of cells are involved in ILD, detailed knowledge in EGFR signaling among those cells is extremely insufficient. Nevertheless, the evidence accumulated so far consistently suggested an important role of EGFR genes in both pathophysiology and genetics of the disease. It is thus reasonable that genetic variations in EGFR pathway may at least in part contribute to the natural history of ILD. Interestingly, a recent GWAS observed that two polymorphisms (rs79842896 and rs76795398) close to (~33 kb downstream) the EGFR gene were top-ranked to possess associations with IPF (P ≤ 10−5), although this association did not reach the genome-wide significance15. Continued study on more EGFR polymorphisms is thereby necessary to further clarify the role of EGFR variation in ILD.

The polymorphisms tested in this study have been proven to be functional in determining the expression or activity of EGFR or its ligands in both our and others' studies. We have demonstrated that the two EGFR polymorphisms -216G/T and -191C/A were associated with increased EGFR promoter activity and gene/protein expression24,25. They were also shown to be associated with drug-induced toxicities e.g. skin rash in EGFR inhibitor treatment in our previous study48. A SNP at EGFR codon 497 results in an Arg (R) to Lys (K) substitution, which has been associated with decreased EGFR activity26,27. The TGFA intronic polymorphism rs3821262C/T was also found to be associated with TGFA gene expression as well as sensitivity to EGFR inhibitors in cancer cell lines in our previous study28. With regard to EGF 61A/G polymorphism, it was initially demonstrated to affect EGF protein expression, with the G allele associated with a higher EGF level relative to the A allele49. A recent meta-analysis of 41 case-control studies on various cancers have shown that the G allele was significantly associated with increased cancer risk50. It is thus possible that the relatively lower EGF level associated with the A allele might be the reason underlying its association with ILD in our study. This supports the notion that a relatively lower EGFR-axis activity might be a risk factor for ILD, which is consistent with the observation that inhibition of the EGFR signaling induces ILD.

It should be further noted that previous GWAS did not identify this locus. While most previous GWAS were focused on IPF rather than ILD as a phenotype, it might be also due to the population difference. It is commonly observed that many loci identified in GWAS actually exert different effect size in different populations. Nevertheless, without independent validation, a false positive result in the association observed in our study could not be excluded, in particular that our study was limited by the relatively small sample size. Therefore, our findings should be used and interpreted with caution.

Conclusions

Our study provides a new investigation of the relationship between functional EGFR pathway gene polymorphisms and risk of ILD. The findings suggested a possible association between EGF 61A/G polymorphism and ILD. Further validation of this genetic association in independent sample sets is warranted.