Co-occurrence CDK4/6 amplification serves as biomarkers of de novo EGFR TKI resistance in sensitizing EGFR mutation non-small cell lung cancer

Despite the development of predictive biomarkers to shape treatment paradigms and outcomes, de novo EGFR TKI resistance advanced non-small cell lung cancer (NSCLC) remains an issue of concern. We explored clinical factors in 332 advanced NSCLC who received EGFR TKI and molecular characteristics through 65 whole exome sequencing of various EGFR TKI responses including; de novo (progression within 3 months), intermediate response (IRs) and long-term response (LTRs) (durability > 2 years). Uncommon EGFR mutation subtypes were significantly variable enriched in de novo resistance. The remaining sensitizing EGFR mutation subtypes (exon 19 del and L858R) accounted for 75% of de novo resistance. Genomic landscape analysis was conducted, focusing in 10 frequent oncogenic signaling pathways with functional contributions; cell cycle, Hippo, Myc, Notch, Nrf2, PI-3-Kinase/Akt, RTK-RAS, TGF-β, p53 and β-catenin/Wnt signaling. Cell cycle pathway was the only significant alteration pathway among groups with the FDR p-value of 6 × 10–4. We found only significant q-values of < 0.05 in 7 gene alterations; CDK6, CCNE1, CDK4, CCND3, MET, FGFR4 and HRAS which enrich in de novo resistance [range 36–73%] compared to IRs/LTRs [range 4–22%]. Amplification of CDK4/6 was significant in de novo resistance, contrary to IRs and LTRs (91%, 27.9% and 0%, respectively). The presence of co-occurrence CDK4/6 amplification correlated with poor disease outcome with HR of progression-free survival of 3.63 [95% CI 1.80–7.31, p-value < 0.001]. The presence of CDK4/6 amplification in pretreatment specimen serves as a predictive biomarker for de novo resistance in sensitizing EGFR mutation.


Results
Patient demographics. Of the 458 patients with NSCLC whose tumors harbored EGFR-activating mutation, 332 patients received EGFR TKIs and complete follow-up data were included in the final analysis (Fig. 1). Patient demographics are summarized in Table 1. The median age was 64 years (interquartile range [IQR] 54.3 to 72 years). Sixty-four percent of patients were women. Most of the patients were never smokers (80%) and had 0-1 score of ECOG performance status (87%). A majority of the patients were adenocarcinoma (95%), metastatic disease at presentation (76%) and 1-2 metastatic sites (75%). Baseline brain metastases were present in 22% of the overall population. Regarding EGFR mutation subtype, 169 patients (51%) harbored exon 19 deletion, 136 patients (41%) harbored L858R, and 27 patients (8%) had other mutations including G719X in exon 18 (N = 6), exon 20 insertion (N = 2), S768I in exon 20 (N = 1), and L861G or Q in exon 21 (N = 6). Twelve patients had any or two coexisting EGFR mutations (complex mutation) including one patient with a L858R with a coexisting de novo EGFR T790M mutation, respectively. A total of 218 patients (66%) received EGFR TKIs as first-line treatment, and the remaining were treated as the subsequent line treatment. The majority of 1st generation EGFR TKIs were administrated up to 95%, composed of gefitinib 59% and erlotinib 36%. The objective response rate  Table 1. Patient demographics in the overall population, 332 advanced or recurrent NSCLC who received 1st or 2nd generation EGFR TKI. a Uncommon EGFR mutations, including G719X in exon 18 (n = 6), exon 20 insertion (n = 2), S768I in exon 20 (n = 1) and L861G or Q in exon 21 (n = 6). b Afatinib and dacomitinib were used in 13 and 1 patients, respectively. Del19 exon 19 deletion, ECOG PS Eastern Cooperative Oncology Group performance status, EGFR epidermal growth factor receptor, IQR interquartile range, ORR overall response rate.  (3) long-term responder, durable disease control with EGFR TKIs more than 2 years 18 . Patient characteristics of the three groups were listed in Table 1.
The de novo resistance groups were significantly associated with metastatic disease at presentation (96.4%; p-value 0.003) and uncommon EGFR mutation subtype (25%; p-value 0.001). Presence of metastatic disease at diagnosis was found in 77% and 64% of IRs and LTRs, respectively. Only 3.3% of tumors harboring uncommon EGFR mutations were found in IRs and absent in LTRs. There was no significant difference in age, gender, PS, smoking status, histology, baseline liver or brain metastasis, and the treatment lines of TKI between the three groups. Moreover, there was no difference in EGFR TKIs response between the exon 19 deletion and L858R mutation.
Logistic regression was performed to evaluate the correlation of clinical variables and response to EGFR TKIs (  Table 2). These results were consistent with the results of multivariate Cox's proportional hazards analysis which revealed that those factors were correlated with PFS of EGFR TKIs (Table S2). Kaplan-Meier analysis, according to each clinical variable, correlated with PFS/ OS and results were shown in Fig. S1 and Fig. S2. Survival analysis and subsequent treatment were described in the supplementary information (Table S3-S6).
Comparative "cohort-normal" vs. "match-normal" WES analysis workflow in exploratory cohort 65 tumor-normal resectable lung cancer. To define the concordance variant calling between "cohort-normal" and "match-normal", we conducted WES analysis in 65 patients with resectable stage adenocarcinoma of the lung who underwent surgery as a curative intent. A "cohort-normal" pipeline was conducted using in-house normal reference obtained from either leucocyte or normal lung. In general, mutation profiling for 21 driver genes and CNAs (arm-level and focal-level) were consistent with the lung adenocarcinoma East Asian cohort 19 (Fig. S5). The most frequent driver mutations were EGFR (60%), TP53 (28%) and RMB10 (11%), consistent with the East Asian cohort (47%, 36% and 8%, respectively), while the KRAS mutation was found 4% lower than the East Asian cohort (11%). Median TMB (including synonymous and non-synonymous mutations) was low at 1.84 Mb −1 (range: 0.24-25.14 Mb −1 ) which is a dominant characteristic of the majority of never smoker, adenocarcinoma lung cancer (73%) in our study. Many focal CNAs were found around driver gene amplification in EGFR, MYC, MDM2, KRAS and CCNE1 as well as deletion in ARID1A and APC ( Fig. S6A and S6B). Somatic prediction in "cohort-normal" workflow was conducted using PureCN based on altered allelic fractions of germline and somatic variants which previously showed median accuracy of somatic variants of 97.2% in TCGA-LUAD 20 . There was a high correlation (R = 0.99, p-value < 2.2 × 10 -16 ) of all nonsynonymous mutations between "cohort-normal" and "match-normal" workflow ( Fig. S7B). There were 3,445 non-synonymous variants in the "cohort-normal" workflow and 4,717 non-synonymous variants in the "matchnormal" workflow. Eighty-four percent of all non-synonymous mutations in "cohort-normal" were concordant to 61.3% of "match-normal" workflow. The concordant rates in "cohort-normal" were 89% and 92.3% in 307 significant genes from LUAD 7 studies (additional information: Table S12) and 206 genes from 10 significant pathway analysis, respectively 21 (Fig. S7A). Cohort-normal workflow of the non-synonymous mutation variant with high-concordance rate to match-normal workflow in 206 genes was adopted in WES analysis of 65 EGFR mutation-positive recurrence or advanced NSCLC. The retained variants according to the filtered algorithm is shown in Fig. S8. Demographic characteristics of 65 resectable stage adenocarcinoma of lung were shown in additional information: Table S11.
Elucidation of molecular analysis correlation with the response of EGFR TKI in "cohort-normal". We selected participants for genomic study based on retrospective aforementioned-response classification. Demographic characteristics, response treatment of 65 advanced stage NSCLC, received EGFR TKIs who had adequate tissue for WES were shown in Table 3, Fig S3. We analyzed exome sequencing with target sequences of approximately 90 Mb. The average depth of coverage within targets was 65× (range 60-94×) with 95% of targeted bases were covered by at least 10 reads. Based on the "cohort-normal" algorithm, 14,508 nonsynonymous variants were retained from 65 WES recurrence or advanced EGFR mutation-positive NSCLC (additional information: Table S7). Median non-synonymous mutation was 2.3 Mb −1 (range 1.5-6.0 Mb −1 ). The median frequency of non-synonymous mutation in de novo resistance was 1.15 Mb −1 (range 0.65-3.33 Mb −1 ) lower than IRs and LTRs which was 2.82 Mb −1 (range 1.07-6.02 Mb −1 , p-value < 0.001) and 1.77 Mb −1 (range 1.18-2.98, p-value 0.01). However, this might be an effect of lower average read depth in de novo resistance than  Table S8.
Cell cycle, RTK-RAS and PI-3-Kinase/Akt were the significant alteration pathways among treatment groups with the p-value of 6 × 10 -5 , 0.02 and 0.02, respectively. The P-value for significant pathways were adjusted by www.nature.com/scientificreports/ Benjamini-Hochberg Method which revealed significantly less than 0.05 in only cell cycle pathways (q-value 6 × 10 4 ) (Fig. 2B). Individual genetic alterations per pathway were shown in Fig. 2A, Fig. S10A-S10J and additional information:  23 in 5 adequate specimens. The results were shown in additional information table S13. Consistent CDK4 amplification in all but not for CDK6 amplification was found. As OrigiMed Gene amplification threshold for amplification was over 6 copies. Five specimens which had CDK6 amplification in range 2.9-6 using OrigiMed had been excluded by the algorithm. We analyzed the discriminative effect varying the WES threshold of amplification by using     (Fig. 4B).

Discussion
We analyzed the clinicopathological and molecular features of a subset of patients that is intrinsically resistant to EGFR TKI treatment, although this subset represented only 8% of our study population. We found that uncommon EGFR mutation was an independent factor associated with de novo resistance compared to both IRs and LTRs. It is well known that uncommon EGFR mutations are a heterogeneous group with variable responses to EGFR TKIs, contrary to LTRs which represented 19% of our study. The median duration of EGFR TKIs treatment in this group was 32.4 months. LTRs showed a substantially lower number of metastatic sites (p-value 0.002) and almost exclusively oligo-progression. Patients with recurrent disease who displayed favorable responses to EGFR TKI treatment may partly be explained by their small disease burden and low tumor heterogeneity 24,25 .
Additionally, a meta-analysis showed that never smokers had better PFS benefits than ever smokers in patients who harbored the activating EGFR mutation and received EGFR TKIs 26 . However, we found that this factor was not associated with the outcome of disease control. Besides specific clinical factors, we found that diverse genomic landscapes underlined distinct EGFR TKI responses. Varying mechanisms of de novo resistance in sensitizing EGFR mutation were reported such as de novo co-occurrence of PIK3CA 27 , PI3K/AKT/mTOR 28 , PTEN loss 29 , MET alteration 30-32 and TP53 mutation 33 . Here, we focused on 10 frequent oncogenic signaling pathways; cell cycle, Hippo, Myc, Notch, Nrf2, PI-3-Kinase/ Akt, RTK-RAS, TGF-β, p53 and β-catenin/Wnt signaling which were previously shown to be significant among various cancer types, involving tumorigenesis, cell proliferation, metastasis and angiogenesis 34,35 . Targeting signaling pathways has been a challenge in defining a novel cancer treatment. Among them, RTK-RAS and cell cycle pathways were the most frequent alterations in no-mutation selected adenocarcinoma of the lung with a frequency of 74% and 56%, respectively 21 . These frequencies of pathway alterations were consistent with our EGFR mutation-positive study; 77% in RTK-RAS and 60% in cell cycle pathway, respectively. We found that cell cycle pathway alteration was the only significant pathway alteration (q-value < 0.05) with essential frequency in de novo compared to IRs and LTRs (100% vs. 58% vs. 27%, respectively). CCNE1, CDK4/6 and CCND3 were major contributors of cell cycle pathway alteration with q-value < 0.05. Altered cell cycle expression has also been correlated with acquired EGFR TKI resistance 36 . Broadened exploration of gene alteration in our study confirmed prior cfDNA targeted sequencing of 68 genes that revealed significant cell cycle pathways and the presence of CDK4/6 alterations, which were significantly associated with non-responder of osimertinib 37 .
Furthermore, MET alteration was also enriched in de novo resistance with q-value < 0.05. Despite varying techniques and definitions used to define MET amplification 38 , it is well-known as an important role in de novo and acquired EGFR TKI resistance through bypass activating ERBB/PI3K-Akt signaling pathway 30,[39][40][41][42] . The presence of MET amplification was significantly associated with shortened OS in multivariate Cox regression analysis with the HR of 2.13 [95% CI 1.04-4.5, p-value = 0.03] (Fig. 4B). Missense TP53 mutation, which has previously shown the predictive impact of EGFR TKI treatment in meta-analysis 33 , had a higher prevalence in de novo EGFR TKI than IRs and LTRs (81% vs. 60% vs. 36%, respectively). The prevalence of co-occurrence alteration of TP53 and CDK4/6 amplification was 24.6% (81% in de novo EGFR TKI, 16% in IRs and none in LTR). The presence of missense TP53 mutation was shown as potential prognostic significance to OS but not for PFS in multivariate Cox regression analysis with the HR of OS 2.06 [95% CI 0.93-4.6, p-value = 0.07] (Fig. 4). Our results were consistent with previous publication 43 . The prevalence of co-alteration RB1 alteration and CDK4/6 amplification in EGFR mutation NSCLC patient was 7.6% (de novo 27%, IRs 4% and none in LTR). RB1 alteration was correlated poor prognostic outcome in non-select genomic subgroup advanced NSCLC 44  substrates such as transcription factor Forkhead Box 1 (FOXM1), certain glycolytic enzymes and nuclear factor of activated T cell (NFAT) family members, let activity of CDK4/6 even lack of RB1 function [45][46][47][48] . The presence of either CDK4 or CDK6 amplification in the pretreatment specimen served as a predictive biomarker for EGFR TKI resistance in sensitizing EGFR mutation. Correlation with calculated integer copy number, using CDK4/6 amplification threshold either 2, 4, 6 has discriminate predictive significance to EGFR TKI. Combination EGFR TKI treatment plus anti-CDK4/6 inhibitors are possible to overcome de novo EGFR TKI clonal resistance. Dual CDK4/6 and EGFR blockage shown in vitro activity to prevent or delay resistance in EGFR mutant NSCLC 49   www.nature.com/scientificreports/ Although we focused on significant pathway alteration in whole exome sequencing, our study has some limitations. First, there was a lack of fusion alteration in our analysis. Fusion alteration has been reported as an uncommon mechanism of acquired EGFR TKI resistance [50][51][52][53] . Nevertheless, co-occurrence fusion in pretreatment EGFR mutation-positive was reported at a low frequency (0.9%) 52 . Second, we used cohort-normal workflow, which is required in silico prediction by using allele-specific copy number to calculate the posterior probability to define the variant as somatic variant status (See "Methods"). Using an exploratory cohort of 65 tumor-normal pair resectable adenocarcinoma of lung, revealed that 84% of all non-synonymous mutations in "cohort-normal" concordance to 61.3% of "match-normal" workflow. Subclonal mutations which had low allelic fractions or low purity might be the reason for lower precision accuracy. Nevertheless, we selected high concordance, 92.3% in a limited 206 genes from 10 significant pathway analyses, which cover all significant co-occurrence alterations. Third, our average depth coverage of WES was 65× [range 40-94×] was significantly different among the EGFR TKI treatment group which might impact the detection number of low allelic fraction mutation. However, this sequencing coverage depth is still enough to define a significant pathway and genomic alteration that correlates with de novo resistance. Fourth, the copy number threshold to define amplification was adjusted (> 0.3), more precisely than the pipeline recommendation. The gene-level segment integer copy number was parallel conducted using PureCN. The algorithm was previously shown good concordance with absolute copy number by targeted NGS-Foundation Medicine platform 54 . However, currently hybrid capture-based NGS has diverse thresholds for amplification. The threshold used in FoundationOne® Heme for identifying a copy number amplification is 5 for ERBB2 and 6 for all other genes while the threshold used in OrigiMed is 6 for all. Lastly, we didn't assess prognostic significance for non-significant genes such as AURKB (1.5%) and RBM10 (6%), even reported the prognostic significance associated with EGFR TKI. Four of 65 discrepancies of EGFR mutation results between WES and Cobas® mutation testing were found. Integrative Genomics Viewer analysis (IGV) on the bam file was performed; all specimens had EGFR exon 19 deletion reads less than 15 which were removed by our algorithm.

Materials and methods
Study population. All methods were carried out in accordance with the declarations of Helsinki. The Institutional Review Board (IRB), Faculty of Medicine, Chulalongkorn University approved the study protocol (IRB 298/60). Written Informed consent was waived from individual study participants according to the ethics committee/IRB, Faculty of Medicine, Chulalongkorn University policy for retrospective study. The permission to conduct the study was provided by the director of the hospital. We retrospectively analyzed patients with pathologically confirmed recurrence or metastatic NSCLC diagnosed between 2011 and 2018 at King Chulalongkorn Memorial Hospital (KCMH). EGFR mutation testing was determined by cobas® EGFR Mutation Test v2 kit. Patients with NSCLC harboring activating EGFR mutations who received 1st or 2nd generation EGFR TKIs were included in our study, excluding osimertinib according to limited participants (n = 9) (Fig. 1). All patients were assessed for tumor responses and followed up every two to three months as the standard protocol of our institution. Objective response rate (ORR) was determined according to the Response Evaluation Criteria in Solid Tumors version 1.1 (RECIST v1.1) and classified as a progressive disease (PD), complete response (CR), partial response (PR), or stable disease (SD). Patients were categorized into three groups based on responsiveness to EGFR TKI treatment: (1) those with de novo EGFR TKI resistance who were defined as the best response were PD or SD less than 3 months while receiving EGFR TKI 16 (de novo resistance); (2) those who developed acquired resistance to EGFR TKIs according to the proposed criteria by Jackman 17 (Intermediate responder) and (3) those treated with EGFR TKIs for at least 2 years 18 (Long-term responder). Independent radiologist blind to molecular characteristic had reviewed imaging responses of 65 patients who obtained available tissue for WES. Exome sequencing analysis. Genomic DNA was extracted from paraffin-embedded tissue, using Qiagen FFPE DNA extraction kits following manufacturer protocol. We used leftover extracted genomic DNA after cobas® EGFR Mutation Test as part of standard testing in advanced stage disease. After performing quality control (QC), qualified samples were proceeded to library construction. The genomic library was constructed with SureSelectXT V6 + UTR library prep kit (Illumina, San Diego, CA, USA) and was sequenced using NovoSeq to generate 150 bp paired-end reads at Macrogen Inc. (Seoul, Korea). The analytical pipeline of "cohort-normal" which showed high concordance rate (R 2 0.99) to "match-normal" workflow in 10 significant pathways (206 genes) was explored in our study. The analysis of "cohort-normal" workflow was compared with our "matchnormal" workflow, using 65 pair tumor-normal fresh tissue WES from resectable lung cancer patients who had received surgical procedures at The King Chulalongkorn Memorial Hospital. Written Informed consent was obtained in all resectable lung cancer participants. We selected those 65 specimens as retrospective manner based on EGFR TKI response with an adequate amount of specimen for WES, enriched in de novo EGFR TKI resistance. Ninety-eight percent of the second cohort WES had sensitizing mutation; composed of 55% EGFR exon 19 deletion, 43% exon 19 L858R, and one patient with exon 21 L861Q mutation. Sixty-five WES advanced stage NSCLC were categorized to 11 de novo resistance, 43 intermediate responders and 11 long-term responders. Pooled normal in the "cohort-normal" pipeline was obtained from either 65 normal lung tissue or leucocytes from the first exploratory cohort. Preprocessing steps and filtering for variant and copy number alteration (CNAs) discovery are described in the supplementary information.
Statistical analysis. The correlation of all categorical variables was analyzed using Kruskal-Wallis test. Significant correlation of two categorical variables was analyzed using two-sided Fisher's exact test or Chi-squared test for p-value calculations, while correlation of two continuous variables was conducted using Wilcoxon ranksum test. FDR p-values were calculated by Benjamini-Hochberg Method from all correlation p-values in this cohort. Progression-free survival (PFS) was calculated from the first day of treatment with EGFR TKI to disease