Genomic characterization and outcome evaluation of kinome fusions in lung cancer revealed novel druggable fusions

Kinase fusions represent an important type of somatic alterations that promote oncogenesis and serve as diagnostic markers in lung cancer. We aimed to identify the landscape of clinically relevant kinase fusions in Chinese lung cancer and to explore rare kinase rearrangements; thus, providing valuable evidence for therapeutic decision making. We performed genomic profiling of 425 cancer-relevant genes from tumor/plasma biopsies from a total of 17,442 Chinese lung cancer patients using next generation sequencing (NGS). Patients’ clinical characteristics and treatment histories were retrospectively studied. A total of 1162 patients (6.66%; 1162/17,442) were identified as having kinase fusions, including 906 adenocarcinomas (ADCs) and 35 squamous cell carcinomas (SCCs). In ADC, 170 unique gene fusion pairs were observed, including rare kinase fusions, SLC12A2-ROS1, NCOA4-RET, and ANK3-RET. As for SCC, 15 unique gene fusions were identified, among which the most frequent were EML4-ALK and FGFR3-TACC3. Analyses of oncogenic mutations revealed a dual role for the gene fusions, CCDC6-RET and FGFR3-TACC3, in driving oncogenesis or serving as acquired resistance mechanisms to kinase inhibitors. In addition, our real-world evidence showed that patients with recurrent kinase fusions with low frequency (two occurrences) could benefit from treatment with kinase inhibitors’ off-label use. Notably, patients with stage IV ADC who had novel RORB-ALK or AFF2-RET fusions, but no other known oncogenic driver mutations, demonstrated favorable clinical outcomes on tyrosine kinase inhibitors. Our data provide a comprehensive overview of the landscape of oncogenic kinase fusions in lung cancer, which assist in recognizing potentially druggable fusions that can be translated into therapeutic applications.


INTRODUCTION
Lung cancer is the leading cause of cancer-related death worldwide, with a 5-year survival rate of less than 21% 1 . There are two major histological groups of lung cancer, including small cell lung cancer (SCLC) and non-SCLC (NSCLC). NSCLC accounts for approximately 85% of all lung cancers 1 and can be further divided into different subtypes, among which, the most common is adenocarcinoma (ADC), followed by squamous cell carcinoma (SCC), adeno-squamous cell carcinoma (ASC), and large cell carcinoma (LCC). However, the molecular features of these subtypes are significantly distinct. For that reason, the identification of oncogenic driver genes and novel therapeutic targets is highly important for lung cancer treatment.
Kinases activated by gene fusions have been reported to be major classes of oncogenic drivers in lung cancer, which are produced by translocation or structural chromosome rearrangements, and function as potential targets of anticancer drugs 2,3 . Advances in next-generation sequencing (NGS) technologies have enabled the characterization of kinase fusions among different lung cancer subtypes, the identification of concurrent cancerrelevant alterations, and the identification of novel driver fusions.
In this study, we comprehensively analyzed DNA-seq data from 17,442 Chinese lung cancer patients, described the clinical and pathological characteristics of patients with different kinase fusions, and distinguished kinase fusion types between ADC and SCC. Additionally, we deeply studied the fusion breakpoint preferences of kinases and analyzed treatment-relevant mutations that co-occurred with primary recurrent kinase fusions. Furthermore, we identified multiple novel druggable fusions, which benefited from tyrosine kinase inhibitor (TKI) treatment. These studies facilitated a deep understanding of the unique clinical and molecular features, and the outcomes of lung cancer patients with kinase fusions.
We classified the 1233 kinase fusion events into three categories, including 1165 with a kinase at the 3′ end (3′ kinase), 57 with a kinase at the 5′ end (5′ kinase) and 11 fusions in which both partners were kinases (both-kinase) (Fig. 1b). In both-kinase patients, TRIM24-BRAF fusions were the most frequent (54.4%; 6/11; data not shown) and reported to be sensitive to a MEK inhibitor 4    ALK fusions were the most common among all histologic subtypes (Supplementary Table 1 and Supplementary Table 2). Compared to ADC, the frequencies of ROS1 and RET were lower in SCC, and accounted for 8.5% and 2.9%, respectively. The prevalence of FGFR3 was much higher in SCC than ADC (SCC vs. ADC: 28.6% vs. 1.5%, respectively). We also summarized the number of kinase fusions carried by each patient (Fig. 1c). As expected, 94.7% of patients contained only one fusion, while 5.3% of patients harbored at least two fusions. The ALK gene rearrangement was most frequently observed in patients with either single fusion event or multiple fusions (59.6% and 80%, respectively). We also identified one patient with four kinase fusions that all belonged to KIF5B-RET, but with different fusion breakpoints.
We further analyzed the gender distribution in patients with different fusions (a sample size of more than 15 cases was required for each fusion) ( Supplementary Fig. 1a). Most patients with CCDC6-RET and FGFR3-TACC3 fusions were male, while the frequency of females was higher among patients with VCL-ALK and SDC4-ROS1 fusions. Age was also statistically associated with fusion type. The ages of patients with ROS1 fusions involving different partners varied significantly, as patients with EZR-ROS1 were older than patients with CD74-, SLC34A2-, and SDC4-ROS1 fusions ( Supplementary Fig. 1b).
Characterizing recurrent kinase fusions in lung cancer patients Recurrent kinase fusions have been of great interest and regarded as potential therapeutic targets for cancer treatment. Among 219 kinase fusion types, we observed 39 recurrent fusions involving 8 kinases, including ALK, ROS1, RET, FGFR3, BRAF, EGFR, ERBB2, and NTRK1 (Fig. 2a, b). The most frequently occurring kinase was ALK (60.7%; 639/1053) and included EML4-ALK (91.1%; 582/639), HIP1-ALK (0.9%; 6/639), and STRN-ALK (0.8%; 5/639) fusions, and has been well documented in NSCLC [5][6][7] (Fig. 2b). We also found several recurrent ALK fusions which have not been reported in lung cancer, but have been identified in other cancer types, such as VCL-ALK, FN1-ALK, and NPM1-ALK. Those data indicated that such rarely-reported driver fusions may also play crucial roles in the carcinogenesis of lung cancer. For the recurrent kinase fusions with low frequency (two occurrences) in our data, CUX1-ALK 8 and FAM179A-ALK 9 have been shown response to ALK TKIs. Meanwhile, it has been reported that genetic rearrangements can mediate drug resistance to TKI treatment 10 . FGFR3-TACC3 11,12 is the resistance mechanism against EGFR TKIs in NSCLC patients with EGFR L858R or 19del mutation. EML4-ALK 13 and NCOA4-RET 14 are mechanisms of resistance to EGFR TKIs in NSCLC patients with EGFR T790M. Meanwhile, KIF5B-RET 15 , CCDC6-RET 16 , EZR-ROS1 17 , PHF20-NTRK1 18 , and TRIM24-BRAF 19 have been reported to function as resistance mechanisms to EGFR 19del upon EGFR TKIs treatment in NSCLC patients. Besides, HIP1-ALK 20 and STRN-ALK 21 have been reported to mediate resistance to ALK TKIs in NSCLC patients with other ALK rearrangement. Further studies should be conducted to deeply understand the function of kinase fusions in cancer therapy.

Breakpoint patterns of known kinase fusions
Although the kinases in our cohort all retained the integrity of the kinase domain, the breakpoints were variable when ligated with different partners. Such an observation may be caused by the structural and functional characteristics of each partner gene and may influence the tumorigenesis capacities of the fusions.
In the EML4-ALK group, L858R more frequently co-occurred than other EGFR mutations (Fig. 3b). Additionally, the frequency of ALK mutations was 44.3% (31/70), including ALK G1202R, G1269A, and L1196M, which are well-described resistance mechanisms to ALK TKIs. The HRAS mutation was only identified in the EML4-ALK group. MET exon 14 skipping (77.8%; 7/9) occurred significantly more frequently than other types of MET alterations that cooccurred with kinase fusions.
Interestingly, an ADC patient harboring an EML4-ALK fusion also had a NTRK1 G595R mutation, which is considered to be an acquired resistant mutation to NTRK inhibitors (larotrectinib or entrectinib) 34 . However, whether this alteration is associated with ALK TKI resistance remains to be determined. We also found a patient carrying CD74-ROS1 and RET V804M alterations, however, the function of the concurrent RET V804M mutation requires further investigation.

RET fusions in lung cancer and the response to RET inhibitors
Since RET is a crucial therapeutic target in cancers, several multikinase inhibitors (MKIs) such as cabozantinib, vandetanib, alectinib, and apatinib have shown anti-RET activities. However, these MKIs have limited efficacy and patients with RET fusions showed poor responses with short progression free survival (PFS) times of generally less than 7.5 months 27 . In our cohort, targeted therapy information was available for 10 patients carrying different RET fusions (Fig. 4), among which nine patients were treated with MKIs and one was treated with the RET-selective inhibitor, pralsetinib. For the nine patients who received MKI therapy, five showed no response. Among the remaining four patients, three KIF5B-RET-positive cases received cabozantinib treatment and one achieved a PFS longer than 6 months. There were three CCDC6-RET-positive patients treated with apatinib in this study, but only one showed a response and achieved a PFS of 10 months. We also identified a patient with a novel AFF2-RET fusion who was treated with pralsetinib and maintained a response for 9 months. The clinical history of that AFF2-RET patient is described below.
Novel fusions that demonstrated favorable clinical benefits of kinase inhibitors Besides the well-described recurrent kinase fusions, we also identified 19 recurrent fusion types with low frequency (two occurrences) in our cohort (Fig. 2). To search for functional fusions, we focused only on cases with targeted therapy information available, which included two fusions, KIF13A-RET and ZC3HAV1-BRAF (Fig. 5a). The patient with the KIF13A-RET fusion was a 64year-old male with stage IV ADC. After 15 months of chemotherapy, the patient's tumor biopsy was subjected to NGS and revealed the potential driver fusion, KIF13A: exon18-RET: exon12. Subsequently, the patient was treated with cabozantinib, but failed to exhibit a response (Fig. 5b). The patient harboring the ZC3HAV1-BRAF fusion was a 56-year-old female with stage IV ADC. Her tissue and plasma biopsies were subjected NGS which identified the EGFR L858R mutation at mutation allele frequency (MAF) of 41.3% and 3.0%, respectively. The patient was administered gefitinib and achieved a PFS of 7 months. After the disease progressed, the patient was administered osimertinib treatment, but only responded for 5 months. The tumor progressed and an additional metastatic site was identified, and thus, the patient's plasma and metastatic tissue biopsies were subjected to a second NGS test. That test identified a ZC3HAV1: exon3-BRAF: exon10 mutation at a MAF of 3.1% and 5.8%, respectively, thus, indicating the resistance function of the ZC3HAV1-BRAF fusion (Fig. 5c).  We also found 180 types of single kinase fusions, including 126 fusions that had not previously been reported. To scan for functional cases from novel kinase fusions, we evaluated the therapeutic history of each patient and focused only on fusions without concurrent driver mutations that benefited from fusion TKI treatments (Fig. 5d). We identified two novel fusion types, including AFF2-RET (AFF2: intron7-RET: exon11) and RORB-ROS1 (RORB: intron4-ALK: intron19). The patient harboring the AFF2-RET fusion was a male with stage IV ADC. His plasma biopsy was subjected to NGS, and the novel AFF2-RET fusion was identified at a MAF of 4.1%. After chemotherapy for 11 months, the patient received pralsetinib treatment and achieved a stable disease, which was maintained for 9 months. Additional investigations of the AFF domain identified a coil-coiled domain, which is capable of mediating dimerization (Fig. 5e). The patient carrying the RORB-ALK fusion was a 64-year-old male with stage IV ADC. His tumor biopsy was subjected to NGS, which revealed the potential driver fusion. Subsequently, the patient was treated with crizotinib and achieved a response for 24 months. After disease progression, the patient was treated with ceritinib and responded for 6 months. Structural analysis of RORB revealed a zinc finger domain that may lead to dimerization and activation of downstream ALK targets  ( Fig. 5f). These two novel kinase fusions both exhibited remarkable sensitivity to RET and ALK TKIs, respectively, which suggested that they were able to function as potentially druggable fusions.

DISCUSSION
In this study, we used DNA-seq data from 17,442 Chinese lung cancer patients to identify potential driver fusions, and analyze the clinical and genomic features of patients with different kinase fusion types. Overall, 6.66% (1162/17,442) of patients contained kinase fusions (recurrent fusions account for 5.2%). Of the 219 fusion types observed, 39 were recurrent while 180 were observed in single cases. The most frequent kinases identified in patients were ALK, RET, ROS1, EGFR, and FGFR3. An interesting observation was the identification of recurrent fusions that had not previously been reported in lung cancer, but were found in other cancer types, such as the VCL-ALK fusion in epithelioid fibrous histiocytoma 35 and renal cell carcinoma 36 , the FN1-ALK fusion in ovarian cancer 37 and gastrointestinal leiomyomas 38 , the NPM1-ALK fusion in anaplastic large cell lymphoma 39 , the VSTM2A-EGFR fusion in glioblastomas 40 , and the ERBB2-IKZF3 fusion in breast cancer 41 . Those results indicated that driver fusions in other cancers may also play crucial roles in lung cancer carcinogenesis. We also identified a novel recurrent fusion, CCT6A-EGFR, but the function of which remains to be determined.
Many driver mutations often co-occurred and played critical roles with kinase fusions during cancer progression. Thus, detailing the relationships between fusions and mutations may lead to the development of efficient treatment strategies. Therefore, we analyzed treatment-relevant driver alterations that co-occurred with the most frequent kinase fusions.
Further investigation of the breakpoints involved in different kinase fusions revealed preferences for the ligation sites of partners. Most of the fusions occurred in intron 11 and intron 17 of RET and FGFR3, respectively, but the breakpoints of ALK and ROS1 varied between different partners. Studies showed that translocations were not random, as the breakpoints in the fusion genes had both sequence and structure preferences 42 . It was reported that ALK fusion variants may affect clinical outcomes upon TKI treatment 43 .
By integrating mutation, copy number variation, and fusion data, we evaluated the genomic alterations co-occurring with each fusion, and focused mainly on alterations related to drug administration. Patients with CCDC6-RET and FGFR3-TACC3 had a higher prevalence of EGFR ex19del mutations (p < 0.01), which indicated that they may function as resistance mechanisms for EGFR TKIs. It has been reported that CCDC6-RET can mediate osimertinib resistance in NSCLC, and the combination of osimertinib and RET inhibitors leads to a rapid response in those patients 44 . In our cohort, nine patients with EGFR alterations were treated with EGFR TKIs, among which seven were treated with osimertinib. However, two patients with CCDC6-RET were only treated with gefitinib and ectinib, respectively, which indicated that the CCDC6-RET fusion functioned as a resistance alteration against first and third generation TKIs. The therapeutic approach to targeting those fusions should be further studied. FGFR3-TACC3 could confer resistance to EGFR TKIs and often had coexisting EGFR activating mutations in NSCLC 11 , which is consistent with our findings. We also revealed that the frequency of patients with METex 14 skipping was higher than that of other MET alterations co-occurring with kinase fusions. However, the underlying mechanisms leading to such observations should be studied in detail.
Since only 6% of cancer patients can benefit from the existing drugs targeting fusions, the identification of novel druggable fusions is crucial for expanding the therapeutic options for cancer patients. In this study, we focused only on fusions with intact kinase domains, without concurrent driver mutations, and studied the clinical history of each patient. We identified two novel fusions, including AFF2-RET and RORB-ALK with demonstrated responses to TKI treatment. We analyzed the domains of the two fusions and found that they both harbored domains that mediate protein dimerization and activate downstream signaling pathways. We also observed a patient with a novel AFF3-ALK fusion, without the dimerization domain (data not shown). As expected, that patient showed no response when treated with crizotinib.
This study has two main limitations. The first one is that the therapeutic information were incomplete or missing, which is also one of the weaknesses of real-world studies 45 . Although we used a system with error alert function to abstract medical record, errors might have occurred during information collection. The other limitation is that we did not validate the novel or concurrent fusions via other diagnostic approaches including reverse transcription polymerase chain reaction, or RNA sequencing which due to the insufficiency of tissue amount. A recent study 46 showed that two non-canonical ALK fusions identified with DNA sequencing in a patient generated canonical EML4-ALK transcripts during mRNA maturation which was revealed via RNA sequencing, indicating a difference of fusion detected at DNA and RNA level. Therefore, other diagnostic approaches should be used for further validation of fusions observed with DNA sequencing.
In conclusion, we provided a comprehensive overview of the landscape of kinase fusions in a large Chinese lung cancer population, including lung ADC and SCC patients, and characterized the patterns of breakpoint locations and co-occurring mutations in patients with known kinase fusions. Furthermore, we reported novel kinase fusions and highlighted patients with recurrent kinase fusions with low frequency (two occurrences) that demonstrated favorable clinical outcomes on TKIs. Collectively, these findings not only advance our understanding of the spectrum of fusions involving kinases in lung cancer, but also have immediate implications for disease diagnosis and treatment.

Patients and samples
Using capture-based targeted NGS, this study retrospectively reviewed 17,442 Chinese NSCLC patients who underwent genetic testing at hospitals across China between June 2016 and July 2019, including The Second Affiliated Hospital of Zhejiang University School of Medicine, Fujian Medical University Cancer Hospital and Zhangzhou Zhengxing Hospital. Demographic and clinical data, including age, gender, histology type, pathological stage, metastasis sites, treatment regimens, and duration of TKI treatment were abstracted from the medical records provided by physicians. Genomic profiling was performed on formalin-fixed paraffinembedded (FFPE) tumor and liquid biopsy specimens. All samples were sequenced in a Clinical Laboratory Improvement Amendments (CLIA)-and College of American Pathologists (CAP)-certified genomic testing facility (Nanjing Geneseeq Technology Inc., Nanjing, China). This study was approved by the ethical committee of each participating hospital and all patients provided written informed consent to participate.

Library preparation and sequencing
For targeted NGS, DNA extraction and sequencing libraries were prepared according to protocols described previously 47 . Briefly, genomic DNA was extracted from FFPE tumor and liquid biopsies using a QIAamp DNA FFPE Tissue Kit and a QIAamp Circulating Nucleic Acid kit (Qiagen), respectively. Library preparations were performed using the KAPA Hyper Prep Kit (KAPA Biosystems). Target enrichment was performed using the xGen lockdown probes targeting 425 cancer-related genes (Geneseeq Prime). The targetenriched libraries were quantified by qPCR using the KAPA Library Quantification kit (KAPA Biosystems) and sequenced on HiSeq NGS platforms (Illumina) to generate 2 × 150-bp reads following the manufacturer's instructions.

Fusions detection and kinase fusion analysis
We used Delly fusion calling tool 48 to identify the number of chimeric reads (sequencing paired ends mapped to different genes) and split reads (spanning a fusion breakpoint) from the targeted DNA-seq data. Fusions were filtered by removing fusions with (1) breakpoints involving intergenic loci; and (2) fusions with split reads <3 or paired reads <5. Then, kinase fusions matched to the Kinase Database (including 538 kinase genes; http://kinase.com/web/current/) were evaluated with the conservation of full catalytic kinase domains. All fusions were manually confirmed using the Integrative Genomics Viewer (IGV) and only kinase fusions retaining an intact kinase domain were included for analysis in this study. For patients with multiple tests, same fusion was counted only once regardless of whether it was detected in multiple samples or different sample sources. Samples' therapeutic timelines, including treatment-naïve and post-TKI treatment, were considered unless this information was lacking.

Mutation calling
Somatic alterations concurrent with kinase fusions were analyzed as previously described 49 . In brief, we used Trimmomatic for FASTQ file quality control (below 15 or N bases were removed). Burrows-Wheeler Aligner (BWA v0.7.12) was used for mapping reads to the reference Human Genome (hg19). Local realignment around the indels and base quality score recalibration was performed with the Genome Analysis Toolkit (GATK 3.4.0). VarScan 2 was used for somatic mutations calling, with at least 0.2% MAF and with at least three supporting-reads. The Oncology Knowledge Base (oncoKB) was used to identify oncogenic alterations 50 . Information on targeted therapy-relevant hotspot alterations were also abstracted from the oncogenic alterations 51 .

Data analysis and statistics
The chi-squared test, Mann-Whitney U test, and Kruskal-Wallis H test were used for analyses of gender and age. A p-value < 0.05 was considered significant for all tests, unless otherwise indicated. Statistical analyses were performed in R (v.3.3.2) and Statistical Product and Service Solutions.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
The data presented in the study are deposited in the Genome Sequence Archive for Human (GSA-Human) repository, accession number HRA001045.