Genomic alterations and possible druggable mutations in carcinoma of unknown primary (CUP)

Carcinoma of Unknown Primary (CUP) is a heterogeneous and metastatic disease where the primary site of origin is undetectable. Currently, chemotherapy is the only state-of-art treatment option for CUP patients. The molecular profiling of the tumour, particularly mutation detection, offers a new treatment approach for CUP in a personalized fashion using targeted agents. We analyzed the mutation and copy number alterations profile of 1709 CUP samples deposited in the AACR Project Genomics Evidence Neoplasia Information Exchange (GENIE) cohort and explored potentially druggable mutations. We identified 52 significant mutated genes (SMGs) among CUP samples, in which 13 (25%) of SMGs were potentially targetable with either drugs are approved for the know primary tumour or undergoing clinical trials. The most variants detected were TP53 (43%), KRAS (19.90%), KMT2D (12.60%), and CDKN2A (10.30%). Additionally, using pan-cancer analysis, we found similar variants of TERT promoter in CUP and NSCLC samples, suggesting that these mutations may serve as a diagnostic marker for identifying the primary tumour in CUP. Taken together, the mutation profiling analysis of the CUP tumours may open a new way of identifying druggable targets and consequently administrating appropriate treatment in a personalized manner.

Of all patients diagnosed with cancer, 2% present as metastatic carcinoma of unknown primary site (CUP) 1 . It is classified as any metastatic epithelial tumour where, following extensive clinical history, physical examination, radiological studies and histopathological investigations, failed to identify the primary site of tumours 2 . According to the European Society for Medical Oncology (ESMO) guidelines for the treatment of patients with favourable-risk CUP, the administration of various regimens of chemotherapy alone or in combination with radiotherapy or hormonal therapy has been proposed as only standard treatment guidelines 3 . Because of CUP tumour heterogeneity, the current clinical trials are challenging to perform, resulting in a poor prognosis with a median survival of less than 12 months and 5-year survival of 14% 4 . Thus, there is an urgent need to improve treatment modalities and prolong patients' survival with CUP 5 .
Personalized cancer medicine using genomics technologies opened new ways to treat various types of cancer using the identification of targetable mutations [6][7][8][9][10] . Recent studies have highlighted the crucial role of precision medicine in patient stratification and the selection of effective treatment in malignant types of cancer [11][12][13][14][15][16][17] . Moreover, several studies have reported improved overall survival in patients with advanced and metastatic cancers who have received genetically matched targeted therapies 18,19 . In CUP tumours, the implementation of this approach may improve treatment by targeting tumour-specific and druggable somatic variants in a personalized manner 4 . The AACR Project Genomics Evidence Neoplasia Information Exchange (GENIE) has recently collected the genomic information, including mutations and copy number variation of the wide range of solid tumours, including CUP from both primary and metastatic tumours [20][21][22] . Using these public data, we analyzed the genomic mutations and copy number alterations of 1709 CUP samples to provide insight into the genetic makeup of these tumours and determined potentially druggable targets. included in this study. The sample type distribution was 24,567 primary and 15,484 metastasis tumours in GENIE cohorts. The hotspot regional mutations and copy number variations of these samples were available from GENIE and cBioportal. According to the information provided by GENIE, we divided samples into more than17 broader cancer types, including CUP samples (Fig. 1A). The cancer types containing the most samples were non-small cell lung cancer (9090 (15.3%)), breast invasive ductal carcinoma (8712 (14.7%)), colorectal can-  www.nature.com/scientificreports/ cer (5961 (10.0%)), Glioma (3214 (5.4%)), Melanoma (2492 (4.2%)), prostate cancer (2214 (3.7%)).  (Fig. 1D).
Significantly mutated genes (SMG) in CUP samples. We analyzed the most genomic mutations of hotspot regions at the gene level in CUP samples in GENIE according to the previously developed method 9,10 . In total, 52 SMG was identified ( Fig. 2A; Supplementary Table 1). Among SMGs, the mutation rate of TP53, KRAS, ARID1A, SMARCA4 and KMT2D were recorded significantly higher than other identified SMGs (Fig. 2B, Supplementary Table 1). The pathway enrichment analysis of identified SMGs resulted in SMGs' involvement in a wide range of cellular processes (Fig. 2C Zehir et al. 9 highlighting TERT promoter mutations across few primary tumours, we observed a similar mutation of TERT promoter among CUP samples (n = 91) (Fig. 2D). Although the clinical relevance of mutations in the TERT promoter remains incompletely understood, our results reaffirm the high prevalence of these alterations in patients with advanced solid tumours and suggest an association with disease progression and poor outcome. Additionally, the presence of similar mutations of TERT promoter in CUP and NSCLC samples suggests these mutations may serve as a diagnostic marker for identifying the primary tumour in CUP patients.  Table 4). Among CUP samples, a deep deletion of TP53, RB1, CDKN2A, and STK11 and amplification of KRAS and PIK3CA were observed. In a pan-cancer analysis, amplification of KRAS and PIK3CA in the breast (66 and 114 of cases) and non-small cell lung cancer (46 and 48 of cases), TERT in non-small cell lung cancer (114 of cases) and ATR in breast cancer (36 of cases), were the most amplified genes, while deletion in CDKN2A in glioma (676 of cases), RB1 and TP53 in small cell lung cancer (15 of cases) were observed in these 14 different cancer types (Fig. 4C). Among these genes with significantly altered copy numbers between CUP and primary tumours, a significant amplification of TERT promoter was observed in both CUP and non-small cell lung cancer samples compared to glioma and breast primary tumours suggesting that copy number variation of TERT may play a diagnostic role for the identification of the origin of CUP tumours (Fig. 4D).

Mutation frequency of CUP-SMGs across 17 known primary tumours.
To identify similar and targetable mutation patterns in CUP, we analyzed and compared the genomic alteration frequency of identified CUP-SMGs in primary tumour types across 17 cancer types in GENIE (Fig. 5A). The majority of CUP-SMGs mutations were enriched in non-small cell lung cancer (4221 cases), colon cancer (4011 cases) and breast cancer (3376 cases) (Fig. 5A).
The most frequently mutated gene in this cohort was TP53 (44% total samples) (Fig. 5B). Its mutations predominate in non-small cell lung cancer (46.36%; 2517 cases), colon cancer (65.55%; 2365 cases) and breast cancer (36.26%; 2060 cases) (Fig. 5B). KRAS is the second most commonly mutated gene, occurring frequently (> 10%) in most cancer types (pancreatic: 74.6%, colon cancer:44.24%, non-small cell lung cancer:30.93%) except hepatobiliary carcinoma, cervical cancer, bladder cancer, thyroid cancer, melanoma, small-cell lung cancer, head and neck carcinoma, prostate and breast cancer (Fig. 5B). PIK3CA mutations were frequented in breast cancer (36.7%) and cervical cancer (25.14%), being specifically enriched in luminal subtype tumours. Many cancer types carried mutations in chromatin re-modelling genes. In particular, histone-lysine N-methyltransferase genes www.nature.com/scientificreports/ www.nature.com/scientificreports/ KMT2D, KMT2C and KMT2B in bladder, lung and endometrial cancers, whereas the KMT2A is mostly mutated in non-small cell lung cancer and colon cancer. Mutations in ARID1A are frequent in non-small cell lung cancer, colon cancer, bladder cancer and breast cancer, whereas mutations in KEAP1 and STK11 was predominate in non-small cell lung cancer (8.62% and 11.75%, respectively) (Fig. 5B). KRAS mutations are typically mutually exclusive, with recurrent activating mutations (KRAS (Gly 12) and KRAS (Gly 13) common in colon cancer, non-small cell lung cancer and pancreatic cancer. We compared the most common hotspot mutations in KRAS between CUP, and other KRAS mutation enriched cancer types (Fig. 5C). The comparison of hotspot mutations resulted in the enrichment of G12D and G12R in pancreatic cancer, G12C, G12F and G13C in non-small cell lung cancer and CUP samples. These data highlight similarity of KRAS hotspot mutations between CUP and NSCLC.
Targetable mutations and drug candidates. To identify or predict possible therapeutics based on genomic alterations identified in SMG in CUP samples, we performed a gene-drug association analysis using PanDrugs platforms 23 . The gene-drug associations classified into two groups called "Drug targets" in which drugs can directly target genes that contribute to disease phenotype, and "Biomarkers" where genes are representing a drug-response associated status while its protein products are not targetable 23 . From 262 identified    (Fig. 5D). Interestingly, we found five FDA approved drugs, Crizotinib (GScore: 0.76. Dscore: 0.95) and Copanlisib (GScore: 0.76. Dscore: 0.92), Debrafenib, Sorafenib, Vemurafenib, and Regorafenib as best candidates for targeting ALK/MET, PIK3CA, and BRAF inhibitors, respectively (Fig. 5D. Supplementary  Table 5). Moreover, various off-label and clinically investigating compounds for targeting mutated KRAS were identified, although the GScore and DScore of these compounds did not reach a high score (Supplementary Table- 5). Everolimus (mTOR inhibitor) and Bortezomib (26S proteasome inhibitor) were identified with the highest GScore and DScore compared to the other drugs candidates in this group (Supplementary Table-5). Taken together, these data highlight presence of at least one druggable variant and the potential of using genomic alteration guided targeted therapy in CUP patients.

Discussion
Currently, combination chemotherapy regimens have been considered as the first-line of therapy for CUP patients 24 . Personalized cancer therapy using the identification of druggable mutations has encouraged mutational profiling of various types of tumours, including metastasis tumours, for instance CUP [25][26][27] . This study analyzed the most significant mutated genes and identified the most prevalent variants in 1709 CUP samples. The gene-drug association studies suggested that at least one of the identified variants is linked to the known ,and approved targeted therapy agents or therapeutics are currently in clinical trial studies highlighting the potential of genomic alteration-based treatment approach for a patient with CUP. In line with this concept, numerous clinical studies have been reported durable treatment responses using mutation-matched targeted therapies drugs, including EGFR, BRAF, KIT, and MET 18,28-30 . Currently, targeted therapy agents Crizotinib and Copanlisib approved for the treatment of tumours that harbour mutations in ROS1/MET/ALK and PIK3CA, while therapeutic agents for the other identified variants, including FGFR family, MYC, MET, and KRAS are currently under investigation in active and ongoing clinical trials. A large proportion of the mutations detected in this study are associated with various signal transduction pathways, apoptotic regulation, cell cycle progression, and receptor tyrosine kinase signalling regulations. These results can be promising because the majority of available targeted drugs act through targeting one of these pathways, which are commonly altered in various types of cancer with known primary tumours [31][32][33][34][35] . The most mutated gene identified in this study was TP53 (43%, 743/1709), with numerous non-synonymous coding region variants. Similar to these data, previous studies demonstrated the association of TP53 mutations in metastatic progression in multiple cancer types, supporting the presence of high mutation load on TP53 reported in CUP 36,37 .
Other common variants detected in this cohort were observed in genes involved in activating and regulating key signal transduction pathways, including BRAF and KRAS. This is the first study to report various codon 12 variants of KRAS in CUP samples. The detection of codon 12 mutations in this cohort is consistent with the highly aggressive behaviours of CUP tumours 25,29 . Furthermore, characterizing the mutational status of KRAS has become clinically relevant in some malignancies because the presence of a KRAS mutation is known to stimulate resistance to some tyrosine kinase inhibitors [38][39][40][41] . Although currently no approved therapeutic agent to target and inhibit mutant KRAS activity available; however, recent clinical studies reported a partial response in CUP patients with a KRAS(G12D) mutation treated with Trametinib (MEK inhibitor) 30,42 . In this study, we also observed KRAS(G12C) variant in 25% of CUP samples. Recent promising results from Sotorasib (AMG-510); a specific covalent inhibitor of KRAS(G12C) in NSCLC suggest detecting this variant of KRAS as a possible druggable target in CUP patients 43 . Moreover, targeting KRAS(G12C) using Sotorasib in advanced solid tumours showed an encouraging anticancer activity which might be useful in CUP 44 .
Similar to other studies, we also identified activating BRAF (V600E) mutations in 4.3% (74/1709) cases [24][25][26] . This offers the potential of using BRAF inhibitors such as Vemurafenib and Dabrafenib for CUP with BRAF (V600E) mutations. In line with these, through the gene-drug association analysis, we also observed a high GScore and DScore of BRAF inhibitors Dabrafenib and Vemurafenib for targeting V600E variant identified in CUP samples. Moreover, a clinical study showed a complete clinical response of CUP patients treated with BRAF(V600E) targeted therapy Vemurafenib in combination with immunotherapy agent Ipilimumab 45 .
Mutations in MET and ERBB2 (HER2) amplification were detected in 30 and 27 of cases,respectively, suggesting the possibility of targeting these receptor tyrosine kinase 28 . Targeting MET using Crizotinib for patients without exon-14 skipping combined with HER2 inhibitor Trastuzumab has been shown with success in CUP patients. The current success of HER2 and MET targeted therapies using Trastuzumab (for cases with a HER2 amplification status) and Crizotinib in a combination manner in advanced and metastatic tumours including HER2 amplified and MET-mutant CUP tumours, suggest the further evaluation of these genes as druggable targets in patients with CUP 46 . Our results support those of other CUP studies highlighting the value of sequencing techniques, particularly gene mutation detection, to identify actionable targets 11,[24][25][26][27] .
Taken together, these data highlight the molecular heterogeneity of CUP tumours. The mutations detected across the majority of CUP cases included in this study highlight not only the genomic instability present in these tumours but also the potential application of targeted therapies for a significant proportion of patients with CUP, which might improve the prognosis and therapeutic decisions for these patients 12 .

Material and methods
Data collection. GENIE v5.0 provided the mutation, copy number variation, gene fusion and clinical information of 59,442 tumour samples 21 . Most onco-types were classified into 17 categories according to Oncotree (http:// oncot ree. mskcc. org/ oncot ree/). The onco-types not included in these 17 categories were excluded from our analysis. Raw data were downloaded from Synapse (syn17112456, https:// www. synap se. org/) and provided by the GENIE project using either R commands or cbioportal (https:// genie. cbiop ortal. org/) 47,48 . The preprocessing protocols for these data are described in the GENIE-provided data guide.
Significantly mutated genes (SMG) analysis. The SMG analysis performed according to the previously developed criteria and protocols 20,21 . We used the MuSiC suite 49 to identify significant genes for CUP samples and also for Pan-Cancer tumours according. This test assigns mutations to seven categories: AT transition, AT transversion, CG transition, CG transversion, CpG transition, CpG transversion and indel, and then uses statistical methods based on convolution, the hypergeometric distribution (Fisher's test P value < 0.05), and likelihood to combine the category-specific binomials to obtain overall P values. Notably, the genes with a cohort level alteration frequency of ≥ 5% or a tumour type-specific alteration frequency of ≥ 30% were included in our analysis, while tumours having no mutation, or more than 500 mutations were excluded in this study. Dif- www.nature.com/scientificreports/ ferentially mutated sites were plotted using Mutation-Mapper module in cBioportal. (https:// www. cbiop ortal. org/ mutat ion_ mapper).
Copy number variation analysis. Copy number alteration data were available at AACR Project GENIE, in cbioportal. In the present study, we selected the 17 most common cancer types for comparing their copy number variation frequencies with CUP samples. We calculated the changes in the average frequency of copy number variation (amplification and deletion) of CUP and Pan-cancer samples using provided R code in cbioportal.
Mutual exclusivity and co-occurrence analysis. We used Fisher's exact test to identify pairs of SMGs with significant (P value < 0.001 by Benjamini-Hochberg) exclusivity and co-occurrence. We identified significant pairs by analyzing all CUP samples together. Then we used a de novo driver exclusivity algorithm known as Dendrix 50 to identify sets of approximately mutually exclusive mutations on all samples together. The plotting for mutual exclusivity and co-occurrence was performed using Gitools software (version 2.3.1) 51 .

Data availability
The genomic data from the GENIE dataset used in this study are openly available for download in https:// www. synap se. org, reference number [syn17112456] 21. . All data generated and described in this article are available from the corresponding web servers and portal and are freely available to download for noncommercial purposes, without breaching participant confidentiality. www.nature.com/scientificreports/