Cancer cells retain genomic alterations that provide a selective advantage. The prediction and validation of advantageous alterations are major challenges in cancer genomics. Moreover, it is crucial to understand how the coexistence of specific alterations alters response to genetic and therapeutic perturbations. In the present study, we inferred functional alterations and preferentially selected combinations of events in >9,000 human tumors. Using a Bayesian inference framework, we validated computational predictions with high-throughput readouts from genetic and pharmacological screenings on 2,000 cancer cell lines. Mutually exclusive and co-occurring cancer alterations reflected, respectively, functional redundancies able to rescue the phenotype of individual target inhibition, or synergistic interactions, increasing oncogene addiction. Among the top scoring dependencies, co-alteration of the phosphoinositide 3-kinase (PI3K) subunit PIK3CA and the nuclear factor NFE2L2 was a synergistic evolutionary trajectory in squamous cell carcinomas. By integrating computational, experimental and clinical evidence, we provide a framework to study the combinatorial functional effects of cancer genomic alterations.
Subscribe to Journal
Get full journal access for 1 year
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
All data analyzed in this study are publicly available through different data portals: TCGA, https://gdc.cancer.gov/about-data/publications/pancanatlas CCLE, https://portals.broadinstitute.org/ccle/about CellModelPassports, https://cellmodelpassports.sanger.ac.uk/downloads CCLP, http://cancer.sanger.ac.uk/cell_lines AVANA, DEMETER2, https://depmap.org/portal SCORE, https://score.depmap.sanger.ac.uk CTRP, https://portals.broadinstitute.org/ctrp/?page=#ctd2BodyHome GDSC, http://www.cancerrxgene.org. Data for the DRIVE dataset was obtained on private request to the contact author. A detailed description of all data sources is available in the Supplementary Note.
The latest version of the SELECT algorithm (v.1.6) is available at http://ciriellolab.org/select/select.html. The development version of SELECT is available in the Git repository https://bitbucket.org/cso_repo/select. The custom code to implement the Bayesian inference framework discussed in the manuscript was implemented in R and is available at https://bitbucket.org/cso_repo/eda.
Sanchez-Vega et al. Oncogenic signaling pathways in the cancer genome atlas. Cell 173, 321–337.e10 (2018).
Bailey et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385.e18 (2018).
Sieber, O. M., Tomlinson, S. R. & Tomlinson, I. P. M. Tissue, cell and stage specificity of (epi)mutations in cancers. Nat. Rev. Cancer 5, 649–655 (2005).
Visvader, J. E. Cells of origin in cancer. Nature 469, 314–322 (2011).
Schaefer, M. H. & Serrano, L. Cell type-specific properties and environment shape tissue specificity of cancer genes. Sci. Rep. 6, 20707 (2016).
Schneider, G., Schmidt-Supprian, M., Rad, R. & Saur, D. Tissue-specific tumorigenesis: context matters. Nat. Rev. Cancer 17, 239–253 (2017).
Park, S. & Lehner, B. Cancer type-dependent genetic interactions between cancer driver alterations indicate plasticity of epistasis across cell types. Mol. Syst. Biol. 11, 824 (2015).
Mina et al. Conditional selection of genomic alterations dictates cancer evolution and oncogenic dependencies. Cancer Cell 32, 155–168.e6 (2017).
Sansom et al. Loss of Apc allows phenotypic manifestation of the transforming properties of an endogenous K-ras oncogene in vivo. Proc. Natl Acad. Sci. USA 103, 14122–14127 (2006).
Bardeesy et al. Smad4 is dispensable for normal pancreas development yet critical in progression and tumor biology of pancreas cancer. Genes Dev. 20, 3130–3146 (2006).
Farmer et al. Targeting the DNA repair defect in BRCA mutant cells as a therapeutic strategy. Nature 434, 917–921 (2005).
Unni, A. M., Lockwood, W. W., Zejnullahu, K., Lee-Lin, S.-Q. & Varmus, H. Evidence that synthetic lethality underlies the mutual exclusivity of oncogenic KRAS and EGFR mutations in lung adenocarcinoma. eLife 4, e06907 (2015).
Etemadmoghadam et al. Synthetic lethality between CCNE1 amplification and loss of BRCA1. Proc. Natl Acad. Sci. USA 110, 19489–19494 (2013).
Chakravarty, D. et al. OncoKB: a precision oncology knowledge base. JCO Precis. Oncol. https://doi.org/10.1200/PO.17.00011 (2017).
Chang, T. et al. Accelerating discovery of functional mutant alleles in cancer. Cancer Discov. 8, 174–183 (2017).
Chang et al. Identifying recurrent mutations in cancer reveals widespread lineage diversity and mutational specificity. Nat. Biotech. 34, 155–163 (2016).
Turajlic et al. Tracking cancer evolution reveals constrained routes to metastases: TRACERx renal. Cell 173, 581–594.e12 (2018).
Caravagna, et al. Detecting repeated cancer evolution from multi-region tumor sequencing data. Nat. Methods 15, 707–714 (2018).
Jamal-Hanjani et al. Tracking the evolution of non–small-cell lung cancer. N. Engl. J. Med. 376, 2109–2121 (2017).
Gaiti et al. Epigenetic evolution and lineage histories of chronic lymphocytic leukaemia. Nature 569, 576–580 (2019).
Tirosh et al. Single-cell RNA-seq supports a developmental hierarchy in human oligodendroglioma. Nature 539, 309–313 (2016).
Roerink et al. Intra-tumour diversification in colorectal cancer at the single-cell level. Nature 556, 457–462 (2018).
Boca, S. M., Kinzler, K. W., Velculescu, V. E., Vogelstein, B. & Parmigiani, G. Patient-oriented gene set analysis for cancer mutation data. Genome Biol. 11, R112 (2010).
Ciriello, G., Cerami, E., Sander, C. & Schultz, N. Mutual exclusivity analysis identifies oncogenic network modules. Genome Res. 22, 398–406 (2012).
Kim et al. Characterizing genomic alterations in cancer by complementary functional associations. Nat. Biotech. 34, 539–546 (2016).
Haar et al. Identifying epistasis in cancer genomes: a delicate affair. Cell 177, 1375–1383 (2019).
Meyers et al. Computational correction of copy-number effect improves specificity of CRISPR–Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).
McDonald et al. Project DRIVE: a compendium of cancer dependencies and synthetic lethal relationships uncovered by large-scale, deep RNAi screening. Cell 170, 577–592.e10 (2017).
Behan et al. Prioritization of cancer therapeutic targets using CRISPR–Cas9 screens. Nature 568, 511–516 (2019).
McFarland et al. Improved estimation of cancer dependencies from large-scale RNAi screens using model-based normalization and data integration. Nat. Commun. 9, 1–13 (2018).
Beroukhim et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).
Lawrence et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).
Scholl et al. Synthetic lethal interaction between oncogenic KRAS dependency and STK33 suppression in human cancer cells. Cell 137, 821–834 (2009).
Zhao et al. ARID1A promotes genomic stability through protecting telomere cohesion. Nat. Commun. 10, 4067 (2019).
Helming, K. C., Wang, X. & Roberts, C. W. M. Vulnerabilities of mutant SWI/SNF complexes in cancer. Cancer Cell 26, 309–317 (2014).
Zhang, H., Chen, X., Liu, B. & Zhou, L. Effects of stable knockdown of Aurora kinase A on proliferation, migration, chromosomal instability, and expression of focal adhesion kinase and matrix metalloproteinase-2 in HEp-2 cells. Mol. Cell. Biochem. 357, 95–106 (2011).
Berger, J. O. & Sellke, T. Testing a point null hypothesis: the irreconcilability of P values and evidence. J. Am. Stat. Assoc. 82, 112–122 (1987).
Jarosz, A. & Wiley, J. What are the odds? A practical guide to computing and reporting bayes factors. J. Prob. Solving 7, 2–9 (2014).
Jeffreys, S. H. The Theory of Probability (Oxford University Press, 1998).
Skoulidis et al. Co-occurring genomic alterations define major subsets of KRAS—mutant lung adenocarcinoma with distinct biology, immune profiles, and therapeutic vulnerabilities. Cancer Discov. 5, 860–877 (2015).
Skoulidis et al. STK11/LKB1 mutations and PD-1 inhibitor resistance in KRAS-mutant lung adenocarcinoma. Cancer Discov. 8, 822–835 (2018).
Rao, R. C. & Dou, Y. Hijacked in cancer: the KMT2 (MLL) family of methyltransferases. Nat. Rev. Cancer 15, 334–346 (2015).
Bögershausen et al. RAP1-mediated MEK/ERK pathway defects in Kabuki syndrome. J. Clin. Invest. 125, 3585–3599 (2015).
Vogelstein, B. & Kinzler, K. W. The path to cancer—three strikes and you’re out. N. Engl. J. Med. 373, 1895–1898 (2015).
Zehir et al. Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients. Nat. Med. 23, 703–713 (2017).
Sanghvi et al. The oncogenic action of NRF2 depends on de-glycation by fructosamine-3-kinase. Cell 178, 807–819.e21 (2019).
Raynaud, F., Mina, M., Tavernari, D. & Ciriello, G. Pan-cancer inference of intra-tumor heterogeneity reveals associations with different forms of genomic instability. PLoS Genet. 14, e1007669 (2018).
Mitsuishi, Y. et al. Nrf2 redirects glucose and glutamine into anabolic pathways in metabolic reprogramming. Cancer Cell 22, 66–79 (2012).
Iorio, F. et al. A landscape of pharmacogenomic interactions in cancer. Cell 166, 740–754 (2016).
Rees, M. G. et al. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat. Chem. Biol. 12, 109–116 (2016).
Enache, O. M. et al. Cas9 activates the p53 pathway and selects for p53-inactivating mutations. Nat. Genet. 52, 662–668 (2020).
Haapaniemi, E., Botla, S., Persson, J., Schmierer, B. & Taipale, J. CRISPR–Cas9 genome editing induces a p53-mediated DNA damage response. Nat. Med. 24, 927–930 (2018).
Zamanighomi, M. et al. GEMINI: a variational Bayesian approach to identify genetic interactions from combinatorial CRISPR screens. Genome Biol. 20, 137 (2019).
Gao, Q. et al. Driver fusions and their implications in the development and treatment of human cancers. Cell Rep. 23, 227–238.e3 (2018).
Hu, X. et al. TumorFusions: an integrative resource for cancer-associated transcript fusions. Nucleic Acids Res. 46, D1144–D1149 (2018).
Rouder, J. N., Morey, R. D., Speckman, P. L. & Province, J. M. Default Bayes factors for ANOVA designs. J. Math. Psychol. 56, 356–374 (2012).
Rouder, J. N., Morey, R. D., Verhagen, J., Swagman, A. R. & Wagenmakers, E.-J. Bayesian analysis of factorial designs. Psychol. Methods 22, 304–321 (2017).
Zellner, A. & Siow, A. Posterior odds ratios for selected regression hypotheses. Trabajos de Estadistica Y de Investigacion Operativa 31, 585–603 (1980).
Lefebvre, C. et al. A human B-cell interactome identifies MYB and FOXM1 as master regulators of proliferation in germinal centers. Mol. Syst. Biol. 6, 377 (2010).
Alvarez, M. J. et al. Network-based inference of protein activity helps functionalize the genetic landscape of cancer. Nat. Genet. 48, 838–847 (2016).
We thank A. Sottoriva and G. Caravagna for providing inferred tumor phylogenies of the TRACERx cohort, and E. Oricchio and B. Correia for the critical reading of and feedback to our work. This work was supported by the Swiss National Science Foundation (grant no. 310030_169519). Additional support was provided by the Gabriella Giorgi Cavaglieri Foundation (to G.C.).
The authors declare no competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a) KRAS gene essentiality across the cancer cell lines in the AVANA dataset (Y axis - negative scores indicate greater gene essentiality). Cancer cell lines are classified according to KRAS mutations (nG12 = 69, nother_drivers_ = 16, nneutral = 7, nwt = 389). b) 3D spatial conformation of the KRAS protein (PDB ID: 5OCG). Amino acids hit by functional single nucleotide variants are colored.
Comparison of alteration frequencies for selected events (n = 545) between human primary samples in TCGA (n = 9083) and cancer cell lines from the Cell Line Encyclopedia (n = 1461).
Extended Data Fig. 3 Gene essentiality scores associated to putative functional and neutral mutations.
a) Systematic comparison of DEMETER2 gene essentiality scores in cell lines with functional (F) alterations for a given gene vs. cell lines wild type for the same gene. Y axis: P-values of ANOVA analysis, performed as described in the Supplementary Note, section statistical framework for association studies. X axis: effect size. Red and blue dots represent significant (q-value < 0.05) oncogenes (OGs) and Tumor Suppressor genes (TSG), respectively. Gray dots: q-value > 0.05. The exact number of independent biological cancer cell lines used to derive the statistics is reported in Supplementary Table 2. b) Systematic comparison of DEMETER2 gene essentiality scores in cell lines with putative neutral (N) alterations for a given gene vs. cell lines wild type for the same gene. See panel C for details about the plot. c) Systematic comparison of SCORE gene essentiality scores in cell lines with F mutations for a given gene vs. cell lines wild type for the same gene. See panel C for details about the plot. d) Systematic comparison of SCORE gene essentiality scores in cell lines with N mutations for a given gene vs. cell lines wild type for the same gene. See panel C for details about the plot.
Extended Data Fig. 4 Significant differences of gene essentiality scores are influenced by alteration frequency and co-mutations.
a) Histogram of the number of genes (Y axis) that were functionally altered in a given number of cancer cell lines (X axis) in any of the four screening datasets (AVANA, DEMETER2, DRIVE, SCORE). b) Gene essentiality scores from the AVANA dataset upon NF1 knock-out in central nervous system (CNS) cell lines (left, nwt = 35, nalt = 13) and skin melanoma (right, nwt = 25, nalt = 3) in NF1 altered and wild type cell lines. Cell lines harboring activating mutations in either BRAF, KRAS, or NRAS are highlighted. The thick central line of each box plot in all panels represents the median number of significant motifs, the bounding box corresponds to the 25th–75th percentiles and the whiskers extend up to 1.5 times the interquartile range.
a) Tail ratio analysis for SELECT scores on the TCGA GAM. X axis: SELECT score threshold (x). Y axis: the ratio between the percentage of SELECT solutions (on the real GAM) with a score greater or equal than the threshold x and the average percentage of SELECT solutions (on randomized GAMs) with a score greater or equal than the threshold x. b) Density distributions of the distances between mutational signature profiles computed for each alteration events Distributions were separately derived for mutually exclusive (ME) alterations (solid purple line), randomized ME alterations (dashed purple line), co-occurrent (CO) alterations (solid green line), randomized CO alterations (dashed green line). c) For each testable ED (colored dots) between alterations X and Y, the plot shows the mean probability of clonal (X-axis) and subclonal (Y-axis) co-occurrence over the set of samples exhibiting both X and Y (double-altered samples). d) Detailed view of the probability of subclonal co-occurrence (color coded) in double-altered samples for the 10 EDs with highest mean subclonal co-occurrence probability. Samples with probability greater than 0.1 are annotated with the corresponding tumor type.
a) Archetype of the structure of data and association analysis performed in this work. Samples (for example cancer cell lines) are classified in four categories according to the presence/absence of functional alterations in gene x1 or x2. Each sample is annotated with a phenotype y (real number). b) Output produced by the ANOVA and Bayesian statistical frameworks. The Bayesian framework returns posterior estimates of the effect sizes (that is means values) of the phenotypes of each alteration class. c) Example of direct post-hoc analysis performed in either ANOVA or Bayesian settings. d) Example of indirect post-hoc analysis developed for the Bayesian framework.
a) Example of synthetic positive (P) and negative (N) cases generated to mimic the real-case scenarios (as in panel B). Extensive sets of multiple synthetic P and N cases were generated, with different combinations of Dp, Np, and Nn parameters. Dp: the difference between the mean parameters of the normal distributions from which phenotype values for the samples in the red and purple classes are drawn. Np and Nn: the number of samples in red and purple classes, for positive (P) and negative (N) cases, respectively. The synthetic dataset was used to assess and compare the ANOVA and Bayesian inference frameworks. Boxplots in this panel are used as symbolic examples and do not represent actual data. b) True and False Positive rates for direct and indirect post-hoc tests, for synthetic sets of P and N cases with different effect sizes (Dp) and same number of P and N samples (Np = Nn). c) True and False Positive rates for direct and indirect post-hoc tests, for more extreme cases with small effect size (Dp) and lower number of P samples (Np << Nn). d) True and False Positive rates (left panel) and FDR (right panel) for synthetic sets of P and N cases with small effect size (Dp) and lower number of P samples (Np << Nn). The thick central line of each box plot in all panels, with the exception of panel e, represents the median number of significant motifs, the bounding box corresponds to the 25th–75th percentiles and the whiskers extend up to 1.5 times the interquartile range. The data in these boxplots are randomly drawn from normal distributions and do not represent actual data.
Extended Data Fig. 8 Assessing the functional impact of evolutionary dependencies in cancer cell lines.
a) Comparison of the difference between observed and expected overlap fraction (that is fraction of double altered samples) in the TCGA (X-axis) and CCLE (Y-axis) datasets for all significant EDs detected by SELECT in the TCGA cohort. EDs with high weighted mutual information differences are highlighted in red. b) Significant associations between EDs and gene essentiality. For each ED (knock-out-gene is in blue), we report the change of gene essentiality score determined in each screening where the ED could be tested (arrows: tail is the value in single-altered cell lines, arrowhead point to the value in double-altered cells, green: co-occurrence / purple: mutual exclusivity). Significant changes are annotated with a thick line arrow. c) Detailed AVANA gene essentiality scores for cancer cell lines of intestinal lineage upon KRAS knock-out; cell lines are stratified according to the alterations in KRAS and KMT2D genes (X axis, nalt/alt = 2, nalt/wt = 8, nwt/alt = 3, nwt/wt = 6). The thick central line of each box plot in all panels represents the median number of significant motifs, the bounding box corresponds to the 25th–75th percentiles and the whiskers extend up to 1.5 times the interquartile range. d, e) The fraction of co-occurrent (green line) and mutually exclusive (purple line) EDs leading to decreased vs. increased cell fitness (0-centered to the median of the random EDs, gray distribution) for the (d) AVANA and (e) DRIVE. One-sided P values are derived empirically by comparing the observed fraction of increased and decreased cell fitness to the null distribution expected for random EDs.
a) Schematic of the procedure to map EDs identified by SELECT in the pan-cancer and single tumor type studies to a given cohort of interest. b–d) Evolutionary axes inferred for (b) brain tumors (low grade glioma and glioblastoma cohorts), (c) gastric tumors (colorectal and stomach cancer cohorts), and (d) squamous cell carcinomas (lung squamous cell, head and neck, esophageal cancer cohorts). Axes comprise mutually exclusive (purple edges) and co-occurrent (green edges) EDs between altered oncogenes (red circles) and tumor suppressors (blue circles).
a) Oncoprint summarizing the alteration occurrences in TCGA lung cancer patients. Samples are sorted by the evolutionary axes and altered genes in each axis are shown separately. b) Number of TRACERx lung cancer patients with cancer genes functionally altered in the first clone (X axis) or in a subclone (Y axis), based on the trajectories inferred by the REVOLVER algorithm. c) Detailed gene essentiality scores in cell lines, based on the alteration status of PIK3CA and NFE2L2 upon knock-out of NFE2L2. Gene essentiality scores were taken from DRIVE (left, ndriver/driver = 6, ndriver/wt = 6, nwt/driver = 60, nwt/wt = 216) and DEMETER2 (right, ndriver/driver = 8, ndriver/wt = 11, nwt/driver = 108, nwt/wt = 374). Cell lines from lung cancer lineage are highlighted in red. d) Detailed representation of drug sensitivity values (EC50 concentrations, Y axis) to BRD-K34222889 for cancer cell lines from the KRAS-STK11-KEAP11 or NFE2L2-PIK3CA evolutionary axes (X axis, nKRAS+STK11+KEAP1 = 10, nKRAS_only = 87, nPIK3CA+NFE2L2 = 5, nPIK3CA_only = 68). Cell lines from lung cancer lineage are highlighted in blue. The thick central line of each box plot in all panels represents the median number of significant motifs, the bounding box corresponds to the 25th–75th percentiles and the whiskers extend up to 1.5 times the interquartile range.
About this article
Cite this article
Mina, M., Iyer, A., Tavernari, D. et al. Discovering functional evolutionary dependencies in human cancers. Nat Genet 52, 1198–1207 (2020). https://doi.org/10.1038/s41588-020-0703-5
Nature Genetics (2020)