Abstract
Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment. We used REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations, demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Meyerson, M., Gabriel, S. & Getz, G. Advances in understanding cancer genomes through second-generation sequencing. Nat. Rev. Genet. 11, 685–696 (2010).
Lawrence, M.S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).
Lawrence, M.S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).
Weinstein, J.N., et al. & Cancer Genome Atlas Research Network. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).
Beroukhim, R. et al. Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proc. Natl. Acad. Sci. USA 104, 20007–20012 (2007).
Chin, L., Hahn, W.C., Getz, G. & Meyerson, M. Making sense of cancer genomic data. Genes Dev. 25, 534–555 (2011).
Luo, B. et al. Highly parallel identification of essential genes in cancer cells. Proc. Natl. Acad. Sci. USA 105, 20380–20385 (2008).
Barbie, D.A. et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462, 108–112 (2009).
Cheung, H.W. et al. Systematic investigation of genetic vulnerabilities across cancer cell lines reveals lineage-specific dependencies in ovarian cancer. Proc. Natl. Acad. Sci. USA 108, 12372–12377 (2011).
Boehm, J.S. & Hahn, W.C. Towards systematic functional characterization of cancer genomes. Nat. Rev. Genet. 12, 487–498 (2011).
Nijhawan, D. et al. Cancer vulnerabilities unveiled by genomic loss. Cell 150, 842–854 (2012).
Valenta, T., Hausmann, G. & Basler, K. The many faces and functions of β-catenin. EMBO J. 31, 2714–2736 (2012).
Rosenbluh, J. et al. β-Catenin-driven cancers require a YAP1 transcriptional complex for survival and tumorigenesis. Cell 151, 1457–1473 (2012).
Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).
Minde, D.P., Anvarian, Z., Rüdiger, S.G. & Maurice, M.M. Messing up disorder: how do missense mutations in the tumor suppressor protein APC lead to cancer? Mol. Cancer 10, 101 (2011).
Firestein, R. et al. CDK8 is a colorectal cancer oncogene that regulates beta-catenin activity. Nature 455, 547–551 (2008).
DeNicola, G.M. et al. Oncogene-induced Nrf2 transcription promotes ROS detoxification and tumorigenesis. Nature 475, 106–109 (2011).
Malhotra, D. et al. Global mapping of binding sites for Nrf2 identifies novel targets in cell survival response through ChIP-Seq profiling and network analysis. Nucleic Acids Res. 38, 5718–5734 (2010).
Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525 (2012).
Singh, A. et al. Dysfunctional KEAP1-NRF2 interaction in non-small-cell lung cancer. PLoS Med. 3, e420 (2006).
Solis, L.M. et al. Nrf2 and Keap1 abnormalities in non-small cell lung carcinoma and association with clinicopathologic features. Clin. Cancer Res. 16, 3743–3753 (2010).
Brar, S.S. et al. NOX5 NAD(P)H oxidase regulates growth and apoptosis in DU 145 prostate cancer cells. Am. J. Physiol. Cell Physiol. 285, C353–C369 (2003).
Pendyala, S. & Natarajan, V. Redox regulation of Nox proteins. Respir. Physiol. Neurobiol. 174, 265–271 (2010).
Pandey, D. et al. Expression and functional significance of NADPH oxidase 5 (Nox5) and its splice variants in human blood vessels. Am. J. Physiol. Heart Circ. Physiol. 302, H1919–H1928 (2012).
Abazeed, M.E. et al. Integrative radiogenomic profiling of squamous cell lung cancer. Cancer Res. 73, 6289–6298 (2013).
The Cancer Genome Atlas. http://cancergenome.nih.gov/ (2012).
McCubrey, J.A. et al. Roles of the Raf/MEK/ERK pathway in cell growth, malignant transformation and drug resistance. Biochim. Biophys. Acta 1773, 1263–1284 (2007).
Chin, L., Garraway, L.A. & Fisher, D.E. Malignant melanoma: genetics and therapeutics in the genomic era. Genes Dev. 20, 2149–2182 (2006).
Young, A. et al. Ras signaling and therapies. Adv. Cancer Res. 102, 1–17 (2009).
Shao, D.D. et al. ATARiS: computational quantification of gene suppression phenotypes from multisample RNAi screens. Genome Res. 23, 665–678 (2013).
Raeder, M.B. et al. Integrated genomic analysis of the 8q24 amplification in endometrial cancers identifies ATAD2 as essential to MYC-dependent cancers. PLoS One 8, e54873 (2013).
Hollern, D.P., Yuwanita, I. & Andrechek, E.R. A mouse model with T58A mutations in Myc reduces the dependence on KRas mutations and has similarities to claudin-low human breast cancer. Oncogene 32, 1296–1304 (2013).
Scholl, C. et al. Synthetic lethal interaction between oncogenic KRAS dependency and STK33 suppression in human cancer cells. Cell 137, 821–834 (2009).
Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Series B Stat. Methodol. 67, 301–320 (2005).
Peng, H., Long, F. & Ding, C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1226–1238 (2005).
Ciriello, G., Cerami, E., Sander, C. & Schultz, N. Mutual exclusivity analysis identifies oncogenic network modules. Genome Res. 22, 398–406 (2012).
Miller, C.A., Settle, S.H., Sulman, E.P., Aldape, K.D. & Milosavljevic, A. Discovering functional modules by identifying recurrent and mutually exclusive mutational patterns in tumors. BMC Med. Genomics 4, 34 (2011).
Vandin, F., Upfal, E. & Raphael, B.J. De novo discovery of mutated driver pathways in cancer. Genome Res. 22, 375–385 (2012).
Leiserson, M.D., Blokh, D., Sharan, R. & Raphael, B.J. Simultaneous identification of multiple driver pathways in cancer. PLoS Comput. Biol. 9, e1003054 (2013).
Gu, Y. et al. Network analysis of genomic alteration profiles reveals co-altered functional modules and driver genes for glioblastoma. Mol. Biosyst. 9, 467–477 (2013).
Vandin, F., Upfal, E. & Raphael, B.J. Algorithms for detecting significantly mutated pathways in cancer. J. Comput. Biol. 18, 507–522 (2011).
Vaske, C.J. et al. Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics 26, i237–i245 (2010).
Waldron, L. et al. Optimized application of penalized regression methods to diverse genomic data. Bioinformatics 27, 3399–3406 (2011).
Butte, A.J. & Kohane, I.S. Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac. Symp. Biocomput. 2000, 418–429 (2000).
Steuer, R., Kurths, J., Daub, C.O., Weise, J. & Selbig, J. The mutual information: detecting and evaluating dependencies between variables. Bioinformatics 18 (suppl. 2), S231–S240 (2002).
Margolin, A.A. et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 (suppl. 1), S7 (2006).
Linfoot, E.H. An informational measure of correlation. Inf. Control 1, 85–89 (1957).
Joe, H. Relative entropy measures of multivariate dependence. J. Am. Stat. Assoc. 84, 157–164 (1989).
Correa, C.D. & Lindstrom, P. The mutual information diagram for uncertainty visualization. Int. J. Uncertain. Quantif. 3, 187–201 (2013).
Kraskov, A., Stögbauer, H. & Grassberger, P. Estimating mutual information. Phys. Rev. E 69, 066138 (2004).
Kraskov, A., Stögbauer, H., Andrzejak, R.G. & Grassberger, P. Hierarchical clustering using mutual information. Europhys. Lett. 70, 278 (2005).
Cover, T.M. & Thomas, J.A. Elements of Information Theory, 2nd edn. (John Wiley & Sons, 2012).
Darbellay, G.A. & Vajda, I. Estimation of the information by an adaptive partitioning of the observation space. IEEE Trans. Inf. Theory 45, 1315–1321 (1999).
Foster, D.V. & Grassberger, P. Lower bounds on mutual information. Phys. Rev. E 83, 010101 (2011).
Lin, J. Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 37, 145–15l (1991).
Sheather, S. Density estimation. Stat. Sci. 19, 588–597 (2004).
Venables, W.N. & Ripley, B.D. Modern Applied Statistics with S, 4th edn. (Springer, 2002).
Feng, D. & Tierney, L. Computing and displaying isosurfaces in R. J. Stat. Softw. 28, 1–24 (2008).
Scott, D.W. Multivariate Density Estimation, 1st edn. (Wiley, 1992).
Guardavaccaro, D. & Clevers, H. Wnt/β-catenin and MAPK signaling: allies and enemies in different battlefields. Sci. Signal. 5, pe15 (2012).
Brunet, J.P., Tamayo, P., Golub, T.R. & Mesirov, J.P. Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl. Acad. Sci. USA 101, 4164–4169 (2004).
Tamayo, P. et al. Metagene projection for cross-platform, cross-species characterization of global transcriptional states. Proc. Natl. Acad. Sci. USA 104, 5959–5964 (2007).
Azzalini, A. & Capitanio, A. The Skew-Normal and Related Families. Cambridge University Press, 2014.
Azzalini, A. & Capitanio, A. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J. R. Stat. Soc. Series B Stat. Methodol. 65, 367–389 (2003).
Ding, C. & Peng, H. Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 3, 185–205 (2005).
Acknowledgements
This project was supported in part by US National Institutes of Health grants R01 CA154480, R01 CA121941, U01 CA176058, R01 CA109467 and U01 CA184898-02.
Author information
Authors and Affiliations
Contributions
J.W.K., O.B.B., J.P.M., J.S.B., W.C.H. and P.T. designed and conceptualized the method. O.B.B., O.A., C.B. and P.T. implemented the algorithm. J.W.K., D.A.B., J.R., Y.S., M.E.A., P.S.H., A.A., H.G., F.V., B.A.W., E.M.V.A., D.D.S., T.I.Z., R.B., L.A.G., C.M.J., D.J.K, J.P.M. and P.T. analyzed and interpreted results. A.R.A.-R., M.A., C.R., G.S., D.D., G.G., M.G., G.A., M.N., A.L., A.T. and P.T. provided expertise or work on specific issues regarding algorithmic approaches, data analysis, data preparation, data resources, benchmarking, validation datasets and method comparisons. J.W.K., O.B.B., M.E.A., J.R., J.P.M. and P.T. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 First iteration of REVEALER for the β-catenin activation example
A) The target profile is a β-catenin/TCF4 reporter and the seed feature is the mutation status of β –catenin (CTNNB1.MC_MUT). REVEALER finds APC mutation as the top hit according to the Conditional Information Coefficient (CIC = 0.49). B) The clustered version of the same top scoring features shown in A
Supplementary Figure 2 Second iteration of REVEALER for the β-catenin activation in cancer example.
A) Conditioning to the summary feature after the first iteration, consisting of the seed feature (β-catenin mutation: CTNNB1.MC_MUT) and APC mutation (APC.MC_MUT), REVEALER finds amplifications in 13q33 (feature ITGBL1_AMP) as the best match. B) The clustered version of the same top scoring features shown in A.
Supplementary Figure 3 First iteration of REVEALER for the transcriptional NRF2 activation in lung cancer example.
The target profile is a profile of transcriptional activation of NRF2 and the seed feature is the NRF2 mutation and amplification status. A) REVEALER finds KEAP1 mutation as the top hit according to the Conditional Information Coefficient (ICI=0.53). B) The clustered version of the same top scoring features shown in A.
Supplementary Figure 4 Second iteration of REVEALER for the transcriptional NRF2 activation in lung cancer example.
A) Conditioning to the summary feature after the first iteration, consisting of the seed feature (NFE2L2_MUT + NFE2L2_AMP) and KEAP1 mutations, REVEALER finds amplifications in chr15q22/26 (feature OR4F13P_AMP) as the top hit. B) The clustered version of the same top scoring features shown in A.
Supplementary Figure 5 Assessment of the REVEALER features for the transcriptional NRF2 activation in lung cancer example in an independent test dataset.
The dataset is from The Cancer Genome Atlas (TCGA) consisting of 153 tumor samples (adenocarcinomas and squamous lung cancers). As can be seen in the figure the 4 features (NFE2L2_MUT, NFE2L2_AMP, KEAP1_MUT, and NOX5_AMP (15q23)) appear to generalize well to primary tumors and explain a significant number of samples with NRF2 activation in tumors.
Supplementary Figure 6 First iteration of REVEALER for the drug sensitivity to a MEK-inhibitor example.
A) The target is the MEK-inhibitor PD0325901 sensitivity profile and there is no seed feature (NULLSEED). REVEALER identifies BRAF mutation as the top hit. B) The clustered version of the same top scoring features shown in A.
Supplementary Figure 7 Second iteration of REVEALER for the drug sensitivity to a MEK-inhibitor example.
A) Conditional to the summary feature after the first iteration, consisting of BRAF mutation, REVEALER finds KRAS mutations as the top hit. B) The clustered version of the same top scoring features shown in A.
Supplementary Figure 8 Third iteration of REVEALER for the drug sensitivity to a MEK-inhibitor example.
A) Conditional to the summary feature after the second iteration, consisting of BRAF and KRAS mutations, REVEALER finds NRAS mutations as the top hit. B) The clustered version of the same top scoring features shown in A.
Supplementary Figure 9 First iteration of REVEALER for the KRAS-dependency example.
A) The target profile is a profile relative KRAS-dependence and the seed feature is the mutation status of KRAS. REVEALER identifies a copy number gain across a region on chromosome 8q23-24 (feature NSMCE2_AMP) as the most complementary genomic alteration to KRAS mutation. B) The clustered version of the same top scoring features shown in A.
Supplementary Figure 10 Second iteration of REVEALER for the KRAS-dependency example.
A) REVEALER identifies 9p21.2 amplification (feature LINGO2_AMP). B) The clustered version of the same top scoring features shown in A.
Supplementary Figure 11 Third iteration of REVEALER for the KRAS-dependency example.
A) REVEALER identifies 9p12 deletion (feature FAM74A4). B) The clustered version of the same top scoring features shown in A.
Supplementary Figure 12 Fourth iteration of REVEALER for the KRAS-dependency example.
A) REVEALER identifies 12p12.1 amplification (feature LINC00477). B) The clustered version of the same top scoring features shown in A.
Supplementary Figure 14 Simulated benchmark results.
A) Simulated generation of target, seed and complementary feature. B) Four examples of simulated iteration instances showing the target, seed complementary feature and the best of the random features. C) Empirical histogram and skew-t distribution fit to the genomic alterations frequencies in the revealer input feature dataset. D) Plot of the IC/CIC (left) and COR-PCOR (right) association metric values for each simulated instance in the benchmark. E) Plot of the probability of detecting the signal (complementary feature) vs. the noise (best random feature) as provided by a logistic model as a function of the CIC values of the complementary feature. F) Signal to noise ratios of the CIC/information-based (y-axis) vs. the PCOR/correlation-based (x-axis) metrics.
Supplementary Figure 15 Comparative results for different algorithms.
A) Comparative heatmaps of the top features results in the β-catenin activation example using REVEALER, the ElasticNet and Dendrix. B) Comparative heatmaps of the top features results in the NRF2 activation example using REVEALER, the ElasticNet and Dendrix. C) Comparative heatmaps of the top features results in the MEK-Inhibition example using REVEALER, the ElasticNet and Dendrix. D) Comparative heatmaps of the top features results in the KRAS-dependency example using REVEALER, the ElasticNet and Dendrix. E) Summary ROC curves for the simulated data benchmark using the Elastic Net and mRMR feature selection methods including all cases regardless of the seed being identified as one of the two top features.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–15 (PDF 11139 kb)
Supplementary Code
Supplementary Code: REVEALER (ZIP 498 kb)
Rights and permissions
About this article
Cite this article
Kim, J., Botvinnik, O., Abudayyeh, O. et al. Characterizing genomic alterations in cancer by complementary functional associations. Nat Biotechnol 34, 539–546 (2016). https://doi.org/10.1038/nbt.3527
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt.3527
This article is cited by
-
Predicting gene knockout effects from expression data
BMC Medical Genomics (2023)
-
MicroRNA-367-3p directly targets RAB23 and inhibits proliferation, migration and invasion of bladder cancer cells and increases cisplatin sensitivity
Journal of Cancer Research and Clinical Oncology (2023)
-
Differential regulation of TNFα and IL-6 expression contributes to immune evasion in prostate cancer
Journal of Translational Medicine (2022)
-
Synthetic Lethal Interactions Prediction Based on Multiple Similarity Measures Fusion
Journal of Computer Science and Technology (2021)
-
Discovering functional evolutionary dependencies in human cancers
Nature Genetics (2020)