Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Characterizing genomic alterations in cancer by complementary functional associations

Abstract

Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment. We used REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations, demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: REVEALER information-based metrics.
Figure 2: REVEALER results for transcriptional activation of β-catenin in cancer.
Figure 3: REVEALER results for transcriptional NRF2 activation in lung cancer.
Figure 4: REVEALER results for the drug sensitivity to a MEK-inhibitor example.
Figure 5: REVEALER results for KRAS-dependency.
Figure 6: Simulated data results.

Similar content being viewed by others

References

  1. Meyerson, M., Gabriel, S. & Getz, G. Advances in understanding cancer genomes through second-generation sequencing. Nat. Rev. Genet. 11, 685–696 (2010).

    Article  CAS  Google Scholar 

  2. Lawrence, M.S. et al. Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013).

    Article  CAS  Google Scholar 

  3. Lawrence, M.S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).

    Article  CAS  Google Scholar 

  4. Weinstein, J.N., et al. & Cancer Genome Atlas Research Network. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).

    Article  Google Scholar 

  5. Beroukhim, R. et al. Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proc. Natl. Acad. Sci. USA 104, 20007–20012 (2007).

    Article  CAS  Google Scholar 

  6. Chin, L., Hahn, W.C., Getz, G. & Meyerson, M. Making sense of cancer genomic data. Genes Dev. 25, 534–555 (2011).

    Article  CAS  Google Scholar 

  7. Luo, B. et al. Highly parallel identification of essential genes in cancer cells. Proc. Natl. Acad. Sci. USA 105, 20380–20385 (2008).

    Article  CAS  Google Scholar 

  8. Barbie, D.A. et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462, 108–112 (2009).

    Article  CAS  Google Scholar 

  9. Cheung, H.W. et al. Systematic investigation of genetic vulnerabilities across cancer cell lines reveals lineage-specific dependencies in ovarian cancer. Proc. Natl. Acad. Sci. USA 108, 12372–12377 (2011).

    Article  CAS  Google Scholar 

  10. Boehm, J.S. & Hahn, W.C. Towards systematic functional characterization of cancer genomes. Nat. Rev. Genet. 12, 487–498 (2011).

    Article  CAS  Google Scholar 

  11. Nijhawan, D. et al. Cancer vulnerabilities unveiled by genomic loss. Cell 150, 842–854 (2012).

    Article  CAS  Google Scholar 

  12. Valenta, T., Hausmann, G. & Basler, K. The many faces and functions of β-catenin. EMBO J. 31, 2714–2736 (2012).

    Article  CAS  Google Scholar 

  13. Rosenbluh, J. et al. β-Catenin-driven cancers require a YAP1 transcriptional complex for survival and tumorigenesis. Cell 151, 1457–1473 (2012).

    Article  CAS  Google Scholar 

  14. Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).

    Article  CAS  Google Scholar 

  15. Minde, D.P., Anvarian, Z., Rüdiger, S.G. & Maurice, M.M. Messing up disorder: how do missense mutations in the tumor suppressor protein APC lead to cancer? Mol. Cancer 10, 101 (2011).

    Article  CAS  Google Scholar 

  16. Firestein, R. et al. CDK8 is a colorectal cancer oncogene that regulates beta-catenin activity. Nature 455, 547–551 (2008).

    Article  CAS  Google Scholar 

  17. DeNicola, G.M. et al. Oncogene-induced Nrf2 transcription promotes ROS detoxification and tumorigenesis. Nature 475, 106–109 (2011).

    Article  CAS  Google Scholar 

  18. Malhotra, D. et al. Global mapping of binding sites for Nrf2 identifies novel targets in cell survival response through ChIP-Seq profiling and network analysis. Nucleic Acids Res. 38, 5718–5734 (2010).

    Article  CAS  Google Scholar 

  19. Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525 (2012).

  20. Singh, A. et al. Dysfunctional KEAP1-NRF2 interaction in non-small-cell lung cancer. PLoS Med. 3, e420 (2006).

    Article  Google Scholar 

  21. Solis, L.M. et al. Nrf2 and Keap1 abnormalities in non-small cell lung carcinoma and association with clinicopathologic features. Clin. Cancer Res. 16, 3743–3753 (2010).

    Article  CAS  Google Scholar 

  22. Brar, S.S. et al. NOX5 NAD(P)H oxidase regulates growth and apoptosis in DU 145 prostate cancer cells. Am. J. Physiol. Cell Physiol. 285, C353–C369 (2003).

    Article  CAS  Google Scholar 

  23. Pendyala, S. & Natarajan, V. Redox regulation of Nox proteins. Respir. Physiol. Neurobiol. 174, 265–271 (2010).

    Article  CAS  Google Scholar 

  24. Pandey, D. et al. Expression and functional significance of NADPH oxidase 5 (Nox5) and its splice variants in human blood vessels. Am. J. Physiol. Heart Circ. Physiol. 302, H1919–H1928 (2012).

    Article  CAS  Google Scholar 

  25. Abazeed, M.E. et al. Integrative radiogenomic profiling of squamous cell lung cancer. Cancer Res. 73, 6289–6298 (2013).

    Article  CAS  Google Scholar 

  26. The Cancer Genome Atlas. http://cancergenome.nih.gov/ (2012).

  27. McCubrey, J.A. et al. Roles of the Raf/MEK/ERK pathway in cell growth, malignant transformation and drug resistance. Biochim. Biophys. Acta 1773, 1263–1284 (2007).

    Article  CAS  Google Scholar 

  28. Chin, L., Garraway, L.A. & Fisher, D.E. Malignant melanoma: genetics and therapeutics in the genomic era. Genes Dev. 20, 2149–2182 (2006).

    Article  CAS  Google Scholar 

  29. Young, A. et al. Ras signaling and therapies. Adv. Cancer Res. 102, 1–17 (2009).

    Article  CAS  Google Scholar 

  30. Shao, D.D. et al. ATARiS: computational quantification of gene suppression phenotypes from multisample RNAi screens. Genome Res. 23, 665–678 (2013).

    Article  CAS  Google Scholar 

  31. Raeder, M.B. et al. Integrated genomic analysis of the 8q24 amplification in endometrial cancers identifies ATAD2 as essential to MYC-dependent cancers. PLoS One 8, e54873 (2013).

    Article  CAS  Google Scholar 

  32. Hollern, D.P., Yuwanita, I. & Andrechek, E.R. A mouse model with T58A mutations in Myc reduces the dependence on KRas mutations and has similarities to claudin-low human breast cancer. Oncogene 32, 1296–1304 (2013).

    Article  CAS  Google Scholar 

  33. Scholl, C. et al. Synthetic lethal interaction between oncogenic KRAS dependency and STK33 suppression in human cancer cells. Cell 137, 821–834 (2009).

    Article  CAS  Google Scholar 

  34. Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Series B Stat. Methodol. 67, 301–320 (2005).

    Article  Google Scholar 

  35. Peng, H., Long, F. & Ding, C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1226–1238 (2005).

    Article  Google Scholar 

  36. Ciriello, G., Cerami, E., Sander, C. & Schultz, N. Mutual exclusivity analysis identifies oncogenic network modules. Genome Res. 22, 398–406 (2012).

    Article  CAS  Google Scholar 

  37. Miller, C.A., Settle, S.H., Sulman, E.P., Aldape, K.D. & Milosavljevic, A. Discovering functional modules by identifying recurrent and mutually exclusive mutational patterns in tumors. BMC Med. Genomics 4, 34 (2011).

    Article  Google Scholar 

  38. Vandin, F., Upfal, E. & Raphael, B.J. De novo discovery of mutated driver pathways in cancer. Genome Res. 22, 375–385 (2012).

    Article  CAS  Google Scholar 

  39. Leiserson, M.D., Blokh, D., Sharan, R. & Raphael, B.J. Simultaneous identification of multiple driver pathways in cancer. PLoS Comput. Biol. 9, e1003054 (2013).

    Article  CAS  Google Scholar 

  40. Gu, Y. et al. Network analysis of genomic alteration profiles reveals co-altered functional modules and driver genes for glioblastoma. Mol. Biosyst. 9, 467–477 (2013).

    Article  CAS  Google Scholar 

  41. Vandin, F., Upfal, E. & Raphael, B.J. Algorithms for detecting significantly mutated pathways in cancer. J. Comput. Biol. 18, 507–522 (2011).

    Article  CAS  Google Scholar 

  42. Vaske, C.J. et al. Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics 26, i237–i245 (2010).

    Article  CAS  Google Scholar 

  43. Waldron, L. et al. Optimized application of penalized regression methods to diverse genomic data. Bioinformatics 27, 3399–3406 (2011).

    Article  CAS  Google Scholar 

  44. Butte, A.J. & Kohane, I.S. Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. Pac. Symp. Biocomput. 2000, 418–429 (2000).

    Google Scholar 

  45. Steuer, R., Kurths, J., Daub, C.O., Weise, J. & Selbig, J. The mutual information: detecting and evaluating dependencies between variables. Bioinformatics 18 (suppl. 2), S231–S240 (2002).

    Article  Google Scholar 

  46. Margolin, A.A. et al. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 (suppl. 1), S7 (2006).

    Article  Google Scholar 

  47. Linfoot, E.H. An informational measure of correlation. Inf. Control 1, 85–89 (1957).

    Article  Google Scholar 

  48. Joe, H. Relative entropy measures of multivariate dependence. J. Am. Stat. Assoc. 84, 157–164 (1989).

    Article  Google Scholar 

  49. Correa, C.D. & Lindstrom, P. The mutual information diagram for uncertainty visualization. Int. J. Uncertain. Quantif. 3, 187–201 (2013).

    Article  Google Scholar 

  50. Kraskov, A., Stögbauer, H. & Grassberger, P. Estimating mutual information. Phys. Rev. E 69, 066138 (2004).

    Article  Google Scholar 

  51. Kraskov, A., Stögbauer, H., Andrzejak, R.G. & Grassberger, P. Hierarchical clustering using mutual information. Europhys. Lett. 70, 278 (2005).

    Article  CAS  Google Scholar 

  52. Cover, T.M. & Thomas, J.A. Elements of Information Theory, 2nd edn. (John Wiley & Sons, 2012).

  53. Darbellay, G.A. & Vajda, I. Estimation of the information by an adaptive partitioning of the observation space. IEEE Trans. Inf. Theory 45, 1315–1321 (1999).

    Article  Google Scholar 

  54. Foster, D.V. & Grassberger, P. Lower bounds on mutual information. Phys. Rev. E 83, 010101 (2011).

    Article  Google Scholar 

  55. Lin, J. Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 37, 145–15l (1991).

    Article  Google Scholar 

  56. Sheather, S. Density estimation. Stat. Sci. 19, 588–597 (2004).

    Article  Google Scholar 

  57. Venables, W.N. & Ripley, B.D. Modern Applied Statistics with S, 4th edn. (Springer, 2002).

  58. Feng, D. & Tierney, L. Computing and displaying isosurfaces in R. J. Stat. Softw. 28, 1–24 (2008).

    Article  Google Scholar 

  59. Scott, D.W. Multivariate Density Estimation, 1st edn. (Wiley, 1992).

  60. Guardavaccaro, D. & Clevers, H. Wnt/β-catenin and MAPK signaling: allies and enemies in different battlefields. Sci. Signal. 5, pe15 (2012).

    Article  Google Scholar 

  61. Brunet, J.P., Tamayo, P., Golub, T.R. & Mesirov, J.P. Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl. Acad. Sci. USA 101, 4164–4169 (2004).

    Article  CAS  Google Scholar 

  62. Tamayo, P. et al. Metagene projection for cross-platform, cross-species characterization of global transcriptional states. Proc. Natl. Acad. Sci. USA 104, 5959–5964 (2007).

    Article  CAS  Google Scholar 

  63. Azzalini, A. & Capitanio, A. The Skew-Normal and Related Families. Cambridge University Press, 2014.

  64. Azzalini, A. & Capitanio, A. Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. J. R. Stat. Soc. Series B Stat. Methodol. 65, 367–389 (2003).

    Article  Google Scholar 

  65. Ding, C. & Peng, H. Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 3, 185–205 (2005).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This project was supported in part by US National Institutes of Health grants R01 CA154480, R01 CA121941, U01 CA176058, R01 CA109467 and U01 CA184898-02.

Author information

Authors and Affiliations

Authors

Contributions

J.W.K., O.B.B., J.P.M., J.S.B., W.C.H. and P.T. designed and conceptualized the method. O.B.B., O.A., C.B. and P.T. implemented the algorithm. J.W.K., D.A.B., J.R., Y.S., M.E.A., P.S.H., A.A., H.G., F.V., B.A.W., E.M.V.A., D.D.S., T.I.Z., R.B., L.A.G., C.M.J., D.J.K, J.P.M. and P.T. analyzed and interpreted results. A.R.A.-R., M.A., C.R., G.S., D.D., G.G., M.G., G.A., M.N., A.L., A.T. and P.T. provided expertise or work on specific issues regarding algorithmic approaches, data analysis, data preparation, data resources, benchmarking, validation datasets and method comparisons. J.W.K., O.B.B., M.E.A., J.R., J.P.M. and P.T. wrote the manuscript.

Corresponding author

Correspondence to Pablo Tamayo.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 First iteration of REVEALER for the β-catenin activation example

A) The target profile is a β-catenin/TCF4 reporter and the seed feature is the mutation status of β –catenin (CTNNB1.MC_MUT). REVEALER finds APC mutation as the top hit according to the Conditional Information Coefficient (CIC = 0.49). B) The clustered version of the same top scoring features shown in A

Supplementary Figure 2 Second iteration of REVEALER for the β-catenin activation in cancer example.

A) Conditioning to the summary feature after the first iteration, consisting of the seed feature (β-catenin mutation: CTNNB1.MC_MUT) and APC mutation (APC.MC_MUT), REVEALER finds amplifications in 13q33 (feature ITGBL1_AMP) as the best match. B) The clustered version of the same top scoring features shown in A.

Supplementary Figure 3 First iteration of REVEALER for the transcriptional NRF2 activation in lung cancer example.

The target profile is a profile of transcriptional activation of NRF2 and the seed feature is the NRF2 mutation and amplification status. A) REVEALER finds KEAP1 mutation as the top hit according to the Conditional Information Coefficient (ICI=0.53). B) The clustered version of the same top scoring features shown in A.

Supplementary Figure 4 Second iteration of REVEALER for the transcriptional NRF2 activation in lung cancer example.

A) Conditioning to the summary feature after the first iteration, consisting of the seed feature (NFE2L2_MUT + NFE2L2_AMP) and KEAP1 mutations, REVEALER finds amplifications in chr15q22/26 (feature OR4F13P_AMP) as the top hit. B) The clustered version of the same top scoring features shown in A.

Supplementary Figure 5 Assessment of the REVEALER features for the transcriptional NRF2 activation in lung cancer example in an independent test dataset.

The dataset is from The Cancer Genome Atlas (TCGA) consisting of 153 tumor samples (adenocarcinomas and squamous lung cancers). As can be seen in the figure the 4 features (NFE2L2_MUT, NFE2L2_AMP, KEAP1_MUT, and NOX5_AMP (15q23)) appear to generalize well to primary tumors and explain a significant number of samples with NRF2 activation in tumors.

Supplementary Figure 6 First iteration of REVEALER for the drug sensitivity to a MEK-inhibitor example.

A) The target is the MEK-inhibitor PD0325901 sensitivity profile and there is no seed feature (NULLSEED). REVEALER identifies BRAF mutation as the top hit. B) The clustered version of the same top scoring features shown in A.

Supplementary Figure 7 Second iteration of REVEALER for the drug sensitivity to a MEK-inhibitor example.

A) Conditional to the summary feature after the first iteration, consisting of BRAF mutation, REVEALER finds KRAS mutations as the top hit. B) The clustered version of the same top scoring features shown in A.

Supplementary Figure 8 Third iteration of REVEALER for the drug sensitivity to a MEK-inhibitor example.

A) Conditional to the summary feature after the second iteration, consisting of BRAF and KRAS mutations, REVEALER finds NRAS mutations as the top hit. B) The clustered version of the same top scoring features shown in A.

Supplementary Figure 9 First iteration of REVEALER for the KRAS-dependency example.

A) The target profile is a profile relative KRAS-dependence and the seed feature is the mutation status of KRAS. REVEALER identifies a copy number gain across a region on chromosome 8q23-24 (feature NSMCE2_AMP) as the most complementary genomic alteration to KRAS mutation. B) The clustered version of the same top scoring features shown in A.

Supplementary Figure 10 Second iteration of REVEALER for the KRAS-dependency example.

A) REVEALER identifies 9p21.2 amplification (feature LINGO2_AMP). B) The clustered version of the same top scoring features shown in A.

Supplementary Figure 11 Third iteration of REVEALER for the KRAS-dependency example.

A) REVEALER identifies 9p12 deletion (feature FAM74A4). B) The clustered version of the same top scoring features shown in A.

Supplementary Figure 12 Fourth iteration of REVEALER for the KRAS-dependency example.

A) REVEALER identifies 12p12.1 amplification (feature LINC00477). B) The clustered version of the same top scoring features shown in A.

Supplementary Figure 13 Empirical histograms and fitted skew-t distributions for the target profiles of the NRF2 and MEK-inhibition examples.

Supplementary Figure 14 Simulated benchmark results.

A) Simulated generation of target, seed and complementary feature. B) Four examples of simulated iteration instances showing the target, seed complementary feature and the best of the random features. C) Empirical histogram and skew-t distribution fit to the genomic alterations frequencies in the revealer input feature dataset. D) Plot of the IC/CIC (left) and COR-PCOR (right) association metric values for each simulated instance in the benchmark. E) Plot of the probability of detecting the signal (complementary feature) vs. the noise (best random feature) as provided by a logistic model as a function of the CIC values of the complementary feature. F) Signal to noise ratios of the CIC/information-based (y-axis) vs. the PCOR/correlation-based (x-axis) metrics.

Supplementary Figure 15 Comparative results for different algorithms.

A) Comparative heatmaps of the top features results in the β-catenin activation example using REVEALER, the ElasticNet and Dendrix. B) Comparative heatmaps of the top features results in the NRF2 activation example using REVEALER, the ElasticNet and Dendrix. C) Comparative heatmaps of the top features results in the MEK-Inhibition example using REVEALER, the ElasticNet and Dendrix. D) Comparative heatmaps of the top features results in the KRAS-dependency example using REVEALER, the ElasticNet and Dendrix. E) Summary ROC curves for the simulated data benchmark using the Elastic Net and mRMR feature selection methods including all cases regardless of the seed being identified as one of the two top features.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–15 (PDF 11139 kb)

Supplementary Code

Supplementary Code: REVEALER (ZIP 498 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, J., Botvinnik, O., Abudayyeh, O. et al. Characterizing genomic alterations in cancer by complementary functional associations. Nat Biotechnol 34, 539–546 (2016). https://doi.org/10.1038/nbt.3527

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.3527

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer