Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking

Abstract

Extensive prior research focused on somatic copy-number alterations (SCNAs) affecting cancer genes, yet the extent to which recurrent SCNAs exert their influence through rearrangement of cis-regulatory elements (CREs) remains unclear. Here we present a framework for inferring cancer-related gene overexpression resulting from CRE reorganization (e.g., enhancer hijacking) by integrating SCNAs, gene expression data and information on topologically associating domains (TADs). Analysis of 7,416 cancer genomes uncovered several pan-cancer candidate genes, including IRS4, SMARCA1 and TERT. We demonstrate that IRS4 overexpression in lung cancer is associated with recurrent deletions in cis, and we present evidence supporting a tumor-promoting role. We additionally pursued cancer-type-specific analyses and uncovered IGF2 as a target for enhancer hijacking in colorectal cancer. Recurrent tandem duplications intersecting with a TAD boundary mediate de novo formation of a 3D contact domain comprising IGF2 and a lineage-specific super-enhancer, resulting in high-level gene activation. Our framework enables systematic inference of CRE rearrangements mediating dysregulation in cancer.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: CESAM: framework for uncovering SCNAs driving gene dysregulation in cis.
Figure 2: Analysis of the TERT locus: a CESAM pan-cancer hit.
Figure 3: Recurrent SCNAs in cis are associated with a marked increase in IRS4 expression.
Figure 4: SCNAs associated with marked IGF2 locus overexpression in cis in CRC.
Figure 5: Verification of IGF2 enhancer hijacking and model for mechanism involving de novo contact domain formation.

Similar content being viewed by others

References

  1. Ley, T.J. et al. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456, 66–72 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Pleasance, E.D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010).

    Article  CAS  PubMed  Google Scholar 

  3. Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

  4. Rausch, T. et al. Genome sequencing of pediatric medulloblastoma links catastrophic DNA rearrangements with TP53 mutations. Cell 148, 59–71 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Jones, D.T. et al. Dissecting the genomic complexity underlying medulloblastoma. Nature 488, 100–105 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Kandoth, C. et al. Mutational landscape and significance across 12 major cancer types. Nature 502, 333–339 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Jones, D.T. et al. Recurrent somatic alterations of FGFR1 and NTRK2 in pilocytic astrocytoma. Nat. Genet. 45, 927–932 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Baca, S.C. et al. Punctuated evolution of prostate cancer genomes. Cell 153, 666–677 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Lawrence, M.S. et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Zhu, J. et al. Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152, 642–654 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Shen, Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  14. Levine, M. Transcriptional enhancers in animal development and evolution. Curr. Biol. 20, R754–R763 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Sanyal, A., Lajoie, B.R., Jain, G. & Dekker, J. The long-range interaction landscape of gene promoters. Nature 489, 109–113 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Jin, F. et al. A high-resolution map of the three-dimensional chromatin interactome in human cells. Nature 503, 290–294 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. de Laat, W. & Duboule, D. Topology of mammalian developmental enhancers and their regulatory landscapes. Nature 502, 499–506 (2013).

    Article  CAS  PubMed  Google Scholar 

  18. Weinhold, N., Jacobsen, A., Schultz, N., Sander, C. & Lee, W. Genome-wide analysis of noncoding regulatory mutations in cancer. Nat. Genet. 46, 1160–1165 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Fredriksson, N.J., Ny, L., Nilsson, J.A. & Larsson, E. Systematic analysis of noncoding somatic mutations and gene expression alterations across 14 tumor types. Nat. Genet. 46, 1258–1263 (2014).

    Article  CAS  PubMed  Google Scholar 

  20. Melton, C., Reuter, J.A., Spacek, D.V. & Snyder, M. Recurrent somatic mutations in regulatory regions of human cancer genomes. Nat. Genet. 47, 710–716 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Puente, X.S. et al. Non-coding recurrent mutations in chronic lymphocytic leukaemia. Nature 526, 519–524 (2015).

    Article  CAS  PubMed  Google Scholar 

  22. Mansour, M.R. et al. An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element. Science 346, 1373–1377 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Beroukhim, R. et al. Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proc. Natl. Acad. Sci. USA 104, 20007–20012 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Stephens, P.J. et al. Complex landscapes of somatic rearrangement in human breast cancer genomes. Nature 462, 1005–1010 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Stephens, P.J. et al. Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144, 27–40 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Zack, T.I. et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Northcott, P.A. et al. Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma. Nature 511, 428–434 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Peifer, M. et al. Telomerase activation by genomic rearrangements in high-risk neuroblastoma. Nature 526, 700–704 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Valentijn, L.J. et al. TERT rearrangements are frequent in neuroblastoma and identify aggressive tumors. Nat. Genet. 47, 1411–1414 (2015).

    Article  CAS  PubMed  Google Scholar 

  31. Bignell, G.R. et al. Signatures of mutation and selection in the cancer genome. Nature 463, 893–898 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Ciriello, G. et al. Emerging landscape of oncogenic signatures across human cancers. Nat. Genet. 45, 1127–1133 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Gröschel, S. et al. A single oncogenic enhancer rearrangement causes concomitant EVI1 and GATA2 deregulation in leukemia. Cell 157, 369–381 (2014).

    Article  CAS  PubMed  Google Scholar 

  34. Hnisz, D. et al. Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science 351, 1454–1458 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Chen, J. & Weiss, W.A. When deletions gain functions: commandeering epigenetic mechanisms. Cancer Cell 26, 160–161 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Dixon, J.R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Nora, E.P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Dekker, J. & Heard, E. Structural and functional diversity of Topologically Associating Domains. FEBS Lett. 589, 2877–2884 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Anderson, E., Devenney, P.S., Hill, R.E. & Lettice, L.A. Mapping the Shh long-range regulatory domain. Development 141, 3934–3943 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Symmons, O. et al. Functional and topological characteristics of mammalian regulatory domains. Genome Res. 24, 390–400 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Waszak, S.M. et al. Population variation and genetic control of modular chromatin architecture in humans. Cell 162, 1039–1050 (2015).

    Article  CAS  PubMed  Google Scholar 

  42. Lupiáñez, D.G. et al. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Bourgon, R., Gentleman, R. & Huber, W. Independent filtering increases detection power for high-throughput experiments. Proc. Natl. Acad. Sci. USA 107, 9546–9551 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Song, Y. et al. Identification of genomic alterations in oesophageal squamous cell cancer. Nature 509, 91–95 (2014).

    Article  CAS  PubMed  Google Scholar 

  45. Roy, N. et al. Brg1 promotes both tumor-suppressive and oncogenic activities at distinct stages of pancreatic cancer formation. Genes Dev. 29, 658–671 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).

    Article  CAS  PubMed  Google Scholar 

  47. Davis, C.F. et al. The somatic genomic landscape of chromophobe renal cell carcinoma. Cancer Cell 26, 319–330 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Qu, B.H., Karas, M., Koval, A. & LeRoith, D. Insulin receptor substrate-4 enhances insulin-like growth factor-I-induced cell proliferation. J. Biol. Chem. 274, 31179–31184 (1999).

    Article  CAS  PubMed  Google Scholar 

  49. Xia, Z., Zhang, N. & Ding, D. Proliferation and migration of hepatoblastoma cells are mediated by IRS-4 via PI3K/Akt pathways. Int. J. Clin. Exp. Med. 7, 3763–3769 (2014).

    PubMed  PubMed Central  Google Scholar 

  50. Hoxhaj, G., Dissanayake, K. & MacKintosh, C. Effect of IRS4 levels on PI 3-kinase signalling. PLoS One 8, e73327 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Homma, Y. et al. Insulin receptor substrate-4 binds to Slingshot-1 phosphatase and promotes cofilin dephosphorylation. J. Biol. Chem. 289, 26302–26313 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Shimwell, N.J. et al. Adenovirus 5 E1A is responsible for increased expression of insulin receptor substrate 4 in established adenovirus 5-transformed cell lines and interacts with IRS components activating the PI3 kinase/Akt signalling pathway. Oncogene 28, 686–697 (2009).

    Article  CAS  PubMed  Google Scholar 

  53. Lingohr, M.K. et al. Decreasing IRS-2 expression in pancreatic beta-cells (INS-1) promotes apoptosis, which can be compensated for by introduction of IRS-4 expression. Mol. Cell. Endocrinol. 209, 17–31 (2003).

    Article  CAS  PubMed  Google Scholar 

  54. Mehine, M., Mäkinen, N., Heinonen, H.R., Aaltonen, L.A. & Vahteristo, P. Genomics of uterine leiomyomas: insights from high-throughput sequencing. Fertil. Steril. 102, 621–629 (2014).

    Article  CAS  PubMed  Google Scholar 

  55. Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525 (2012).

  56. Cancer Genome Atlas Research Network. Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543–550 (2014).

  57. Uchida, T., Myers, M.G. Jr. & White, M.F. IRS-4 mediates protein kinase B signaling during insulin stimulation without promoting antiapoptosis. Mol. Cell. Biol. 20, 126–138 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Hinsby, A.M., Olsen, J.V. & Mann, M. Tyrosine phosphoproteomics of fibroblast growth factor signaling: a role for insulin receptor substrate-4. J. Biol. Chem. 279, 46438–46447 (2004).

    Article  CAS  PubMed  Google Scholar 

  59. Ahmad, I., Iwata, T. & Leung, H.Y. Mechanisms of FGFR-mediated carcinogenesis. Biochim. Biophys. Acta 1823, 850–860 (2012).

    Article  CAS  PubMed  Google Scholar 

  60. Korbel, J.O. et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420–426 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Ernst, J. et al. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 43–49 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. van de Werken, H.J. et al. 4C technology: protocols and data analysis. Methods Enzymol. 513, 89–112 (2012).

    Article  CAS  PubMed  Google Scholar 

  63. Brouwer-Visser, J. & Huang, G.S. IGF2 signaling and regulation in cancer. Cytokine Growth Factor Rev. 26, 371–377 (2015).

    Article  CAS  PubMed  Google Scholar 

  64. Li, X. et al. Oncogenic transformation of diverse gastrointestinal tissues in primary organoid culture. Nat. Med. 20, 769–777 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Leighton, P.A., Saam, J.R., Ingram, R.S., Stewart, C.L. & Tilghman, S.M. An enhancer deletion affects both H19 and Igf2 expression. Genes Dev. 9, 2079–2089 (1995).

    Article  CAS  PubMed  Google Scholar 

  66. Rao, S.S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Yoshihara, K. et al. The landscape and therapeutic relevance of cancer-associated transcript fusions. Oncogene 34, 4845–4854 (2015).

    Article  CAS  PubMed  Google Scholar 

  68. Ghavi-Helm, Y. et al. Enhancer loops appear stable during development and are associated with paused polymerase. Nature 512, 96–100 (2014).

    Article  CAS  PubMed  Google Scholar 

  69. Narendra, V. et al. CTCF establishes discrete functional chromatin domains at the Hox clusters during differentiation. Science 347, 1017–1021 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Venkatraman, A. et al. Maternal imprinting at the H19-Igf2 locus maintains adult haematopoietic stem cell quiescence. Nature 500, 345–349 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Hark, A.T. et al. CTCF mediates methylation-sensitive enhancer-blocking activity at the H19/Igf2 locus. Nature 405, 486–489 (2000).

    Article  CAS  PubMed  Google Scholar 

  72. Zhang, X. et al. Identification of focally amplified lineage-specific super-enhancers in human epithelial cancers. Nat. Genet. 48, 176–182 (2016).

    Article  CAS  PubMed  Google Scholar 

  73. Nambiar, M., Kari, V. & Raghavan, S.C. Chromosomal translocations in cancer. Biochim. Biophys. Acta 1786, 139–152 (2008).

    CAS  PubMed  Google Scholar 

  74. Stein, L.D., Knoppers, B.M., Campbell, P., Getz, G. & Korbel, J.O. Data analysis: create a cloud commons. Nature 523, 149–151 (2015).

    Article  CAS  PubMed  Google Scholar 

  75. Rausch, T. et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333–i339 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Whyte, W.A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  79. Attiyeh, E.F. et al. Genomic copy number determination in cancer cells from single nucleotide polymorphism microarrays based on quantitative genotyping corrected for aneuploidy. Genome Res. 19, 276–283 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Ongen, H., Buil, A., Brown, A.A., Dermitzakis, E.T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).

    Article  CAS  PubMed  Google Scholar 

  81. Price, A.L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).

    Article  CAS  PubMed  Google Scholar 

  82. Yang, J. et al. Genomic inflation factors under polygenic inheritance. Eur. J. Hum. Genet. 19, 807–812 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  83. Dieter, S.M. et al. Distinct types of tumor-initiating cells form human colon cancer tumors and metastases. Cell Stem Cell 9, 357–365 (2011).

    Article  CAS  PubMed  Google Scholar 

  84. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).

  85. Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Splinter, E., de Wit, E., van de Werken, H.J., Klous, P. & de Laat, W. Determining long-range chromatin interactions for selected genomic sites using 4C-seq technology: from fixation to computation. Methods 58, 221–230 (2012).

    Article  CAS  PubMed  Google Scholar 

  87. Klein, F.A. et al. FourCSeq: analysis of 4C sequencing data. Bioinformatics 31, 3085–3091 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Untergasser, A. et al. Primer3Plus, an enhanced web interface to Primer3. Nucleic Acids Res. 35, W71–W74 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  89. Mardin, B.R. et al. A cell-based model system links chromothripsis with hyperploidy. Mol. Syst. Biol. 11, 828 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Weischenfeldt, J. et al. Integrative genomic analyses reveal an androgen-driven somatic alteration landscape in early-onset prostate cancer. Cancer Cell 23, 159–170 (2013).

    Article  CAS  PubMed  Google Scholar 

  91. Stark, R. & Brown, G. DiffBind: Differential Binding Analysis of ChIP-Seq Peak Data (Univ. of Cambridge/Cancer Research UK–Cambridge Institute, 2011).

  92. Sotillo, R. et al. Mad2 overexpression promotes aneuploidy and tumorigenesis in mice. Cancer Cell 11, 9–23 (2007).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This research project was funded in part through Network of Excellence funding by the European Commission (260791 to J.O.K. and A.H.), grants from the German Ministry for Science and Education (BMBF) (01KU1505F to J.O.K. and S.M.P.; 01ZX1303A to R.T. and M.P.; 01ZX1406 to M.P.), a European Research Council (ERC) Starting Grant (336045 to J.O.K.), the Danish Medical Research Council (DFF-4183-00233 to J.W.), the DFG (KFO 227/BA 4806/1-2 to C.R.B. and KFO 227/GL286/1-1 to H.G.), the Baden–Württemberg Stiftung (P-LS-ASII/33 to C.R.B. and H.G.), iMed (Helmholtz Initiative on Personalized Oncology to H.G.), the EU framework program Horizon2020 (TRANSCAN-2 ERA-NET to H.G.), and the German Cancer Aid (Colon-Resist-Net to C.R.B. and H.G.). S.M.W. received funding through an SNSF Early Postdoc Mobility Fellowship (P2ELP3_155365) and an EMBO Long-Term Fellowship (ALTF 755-2014). T.D. was supported by a scholarship from the German Cancer Research Center. S.M.D. was supported by the Heidelberg School of Oncology. The results reported here are in part based upon data generated by the TCGA Research Network (“URLs”), and we acknowledge the specimen donors as well as the research groups involved in the sampling, sequencing and processing of these data. We are grateful to the NCT Tissue Bank for providing samples in accordance with the regulations of the tissue bank and the approval of the ethics committee of Heidelberg University. We are grateful to the GeneCore, IT, mouse facility and Flow Cytometry core facilities at EMBL for excellent assistance. We thank E. Furlong and B. Klaus for valuable discussions during early stages of this project, N. Sidiropoulos for assistance with Hi-C plots, and N. Habermann for assistance with manuscript formatting and proofreading. pMD2.G and psPAX2 were gifts from D. Trono (EPFL, Lausanne, France).

Author information

Authors and Affiliations

Authors

Contributions

H.G. and J.O.K. share joint senior authorship. J.W. and J.O.K. developed the CESAM methodology; J.W., A.P.D., S.M.W., T.Z., S.E. and J.O.K. carried out computational pan-cancer analysis of the TCGA public resource data set; J.W., A.P.D., B.R.M., T.D., C.R.B., H.G. and J.O.K. designed experiments; A.P.D., B.R.M., A.M.S., T.D., B.R., T.E., G.B., R.T., M.P., A.R.H., A.H., C.R.B., J.W., H.G. and J.O.K. worked on experiments in primary cancer samples; A.P.D., B.R., A.M.S., T.E., B.R.M., J.W. and J.O.K. conducted experiments in cell lines; T.D., A.P.D., B.R.M., A.M.S., S.M.D., J.W., C.R.B., H.G. and J.O.K. conducted experiments in spheroid cultures; C.S., S.M.D., C.R.B. and H.G. carried out enhancer luciferase experiments; J.W., A.P.D., B.R.M., Y.C., T.D., C.R.B., H.G. and J.O.K. prepared manuscript display items; M.S., A.H., M.P., H.B., W.W., O.T.B., P.A.N., S.M.P., I.P., S.K.S. and E.T. provided clinical information and human cancer tissue; W.W. carried out pathology analysis of spheroids; Y.C., A.P.D. and S.M.W. performed mouse experiments; and R.S., M.J., A.P.D., Y.C., J.W. and J.O.K. designed mouse experiments.

Corresponding authors

Correspondence to Hanno Glimm or Jan O Korbel.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 P value distribution and relative distance to genomic features of CESAM hits.

(a) QQ plot depicting observed P-values (-log10) in comparison to expected P-values (-log10). (b) Shadow figure to Figure 1d. Relative proximity to genetic elements compared to background (mann-whitney U test) for CESAM hits (‘CESAM’) versus ‘CONTROL’.

Supplementary Figure 2 Pan-cancer and ACC-specific CESAM analysis of gene expression changes for TERT-locus-proximal genes.

Related to main Figure 2. (a) Pan-cancer CESAM analysis of expression for genes in the vicinity of TERT, which demonstrates TERT as plausible target. (b) ACC-specific CESAM analysis of expression for genes in the vicinity of TERT, which demonstrates TERT as plausible target. For tumor type abbreviations, see Supplementary Table 6.

Supplementary Figure 3 Recurrent deletions at a TAD boundary on chromosome X are associated with IRS4 dysregulation in different cancer types.

Related to main Figure 3. A genomic region near IRS4 (a gene transcribed from the (-) strand of the reference genome), highlighted in yellow, exhibiting clustered transcription factor binding sites is highlighted in gray. Each SCNA is accompanied with the depicted expression fold change of IRS4 (shown adjacent to each SCNA). Samples harboring the cis deletion, but not samples without the cis deletion, exhibit H3K27ac chromatin marks at IRS4 as well as at the candidate CRE nearby (highlighted in grey; asterisks indicate significant differential H3K27ac marks between carriers and non-carriers). 4C-Seq experiments using the candidate CRE as a viewpoint clearly demonstrate physical interaction with IRS4; a similar intensity in interaction was observed in samples harboring the cis deletion as in samples lacking the deletion. And also 4C-Seq experiments using the gene as a viewpoint show interaction between IRS4 and the putative CRE in both deletion carriers and non-carriers. LUSC samples S00086, S00473 and S00166 were verified, using semi-quantitative RT-PCR, to exhibit high-level upregulation of IRS4, whereas S00478 and S00198 showed merely baseline-level expression (data not shown). For all remaining carrier and non-carrier samples, we verified outlier and non-outlier expression, respectively, by qPCR (Supplementary Table 3).

Supplementary Figure 4 IRS4 CESAM pan-cancer frequency and association with IRS2 and FGFR1.

Related to main Figure 3. (a) Fraction of donors per cancer type for which CESAM inferred IRS4 dysregulation in conjunction with SCNAs in cis in at least 3 donors. For tumor type abbreviations, see Supplementary Table 6. (b) IRS4-IRS2 gene expression correlation plot, showing significantly anticorrelated expression (P=0.008, Pearson correlation; r=-0.11) (c) Co-occurrence heatmap of samples exhibiting IRS4 CESAM hits versus FGFR1 amplification in LUSC samples (ctrl: control). Significant co-occurrence was seen with P=0.006 (Pearson’s chi-square test).

Supplementary Figure 5 Cancer-type-specific CESAM analysis of gene expression changes for structural variants versus control of IRS4 locus-proximal genes.

Related to main Figure 3. IRS4, COL4A5, COL4A6, VSIG1, PSMD10 and ATG4A expression values (RSEM) for SVs and controls, shown for different cancer types as well as for the pan-cancer setting. IRS4 represents the most consistently upregulated gene, and SCNAs clearly appear to converge on IRS4 overexpression at this extended genomic locus, implicating IRS4 as a plausible candidate gene.

Supplementary Figure 6 Cancer-type-specific CESAM analysis of gene expression changes for IRS4-locus-proximal genes.

Related to main Figure 3 and Supplementary Figure 5. IRS4, COL4A5, COL4A6, VSIG1, PSMD10 and ATG4A expression values (RSEM) for deletions (del), duplications (dup), amplifications (amp) and controls for different cancer types as well as in a pan-cancer setting. IRS4 is the most consistently upregulated gene.

Supplementary Figure 7 Tumor progression of transplanted IRS4-overexpressing HCC-15 cells and mock control in mouse model.

Related to main Figure 3. (a) Boxplots depicting mouse tumor progression curves of HCC-15 cells containing IRS4-expressing lentiviral constructs (pLenti-IRS4) versus mock control (pLenti-empty HCC-15); 1x106 cells injected respectively; last time point: 1st P=0.046, 2nd P=0.03; two-tailed t-test; two-tailed t-test computed at last measured time point (day 39); N=8 for each group in first experiment, N=9 for control and N=12 for IRS4 overexpressing sample in second experiment. (b) RT-qPCR of the control HCC-15 and IRS4-HCC-15 tumors confirming IRS4 overexpression. (c) Flow cytometry of tumors injected with IRS4-IRES-GFP overexpressing vector HCC-15 cell line. (d) Representative immunohistochemistry experiment showing IRS4 overexpression in the harvested tumors (bottom panel) but not in the control tumors (top panel).

Supplementary Figure 8 Structural variant analysis of chromosome X from LUSC samples associated with IRS4 gene overexpression and controls.

Related to main Figure 3. Read depth plot and somatic SVs from mate-pair sequencing data of LUSC samples exhibiting IRS4 overexpression as well as LUSC controls samples with normal IRS4 expression level. The location of the IRS4 gene is indicated with a green arrow. Read depth changes and SV types are shown as colored graphs. IRS4 alterations included simple as well as more complex SV events including such showing evidence for chromosome shattering (also known as chromothripsis). IRS4 expression fold-change is shown for each sample.

Supplementary Figure 9 Proposed model for IRS4 overexpression by active chromatin spreading.

Related to main Figure 3. Proposed model for IRS4 overexpression by active chromatin spreading, resulting from clustered deletions at a TAD boundary in cis of IRS4.

Supplementary Figure 10 Tumor-type-specific analysis and frequency of CESAM hits per tumor type.

Related to main Table 1

Supplementary Figure 11 Recurrent somatic duplications at the IGF2 locus associated with IGF2 overexpression.

Related to main Figure 5. Recurrent somatic duplications at the IGF2 locus associating with IGF2 overexpression encompass a contact domain boundary and a non-cognate annotated super enhancer in the adjacent contact domain, but do not encompass the known IGF2 cognate enhancer (light blue). H3K27ac peaks show the presence of a non-cognate enhancer in the contact domain adjacent to the IGF2 locus. H2K27me3 marks for WT samples and samples harboring the duplication show absence of repressive chromatin at the adjacent non-cognate enhancer. 4C-Seq experiments using IGF2 as the viewpoint demonstrate a marked physical interaction between the IGF2 locus and the non-cognate enhancer in samples with the recurrent tandem duplication, but not in samples lacking the tandem duplication (WT). By comparison, no physical interaction is seen between IGF2 and its known cognate enhancer, neither in tandem duplication nor in WT samples. 4C-Seq experiments using the non-cognate enhancer as viewpoint verify the marked physical interaction with IGF2 in tandem duplication carriers and not in WT samples. Collectively, these results demonstrate that hijacking of a non-cognate super-enhancer, mediated by a contact domain spanning recurrent SCNA, drive overexpression of the IGF2 locus (see also Figure 5). CTCF marks for WT samples and samples harboring the duplication are consistent with the presence of TAD boundaries normally separating IGF2 and the non-cognate (super) enhancer. (b) IGF2 expression fold change versus copy number ratio (tumor/normal).

Supplementary Figure 12 Enhancer validation in primary CRC-derived spheroid cultures.

Related to main Figure 5. (a) Additional experiments to characterize the mechanism of gene dysregulation were pursued in (non-TCGA based) spheroid cultures obtained from primary CRC. (b) Distribution of IGF2 expression measurements in additional primary samples and spheroids. Two spheroids with marked overexpression, denoted CRCP5S and CRCP7S, were used for further characterization along with controls lacking IGF2 expression. Expression values are based on qPCR. (c) Functional activity of enhancer regions spanning the IGF2-interacting distal super-enhancer locus. Selected parts of the enhancer regions found to be interacting with the IGF2 locus in tandem-duplicated spheroid cells were amplified by PCR, cloned into a firefly luciferase reporter construct and co-transfected with a renilla luciferase control construct into the colon cancer cell line HCT116 as well as HeLa (cervical cancer cell line) cells used as a control. 48 h post transfection, cells were lysed and luciferase activities were measured. Firefly luciferase signal was normalized to renilla luciferase signal and displayed as fold activity normalized to empty vector control (Ctrl) (Mean ± SEM in triplicate).

Supplementary Figure 13 Single-copy super-enhancer juxtaposing tandem duplications result in IGF2 overexpression.

Related to main Figure 5. (a) Four IGF2 single copy gene duplications not associated with stark levels of IGF2 overexpression (orange bars) are shown next to IGF2 single-copy tandem duplications leading to ~300-fold IGF2 upregulation. Based on analyzing their relative position with respect to preexisting contact domains/TADs (black bars), 3/4 of these larger duplications are not able to bring IGF2 and the super-enhancer (SE) depicted in Fig. 5 into a single de novo contact domain – the lack of associated IGF2 overexpression is hence consistent with our new model of IGF2 dysregulation by enhancer hijacking (Fig. 5d). For the fourth duplication, juxtaposition of IGF2 and the SE is inferred to occur at a larger distance. (b) Gene expression fold-change for single-copy duplicated genes on chromosome 11. Single-copy IGF2 duplications not inferred to bring the SE and IGF2 into a de novo contact domain (denoted class A) and duplications inferred to juxtapose the SE and IGF2 bringing both elements into a de novo contact domain (class B) are depicted separately. For the observed 58 single-copy duplicated genes on chromosome 11 in CRC samples, the median gene expression increase was ~1.4-fold. (c) Panel supporting (b), showing significant difference in IGF2 expression between class A and class B single-copy IGF2 duplications (P=0.01; Wilcoxon rank-sum test). (d) RNA expression of IGF2 (y-axis) for class A (green dots) and class B (purple dots) versus the estimated distance in basepairs (bp) between IGF2 and the SE upon tandem duplication (x-axis).

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–13, Supplementary Table 7 and Supplementary Note (PDF 3005 kb)

Supplementary Table 1

Full list of CESAM hits in tumor-specific analysis (sheet 1) and tumor-type abbreviations (XLSX 44 kb)

Supplementary Table 2

SCNAs and fold-change expression from TERT, IRS4 and IGF2 CESAM analyses (XLSX 58 kb)

Supplementary Table 3

Xenograft mouse experiment data with IRS4-overexpressing HCC-15 LUSC cell lines and control. Data include tumor volume measurements, statistics and qPCR measurements. (XLSX 37 kb)

Supplementary Table 4

RT-qPCR data from LUSC samples to identify IRS4 overexpressors (separate Excel file). First, all 94 samples were tested in duplicate with IRS4 primers only; the results are provided in the “IRS4 Ct raw” column. Then, the 15 highest IRS4-expressing samples (taking the known RIN value into account) and 7 representative normal-expression samples were selected, and tests were repeated in another experiment using both IRS4 and globulin primers. The fold-change results after normalization with the housekeeping gene are reported. (XLSX 9 kb)

Supplementary Table 5

Patient characteristics and RT-qPCR data from CRC samples to identify IGF2 overexpressors (XLSX 12 kb)

Supplementary Table 6

Luciferase enhancer reporter experiment measurements (XLSX 12 kb)

Supplementary Table 8

Primer sequences (XLSX 10 kb)

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Weischenfeldt, J., Dubash, T., Drainas, A. et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat Genet 49, 65–74 (2017). https://doi.org/10.1038/ng.3722

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.3722

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer