Structural variants drive context-dependent oncogene activation in cancer

Xu, Zhichao; Lee, Dong-Sung; Chandran, Sahaana; Le, Victoria T.; Bump, Rosalind; Yasis, Jean; Dallarda, Sofia; Marcotte, Samantha; Clock, Benjamin; Haghani, Nicholas; Cho, Chae Yun; Akdemir, Kadir C.; Tyndale, Selene; Futreal, P. Andrew; McVicker, Graham; Wahl, Geoffrey M.; Dixon, Jesse R.

doi:10.1038/s41586-022-05504-4

Article
Published: 07 December 2022

Structural variants drive context-dependent oncogene activation in cancer

Nature volume 612, pages 564–572 (2022)Cite this article

17k Accesses
21 Citations
227 Altmetric
Metrics details

Subjects

Abstract

Higher-order chromatin structure is important for the regulation of genes by distal regulatory sequences^1,2. Structural variants (SVs) that alter three-dimensional (3D) genome organization can lead to enhancer–promoter rewiring and human disease, particularly in the context of cancer³. However, only a small minority of SVs are associated with altered gene expression^4,5, and it remains unclear why certain SVs lead to changes in distal gene expression and others do not. To address these questions, we used a combination of genomic profiling and genome engineering to identify sites of recurrent changes in 3D genome structure in cancer and determine the effects of specific rearrangements on oncogene activation. By analysing Hi-C data from 92 cancer cell lines and patient samples, we identified loci affected by recurrent alterations to 3D genome structure, including oncogenes such as MYC, TERT and CCND1. By using CRISPR–Cas9 genome engineering to generate de novo SVs, we show that oncogene activity can be predicted by using ‘activity-by-contact’ models that consider partner region chromatin contacts and enhancer activity. However, activity-by-contact models are only predictive of specific subsets of genes in the genome, suggesting that different classes of genes engage in distinct modes of regulation by distal regulatory elements. These results indicate that SVs that alter 3D genome organization are widespread in cancer genomes and begin to illustrate predictive rules for the consequences of SVs on oncogene activation.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: TAD fusion events from Hi-C data in cancer samples.**

**Fig. 2: Interdomain rearrangements in patient tumour samples.**

**Fig. 3: Engineered rearrangements and *MYC* gene activation.**

**Fig. 4: Quantitative models of *MYC* expression in the context of engineered rearrangements.**

**Fig. 5: Genome-wide ABC models across cell lines.**

Genome-wide detection of enhancer-hijacking events from chromatin interaction data in rearranged genomes

Article 03 June 2021

Xiaotao Wang, Jie Xu, … Feng Yue

Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer

Article Open access 05 February 2020

Kadir C. Akdemir, Victoria T. Le, … PCAWG Consortium

Systematic investigation of promoter substitutions resulting from somatic intrachromosomal structural alterations in diverse human cancers

Article Open access 23 October 2020

Babak Alaei-Mahabadi, Kerryn Elliott & Erik Larsson

Data availability

All data generated as part of this study is available through the Gene Expression Omnibus (GEO) database with accession number GSE147123.

Code availability

All code used as part of this study is available through GitHub (https://github.com/dixonlab/).

References

Dekker, J. & Mirny, L. The 3D genome as moderator of chromosomal communication. Cell 164, 1110–1121 (2016).
Article CAS Google Scholar
Yu, M. & Ren, B. The three-dimensional organization of mammalian genomes. Annu. Rev. Cell Dev. Biol. 33, 265–289 (2017).
Article CAS Google Scholar
Spielmann, M., Lupianez, D. G. & Mundlos, S. Structural variation in the 3D genome. Nat. Rev. Genet. 19, 453–467 (2018).
Article CAS Google Scholar
Ghavi-Helm, Y. et al. Highly rearranged chromosomes reveal uncoupling between genome topology and gene expression. Nat. Genet. 51, 1272–1282 (2019).
Article CAS Google Scholar
Akdemir, K. C. et al. Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer. Nat. Genet. 52, 294–305 (2020).
Article CAS Google Scholar
Levine, M., Cattoglio, C. & Tjian, R. Looping back to leap forward: transcription enters a new era. Cell 157, 13–25 (2014).
Article CAS Google Scholar
Leder, P. et al. Translocations among antibody genes in human cancer. Science 222, 765–771 (1983).
Article CAS ADS Google Scholar
Taub, R. et al. Translocation of the c-myc gene into the immunoglobulin heavy chain locus in human Burkitt lymphoma and murine plasmacytoma cells. Proc. Natl Acad. Sci. USA 79, 7837–7841 (1982).
Article CAS ADS Google Scholar
Weischenfeldt, J. et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat. Genet. 49, 65–74 (2017).
Article CAS Google Scholar
Northcott, P. A. et al. Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma. Nature 511, 428–434 (2014).
Article CAS ADS Google Scholar
Peifer, M. et al. Telomerase activation by genomic rearrangements in high-risk neuroblastoma. Nature 526, 700–704 (2015).
Article CAS ADS Google Scholar
Groschel, S. et al. A single oncogenic enhancer rearrangement causes concomitant EVI1 and GATA2 deregulation in leukemia. Cell 157, 369–381 (2014).
Article CAS Google Scholar
Hnisz, D. et al. Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science 351, 1454–1458 (2016).
Article CAS ADS Google Scholar
Wang, X. et al. Genome-wide detection of enhancer-hijacking events from chromatin interaction data in rearranged genomes. Nat. Methods 18, 661–668 (2021).
Article CAS Google Scholar
Flavahan, W. A. et al. Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature 529, 110–114 (2016).
Article CAS ADS Google Scholar
Flavahan, W. A. et al. Altered chromosomal topology drives oncogenic programs in SDH-deficient GISTs. Nature 575, 229–233 (2019).
Article CAS ADS Google Scholar
Dixon, J. R. et al. Integrative detection and analysis of structural variation in cancer genomes. Nat. Genet. 50, 1388–1398 (2018).
Article CAS Google Scholar
Barutcu, A. R. et al. RUNX1 contributes to higher-order chromatin organization and gene regulation in breast cancer cells. Biochim. Biophys. Acta 1859, 1389–1397 (2016).
Article CAS Google Scholar
Harewood, L. et al. Hi-C as a tool for precise detection and characterisation of chromosomal rearrangements and copy number variation in human tumours. Genome Biol. 18, 125 (2017).
Article Google Scholar
Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
Article CAS Google Scholar
Wang, Z. et al. The properties of genome conformation and spatial gene interaction and regulation networks of normal and malignant human cell types. PLoS One 8, e58793 (2013).
Article CAS ADS Google Scholar
Taberlay, P. C. et al. Three-dimensional disorganization of the cancer genome occurs coincident with long-range genetic and epigenetic alterations. Genome Res. 26, 719–731 (2016).
Article CAS Google Scholar
Guo, Y. et al. CRISPR inversion of CTCF sites alters genome topology and enhancer/promoter function. Cell 162, 900–910 (2015).
Article CAS Google Scholar
Akdemir, K. C. et al. Somatic mutation distributions in cancer genomes vary with three-dimensional chromatin structure. Nat. Genet. 52, 1178–1188 (2020).
Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Article ADS Google Scholar
Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125 (2013).
Article CAS Google Scholar
Engreitz, J. M., Agarwala, V. & Mirny, L. A. Three-dimensional genome architecture influences partner selection for chromosomal translocations in human disease. PLoS One 7, e44196 (2012).
Article CAS ADS Google Scholar
Naumova, N. et al. Organization of the mitotic chromosome. Science 342, 948–953 (2013).
Article CAS ADS Google Scholar
Seaman, L. et al. Nucleome analysis reveals structure-function relationships for colon cancer. Mol. Cancer Res. 15, 821–830 (2017).
Article CAS Google Scholar
Marcotte, R. et al. Functional genomic landscape of human breast cancer drivers, vulnerabilities, and resistance. Cell 164, 293–309 (2016).
Article CAS Google Scholar
Ngoc, P. C. T. et al. Identification of novel lncRNAs regulated by the TAL1 complex in T-cell acute lymphoblastic leukemia. Leukemia 32, 2138–2151 (2018).
Article Google Scholar
Harenza, J. L. et al. Transcriptomic profiling of 39 commonly-used neuroblastoma cell lines. Sci. Data 4, 170033 (2017).
Article CAS Google Scholar
Korbel, J. O. & Campbell, P. J. Criteria for inference of chromothripsis in cancer genomes. Cell 152, 1226–1236 (2013).
Article CAS Google Scholar
Baca, S. C. et al. Punctuated evolution of prostate cancer genomes. Cell 153, 666–677 (2013).
Article CAS Google Scholar
Cortes-Ciriano, I. et al. Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing. Nat. Genet. 52, 331–341 (2020).
Article CAS Google Scholar
Stephens, P. J. et al. Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144, 27–40 (2011).
Article MathSciNet CAS Google Scholar
Forcato, M. et al. Comparison of computational methods for Hi-C data analysis. Nat. Methods 14, 679–685 (2017).
Article CAS Google Scholar
Davidson, I. F. & Peters, J. M. Genome folding through loop extrusion by SMC complexes. Nat. Rev. Mol. Cell Biol. 22, 445–464 (2021).
Article CAS Google Scholar
Veloza, L., Ribera-Cortada, I. & Campo, E. Mantle cell lymphoma pathology update in the 2016 WHO classification. Ann. Lymphoma 3, 2616–2695 (2019).
Canela, A. et al. Genome organization drives chromosome fragility. Cell 170, 507–521 e518 (2017).
Article CAS Google Scholar
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).
Article CAS ADS Google Scholar
Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).
Article CAS ADS Google Scholar
Zhang, Y. et al. Transcriptionally active HERV-H retrotransposons demarcate topologically associating domains in human pluripotent stem cells. Nat. Genet. 51, 1380–1388 (2019).
Article CAS Google Scholar
Dixon, J. R. et al. Chromatin architecture reorganization during stem cell differentiation. Nature 518, 331–336 (2015).
Article CAS ADS Google Scholar
Schuijers, J. et al. Transcriptional dysregulation of MYC reveals common enhancer-docking mechanism. Cell Rep. 23, 349–360 (2018).
Article CAS Google Scholar
Shi, J. et al. Role of SWI/SNF in acute leukemia maintenance and enhancer-mediated Myc regulation. Genes Dev. 27, 2648–2662 (2013).
Article CAS Google Scholar
Fulco, C. P. et al. Systematic mapping of functional enhancer-promoter connections with CRISPR interference. Science 354, 769–773 (2016).
Article CAS ADS Google Scholar
Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).
Article CAS ADS Google Scholar
Delgado, M. D. & Leon, J. Myc roles in hematopoiesis and leukemia. Genes Cancer 1, 605–616 (2010).
Article CAS Google Scholar
Li, Y. et al. Patterns of somatic structural variation in human cancer genomes. Nature 578, 112–121 (2020).
Article CAS ADS Google Scholar
Zhang, Y. et al. Spatial organization of the mouse genome and its role in recurrent chromosomal translocations. Cell 148, 908–921 (2012).
Article CAS Google Scholar
Arber, D. A. et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood 127, 2391–2405 (2016).
Article CAS Google Scholar
Doyle, L. A. Sarcoma classification: an update based on the 2013 World Health Organization classification of tumors of soft tissue and bone. Cancer 120, 1763–1774 (2014).
Article Google Scholar
Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013).
Article CAS ADS Google Scholar
Zimmerman, M. W. et al. MYC drives a subset of high-risk pediatric neuroblastomas and is activated through mechanisms including enhancer hijacking and focal enhancer amplification. Cancer Discov. 8, 320–335 (2018).
Article CAS Google Scholar
Fulco, C. P. et al. Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).
Article CAS Google Scholar
Zhou, J. Sequence-based modeling of three-dimensional genome architecture from kilobase to chromosome scale. Nat. Genet. 54, 725–734 (2022).
Article CAS Google Scholar
Lambert, S. A. et al. The human transcription factors. Cell 172, 650–665 (2018).
Article CAS Google Scholar
Bradner, J. E., Hnisz, D. & Young, R. A. Transcriptional addiction in cancer. Cell 168, 629–643 (2017).
Article CAS Google Scholar
Khurana, E. et al. Role of non-coding sequence variants in cancer. Nat. Rev. Genet. 17, 93–108 (2016).
Article CAS Google Scholar
Lupianez, D. G. et al. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025 (2015).
Article CAS Google Scholar
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Article CAS ADS Google Scholar
Whyte, W. A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).
Article CAS Google Scholar
Zuin, J. et al. Nonlinear control of transcription through enhancer-promoter interactions. Nature 604, 571–577 (2022).
Article CAS ADS Google Scholar
Ran, F. A. et al. Genome engineering using the CRISPR–Cas9 system. Nat. Protoc. 8, 2281–2308 (2013).
Article CAS Google Scholar
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
Article CAS Google Scholar
Selvaraj, S., J, R. D., Bansal, V. & Ren, B. Whole-genome haplotype reconstruction using proximity-ligation and shotgun sequencing. Nat. Biotechnol. 31, 1111–1118 (2013).
Article CAS Google Scholar
Edge, P., Bafna, V. & Bansal, V. HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies. Genome Res. 27, 801–812 (2017).
Article CAS Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS Google Scholar
Anders, S., Pyl, P. T. & Huber, W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).
Article CAS Google Scholar
Haas, B. J. et al. Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods. Genome Biol. 20, 213 (2019).
Article Google Scholar
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).
Imakaev, M. et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods 9, 999–1003 (2012).
Article CAS Google Scholar
Consortium, I. T. P.-C. A. o. W. G. Pan-cancer analysis of whole genomes. Nature 578, 82–93 (2020).
Article ADS Google Scholar
Nakamura, Y. et al. Targeting of super-enhancers and mutant BRAF can suppress growth of BRAF-mutant colon cancer cells via repression of MAPK signaling pathway. Cancer Lett. 402, 100–109 (2017).
Article CAS Google Scholar
Diaferia, G. R. et al. Dissection of transcriptional and cis-regulatory control of differentiation in human pancreatic cancer. EMBO J. 35, 595–617 (2016).
Article CAS Google Scholar
Abraham, B. J. et al. Small genomic insertions form enhancers that misregulate oncogenes. Nat. Commun. 8, 14385 (2017).
Article CAS ADS Google Scholar
Kalender Atak, Z. et al. Identification of cis-regulatory mutations generating de novo edges in personalized cancer gene regulatory networks. Genome Med. 9, 80 (2017).
Article Google Scholar
Ryan, R. J. et al. Detection of enhancer-associated rearrangements reveals mechanisms of oncogene dysregulation in B-cell lymphoma. Cancer Discov. 5, 1058–1071 (2015).
Article CAS Google Scholar
Perreault, A. A., Sprunger, D. M. & Venters, B. J. Epigenetic and transcriptional profiling of triple negative breast cancer. Sci. Data 6, 190033 (2019).
Article Google Scholar
Franco, H. L. et al. Enhancer transcription reveals subtype-specific gene expression programs controlling breast cancer pathogenesis. Genome Res. 28, 159–170 (2018).
Article CAS Google Scholar
Wang, Y. et al. CDK7-dependent transcriptional addiction in triple-negative breast cancer. Cell 163, 174–186 (2015).
Article CAS Google Scholar
Feld, C. et al. Combined cistrome and transcriptome analysis of SKI in AML cells identifies SKI as a co-repressor for RUNX1. Nucleic Acids Res. 46, 3412–3428 (2018).
Article CAS Google Scholar
Singh, A. A. et al. Optimized ChIP-seq method facilitates transcription factor profiling in human tumors. Life Sci. Alliance 2, e201800115 (2019).
Article Google Scholar
Liu, N. Q. et al. The non-coding variant rs1800734 enhances DCLK3 expression through long-range interaction and promotes colorectal cancer progression. Nat. Commun. 8, 14418 (2017).
Article CAS ADS Google Scholar
Wan, L. et al. ENL links histone acetylation to oncogenic gene expression in acute myeloid leukaemia. Nature 543, 265–269 (2017).
Article CAS ADS Google Scholar
Saito, S. et al. Eradication of central nervous system leukemia of T-cell origin with a brain-permeable LSD1 inhibitor. Clin. Cancer Res. 25, 1601–1611 (2019).
Article CAS Google Scholar
Mansour, M. R. et al. Oncogene regulation. An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element. Science 346, 1373–1377 (2014).
Article CAS ADS Google Scholar
Eliades, P. et al. High MITF expression is associated with super-enhancers and suppressed by CDK7 inhibition in melanoma. J. Invest. Dermatol. 138, 1582–1590 (2018).
Article CAS Google Scholar
Boeva, V. et al. Heterogeneity of neuroblastoma cell identity defined by transcriptional circuitries. Nat. Genet. 49, 1408–1413 (2017).
Article CAS Google Scholar
Cohen, A. J. et al. Hotspots of aberrant enhancer activity punctuate the colorectal cancer epigenome. Nat. Commun. 8, 14400 (2017).
Article CAS ADS Google Scholar
Valenciaga, A. et al. Transcriptional targeting of oncogene addiction in medullary thyroid cancer. JCI Insight 3, e122225 (2018).
Article Google Scholar
Chen, P. et al. Symbiotic macrophage-glioma cell interactions reveal synthetic lethality in PTEN-null glioma. Cancer Cell 35, 868–884 e866 (2019).
Article Google Scholar
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Article Google Scholar
Adrian Alexa, J. R. topGO: Enrichment analysis for gene ontology. R package version 2.48.0 https://doi.org/10.18129/B9.bioc.topGO (2022).

Download references

Acknowledgements

We thank A. Saghatelian and A. Deshpande for contributing cell lines to this study. We thank A. Kim for sharing the mCherry modified version of the pX458 plasmid. We thank T. Popay for helpful comments on the manuscript. This work was supported by the NIH grant DP5OD023071 to J.R.D. and is also supported by the Leona M. and Harry B. Helmsley Charitable Trust grant No. 2017-PG-MED001 to J.R.D. Work in the laboratory of G.M.W. was supported, in part, by the National Institutes of Health/National Cancer Institute (grant no. R35 CA197687) and the Breast Cancer Research Foundation (BCRF). This work was also supported by the Flow Cytometry Core Facility of the Salk Institute and the NGS Core Facility of the Salk Institute with funding from NIH-NCI CCSG (grant no. P30 014195). We thank UC San Diego Biorepository and Tissue technology who shared resources for Biospecimen collection. This work carried out at the UC San Diego Moore’s Cancer Center Comprehensive Biorepository was supported by the National Cancer Institute (grant no. NCI P30CA23100).

Author information

These authors contributed equally: Zhichao Xu, Dong-Sung Lee

Authors and Affiliations

Gene Expression Laboratory; Salk Institute for Biological Studies, La Jolla, CA, USA
Zhichao Xu, Sahaana Chandran, Victoria T. Le, Rosalind Bump, Jean Yasis, Sofia Dallarda, Samantha Marcotte, Benjamin Clock, Nicholas Haghani, Chae Yun Cho, Geoffrey M. Wahl & Jesse R. Dixon
Department of Life Sciences, University of Seoul, Seoul, South Korea
Dong-Sung Lee
Department of Genomic Medicine, UT MD Anderson Cancer Center, Houston, TX, USA
Kadir C. Akdemir & P. Andrew Futreal
Department of Neurosurgery, UT MD Anderson Cancer Center, TX, Houston, USA
Kadir C. Akdemir
Integrative Biology Laboratory; Salk Institute for Biological Studies, La Jolla, CA, USA
Selene Tyndale & Graham McVicker

Authors

Zhichao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Dong-Sung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Sahaana Chandran
View author publications
You can also search for this author in PubMed Google Scholar
Victoria T. Le
View author publications
You can also search for this author in PubMed Google Scholar
Rosalind Bump
View author publications
You can also search for this author in PubMed Google Scholar
Jean Yasis
View author publications
You can also search for this author in PubMed Google Scholar
Sofia Dallarda
View author publications
You can also search for this author in PubMed Google Scholar
Samantha Marcotte
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Clock
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas Haghani
View author publications
You can also search for this author in PubMed Google Scholar
Chae Yun Cho
View author publications
You can also search for this author in PubMed Google Scholar
Kadir C. Akdemir
View author publications
You can also search for this author in PubMed Google Scholar
Selene Tyndale
View author publications
You can also search for this author in PubMed Google Scholar
P. Andrew Futreal
View author publications
You can also search for this author in PubMed Google Scholar
Graham McVicker
View author publications
You can also search for this author in PubMed Google Scholar
Geoffrey M. Wahl
View author publications
You can also search for this author in PubMed Google Scholar
Jesse R. Dixon
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z.X., D.-S.L. and J.R.D. conceived and designed the study. Z.X., V.T.L., R.B., S.C., J.Y., S.D., S.M., B.C., N.H, C.Y.C, S.T. and J.R.D. conducted the experiments. D.-S.L., Z.X. and J.R.D. led the data analysis. K.C.A. and P.A.F. contributed to the analysis of structural variation in patient tumour samples. G.M.W. and G.M. contributed to and helped supervise the experimental design. Z.X., D.-S.L. and J.R.D. wrote the manuscript. All authors read and approved the manuscript.

Corresponding author

Correspondence to Jesse R. Dixon.

Ethics declarations

Competing interests

The authors declare no competing interests

Peer review

Peer review information

Nature thanks Charles Lin, Ekta Khurana and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Identification of rearrangements based on Hi-C data.

a, Pie chart showing all 4,543 rearrangements identified and which cell line or patient tumor sample they are derived from. The order in the pie chart starts with A172 cells and proceeds counter-clockwise. b, Resolution of structural variants calls from Hi-C. Calls are first identified at low resolution and then progressively refined. The resolution reported is the highest resolution with which a given structural variant is identified. c, Chromatin interaction maps from mixed lineage leukemia cell lines with known MLL/KMT2A rearrangements. The maps show the presence of translocations on chromosome 4 in MV4;11 cells (left), chromosome 6 in ML2 cells (middle), and chromosome 9 in MOLM13 cells (right). d, Heat maps showing known disease defining translocations from five Mantle Cell lymphoma cell lines (Rec-1, Mino, Maver, Jeko, Granta). e, Heat maps showing known disease defining translocations in two Chronic Lymphocytic Leukemia cell lines (K562 and KBM7).

Extended Data Fig. 2 Features associated with TAD fusion events.

a, Pie chart showing the fraction of intra-chromosomal vs. inter-chromosomal structural variant predictions. b, The number of observed intra-chromosomal (blue) or inter-chromosomal (red) rearrangements identified in each cell line. c, -log₁₀ (p-values) for the observed frequency of intra-chromosomal rearrangements for each chromosome in each cell line under the null hypothesis that rearrangements are randomly distributed across chromosomes. The dashed line shows the threshold for significance accounting for multiple testing using a Bonferroni correction (p = 2.5 × 10⁻⁵). d, Example of high-frequency local rearrangements on chromosome 9 in U343 cells. Below the matrix is an arc plot of predicted rearrangements. e, Example of high-frequency local rearrangements along chromosome 15 in SNU-C1 cells (shown in the upper right-hand half of the matrix) in comparison with data from chromosome 15 in LoVo cells (lower left hand) where no rearrangements are observed. Below the matrix is an arc plot of predicted rearrangements. f, Results of cross validation of the neural network. The violin plots show the distribution of the accuracy and false discovery rate (FDR) across all 82 samples. g, Bar plots showing the percentage of domains containing oncogenes (based on the Cosmic Cancer Gene census) in domains identified as being part of fusion TADs (blue) versus those not identified in fusion TADs (grey). P-value is calculated by Fisher’s exact test. h, Bar plots showing the percentage of domains that contain enhancers for domains that contain TAD fusion events (blue) or do not (gray). The domain/enhancer analysis was performed for each domain in each cell type. P-value is calculated by Fisher’s exact test. i, Violin plots showing the distribution of the frequency of enhancers in domains that show TAD fusion events (blue) versus those that do not (gray). P-value is calculated from the two-sided Wilcoxon Rank Sum test. j, Bar plots showing the percentage of domains that contain super enhancers for domains that contain TAD fusion events (blue) or do not (gray). The domain/super-enhancer analysis was performed for each domain in each cell type. P-value is calculated by Fisher’s exact test. k, Violin plots showing the number of END-seq reads per kb for TADs that contain super enhancers (blue) versus those that do not (gray).

Extended Data Fig. 3 TAD fusion events at the MYC locus.

a, The number of called domains in each of five cell lines (hESC, HCC38, MV411, NCI-H1437, DLD-1) and the number of domains after merging unique boundaries (Merged). b, Quantile-quantile plot for evaluating the false discovery rate for recurrent TAD fusion identification. The observed p-values (Y-axis) are estimated using a Poisson model accounting for the overall frequency of rearrangements and the size of the domain. Randomized p-values are generated from these expected values (x-axis). This randomization analysis was repeated 1000 times to estimate the FDR at different p-value cut-offs. c, Hi-C data over the MYC locus in five cell types used for generating the merged TAD boundary set. The locations of TAD calls are shown in black bars below each heat map. This includes the TAD calls for each cell type as well as the across-cell merged calls (“Union set”). d, Estimated copy number of the MYC gene for samples with a TAD fusion event at the MYC locus versus those that do not. The copy number is estimated from the total number of Hi-C reads over the 100 kb bin surrounding the MYC gene divided by the median read count per 100 kb bin in each cell line. e, Circos plot showing the translocation partner region of each predicted TAD fusion event at the MYC locus. f, Examples of identified TAD fusion events at the MYC locus in two cell lines.

Extended Data Fig. 4 Inter TAD rearrangements at the MYC locus in human patient tumor samples.

a, Bar plot showing the frequency of patient samples containing inter-TAD rearrangements at the MYC locus by tumor type. b, Fraction of PCAWG samples with SVs at the MYC locus based on copy number. Samples are stratified into low copy (<=2), mid-copy (>2 and <=6), and high-copy (>6). c, Violin plots showing MYC expression for PCAWG samples stratified by copy number and the presence or absence of an SV at the MYC locus. P-values are calculated using Kruskal-Wallis test. d, RNA-seq expression of the MYC gene from patient samples with matched structural variant calls for samples with no high-level copy number alterations at the MYC gene (copy <= 6). Samples are separated into those that contain an inter-TAD rearrangement at the MYC locus (blue) and those that do not (black). P-value is from two-sided Wilcoxon Rank Sum test. e, RNA-seq expression of the MYC gene from patient samples with matched structural variant calls that are copy neutral at the MYC gene (copy <= 2). Samples are separated into those that contain an inter-TAD rearrangement at the MYC locus (blue) and those that do not (black). P-value is from two-sided Wilcoxon Rank Sum test. f, Circos plot of all inter-TAD rearrangements at the MYC locus. The Circos plot is zoomed in on cytoband 8q24.21 to show the MYC locus at a higher resolution. The position of TAD calls (black) and genes (green) are marked below the track.

Extended Data Fig. 5 Engineered rearrangements in SK-N-DZ cells.

a, Hi-C heat maps between chromosomes 7 and 8 in SK-N-SH cells (left) and SK-N-DZ cells (right). SK-N-SH cells have an endogenous t(7;8) translocation that creates a TAD fusion event at the locus, while SK-N-DZ cells have no rearrangements at the MYC locus in wild-type cells. b, Schematic for engineering rearrangement strategy. Guide RNAs targeting a locus ~300 kb downstream from the MYC gene and Guide RNAs targeting the partner region are cloned into a vector expressing Cas9. Guides are expressed either as single guides on plasmid with different fluorescent proteins or as dual guides on a plasmid with a single fluorescent protein. Cells are sorted and plated as single cells into 96 well plates. These can then be screened by PCR over the potential breakpoint to identify engineered clones. c, Sanger sequencing of PCR products from different engineered clones. The sequences that align to chromosome 7 are highlighted in green, while the sequences that align to chromosome 8 are highlighted in purple. d, Similar to Fig. 4b, validation of the engineered t(7;8) translocation by chromosome painting. e, MYC expression in cell lines containing endogenous or engineered rearrangements at the MYC locus including the non-rearranged SK-N-DZ parent cell line (purple), engineered clones classified as “Non-activating” (light blue), engineered clones classified as “MYC-activating” (dark red), Neuroblastoma cell lines with endogenous MYC rearrangements (green), and non-Neuroblastoma cell lines with MYC rearrangements (black). f, Scatter plot showing MYC expression (y-axis) and estimated MYC copy number (x-axis). g, Scatter plot showing MYC expression (y-axis) and estimated MYCN copy number (x-axis). h, Scatter plot showing MYC expression (y-axis) and MYCN expression (x-axis). i, FACS plots of mClover2 fluorescence in SK-N-DZ cells with a T2A-mClover2 reporter knocked into the 3′ end of the MYC gene (pink) and in a line derived from this MYC reporter with an engineered translocation between chromosome 1 and 8 (green). j, Heat map of chromosome 1 translocation to chromosome 8 with box showing H3K27ac ChIP-seq data over the partner region. The small inset box on the ChIP-seq track shows the enhancer targeted for deletion. k, FACS showing mClover2 fluorescence levels in the original chromosome 1 and chromosome 8 MYC reporter translocation (red) and in the same line with the targeted enhancer deletion (blue). The gate shows the region classified as “mClover2 low”. An example of the gating strategy for is also shown, including gating for single-cells and mCherry positive cells (FSC – forward scatter, SSC – side scatter, A – area, W – width). l, Percentage of “mClover2 low” cells in the control (red) and deletion (blue) cells. P-value is using Student’s two-sided T-test. m, MYC RPKM of clones with enhancer deletion on wild type allele and MYC-translocated allele. P-value is using two-sided T-test with equal variance.

Extended Data Fig. 6 Models for activation in engineered rearrangements.

a, Example plot showing method for calculating ABC score for MYC with rearranged partner sites. Interaction frequency between the MYC promoter and H3K27ac peaks in the partner region (“contact”) is multiplied by the strength of the H3K27ac signal (“activity”) at each peak across the partner region to obtain a final score for each peak. This signal is then summed across all peaks over the partner region. Of note, this example plot only shows the calculations for the six strongest H3K27ac peaks in the partner region, whereas the actual score is calculated using all H3K27ac peaks. b, Receiver Operating Characteristic (ROC) curve for the TAD delimited ABC model. Shown above the plot is the area under the curve (AUC). c, ROC curve for an ABC model where contacts are measured from genome wide average interaction frequencies. d, Plots showing ABC scores for genes neighboring MYC. Above the plot is the Pearson correlation coefficient for each gene between the genes’ ABC score and expression. e, Heat map of the TAD surrounding MYC as well as the location and relative position of the genes shown in panel D. f, Scatter plot showing ABC scores and summed enhancer activity within 3 Mb for every gene in 30 cancer cell lines. g, Scatter plot showing ABC scores and summed interaction within 3 Mb for every gene in 30 cancer cell lines. h, The number of enhancers per gene linked by the marginal ABC score >= 0.1 for ABC-correlated and non-correlated genes. Gray lines show the paired values for each cell line comparing ABC-correlated and non-correlated genes. P-value is from paired Wilcoxon test. i, Percentage of ABC responsive (blue) and protein-coding genes classified as transcription factors. Protein coding genes are from the Gencode reference annotation. P-value is from Fisher’s Exact test. j, Percentage of ABC responsive (blue) and protein-coding genes classified as oncogenes according to the Cosmic cancer gene census. P-value is from Fisher’s Exact test. k, Normalized interaction frequency as a function of distance for Hi-C interactions at 10 kb resolution in SK-N-DZ cells. Interaction frequency decays exponentially as a function of distance. l, Enhancer activity based on H3K27ac ChIP-seq as quantified by the ROSE super enhancer calling activity for all enhancers in SK-N-DZ cells. Enhancers are displayed ranked according to strength. Super-enhancers show exponentially stronger enhancer activity compared with typical enhancers. m, Enhancer activity required to achieve the equivalent activity-by-contact score for the median enhancer at 20 kb in SK-N-DZ cells as a function of genomic distance. Shown as a dashed line is the minimal enhancer strength categorized as a “super-enhancer” in SK-N-DZ cells by the ROSE algorithm. Due to the exponential decay in interaction frequency. After ~300 kb, the only enhancers capable of producing an ABC score equivalent to the median enhancer at 20 kb are super enhancers.

Supplementary information

Reporting Summary

Supplementary Table 1 Cell line and sequencing summary.

Supplementary Table 2 Hi–C defined structural variants.

Supplementary Table 3 Inter-TAD MYC rearrangement survival analysis.

Supplementary Table 4 Inter-TAD MYC rearrangement partner domain frequency.

Supplementary Table 5 ABC-correlated genes.

Supplementary Table 6 gRNA and PCR primer sequences.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xu, Z., Lee, DS., Chandran, S. et al. Structural variants drive context-dependent oncogene activation in cancer. Nature 612, 564–572 (2022). https://doi.org/10.1038/s41586-022-05504-4

Download citation

Received: 21 September 2020
Accepted: 01 November 2022
Published: 07 December 2022
Issue Date: 15 December 2022
DOI: https://doi.org/10.1038/s41586-022-05504-4

This article is cited by

Advances in single-cell omics and multiomics for high-resolution molecular profiling
- Jongsu Lim
- Chanho Park
- Dong-Sung Lee
Experimental & Molecular Medicine (2024)
Epigenomic insights into common human disease pathology
- Christopher G. Bell
Cellular and Molecular Life Sciences (2024)
Etiology of super-enhancer reprogramming and activation in cancer
- Royce W. Zhou
- Ramon E. Parsons
Epigenetics & Chromatin (2023)
Methylation across the central dogma in health and diseases: new therapeutic strategies
- Ruochen Liu
- Erhu Zhao
- Hongjuan Cui
Signal Transduction and Targeted Therapy (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.