Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Structural variants drive context-dependent oncogene activation in cancer

Abstract

Higher-order chromatin structure is important for the regulation of genes by distal regulatory sequences1,2. Structural variants (SVs) that alter three-dimensional (3D) genome organization can lead to enhancer–promoter rewiring and human disease, particularly in the context of cancer3. However, only a small minority of SVs are associated with altered gene expression4,5, and it remains unclear why certain SVs lead to changes in distal gene expression and others do not. To address these questions, we used a combination of genomic profiling and genome engineering to identify sites of recurrent changes in 3D genome structure in cancer and determine the effects of specific rearrangements on oncogene activation. By analysing Hi-C data from 92 cancer cell lines and patient samples, we identified loci affected by recurrent alterations to 3D genome structure, including oncogenes such as MYC, TERT and CCND1. By using CRISPR–Cas9 genome engineering to generate de novo SVs, we show that oncogene activity can be predicted by using ‘activity-by-contact’ models that consider partner region chromatin contacts and enhancer activity. However, activity-by-contact models are only predictive of specific subsets of genes in the genome, suggesting that different classes of genes engage in distinct modes of regulation by distal regulatory elements. These results indicate that SVs that alter 3D genome organization are widespread in cancer genomes and begin to illustrate predictive rules for the consequences of SVs on oncogene activation.

This is a preview of subscription content, access via your institution

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: TAD fusion events from Hi-C data in cancer samples.
Fig. 2: Interdomain rearrangements in patient tumour samples.
Fig. 3: Engineered rearrangements and MYC gene activation.
Fig. 4: Quantitative models of MYC expression in the context of engineered rearrangements.
Fig. 5: Genome-wide ABC models across cell lines.

Data availability

All data generated as part of this study is available through the Gene Expression Omnibus (GEO) database with accession number GSE147123.

Code availability

All code used as part of this study is available through GitHub (https://github.com/dixonlab/).

References

  1. Dekker, J. & Mirny, L. The 3D genome as moderator of chromosomal communication. Cell 164, 1110–1121 (2016).

    Article  CAS  Google Scholar 

  2. Yu, M. & Ren, B. The three-dimensional organization of mammalian genomes. Annu. Rev. Cell Dev. Biol. 33, 265–289 (2017).

    Article  CAS  Google Scholar 

  3. Spielmann, M., Lupianez, D. G. & Mundlos, S. Structural variation in the 3D genome. Nat. Rev. Genet. 19, 453–467 (2018).

    Article  CAS  Google Scholar 

  4. Ghavi-Helm, Y. et al. Highly rearranged chromosomes reveal uncoupling between genome topology and gene expression. Nat. Genet. 51, 1272–1282 (2019).

    Article  CAS  Google Scholar 

  5. Akdemir, K. C. et al. Disruption of chromatin folding domains by somatic genomic rearrangements in human cancer. Nat. Genet. 52, 294–305 (2020).

    Article  CAS  Google Scholar 

  6. Levine, M., Cattoglio, C. & Tjian, R. Looping back to leap forward: transcription enters a new era. Cell 157, 13–25 (2014).

    Article  CAS  Google Scholar 

  7. Leder, P. et al. Translocations among antibody genes in human cancer. Science 222, 765–771 (1983).

    Article  CAS  ADS  Google Scholar 

  8. Taub, R. et al. Translocation of the c-myc gene into the immunoglobulin heavy chain locus in human Burkitt lymphoma and murine plasmacytoma cells. Proc. Natl Acad. Sci. USA 79, 7837–7841 (1982).

    Article  CAS  ADS  Google Scholar 

  9. Weischenfeldt, J. et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat. Genet. 49, 65–74 (2017).

    Article  CAS  Google Scholar 

  10. Northcott, P. A. et al. Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma. Nature 511, 428–434 (2014).

    Article  CAS  ADS  Google Scholar 

  11. Peifer, M. et al. Telomerase activation by genomic rearrangements in high-risk neuroblastoma. Nature 526, 700–704 (2015).

    Article  CAS  ADS  Google Scholar 

  12. Groschel, S. et al. A single oncogenic enhancer rearrangement causes concomitant EVI1 and GATA2 deregulation in leukemia. Cell 157, 369–381 (2014).

    Article  CAS  Google Scholar 

  13. Hnisz, D. et al. Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science 351, 1454–1458 (2016).

    Article  CAS  ADS  Google Scholar 

  14. Wang, X. et al. Genome-wide detection of enhancer-hijacking events from chromatin interaction data in rearranged genomes. Nat. Methods 18, 661–668 (2021).

    Article  CAS  Google Scholar 

  15. Flavahan, W. A. et al. Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature 529, 110–114 (2016).

    Article  CAS  ADS  Google Scholar 

  16. Flavahan, W. A. et al. Altered chromosomal topology drives oncogenic programs in SDH-deficient GISTs. Nature 575, 229–233 (2019).

    Article  CAS  ADS  Google Scholar 

  17. Dixon, J. R. et al. Integrative detection and analysis of structural variation in cancer genomes. Nat. Genet. 50, 1388–1398 (2018).

    Article  CAS  Google Scholar 

  18. Barutcu, A. R. et al. RUNX1 contributes to higher-order chromatin organization and gene regulation in breast cancer cells. Biochim. Biophys. Acta 1859, 1389–1397 (2016).

    Article  CAS  Google Scholar 

  19. Harewood, L. et al. Hi-C as a tool for precise detection and characterisation of chromosomal rearrangements and copy number variation in human tumours. Genome Biol. 18, 125 (2017).

    Article  Google Scholar 

  20. Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

    Article  CAS  Google Scholar 

  21. Wang, Z. et al. The properties of genome conformation and spatial gene interaction and regulation networks of normal and malignant human cell types. PLoS One 8, e58793 (2013).

    Article  CAS  ADS  Google Scholar 

  22. Taberlay, P. C. et al. Three-dimensional disorganization of the cancer genome occurs coincident with long-range genetic and epigenetic alterations. Genome Res. 26, 719–731 (2016).

    Article  CAS  Google Scholar 

  23. Guo, Y. et al. CRISPR inversion of CTCF sites alters genome topology and enhancer/promoter function. Cell 162, 900–910 (2015).

    Article  CAS  Google Scholar 

  24. Akdemir, K. C. et al. Somatic mutation distributions in cancer genomes vary with three-dimensional chromatin structure. Nat. Genet. 52, 1178–1188 (2020).

  25. Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Article  ADS  Google Scholar 

  26. Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125 (2013).

    Article  CAS  Google Scholar 

  27. Engreitz, J. M., Agarwala, V. & Mirny, L. A. Three-dimensional genome architecture influences partner selection for chromosomal translocations in human disease. PLoS One 7, e44196 (2012).

    Article  CAS  ADS  Google Scholar 

  28. Naumova, N. et al. Organization of the mitotic chromosome. Science 342, 948–953 (2013).

    Article  CAS  ADS  Google Scholar 

  29. Seaman, L. et al. Nucleome analysis reveals structure-function relationships for colon cancer. Mol. Cancer Res. 15, 821–830 (2017).

    Article  CAS  Google Scholar 

  30. Marcotte, R. et al. Functional genomic landscape of human breast cancer drivers, vulnerabilities, and resistance. Cell 164, 293–309 (2016).

    Article  CAS  Google Scholar 

  31. Ngoc, P. C. T. et al. Identification of novel lncRNAs regulated by the TAL1 complex in T-cell acute lymphoblastic leukemia. Leukemia 32, 2138–2151 (2018).

    Article  Google Scholar 

  32. Harenza, J. L. et al. Transcriptomic profiling of 39 commonly-used neuroblastoma cell lines. Sci. Data 4, 170033 (2017).

    Article  CAS  Google Scholar 

  33. Korbel, J. O. & Campbell, P. J. Criteria for inference of chromothripsis in cancer genomes. Cell 152, 1226–1236 (2013).

    Article  CAS  Google Scholar 

  34. Baca, S. C. et al. Punctuated evolution of prostate cancer genomes. Cell 153, 666–677 (2013).

    Article  CAS  Google Scholar 

  35. Cortes-Ciriano, I. et al. Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing. Nat. Genet. 52, 331–341 (2020).

    Article  CAS  Google Scholar 

  36. Stephens, P. J. et al. Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144, 27–40 (2011).

    Article  MathSciNet  CAS  Google Scholar 

  37. Forcato, M. et al. Comparison of computational methods for Hi-C data analysis. Nat. Methods 14, 679–685 (2017).

    Article  CAS  Google Scholar 

  38. Davidson, I. F. & Peters, J. M. Genome folding through loop extrusion by SMC complexes. Nat. Rev. Mol. Cell Biol. 22, 445–464 (2021).

    Article  CAS  Google Scholar 

  39. Veloza, L., Ribera-Cortada, I. & Campo, E. Mantle cell lymphoma pathology update in the 2016 WHO classification. Ann. Lymphoma 3, 2616–2695 (2019).

  40. Canela, A. et al. Genome organization drives chromosome fragility. Cell 170, 507–521 e518 (2017).

    Article  CAS  Google Scholar 

  41. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

    Article  CAS  ADS  Google Scholar 

  42. Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).

    Article  CAS  ADS  Google Scholar 

  43. Zhang, Y. et al. Transcriptionally active HERV-H retrotransposons demarcate topologically associating domains in human pluripotent stem cells. Nat. Genet. 51, 1380–1388 (2019).

    Article  CAS  Google Scholar 

  44. Dixon, J. R. et al. Chromatin architecture reorganization during stem cell differentiation. Nature 518, 331–336 (2015).

    Article  CAS  ADS  Google Scholar 

  45. Schuijers, J. et al. Transcriptional dysregulation of MYC reveals common enhancer-docking mechanism. Cell Rep. 23, 349–360 (2018).

    Article  CAS  Google Scholar 

  46. Shi, J. et al. Role of SWI/SNF in acute leukemia maintenance and enhancer-mediated Myc regulation. Genes Dev. 27, 2648–2662 (2013).

    Article  CAS  Google Scholar 

  47. Fulco, C. P. et al. Systematic mapping of functional enhancer-promoter connections with CRISPR interference. Science 354, 769–773 (2016).

    Article  CAS  ADS  Google Scholar 

  48. Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).

    Article  CAS  ADS  Google Scholar 

  49. Delgado, M. D. & Leon, J. Myc roles in hematopoiesis and leukemia. Genes Cancer 1, 605–616 (2010).

    Article  CAS  Google Scholar 

  50. Li, Y. et al. Patterns of somatic structural variation in human cancer genomes. Nature 578, 112–121 (2020).

    Article  CAS  ADS  Google Scholar 

  51. Zhang, Y. et al. Spatial organization of the mouse genome and its role in recurrent chromosomal translocations. Cell 148, 908–921 (2012).

    Article  CAS  Google Scholar 

  52. Arber, D. A. et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood 127, 2391–2405 (2016).

    Article  CAS  Google Scholar 

  53. Doyle, L. A. Sarcoma classification: an update based on the 2013 World Health Organization classification of tumors of soft tissue and bone. Cancer 120, 1763–1774 (2014).

    Article  Google Scholar 

  54. Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013).

    Article  CAS  ADS  Google Scholar 

  55. Zimmerman, M. W. et al. MYC drives a subset of high-risk pediatric neuroblastomas and is activated through mechanisms including enhancer hijacking and focal enhancer amplification. Cancer Discov. 8, 320–335 (2018).

    Article  CAS  Google Scholar 

  56. Fulco, C. P. et al. Activity-by-contact model of enhancer-promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).

    Article  CAS  Google Scholar 

  57. Zhou, J. Sequence-based modeling of three-dimensional genome architecture from kilobase to chromosome scale. Nat. Genet. 54, 725–734 (2022).

    Article  CAS  Google Scholar 

  58. Lambert, S. A. et al. The human transcription factors. Cell 172, 650–665 (2018).

    Article  CAS  Google Scholar 

  59. Bradner, J. E., Hnisz, D. & Young, R. A. Transcriptional addiction in cancer. Cell 168, 629–643 (2017).

    Article  CAS  Google Scholar 

  60. Khurana, E. et al. Role of non-coding sequence variants in cancer. Nat. Rev. Genet. 17, 93–108 (2016).

    Article  CAS  Google Scholar 

  61. Lupianez, D. G. et al. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025 (2015).

    Article  CAS  Google Scholar 

  62. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

    Article  CAS  ADS  Google Scholar 

  63. Whyte, W. A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).

    Article  CAS  Google Scholar 

  64. Zuin, J. et al. Nonlinear control of transcription through enhancer-promoter interactions. Nature 604, 571–577 (2022).

    Article  CAS  ADS  Google Scholar 

  65. Ran, F. A. et al. Genome engineering using the CRISPR–Cas9 system. Nat. Protoc. 8, 2281–2308 (2013).

    Article  CAS  Google Scholar 

  66. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    Article  CAS  Google Scholar 

  67. Selvaraj, S., J, R. D., Bansal, V. & Ren, B. Whole-genome haplotype reconstruction using proximity-ligation and shotgun sequencing. Nat. Biotechnol. 31, 1111–1118 (2013).

    Article  CAS  Google Scholar 

  68. Edge, P., Bafna, V. & Bansal, V. HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies. Genome Res. 27, 801–812 (2017).

    Article  CAS  Google Scholar 

  69. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  Google Scholar 

  70. Anders, S., Pyl, P. T. & Huber, W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).

    Article  CAS  Google Scholar 

  71. Haas, B. J. et al. Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods. Genome Biol. 20, 213 (2019).

    Article  Google Scholar 

  72. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).

  73. Imakaev, M. et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods 9, 999–1003 (2012).

    Article  CAS  Google Scholar 

  74. Consortium, I. T. P.-C. A. o. W. G. Pan-cancer analysis of whole genomes. Nature 578, 82–93 (2020).

    Article  ADS  Google Scholar 

  75. Nakamura, Y. et al. Targeting of super-enhancers and mutant BRAF can suppress growth of BRAF-mutant colon cancer cells via repression of MAPK signaling pathway. Cancer Lett. 402, 100–109 (2017).

    Article  CAS  Google Scholar 

  76. Diaferia, G. R. et al. Dissection of transcriptional and cis-regulatory control of differentiation in human pancreatic cancer. EMBO J. 35, 595–617 (2016).

    Article  CAS  Google Scholar 

  77. Abraham, B. J. et al. Small genomic insertions form enhancers that misregulate oncogenes. Nat. Commun. 8, 14385 (2017).

    Article  CAS  ADS  Google Scholar 

  78. Kalender Atak, Z. et al. Identification of cis-regulatory mutations generating de novo edges in personalized cancer gene regulatory networks. Genome Med. 9, 80 (2017).

    Article  Google Scholar 

  79. Ryan, R. J. et al. Detection of enhancer-associated rearrangements reveals mechanisms of oncogene dysregulation in B-cell lymphoma. Cancer Discov. 5, 1058–1071 (2015).

    Article  CAS  Google Scholar 

  80. Perreault, A. A., Sprunger, D. M. & Venters, B. J. Epigenetic and transcriptional profiling of triple negative breast cancer. Sci. Data 6, 190033 (2019).

    Article  Google Scholar 

  81. Franco, H. L. et al. Enhancer transcription reveals subtype-specific gene expression programs controlling breast cancer pathogenesis. Genome Res. 28, 159–170 (2018).

    Article  CAS  Google Scholar 

  82. Wang, Y. et al. CDK7-dependent transcriptional addiction in triple-negative breast cancer. Cell 163, 174–186 (2015).

    Article  CAS  Google Scholar 

  83. Feld, C. et al. Combined cistrome and transcriptome analysis of SKI in AML cells identifies SKI as a co-repressor for RUNX1. Nucleic Acids Res. 46, 3412–3428 (2018).

    Article  CAS  Google Scholar 

  84. Singh, A. A. et al. Optimized ChIP-seq method facilitates transcription factor profiling in human tumors. Life Sci. Alliance 2, e201800115 (2019).

    Article  Google Scholar 

  85. Liu, N. Q. et al. The non-coding variant rs1800734 enhances DCLK3 expression through long-range interaction and promotes colorectal cancer progression. Nat. Commun. 8, 14418 (2017).

    Article  CAS  ADS  Google Scholar 

  86. Wan, L. et al. ENL links histone acetylation to oncogenic gene expression in acute myeloid leukaemia. Nature 543, 265–269 (2017).

    Article  CAS  ADS  Google Scholar 

  87. Saito, S. et al. Eradication of central nervous system leukemia of T-cell origin with a brain-permeable LSD1 inhibitor. Clin. Cancer Res. 25, 1601–1611 (2019).

    Article  CAS  Google Scholar 

  88. Mansour, M. R. et al. Oncogene regulation. An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element. Science 346, 1373–1377 (2014).

    Article  CAS  ADS  Google Scholar 

  89. Eliades, P. et al. High MITF expression is associated with super-enhancers and suppressed by CDK7 inhibition in melanoma. J. Invest. Dermatol. 138, 1582–1590 (2018).

    Article  CAS  Google Scholar 

  90. Boeva, V. et al. Heterogeneity of neuroblastoma cell identity defined by transcriptional circuitries. Nat. Genet. 49, 1408–1413 (2017).

    Article  CAS  Google Scholar 

  91. Cohen, A. J. et al. Hotspots of aberrant enhancer activity punctuate the colorectal cancer epigenome. Nat. Commun. 8, 14400 (2017).

    Article  CAS  ADS  Google Scholar 

  92. Valenciaga, A. et al. Transcriptional targeting of oncogene addiction in medullary thyroid cancer. JCI Insight 3, e122225 (2018).

    Article  Google Scholar 

  93. Chen, P. et al. Symbiotic macrophage-glioma cell interactions reveal synthetic lethality in PTEN-null glioma. Cancer Cell 35, 868–884 e866 (2019).

    Article  Google Scholar 

  94. Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

    Article  Google Scholar 

  95. Adrian Alexa, J. R. topGO: Enrichment analysis for gene ontology. R package version 2.48.0 https://doi.org/10.18129/B9.bioc.topGO (2022).

Download references

Acknowledgements

We thank A. Saghatelian and A. Deshpande for contributing cell lines to this study. We thank A. Kim for sharing the mCherry modified version of the pX458 plasmid. We thank T. Popay for helpful comments on the manuscript. This work was supported by the NIH grant DP5OD023071 to J.R.D. and is also supported by the Leona M. and Harry B. Helmsley Charitable Trust grant No. 2017-PG-MED001 to J.R.D. Work in the laboratory of G.M.W. was supported, in part, by the National Institutes of Health/National Cancer Institute (grant no. R35 CA197687) and the Breast Cancer Research Foundation (BCRF). This work was also supported by the Flow Cytometry Core Facility of the Salk Institute and the NGS Core Facility of the Salk Institute with funding from NIH-NCI CCSG (grant no. P30 014195). We thank UC San Diego Biorepository and Tissue technology who shared resources for Biospecimen collection. This work carried out at the UC San Diego Moore’s Cancer Center Comprehensive Biorepository was supported by the National Cancer Institute (grant no. NCI P30CA23100).

Author information

Authors and Affiliations

Authors

Contributions

Z.X., D.-S.L. and J.R.D. conceived and designed the study. Z.X., V.T.L., R.B., S.C., J.Y., S.D., S.M., B.C., N.H, C.Y.C, S.T. and J.R.D. conducted the experiments. D.-S.L., Z.X. and J.R.D. led the data analysis. K.C.A. and P.A.F. contributed to the analysis of structural variation in patient tumour samples. G.M.W. and G.M. contributed to and helped supervise the experimental design. Z.X., D.-S.L. and J.R.D. wrote the manuscript. All authors read and approved the manuscript.

Corresponding author

Correspondence to Jesse R. Dixon.

Ethics declarations

Competing interests

The authors declare no competing interests

Peer review

Peer review information

Nature thanks Charles Lin, Ekta Khurana and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Identification of rearrangements based on Hi-C data.

a, Pie chart showing all 4,543 rearrangements identified and which cell line or patient tumor sample they are derived from. The order in the pie chart starts with A172 cells and proceeds counter-clockwise. b, Resolution of structural variants calls from Hi-C. Calls are first identified at low resolution and then progressively refined. The resolution reported is the highest resolution with which a given structural variant is identified. c, Chromatin interaction maps from mixed lineage leukemia cell lines with known MLL/KMT2A rearrangements. The maps show the presence of translocations on chromosome 4 in MV4;11 cells (left), chromosome 6 in ML2 cells (middle), and chromosome 9 in MOLM13 cells (right). d, Heat maps showing known disease defining translocations from five Mantle Cell lymphoma cell lines (Rec-1, Mino, Maver, Jeko, Granta). e, Heat maps showing known disease defining translocations in two Chronic Lymphocytic Leukemia cell lines (K562 and KBM7).

Extended Data Fig. 2 Features associated with TAD fusion events.

a, Pie chart showing the fraction of intra-chromosomal vs. inter-chromosomal structural variant predictions. b, The number of observed intra-chromosomal (blue) or inter-chromosomal (red) rearrangements identified in each cell line. c, -log10 (p-values) for the observed frequency of intra-chromosomal rearrangements for each chromosome in each cell line under the null hypothesis that rearrangements are randomly distributed across chromosomes. The dashed line shows the threshold for significance accounting for multiple testing using a Bonferroni correction (p = 2.5 × 10−5). d, Example of high-frequency local rearrangements on chromosome 9 in U343 cells. Below the matrix is an arc plot of predicted rearrangements. e, Example of high-frequency local rearrangements along chromosome 15 in SNU-C1 cells (shown in the upper right-hand half of the matrix) in comparison with data from chromosome 15 in LoVo cells (lower left hand) where no rearrangements are observed. Below the matrix is an arc plot of predicted rearrangements. f, Results of cross validation of the neural network. The violin plots show the distribution of the accuracy and false discovery rate (FDR) across all 82 samples. g, Bar plots showing the percentage of domains containing oncogenes (based on the Cosmic Cancer Gene census) in domains identified as being part of fusion TADs (blue) versus those not identified in fusion TADs (grey). P-value is calculated by Fisher’s exact test. h, Bar plots showing the percentage of domains that contain enhancers for domains that contain TAD fusion events (blue) or do not (gray). The domain/enhancer analysis was performed for each domain in each cell type. P-value is calculated by Fisher’s exact test. i, Violin plots showing the distribution of the frequency of enhancers in domains that show TAD fusion events (blue) versus those that do not (gray). P-value is calculated from the two-sided Wilcoxon Rank Sum test. j, Bar plots showing the percentage of domains that contain super enhancers for domains that contain TAD fusion events (blue) or do not (gray). The domain/super-enhancer analysis was performed for each domain in each cell type. P-value is calculated by Fisher’s exact test. k, Violin plots showing the number of END-seq reads per kb for TADs that contain super enhancers (blue) versus those that do not (gray).

Extended Data Fig. 3 TAD fusion events at the MYC locus.

a, The number of called domains in each of five cell lines (hESC, HCC38, MV411, NCI-H1437, DLD-1) and the number of domains after merging unique boundaries (Merged). b, Quantile-quantile plot for evaluating the false discovery rate for recurrent TAD fusion identification. The observed p-values (Y-axis) are estimated using a Poisson model accounting for the overall frequency of rearrangements and the size of the domain. Randomized p-values are generated from these expected values (x-axis). This randomization analysis was repeated 1000 times to estimate the FDR at different p-value cut-offs. c, Hi-C data over the MYC locus in five cell types used for generating the merged TAD boundary set. The locations of TAD calls are shown in black bars below each heat map. This includes the TAD calls for each cell type as well as the across-cell merged calls (“Union set”). d, Estimated copy number of the MYC gene for samples with a TAD fusion event at the MYC locus versus those that do not. The copy number is estimated from the total number of Hi-C reads over the 100 kb bin surrounding the MYC gene divided by the median read count per 100 kb bin in each cell line. e, Circos plot showing the translocation partner region of each predicted TAD fusion event at the MYC locus. f, Examples of identified TAD fusion events at the MYC locus in two cell lines.

Extended Data Fig. 4 Inter TAD rearrangements at the MYC locus in human patient tumor samples.

a, Bar plot showing the frequency of patient samples containing inter-TAD rearrangements at the MYC locus by tumor type. b, Fraction of PCAWG samples with SVs at the MYC locus based on copy number. Samples are stratified into low copy (<=2), mid-copy (>2 and <=6), and high-copy (>6). c, Violin plots showing MYC expression for PCAWG samples stratified by copy number and the presence or absence of an SV at the MYC locus. P-values are calculated using Kruskal-Wallis test. d, RNA-seq expression of the MYC gene from patient samples with matched structural variant calls for samples with no high-level copy number alterations at the MYC gene (copy <= 6). Samples are separated into those that contain an inter-TAD rearrangement at the MYC locus (blue) and those that do not (black). P-value is from two-sided Wilcoxon Rank Sum test. e, RNA-seq expression of the MYC gene from patient samples with matched structural variant calls that are copy neutral at the MYC gene (copy <= 2). Samples are separated into those that contain an inter-TAD rearrangement at the MYC locus (blue) and those that do not (black). P-value is from two-sided Wilcoxon Rank Sum test. f, Circos plot of all inter-TAD rearrangements at the MYC locus. The Circos plot is zoomed in on cytoband 8q24.21 to show the MYC locus at a higher resolution. The position of TAD calls (black) and genes (green) are marked below the track.

Extended Data Fig. 5 Engineered rearrangements in SK-N-DZ cells.

a, Hi-C heat maps between chromosomes 7 and 8 in SK-N-SH cells (left) and SK-N-DZ cells (right). SK-N-SH cells have an endogenous t(7;8) translocation that creates a TAD fusion event at the locus, while SK-N-DZ cells have no rearrangements at the MYC locus in wild-type cells. b, Schematic for engineering rearrangement strategy. Guide RNAs targeting a locus ~300 kb downstream from the MYC gene and Guide RNAs targeting the partner region are cloned into a vector expressing Cas9. Guides are expressed either as single guides on plasmid with different fluorescent proteins or as dual guides on a plasmid with a single fluorescent protein. Cells are sorted and plated as single cells into 96 well plates. These can then be screened by PCR over the potential breakpoint to identify engineered clones. c, Sanger sequencing of PCR products from different engineered clones. The sequences that align to chromosome 7 are highlighted in green, while the sequences that align to chromosome 8 are highlighted in purple. d, Similar to Fig. 4b, validation of the engineered t(7;8) translocation by chromosome painting. e, MYC expression in cell lines containing endogenous or engineered rearrangements at the MYC locus including the non-rearranged SK-N-DZ parent cell line (purple), engineered clones classified as “Non-activating” (light blue), engineered clones classified as “MYC-activating” (dark red), Neuroblastoma cell lines with endogenous MYC rearrangements (green), and non-Neuroblastoma cell lines with MYC rearrangements (black). f, Scatter plot showing MYC expression (y-axis) and estimated MYC copy number (x-axis). g, Scatter plot showing MYC expression (y-axis) and estimated MYCN copy number (x-axis). h, Scatter plot showing MYC expression (y-axis) and MYCN expression (x-axis). i, FACS plots of mClover2 fluorescence in SK-N-DZ cells with a T2A-mClover2 reporter knocked into the 3′ end of the MYC gene (pink) and in a line derived from this MYC reporter with an engineered translocation between chromosome 1 and 8 (green). j, Heat map of chromosome 1 translocation to chromosome 8 with box showing H3K27ac ChIP-seq data over the partner region. The small inset box on the ChIP-seq track shows the enhancer targeted for deletion. k, FACS showing mClover2 fluorescence levels in the original chromosome 1 and chromosome 8 MYC reporter translocation (red) and in the same line with the targeted enhancer deletion (blue). The gate shows the region classified as “mClover2 low”. An example of the gating strategy for is also shown, including gating for single-cells and mCherry positive cells (FSC – forward scatter, SSC – side scatter, A – area, W – width). l, Percentage of “mClover2 low” cells in the control (red) and deletion (blue) cells. P-value is using Student’s two-sided T-test. m, MYC RPKM of clones with enhancer deletion on wild type allele and MYC-translocated allele. P-value is using two-sided T-test with equal variance.

Extended Data Fig. 6 Models for activation in engineered rearrangements.

a, Example plot showing method for calculating ABC score for MYC with rearranged partner sites. Interaction frequency between the MYC promoter and H3K27ac peaks in the partner region (“contact”) is multiplied by the strength of the H3K27ac signal (“activity”) at each peak across the partner region to obtain a final score for each peak. This signal is then summed across all peaks over the partner region. Of note, this example plot only shows the calculations for the six strongest H3K27ac peaks in the partner region, whereas the actual score is calculated using all H3K27ac peaks. b, Receiver Operating Characteristic (ROC) curve for the TAD delimited ABC model. Shown above the plot is the area under the curve (AUC). c, ROC curve for an ABC model where contacts are measured from genome wide average interaction frequencies. d, Plots showing ABC scores for genes neighboring MYC. Above the plot is the Pearson correlation coefficient for each gene between the genes’ ABC score and expression. e, Heat map of the TAD surrounding MYC as well as the location and relative position of the genes shown in panel D. f, Scatter plot showing ABC scores and summed enhancer activity within 3 Mb for every gene in 30 cancer cell lines. g, Scatter plot showing ABC scores and summed interaction within 3 Mb for every gene in 30 cancer cell lines. h, The number of enhancers per gene linked by the marginal ABC score >= 0.1 for ABC-correlated and non-correlated genes. Gray lines show the paired values for each cell line comparing ABC-correlated and non-correlated genes. P-value is from paired Wilcoxon test. i, Percentage of ABC responsive (blue) and protein-coding genes classified as transcription factors. Protein coding genes are from the Gencode reference annotation. P-value is from Fisher’s Exact test. j, Percentage of ABC responsive (blue) and protein-coding genes classified as oncogenes according to the Cosmic cancer gene census. P-value is from Fisher’s Exact test. k, Normalized interaction frequency as a function of distance for Hi-C interactions at 10 kb resolution in SK-N-DZ cells. Interaction frequency decays exponentially as a function of distance. l, Enhancer activity based on H3K27ac ChIP-seq as quantified by the ROSE super enhancer calling activity for all enhancers in SK-N-DZ cells. Enhancers are displayed ranked according to strength. Super-enhancers show exponentially stronger enhancer activity compared with typical enhancers. m, Enhancer activity required to achieve the equivalent activity-by-contact score for the median enhancer at 20 kb in SK-N-DZ cells as a function of genomic distance. Shown as a dashed line is the minimal enhancer strength categorized as a “super-enhancer” in SK-N-DZ cells by the ROSE algorithm. Due to the exponential decay in interaction frequency. After ~300 kb, the only enhancers capable of producing an ABC score equivalent to the median enhancer at 20 kb are super enhancers.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Xu, Z., Lee, DS., Chandran, S. et al. Structural variants drive context-dependent oncogene activation in cancer. Nature 612, 564–572 (2022). https://doi.org/10.1038/s41586-022-05504-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-022-05504-4

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer