Genome-wide detection of enhancer-hijacking events from chromatin interaction data in rearranged genomes

Wang, Xiaotao; Xu, Jie; Zhang, Baozhen; Hou, Ye; Song, Fan; Lyu, Huijue; Yue, Feng

doi:10.1038/s41592-021-01164-w

Article
Published: 03 June 2021

Genome-wide detection of enhancer-hijacking events from chromatin interaction data in rearranged genomes

Xiaotao Wang¹,
Jie Xu¹,
Baozhen Zhang ORCID: orcid.org/0000-0002-2307-9773¹^nAff3,
Ye Hou¹,
Fan Song¹,
Huijue Lyu¹ &
…
Feng Yue ORCID: orcid.org/0000-0002-7954-5462^1,2

Nature Methods volume 18, pages 661–668 (2021)Cite this article

13k Accesses
48 Citations
75 Altmetric
Metrics details

Subjects

Abstract

Recent efforts have shown that structural variations (SVs) can disrupt three-dimensional genome organization and induce enhancer hijacking, yet no computational tools exist to identify such events from chromatin interaction data. Here, we develop NeoLoopFinder, a computational framework to identify the chromatin interactions induced by SVs, including interchromosomal translocations, large deletions and inversions. Our framework can automatically resolve complex SVs, reconstruct local Hi-C maps surrounding the breakpoints, normalize copy number variation and allele effects and predict chromatin loops induced by SVs. We applied NeoLoopFinder in Hi-C data from 50 cancer cell lines and primary tumors and identified tens of recurrent genes associated with enhancer hijacking. To experimentally validate NeoLoopFinder, we deleted the hijacked enhancers in prostate adenocarcinoma cells using CRISPR–Cas9, which significantly reduced expression of the target oncogene. In summary, NeoLoopFinder enables identification of critical oncogenic regulatory elements that can potentially reveal therapeutic targets.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Overall design of the NeoLoopFinder framework.**

**Fig. 2: Detection of neoloops in 50 cancer cell lines or patient samples.**

**Fig. 3: Cancer-type specificity of neoloop-involved genes.**

**Fig. 4: Analysis of expression of genes in neoloops.**

**Fig. 5: Deletion of hijacked enhancers reduced oncogene expression in prostate adenocarcinoma.**

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Qiuyue Yuan & Zhana Duren

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

Saori Sakaue, Kathryn Weinand, … Soumya Raychaudhuri

Targeting DCAF5 suppresses SMARCB1-mutant cancer by stabilizing SWI/SNF

Article 27 March 2024

Sandi Radko-Juettner, Hong Yue, … Charles W. M. Roberts

Data availability

The cancer Hi-C datasets analyzed in this study are summarized in Supplementary Table 1. Details of all the other datasets collected for the validation and downstream analysis are summarized in Supplementary Table 7. Data used for survival analysis in leukemia and gastric cancer were downloaded from the cBioPortal for Cancer Genomics (https://www.cbioportal.org)⁴⁵. The list of cancer-related genes was obtained from the Bushman Lab (http://www.bushmanlab.org/assets/doc/allOnco_May2018.tsv). The 4C-seq data for the MYC gene promoter in SK-N-MC cells, and the Hi-C data generated for wild-type and enhancer-deleted LNCaP cells have been uploaded to the gene expression omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) under accession code GSE161493. Source data are provided with this paper.

Code availability

The NeoLoopFinder source code is publicly available in GitHub at https://github.com/XiaoTaoWang/NeoLoopFinder. The NeoLoopFinder code is also available at Code Ocean (https://doi.org/10.24433/CO.1323561.v1).

References

Futreal, P. A. et al. A census of human cancer genes. Nat. Rev. Cancer 4, 177–183 (2004).
CAS PubMed PubMed Central Google Scholar
Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Google Scholar
Bernstein, B. E. et al. The NIH Roadmap Epigenomics Mapping Consortium. Nat. Biotechnol. 28, 1045–1048 (2010).
CAS PubMed PubMed Central Google Scholar
Weischenfeldt, J. et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat. Genet. 49, 65–74 (2017).
CAS PubMed Google Scholar
Spielmann, M., Lupianez, D. G. & Mundlos, S. Structural variation in the 3D genome. Nat. Rev. Genet. 19, 453–467 (2018).
CAS PubMed Google Scholar
Groschel, S. et al. A single oncogenic enhancer rearrangement causes concomitant EVI1 and GATA2 deregulation in leukemia. Cell 157, 369–381 (2014).
CAS PubMed Google Scholar
Drier, Y. et al. An oncogenic MYB feedback loop drives alternate cell fates in adenoid cystic carcinoma. Nat. Genet. 48, 265–272 (2016).
CAS PubMed PubMed Central Google Scholar
Franke, M. et al. Formation of new chromatin domains determines pathogenicity of genomic duplications. Nature 538, 265–269 (2016).
CAS PubMed Google Scholar
Northcott, P. A. et al. The whole-genome landscape of medulloblastoma subtypes. Nature 547, 311–317 (2017).
CAS PubMed PubMed Central Google Scholar
Yang, M. et al. 13q12.2 deletions in acute lymphoblastic leukemia lead to upregulation of FLT3 through enhancer hijacking. Blood https://doi.org/10.1182/blood.2019004684 (2020).
Article PubMed PubMed Central Google Scholar
Ooi, W. F. et al. Integrated paired-end enhancer profiling and whole-genome sequencing reveals recurrent CCNE1 and IGF2 enhancer hijacking in primary gastric adenocarcinoma. Gut 69, 1039–1052 (2020).
CAS PubMed Google Scholar
Martin-Garcia, D. et al. CCND2 and CCND3 hijack immunoglobulin light-chain enhancers in cyclin D1(−) mantle cell lymphoma. Blood 133, 940–951 (2019).
CAS PubMed PubMed Central Google Scholar
Haller, F. et al. Enhancer hijacking activates oncogenic transcription factor NR4A3 in acinic cell carcinomas of the salivary glands. Nat. Commun. 10, 368 (2019).
CAS PubMed PubMed Central Google Scholar
Zimmerman, M. W. et al. MYC drives a subset of high-risk pediatric neuroblastomas and is activated through mechanisms including enhancer hijacking and focal enhancer amplification. Cancer Discov. 8, 320–335 (2018).
CAS PubMed Google Scholar
Ryan, R. J. et al. Detection of enhancer-associated rearrangements reveals mechanisms of oncogene dysregulation in B-cell lymphoma. Cancer Discov. 5, 1058–1071 (2015).
CAS PubMed PubMed Central Google Scholar
Northcott, P. A. et al. Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma. Nature 511, 428–434 (2014).
CAS PubMed PubMed Central Google Scholar
He, B. et al. Diverse noncoding mutations contribute to deregulation of cis-regulatory landscape in pediatric cancers. Sci. Adv. 6, eaba3064 (2020).
CAS PubMed PubMed Central Google Scholar
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
CAS PubMed PubMed Central Google Scholar
Wang, S. et al. HiNT: a computational method for detecting copy number variations and translocations from Hi-C data. Genome Biol. 21, 73 (2020).
PubMed PubMed Central Google Scholar
Dixon, J. R. et al. Integrative detection and analysis of structural variation in cancer genomes. Nat. Genet. 50, 1388–1398 (2018).
CAS PubMed PubMed Central Google Scholar
Chakraborty, A. & Ay, F. Identification of copy number variations and translocations in cancer cells from Hi-C data. Bioinformatics 34, 338–345 (2018).
CAS PubMed Google Scholar
Rao, S. S. P. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
CAS PubMed PubMed Central Google Scholar
Imakaev, M. et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods 9, 999–1003 (2012).
CAS PubMed PubMed Central Google Scholar
Yang, T. et al. HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Res 27, 1939–1949 (2017).
CAS PubMed PubMed Central Google Scholar
Wu, H. J. & Michor, F. A computational strategy to adjust for copy number in tumor Hi-C data. Bioinformatics 32, 3695–3701 (2016).
CAS PubMed PubMed Central Google Scholar
Vidal, E. et al. OneD: increasing reproducibility of Hi-C samples with abnormal karyotypes. Nucleic Acids Res. 46, e49 (2018).
PubMed PubMed Central Google Scholar
Servant, N., Varoquaux, N., Heard, E., Barillot, E. & Vert, J. P. Effective normalization for copy number variation in Hi-C data. BMC Bioinformatics 19, 313 (2018).
PubMed PubMed Central Google Scholar
Salameh, T. J. et al. A supervised learning framework for chromatin loop detection in genome-wide contact maps. Nat. Commun. 11, 3428 (2020).
PubMed PubMed Central Google Scholar
Fudenberg, G. et al. Formation of chromosomal domains by loop extrusion. Cell Rep. 15, 2038–2049 (2016).
CAS PubMed PubMed Central Google Scholar
Sanborn, A. L. et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl Acad. Sci. USA 112, E6456–E6465 (2015).
CAS PubMed PubMed Central Google Scholar
Crane, E. et al. Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244 (2015).
CAS PubMed PubMed Central Google Scholar
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).
CAS PubMed PubMed Central Google Scholar
Derderian, C., Orunmuyi, A. T., Olapade-Olaopa, E. O. & Ogunwobi, O. O. PVT1 signaling is a mediator of cancer progression. Front Oncol. 9, 502 (2019).
PubMed PubMed Central Google Scholar
Quereda, V. et al. Therapeutic targeting of CDK12/CDK13 in triple-negative breast cancer. Cancer Cell 36, 545–558 e547 (2019).
CAS PubMed Google Scholar
Parolia, A. et al. Distinct structural classes of activating FOXA1 alterations in advanced prostate cancer. Nature 571, 413–418 (2019).
CAS PubMed PubMed Central Google Scholar
Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).
CAS PubMed PubMed Central Google Scholar
Spangle, J. M. et al. PI3K/AKT signaling regulates H3K4 methylation in breast cancer. Cell Rep. 15, 2692–2704 (2016).
CAS PubMed PubMed Central Google Scholar
Baena, E. et al. ETV1 directs androgen metabolism and confers aggressive prostate cancer in targeted mice and patients. Genes Dev. 27, 683–698 (2013).
CAS PubMed PubMed Central Google Scholar
Gasi, D. et al. Overexpression of full-length ETV1 transcripts in clinical prostate cancer due to gene translocation. PLoS ONE 6, e16332 (2011).
CAS PubMed PubMed Central Google Scholar
Kragesteen, B. K. et al. Dynamic 3D chromatin architecture contributes to enhancer specificity and limb morphogenesis. Nat. Genet. 50, 1463–1473 (2018).
CAS PubMed Google Scholar
Despang, A. et al. Functional dissection of the Sox9-Kcnj2 locus identifies nonessential and instructive roles of TAD architecture. Nat. Genet. 51, 1263–1271 (2019).
CAS PubMed Google Scholar
Ay, F., Bailey, T. L. & Noble, W. S. Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts. Genome Res 24, 999–1011 (2014).
CAS PubMed PubMed Central Google Scholar
Wang, Y. et al. The 3D Genome Browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol. 19, 151 (2018).
PubMed PubMed Central Google Scholar
Boeva, V. et al. Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data. Bioinformatics 28, 423–425 (2012).
CAS PubMed Google Scholar
Liu, J. et al. An integrated TCGA pan-cancer clinical data resource to drive high-quality survival outcome analytics. Cell 173, 400–416 e411 (2018).
CAS PubMed PubMed Central Google Scholar
Wang, X. T., Cui, W. & Peng, C. HiTAD: detecting the structural and functional hierarchies of topologically associating domains from chromatin interactions. Nucleic Acids Res. 45, e163 (2017).
PubMed PubMed Central Google Scholar
Abdennur, N. & Mirny, L. A. Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics 36, 311–316 (2020).
CAS PubMed Google Scholar
Xu, W. et al. CoolBox: a interactive genomic data explorer for Jupyter Notebook. Preprint at bioRxiv https://doi.org/10.1101/614222 (2019).
Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).
CAS PubMed PubMed Central Google Scholar
Kaul, A., Bhattacharyya, S. & Ay, F. Identifying statistically significant chromatin contacts from Hi-C data with FitHiC2. Nat. Protoc. 15, 991–1012 (2020).
CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

F.Y. is supported by NIH grants R35GM124820 and R01HG009906. We thank H. Yang and the rest of the Yue laboratory for discussion.

Author information

Baozhen Zhang
Present address: Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Division of Etiology, Peking University Cancer Hospital and Institute, Beijing, China

Authors and Affiliations

Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, IL, USA
Xiaotao Wang, Jie Xu, Baozhen Zhang, Ye Hou, Fan Song, Huijue Lyu & Feng Yue
Robert H. Lurie Comprehensive Cancer Center of Northwestern University, Chicago, IL, USA
Feng Yue

Authors

Xiaotao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Xu
View author publications
You can also search for this author in PubMed Google Scholar
Baozhen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ye Hou
View author publications
You can also search for this author in PubMed Google Scholar
Fan Song
View author publications
You can also search for this author in PubMed Google Scholar
Huijue Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Feng Yue
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

F.Y. conceived, designed and supervised the project. X.W. implemented the algorithm and performed the data analysis. J.X. performed the 4C-seq experiment in SK-N-MC cells. B.Z. performed the CRISPR–Cas9 deletion experiments while working in Northwestern University. Y.H. performed Hi-C experiments in wild-type and M1-deleted LNCaP cells. F.S. processed the 4C-seq data. H.L. contributed to the proofreading and editing of the manuscript. X.W. and F.Y. wrote the manuscript with input from all the authors.

Corresponding author

Correspondence to Feng Yue.

Ethics declarations

Competing interests

F.Y. and X.W. are listed as inventors of a provisional patent titled ‘Detection of chromatin interactions in re-arranged genomes’. F.Y. is a cofounder of Sariant Therapeutics, Inc.

Additional information

Peer review Information Nature Methods thanks the anonymous reviewers for their contribution to the peer review of this work. L. Tang was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Evaluation of the CNV segmentation and CNV normalization module in NeoLoopFinder.

a–g, We implemented the same generalized additive model (GAM) used by HiNT-CNV to estimate the copy number profile directly from Hi-C. For CNV segmentation, we applied a different algorithm based on Hidden Markov Model (HMM). a, We compared the copy number profiles estimated by NeoLoopFinder with the CNV profiles computed by Control-FREEC with whole genome sequencing (WGS) data. Each dot represents a 25kb bin. Bins with zero reads were excluded from the calculation. b, Similar to a, but for the comparison between NeoLoopFinder and HiNT-CNV. c, The number of CNV segments identified by NeoLoopFinder and HiNT-CNV, with or without WGS support. Only segments with a copy number ratio larger than 1.5 or smaller than 0.3 were considered in the calculation. d, The fraction of WGS-detected CNV segments that are recalled by NeoLoopFinder or HiNT-CNV. e, Comparison of F₁ scores between NeoLoopFinder and HiNT-CNV in eight cancer cell lines. f, Comparisons of CNV segments inferred from HiNT-CNV and NeoLoopFinder based on Hi-C data in K562 cells. We compared their results with the CNV segments computed by Control-FREEC with whole genome sequencing (WGS) data. Results from both HiNT-CNV and NeoLoopFinder are similar to Control-FREEC. However, for more fragmented regions (green circles), NeoLoopFinder’s performance is better. g, A similar example in SK-N-MC cells. h–i, Comparison of different Hi-C normalization methods in K562 (Resolution: 10kb). Hi-C contact heatmaps and copy number variation profiles are shown for two example regions: ‘chr22: 22,340,000 – 24,200,000’ (h) and ‘chr9: 130,000,000 – 131,280,000’ (i). The CAIC method is excluded from the analysis due to memory error (Supplementary Table 2).

Source data

Extended Data Fig. 2 Evaluating the performance of CNV normalization of Hi-C data by simulation.

The inputs in our simulation are Hi-C data in GM12878 (normal lymphoblastic cells) and K562 (chronic myeloid leukemia) cells, and the CNV profiles in K562 cells. a, Our algorithm learns the trans-/cis-scaling factors separately for all possible copy number pairs from K562 Hi-C data. The CNV effects of K562 are then imposed on GM12878 Hi-C by linearly transforming the signals with the factor of corresponding copy number pairs. The resulting simulated GM12878 Hi-C matrix with K562 CNV is highly similar to the original K562 Hi-C matrix. b, We applied ICE and the newly designed CNV normalization method in this project to the simulated matrix from Supplementary Fig. 2a (GM12878 Hi-C matrix with CNVs in K562). By visual inspection, the CNV normalized Hi-C is more similar to the original GM12878. c, We used HiCRep to calculate the Stratum-adjusted Correlation Coefficients (SCCs) between ICE normalized and CNV normalized matrix to the original GM12878 Hi-C data. The distributions of SCC scores are presented in box-and-whisker plots, where the box represents the interquartile range (IQR, Q3-Q1), the horizontal thick line represents the median, the upper whisker extends to the last datum less than Q3 + 1.5×IQR, and the lower whisker extends to the first datum greater than Q1-1.5×IQR. Each dot represents an individual chromosome.

Source data

Extended Data Fig. 3 Complex SV detection based on Hi-C maps.

a, Illustration of how we re-construct Hi-C map surrounding breakpoints. There are four orientation types of inter-chromosomal translocations. b, Overall workflow of the complex SV assembling module in NeoLoopFinder. c, In the first step of the pipeline, we determine the whole rearranged fragments (green box) of the input SV breakpoints within the checking window (gray box, by default 5Mb extended from the breakpoints). d, The algorithm for determining rearranged fragments. First, correlation matrices are calculated by rows (top) and columns (bottom) of the contact matrix separately within the checking window; then the rearranged fragment boundaries are determined by checking the first principal component profile (PC1) of the correlation matrices. e,f, Determination of the R² cutoff for SV filtering. e, The distribution of R² across all large SVs (intra-chromosomal rearrangements larger than 1Mb and inter-chromosomal translocations). The number was summarized from all 50 cancer samples. f, The fraction of SVs after filtering as a function of R² cutoff. Data were merged from eight cancer cell lines: A549, Caki2, K562, LNCaP, NCI-H460, PANC-1, SK-N-MC, and T47D.

Source data

Extended Data Fig. 4 Examples of complex SVs assembled by NeoLoopFinder using Hi-C data.

All these complex SVs were also reported in our previous work, where the structure of each assembly was manually reconstructed by combining optical mapping, Hi-C and WGS. a–c, Assembly of a complex SV in K562 cells. a, The abnormal signals on the original Hi-C map indicate the complex SV structures between chromosome 13 and chromosome 9 in K562 cells. b, The correct lengths and orders of the rearranged regions B₂ (chr13: 92.3M-92.66M), B₃ (chr13: 93.2M-93.37M), C (chr13: 107.84M-108M) and D (chr9: 130.72M-131.28M) can be automatically identified and assembled by NeoLoopFinder solely based on Hi-C data. c, Linear regression of the global distance averaged contact frequencies and local distance averaged contact frequencies within the indicated contact regions, for example, ‘D-C’ represents the contacts between region D and region C in b. d–f, Assembly of another complex SV in K562 cells. g–i, Reconstruction of a complex SV in T47D cells.

Source data

Extended Data Fig. 5 Cancer-specific allele normalization and neo-TAD identification.

a–c, We show an example here in SK-N-MC cells. As shown in a, there is a 1.5 Mb deletion event (chr8: 127.88M, +; chr8: 129.37M, -). Since this is a heterozygous deletion, the chromatin interactions in area A3 are between loci (127.28M – 127.88M) and loci (129.37M – 129.97M) on one allele where the deletion happens. On the contrary, chromatin interactions within A1 (or A2) area are from all alleles and therefore, the overall intensities in area A3 are much weaker than A1 or A2. We need to normalize the signals so that we can predict neo-TADs and neo-Loops. a, Hi-C matrix without normalizing cancer-specific allele. The neo-TAD is undetectable as shown by the directionality index (DI) track. b, Reconstructed Hi-C map after cancer-specific allele normalization. Now the neo-TAD becomes detectable, and the number of detected neo-loops is also enhanced. c, Linear regression of the local distance averaged contact frequencies and the global distance averaged contact frequencies in different regions (A1, A2 and A3) of the Hi-C map in a. d,e, Detection of neo-TADs in 50 cancer cell lines or patient samples. d, The number of neo-TADs detected in each sample. e, Aggregate analysis of neo-TADs and distribution of breakpoint locations. Hi-C signals were distance-normalized, averaged, and centered at neo-TAD midpoints.

Source data

Extended Data Fig. 6 Comparison of loops predicted by FitHiC2 and NeoLoopFinder in SV regions.

a–c, Neo-loops predicted by NeoLoopFinder and FitHiC2 in LNCaP cells. a, Global view of the Hi-C map of chromosomes 7 and 14 shows that there is an inter-chromosomal translocation (marked by arrow). b, High-resolution Hi-C map showing contact frequencies between ETV1 and its hijacked enhancers on chr14. Blue circles indicate neo-loops identified by NeoLoopFinder. c, Significant interactions (blue circles) identified by FitHiC2. d–g, For this analysis, we used all the loops from ten cancer cell lines with DNase-Seq data available in ENCODE data portal (MCF7, A549, LNCaP, T47D, HL-60, KBM7, RPMI 8226, SK-N-MC, SW480 and K562). d, Upset plot of chromatin loops detected by NeoLoopFinder and FitHiC2 within SV regions. e, Chromatin accessibility around anchors of NeoLoopFinder-unique loops and FitHiC2-unique loops. f-g, Aggregate Peak Analysis (APA) for NeoLoopFinder-unique loops (f) or FitHiC2-unique loops (g).

Source data

Extended Data Fig. 7 Reconstruction of complex SVs in 50 cancer samples.

a, Number of assembled complex SVs in each sample. Only samples with complex SVs detected are shown. b, Size distributions of complex SV fragments, simple inversions and simple deletions/duplications. Data were merged from eight cancer cell lines: A549, Caki2, K562, LNCaP, NCI-H460, PANC-1, SK-N-MC, and T47D. For the boxplot, the box represents the interquartile range (IQR, Q3-Q1), the white dot represents the median, the upper whisker extends to the last datum less than Q3 + 1.5×IQR, and the lower whisker extends to the first datum greater than Q1-1.5×IQR. c, Percentage of transition between different SV types in a complex SV assembly, averaged from our analysis in 50 cancer cell lines/tissues.

Source data

Extended Data Fig. 8 Examples of neo-loop-involved genes.

a, (left) Reconstructed Hi-C map for the translocation (chr19: 6.62M, -; chr11: 118.91M, -) in a T-ALL (T-cell Acute Lymphoblastic Leukemia) patient. Blue circles indicate the predicted neo-loops. (middle) Kaplan-Meier survival analysis of TCGA leukemia patients with high (top 35%) and low (bottom 35%) expressions of the UPK2 gene. The means and the 95% confidential intervals are shown for each group of patients. The p-value was calculated from two-sided log-rank test. (right) Log2 converted copy number ratios of the UPK2 gene with high (top 35%) or low (bottom 35%) expressions in patients. The horizontal bar in each violin plot represents the median. Genes with higher expression levels do not have higher copies. The p-value was computed from two-sided Mann-Whitney U test. b-c, Similar examples in a gastric cancer patient T2000877. The Kaplan-Meier survival and copy-number analysis were performed using TCGA stomach adenocarcinoma patient data.

Source data

Extended Data Fig. 9 Genes with hijacked enhancers are up-regulated in cancer.

Quantile normalized gene expression signals (Transcripts Per Kilobase Million, TPM) are compared between cancer cells and corresponding normal cells. Here the genes with hijacked enhancers are defined as expressed neo-loop-involved genes (TPM > 1 in cancer cells) when there are at least one DNase-Seq peaks in the other anchor of the neo-loop. Each dot represents an individual gene. The p-values were computed using the two-sided Wilcoxon signed-rank test.

Source data

Extended Data Fig. 10 Application of the NeoLoopFinder framework to developmental diseases with genomic rearrangements.

a, CHi-C map reconstruction and neo-loop detection for an inversion event (inv1) in the mouse forelimb at embryonic day 11.5 (E11.5). Data were downloaded from Kragesteen BK et al. Nature Genetics 2018. (left) CHi-C map of the wild-type forelimb. The Pitx1 gene shows weak interactions with the Pen enhancer. Blue circles indicate the predicted chromatin loops by Peakachu. (middle) Original CHi-C map of the forelimb that contains a homozygous inversion of a 113-kb fragment containing Pen. (right) Reconstructed CHi-C map for the inversion. Note the neo-loop between Pitx1 and the Pen enhancer was correctly detected by NeoLoopFinder. b, CHi-C map reconstruction and neo-loop detection for a duplication event (Dup-C) in the mouse limb buds at E12.5. Data were downloaded from Franke M et al. Nature 2016. The duplicated region (blue and green arrows) contains both the Sox9 enhancers (marked by H3K27ac peaks) and the Kcnj2 gene. The rightmost panel shows the reconstructed chromatin interaction map near the duplication breakpoints. Yellow circles highlight the Kcnj2-involved neo-loops. There are also two more predicted neo-loops in this region (blue circles). c, CHi-C map reconstruction and neo-loop detection for an inversion event (InvC) in E12.5 limb buds. Data were downloaded from Despang A et al. Nature Genetics 2019. The inverted region (green bar in the middle panel) contains Sox9 enhancers (marked by H3K27ac peaks) and the TAD boundary separating the Sox9 enhancers and the Kcnj2 gene. The rightmost panel shows the reconstructed map for the whole region. The blue circles indicate the detected neo-loops, and yellow circles highlight the Kcnj2-involved neo-loops.

Supplementary information

Supplementary Information

Supplementary Tables 2, 6, 9 and 10.

Reporting Summary

Supplementary Table 1

Details of the 50 cancer Hi-C datasets analyzed in this study.

Supplementary Table 3

List of large SVs detected in each sample.

Supplementary Table 4

Genomic coordinates of the detected neoloops in each sample.

Supplementary Table 5

List of neoloop-involved genes identified in each sample.

Supplementary Table 7

Details of the datasets collected for the validation and downstream analysis, including WGS, ChIP–seq, DNase-seq, RNA-seq and Capture Hi-C datasets in various cell lines or tissues.

Supplementary Table 8

List of the annotated enhancer-hijacking events in 11 cancer cell lines: A549, K562, LNCaP, MCF7, T47D, HepG2, SK-MEL-5, NCI-H460, PANC-1, HT-1080 and C4-2B.

Source data

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 6

Statistical source data.

Source Data Extended Data Fig. 7

Statistical source data.

Source Data Extended Data Fig. 8

Statistical source data.

Source Data Extended Data Fig. 9

Statistical source data.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Xu, J., Zhang, B. et al. Genome-wide detection of enhancer-hijacking events from chromatin interaction data in rearranged genomes. Nat Methods 18, 661–668 (2021). https://doi.org/10.1038/s41592-021-01164-w

Download citation

Received: 23 June 2020
Accepted: 22 April 2021
Published: 03 June 2021
Issue Date: June 2021
DOI: https://doi.org/10.1038/s41592-021-01164-w

This article is cited by

Comparative study on chromatin loop callers using Hi-C data reveals their effectiveness
- H. M. A. Mohit Chowdhury
- Terrance Boult
- Oluwatosin Oluwadare
BMC Bioinformatics (2024)
Alteration of chromosome structure impacts gene expressions implicated in pancreatic ductal adenocarcinoma cells
- Wenrui Han
- Detong Shi
- Fang Yan
BMC Genomics (2024)
Computational methods for analysing multiscale 3D genome organization
- Yang Zhang
- Lorenzo Boninsegna
- Jian Ma
Nature Reviews Genetics (2024)
Comparative study on genomic and epigenomic profiles of retinoblastoma or tuberous sclerosis complex via nanopore sequencing and a joint screening framework
- Junting Wang
- Chengyue Zhang
- Liang Li
Cancer Gene Therapy (2024)
Etiology of super-enhancer reprogramming and activation in cancer
- Royce W. Zhou
- Ramon E. Parsons
Epigenetics & Chromatin (2023)

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links