The CCCTC-binding factor (CTCF) works together with the cohesin complex to drive the formation of chromatin loops and topologically associating domains, but its role in gene regulation has not been fully defined. Here, we investigated the effects of acute CTCF loss on chromatin architecture and transcriptional programs in mouse embryonic stem cells undergoing differentiation to neural precursor cells. We identified CTCF-dependent enhancer–promoter contacts genome-wide and found that they disproportionately affect genes that are bound by CTCF at the promoter and are dependent on long-distance enhancers. Disruption of promoter-proximal CTCF binding reduced both long-range enhancer–promoter contacts and transcription, which were restored by artificial tethering of CTCF to the promoter. Promoter-proximal CTCF binding is correlated with the transcription of over 2,000 genes across a diverse set of adult tissues. Taken together, the results of our study show that CTCF binding to promoters may promote long-distance enhancer-dependent transcription at specific genes in diverse cell types.
Subscribe to Journal
Get full journal access for 1 year
only $17.42 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
All datasets generated in this study have been deposited to the Gene Expression Omnibus (GEO) with accession number GSE94452. The Hi-C dataset analyzed in Extended Data Fig. 3g–i was provided by B. Bruneau (GSE98671). Accession codes for the mouse tissue datasets used in Fig. 5 and Extended Data Fig. 10 are listed in Supplementary Table 6; datasets are available from the ENCODE portal (https://www.encodeproject.org/). Source data are provided with this paper.
PLAC-seq and the other analyses in this study were performed by combining public software as described in the Methods.
Heintzman, N. D. et al. Histone modifications at human enhancers reflect global cell-type-specific gene expression. Nature 459, 108–112 (2009).
Long, H. K., Prescott, S. L. & Wysocka, J. Ever-changing landscapes: transcriptional enhancers in development and evolution. Cell 167, 1170–1187 (2016).
Shen, Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012).
Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
The ENCODE Project Consortium.An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Yu, M. & Ren, B. The three-dimensional organization of mammalian genomes. Annu. Rev. Cell Dev. Biol. 33, 265–289 (2017).
Benabdallah, N. S. et al. Decreased Enhancer-Promoter Proximity Accompanying Enhancer Activation. Mol. Cell 76, 473–484.e7 (2019).
Alexander, J. M. et al. Live-cell imaging reveals enhancer-dependent Sox2 transcription in the absence of enhancer proximity. Elife 8, e41769 (2019).
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).
Phillips-Cremins, J. E. et al. Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell 153, 1281–1295 (2013).
Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).
Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
Nora, E. P. et al. Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell 169, 930–944 (2017).
Sanborn, A. L. et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl Acad. Sci. USA 112, E6456–E6465 (2015).
Fudenberg, G. et al. Formation of chromosomal domains by loop extrusion. Cell Rep. 15, 2038–2049 (2016).
Monahan, K. et al. Role of CCCTC binding factor (CTCF) and cohesin in the generation of single-cell diversity of protocadherin-α gene expression. Proc. Natl Acad. Sci. USA 109, 9125–9130 (2012).
Zhang, X. et al. Fundamental roles of chromatin loop extrusion in antibody class switching. Nature 575, 385–389 (2019).
Lee, J., Krivega, I., Dale, R. K. & Dean, A. The LDB1 complex co-opts CTCF for erythroid lineage-specific long-range enhancer interactions. Cell Rep. 19, 2490–2502 (2017).
Rao, S. S. P. et al. Cohesin loss eliminates all loop domains. Cell 171, 305–320.e24 (2017).
Arzate-Mejía, R. G., Recillas-Targa, F. & Corces, V. G. Developing in 3D: the role of CTCF in cell differentiation. Development 145, dev137729 (2018).
Stik, G. et al. CTCF is dispensable for immune cell transdifferentiation but facilitates an acute inflammatory response. Nat. Genet. 52, 655–661 (2020).
Nishimura, K., Fukagawa, T., Takisawa, H., Kakimoto, T. & Kanemaki, M. An auxin-based degron system for the rapid depletion of proteins in nonplant cells. Nat. Methods 6, 917–922 (2009).
Holland, A. J., Fachinetti, D., Han, J. S. & Cleveland, D. W. Inducible, reversible system for the rapid and complete degradation of proteins in mammalian cells. Proc. Natl Acad. Sci. USA 109, E3350–E3357 (2012).
Krietenstein, N. et al. Ultrastructural details of mammalian chromosome architecture. Mol. Cell 78, 554–565 (2020).
Fang, R. et al. Mapping of long-range chromatin interactions by proximity ligation-assisted ChIP–seq. Cell Res 26, 1345–1348 (2016).
Mumbach, M. R. et al. HiChIP: efficient and sensitive analysis of protein-directed genome architecture. Nat. Methods 13, 919–922 (2016).
Dowen, J. M. et al. Control of cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. Cell 159, 374–387 (2014).
Krijger, P. H. et al. Cell-of-origin-specific 3D genome structure acquired during somatic cell reprogramming. Cell Stem Cell 18, 597–610 (2016).
Bonev, B. et al. Multiscale 3D genome rewiring during mouse neural development. Cell 171, 557–572.e24 (2017).
Li, Y. et al. CRISPR reveals a distal super-enhancer required for Sox2 expression in mouse embryonic stem cells. PLoS ONE 9, e114485 (2014).
Li, G. et al. Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148, 84–98 (2012).
Diao, Y. et al. A tiling-deletion-based genetic screen for cis-regulatory element identification in mammalian cells. Nat. Methods 14, 629–635 (2017).
Thiecke, M. J. et al. Cohesin-dependent and -independent mechanisms mediate chromosomal contacts between promoters and enhancers. Cell Rep. 32, 107929 (2020).
Landolt, R. M., Vaughan, L., Winterhalter, K. H. & Zimmermann, D. R. Versican is selectively expressed in embryonic tissues that act as barriers to neural crest cell migration and axon outgrowth. Development 121, 2303–2312 (1995).
Wu, Y. et al. Versican V1 isoform induces neuronal differentiation and promotes neurite outgrowth. Mol. Biol. Cell 15, 2093–2104 (2004).
Chiariello, A. M., Annunziatella, C., Bianco, S., Esposito, A. & Nicodemi, M. Polymer physics of chromosome large-scale 3D organisation. Sci. Rep. 6, 29775 (2016).
Bianco, S. et al. Polymer physics predicts the effects of structural variants on chromatin architecture. Nat. Genet. 50, 662–667 (2018).
Herz, H. M. et al. Enhancer-associated H3K4 monomethylation by Trithorax-related, the Drosophila homolog of mammalian Mll3/Mll4. Genes Dev. 26, 2604–2620 (2012).
Hu, D. et al. The MLL3/MLL4 branches of the COMPASS family function as major histone H3K4 monomethylases at enhancers. Mol. Cell. Biol. 33, 4745–4754 (2013).
Yan, J. et al. Histone H3 lysine 4 monomethylation modulates long-range chromatin interactions at enhancers. Cell Res. 28, 387 (2018).
He, Y. et al. Spatiotemporal DNA methylome dynamics of the developing mouse fetus. Nature 583, 752–759 (2020).
Martínez, O. & Reyes-Valdés, M. H. Defining diversity, specialization, and gene specificity in transcriptomes through information theory. Proc. Natl Acad. Sci. USA 105, 9709–9714 (2008).
Lee, D. P. et al. Robust CTCF-based chromatin architecture underpins epigenetic changes in the heart failure stress-gene response. Circulation 139, 1937–1956 (2019).
Wutz, G. et al. ESCO1 and CTCF enable formation of long chromatin loops by protecting cohesinSTAG1 from WAPL. Elife 9, e52091 (2020).
Hsieh, T. S. et al. Resolving the 3D landscape of transcription-linked mammalian chromatin folding. Mol. Cell 78, 539–553.e8 (2020).
Weintraub, A. S. et al. YY1 is a structural regulator of enhancer-promoter loops. Cell 171, 1573–1588.e28 (2017).
Beagan, J. A. et al. YY1 and CTCF orchestrate a 3D chromatin looping switch during early neural lineage commitment. Genome Res. 27, 1139–1152 (2017).
Caputo, L. et al. The Isl1/Ldb1 complex orchestrates genome-wide chromatin organization to instruct differentiation of multipotent cardiac progenitors. Cell Stem Cell 17, 287–299 (2015).
Monahan, K., Horta, A. & Lomvardas, S. LHX2- and LDB1-mediated trans interactions regulate olfactory receptor choice. Nature 565, 448–453 (2019).
Schoenfelder, S. et al. The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements. Genome Res. 25, 582–597 (2015).
Hnisz, D., Shrinivas, K., Young, R. A., Chakraborty, A. K. & Sharp, P. A. A phase separation model for transcriptional control. Cell 169, 13–23 (2017).
Wang, H. et al. Widespread plasticity in CTCF occupancy linked to DNA methylation. Genome Res. 22, 1680–1688 (2012).
Renda, M. et al. Critical DNA binding interactions of the insulator protein CTCF: a small number of zinc fingers mediate strong binding, and a single finger-DNA interaction controls binding at imprinted loci. J. Biol. Chem. 282, 33336–33345 (2007).
Leitch, H. G. et al. Naive pluripotency is associated with global DNA hypomethylation. Nat. Struct. Mol. Biol. 20, 311–316 (2013).
Ficz, G. et al. FGF signaling inhibition in ESCs drives rapid genome-wide demethylation to the epigenetic ground state of pluripotency. Cell Stem Cell 13, 351–359 (2013).
Maurano, M. T. et al. Role of DNA methylation in modulating transcription factor occupancy. Cell Rep. 12, 1184–1195 (2015).
Nanan, K. K. et al. TET-catalyzed 5-carboxylcytosine promotes CTCF binding to suboptimal sequences genome-wide. Iscience 19, 326–339 (2019).
Katainen, R. et al. CTCF/cohesin-binding sites are frequently mutated in cancer. Nat. Genet. 47, 818–821 (2015).
Kaiser, V. B., Taylor, M. S. & Semple, C. A. Mutational biases drive elevated rates of substitution at regulatory sites across cancer types. PLoS Genet. 12, e1006207 (2016).
Levine, D. A. & The Cancer Genome Atlas Research Network. Integrated genomic characterization of endometrial carcinoma. Nature 497, 67–73 (2013).
Gribnau, J., Hochedlinger, K., Hata, K., Li, E. & Jaenisch, R. Asynchronous replication timing of imprinted loci is independent of DNA methylation, but consistent with differential subnuclear localization. Genes Dev. 17, 759–773 (2003).
Strubing, C. et al. Differentiation of pluripotent embryonic stem cells into the neuronal lineage in vitro gives rise to mature inhibitory and excitatory neurons. Mech. Dev. 53, 275–287 (1995).
Bain, G., Kitchens, D., Yao, M., Huettner, J. E. & Gottlieb, D. I. Embryonic stem cells express neuronal properties in vitro. Dev. Biol. 168, 342–357 (1995).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).
Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Anders, S., Pyl, P. T. & Huber, W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Hu, M. et al. HiCNorm: removing biases in Hi-C data via Poisson regression. Bioinformatics 28, 3131–3133 (2012).
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
Crane, E. et al. Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244 (2015).
Vian, L. et al. The energetics and physiological impact of cohesin extrusion. Cell 175, 292–294 (2018).
Juric, I. et al. MAPS: model-based analysis of long-range chromatin interactions from PLAC-seq and HiChIP experiments. PLoS Comput. Biol. 15, e1006982 (2019).
Bhattacharyya, S., Chandra, V., Vijayanand, P. & Ay, F. Identification of significant chromatin contacts from HiChIP data by FitHiChIP. Nat. Commun. 10, 4221 (2019).
Kremer, K. & Grest, G. S. Dynamics of entangled linear polymer melts: a molecular‐dynamics simulation. J. Chem. Phys. 92, 5057–5086 (1990).
Plimpton, S. Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys. 117, 1–19 (1995).
Yang, T. et al. HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Res. 27, 1939–1949 (2017).
Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
We thank E. Nora and B. Bruneau for exchanging datasets and reagents. We would like to give special thanks to S. Kuan for operating the sequencing instruments and T. Liu and Z. Ye for helping with experiments. We would like to acknowledge the help of V. Lobanenkov and A. Desai for giving helpful advice and the help of F. Yue, X. Wang, I. Juric and A. Abnousi for sharing computational pipelines. We would also like to give special thanks to R. Raviram, R. Fang, Y. Zhang, A. Schmitt and S. Chee for sharing helpful information and protocols, as well as all of the other members of the Ren laboratory. This work was supported by the Ludwig Institute for Cancer Research (B.R.), NIH (1U54DK107977-01) (B.R.), NIH (1U54DK107965) (H.Z.), a Ruth L. Kirschstein Institutional National Research Award from the National Institute for General Medical Sciences (T32 GM008666) (J.D.H.) and a Postdoc fellowship from the TOYOBO Biotechnology Foundation (N.K.).
B.R. is a co-founder of Arima Genomics and Epigenome Technologies.
Peer review information Peer reviewer reports are available. Anke Sparmann was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a, b, Western blot showing AID-tagged CTCF and wild type CTCF (a) and the expression of TIR1 protein in two mESC clones. TIR1 expression in these clones that went through multiple passages was comparable to that in control cells with lower passage number. Uncropped images are available as source data online. c, Western blot showing acute depletion of CTCF protein after 24 and 48 hours of auxin treatment. Uncropped images are available as source data online. d, Heatmaps showing CTCF ChIP-seq signals centered at all regions of CTCF peaks identified in the control cells and CTCF occupancy at the same regions in CTCF-depleted cells at each time point of differentiation. e, Venn-diagram comparing the number of CTCF ChIP-seq peaks identified in control and CTCF-depleted ESCs at each time point. f, Histogram showing the number of CTCF binding regions in y-axis and the associated CTCF ChIP-seq signal level in x-axis. The CTCF signal levels in control cells and auxin treated cells were calculated for CTCF peak regions identified in the control cells. g, Heatmaps comparing Rad21 ChIP-seq signals centered at all regions of Rad21 peaks identified in control and CTCF-depleted ESCs at each time point (left, blue heat map). CTCF occupancy are also shown (right, red heat map). h, Growth curves of mouse ESCs with or without auxin treatment. Date are plotted as averages+/- standard deviation (n = 5 independent experiments). i, Bright-field microscopy images of mouse ESC colonies before and after auxin treatment. j, Cell cycle analysis by flow cytometry using propidium iodide staining in control ESCs and after 24, 48, and 96 hours of auxin treatment. Source data
Extended Data Fig. 2 Transcriptional changes during neural differentiation of control and CTCF-depleted mES cells.
a, Gene expression profiles of pluripotent marker genes (Pou5f1, Sox2, Nanog) and examples of induction failure gene upon CTCF loss that is important for nervous system development (Neurog1, Neurod4, Vcan, Pax6, Tubb3 (Tuj1), Rbfox3 (NeuN)) in control and CTCF-depleted cells during differentiation from ESC to NPC and 2 days after washing out of auxin in NPCs. b, Gene expression profiles of Pcdhga and Hoxc gene clusters during multiple days of auxin treatment in ESCs and during differentiation from ESC to NPC in control and CTCF-depleted cells followed by washing out auxin in NPCs. c, Transcriptional changes between control ESCs and NPCs (day 4). Differentially up-regulated and down-regulated genes are plotted in red and blue, respectively (fold change > 2, FDR < 0.05). d, Top two enriched GO terms of the sets of differentially expressed genes upon CTCF loss are shown along with p values (Fisher’s exact test).
a, APA on Hi-C peak loci (> 100-kb looping range) on convergent CTCF binding sites identified in control ESCs (n = 3185) and NPCs (n = 3686) and on Hi-C peak loci that have no CBSs (n = 2874 (ESCs), n = 2940 (NPCs). Scores on the bottom represent focal enrichment of peak pixel against pixels in its lower left. b, Aggregate boundary analysis showing average change in boundary strength between samples. Each triangle is a contact map of the difference in the average contact profile at TAD boundaries between two time points. The bottom column shows difference in the average boundary profile between the two control samples. c, Scatter plots of insulation scores at TAD boundaries in control and auxin treated ESCs (left) and NPCs (right). A higher score denotes lower insulation. d, Boxplots showing insulation scores at TAD boundaries that overlapped with housekeeping genes and CBSs, with other genes and CBSs, and TAD boundaries without CBSs in control and auxin treated ESCs and NPCs. All boxplots hereafter are defined as: Central bar, median; lower and upper box limits, 25th and 75th percentiles, respectively; whiskers, minimum and maximum value within the rage of (1st quartile-1.5*(3rd quartile- 1st quartile)) to (3rd quartile+1.5*(3rd quartile- 1st quartile)). *** p value < 0.001, two-tailed t-test. e, The number of TAD boundaries (left), stripes (middle), and insulated neighborhoods (INs) in control, CTCF-depleted, and auxin washout cells. Hatched bars indicate overlap with control cells. f, Hi-C contact frequencies at each genomic distance. g–i, Comparison of Hi-C datasets generated in this study and by Nora et al.13. Scatter plots of insulation scores at all TAD boundaries (g). Number of TAD boundaries in control and CTCF-depleted cells from both studies. Hatched bars indicate the overlap with control cells (h). Genome browser snapshots showing Hi-C contact heatmaps, TAD boundaries, directionality indices (DIs), and insulation scores analyzed in the two independent studies at the same genomic region in control and CTCF-depleted cells (i).
Extended Data Fig. 4 Features of E-P and P-P contacts that change during neural differentiation in control cells.
a, Histogram showing the number of significantly induced (red) and reduced (blue) E-P contacts between ESCs and NPCs and their genomic distances. *** p value < 0.001, Pearson’s Chi-squared test. b, Scatter plots showing changes of H3K27ac and H3K4me1 ChIP-seq signals at distal elements that display significantly induced (red) or reduced (blue) E-P or P-P contacts during neural differentiation. c, Genome browser snapshots of Sox2 (top) and Dnmt3b (bottom) loci. Arcs show changes of H3K4me3 PLAC-seq contacts on active elements and promoters between ESCs and NPCs (see Methods for details). The colors of arcs represent degrees of interaction change between samples (blue to red, -/+log10(p-value)) (Fisher’s exact test). Promoter regions of Sox2 and Dnmt3b and interacting enhancer regions are shown in green and yellow shadows, respectively. CTCF, H3K4me1, H3K27ac, H3K4me3, H3K27me3 ChIP-seq and RNA-seq in ESCs and NPCs (day 4) are also shown. d, Scatter plots showing changes of E-P or P-P contacts anchored on up-regulated (left) and down-regulated (right) genes between ESCs and NPCs. Genomic distances between their two loop anchor sites are plotted on x-axis. Significantly induced and reduced chromatin contacts are shown as red and blue dots, respectively (FDR < 0.05). e, Histogram showing the number of genes and the number of their interacting distal elements in NPCs. Genes without significant chromatin contacts were removed in this analysis. f, Schematic representation of the AIC model to compute the correlation between changes of multiple E-P contacts and gene expression levels. H3K27ac and H3K4me1 peaks are shown as red and yellow peaks, respectively, and regions where these two types of peaks overlap are defined as active elements (red colored regions). Promoter-centered chromatin contacts on these active elements are shown as red arcs (active contacts) and other chromatin contacts are shown as blue arcs (inactive contacts). AIC ratio and value was calculated as indicated on the bottom (see Methods for details). g, Scatter plots showing changes of AIC values and gene expression levels in differentially expressed genes during neural differentiation with linear approximation. h, (Left) Schematic representation of a model to calculate AIC values using only P-P or E-P contacts. Promoter-centered chromatin contacts on active enhancers are shown as yellow arcs and chromatin contacts on other promoters are shown as green arcs. AIC ratios and values of P-P contacts and E-P contacts to other inactive contacts were calculated as shown in panel f. (Right) Box plots showing changes of the AIC values of P-P and E-P contacts in differentially expressed genes. The number of data points is indicated on the bottom. *** p value < 0.001, two-tailed t-test. i, Histogram of the number of significant PLAC-seq peaks (FDR < 0.01) on P-P and E-P pairs anchored on active and inactive genes (top and bottom 25% of gene expression) in ESCs and NPCs. j, Histogram showing the number of significant PLAC-seq peaks (FDR < 0.01) on P-P and E-P pairs anchored on active distal elements (presence of H3K4me1 and H3K27ac) and repressive distal elements (presence of H3K4me1 and H3K27me3, but not H3K27ac peaks) in ESCs and NPCs. Schematic representation of each type of chromatin contact is shown on the bottom. k, Average enrichments of H3K27me3 ChIP-seq signals on TSSs and TESs of genes that interact with repressive distal enhancers identified in panel (j). H3K27me3 ChIP-seq signals on other genes are shown as control.
Extended Data Fig. 5 Changes of chromatin contacts upon acute CTCF loss illuminate relationship between CTCF-dependent chromatin contacts and gene regulation.
a, b, Scatter plots showing changes of H3K4me3 PLAC-seq contacts (y-axis) on convergently oriented CBSs and their loop ranges (x-axis). Chromatin contacts in CTCF-depleted cells were compared to the chromatin contacts in control cells in ESC (a) and NPC stage (day 4) (b). The plots were classified based on whether they are on promoters and enhancers (E-P and P-P) (left) or not (right). c, Histograms showing the number of significantly changed E-P(P-P) contacts upon CTCF loss and their genomic distances in ESC (left) and NPC (right) stages. Significantly induced and reduced contacts are shown as red and blue bars, respectively. P value: Pearson’s Chi-squared test for the comparison of the number of chromatin contacts that were long-range (≥ 100 kb) or not between the ESCs and NPCs. d, Scatter plots showing changes of E-P(P-P) contacts anchored on CTCF-dependent down-regulated genes in ESC (left) and NPC stage (right). Chromatin contacts were classified based on whether they were E-P or P-P contacts (red vs blue dots). Their genomic ranges are plotted in x-axis. The number of reduced E-P and P-P contacts are also shown, respectively (p value < 0.05). e, Heatmaps and dotplots showing gene expression changes (fold change, FC) of genes that lost P-P contacts upon CTCF loss in ESC (left) and NPC (right) stages. Each gene pair interacting through CTCF-dependent P-P contacts is shown as either Gene A or Gene B. Gene A have a lower log2-FC than Gene B. Pearson Correlation coefficients (r) between the gene expression changes of the two paired genes are shown on the bottom of the heatmaps. Blue and green dots; down-regulated gene A and B (FDR < 0.05), gray and yellow dots; stably regulated gene A and B, light blue and red dots; up-regulated gene A and B (FDR < 0.05).
Extended Data Fig. 6 CTCF loss does not measurably alter histone modification at promoters and enhancers.
a–c, Scatter plots showing the changes of H3K27ac (a) and H3K4me1 (b) ChIP-seq signal levels upon CTCF loss on all significant peak regions in ESCs (left) and NPCs (right). The changes of H3K4me3 ChIP-seq signal levels on all peak regions on transcription start sites (TSSs) are also shown (c). d, Boxplots showing the changes of H3K27ac (top) and H3K4me1 (bottom) ChIP-seq signal levels on distal element loci of all analyzed E-P contacts. The changes upon CTCF depletion in ESC (left) and NPC stage (middle) and the changes during neural differentiation (right) are shown. The numbers of data points are also indicated on the bottom. NS not significant, *** p value < 0.001, two-tailed t-test.
a, b, Enrichment analysis of CTCF-dependent reduced E-P and P-P contacts (top), CTCF-independent E-P and P-P contacts (middle), and CTCF-dependent induced E-P and P-P contacts (bottom). Chromatin contacts were categorized based on the distance from the loop anchor sites on the distal element side (vertical columns) or promoter side (horizontal columns) to the nearest CBS (a) or based on the number of CBSs around loop anchor sites (10 kb bin ±5 kb) on the distal element side (vertical columns) or promoter side (horizontal columns) (b). Enrichment values are shown by odds ratio (scores in boxes) and p-values (color) in ESCs (left) and NPCs (right) (see Methods). c, Average enrichment of CTCF ChIP-seq signals on TSSs of CTCF-dependent up-regulated (red) or down-regulated (blue) and CTCF-independent genes (gray) in ESCs (left) and NPCs (right). d, Histograms showing the number of reduced CTCF-dependent E-P and P-P contacts (p value < 0.05) anchored on CTCF-dependent down-regulated gene promoters with CBSs (TSS±5 kb) in ESCs (left) and NPCs (right). Chromatin contacts were classified based on whether their interacting distal elements were anchored on convergent CTCF or not (within 10 kb bin). e, f, Enrichment analysis of E-P and P-P contacts anchored on CTCF-dependent down-regulated genes categorized based on the distance from the loop anchor sites on the distal element side (vertical columns) or promoter side (horizontal columns) to the nearest CBS (e). The same enrichment analysis categorized based on the number of CBSs around loop anchor sites (10 kb bin ±5 kb) on distal element side (vertical columns) or promoter side (horizontal columns) (f). Enrichment values shown by odds ratio (scores in boxes) and p-values (color) in ESCs (left) and NPCs (right) (see Methods). g, Genome browser snapshots of the Baiap2 locus. Arcs show changes of chromatin contacts on E-P and on CBSs. The colors of arcs represent change from control cells to CTCF-depleted cells (blue to red, −/+log10(p-value)). Promoter regions of Baiap2 and interacting enhancer regions are shown in green and yellow shadows, respectively. CTCF, H3K4me1, H3K27ac, H3K4me3, H3K27me3 ChIP-seq, and RNA-seq in control and CTCF-depleted NPCs, and TAD boundaries in control cells are also shown. h, Boxplots showing the number of CBSs located between two anchor sites of significantly induced (red) or reduced (blue) E-P contacts upon CTCF loss, and CTCF-independent E-P contacts (gray). The numbers of data points are indicated on the bottom. *** p value < 0.001, two-tailed t-test. i, The fraction of E-P and P-P contacts that overlapped with TAD boundaries in ESCs (left) and NPCs (left). The numbers of data points are indicated on the bottom. * p value < 0.05, *** p value < 0.001, Pearson’s Chi-squared test.
a, Genome browser snapshots of the Vcan locus. Arcs show changes of chromatin contacts anchored on the Vcan promoter, distal enhancer, and CBSs identified s between wild type NPCs and NPCs in which promoter-proximal CTCF motif sequences were deleted. The colors of arcs represent degrees of interaction change upon the deletion of CTCF motif sequences (blue to red, -/+log10(p-value)). The promoter region and interacting enhancer region are shown in green and yellow shadows, respectively. CTCF, H3K27ac, H3K4me1, H3K4me3, and H3K27me3 ChIP-seq, and TAD boundaries in wild type NPCs are also shown. b, Schematic representation of the dCas9-CTCF rescue experiments. c, Western blot of cells lysates expressing dCas9-CTCF or dCas9 control plasmids. Uncropped images are available as source data online. d, Snapshots of heatmaps around Vcan showing mapped reads of PLAC-seq in dCas9-CTCF (top and bottom strand) and dCas9 control cell lines. Peaks of chromatin contacts between theVcan gene promoter and the downstream distal element are shown in zoom-in. e, SBS model showing triplet interactions between the Vcan promoter (black), distal enhancer (blue) and CBSsthat weaken upon CTCF depletion (white arrows). Heatmaps from each viewpoint in control and CTCF-depleted NPCs are shown. CTCFs (+) (browns) and CTCFs (-) (reds) are convergently oriented. f, Hi-C contact maps (left) of the Vcan locus in control and auxin treated NPCs and the SBS polymer model (right) (HiCRep stratum adjusted correlation SCC=0.76 and SCC=0.62 respectively). Genomic positions of Vcan promoter (black), distal enhancer (blue) and relevant motif-oriented CBSs (brown and red) are shown by colored triangles. g, SBS derived 3D structures of the Vcan locus in control and CTCF-depleted NPCs, with relevant elements indicated by colored beads (color as in (b)). Source data
a, Boxplots showing the distance from TSS to the nearest enhancer (left) and promoter (right) region in ESCs and NPCs. Red: CTCF-dependent up-regulated genes, blue: CTCF-dependent down-regulated genes, gray: CTCF-independent stably regulated genes. The numbers of genes analyzed in each group are indicated on the bottom. *** p value < 0.001, ** p value < 0.01, * p value < 0.05, two-tailed t-test. b, Average enrichment of H3K27ac ChIP-seq signals on TSSs of CTCF-dependent up-regulated (red), and down-regulated (blue) genes and CTCF-independent genes (gray) in ESCs (left) and NPCs (right). Average enrichment of H3K27me3 ChIP-seq signals on TSSs and TESs are also shown (bottom). c, d, Genome browser snapshot of the Sox2 locus (c) whose reduction of expression level was moderate 24 or 48 hours after CTCF depletion in ESCs in RNA-seq and qPCR (d). The arcs show PLAC-seq contact counts in control (top) and CTCF-depleted ESCs (middle) at every 10-kb bin. Changes of chromatin contacts on enhancers and the Sox2 promoter are also shown (bottom). Sox2 gene promoter and interacting super enhancer are shown in green and yellow shadows, respectively. CTCF, H3K4me1, H3K27ac, H3K4me3, and H3K27me3 ChIP-seq, RNA-seq in control and CTCF-depleted ESCs are shown. The error bars in the right panel of (d) indicate standard deviation of 8 independent experiments. RPKM values were calculated from two RNA-seq replicates. NS not significant, * p value < 0.05, ** p value < 0.01, *** p value < 0.001, two-tailed t-test. e, Enrichment analysis of CTCF-dependent down-regulated (left) and up-regulated (right) genes categorized based on the distance to the nearest interacting enhancer (vertical columns) and the number of enhancers around TSS (<200 kb) (horizontal columns) in ESCs. Enrichment values are shown by odds ratio (scores in boxes) and p-values (color). The distance to the nearest interacting enhancer is represented by the shortest genomic distance of significant PLAC-seq peaks on enhancers and promoters (p-value < 0.01). (see Fig. 4b for the same analysis in NPCs). f, Model for the general features of CTCF-dependent down-regulated (top), up-regulated genes (middle), and CTCF-independent genes (bottom). g, Venn-diagram showing overlap between CTCF dependent genes and Mll3/4 dependent genes in NPCs. Statistical significance is computed based on Fisher’s exact test. Odds ratio represents the strength of association.
a, Histogram showing frequencies of genomic regions with 2 or more CBSs in all analyzed 9 tissues, classified based on GC content levels. Black line shows fold change between the two groups. Total numbers of genomic regions analyzed in each group are indicated on the bottom. *** p value < 0.001, two-tailed t-test. b, Histogram showing frequencies of CTCF motif sequences and their PhastCons conservation scores. c, Histogram showing the fractions of genes whose promoter-proximal CTCF binding motifs were the same direction as the orientation of transcription. The fractions in CTCF-dependent down-and up-regulated genes and CTCF-independent genes in ESCs and NPCs are shown. The numbers of genes analyzed in each group are indicated on the bottom. * p value < 0.05, ** p value < 0.01, *** p value < 0.001, Pearson’s Chi-squared test. d, Heatmap showing lineage-specific DNA methylation levels at CBSs (motif sequences ±100 bp) in promoter of genes shown in Fig. 5c. The DNA methylation levels at multiple CBSs in the same promoter region (TSS±10 kb) were averaged. Lineage-specificity of DNA methylation levels shown in the heatmap are calculated by log2(DNA methylation level/average methylation level of all tissues). The heatmap was sorted by correlation coefficient between CTCF ChIP-seq signal levels and DNA methylation levels across multiple tissues in each group. Each correlation coefficient is shown in the scatter plots (right) (r<−0.5, highlighted in blue). e, Boxplots showing length of lineage-specific genes with CTCF occupied promoter that had high correlation coefficient (> 0.6) in Fig. 5b, c. Forebrain-specific genes and other lineage-specific genes are shown at right and middle, respectively. All genes whose RNA-seq RPKM value is more than 1 in at least one tissue sample were used as control (left). The numbers of genes analyzed in each group are indicated on the bottom. NS not significant, *** p value < 0.001, two-tailed t-test. f, Boxplots showing gene length of CTCF-dependent down-regulated, up-regulated and CTCF-independent genes in ESCs and NPCs. The numbers of genes analyzed in each group are indicated on the bottom. NS not significant, *** p value < 0.001, two-tailed t-test. g, Volcano plots showing the gene expression changes of the forebrain-specific CTCF-occupied genes between control cells and CTCF-depleted cells in ESCs (left) and NPCs (right). h, Volcano plots showing gene expression changes of heart-tissue-specific CTCF-occupied genes between control heart tissue and CTCF knockout heart tissue.
Supplementary Figures 1 and 2 and Supplementary Note.
List of NGS sample information.
List of CTCF ChIP–seq peaks in control and CTCF-depleted cells.
Gene expression changes upon CTCF loss at each time point during neural differentiation.
List of TAD boundaries and stripes in control and CTCF-depleted cells.
List of differentially changed enhancer–promoter (promoter–promoter) interactions.
List of ENCODE datasets that were used for Fig. 5 and Extended Data Fig. 10.
About this article
Cite this article
Kubo, N., Ishii, H., Xiong, X. et al. Promoter-proximal CTCF binding promotes distal enhancer-dependent gene activation. Nat Struct Mol Biol (2021). https://doi.org/10.1038/s41594-020-00539-5