Cohesin catalyses the folding of the genome into loops that are anchored by CTCF1. The molecular mechanism of how cohesin and CTCF structure the 3D genome has remained unclear. Here we show that a segment within the CTCF N terminus interacts with the SA2–SCC1 subunits of human cohesin. We report a crystal structure of SA2–SCC1 in complex with CTCF at a resolution of 2.7 Å, which reveals the molecular basis of the interaction. We demonstrate that this interaction is specifically required for CTCF-anchored loops and contributes to the positioning of cohesin at CTCF binding sites. A similar motif is present in a number of established and newly identified cohesin ligands, including the cohesin release factor WAPL2,3. Our data suggest that CTCF enables the formation of chromatin loops by protecting cohesin against loop release. These results provide fundamental insights into the molecular mechanism that enables the dynamic regulation of chromatin folding by cohesin and CTCF.
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Coordinates are available from the PDB under accession number 6QNX for the SA2–SCC1–CTCF complex. The generated Hi-C, RNA sequencing and ChIP–seq data have been deposited in GEO, accession number GSE126637. Any other relevant data are available from the corresponding authors upon reasonable request.
Dekker, J. & Mirny, L. The 3D genome as moderator of chromosomal communication. Cell 164, 1110–1121 (2016).
Hara, K. et al. Structure of cohesin subcomplex pinpoints direct shugoshin–Wapl antagonism in centromeric cohesion. Nat. Struct. Mol. Biol. 21, 864–870 (2014).
Shintomi, K. & Hirano, T. Releasing cohesin from chromosome arms in early mitosis: opposing actions of Wapl–Pds5 and Sgo1. Genes Dev. 23, 2224–2236 (2009).
Merkenschlager, M. & Nora, E. P. CTCF and cohesin in genome folding and transcriptional gene regulation. Annu. Rev. Genomics Hum. Genet. 17, 17–43 (2016).
Rowley, M. J. & Corces, V. G. Organizational principles of 3D genome architecture. Nat. Rev. Genet. 19, 789–800 (2018).
Yatskevich, S., Rhodes, J. & Nasmyth, K. Organization of chromosomal DNA by SMC complexes. Annu. Rev. Genet. 53, 445–482 (2019).
Alipour, E. & Marko, J. F. Self-organization of domain structures by DNA-loop-extruding enzymes. Nucleic Acids Res. 40, 11202–11212 (2012).
Fudenberg, G. et al. Formation of chromosomal domains by loop extrusion. Cell Rep. 15, 2038–2049 (2016).
Sanborn, A. L. et al. Chromatin extrusion explains key features of loop and domain formation in wild-type and engineered genomes. Proc. Natl Acad. Sci. USA 112, E6456–E6465 (2015).
Haarhuis, J. H. I. et al. The cohesin release factor WAPL restricts chromatin loop extension. Cell 169, 693–707 (2017).
Nora, E. P. et al. Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell 169, 930–944 (2017).
Wutz, G. et al. Topologically associating domains and chromatin loops depend on cohesin and are regulated by CTCF, WAPL, and PDS5 proteins. EMBO J. 36, 3573–3599 (2017).
Guo, Y. et al. CRISPR inversion of CTCF sites alters genome topology and enhancer/promoter function. Cell 162, 900–910 (2015).
de Wit, E. et al. CTCF binding polarity determines chromatin looping. Mol. Cell 60, 676–684 (2015).
Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
Vietri Rudan, M. et al. Comparative Hi-C reveals that CTCF underlies evolution of chromosomal domain architecture. Cell Rep. 10, 1297–1309 (2015).
Rao, S. S. P. et al. Cohesin loss eliminates all loop domains. Cell 171, 305–320 (2017).
Gassler, J. et al. A mechanism of cohesin-dependent loop extrusion organizes zygotic genome architecture. EMBO J. 36, 3600–3618 (2017).
Rubio, E. D. et al. CTCF physically links cohesin to chromatin. Proc. Natl Acad. Sci. USA 105, 8309–8314 (2008).
Xiao, T., Wallace, J. & Felsenfeld, G. Specific sites in the C terminus of CTCF interact with the SA2 subunit of the cohesin complex and are required for cohesin-dependent insulation activity. Mol. Cell. Biol. 31, 2174–2183 (2011).
Pezzi, N. et al. STAG3, a novel gene encoding a protein involved in meiotic chromosome pairing and location of STAG3-related genes flanking the Williams-Beuren syndrome deletion. FASEB J. 14, 581–592 (2000).
Orgil, O. et al. A conserved domain in the Scc3 subunit of cohesin mediates the interaction with both Mcd1 and the cohesin loader complex. PLoS Genet. 11, e1005036 (2015).
Roig, M. B. et al. Structure and function of cohesin’s Scc3/SA regulatory subunit. FEBS Lett. 588, 3692–3702 (2014).
Forbes, S. A. et al. COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res. 45, D777–D783 (2017).
Beckouët, F. et al. Releasing activity disengages cohesin’s Smc3/Scc1 interface in a process blocked by acetylation. Mol. Cell 61, 563–574 (2016).
Gandhi, R., Gillespie, P. J. & Hirano, T. Human Wapl is a cohesin-binding protein that promotes sister-chromatid resolution in mitotic prophase. Curr. Biol. 16, 2406–2417 (2006).
Kueng, S. et al. Wapl controls the dynamic association of cohesin with chromatin. Cell 127, 955–967 (2006).
Liu, H., Rankin, S. & Yu, H. Phosphorylation-enabled binding of SGO1–PP2A to cohesin protects sororin and centromeric cohesion during mitosis. Nat. Cell Biol. 15, 40–49 (2013).
Ouyang, Z. et al. Structure of the human cohesin inhibitor Wapl. Proc. Natl Acad. Sci. USA 110, 11355–11360 (2013).
Krystkowiak, I. & Davey, N. E. SLiMSearch: a framework for proteome-wide discovery and annotation of functional modules in intrinsically disordered regions. Nucleic Acids Res. 45, W464–W469 (2017).
Lawrence, M. S et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495–501 (2014).
Flavahan, W. A. et al. Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature 529, 110–114 (2016).
Hnisz, D. et al. Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science 351, 1454–1458 (2016).
Ouyang, Z., Zheng, G., Tomchick, D. R., Luo, X. & Yu, H. Structural basis and IP6 requirement for Pds5-dependent cohesin dynamics. Mol. Cell 62, 248–259 (2016).
Chan, K. L. et al. Cohesin’s DNA exit gate is distinct from its entrance gate and is regulated by acetylation. Cell 150, 961–974 (2012).
Buheitel, J. & Stemmann, O. Prophase pathway-dependent removal of cohesin from human chromosomes requires opening of the Smc3–Scc1 gate. EMBO J. 32, 666–676 (2013).
Eichinger, C. S., Kurze, A., Oliveira, R. A. & Nasmyth, K. Disengaging the Smc3/kleisin interface releases cohesin from Drosophila chromosomes during interphase and mitosis. EMBO J. 32, 656–665 (2013).
Sedeño Cacciatore, Á. & Rowland, B. D. Loop formation by SMC complexes: turning heads, bending elbows, and fixed anchors. Curr. Opin. Genet. Dev. 55, 11–18 (2019).
Tang, Z. et al. CTCF-mediated human 3D genome architecture reveals chromatin topology for transcription. Cell 163, 1611–1627 (2015).
Nagy, G. et al. Motif oriented high-resolution analysis of ChIP-seq data reveals the topological order of CTCF and cohesin proteins on DNA. BMC Genomics 17, 637 (2016).
Kschonsak, M. et al. Structural basis for a safety-belt mechanism that anchors condensin to chromosomes. Cell 171, 588–600 (2017).
Ganji, M. et al. Real-time imaging of DNA loop extrusion by condensin. Science 360, 102–105 (2018).
Li, Y. et al. Structural basis for Scc3-dependent cohesin recruitment to chromatin. eLife 7, e38356 (2018).
Studier, F. W. Protein production by auto-induction in high density shaking cultures. Protein Expr. Purif. 41, 207–234 (2005).
Bowler, M. W. et al. MASSIF-1: a beamline dedicated to the fully automatic characterization and data collection from crystals of biological macromolecules. J. Synchrotron Radiat. 22, 1540–1547 (2015).
Svensson, O., Malbet-Monaco, S., Popov, A., Nurizzo, D. & Bowler, M. W. Fully automatic characterization and data collection from crystals of biological macromolecules. Acta Crystallogr. D 71, 1757–1767 (2015).
Svensson, O., Gilski, M., Nurizzo, D. & Bowler, M. W. Multi-position data collection and dynamic beam sizing: recent improvements to the automatic data-collection algorithms on MASSIF-1. Acta Crystallogr. D 74, 433–440 (2018).
Kabsch, W. Integration, scaling, space-group assignment and post-refinement. Acta Crystallogr. D 66, 133–144 (2010).
Winn, M. D. et al. Overview of the CCP4 suite and current developments. Acta Crystallogr. D 67, 235–242 (2011).
McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007).
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D 66, 486–501 (2010).
Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D 66, 213–221 (2010).
Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D 66, 12–21 (2010).
Yin, M. et al. Molecular mechanism of directional CTCF recognition of a diverse range of genomic sites. Cell Res. 27, 1365–1377 (2017).
Rhodes, J. D. P. et al. Cohesin can remain associated with chromosomes during DNA replication. Cell Rep. 20, 2749–2755 (2017).
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).
Yang, T. et al. HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Res. 27, 1939–1949 (2017).
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
Imakaev, M. et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods 9, 999–1003 (2012).
Lévy-Leduc, C., Delattre, M., Mary-Huard, T. & Robin, S. Two-dimensional segmentation for analyzing Hi-C data. Bioinformatics 30, i386–i392 (2014).
Crane, E. et al. Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244 (2015).
Flyamer, I. M. et al. Single-nucleus Hi-C reveals unique chromatin reorganization at oocyte-to-zygote transition. Nature 544, 110–114 (2017).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10 (2011).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke, T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–W191 (2014).
Feng, J., Liu, T., Qin, B., Zhang, Y. & Liu, X. S. Identifying ChIP–seq enrichment using MACS. Nat. Protoc. 7, 1728–1740 (2012).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Amemiya, H.M., Kundaje, A., & Boyle, A.P. The ENCODE blacklist: identification of problematic regions of the genome. Sci Rep. 9, 9354 (2019).
Mathelier, A. et al. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res. 42, D142–D147 (2014).
Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011).
Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).
Anders, S., Pyl, P. T. & Huber, W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Landgraf, C. et al. Protein interaction networks by proteome peptide scanning. PLoS Biol. 2, e14 (2004).
This work was funded by EMBL. J.H.I.H., Á.S.C. and B.D.R. were supported by an ERC CoG (772471 ‘CohesinLooping’), M.S.v.R. was supported by the Boehringer Ingelheim Fonds and H.T. and E.d.W. were supported by an ERC StG (637587 ‘HAP-PHEN’). H.T. and E.d.W. are part of the Oncode Institute, which is partly financed by the Dutch Cancer Society. We thank the staff at the ESRF beamline Massif-1; T. Gibson for advice concerning short linear motifs; J. Rhodes and K. Nasmyth for reagents and advice on Halo tagging; R. van der Weide for advice and bioinformatic analyses; and R. Kerkhoven and the NKI Genomics Core Facility for sequencing.
The authors declare no competing interests.
Peer review information Nature thanks Victor Corces, Karl-Peter Hopfner and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
a, Domain architecture of CTCF. CTCF fragments tested for SA2–SCC1 binding by GST pulldown analysis are indicated. The region that retains SA2–SCC1 is highlighted in magenta. b, Summary data showing results of GST pulldowns. The input and the bound fractions were analysed by SDS–PAGE. CTCF fragments that bind SA2–SCC1 are shown in magenta. The experiment was repeated once. c, ITC curves. The binding stoichiometry (N) and dissociation constants (Kd) are indicated. The experiment was repeated three times, with consistent results. d, Fo − Fc omit electron-density Fourier map contoured at 3σ. e, LIGPLOT representation of the interaction between the CTCF peptide and SA2–SCC1. The CTCF peptide is shown in magenta, SA2 in blue and SCC1 in green bonds.
a, Multiple sequence alignment of SA2 (here denoted STAG2) orthologues and paralogues. *Key amino-acid residues that engage CTCF. b, Missense mutation frequencies plotted onto the SA2 structure. R370 (a hotspot in SA2) is indicated. The inset shows an overview of the mutation hotspots R370 of SA2), Y226 and F228 of CTCF, and S334, K335, R338 and L341 of SCC). c, ITC progress curves of binding between WAPL(423–463) and SA2–SCC1. d, Competition between SGO1 and CTCF for SA2–SCC1 binding. SA2–SCC1 was incubated with GST–CTCF(86–267). Increasing amounts (lanes 4–8) (molar ratios are indicated) of the SGO1 phosphorylated at T346 peptide (spanning residues 331–349) were added and the input and the bound fraction analysed by SDS–PAGE. The experiment was repeated twice. One representative example is shown. e, Domain architecture and sequence alignments of cohesin regulators that contain F/YXF motifs. Putative CES-interacting residues are highlighted in red. f, Regular expression motif used to query the human and yeast proteomes for factors containing F/YXF motifs. Regular expression syntax: letters denote a specific amino acid; square brackets denote a subset of allowed amino acids; curly brackets denote length variability.
a, Schematic of CRISPR–Cas9-based generation of CTCFY226A/F228A cells. The guide targets cleavage of exon 1 of the CTCF gene. The repair oligonucleotide renders the gene noncleavable by Cas9, and simultaneously introduces mutations in the codons that encode Y226 and F228. b, The CTCFY226A/F228A mutation was confirmed by Sanger sequencing, including a silent mutation at position 229. c, Western blot depicting Halo-tagged SCC1 in wild-type and CTCFY226A/F228A cells. The parental wild-type cells are included as a control. This experiment was performed once. d, Representative images of cells in G1 and G2, as indicated by their nuclear and cytoplasmic localization of DHB–iRFP, respectively. e, Chromatin-bound levels of CTCF and SMC1 analysed by western blot. Histone H4 is used as a control for the chromatin fraction. The CTCFY226A/F228A mutation does not evidently affect overall CTCF and cohesin levels on chromatin. WCE, whole-cell extract; CB, chromatin-bound fraction. This experiment was performed twice with similar results. f, Relative SCC1–Halo fluorescence intensity quantified in the unbleached area directly after photobleaching, as a proxy for the chromatin-bound fraction of SCC1. This nondiffusive fraction is not evidently affected by the CTCFY226A/F228A mutation. Individual cells of three independent experiments are plotted as dots and their mean is indicated (21 wild-type cells and 17 CTCFY226A/F228A cells were scored).
a, Schematic of a Hi-C matrix displaying DNA–DNA contacts across a genomic region that includes two TADs. TADs in general are flanked by inwards-pointing CTCF sites (magenta arrows). Signal close to the diagonal line reflects short-range contacts, and contacts that span longer distances are found further away from the diagonal. The contacts within a TAD are formed by cohesin complexes (blue circles). Cohesin builds loops that it can enlarge until it encounters CTCF. Some TADs are enriched for contacts between the two CTCF sites that lie at their boundaries. These contacts are referred to as CTCF-anchored loops. b, Aggregate TAD analysis depicting the average contact frequency across TADs defined in wild-type cells. c, Heat map of the insulation score61 at TAD borders, as defined for wild-type cells. d, Aggregate peak analysis as in Fig. 3c, using two independent library preparations per genotype. e, Aggregate TAD analysis for wild-type and CTCFY226A/F228A cells as in b. f, Heat map of insulation scores at TAD borders for wild-type and CTCFY226A/F228A cells as in c.
a, Hi-C contact matrix of region chromosome 16: 77000000–78300000 at 10-kb resolution for the wild-type cell line (bottom triangles) and the CTCFY226A/F228A cell line (top triangles). CTCF sites are depicted below; those selected for qPCR are shown in colour. Red triangles indicate sites with a forward motif and blue triangles indicate sites with a reverse motif. The numbers underneath indicate the qPCR primer pairs shown in b. Primer pair 11 (indicated with *) is at a locus devoid of SCC1 and CTCF. b, ChIP–qPCR analysis of SCC1 (cohesin) enrichment at the aforementioned CTCF sites and control locus (*) in wild-type and CTCFY226A/F228A cells. The mean of three independent ChIP experiments is shown with the s.d. c, ChIP–seq tracks for SCC1 and CTCF at region chromosome 16: 77000000–78300000 in wild-type and CTCFY226A/F228A cells. The loci used for ChIP–qPCR analysis are indicated below the SCC1 ChIP–seq tracks. RPKM, reads per kilobase per million reads. d, ChIP–qPCR analysis of CTCF abundance at loci 1–7, as described in Fig. 3d. Analysis includes IgG as a control. The mean of two independent ChIP experiments is shown. Details of replicates are given in the Methods. e, ChIP–qPCR analysis of CTCF abundance at loci 8–12, as described in Extended Data Fig. 4a. Analysis includes IgG as a control. The mean of two independent ChIP experiments is shown. Details of replicates are given in the extended methods.
Extended Data Fig. 6 Compartmentalization is largely maintained in cells that contain the CTCFY226A/F228A mutation.
a, Hi-C contact matrices of the q-arm of chromosome 2 at 500-kb resolution. The corresponding compartment scores are plotted above. b, Genome-wide comparison of compartment scores for wild-type and CTCFY226A/F228A cells. Pearson correlation = 0.97. c, Saddle plots representing the interaction between A and B compartments. d, A region of chromosome 1 (55500000–59500000) at 10-kb resolution that contains no obvious CTCF-anchored loops. e, Relative contact probability profiles for wild-type and CTCFY226A/F228A mutant cells (left), compared to previously published12 contact profiles upon degradation of CTCF (middle) or SCC1 (right). The contact probability profile is affected only slightly in the CTCFY226A/F228A mutants, similar to the effects of CTCF depletion.
a, Plot depicting the log2-transformed fold change in gene expression in relation to the mean of the normalized counts for each gene. Differentially expressed genes (adjusted P value < 0.05, two-tailed Wald test adjusted for multiple testing using the Benjamini–Hochberg procedure) are shown in red. Gene names are included for the 40 genes with the highest fold change. b, Western blot assessing knockdown of CTCF and the cohesin subunit SMC1 upon transfection with a control siRNA targeting luciferase (luc) or siRNAs targeting CTCF or SMC1A. This experiment was performed twice with similar results. c, Colony-formation assay of wild-type and CTCFY226A/F228A cells upon transfection with a control siRNA targeting luciferase or siRNAs targeting CTCF or SMC1A. CTCF remains essential for viability in CTCFY226A/F228A cells. This experiment was performed four times with similar results. d, Peptide array annotation (top left), binding of SA2–SCC1 (top right) or SA2(F371A)–SCC1 mutant (bottom left) and antibody control (bottom right). Three independent experiments were done, with consistent results. One representative example is shown. e, Amino acid sequences of the peptides. Predicted lead-anchoring residues are coloured red.
About this article
Cite this article
Li, Y., Haarhuis, J.H.I., Sedeño Cacciatore, Á. et al. The structural basis for cohesin–CTCF-anchored loops. Nature 578, 472–476 (2020). https://doi.org/10.1038/s41586-019-1910-z
Current Opinion in Genetics & Development (2020)
Demarcation of Topologically Associating Domains Is Uncoupled from Enriched CTCF Binding in Developing Zebrafish
Royal Society Open Science (2020)