Transcription factors orchestrate dynamic interplay between genome topology and gene regulation during cell reprogramming

  • Nature Geneticsvolume 50pages238249 (2018)
  • doi:10.1038/s41588-017-0030-7
  • Download Citation
Published online:


Chromosomal architecture is known to influence gene expression, yet its role in controlling cell fate remains poorly understood. Reprogramming of somatic cells into pluripotent stem cells (PSCs) by the transcription factors (TFs) OCT4, SOX2, KLF4 and MYC offers an opportunity to address this question but is severely limited by the low proportion of responding cells. We have recently developed a highly efficient reprogramming protocol that synchronously converts somatic into pluripotent stem cells. Here, we used this system to integrate time-resolved changes in genome topology with gene expression, TF binding and chromatin-state dynamics. The results showed that TFs drive topological genome reorganization at multiple architectural levels, often before changes in gene expression. Removal of locus-specific topological barriers can explain why pluripotency genes are activated sequentially, instead of simultaneously, during reprogramming. Together, our results implicate genome topology as an instructive force for implementing transcriptional programs and cell fate in mammals.

  • Subscribe to Nature Genetics for full access:



Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Buganim, Y., Faddah, D. A. & Jaenisch, R. Mechanisms and models of somatic cell reprogramming. Nat. Rev. Genet. 14, 427–439 (2013).

  2. 2.

    Apostolou, E. & Hochedlinger, K. Chromatin dynamics during cellular reprogramming. Nature 502, 462–471 (2013).

  3. 3.

    de Laat, W. & Duboule, D. Topology of mammalian developmental enhancers and their regulatory landscapes. Nature 502, 499–506 (2013).

  4. 4.

    Gorkin, D. U., Leung, D. & Ren, B. The 3D genome in transcriptional regulation and pluripotency. Cell Stem Cell 14, 762–775 (2014).

  5. 5.

    Dekker, J. & Mirny, L. The 3D genome as moderator of chromosomal communication. Cell 164, 1110–1121 (2016).

  6. 6.

    Denker, A. & de Laat, W. The second decade of 3C technologies: detailed insights into nuclear organization. Genes Dev. 30, 1357–1382 (2016).

  7. 7.

    Dixon, J. R., Gorkin, D. U. & Ren, B. Chromatin domains: the unit of chromosome organization. Mol. Cell 62, 668–680 (2016).

  8. 8.

    Cavalli, G. & Misteli, T. Functional implications of genome topology. Nat. Struct. Mol. Biol. 20, 290–299 (2013).

  9. 9.

    Boettiger, A. N. et al. Super-resolution imaging reveals distinct chromatin folding for different epigenetic states. Nature 529, 418–422 (2016).

  10. 10.

    Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

  11. 11.

    Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).

  12. 12.

    Wang, S. et al. Spatial organization of chromatin domains and compartments in single chromosomes. Science 353, 598–602 (2016).

  13. 13.

    Vieux-Rochas, M., Fabre, P. J., Leleu, M., Duboule, D. & Noordermeer, D. Clustering of mammalian Hox genes with other H3K27me3 targets within an active nuclear domain. Proc. Natl. Acad. Sci. USA 112, 4672–4677 (2015).

  14. 14.

    Lin, Y. C. et al. Global changes in the nuclear positioning of genes and intra- and interdomain genomic interactions that orchestrate B cell fate. Nat. Immunol. 13, 1196–1204 (2012).

  15. 15.

    Stevens, T. J. et al. 3D structures of individual mammalian genomes studied by single-cell Hi-C. Nature 544, 59–64 (2017).

  16. 16.

    Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).

  17. 17.

    Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

  18. 18.

    Sexton, T. et al. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458–472 (2012).

  19. 19.

    Lupiáñez, D. G. et al. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025 (2015).

  20. 20.

    Symmons, O. et al. Functional and topological characteristics of mammalian regulatory domains. Genome Res. 24, 390–400 (2014).

  21. 21.

    Andrey, G. et al. A switch between topological domains underlies HoxD genes collinearity in mouse limbs. Science 340, 1234167 (2013).

  22. 22.

    Montavon, T. et al. A regulatory archipelago controls Hox genes transcription in digits. Cell 147, 1132–1145 (2011).

  23. 23.

    Symmons, O. et al. The Shh topological domain facilitates the action of remote enhancers by reducing the effects of genomic distances. Dev. Cell 39, 529–543 (2016).

  24. 24.

    Deng, W. et al. Controlling long-range genomic interactions at a native locus by targeted tethering of a looping factor. Cell 149, 1233–1244 (2012).

  25. 25.

    Hug, C. B., Grimaldi, A. G., Kruse, K. & Vaquerizas, J. M. Chromatin architecture emerges during zygotic genome activation independent of transcription. Cell 169, 216–228.e19 (2017).

  26. 26.

    Ke, Y. et al. 3D chromatin structures of mature gametes and structural reprogramming during mammalian embryogenesis. Cell 170, 367–381.e20 (2017).

  27. 27.

    Du, Z. et al. Allelic reprogramming of 3D chromatin architecture during early mammalian development. Nature 547, 232–235 (2017).

  28. 28.

    Apostolou, E. et al. Genome-wide chromatin interactions of the Nanog locus in pluripotency, differentiation, and reprogramming. Cell Stem Cell 12, 699–712 (2013).

  29. 29.

    Ghavi-Helm, Y. et al. Enhancer loops appear stable during development and are associated with paused polymerase. Nature 512, 96–100 (2014).

  30. 30.

    Soufi, A., Donahue, G. & Zaret, K. S. Facilitators and impediments of the pluripotency reprogramming factors’ initial engagement with the genome. Cell 151, 994–1004 (2012).

  31. 31.

    Di Stefano, B. et al. C/EBPα poises B cells for rapid reprogramming into induced pluripotent stem cells. Nature 506, 235–239 (2014).

  32. 32.

    Di Stefano, B. et al. C/EBPα creates elite cells for iPSC reprogramming by upregulating Klf4 and increasing the levels of Lsd1 and Brd4. Nat. Cell Biol. 18, 371–381 (2016).

  33. 33.

    Buganim, Y. et al. Single-cell expression analyses during cellular reprogramming reveal an early stochastic and a late hierarchic phase. Cell 150, 1209–1222 (2012).

  34. 34.

    Bar-Nur, O. et al. Small molecules facilitate rapid and synchronous iPSC generation. Nat. Methods 11, 1170–1176 (2014).

  35. 35.

    Ruetz, T. & Kaji, K. Routes to induced pluripotent stem cells. Curr. Opin. Genet. Dev. 28, 38–42 (2014).

  36. 36.

    Schmidl, C., Rendeiro, A. F., Sheffield, N. C. & Bock, C. ChIPmentation: fast, robust, low-input ChIP-seq for histones and transcription factors. Nat. Methods 12, 963–965 (2015).

  37. 37.

    Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

  38. 38.

    Heinz, S., Romanoski, C. E., Benner, C. & Glass, C. K. The selection and function of cell type-specific enhancers. Nat. Rev. Mol. Cell Biol. 16, 144–154 (2015).

  39. 39.

    Whyte, W. A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307–319 (2013).

  40. 40.

    Nutt, S. L. & Kee, B. L. The transcriptional regulation of B cell lineage commitment. Immunity 26, 715–725 (2007).

  41. 41.

    Martello, G. & Smith, A. The nature of embryonic stem cells. Annu. Rev. Cell Dev. Biol. 30, 647–675 (2014).

  42. 42.

    Blinka, S., Reimer, M. H. Jr., Pulakanti, K. & Rao, S. Super-enhancers at the Nanog locus differentially regulate neighboring pluripotency-associated genes. Cell Rep. 17, 19–28 (2016).

  43. 43.

    Crane, E. et al. Condensin-driven remodelling of X chromosome topology during dosage compensation. Nature 523, 240–244 (2015).

  44. 44.

    Ong, C. T. & Corces, V. G. CTCF: an architectural protein bridging genome topology and function. Nat. Rev. Genet. 15, 234–246 (2014).

  45. 45.

    Van Bortle, K. et al. Insulator function and topological domain border strength scale with architectural protein occupancy. Genome Biol. 15, R82 (2014).

  46. 46.

    Levasseur, D. N., Wang, J., Dorschner, M. O., Stamatoyannopoulos, J. A. & Orkin, S. H. Oct4 dependence of chromatin structure within the extended Nanog locus in ES cells. Genes Dev. 22, 575–580 (2008).

  47. 47.

    Li, Y. et al. CRISPR reveals a distal super-enhancer required for Sox2 expression in mouse embryonic stem cells. PLoS One 9, e114485 (2014).

  48. 48.

    Zhou, H. Y. et al. A Sox2 distal enhancer cluster regulates embryonic stem cell differentiation potential. Genes Dev. 28, 2699–2711 (2014).

  49. 49.

    Krijger, P. H. et al. Cell-of-origin-specific 3D genome structure acquired during somatic cell reprogramming. Cell Stem Cell 18, 597–610 (2016).

  50. 50.

    Dixon, J. R. et al. Chromatin architecture reorganization during stem cell differentiation. Nature 518, 331–336 (2015).

  51. 51.

    de Wit, E. et al. The pluripotent genome in three dimensions is shaped around pluripotency factors. Nature 501, 227–231 (2013).

  52. 52.

    Meshorer, E. et al. Hyperdynamic plasticity of chromatin proteins in pluripotent embryonic stem cells. Dev. Cell 10, 105–116 (2006).

  53. 53.

    Pasque, V. & Plath, K. X chromosome reactivation in reprogramming and in development. Curr. Opin. Cell Biol. 37, 75–83 (2015).

  54. 54.

    Giorgetti, L. et al. Structural organization of the inactive X chromosome in the mouse. Nature 535, 575–579 (2016).

  55. 55.

    Deng, X. et al. Bipartite structure of the inactive mouse X chromosome. Genome Biol. 16, 152 (2015).

  56. 56.

    Minajigi, A. et al. Chromosomes. A comprehensive Xist interactome reveals cohesin repulsion and an RNA-directed chromosome conformation. Science 349, aab2276 (2015).

  57. 57.

    Schoenfelder, S. et al. Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells. Nat. Genet. 42, 53–61 (2010).

  58. 58.

    Liu, Z. et al. 3D imaging of Sox2 enhancer clusters in embryonic stem cells. eLife 3, e04236 (2014).

  59. 59.

    Therizols, P. et al. Chromatin decondensation is sufficient to alter nuclear organization in embryonic stem cells. Science 346, 1238–1242 (2014).

  60. 60.

    Wijchers, P. J. et al. Cause and consequence of tethering a SubTAD to Different Nuclear Compartments. Mol. Cell 61, 461–473 (2016).

  61. 61.

    Pope, B. D. et al. Topologically associating domains are stable units of replication-timing regulation. Nature 515, 402–405 (2014).

  62. 62.

    Zullo, J. M. et al. DNA sequence-dependent compartmentalization and silencing of chromatin at the nuclear lamina. Cell 149, 1474–1487 (2012).

  63. 63.

    Schwarzer, W. et al. Two independent modes of chromosome organization are revealed by cohesin removal. Nature 551, 51–56 (2017).

  64. 64.

    Nora, E. P. et al. Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell 169, 930–944 (2017). e22.

  65. 65.

    van den Berg, D. L. et al. An Oct4-centered protein interaction network in embryonic stem cells. Cell Stem Cell 6, 369–381 (2010).

  66. 66.

    Donohoe, M. E., Silva, S. S., Pinter, S. F., Xu, N. & Lee, J. T. The pluripotency factor Oct4 interacts with Ctcf and also controls X-chromosome pairing and counting. Nature 460, 128–132 (2009).

  67. 67.

    Wei, Z. et al. Klf4 organizes long-range chromosomal interactions with the oct4 locus in reprogramming and pluripotency. Cell Stem Cell 13, 36–47 (2013).

  68. 68.

    Beagrie, R. A. et al. Complex multi-enhancer contacts captured by genome architecture mapping. Nature 543, 519–524 (2017).

  69. 69.

    Carey, B. W., Markoulaki, S., Beard, C., Hanna, J. & Jaenisch, R. Single-gene transgenic mouse strains for reprogramming adult somatic cells. Nat. Methods 7, 56–59 (2010).

  70. 70.

    Boiani, M., Eckardt, S., Schöler, H. R. & McLaughlin, K. J. Oct4 distribution and level in mouse clones: consequences for pluripotency. Genes Dev. 16, 1209–1219 (2002).

  71. 71.

    van Oevelen, C. et al. C/EBPα activates pre-existing and de novo macrophage enhancers during induced Pre-B cell transdifferentiation and myelopoiesis. Stem. Cell Rep. 5, 232–247 (2015).

  72. 72.

    Stadhouders, R. et al. Multiplexed chromosome conformation capture sequencing for rapid genome-scale high-resolution detection of long-range chromatin interactions. Nat. Protoc. 8, 509–524 (2013).

  73. 73.

    Brouwer, R. W., van den Hout, M. C., van IJcken, W. F., Soler, E. & Stadhouders, R. Unbiased interrogation of 3D genome topology using chromosome conformation capture coupled to high-throughput sequencing (4C-Seq). Methods Mol. Biol. 1507, 199–220 (2017).

  74. 74.

    Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).

  75. 75.

    McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).

  76. 76.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

  77. 77.

    Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

  78. 78.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

  79. 79.

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

  80. 80.

    Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

  81. 81.

    Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

  82. 82.

    Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

  83. 83.

    Klein, F. A. et al. FourCSeq: analysis of 4C sequencing data. Bioinformatics 31, 3085–3091 (2015).

  84. 84.

    Serra, F. et al. Automatic analysis and 3D-modelling of Hi-C data using TADbit reveals structural features of the fly chromatin colors. PLoS Comput. Biol. 13, e1005665 (2017).

  85. 85.

    Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

  86. 86.

    Ay, F. et al. Identifying multi-locus chromatin contacts in human cells using tethered multiple 3C. BMC Genomics 16, 121 (2015).

  87. 87.

    Imakaev, M. et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods 9, 999–1003 (2012).

  88. 88.

    Ribeiro de Almeida, C. et al. The DNA-binding protein CTCF limits proximal Vκ recombination and restricts κ enhancer interactions to the immunoglobulin κ light chain locus. Immunity 35, 501–513 (2011).

  89. 89.

    Chen, X. et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 1106–1117 (2008).

  90. 90.

    Schwickert, T. A. et al. Stage-specific control of early B cell development by the transcription factor Ikaros. Nat. Immunol. 15, 283–293 (2014).

Download references


We thank D. Higgs, J. Hughes, J. Davies and Z. Duan for advice on Hi-C technology; C. Schmidl for ChIPmentation advice; C. van Oevelen for help with CTCF ChIP–seq; C. Segura for mouse-colony management; T. Tian for bone marrow collection; the CRG Genomics Core Facility and the CRG-CNAG Sequencing Unit for sequencing; and members of the laboratory of T.G. for discussions. This work was supported by the European Research Council under the 7th Framework Programme FP7/2007-2013 (ERC Synergy Grant 4D-Genome, grant agreement 609989 to T.G., G.J.F., M.A.M.-R. and M.B.) and the Ministerio de Educacion y Ciencia, SAF.2012-37167. R.S. was supported by an EMBO Long-term Fellowship (ALTF 1201-2014) and a Marie Curie Individual Fellowship (H2020-MSCA-IF-2014). We also acknowledge support from ‘Centro de Excelencia Severo Ochoa 2013-2017’ (SEV-2012-0208) and AGAUR to the CRG.

Author information

Author notes

  1. Ralph Stadhouders and Enrique Vidal contributed equally to this work.


  1. Gene Regulation, Stem Cells and Cancer Program, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain

    • Ralph Stadhouders
    • , Enrique Vidal
    • , François Serra
    • , Bruno Di Stefano
    • , François Le Dily
    • , Javier Quilez
    • , Antonio Gomez
    • , Clara Berenguer
    • , Yasmina Cuartero
    • , Guillaume J. Filion
    • , Miguel Beato
    • , Marc A. Marti-Renom
    •  & Thomas Graf
  2. Universitat Pompeu Fabra (UPF), Barcelona, Spain

    • Ralph Stadhouders
    • , Enrique Vidal
    • , François Serra
    • , Bruno Di Stefano
    • , François Le Dily
    • , Javier Quilez
    • , Antonio Gomez
    • , Clara Berenguer
    • , Yasmina Cuartero
    • , Jochen Hecht
    • , Guillaume J. Filion
    • , Miguel Beato
    • , Marc A. Marti-Renom
    •  & Thomas Graf
  3. Structural Genomics Group, CNAG-CRG, BIST, Barcelona, Spain

    • François Serra
    • , François Le Dily
    • , Yasmina Cuartero
    •  & Marc A. Marti-Renom
  4. Department of Stem Cell and Regenerative Biology, Harvard Stem Cell Institute, Harvard University and Harvard Medical School, Cambridge, MA, USA

    • Bruno Di Stefano
  5. Institut de Biologie de l’Ecole Normale Supérieure (IBENS), CNRS UMR8197, INSERM U1024, Paris, France

    • Samuel Collombet
  6. Genomics Unit, CRG, BIST, Barcelona, Spain

    • Jochen Hecht
  7. Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain

    • Marc A. Marti-Renom
  8. Department of Pulmonary Medicine, Erasmus MC, Rotterdam, the Netherlands

    • Ralph Stadhouders


  1. Search for Ralph Stadhouders in:

  2. Search for Enrique Vidal in:

  3. Search for François Serra in:

  4. Search for Bruno Di Stefano in:

  5. Search for François Le Dily in:

  6. Search for Javier Quilez in:

  7. Search for Antonio Gomez in:

  8. Search for Samuel Collombet in:

  9. Search for Clara Berenguer in:

  10. Search for Yasmina Cuartero in:

  11. Search for Jochen Hecht in:

  12. Search for Guillaume J. Filion in:

  13. Search for Miguel Beato in:

  14. Search for Marc A. Marti-Renom in:

  15. Search for Thomas Graf in:


R.S. and T.G. conceived the study and wrote the manuscript with input from all coauthors. R.S. performed molecular biology, RNA-seq, ChIP–seq, ChIPmentation, ATAC–seq, 4C–seq and in situ Hi-C experiments. R.S., E.V., F.S., J.Q., A.G., S.C. and M.A.M.-R. performed bioinformatic analyses. R.S., E.V., F.S. and M.A.M.-R. integrated and visualized data. B.D.S. performed reprogramming experiments with help from R.S. and C.B.; R.S., F.L.D. and Y.C. optimized and implemented in situ Hi-C technology. J.H. performed high-throughput sequencing. F.L.D., G.J.F., M.B. and M.A.M.-R. provided valuable advice, and T.G. supervised the research.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Ralph Stadhouders or Marc A. Marti-Renom or Thomas Graf.

Integrated supplementary information

  1. Supplementary Figure 1 Transcriptome and epigenome dynamics during reprogramming.

    (a) Genome browser view of Sall4 gene expression measured by RNA-Seq data (two biological replicates per timepoint). Bar graph insert depicts qRT-PCR measurements of Sall4 expression in two independent biological replicate reprogramming experiments (bars indicate mean values). (b) Scatterplot of RPKM gene expression values (n = 16,332 genes) for biological replicates 1 and 2 (iPS samples shown). (c) Pearson correlation (R2) values between RNA-Seq replicates (n = 16,332 genes) for all timepoints. (d) Genome browser views of the Ctsg and Rag genes with H3K4Me2 ChIPmentation (red) or ATAC-Seq (blue) profiles during reprogramming. Bar graphs below show gene expression dynamics (bars indicate mean values, n = 2). (e) qRT-PCR measurements of Nanog (top) and Sox2 (bottom) expression (mean values, n = 2) using primers that detect both mRNA and primary transcripts. Red rectangles indicate area depicted in smaller zoom-in graph on the right. (f) Normalized genome-wide H3K4Me2 (marking active chromatin) coverage per timepoint. (g) Fraction of Oct4 binding sites in PSCs overlapping with an ATAC-Seq peak (‘ATAC+’) during a representative reprogramming time course. Absolute numbers of sites are shown. (h) ATAC-Seq and (i) H3K4Me2 coverage profiles for Oct4 binding sites in PSCs inside (left, n = 821) and outside (right, n = 31,869) PSC superenhancers (SEs) during reprogramming. Error bars in the figure denote 95% CI.

  2. Supplementary Figure 2 Subnuclear compartmentalization dynamics during reprogramming.

    (a) Representative in-situ Hi-C contact maps (50kb resolution) of a 22.5 Mb region on chromosome 3. (b) Pearson correlation coefficient (R2) heatmap of PC1 value comparisons between timepoints. (c) Line chart depicting genome fractions assigned to A or B compartments at the different time points. Regions that could not be assigned (PC1 = 0, e.g. telomeric regions) are shown in gray. (d) Overall contact enrichment for 100kb bins within the A (left) or B (middle) compartment or between A and B (right) compartments during reprogramming. (e) Fraction of the genome that switches compartment at any point during the time course. Bar graph depicts switching percentages per timepoint. (f) Overlay of principal component analyses for gene expression (blue) and compartmentalization (red) dynamics reveals similar trajectories. Sample sizes are indicated in Fig.1d and Fig.2c. (g) Gene expression changes for genes in bins that switch compartment at any timepoint (n = 2,676 for A-to-B; n = 2,667 for B-to-A) or do not switch (‘stable’, n = 21,027) during reprogramming (*P<2.2e-16, Wilcoxon rank-sum test). (h) Gene ontology terms associated with the two categories of switching genes. (i) Average absolute PC1 score of switching or non-switching (‘stable) bins as a function of their distance to the nearest A/B compartment border. Samples sizes as in panel g. (j) Average distance to the nearest compartment border of non-switching stable bins divided by the average distance of the two types of switching bins. Switching bins are significantly closer to borders than stable bins at all timepoints (Poisson regression, P<4.97e-31). (k) Cartoon summarizing characteristics of compartment switching dynamics: compartmentalization dynamics are highest in regions of low PC1 and near compartment domain borders. Error bars in all plots denote 95% CI.

  3. Supplementary Figure 3 Relationship between subnuclear compartmentalization and gene expression changes.

    (a,b) Comparison of gene expression and PC1 dynamics for key B cell (panel a) and pluripotency (panel b) genes (n = 25). Genes were grouped into those stably associated with the A compartment (left, n = 10) and those that switch (right, n = 15). Pie charts depict changes in compartment status for these genes during reprogramming. (c,d) Gene expression (top) and PC1 (bottom) kinetics for downregulated genes (<-0.5 log2, panel c) or upregulated genes (>0.5 log2, panel d) between reprogramming endpoints. Genes were grouped into those stably associated with the A compartment (left; n = 6,119 for upregulated genes, n = 6,696 for downregulated genes) and those that switch (right; n = 1,191 for upregulated genes, n = 1,755 for downregulated genes). Grey shading marks first timepoint of significant change (versus B, *P<0.01, Wilcoxon rank-sum test). Boxplots on the right depict the extent of expression change (PSC versus B) for the two groups of genes. (e) Gene expression clusters of genes stably upregulated during reprogramming at different stages. Line graphs on the right depict average kinetics, gray shading marks first timepoint of significant change. (f) Gene expression (top) and PC1 (bottom) kinetics for stably upregulated genes (from two clusters shown in panel e; n = 64 for the left plot, n = 86 for the right plot) that switch compartment preceding transcriptional upregulation. Gray shading indicates timepoint at which switching was completed. (g) Change in PC1 value (relative to B cells) during reprogramming for bins containing PSC superenhancers (n = 262, P = 0.0004, Wilcoxon rank-sum test). Error bars denote SEM.

  4. Supplementary Figure 4 Integrated kinetics of gene expression, compartmentalization and chromatin state.

    (a) Dynamics of average gene expression versus PC1 (left) and H3K4Me2 levels versus PC1 (right) for all 20 individual switching clusters (see Fig.2g). Arrows indicate time points were the correlation between either expression or H3K4Me2 and PC1 is lost. Sample sizes are shown in panel b. (b) Summarized gene ontology (GO) annotation of the 20 switching clusters grouped by switching type and with relationship class (gene expression versus PC1; concom. = concomitant) indicated. Error bars in all plots denote SEM.

  5. Supplementary Figure 5 TAD dynamics during reprogramming.

    (a) Number of TAD borders identified per timepoint for each biological replicate. (b) TAD border reproducibility between replicates as measured by the Jaccard index. (c) Average enrichment of Ctcf and transcription start sites (TSS) at borders (compared to their genome-wide distribution, n = 3100) in two replicate datasets for B cells and PSCs (*P<2.2e-16, Wilcoxon rank-sum test). (d) In-situ Hi-C contact maps (50kb resolution) centered on TAD border 999, which is progressively lost during reprogramming. Black arrows indicate position of TAD border calls per timepoint. Bar graphs on the right shown insulation score (I-score) values for both independent biological replicates. (e) Number of TAD borders reproducibly called per timepoint. Invariant borders were present at all timepoints; variable borders were lost/acquired during reprogramming. (f) Boxplots showing TAD size distributions (n = 3100) during reprogramming. (g) Genome browser view of the Sox2 locus (2.2 Mb region, centered on Sox2) with H3K4Me2 (red) and Ctcf binding (dark grey, peaks indicated by black rectangles) dynamics indicated below. Sox2 gene location is indicated in blue on top, superenhancer (SE) position in red and neighboring genes as black rectangles. (h) Boxplots showing gene expression (RPKM) dynamics for border regions that were acquired (‘gained’, top) or decommissioned (‘lost’, bottom) during reprogramming. Borders were further grouped according to the timepoint they appeared/disappeared. Very few borders are lost at the Bα and D2 stages, resulting in <5 genes available for downstream analysis and we therefore omitted these analyses. (i) Proportion of borders where gene expression is on average upregulated, downregulated or not changed. Borders were separated based on whether they were gained or lost during reprogramming. (j) Gene expression dynamics at transcriptionally modulated border regions (divided in up or downregulated groups per timepoint) gained or lost during reprogramming (#P<0.1, *P<0.05 versus B cells; unpaired two-tailed t-test).

  6. Supplementary Figure 6 Cell-type-specific genes reside near dynamic TAD borders.

    (a) Insulation score (I-score) dynamics for TAD border stably gained (n = 431), lost (n = 124) or invariant (n = 2,185) during reprogramming. Error bars denote 95% CI; percentages indicate the proportion of all borders that belong to the various classes. (b) Boxplots depicting I-score of borders harboring no Ctcf sites, 1-5 Ctcf sites or >5 Ctcf sites for indicated timepoints. (c) Meta-border plots for all borders that gain I-score, do not change I-score or lose I-score. (d) Principal component analysis (PCA) and unsupervised hierarchical clustering of I-score values (n = 3100). (e) Boxplot showing the average distance of pluripotency genes (n = 25) or all other genes (n = 16,307) to the nearest TAD border. (f) Gene ontology terms significantly associated with genes found within dynamic (top) or stable (bottom) border regions in both independent biological replicates.

  7. Supplementary Figure 7 Insulation strength changes precede Nanog and Sox2 activation.

    (a) Conventional 4C-Seq analysis (representative experiment shown) of the Dppa3-Nanog locus at early reprogramming timepoints using the Nanog promoter as a viewpoint. Border region defined by Hi-C and superenhancer (SE) are indicated in blue. (b) 2.25 Mb in-situ Hi-C contact maps (50 kb resolution) centered on the Sox2 gene and its superenhancer (SE) for both independent biological replicate reprogramming experiments. TAD border calls per timepoint are indicated by black arrows. Note the progressive insulation of Sox2 and its SE into a smaller domain as the gene is activated (indicated by a black arrow in the PSC maps). (c-e) Kinetics of mean H3K4Me2 (panel c), D-score (panel d) and PC1 (panel e) changes at dynamic borders harboring genes that are either upregulated (n = 22) or downregulated (n = 21) after I-score changes are initiated. Shading denotes SEM.

  8. Supplementary Figure 8 TAD connectivity dynamics during reprogramming.

    (a) Average expression of genes (plotted as an expression percentile) in TADs (n = 1,664) having a low (-0.26;-0.02), average (-0.02;0.1) or high (0.1;0.6) relative domain score (D-score). (b) Boxplots showing relative D-score values for TADs in the A (n = 953-1,039) or B (n = 1,141-1,227) compartment at each timepoint. Statistical significance was assessed using a Wilcoxon rank-sum test. (c) Percentage of expression variance explained by TADs (relative to a linear model, see Supplemental Materials for a detailed explanation) for each timepoint. (d) Collection of all PCA trajectories generated in this study. Points denote average data from two biological replicates. (e) Average D-score and PC1 kinetics during reprogramming for clusters of TADs that gain (n = 705, left) or lose (n = 869, right) D-score. Pearson correlation coefficients (R) are indicated. (f) Top dynamic TADs gaining (n = 279, upper half) or losing (n = 252, lower half) D-score during reprogramming. Line graphs show mean PC1 values for switching and non-switching TADs. Percentages of A-to-B and B-to-A switching for both groups of TADs are depicted by triangles. Tables show selected gene ontology (GO) terms for the genes within the corresponding TADs. (g) Fraction of TADs that switch compartment in groups of TADs with low (0-0.02), average (0.02-0.07) or high (>0.07) absolute changes in D-score for both independent biological replicate experiments. Error bars in all plots denote SEM.

  9. Supplementary Figure 9 Chromatin-loop dynamics during reprogramming.

    (a) Meta-loop analysis at 5kb resolution of B cell or PSC loops49. Area shown is centered on the respective TF binding sites (+/- 50kb). (b) Gene ontology (GO) annotation of the genes within B cell (left) or PSC (right) specific loops. In-situ Hi-C data from two independent biological replicate reprogramming experiments was pooled for these analyses.

  10. Supplementary Figure 10 Transcription-factor dynamics at hotspots of topological change.

    (a) Compartment switching induced by C/EBPα (B-to-Bα, top) or OSKM (Bα-to-D2, bottom). Line graphs depict average expression changes of genes located in regions that have stably switched. (b) GO annotation of genes (n = 358) that stably switch B-to-A compartment at the Bα-D2 transition. (c) Average gene expression changes of the genes (n = 10) associated with the gene ontology (GO) term ‘Embryo Development’ that stably switch B-to-A compartment at the B-Bα transition. (d) Klf4 binding enrichment (over the genome-wide average) at the 20 switching clusters shown in Fig.2g. Mean values with 95% CI are shown. (e) Percentage of TAD border regions (n = 3,100) bound by Oct4 at each timepoint. (f) Oct4 (left) and Klf4 (right) enrichment kinetics at border regions that are already targeted by these factors at D2 (n = 37 for Oct4, n = 22 for Klf4) or border regions not yet targeted at D2 (n = 147 for Oct4, n = 162 for Klf4). (g) C/EBPα (left) or Oct4 (right) enrichment at border regions bound by indicated transcription factors at the earliest timepoint (n = 123 for C/EBPα, n = 37 for Oct4) or unbound regions (n = 61 for C/EBPα, n = 147 for Oct4). Mean values +/- SD are indicated, as well as individual data points. P values were calculated using a Wilcoxon rank-sum test. (h) Venn diagram showing the overlap between the number of dynamic borders bound by Oct4 (at D2), Klf4 (at D2) and C/EBPα (at Bα). (i) Kinetics of key transcriptional, epigenomic and topological events during somatic cell reprogramming. Light-to-dark color intensity range signifies quantitative differences. Ect., ectopic; chr., chromosome.

  11. Supplementary Figure 11 Replication timing and genome compartmentalization dynamics.

    (a) Comparison of A/B compartmentalization in B cells (representative experiment shown) and replication timing in a B cell line (CH12 Repli-chip data obtained from the ENCODE consortium) for chromosome 2. Note the extremely high correlation between positive PC1 values (i.e. A compartment domains) and positive replication timing signal (i.e. early replication timing domains). (b) Residence time of switching 100kb genomic bins in either the A or B compartment (as measured in timepoints during reprogramming). See Methods for a detailed description of the analysis procedure employed.

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–11

  2. Life Sciences Reporting Summary

  3. Supplementary Table 1

    In-situ Hi-C dataset statistics.

  4. Supplementary Table 2

    Key cell identity genes for B lymphocytes and pluripotent stem cells.