Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Orphan CpG islands amplify poised enhancer regulatory activity and determine target gene responsiveness

Abstract

CpG islands (CGIs) represent a widespread feature of vertebrate genomes, being associated with ~70% of all gene promoters. CGIs control transcription initiation by conferring nearby promoters with unique chromatin properties. In addition, there are thousands of distal or orphan CGIs (oCGIs) whose functional relevance is barely known. Here we show that oCGIs are an essential component of poised enhancers that augment their long-range regulatory activity and control the responsiveness of their target genes. Using a knock-in strategy in mouse embryonic stem cells, we introduced poised enhancers with or without oCGIs within topologically associating domains harboring genes with different types of promoters. Analysis of the resulting cell lines revealed that oCGIs act as tethering elements that promote the physical and functional communication between poised enhancers and distally located genes, particularly those with large CGI clusters in their promoters. Therefore, by acting as genetic determinants of gene–enhancer compatibility, CGIs can contribute to gene expression control under both physiological and potentially pathological conditions.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Genetic properties and functional relevance of oCGIs associated with PEs.
Fig. 2: Modular engineering of PEs reveals major regulatory functions for oCGIs.
Fig. 3: Characterization of the epigenetic, topological and regulatory features of the PE Sox1(+35) modules engineered within the Gata6-TAD.
Fig. 4: Genes with CpG-poor promoters do not show long-range responsiveness to PEs.
Fig. 5: Promoters with large CGI clusters are particularly responsive to distal PEs.
Fig. 6: oCGIs and TAD boundaries enable PEs to specifically induce their target genes.
Fig. 7: Proposed model for the role of oCGIs as amplifiers of PE regulatory activity and determinants of PE–gene compatibility.

Data availability

All the 4C–seq data generated in this study are available through the GEO (GSE156465). All the generated transgenic ESC lines are available upon request.

References

  1. 1.

    Spitz, F. & Furlong, E. E. M. Transcription factors: from enhancer binding to developmental control. Nat. Rev. Genet. 13, 613–626 (2012).

    CAS  PubMed  Google Scholar 

  2. 2.

    Kvon, E. Z. Using transgenic reporter assays to functionally characterize enhancers in animals. Genomics 106, 185–192 (2015).

    CAS  PubMed  Google Scholar 

  3. 3.

    Furlong, E. E. M. & Levine, M. Developmental enhancers and chromosome topology. Science 361, 1341–1345 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Laugsch, M. et al. Modeling the pathological long-range regulatory effects of human structural variation with patient-specific hiPSCs. Cell Stem Cell 24, 736–752.e12 (2019).

    CAS  PubMed  Google Scholar 

  6. 6.

    Rao, S. S. P. et al. Cohesin loss eliminates all loop domains. Cell 171, 305–320.e24 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Nora, P. et al. Targeted degradation of CTCF decouples local insulation of chromosome domains from genomic compartmentalization. Cell 169, 930–944 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Ghavi-Helm, Y. et al. Highly rearranged chromosomes reveal uncoupling between genome topology and gene expression. Nat. Genet. 51, 1272–1282 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Kraft, K. et al. Serial genomic inversions induce tissue-specific architectural stripes, gene misexpression and congenital malformations. Nat. Cell Biol. 21, 305–310 (2019).

    CAS  PubMed  Google Scholar 

  10. 10.

    Kikuta, H. et al. Genomic regulatory blocks encompass multiple neighboring genes and maintain conserved synteny in vertebrates. Genome Res. 17, 545–555 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Arnold, C. D. et al. Genome-wide assessment of sequence-intrinsic enhancer responsiveness at single-base-pair resolution. Nat. Biotechnol. 35, 136–144 (2016).

    PubMed  PubMed Central  Google Scholar 

  12. 12.

    Haberle, V. et al. Transcriptional cofactors display specificity for distinct types of core promoters. Nature 570, 122–126 (2019).

    CAS  PubMed  Google Scholar 

  13. 13.

    Spielmann, M., Lupiáñez, D. G. & Mundlos, S. Structural variation in the 3D genome. Nat. Rev. Genet. 19, 453–467 (2018).

    CAS  PubMed  Google Scholar 

  14. 14.

    Cruz-Molina, S. et al. PRC2 facilitates the regulatory topology required for poised enhancer function during pluripotent stem cell differentiation. Cell Stem Cell 20, 689–705.e9 (2017).

    CAS  PubMed  Google Scholar 

  15. 15.

    Rada-Iglesias, A. et al. A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279–283 (2011).

    CAS  PubMed  Google Scholar 

  16. 16.

    Deaton, A. M. & Bird, A. CpG islands and the regulation of transcription. Genes Dev. 25, 1010–1022 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Bell, J. S. K. & Vertino, P. M. Orphan CpG islands define a novel class of highly active enhancers. Epigenetics 12, 449–464 (2017).

    PubMed  PubMed Central  Google Scholar 

  18. 18.

    Illingworth, R. S. et al. Orphan CpG islands identify numerous conserved promoters in the mammalian genome. PLoS Genet. 6, e1001134 (2010).

    PubMed  PubMed Central  Google Scholar 

  19. 19.

    Steinhaus, R., Gonzalez, T., Seelow, D. & Robinson, P. N. Pervasive and CpG-dependent promoter-like characteristics of transcribed enhancers. Nucleic Acids Res. 48, 5306–5317 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Bogdanović, O. et al. Active DNA demethylation at enhancers during the vertebrate phylotypic period. Nat. Genet. 48, 417–426 (2016).

    PubMed  PubMed Central  Google Scholar 

  21. 21.

    Long, H. K. et al. Epigenetic conservation at gene regulatory elements revealed by non-methylated DNA profiling in seven vertebrates. eLife 2, e00348 (2013).

    PubMed  PubMed Central  Google Scholar 

  22. 22.

    Lenhard, B., Sandelin, A. & Carninci, P. Metazoan promoters: emerging characteristics and insights into transcriptional regulation. Nat. Rev. Genet. 13, 233–245 (2012).

    CAS  PubMed  Google Scholar 

  23. 23.

    Williams, K. et al. TET1 and hydroxymethylcytosine in transcription and DNA methylation fidelity. Nature 473, 343–349 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Blackledge, N. P. et al. Variant PRC1 complex-dependent H2A ubiquitylation drives PRC2 recruitment and polycomb domain formation. Cell 157, 1445–1459 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Aljazi, M. B., Gao, Y., Wu, Y., Mias, G. I. & He, J. Cell signaling coordinates global PRC2 recruitment and developmental gene expression in murine embryonic stem cells. iScience 23, 101646 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Habibi, E. et al. Whole-genome bisulfite sequencing of two distinct interconvertible DNA methylomes of mouse embryonic stem cells. Cell Stem Cell 13, 360–369 (2013).

    CAS  PubMed  Google Scholar 

  27. 27.

    Zylicz,J. J. et al. Chromatin dynamics and the role of G9a in gene regulation and enhancer silencing during early mouse development. eLife 4, e09571 (2015).

    PubMed  PubMed Central  Google Scholar 

  28. 28.

    Lee, S. M. et al. Intragenic CpG islands play important roles in bivalent chromatin assembly of developmental genes. Proc. Natl Acad. Sci. USA 114, E1885–E1894 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Bolt, C. C. & Duboule, D. The regulatory landscapes of developmental genes. Development 147, dev171736 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Blackledge, N. P. & Klose, R. CpG island chromatin. Epigenetics 2294, 147–152 (2011).

    Google Scholar 

  31. 31.

    Turberfield, A. H. et al. KDM2 proteins constrain transcription from CpG island gene promoters independently of their histone demethylase activity. Nucleic Acids Res. 47, 9005–9023 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Arab, K. et al. GADD45A binds R-loops and recruits TET1 to CpG island promoters. Nat. Genet. 51, 217–223 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Diez, R. & Storey, K. G. Markers in vertebrate neurogenesis. Nat. Rev. Neurosci. 2, 835–839 (2001).

    Google Scholar 

  34. 34.

    Bentovim, L., Harden, T. T. & DePace, A. H. Transcriptional precision and accuracy in development: from measurements to models and mechanisms. Development 144, 3855–3866 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Boyes, J. & Bird, A. DNA methylation inhibits transcription indirectly via a methyl-CpG binding protein. Cell 64, 1123–1134 (1991).

    CAS  PubMed  Google Scholar 

  36. 36.

    Klemm, S. L., Shipony, Z. & Greenleaf, W. J. Chromatin accessibility and the regulatory epigenome. Nat. Rev. Genet. 20, 207–220 (2019).

    CAS  PubMed  Google Scholar 

  37. 37.

    You, J. S. et al. OCT4 establishes and maintains nucleosome-depleted regions that provide additional layers of epigenetic regulation of its target genes. Proc. Natl Acad. Sci. USA 108, 14497–14502 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38.

    Stadler, M. B. et al. DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature 480, 490–495 (2011).

    CAS  PubMed  Google Scholar 

  39. 39.

    Kim, T.-K. et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Mas, G. & Di Croce, L. The role of Polycomb in stem cell genome architecture. Curr. Opin. Cell Biol. 43, 87–95 (2016).

    CAS  PubMed  Google Scholar 

  41. 41.

    Yan, J. et al. Histone H3 lysine 4 monomethylation modulates long-range chromatin interactions at enhancers. Cell Res. 28, 204–220 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Denholtz, M. et al. Long-range chromatin contacts in embryonic stem cells reveal a role for pluripotency factors and polycomb proteins in genome organization. Cell Stem Cell 13, 602–616 (2013).

    CAS  PubMed  Google Scholar 

  43. 43.

    Wang, J. et al. A protein interaction network for pluripotency of embryonic stem cells. Nature 444, 364–368 (2006).

    CAS  PubMed  Google Scholar 

  44. 44.

    Pachano, T., Crispatzu, G. & Rada-Iglesias, A. Polycomb proteins as organizers of 3D genome architecture in embryonic stem cells. Brief. Funct. Genomics 18, 358–366 (2019).

    CAS  PubMed  Google Scholar 

  45. 45.

    Bantignies, F. et al. Polycomb-dependent regulatory contacts between distant Hox loci in Drosophila. Cell 144, 214–226 (2011).

    CAS  PubMed  Google Scholar 

  46. 46.

    Isono, K. et al. SAM domain polymerization links subnuclear clustering of PRC1 to gene silencing. Dev. Cell 26, 565–577 (2013).

    CAS  PubMed  Google Scholar 

  47. 47.

    Loubiere, V., Papadopoulos, G. L., Szabo, Q., Martinez, A. M. & Cavalli, G. Widespread activation of developmental gene expression characterized by PRC1-dependent chromatin looping. Sci. Adv. 6, eaax4001 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Benabdallah, N. S. et al. Decreased enhancer-promoter proximity accompanying enhancer activation. Mol. Cell 76, 473–484 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Lim, B., Heist, T., Levine, M. & Fukaya, T. Visualization of transvection in living Drosophila embryos. Mol. Cell 70, 287–296.e6 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. 50.

    Beck, S. et al. Implications of CpG islands on chromosomal architectures and modes of global gene regulation. Nucleic Acids Res. 46, 4382–4391 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Liu, S. et al. From 1D sequence to 3D chromatin dynamics and cellular functions: a phase separation perspective. Nucleic Acids Res. 46, 9367–9383 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Kurup, J. T., Han, Z., Jin, W. & Kidder, B. L. H4K20me3 methyltransferase SUV420H2 shapes the chromatin landscape of pluripotent embryonic stem cells. Development 147, dev188516 (2020).

    CAS  PubMed  Google Scholar 

  53. 53.

    Andersson, R., Sandelin, A. & Danko, C. G. A unified architecture of transcriptional regulatory elements. Trends Genet. 31, 426–433 (2015).

    CAS  PubMed  Google Scholar 

  54. 54.

    Lloret-Llinares, M. et al. The RNA exosome contributes to gene expression regulation during stem cell differentiation. Nucleic Acids Res. 46, 11502–11513 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Local, A. et al. Identification of H3K4me1-associated proteins at mammalian enhancers. Nat. Genet. 50, 73–82 (2018).

    CAS  PubMed  Google Scholar 

  56. 56.

    Etchegaray, J. P. et al. The histone deacetylase SIRT6 restrains transcription elongation via promoter-proximal pausing. Mol. Cell 75, 683–699 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Hirabayashi, S. et al. NET-CAGE characterizes the dynamics and topology of human transcribed cis-regulatory elements. Nat. Genet. 51, 1369–1379 (2019).

    CAS  PubMed  Google Scholar 

  58. 58.

    Schoenfelder, S. et al. Polycomb repressive complex PRC1 spatially constrains the mouse embryonic stem cell genome. Nat. Genet. 47, 1179–1186 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. 59.

    Butler, J. E. F. & Kadonaga, J. T. Enhancer–promoter specificity mediated by DPE or TATA core promoter motifs. Genes Dev. 15, 2515–2519 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. 60.

    Gómez-Marín, C. et al. Evolutionary comparison reveals that diverging CTCF sites are signatures of ancestral topological associating domains borders. Proc. Natl Acad. Sci. USA 112, 7542–7547 (2015).

    PubMed  PubMed Central  Google Scholar 

  61. 61.

    O’Brien, L. L. et al. Transcriptional regulatory control of mammalian nephron progenitors revealed by multi-factor cistromic analysis and genetic studies. PLoS Genet. 14, e1007181 (2018).

    PubMed  PubMed Central  Google Scholar 

  62. 62.

    Catarino, R. R. & Stark, A. Assessing sufficiency and necessity of enhancer activities for gene expression and the mechanisms of transcription activation. Genes Dev. 32, 202–223 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. 63.

    Lupiáñez, D. G. et al. Disruptions of topological chromatin domains cause pathogenic rewiring of gene-enhancer interactions. Cell 161, 1012–1025 (2015).

    PubMed  PubMed Central  Google Scholar 

  64. 64.

    Kragesteen, B. K. et al. Dynamic 3D chromatin architecture contributes to enhancer specificity and limb morphogenesis. Nat. Genet. 50, 1463–1473 (2018).

    CAS  PubMed  Google Scholar 

  65. 65.

    Li, X. & Noll, M. Compatibility between enhancers and promoters determines the transcriptional specificity of gooseberry and gooseberry neuro in the Drosophila embryo. EMBO J. 13, 400–406 (1994).

    PubMed  PubMed Central  Google Scholar 

  66. 66.

    Zabidi, M. A. et al. Enhancer-core-promoter specificity separates developmental and housekeeping gene regulation. Nature 518, 556–559 (2015).

    CAS  PubMed  Google Scholar 

  67. 67.

    Mahmoudi, T., Katsani, K. R. & Verrijzer, C. P. GAGA can mediate enhancer function in trans by linking two separate DNA molecules. EMBO J. 21, 1775–1781 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. 68.

    Calhoun, V. C. & Levine, M. Long-range enhancer-promoter interactions in the Scr-Antp interval of the Drosophila Antennapedia complex. Proc. Natl Acad. Sci. USA 100, 9878–9883 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  69. 69.

    Calhoun, V. C., Stathopoulos, A. & Levine, M. Promoter-proximal tethering elements regulate enhancer-promoter specificity in the Drosophila Antennapedia complex. Proc. Natl Acad. Sci. USA 99, 9243–9247 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  70. 70.

    Boyle, S. et al. A central role for canonical PRC1 in shaping the 3D nuclear landscape. Genes Dev. 34, 931–949 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. 71.

    Perino, M. et al. MTF2 recruits Polycomb Repressive Complex 2 by helical-shape-selective DNA binding. Nat. Genet. 50, 1002–1010 (2018).

    CAS  PubMed  Google Scholar 

  72. 72.

    Beltran, M. et al. The interaction of PRC2 with RNA or chromatin is mutually antagonistic. Genome Res. 26, 896–907 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. 73.

    Crispatzu, G. et al. The chromatin, topological and regulatory properties of pluripotency-associated poised enhancers are conserved in vivo. Preprint at bioRxiv https://doi.org/10.1101/2021.01.18.427085 (2021).

  74. 74.

    Shrinivas, K. et al. Enhancer features that drive formation of transcriptional condensates. Mol. Cell 75, 549–561.e7 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  75. 75.

    Dimitrova, E. et al. FBXl19 recruits CDK-Mediator to CpG islands of developmental genes priming them for activation during lineage commitment. eLife 7, e37084 (2018).

    PubMed  PubMed Central  Google Scholar 

  76. 76.

    Long, H. K., Blackledge, N. P. & Klose, R. J. ZF-CxxC domain-containing proteins, CpG islands and the chromatin connection. Biochem. Soc. Trans. 41, 727–740 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  77. 77.

    Mastrangelo, I. A., Courey, A. J., Wall, J. S., Jackson, S. P. & Hough, P. V. C. DNA looping and Sp1 multimer links: a mechanism for transcriptional synergism and enhancement. Proc. Natl Acad. Sci. USA 88, 5670–5674 (1991).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. 78.

    Su, W., Jackson, S., Tjian, R. & Echols, H. DNA looping between sites for transcriptional activation: self-association of DNA-bound Sp1. Genes Dev. 5, 820–826 (1991).

    CAS  PubMed  Google Scholar 

  79. 79.

    Hartl, D. et al. CG dinucleotides enhance promoter activity independent of DNA methylation. Genome Res. 29, 554–563 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. 80.

    Wang, Y. et al. The 3D Genome Browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol. 19, 151 (2018).

    PubMed  PubMed Central  Google Scholar 

  81. 81.

    Bonev, B. et al. Multiscale 3D genome rewiring during mouse neural development. Cell 171, 557–572.e24 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  82. 82.

    Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  83. 83.

    Liu, T. et al. Cistrome: an integrative platform for transcriptional regulation studies. Genome Biol. 12, R83 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  84. 84.

    Gouti, M. et al. In vitro generation of neuromesodermal progenitors reveals distinct roles for wnt signalling in the specification of spinal cord and paraxial mesoderm identity. PLoS Biol. 12, e1001937 (2014).

    PubMed  PubMed Central  Google Scholar 

  85. 85.

    Matsuda, K. & Kondoh, H. Dkk1-dependent inhibition of Wnt signaling activates Hesx1 expression through its 5′ enhancer and directs forebrain precursor development. Genes Cells 19, 374–385 (2014).

    CAS  PubMed  Google Scholar 

  86. 86.

    Yao, X. et al. Tild-CRISPR allows for efficient and precise gene knockin in mouse and human cells. Dev. Cell 45, 526–536.e5 (2018).

    CAS  PubMed  Google Scholar 

  87. 87.

    Giresi, P. G., Kim, J., McDaniell, R. M., Iyer, V. R. & Lieb, J. D. FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin. Genome Res. 17, 877–885 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. 88.

    Requena, F. et al. NOMePlot: analysis of DNA methylation and nucleosome occupancy at the single molecule. Sci. Rep. 9, 8140 (2019).

    PubMed  PubMed Central  Google Scholar 

  89. 89.

    Gardiner-Garden, M. & Frommer, M. CpG islands in vertebrate genomes. J. Mol. Biol. 196, 261–282 (1987).

    CAS  PubMed  Google Scholar 

  90. 90.

    Madeira, F. et al. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 47, W636–W641 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  91. 91.

    Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  92. 92.

    Karolchik, D. et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 32, 493–496 (2004).

    Google Scholar 

  93. 93.

    Ewels, P., Magnusson, M., Lundin, S. & Käller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  94. 94.

    Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  95. 95.

    Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12 (2011).

    Google Scholar 

  96. 96.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  97. 97.

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  98. 98.

    Feng, J., Liu, T., Qin, B., Zhang, Y. & Liu, X. S. Identifying ChIP-seq enrichment using MACS. Nat. Protoc. 7, 1728–1740 (2012).

    CAS  PubMed  Google Scholar 

  99. 99.

    Pagès, H. BSgenome: software infrastructure for efficient representation of full genomes and their SNPs. R package version 1.56.0 (2020).

  100. 100.

    Wang, J. et al. Nascent RNA sequencing analysis provides insights into enhancer-mediated gene regulation. BMC Genomics 19, 633 (2018).

    PubMed  PubMed Central  Google Scholar 

  101. 101.

    Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).

    PubMed  PubMed Central  Google Scholar 

  102. 102.

    Cliff, N. Dominance statistics: ordinal analyses to answer ordinal questions. Psychol. Bull. 114, 494–509 (1993).

    Google Scholar 

  103. 103.

    Macbeth, G., Razumiejczyk, E. & Ledesma, R. D. Cliff´s Delta Calculator: a non-parametric effect size program for two groups of observations. Univ. Psychol. 10, 545–555 (2011).

    Google Scholar 

  104. 104.

    Bush, S. J., McCulloch, M. E. B., Summers, K. M., Hume, D. A. & Clark, E. L. Integration of quantitated expression estimates from polyA-selected and rRNA-depleted RNA-seq libraries. BMC Bioinformatics 18, 301 (2017).

    PubMed  PubMed Central  Google Scholar 

  105. 105.

    Abdennur, N. & Mirny, L. A. Cooler: scalable storage for Hi-C data and other genomically labeled arrays. Bioinformatics 36, 311–316 (2020).

    CAS  PubMed  Google Scholar 

  106. 106.

    Flyamer, I. M., Illingworth, R. S. & Bickmore, W. A. Coolpup.py: versatile pile-up analysis of Hi-C data. Bioinformatics 36, 2980–2985 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  107. 107.

    Bailey, T. L. et al. MEME Suite: tools for motif discovery and searching. Nucleic Acids Res. 37, 202–208 (2009).

    Google Scholar 

  108. 108.

    Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for bisulfite-seq applications. Bioinformatics 27, 1571–1572 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  109. 109.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  110. 110.

    Kent, W. J., Zweig, A. S., Barber, G., Hinrichs, A. S. & Karolchik, D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics 26, 2204–2207 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  111. 111.

    Zhao, H. et al. CrossMap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics 30, 1006–1007 (2014).

    PubMed  Google Scholar 

  112. 112.

    Pope, B. D. et al. Topologically associating domains are stable units of replication-timing regulation. Nature 515, 402–405 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank the Rada-Iglesias laboratory members for insightful comments and critical reading of the manuscript. T.P. is supported by a doctoral fellowship from the DAAD (Germany). V.S.-G. is supported by a doctoral fellowship from the University of Cantabria (Spain). Work in the Rada-Iglesias laboratory was supported by the EMBO Young Investigator Programme; CMMC intramural funding (Germany); the German Research Foundation (DFG) (Research Grant no. RA 2547/2-1); ‘Programa STAR-Santander Universidades, Campus Cantabria Internacional de la convocatoria CEI 2015 de Campus de Excelencia Internacional’ (Spain); the Spanish Ministry of Science, Innovation and Universities (Research Grant nos. PGC2018-095301-B-I00 and RED2018-102553-T REDEVNEURAL 3.0); and the European Research Council (ERC CoG ‘PoisedLogic’; grant no. 862022). The Landeira laboratory is funded by grants from the Spanish Ministry of Science and Innovation (grant nos. BFU2016-75233-P and PID2019-108108GB-I00) and the Andalusian Regional Government (grant no. PC-0246-2017).

Author information

Affiliations

Authors

Contributions

T.P. and A.R.-I. conceptualized the project. Experimental investigations were performed by T.P., T.E., M.M.-F., H.G.A., P.R., M.M.-S. and E.H. T.P., V.S.-G. and T.B. performed data analyses. T.P. and A.R.-I. wrote, reviewed and edited the manuscript. S.C.-M., W.F.J.v.I., D.L. and A.R.-I. were responsible for obtaining resources. A.R.-I. was responsible for supervision and funding acquisition.

Corresponding author

Correspondence to Alvaro Rada-Iglesias.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Genetics thanks Darío Lupiáñez, Robin Andersson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Genetic and epigenetic features of the oCGIs associated with PEs.

a, Comparison of CpG%, observed/expected CpG ratio, GC% and sequence length between random regions (n = 436000), NMIs associated to PE-distal (PE-NMIs; n = 345) and NMIs associated to the devTSS (devTSS-NMIs; n = 1476) (Methods). The p-values were calculated using two-sided unpaired Wilcoxon tests with Bonferroni correction for multiple testing; black numbers indicate median fold-changes; green numbers indicate non-negligible Cliff Delta effect sizes. The coloured area of the violin plot represents the expression values distribution and the center line represents the median. b, H3K27me3 ChIP-seq levels14,24 around: PE-distal with overlapping TFBS/p300 peaks and CAP-CGIs (n = 135), PE-distal with TFBS/p300 peaks separated by 1bp-1kb from CAP-CGIs (n = 65), PE-distal with TFBS/p300 peaks separated by 1-3kb from CAP-CGIs (n = 53), PE-distal without CAP-CGIs within 3kb (n = 254) and AEs without CAP-CGI within 3kb (n = 8115). c, % of CpG methylation at CAP-CGI associated with PE-distal (PE-CAP-CGI; n = 276) and CAP-CGI associated with the TSS of developmental genes (devTSS-CAP-CGI; n = 1926) in the indicated cell types (Methods). d, For the identification of the PE Sox1(+35)CGI deletion, primer pairs flanking each of the deletion breakpoints (1 + 3 and 4 + 2), located within the deleted region (5 + 6) or amplifying a large or small fragment depending on the absence or presence of the deletion (1 + 2) were used. e, H3K27me3 levels at PE Sox1(+35) were measured by ChIP-qPCR in WT ESCs and in n = 2 independent PE Sox1(+35)CGI/ ESCs clones using primers adjacent to the deleted region. The bars display the mean of n = 3 technical replicates (black dots). f, Independent biological replicate for the data presented in Fig. 1d. Sox1 expression was investigated by RT-qPCR in ESCs and AntNPC with the indicated genotypes. n = 2 independent PE Sox1 CGI/ ESC clones (circles and diamonds) and n = 1 PE Sox1/ clone were studied. For each cell line, n = 2 replicates of the AntNPC differentiation were performed. Expression values were normalized to two housekeeping genes (Eef1a and Hprt) and are presented as fold-changes with respect to WT ESCs. The coloured area of the violin plot represents the expression values distribution and the center line represents the median.

Extended Data Fig. 2 Engineering of PEs modules within the Gata6-TAD and FoxA2-TAD.

a, Epigenomic and genomic features of two previously characterized PEs14 (PE Six3(133); PE Lmx1b(+59)) in which the oCGIs overlap with conserved sequences bound by p300 and, thus, likely to contain relevant TFBS. b, The different PE Sox1(+35) insertions were identified using primer pairs flanking the insertion borders (1 + 3 and 4 + 2; 1 + 5 and 6 + 2; 1 + 3 and 6 + 2), amplifying potential duplications (4 + 3, 3 + 2 and 4 + 1; 6 + 5, 5 + 2 and 6 + 1) and amplifying a large or small fragment depending on the absence or presence of the insertion (1 + 2), respectively. The PCR results obtained for WT ESCs and for two ESC clonal lines with homozygous insertions of the PE Sox1(+35) modules in the Gata6-TAD are shown. c, Independent biological replicate for the data presented in Fig. 2b. d-e, Strategy used to insert the PE Wnt8b(+21) (d) or the PE Sox1(+35) (e) components into the Gata6-TAD (d) or Foxa2-TAD (e), respectively. The right panels shows the TADs in which Gata6 (d) or Foxa2 (e) are included according to publically available Hi-C data80,81, with the red triangle indicating the integration site of the PE modules, approximately 100 Kb downstream of Gata6 (d) or Foxa2 (e). f-g, For identifying the successful insertion of the different PE Sox1(+35) (f) or PE Wnt8b(+21) (g) modules, primer pairs flanking the insertion borders (1 + 3 and 4 + 2; 1 + 5 and 6 + 2; 1 + 3 and 6 + 2), amplifying potential duplications (4 + 3, 3 + 2 and 4 + 1; 6 + 5, 5 + 2 and 6 + 1) and amplifying a large or small fragment depending on the absence or presence of the insertion (1 + 2), respectively, were used. The PCR results obtained for two ESC clonal lines with homozygous insertions of the indicated PE modules in the Foxa2-TAD (f) or Gata6-TAD (g), respectively, are shown. h-i, Independent biological replicates for the data shown in Fig. 2c (h) and Fig. 2d (i). In (c), (h) and (i), the expression differences between AntNPCs with the TFBS + CGI module and AntNPCs with the other PE modules were calculated using two-sided non-paired t-tests (**: foldchange>2 & p<0.001; *: foldchange> 2 & p<0.05; ns: not significant; fold-change<2 or p>0.05).

Extended Data Fig. 3 PEs are enriched in CpG-rich motifs and are bound by CxxC-domain containing proteins.

a, Comparison of the TF motifs enriched in either PEs with a CAP-CGI in <3kb and active enhancers without CAP-CGIs in <3kb. Motif enrichment analyses were performed with Homer82 (left) and AME107 (right). b, ChIP-seq signals for KDM2B31 (upper panel) and TET132 (lower panel) are shown around: PE-distal with overlapping TFBS/p300 peaks and CAP-CGIs (n = 135), PE-distal with TFBS/p300 peaks separated by 1bp-1kb from CAP-CGIs (n = 65), PE-distal with TFBS/p300 peaks separated by 1-3kb from CAP-CGIs (n = 53) and PE-distal without CAP-CGIs within 3kb (n = 254). ChIP-seq profile plots were generated using either the p300 peaks (left) or the CAP-CGIs (right) associated with the PEs as midpoints.

Extended Data Fig. 4 Engineering of ESC lines containing the PE Sox1(+35) TFBS and an artificial CGI within the Gata6-TAD.

a, Strategy used to insert the PE Sox1(+35)TFBS alone or together with an aCGI into the Gata6-TAD. The upper left panel shows the epigenomic and genetic features of the PE Sox1(+35). The lower left panel shows the PE Sox1(+35) modules inserted into the Gata6-TAD. The right panel shows the Gata6-TAD according to publically available Hi-C data80,81. The red triangle indicates the integration site of the PE Sox1(+35) modules approximately 100 Kb downstream of Gata6. b, For the identification of the PE Sox1(+35)TFBS+aCGI insertion, primer pairs flanking the insertion borders (1+3 and 4+2), amplifying potential duplications (4 + 3 and 4 + 4) and amplifying a large or small fragment depending on the absence or presence of the insertion (1 + 2), respectively, were used. The PCR results obtained for two ESC clonal lines with homozygous insertions of PE Sox1(+35)TFBS+aCGI in the Gata6-TAD are shown. c, Independent biological replicate for the data presented in Fig. 2f. The expression differences between AntNPCs with the TFBS+CGI module and AntNPCs with the other PE modules were calculated using two-sided non-paired t-tests (*: foldchange> 2 & p<0.05; ns: not significant; fold-change<2 or p>0.05). d, For the identification of the aCGI insertion alone, primer pairs flanking the insertion borders (1 + 3 and 4 + 2), amplifying potential duplications (4 + 3 and 4 + 4) and amplifying a large or small fragment depending on the absence or presence of the insertion (1 + 2), respectively, were used. The PCR results obtained from two ESC clonal lines with heterozygous insertions of aCGI in the Gata6-TAD are shown. e, The expression of Gata6 and Sox1 was measured by RT-qPCR in cells that were either WT or heterozygous for the aCGI insertion in the Gata6-TAD (two different clones; circles and diamonds). For each cell line, n = 2 replicates of the AntNPC differentiation were performed. The results obtained in n = 2 independent biological replicates are presented in each panel (Rep1 and Rep2).

Extended Data Fig. 5 Gata6 expression patterns in cell lines with the PE Sox1(+35) modules inserted within the Gata6-TAD.

a, Gata6 and Sox1 expression was measured by RT-qPCR in ESCs and at intermediate stages of AntNPC differentiation (Day 3 and Day 4). The analysed cells were either WT or homozygous for the insertions of the different PE Sox1(+35) modules within the Gata6-TAD. For the cells with the PE module insertions, n = 1 clonal cell line was studied. For each cell line, n = 2 replicates of the AntNPC differentiation were performed. Expression values were normalized to two housekeeping genes (Eef1a and Hprt) and are presented as fold-changes with respect to WT ESCs. b, Quantification of cells expressing GATA6 or SOX1 according to immunofluorescence assays as the ones shown in Fig. 2g. The analysed cells were either WT of homozygous for the insertions of the different PE Sox1(+35) modules within the Gata6-TAD. c, The expression patterns of GATA6 (upper panel) and SOX1 (lower panel) were investigated by immunofluorescence in WT ESCs or AntNPCs that were either WT, homozygous for the insertion of the PE Sox1(+35)TFBS + aCGI in the Gata6-TAD or heterozygous for the insertion of the aCGI alone in the Gata6-TAD. Nuclei were stained with DAPI. Scale bar = 100µm. d, Quantification of cells expressing GATA6 or SOX1 according to the immunofluorescence assays described in (c). In (b) and (d), the bars display the mean of n = 3 technical replicates (black dots).

Extended Data Fig. 6 Epigenetic and topological characterization of the Gata6-TAD cell lines.

a, Bisulfite sequencing data presented in Fig. 3a for the indicated Gata6-TAD cell lines. The circles correspond to individual CpG dinucleotides located within the TFBS module. Unmethylated CpGs are shown in white, methylated CpGs in black and not-covered CpGs in gray. b, Chromatin accessibility at the endogenous PE Sox1(+35) and the Gata6-TAD insertion site (P1 and P2) were measured by FAIRE-qPCR in cells with the indicated genotypes. c, DNA methylation and nucleosome occupancy at the TFBS were simultaneously analyzed by NOMe-PCR in the indicated Gata6-TAD ESC lines. In the upper panels, the black and white circles represent methylated or unmethylated CpG sites, respectively. In the lower panels, the blue or white circles represent accessible or inaccessible GpC sites for the GpC methyltransferase, respectively. Red bars represent inaccessible regions large enough to accommodate a nucleosome. The dotted line indicates where the TFBS starts. The grey shaded area represents a nucleosome-depleted region. d, Scatter plots showing population-averaged nucleosome occupancy (red) and DNA methylation (black) levels within the TFBS in the indicated Gata6-TAD ESC lines. The grey shaded area represents a nucleosome depleted region. e-f, H3K4me1, H3K4me3, H2AK119ub, CBX7 and PHC1 levels at the endogenous PE Sox1(+35) and the Gata6-TAD insertion site (P1 and P2) were measured by ChIP-qPCR in cells with the indicated genoytpes. ChIP-qPCR signals were calculated as described in Fig. 3. g, 4C-seq experiments were performed using the Gata6 promoter as a viewpoint in AntNPC with the indicated genotypes. h, Pile-up plots showing average Hi-C7,52 signals in ESC between two groups of PE-gene pairs: PEs and developmental genes with CGI-rich promoters; PEs and genes with CGI-poor promoters. For each PE-gene pair, both the PE and the gene were located within the same TAD. Left panels include all the considered PE-gene pairs (n = 401 pairs for developmental genes; n = 900 for CGI-poor promoters; middle panels includes PE-gene pairs with the same genomic size in the two groups (n = 401 pairs); right panels consist of PE-gene pairs with the same genomic size and genes with expression levels <1 FPKM9 (n = 290 pairs) (Methods).

Extended Data Fig. 7 Generation of cell lines with engineered PE Sox1(+35) modules within the Gria1-TAD and global characterization of H3K27ac and eRNA levels at active enhancers.

a, ESC clonal lines with insertions of the different PE Sox1(+35) modules were identified using primer pairs flanking the insertion borders (1 + 3 and 4 + 2; 1 + 5 and 6 + 2; 1 + 3 and 6 + 2), amplifying potential duplications (4 + 3, 3 + 2 and 4 + 1; 6 + 5, 5 + 2 and 6 + 1) and amplifying a large or small fragment depending on the absence or presence of the insertion (1 + 2), respectively. The PCR results obtained for WT ESCs or two ESC clonal lines with homozygous insertions of the different PE Sox1(+35) modules in the Gria1-TAD are shown. b, Independent biological replicate for the data presented in Fig. 4b. The expression differences between AntNPCs with the TFBS + CGI module and AntNPCs with the other PE modules were calculated using two-sided non-paired t-tests (ns: not significant; fold-change<2 or p>0.05). c, Bisulfite sequencing analyses of ESC lines with the indicated PE Sox1(+35) modules inserted in the Gria1-TAD. The circles correspond to individual CpG dinucleotides located within the TFBS: unmethylated CpGs (white), methylated CpGs (black) and not-covered CpGs (gray) are shown. The plot on the right summarizes the DNA methylation levels measured within the TFBS in the indicated ESC lines. d, Active enhancers (AEs) identified in ESCs based on the presence of distal H3K27ac peaks were classified into three categories (Methods): Class I (AEs in TADs containing only poorly expressed genes; n = 271(left); n = 340 (middle, right); Class II (AEs in TADs with at least one highly expressed gene; n = 271(left); n = 2353(middle); n = 340(right)); Class III (AEs whose closest genes in the same TAD is highly expressed; n = 271(left); n = 1262(middle); n = 340(right)). The violin plots show the H3K27ac and eRNA levels in ESC for each AE category. P-values were calculated using unpaired Wilcoxon tests with Bonferroni correction for multiple testing; the numbers in black indicate the median fold-changes between the indicated groups; the coloured numbers correspond to Cliff Delta effect sizes: negligible (red) and non-negligible (green). In the left and right panels, eRNA levels for the three enhancers classes are compared after correcting for H3K27ac differences (Methods).

Extended Data Fig. 8 Generation and characterization of cell lines with PE insertions at the Gria1 and Sox7/Rp1l1 TADs.

a, H2AK229ub and SUZ12 levels at the endogenous PE Sox1(+35), the Gria1 promoter and the Gria1-TAD insertion site (P1 and P2; Fig. 4d) were measured by ChIP-qPCR in ESCs with the indicated genotypes. ChIP-qPCR signals were calculated as in Fig. 3. b, ESC clonal lines in which a pCGI was inserted 380bp upstream of the Gria1-TSS in cells with the indicated PE Sox1(+35) modules 100Kb upstream from Gria1 were identified using the indicated primer pairs. PCR results for clonal ESC lines with the indicated double homozygous insertions are shown. c, eRNA levels at the endogenous PE Sox1(+35) and the Gria1-TAD insertion site (P1 and P2) were measured by RT-qPCR in cells with the indicated genotypes. Expression values were calculated as in Fig. 3. d, Strategy to insert the indicated PE Sox1(+35) modules 380bp upstream (red triangle) of the Gria1-TSS. e, ESC clonal lines with the PE Sox1(+35) modules 380bp upstream of the Gria1-TSS were identified using the indicated primer pairs. PCR for ESC clonal lines with homozygous insertions of the indicated PE Sox1(+35) modules are shown. f, Independent biological replicate for the data presented in Fig. 5e. g, ESC clonal lines with the PE Sox1(+35) modules within the Sox7/Rp1l1-TAD were identified using primers flanking the insertion borders (1 + 3 and 4 + 2; 1 + 3 and 6 + 2), amplifying potential duplications (4 + 3, 3 + 2 and 4 + 1) and amplifying a large or small fragment depending on the absence or presence of the insertion (1 + 2), respectively. PCR results for ESC clonal lines with homozygous insertions of the indicated PE Sox1(+35) modules are shown. h, Independent biological replicate for the data presented in Fig. 5g. In (a) and (c), the bars display the mean of n = 3 technical replicates (black dots). In (f) and (h), the expression differences between AntNPCs with the TFBS + CGI module or the other PE modules were calculated using two-sided non-paired t-tests (***: foldchange> 2 & p<0.0001; ns: not significant; fold-change<2 or p>0.05).

Extended Data Fig. 9 Generation of ESC lines with structural variants.

a, ESC lines with the Six3/Six2 TAD boundary deletion were identified using primers flanking the deleted region (1 + 3 and 4 + 2), amplifying the deleted fragment (5 + 6) and amplifying a large or small fragment depending on the absence or presence of the deletion (1 + 2), respectively. The PCR results for two ESC clonal lines with 36Kb homozygous deletions (del36) are shown. b, ESC lines with the Six3/Six2 inversion were identified using primer pairs flanking the inverted region (1 + 3, 4 + 2, 1 + 4 and 3 + 2) and amplifying potential duplications (4 + 3, 3 + 3 and 4 + 4). The PCR results for two ESC clonal lines with 110Kb homozygous inversions (inv110) are shown. c, Epigenomic and genetic features of a CTCF binding site112 (CBS; highlighted in grey) located upstream of the PE Six1(133) (highlighted in yellow). d, ESC lines with the CBS deletion were identified using primers flanking the deleted region (1 + 2) or located in the CBS (3 + 4). The PCR results for two ESC clonal lines with homozygous CBS deletions are shown. e, The expression of Six3 and Six2 was measured by RT-qPCR in cells with the indicated genotypes. For each of the engineered structural variants, n = 2 independent clonal cell lines were generated (circles and diamonds). In each plot, the number of circles and/or diamonds corresponds to the number of AntNPC differentiations performed. The results obtained in n = 2 independent biological replicates are presented in each panel (Rep1 and Rep2). Expression values are presented as fold-changes with respect to WT ESCs. f, ESC lines with the Lmx1a-TAD boundary inversion were identified using primers flanking the inverted region (1 + 3, 4 + 2, 1 + 4 and 3 + 2) and amplifying potential deletions (1 + 4) or duplications (4 + 3, 3 + 3 and 4 + 4). The PCR results for three ESC clonal lines with 260 Kb homozygous inversions (inv260) are shown.

Extended Data Fig. 10 Examples of human congenital diseases caused by structural variants that disrupt developmental loci with PE-associated oCGIs.

a, Upper panel: heterozygous inversion in a patient with Branchio-oculo-facial syndrome (BOFS)5. Lower panel: epigenomic and genetic features of TFAP2A neural crest (NC) cognate enhancers (left), 6q16.2 genes (middle) and TFAP2A (right). In the lower left panel, enhancer reporter assays in chicken embryos are shown for two representative TFAP2A enhancers5. Computational CGI and NMIs are represented as green rectangles. The inversion places one TFAP2A allele into a novel TAD and impairs its normal expression in NC cells due to the physical disconnection from its enhancers. TFAP2A has a promoter with a large CGI cluster and marked with a broad H3K27me3 domain in ESCs. Some TFAP2A NC enhancers are associated with oCGIs and marked with H3K27me3 in ESCs. Moreover, this inversion places genes originally found within the 6q16.2 locus in proximity of the TFAP2A NC enhancers within a shuffled domain. The promoters of these 6q16.2 genes (i.e GPR63 and NDUFAF4) contain a short CGI centered on their TSSs. In agreement with our findings, none of the 6q16.2 genes is responsive to the TFAP2A NC enhancers5. b, Upper panel: deletion found in families with brachydactyly involving a TAD boundary located between the EPHA4 and the PAX3 loci63. Lower panel: epigenomic and genetic features of the Epha4 cognate enhancers in the mouse E11.5 limb (left) and in human ESCs (right). Representative reporter assay in E11.5 mouse embryos for the hs1507 element is shown in the middle63. The deletion includes EPHA4, a gene highly expressed in the developing limb, and the TAD boundary separating the EPHA4 and PAX3 TADs. As a result, enhancers that control EPHA4 expression in the limb establish ectopic interactions with PAX3 (that is enhancer adoption) and strongly induce its expression in the limb. The PAX3 promoter contains a large CGI cluster and is marked with H3K27me3 in ESCs, while one of the major EPHA4 enhancers (hs1507) is associated with an oCGI and is marked with H3K27me3 in ESCs. The high responsiveness of PAX3 to the EPHA4 enhancers is in agreement with our findings.

Supplementary information

Reporting Summary

Peer Review Information

Supplementary Data 1

List of oligonucleotides and antibodies.

Supplementary Data 2

List of knock-in donor sequences.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pachano, T., Sánchez-Gaya, V., Ealo, T. et al. Orphan CpG islands amplify poised enhancer regulatory activity and determine target gene responsiveness. Nat Genet 53, 1036–1049 (2021). https://doi.org/10.1038/s41588-021-00888-x

Download citation

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing