R-loops are DNA–RNA hybrids enriched at CpG islands (CGIs) that can regulate chromatin states1,2,3,4,5,6,7,8. How R-loops are recognized and interpreted by specific epigenetic readers is unknown. Here we show that GADD45A (growth arrest and DNA damage protein 45A) binds directly to R-loops and mediates local DNA demethylation by recruiting TET1 (ten-eleven translocation 1). Studying the tumor suppressor TCF21 (ref. 9), we find that antisense long noncoding (lncRNA) TARID (TCF21 antisense RNA inducing promoter demethylation) forms an R-loop at the TCF21 promoter. Binding of GADD45A to the R-loop triggers local DNA demethylation and TCF21 expression. TARID transcription, R-loop formation, DNA demethylation, and TCF21 expression proceed sequentially during the cell cycle. Oxidized DNA demethylation intermediates are enriched at genomic R-loops and their levels increase upon RNase H1 depletion. Genomic profiling in embryonic stem cells identifies thousands of R-loop-dependent TET1 binding sites at CGIs. We propose that GADD45A is an epigenetic R-loop reader that recruits the demethylation machinery to promoter CGIs.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Data availability

TARID sequence information is available from GenBank under KF484512.1. The TET1 ChIP–seq data are available from GEO under accession number GSE104067.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Castellano-Pozo, M. et al. R loops are linked to histone H3 S10 phosphorylation and chromatin condensation. Mol. Cell 52, 583–590 (2013).

  2. 2.

    Nakama, M. et al. DNA-RNA hybrid formation mediates RNAi-directed heterochromatin formation. Genes Cells 17, 218–233 (2012).

  3. 3.

    Groh, M., Lufino, M. M., Wade-Martins, R. & Gromak, N. R-loops associated with triplet repeat expansions promote gene silencing in Friedreich ataxia and fragile X syndrome. PLoS Genet. 10, e1004318 (2014).

  4. 4.

    Sanz, L. A. et al. Prevalent, dynamic, and conserved R-loop structures associate with specific epigenomic signatures in mammals. Mol. Cell 63, 167–178 (2016).

  5. 5.

    Skourti-Stathaki, K., Kamieniarz-Gdula, K. & Proudfoot, N. J. R-loops induce repressive chromatin marks over mammalian gene terminators. Nature 516, 436–439 (2014).

  6. 6.

    Skourti-Stathaki, K., Proudfoot, N. J. & Gromak, N. Human senataxin resolves RNA/DNA hybrids formed at transcriptional pause sites to promote Xrn2-dependent termination. Mol. Cell 42, 794–805 (2011).

  7. 7.

    Powell, W. T. et al. R-loop formation at Snord116 mediates topotecan inhibition of Ube3a-antisense and allele-specific chromatin decondensation. Proc. Natl Acad. Sci. USA 110, 13938–13943 (2013).

  8. 8.

    Ginno, P. A. et al. R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters. Mol. Cell 45, 814–825 (2012).

  9. 9.

    Arab, K. et al. Long noncoding RNA TARID directs demethylation and activation of the tumor suppressor TCF21 via GADD45A. Mol. Cell 55, 604–614 (2014).

  10. 10.

    Barreto, G. et al. Gadd45a promotes epigenetic gene activation by repair-mediated DNA demethylation. Nature 445, 671–675 (2007).

  11. 11.

    Cortellino, S. et al. Thymine DNA glycosylase is essential for active DNA demethylation by linked deamination-base excision repair. Cell 146, 67–79 (2011).

  12. 12.

    Schmitz, K. M. et al. TAF12 recruits Gadd45a and the nucleotide excision repair complex to the promoter of rRNA genes leading to active DNA demethylation. Mol. Cell 33, 344–353 (2009).

  13. 13.

    Schäfer, A. et al. Ing1 functions in DNA demethylation by directing Gadd45a to H3K4me3. Genes Dev. 27, 261–273 (2013).

  14. 14.

    Kienhöfer, S. et al. GADD45a physically and functionally interacts with TET1. Differentiation 90, 59–68 (2015).

  15. 15.

    Li, Z. et al. Gadd45a promotes DNA demethylation through TDG. Nucl. Acids Res. 43, 3986–3997 (2015).

  16. 16.

    Sun, Q. et al. R-loop stabilization represses antisense transcription at the Arabidopsis FLC locus. Science 340, 619 (2013).

  17. 17.

    Hobson, D. J., Wei, W., Steinmetz, L. M. & Svejstrup, J. Q. RNA polymerase II collision interrupts convergent transcription. Mol. Cell 48, 365–374 (2012).

  18. 18.

    Cerritelli, S. M. & Crouch, R. J. Ribonuclease H: the enzymes in eukaryotes. FEBS J. 276, 1494–1505 (2009).

  19. 19.

    Wu, H. et al. Dual functions of Tet1 in transcriptional regulation in mouse embryonic stem cells. Nature 473, 389–393 (2011).

  20. 20.

    Ma, D. K. Neuronal activity-induced Gadd45b promotes epigenetic DNA demethylation and adult neurogenesis. Science 323, 1074–1077 (2009).

  21. 21.

    Gan, W. et al. R-loop-mediated genomic instability is caused by impairment of replication fork progression. Genes Dev. 25, 2041 (2011).

  22. 22.

    Bhatia, V. et al. BRCA2 prevents R-loop accumulation and associates with TREX-2 mRNA export factor PCID2. Nature 511, 362–365 (2014).

  23. 23.

    Boguslawski, S. J. et al. Characterization of monoclonal antibody to DNA·RNA and its application to immunodetection of hybrids. J. Immunol. Methods 89, 123–130 (1986).

  24. 24.

    Ehrich, M. et al. Quantitative high-throughput analysis of DNA methylation patterns by base-specific cleavage and mass spectrometry. Proc. Natl Acad. Sci. USA 102, 15785–15890 (2005).

  25. 25.

    Kinney, S. M. et al. Tissue-specific distribution and dynamic changes of 5-hydroxymethylcytosine in mammalian genomes. J. Biol. Chem. 286, 24685–24693 (2011).

  26. 26.

    Yu, K. et al. R-loops at immunoglobulin class switch regions in the chromosomes of stimulated B cells. Nat. Immunol. 4, 442–451 (2003).

  27. 27.

    Schomacher, L. et al. Neil DNA glycosylases promote substrate turnover by Tdg during DNA demethylation. Nat. Struct. Mol. Biol. 23, 116–124 (2016).

  28. 28.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

  29. 29.

    Li, H. et al. The sequence alignment/map format and samtools. Bioinformatics 25, 2078–2079 (2009).

  30. 30.

    Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).

  31. 31.

    Shen, L., Shao, N., Liu, X. & Nestler, E. Ngs.plot: Quick mining and visualization of next-generation sequencing data by integrating genomic databases. BMC Genomics 15, 284 (2014).

  32. 32.

    Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome. Biol. 9, R137 (2008).

  33. 33.

    Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

  34. 34.

    Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome. Biol. 15, 550 (2014).

  35. 35.

    Ignatiadis, N., Klaus, B., Zaugg, J. B. & Huber, W. Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nat. Methods 13, 577–580 (2016).

  36. 36.

    Neri, F. et al. Single-base resolution analysis of 5-formyl and 5-carboxyl cytosine reveals promoter DNA methylation dynamics. Cell Rep. 10, 674–683 (2015).

  37. 37.

    Pefanis, E. et al. RNA exosome-regulated long non-coding RNA transcription controls super-enhancer activity. Cell 161, 774–789 (2015).

  38. 38.

    Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

  39. 39.

    Domcke, S. et al. Competition between DNA methylation and transcription factors determines binding of NRF1. Nature 528, 575–579 (2015).

Download references


We acknowledge the support of O. Muecke and C. Plass in DNA methylation analysis, and the support of the DKFZ FACS and the IMB Microscopy and Genomics core facilities. We thank V. Vastolo for mouse embryos. I.G. was supported by the Helmholtz Foundation and by grants from the Deutsche Forschungsgemeinschaft (GR475/22-1, SFB1036), the CellNetworks Cluster of Excellence (EcTop 5), the DKFZ-MOST programme, and the Baden-Württemberg Stiftung (NCRNA_025). C.N. was supported by the European Research Council (ERC) DNAdemethylase.

Author information

Author notes

  1. These authors contributed equally: Khelifa Arab, Ingrid Grummt, Christof Niehrs.


  1. Institute of Molecular Biology (IMB), Mainz, Germany

    • Khelifa Arab
    • , Emil Karaulanov
    • , Michael Musheev
    • , Philipp Trnka
    • , Andrea Schäfer
    •  & Christof Niehrs
  2. Division of Molecular Biology of the Cell II, German Cancer Research Center and DKFZ-ZMBH Alliance, Heidelberg, Germany

    • Khelifa Arab
    •  & Ingrid Grummt
  3. Division of Molecular Embryology, German Cancer Research Center and DKFZ-ZMBH Alliance, Heidelberg, Germany

    • Khelifa Arab
    •  & Christof Niehrs


  1. Search for Khelifa Arab in:

  2. Search for Emil Karaulanov in:

  3. Search for Michael Musheev in:

  4. Search for Philipp Trnka in:

  5. Search for Andrea Schäfer in:

  6. Search for Ingrid Grummt in:

  7. Search for Christof Niehrs in:


K.A. conceived the study, carried out most of the experiments and analysed the data. K.A., I.G., and C.N. designed the experimental work and wrote the manuscript. M.M. performed LC/MS-MS analyses, and E.K. performed the bioinformatics analyses. P.T. and A.S. generated and characterized HA-tagged Tet1 mESCs.

Competing interests

The authors declare no competing interests.

Corresponding authors

Correspondence to Khelifa Arab or Ingrid Grummt or Christof Niehrs.

Integrated supplementary information

  1. Supplementary Figure 1 TARID forms an R-loop at the TCF21 promoter.

    (a), Top, the TCF21 locus is shown with CG content (%), TCF21 (blue) and TARID (red). Bottom, GC skew plot (red line) and GC content (green line) at the 5ʹ end of TCF21. Gray shading highlights the elevated GC skew score. (b), DRIP–qPCR analysis of R-loops at the CGI promoter of TCF21 in H387 cells transfected with plasmids encoding RNase H1 (RNH1 +) or GFP (–). The scheme above illustrates the structure of the TCF21 locus and the qPCR amplicons used. RPL13A (orange) was used as a positive control. (c), RNase H1 downregulates expression of RPL13A. RT–qPCR analysis of RPL13A mRNA in primary skin fibroblasts (PSF) and HEK293TARIDwt cells transfected with plasmids encoding GFP (Ctrl), RNase H1 (RNH1), or the hybrid-binding (HB) domain of RNase H1. RNA levels were normalized to those of HPRT1 mRNA. (d), Top, scheme of the TCF21 TSS (see also Fig. 1d). Bottom, native bisulfite sequencing reads showing the protected (+) and unprotected (–) DNA strand, sequenced from PCR products (a and b) and (c and d), shown in Fig. 1c. The R-loop region is shaded in yellow, and the positions of C-to-T (C->T) conversions are indicated by red lines (top) and black arrows. The experiment was carried out once. In (b and c), data are shown as the mean ± s.d., n = 3 biological replicates, two-tailed t test; **P < 0.01.

  2. Supplementary Figure 2 GADD45A exhibits binding preference for R-loops.

    (a), R-loop probes used for in vitro binding. 5ʹ-radiolabeled RNA or DNA was annealed to the complementary DNA strand or to both DNA strands to form a DNA:RNA hybrid or an R-loop. Where indicated, the products were treated with RNase H1 or RNase A. Products were resolved by PAGE and visualized by PhosphorImaging. (b,c). Pulldown assays showing GADD45A binding to TCF21-unrelated R-loops in vitro. Bead-bound S9.6 antibody or FLAG-tagged GADD45A (GA45a), PTB or GFP was incubated with a radiolabeled generic R-loop without (b) or with (c) overhanging RNA, and bound probes were visualized by phosphorimaging. Where indicated, the samples were incubated with RNase A (5 pg/ml, 120 min) or with RNase H1 (0.05 U/ μl, 2 h) before PAGE. cR-loop marks the position of the R-loop after cleavage with RNase A. (d), EMSA showing binding of GADD45A to DNA–RNA heteroduplexes in vitro. A radiolabeled DNA:RNA hybrid was incubated with increasing amounts of GADD45A and analyzed by PAGE. (e), Competitive EMSA showing no displacement of GADD45A from R-loops by unlabeled ssDNA (left), RNA (middle) or dsRNA (right). (f), GADD45A is associated with cellular R-loops. Lysates from PSFs expressing FLAG-tagged GADD45A (GA45a), PTB or GFP were incubated with S9.6 antibody or mouse IgGs, and co-precipitated proteins were visualized on western blots (left). 5% of the lysates used for IP is shown at the right. In (af), experiments were repeated twice with similar results. All blot images were cropped (see Supplementary Figs. 10c, 11 and 12a).

  3. Supplementary Figure 3 TARID transcription, DNA demethylation and TCF21 expression proceed sequentially.

    (a), FACS sorting of PSFs. Left, dot plot showing histograms of side scatter (SSC-H) and forward scatter (FSC-H) as well as the gate employed (colored, P1); middle, histogram of Hoechst-stained PSFs (VL450/50: detection parameter in nm); right, sorted G1, S and G2M cell populations. (b), RT–qPCR analysis of TARID and TCF21 mRNA in FACS-sorted HEK293TARIDwt cells. RNA levels were normalized to those of HPRT1 mRNA. (c), TARID expression precedes DNA demethylation. Top, scheme showing the region around the TSS of TCF21 and the location of the CpGs analyzed. Bottom, MassARRAY analysis of DNA methylation at the TSS of TCF21 in the indicated cell lines. Cells were sorted by FACS into G1, S or G2/M phase. The methylation level of each CpG is shown for the indicated cells. (d), FACS analysis of cell cycle phases in synchronized PSFs. The time course of cell cycle progression from G1/S to G2/M phase after release from thymidine block. In (b and c), data are shown as the mean ± s.d., n = 3 biological replicates, two-tailed t test; **P < 0.01. In (a, d), experiments were repeated twice with similar results.

  4. Supplementary Figure 4 TARID transcription, DNA demethylation and TCF21 expression proceed sequentially.

    (a), Top, RT–qPCR analysis of TARID and TCF21 mRNA in G1, S, G2, and M phase-arrested thymidine/nocodazol-arrested PSFs. Bottom, FACS analysis of cell cycle phases in synchronized PSFs (left) and RT–qPCR analysis of mRNA levels of cell cycle phase markers (right). RNA levels are normalized to those of HPRT1 mRNA. (b), DRIP–qPCR analysis at a negative-control region of TCF21 (blue amplicon) in synchronized PSFs showing no R-loop formation at the indicated time points after release into the cell cycle. (c), RT–qPCR expression analysis of TET2 and TET3 mRNAs in synchronized PSFs. RNA levels were normalized to those of HPRT1 mRNA. (d), ChIP–qPCR of DNMT3A and DNMT3B at the TCF21 locus during the cell cycle. Top, scheme showing the position of ChIP–qPCR amplicons at the TCF21 locus; amplicons monitoring the promoter region (blue) and the control region (red) are shown. Bottom, ChIP–qPCR analysis of DNMT3A and DNMT3B in synchronized PSFs during G1, S and G2/M phase. In (ad), data are shown as the mean ± s.d., n = 3 biological replicates.

  5. Supplementary Figure 5 Characterization of HA-Tet1-tagged mESC lines.

    (a), LC–MS/MS analysis of genomic 5mC, 5hmC, 5fC and 5caC in HEK293T cells after siRNA-mediated knockdown of RNases H1, H2A, 2B, 2C in cells overexpressing HA-TET1; data are shown as the mean ± s.d., n = 3 biological replicates, two-tailed t test; *P < 0.05, **P < 0.001 or indicated as not significant (n.s.). (b), Lysates of wild-type and G418-selected mESC clones expressing FLAG/HA-tagged TET1 were immunoprecipitated with FLAG antibody, and precipitated proteins were analyzed on western blots using HA antibody. Top, levels of FLAG/HA-TET1 in 10% input used for IP together with the loading control α-tubulin. FLAG/HA-TET1 was present in four of seven selected clones (bottom). All blot images were cropped (see Supplementary Fig. 12b). (c), LC–MS/MS analysis of genomic 5mC, 5hmC, 5fC and 5caC in clones of WT (1–3) and FLAG/HA-tagged Tet1 (HA-Tet1)(4, 6). (d), RT–qPCR analysis of Tet1, Oct4, Sox2 and Nanog mRNAs in clones expressing WT or FLAG/HA-tagged Tet1. RNA levels were normalized to those of Gapdh mRNA. In (bd), experiments were repeated once with similar results.

  6. Supplementary Figure 6 RNase H1 impairs TET1 binding.

    (a), Principal-components analysis (PCA) of the ChIP–seq samples based on coverage of the 90,482 consensus TET1 peaks. (n = 3 biological replicates). (b), Venn diagram showing the overlap of consensus TET1 peaks with previously published TET1 peaks (Nature 473, 389–393, 2011). (c), ChIP–qPCR for TET1 ChIP–seq validation at TET1-bound target gene promoters as well as negative controls. Data are shown as the mean ± s.d., n = 3 biological replicates, two-tailed t test; *P < 0.05, **P < 0.01 or indicated as not significant (n.s.). (d), Table showing enrichment of R-loops and TARID-like antisense lncRNAs overlapping RNH1-sensitive TET1 peaks located at TSS CGIs (3,294) versus all TET1 peaks at TSS CGIs (10,882). Indicated is fold enrichment (fold enrich) and the corresponding one-sided Fisher’s exact test (FET) P value. (e), UCSC browser screenshot of an RNH1-sensitive TET1 peak associated with the TSS of Gadd45a and a TARID-like lncRNA (E230016M11Rik). Other tracks show the CpG islands and R-loops (Mol. Cell 63, 167–178, 2016) as well as the CpG methylation levels in mESCs. In (a and e), experiments were repeated twice with similar results.

  7. Supplementary Figure 7 RNase H1–sensitive TET1 peaks show enriched NRF1 binding.

    (a), Table of the top ten enriched transcription factor motifs using HOMER (Mol. Cell 38, 576–589, 2010) in the TET1 RNH1-sensitive peaks (‘Target Sequences’, n = 3,294) compared to the rest of the TET1 peaks located at TSS CGIs (‘Background Sequences’, n = 7,588). Motif enrichment P and q values are calculated by HOMER using cumulative binomial distribution with autonormalization of sequence biases (see URLs). b, Heat maps with summary plots of NRF1 ChIP–seq (Nature 528, 575–579, 2015) signal intensity centered at the TET1 peaks using deepTools (Nucleic Acids Res. 44, W160–W165, 2016). NRF1 binding is shown for TET1 RNH1-sensitive peaks on the left versus the rest of TET1 peaks at TSS CGIs on the right.

  8. Supplementary Figure 8 Validation of R-loops at human ortholog gene promoters in primary skin fibroblasts.

    (a), DRIP–qPCR validation of R-loops at selected promoter gene candidates in human PSFs without (Mock) or with (RNH1) RNase H1 treatment. Data are shown as the mean ± s.d., n = 3 biological replicates, two-tailed t test; *P < 0.05 or indicated as not significant (n.s.). (b), RT–qPCR analysis of GADD45A and GADD45B knockdown efficiency in human PSFs treated with siRNA against GADD45A and GADD45B mRNA or with scrambled siRNA (siCtrl). RNA levels were normalized to those of HPRT1 mRNA. Note that GADD45G levels in PSFs are low and were unaffected by siGADD45A and siGADD45B. Data are shown as the mean ± s.d., n = 4 biological replicates. (c), Overlap of all versus RNH1-sensitive TET1 peaks at TSS CGIs with published TDG ChIP–seq peaks in mESCs (Cell Rep. 10, 674––683, 2015) (see GEO data set GSE55657).

  9. Supplementary Figure 9

    Original uncropped gels

  10. Supplementary Figure 10

    Original uncropped gels

  11. Supplementary Figure 11

    Original uncropped gels

  12. Supplementary Figure 12

    Original uncropped gels

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–12

  2. Reporting Summary

  3. Supplementary Table 1

    RNH1-sensitive TET1 peaks and the TARID-like lncRNAs associated with them

  4. Supplementary Table 2

    Oligonucleotides used in this study

About this article

Publication history




Issue Date