Metazoan promoters are enriched in secondary DNA structure-forming motifs, such as G-quadruplexes (G4s). Here we describe ‘G4access’, an approach to isolate and sequence G4s associated with open chromatin via nuclease digestion. G4access is antibody- and crosslinking-independent and enriches for computationally predicted G4s (pG4s), most of which are confirmed in vitro. Using G4access in human and mouse cells, we identify cell-type-specific G4 enrichment correlated with nucleosome exclusion and promoter transcription. G4access allows measurement of variations in G4 repertoire usage following G4 ligand treatment, HDAC and G4 helicases inhibitors. Applying G4access to cells from reciprocal hybrid mouse crosses suggests a role for G4s in the control of active imprinting regions. Consistently, we also observed that G4access peaks are unmethylated, while methylation at pG4s correlates with nucleosome repositioning on DNA. Overall, our study provides a new tool for studying G4s in cellular dynamics and highlights their association with open chromatin, transcription and their antagonism to DNA methylation.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Jiang, C. & Pugh, B. F. Nucleosome positioning and gene regulation: advances through genomics. Nat. Rev. Genet. 10, 161–172 (2009).
Fenouil, R. et al. CpG islands and GC content dictate nucleosome depletion in a transcription-independent manner at mammalian promoters. Genome Res. 22, 2399–2408 (2012).
Esnault, C. et al. G-quadruplexes are promoter elements controlling nucleosome exclusion and RNA polymerase II pausing. Preprint at bioRxiv https://doi.org/10.1101/2023.02.24.529838 (2023).
Bochman, M. L., Paeschke, K. & Zakian, V. A. DNA secondary structures: stability and function of G-quadruplex structures. Nat. Rev. Genet. 13, 770–780 (2012).
Hansel-Hertsch, R., Di Antonio, M. & Balasubramanian, S. DNA G-quadruplexes in the human genome: detection, functions and therapeutic potential. Nat. Rev. Mol. Cell. Biol. 18, 279–284 (2017).
Bedrat, A., Lacroix, L. & Mergny, J. L. Re-evaluation of G-quadruplex propensity with G4Hunter. Nucleic Acids Res. 44, 1746–1759 (2016).
Huppert, J. L. & Balasubramanian, S. Prevalence of quadruplexes in the human genome. Nucleic Acids Res. 33, 2908–2916 (2005).
Hansel-Hertsch, R. et al. G-quadruplex structures mark human regulatory chromatin. Nat. Genet. 48, 1267–1272 (2016).
Li, C. et al. Ligand-induced native G-quadruplex stabilization impairs transcription initiation. Genome Res. 31, 1546–1560 (2021).
Mao, S. Q. et al. DNA G-quadruplex structures mold the DNA methylome. Nat. Struct. Mol. Biol. 25, 951–957 (2018).
Ray, S. et al. Custom DNA microarrays reveal diverse binding preferences of proteins and small molecules to thousands of G-Quadruplexes. ACS Chem. Biol. 15, 925–935 (2020).
Tran, P. L. T. et al. Folding and persistence times of intramolecular G-quadruplexes transiently embedded in a DNA duplex. Nucleic Acids Res. 49, 5189–5201 (2021).
van Holde, K. & Zlatanova, J. Unusual DNA structures, chromatin and transcription. BioEssays 16, 59–68 (1994).
Hershman, S. G. et al. Genomic distribution and functional analyses of potential G-quadruplex-forming sequences in Saccharomyces cerevisiae. Nucleic Acids Res. 36, 144–156 (2008).
Dingwall, C., Lomonossoff, G. P. & Laskey, R. A. High sequence specificity of micrococcal nuclease. Nucleic Acids Res. 9, 2659–2673 (1981).
Horz, W. & Altenburger, W. Sequence specific cleavage of DNA by micrococcal nuclease. Nucleic Acids Res. 9, 2643–2658 (1981).
Foulk, M. S., Urban, J. M., Casella, C. & Gerbi, S. A. Characterizing and controlling intrinsic biases of Lambda exonuclease in nascent strand sequencing reveals phasing between nucleosomes and G-quadruplex motifs around a subset of human replication origins. Genome Res. 25, 725–735 (2015).
Luo, Y., Granzhan, A., Verga, D. & Mergny, J. L. FRET-MC: a fluorescence melting competition assay for studying G4 structures in vitro. Biopolymers 112, e23415 (2020).
Rodriguez, R. et al. Small-molecule-induced DNA damage identifies alternative DNA structures in human genes. Nat. Chem. Biol. 8, 301–310 (2012).
Chambers, V. S. et al. High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat. Biotechnol. 33, 877–881 (2015).
Sun, H., Karow, J. K., Hickson, I. D. & Maizels, N. The Bloom’s syndrome helicase unwinds G4 DNA. J. Biol. Chem. 273, 27587–27592 (1998).
Vaughn, J. P. et al. The DEXH protein product of the DHX36 gene is the major source of tetramolecular quadruplex G4-DNA resolving activity in HeLa cell lysates. J. Biol. Chem. 280, 38117–38120 (2005).
Natoli, G. & Andrau, J. C. Noncoding transcription at enhancers: general principles and functional models. Annu. Rev. Genet. 46, 1–19 (2012).
Xia, Y. et al. Transmission of dynamic supercoiling in linear and multi-way branched DNAs and its regulation revealed by a fluorescent G-quadruplex torsion sensor. Nucleic Acids Res. 46, 7418–7424 (2018).
Jonkers, I., Kwak, H. & Lis, J. T. Genome-wide dynamics of Pol II elongation and its interplay with promoter proximal pausing, chromatin, and exons. eLife 3, e02407 (2014).
Medlin, J. et al. P-TEFb is not an essential elongation factor for the intronless human U2 snRNA and histone H2b genes. EMBO J. 24, 4154–4165 (2005).
Shen, J. et al. Promoter G-quadruplex folding precedes transcription and is controlled by chromatin. Genome Biol. 22, 143 (2021).
Cusack, M. et al. Distinct contributions of DNA methylation and histone acetylation to the genomic occupancy of transcription factors. Genome Res. 30, 1393–1406 (2020).
Vaid, R., Wen, J. & Mannervik, M. Release of promoter-proximal paused Pol II in response to histone deacetylase inhibition. Nucleic Acids Res. 48, 4877–4890 (2020).
Kelsey, G. & Feil, R. New insights into establishment and maintenance of DNA methylation imprints in mammals. Philos. Trans. R. Soc. Lond. B Biol. Sci. 368, 20110336 (2013).
Sanli, I. et al. Meg3 non-coding RNA expression controls imprinting by preventing transcriptional upregulation in cis. Cell Rep. 23, 337–348 (2018).
Valouev, A. et al. Determinants of nucleosome organization in primary human cells. Nature 474, 516–520 (2011).
Capra, J. A., Paeschke, K., Singh, M. & Zakian, V. A. G-quadruplex DNA sequences are evolutionarily conserved and associated with distinct genomic features in Saccharomyces cerevisiae. PLoS Comput. Biol. 6, e1000861 (2010).
Fu, Y., Sinha, M., Peterson, C. L. & Weng, Z. The insulator binding protein CTCF positions 20 nucleosomes around its binding sites across the human genome. PLoS Genet. 4, e1000138 (2008).
Lay, F. D. et al. The role of DNA methylation in directing the functional organization of the cancer epigenome. Genome Res. 25, 467–477 (2015).
Ozcan, K. A., Ghaffari, L. T. & Haeusler, A. R. The effects of molecular crowding and CpG hypermethylation on DNA G-quadruplexes formed by the C9orf72 nucleotide repeat expansion. Sci. Rep. 11, 23213 (2021).
Stevens, A. J. et al. G-quadruplex structures and CpG methylation cause drop-out of the maternal allele in polymerase chain reaction amplification of the imprinted MEST gene promoter. PLoS ONE 9, e113955 (2014).
Wang, Z. F. et al. Cytosine epigenetic modification modulates the formation of an unprecedented G4 structure in the WNT1 promoter. Nucleic Acids Res. 48, 1120–1130 (2020).
Valton, A. L. & Prioleau, M. N. G-Quadruplexes in DNA replication: a problem or a necessity? Trends Genet. 32, 697–706 (2016).
Mendoza, O., Bourdoncle, A., Boule, J. B., Brosh, R. M. Jr. & Mergny, J. L. G-quadruplexes and helicases. Nucleic Acids Res. 44, 1989–2006 (2016).
Asamitsu, S., Obata, S., Yu, Z., Bando, T. & Sugiyama, H. Recent progress of targeted G-quadruplex-preferred ligands toward cancer therapy. Molecules 24, 429 (2019).
Esnault, C., Magat, T., García-Oliver, E. & Andrau, J. C. Analyses of promoter, enhancer, and nucleosome organization in mammalian cells by MNase-seq. Methods Mol. Biol. 2351, 93–104 (2021).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Fenouil, R. et al. Pasha: a versatile R package for piling chromatin HTS data. Bioinformatics 32, 2528–2530 (2016).
Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).
Orlando, D. A. et al. Quantitative ChIP–seq normalization reveals global modulation of the epigenome. Cell Rep. 9, 1163–1170 (2014).
Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).
Descostes, N. et al. Tyrosine phosphorylation of RNA polymerase II CTD is associated with antisense promoter transcription and active enhancers in mammalian cells. eLife 3, e02105 (2014).
Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208 (2009).
Anvar, Z. et al. ZFP57 recognizes multiple and closely spaced sequence motif variants to maintain repressive epigenetic marks in mouse embryonic stem cells. Nucleic Acids Res. 44, 1118–1132 (2016).
Morison, I. M., Ramsay, J. P. & Spencer, H. G. A census of mammalian imprinting. Trends Genet. 21, 457–465 (2005).
Schulz, R. et al. WAMIDEX: a web atlas of murine genomic imprinting and differential expression. Epigenetics 3, 89–96 (2008).
Xie, W. et al. Base-resolution analyses of sequence and parent-of-origin dependent DNA methylation in the mouse genome. Cell 148, 816–831 (2012).
Lacroix, L. G4HunterApps. Bioinformatics 35, 2311–2312 (2019).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).
Huang da, W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-seq applications. Bioinformatics 27, 1571–1572 (2011).
Loyfer, N. et al. A DNA methylation atlas of normal human cell types. Nature 613, 355–364 (2023).
Saldanha, A. J. Java Treeview—extensible visualization of microarray data. Bioinformatics 20, 3246–3248 (2004).
Humayun, M. S., Rady, A. M. & Soliman, G. M. Obstructive jaundice secondary to intra-biliary rupture of hepatic hydatid cyst. Int. Surg. 74, 4–6 (1989).
Makrini, A., Esnault, C., Andrau, J. C. & Magat, T. Scripts and codes for G4access analysis. Zenodo. https://zenodo.org/record/7912528 (2023).
This work was supported by JCA lab (grant ANR-20-CE12-0023), FRM (grant AJE20130728183), INCA PLbIO (grant 2020-117) and CNRS 80prime 2021 (grant DeciphG4). This project has received financial support from the CNRS through the MITI interdisciplinary programs. C.E. was supported in part by an ARC grant (retour postdoc). We thank B. Loriod and the Transcriptomics and Genomics Marseille-Luminy (TGML) platform for sequencing the G4access samples. We are grateful to D. Monchaud for providing us with G4-interfering molecules in exploratory experiments. TGML is a member of the France Génomique Consortium (ANR-10-INBS-0009). E.G.O., T.M. and S.B. were supported by grants from the Epigenesys Labex of excellence and EGO in part by ANR-18-CE12-0019. We acknowledge the financial support from the France Génomique National Infrastructure, funded as part of ‘Investissement d’Avenir’ program managed by the Agence Nationale de la Recherche (contract ANR-10-INBS-09) for the MGX sequencing platform facility in Montpellier. We are also grateful to the Genotoul Bioinformatics Platform Toulouse Midi-Pyrenees for computing and storage resources. The funders had no role in study design, data collection and analysis, and decision to publish or preparation of the manuscript. We also thank P. Navarro and E. Kremer for critical reading of the manuscript and the Raman-Livaja lab for help in the yeast extract preparation.
The authors declare no competing interests.
Peer review information
Nature Genetics thanks F. Brad Johnson and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a- G4FS at all promoters are associated to open regions upstream of TSSs. The graph shows nucleosome and G4H2.0 densities in Raji cells, as well as the location of upstream and downstream peak’s locations of G4 and nucleosome deep of all CGI containing promoters. b- qPCR quantification of a model G4 (human MAZ locus) in G4access preparations. This G4 containing fragment is enriched in the 3 cell lines at various digestion levels of MNase as indicated. MNase activity was controlled by measuring the level mononucleosome fractions (see Fig. 1b). c- FRET melting profiles for comparison of physiological (red) and MNase (black) digestion conditions. Fluorescence level reflects denaturation of the G4 structure. d- Table of test sequences and G4Hunter scores. Tm and ∆Tm are indicated for all sequences except Myc, because of complex melting and very high stability. Note that all G4s are highly stable in the MNase buffer at room temperature or 37 °C (blue bar). e- Correlation plots of G4access merged signals compared to individual biological replicates.
a- Comparison of G4access signal and G4-ChIP at a selected area of the genome (KRAS locus, (chr12: 25.330.000-25.560.000)) b- Venn diagram of overlapping G4access peaks in the 3 model cell lines. (Fisher tests of the overlaps <1 × 10−4) c- Venn diagram of overlapping G4access and G4-ChIP peaks in the HaCaT and K562 cell lines. (Fisher tests of the overlaps <1 × 10−4). d- G4Hunter prediction scores in G4access performed in 3 human cell lines and comparison to published G4 ChIP-seq in 2 of these cell lines. For the sake of comparison, all fragments were resized at 90 bp in G4-ChIP, G4access peaks and genomic DNA (40.000 annotations; see Methods). All distributions are highly significant compared to random selections (not shown) using a two-sided Wilcoxon test (p-value < 2 × 10−16). e- G4Hunter prediction scores compared to shuffled sequences of same sizes and same nucleotide compositions and to random sequences (see Methods; all differences in the distributions of G4access associated scores are highly significant compared to random and shuffled selections using a two-sided Wilcoxon test, p-value < 2 × 10−16).
Extended Data Fig. 3 G4access genomic localization, sequence characterization and association to gene expression programs.
a- G4 subtypes identified in the 3 cell lines (see Methods). b- Compared partition of G4access and G4 ChIP regions in the human genome. The control bars represent the genomic distribution of G4FS at various stringencies (G4Hunter scores of 1.2, 1.5 and 2.0). TES represent transcription end sites at gene units. c- Analyses of number GG or GGG tracks found in G4-ChIP or G4access peak datasets (n = 11563, 44412, 12216, 13320 and 9031). d- Number of Gs found in the G-tracks of the predicted G4s in the G4-ChIP or G4access datasets, with at least 2 G per track. e- GC and CpG contents distributions at promoters associated to G4access peaks (K562 n = 8343, HaCaT n = 4090, Raji n = 4465, all genes n = 20314). f- Gene ontology analyses using DAVID database of the genes associated to promoter with G4access peaks in K562, Raji and HaCaT cells (DAVID, modified Fisher Exact p-value,). g- Gene expression level analysis expressed as Fragment per kb per million (FPKM) in chromatin RNA-seq datasets in K562 and Raji cells (n = 4660, 8569, 32355, 31779, 4659, 8601, 32753 and 31434). Box plots represent minimal and maximal values, first and third quartiles and the median value.
Extended Data Fig. 4 Motifs associated to G4access and G4 ChIP peaks in the 3 model cell lines (2 in the case of ChIP) at TSS and all sites as indicated.
The sequence logos and statistics associated to this analysis were generated using the MEME algorithm. Presented motifs are ranked by occurrence (top 3). MEME-ChIP e-value are displayed.
a- Principle of the ThT and NMM G4 determination. b- Cumulative percentage of validated regions in FRET-MC above a given threshold of G4Hunter of G4access selection sequences. c- Experimental fluorescence for NMM experiments. G4 threshold is indicated at 125 (a. u). d- Experimental fluorescence for ThT experiments. G4 threshold is indicated at 200 (a. u).
Extended Data Fig. 6 G4access measures G-quadruplex dynamics in response to cell treatments with a G4 ligand.
a- Genome browser view illustrating Pyridostatin (PDS, 10 µM for 30 min) effect on G4access peaks dynamics in Raji cells (Chr1: 203.500.000-205.500.000). In the zoom area is shown the promoter ATP2B4, in which the main G4access signal redistributes from strong to weak G4FS. b- DESeq analysis of G4access signal following 30 min of treatment by PDS. The promoter-proximal (TSS) and non-promoter G4s are indicated in red and blue respectively (DESeq, p-value < 0.05). c- G4access score density is shifted toward weaker G4s following PDS treatment. d- G4seq score density is shifted toward weaker G4s following PDS treatment, although to a lesser extent than for G4access.
Extended Data Fig. 7 Nucleosome and Pol II features at G4access peaks, with or without strong G4 predictions.
a- average profiles of G4access regions depending of their nucleosome depletion level (relates to Fig. 4). Metaprofiles of MNase-seq (Nucleosome midpoints), G4access and Pol II ChIP-seq centered on G4access summits in the 4 groups defined in Fig. 4c. The corresponding signals for the H3K4me3 and H3K4me1 ChIP-seq in Raji cells are also shown (right panels), for which the relative high amount of H3K4me3/me1 is indicative of a promoter feature, as seen for group 1 and, to a lesser extent, group 2. b- Features of signals below G4 formation threshold in G4access signal. G4access signals were selected above (G4Hunter > 1.2; n = 9047 regions) or below (<1.0; n = 3492 regions) threshold for G4 formation in all genomic locations and analyzed for nucleosome positioning/density, G4access signals and Pol II loading. G4-forming sequences are strongly associated with nucleosome depletion and positioning c- G4Hunter prediction scores in nucleosome depleted regions (NDRs, see Methods) associated or not to G4access peaks. A random selection of genomic area of same size is indicated in light gray. While distributions of scores at G4access associated NDRs are highly significant compared to random selections (using a two-sided Wilcoxon test, p-value < 2.2 × 10−16), distributions of G4Hunter scores at other NDRs are not significatively different to random selections.
a- G4Hunter (G4H1.2) and chromatin landscape (ATAC-seq and MNase-seq density or positioning) profiling in K562 cells at sites with common or specific G4access and G4-ChIP peaks as indicated. b- G4Hunter (G4H1.2) and chromatin landscape (ATAC-seq) profiling in HaCaT cells. Groups were defined as in Extended Data Fig. 2c and genomic datasets used are listed in the Supplementary Table 1.
Extended Data Fig. 9 G4access dynamics in response to nucleosome perturbation by the HDAC inhibitor TSA.
a- TSA treatment for 24 hours leads to H3K9acetylation increase. Western-blots of VCP and total H3 (loading controls) and of H3K9ac in 3 independent replicates are shown. b- Representative examples of G4access decrease associated to NDR closure at the MFSD2A promoter (chr1: 40.418.000- 40.424.000) and the chr16: 19.503.827-19.506.304 genomic region. c- TSA treatment for 24 hours leads to a global decrease of chromatin accessibility at NDRs associated to G4access decrease signal. Metaprofiles of G4access (left) and MNase-seq density and positioning (right) are shown at all TSSs (up) and non TSS (bottom) sites.
Extended Data Fig. 10 Application of the G4access procedure in organisms with less genomic G4 densities.
a- Comparison of G4Hunter prediction frequencies per kb (higher table) and densities (lower table and graph in the right panel) in 3 distinct organisms (Human, D. melanogaster and S. cerevisiae). b- G4 prediction scores in G4access and equivalent selection of random DNA fragments in the 3 organisms. c- Motif search (MEME) at promoter and non-promoter sites, ranked by occurrence in flies and yeast. MEME-ChIP e-value are displayed. d- Repartition of the G4 subtypes in G4access peaks in flies and yeast as for Fig. 1h. In yeast, the majority of G4access peaks are non-forming G4 sequences. e- Examples of G4access, ATAC-seq and Pol II ChIP-seq signals in Drosophila (chr3L: 18.755.000-18.772.500) and Yeast (chrIV: 766.800-771.500). The isolated peaks for G4access and ATAC, and the G4H1.2 annotations are indicated below the signal tracks.
About this article
Cite this article
Esnault, C., Magat, T., Zine El Aabidine, A. et al. G4access identifies G-quadruplexes and their associations with open chromatin and imprinting control regions. Nat Genet 55, 1359–1369 (2023). https://doi.org/10.1038/s41588-023-01437-4