BANP opens chromatin and activates CpG-island-regulated genes

Grand, Ralph S.; Burger, Lukas; Gräwe, Cathrin; Michael, Alicia K.; Isbel, Luke; Hess, Daniel; Hoerner, Leslie; Iesmantavicius, Vytautas; Durdu, Sevi; Pregnolato, Marco; Krebs, Arnaud R.; Smallwood, Sébastien A.; Thomä, Nicolas; Vermeulen, Michiel; Schübeler, Dirk

doi:10.1038/s41586-021-03689-8

Article
Published: 07 July 2021

BANP opens chromatin and activates CpG-island-regulated genes

Nature volume 596, pages 133–137 (2021)Cite this article

22k Accesses
39 Citations
171 Altmetric
Metrics details

Subjects

Abstract

The majority of gene transcripts generated by RNA polymerase II in mammalian genomes initiate at CpG island (CGI) promoters^1,2, yet our understanding of their regulation remains limited. This is in part due to the incomplete information that we have on transcription factors, their DNA-binding motifs and which genomic binding sites are functional in any given cell type^3,4,5. In addition, there are orphan motifs without known binders, such as the CGCG element, which is associated with highly expressed genes across human tissues and enriched near the transcription start site of a subset of CGI promoters^6,7,8. Here we combine single-molecule footprinting with interaction proteomics to identify BTG3-associated nuclear protein (BANP) as the transcription factor that binds this element in the mouse and human genome. We show that BANP is a strong CGI activator that controls essential metabolic genes in pluripotent stem and terminally differentiated neuronal cells. BANP binding is repelled by DNA methylation of its motif in vitro and in vivo, which epigenetically restricts most binding to CGIs and accounts for differential binding at aberrantly methylated CGI promoters in cancer cells. Upon binding to an unmethylated motif, BANP opens chromatin and phases nucleosomes. These findings establish BANP as a critical activator of a set of essential genes and suggest a model in which the activity of CGI promoters relies on methylation-sensitive transcription factors that are capable of chromatin opening.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: BANP binds the CGCG element in vivo.**

**Fig. 2: BANP is sensitive to DNA methylation.**

**Fig. 3: BANP drives the expression of essential genes.**

**Fig. 4: BANP opens chromatin at CGIs.**

A CpG island-encoded mechanism protects genes from premature transcription termination

Article Open access 09 February 2023

Unraveling the functional role of DNA demethylation at specific promoters by targeted steric blockage of DNA methyltransferase with CRISPR/dCas9

Article Open access 29 September 2021

G4access identifies G-quadruplexes and their associations with open chromatin and imprinting control regions

Article 03 July 2023

Data availability

Next-generation sequencing data have been deposited at the Gene Expression Omnibus with accession number GSE155604. Mass spectrometry data have been deposited at the ProteomeXchange Consortium through the PRIDE partner repository with the identifier PXD024794.

References

Mohn, F. & Schübeler, D. Genetics and epigenetics: stability and plasticity during cellular differentiation. Trends Genet. 25, 129–136 (2009).
Article CAS PubMed Google Scholar
Deaton, A. M. & Bird, A. CpG islands and the regulation of transcription. Genes Dev. 25, 1010–1022 (2011).
Article CAS PubMed PubMed Central Google Scholar
Lambert, S. A. et al. The human transcription factors. Cell 172, 650–665 (2018).
Article CAS PubMed Google Scholar
Vaquerizas, J. M., Kummerfeld, S. K., Teichmann, S. A. & Luscombe, N. M. A census of human transcription factors: function, expression and evolution. Nat. Rev. Genet. 10, 252–263 (2009).
Article CAS PubMed Google Scholar
Slattery, M. et al. Absence of a simple code: how transcription factors read the genome. Trends Biochem. Sci. 39, 381–399 (2014).
Article CAS PubMed PubMed Central Google Scholar
Ernst, J. et al. Genome-scale high-resolution mapping of activating and repressive nucleotides in regulatory regions. Nat. Biotechnol. 34, 1180–1190 (2016).
Article CAS PubMed PubMed Central Google Scholar
FitzGerald. P. C., Shlyakhtenko, A., Mir, A. A. & Vinson, C. Clustering of DNA sequences in human promoters. Genome Res. 14, 1562–1574 (2004).
Article CAS PubMed PubMed Central Google Scholar
Pique-Regi, R. et al. Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data. Genome Res. 21, 447–455 (2011).
Article CAS PubMed PubMed Central Google Scholar
Yang, J. G., Madrid, T. S., Sevastopoulos, E. & Narlikar, G. J. The chromatin-remodeling enzyme ACF is an ATP-dependent DNA length sensor that regulates nucleosome spacing. Nat. Struct. Mol. Biol. 13, 1078–1083 (2006).
Article CAS PubMed Google Scholar
Lienert, F. et al. Identification of genetic elements that autonomously determine DNA methylation states. Nat. Genet. 43, 1091–1097 (2011).
Article CAS PubMed Google Scholar
Pardo, C. E., Darst, R. P., Nabilsi, N. H., Delmas, A. L. & Kladde, M. P. Simultaneous single-molecule mapping of protein–DNA interactions and DNA methylation by MAPit. Curr. Protoc. Mol. Biol. 95, 21.22.1–21.22.18 (2011).
Article Google Scholar
Sönmezer, C. et al. Molecular co-occupancy identifies transcription factor binding cooperativity in vivo. Mol. Cell 81, 255–267 (2021).
Article PubMed CAS Google Scholar
Makowski, M. M. et al. Global profiling of protein–DNA and protein-nucleosome binding affinities using quantitative mass spectrometry. Nat. Commun. 9, 1653 (2018).
Article ADS PubMed PubMed Central CAS Google Scholar
Saksouk, N. et al. Redundant mechanisms to form silent chromatin at pericentromeric regions rely on BEND3 and DNA methylation. Mol. Cell 56, 580–594 (2014).
Article CAS PubMed Google Scholar
Dai, Q. et al. The BEN domain is a novel sequence-specific DNA-binding domain conserved in neural transcriptional repressors. Genes Dev. 27, 602–614 (2013).
Article CAS PubMed PubMed Central Google Scholar
Dai, Q. et al. Common and distinct DNA-binding and regulatory activities of the BEN-solo transcription factor family. Genes Dev. 29, 48–62 (2015).
Article PubMed PubMed Central CAS Google Scholar
Khan, A. & Prasanth, S. G. BEND3 mediates transcriptional repression and heterochromatin organization. Transcription 6, 102–105 (2015).
Article CAS PubMed PubMed Central Google Scholar
Sathyan, K. M., Shen, Z., Tripathi, V., Prasanth, K. V. & Prasanth, S. G. A BEN-domain-containing protein associates with heterochromatin and represses transcription. J. Cell Sci. 124, 3149–3163 (2011).
Article CAS PubMed PubMed Central Google Scholar
Rampalli, S., Pavithra, L., Bhatt, A., Kundu, T. K. & Chattopadhyay, S. Tumor suppressor SMAR1 mediates cyclin D1 repression by recruitment of the SIN3/histone deacetylase 1 complex. Mol. Cell. Biol. 25, 8415–8429 (2005).
Article CAS PubMed PubMed Central Google Scholar
Sreenath, K. et al. Nuclear matrix protein SMAR1 represses HIV-1 LTR mediated transcription through chromatin remodeling. Virology 400, 76–85 (2010).
Article CAS PubMed Google Scholar
Domcke, S. et al. Competition between DNA methylation and transcription factors determines binding of NRF1. Nature 528, 575–579 (2015).
Article ADS CAS PubMed Google Scholar
Baylin, S. B. & Jones, P. A. Epigenetic determinants of cancer. Cold Spring Harb. Perspect. Biol. 8, a019505 (2016).
Article PubMed PubMed Central CAS Google Scholar
Berman, B. P. et al. Regions of focal DNA hypermethylation and long-range hypomethylation in colorectal cancer coincide with nuclear lamina-associated domains. Nat. Genet. 44, 40–46 (2011).
Article PubMed PubMed Central CAS Google Scholar
Mahpour, A., Scruggs, B. S., Smiraglia, D., Ouchi, T. & Gelman, I. H. A methyl-sensitive element induces bidirectional transcription in TATA-less CpG island-associated promoters. PLoS ONE 13, e0205608 (2018).
Article PubMed PubMed Central CAS Google Scholar
Wang, T. et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
McDonald, E. R., III et al. Project DRIVE: a compendium of cancer dependencies and synthetic lethal relationships uncovered by large-scale, deep RNAi screening. Cell 170, 577–592 (2017).
Article CAS PubMed Google Scholar
Nabet, B. et al. The dTAG system for immediate and target-specific protein degradation. Nat. Chem. Biol. 14, 431–441 (2018).
Article CAS PubMed PubMed Central Google Scholar
Muhar, M. et al. SLAM-seq defines direct gene-regulatory functions of the BRD4–MYC axis. Science 360, 800–805 (2018).
Article CAS PubMed PubMed Central Google Scholar
Gaidatzis, D., Burger, L., Florescu, M. & Stadler, M. B. Analysis of intronic and exonic reads in RNA-seq data characterizes transcriptional and post-transcriptional regulation. Nat. Biotechnol. 33, 722–729 (2015).
Article CAS PubMed Google Scholar
Dahlet, T. et al. Genome-wide analysis in the mouse embryo reveals the importance of DNA methylation for transcription integrity. Nat. Commun. 11, 3153 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Thoma, E. C. et al. Ectopic expression of neurogenin 2 alone is sufficient to induce differentiation of embryonic stem cells into mature neurons. PLoS ONE 7, e38651 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109, 21.29.1–21.29.9 (2015).
Article Google Scholar
Fu, Y., Sinha, M., Peterson, C. L. & Weng, Z. The insulator binding protein CTCF positions 20 nucleosomes around its binding sites across the human genome. PLoS Genet. 4, e1000138 (2008).
Article PubMed PubMed Central CAS Google Scholar
Vierstra, J. et al. Global reference mapping of human transcription factor footprints. Nature 583, 729–736 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Wilson, B. C. et al. Intellectual disability-associated factor Zbtb11 cooperates with NRF-2/GABP to control mitochondrial function. Nat. Commun. 11, 5469 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Stielow, B. et al. The SAM domain-containing protein 1 (SAMD1) acts as a repressive chromatin regulator at unmethylated CpG islands. Sci. Adv. 7, eabf2229 (2021).
Article CAS PubMed PubMed Central Google Scholar
Weber, M. et al. Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat. Genet. 39, 457–466 (2007).
Article CAS PubMed Google Scholar
Iurlaro, M. et al. Mammalian SWI/SNF continuously restores local accessibility to chromatin. Nat. Genet. 53, 279–287 (2021).
Article CAS PubMed Google Scholar
Schick, S. et al. Acute BAF perturbation causes immediate changes in chromatin accessibility. Nat. Genet. 53, 269–278 (2021).
Article CAS PubMed Google Scholar
Hartl, D. et al. CG dinucleotides enhance promoter activity independent of DNA methylation. Genome Res. 29, 554–563 (2019).
Article CAS PubMed PubMed Central Google Scholar
Mohn, F. et al. Lineage-specific polycomb targets and de novo DNA methylation define restriction and potential of neuronal progenitors. Mol. Cell 30, 755–766 (2008).
Article CAS PubMed Google Scholar
Zhang, Y. et al. Rapid single-step induction of functional neurons from human pluripotent stem cells. Neuron 78, 785–798 (2013).
Article CAS PubMed PubMed Central Google Scholar
Lowary, P. T. & Widom, J. New DNA sequence rules for high affinity binding to histone octamer and sequence-directed nucleosome positioning. J. Mol. Biol. 276, 19–42 (1998).
Article CAS PubMed Google Scholar
Khan, A. et al. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res. 46, D260–D266 (2018).
Article CAS PubMed Google Scholar
Feng, Y. Q. et al. Site-specific chromosomal integration in mammalian cells: highly efficient CRE recombinase-mediated cassette exchange. J. Mol. Biol. 292, 779–785 (1999).
Article CAS PubMed Google Scholar
Gaidatzis, D., Lerch, A., Hahne, F. & Stadler, M. B. QuasR: quantification and annotation of short reads in R. Bioinformatics 31, 1130–1132 (2015).
Article CAS PubMed Google Scholar
Ostapcuk, V. et al. Activity-dependent neuroprotective protein recruits HP1 and CHD4 to control lineage-specifying genes. Nature 557, 739–743 (2018).
Article ADS CAS PubMed Google Scholar
Cox, J. et al. Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805 (2011).
Article ADS CAS PubMed Google Scholar
Cox, J. et al. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol. Cell. Proteomics 13, 2513–2526 (2014).
Article CAS PubMed PubMed Central Google Scholar
Hubner, N. C. et al. Quantitative proteomics combined with BAC TransgeneOmics reveals in vivo protein interactions. J. Cell Biol. 189, 739–754 (2010).
Article CAS PubMed PubMed Central Google Scholar
Tyanova, S. et al. The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat. Methods 13, 731–740 (2016).
Article CAS PubMed Google Scholar
Tusher, V. G., Tibshirani, R. & Chu, G. Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl Acad. Sci. USA 98, 5116–5121 (2001).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Wang, Y. et al. Reversed-phase chromatography with multiple fraction concatenation strategy for proteome profiling of human MCF10A cells. Proteomics 11, 2019–2026 (2011).
Article CAS PubMed PubMed Central Google Scholar
Perez-Riverol, Y. et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 47, D442–D450 (2019).
Article CAS PubMed Google Scholar
Gräwe, C., Makowski, M. M. & Vermeulen, M. PAQMAN: protein-nucleic acid affinity quantification by mass spectrometry in nuclear extracts. Methods 184, 70–77 (2020).
Article PubMed CAS Google Scholar
Corces, M. R. et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).
Article CAS PubMed PubMed Central Google Scholar
Cui, K. & Zhao, K. Genome-wide approaches to determining nucleosome occupancy in metazoans using MNase-seq. Methods Mol. Biol. 833, 413–419 (2012).
Article CAS PubMed PubMed Central Google Scholar
Gaidatzis, D. et al. DNA sequence explains seemingly disordered methylation levels in partially methylated domains of mammalian genomes. PLoS Genet. 10, e1004143 (2014).
Article PubMed PubMed Central CAS Google Scholar
Barisic, D., Stadler, M. B., Iurlaro, M. & Schübeler, D. Mammalian ISWI and SWI/SNF selectively mediate binding of distinct transcription factors. Nature 569, 136–140 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Abdulrahman, W. et al. A set of baculovirus transfer vectors for screening of affinity tags and parallel expression strategies. Anal. Biochem. 385, 383–385 (2009).
Article CAS PubMed Google Scholar
Huber, W. et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat. Methods 12, 115–121 (2015).
Article CAS PubMed PubMed Central Google Scholar
Lawrence, M., Gentleman, R. & Carey, V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics 25, 1841–1842 (2009).
Article CAS PubMed PubMed Central Google Scholar
Ginno, P. A. et al. A genome-scale map of DNA methylation turnover identifies site-specific dependencies of DNMT and TET activity. Nat. Commun. 11, 2680 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Article PubMed PubMed Central CAS Google Scholar
Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).
Article PubMed PubMed Central CAS Google Scholar
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Article CAS PubMed PubMed Central Google Scholar
Héberlé, É. & Bardet, A. F. Sensitivity of transcription factors to DNA methylation. Essays Biochem. 63, 727–741 (2019).
Article PubMed PubMed Central Google Scholar
Buck-Koehntop, B. A. et al. Molecular basis for recognition of methylated and specific DNA sequences by the zinc finger protein Kaiso. Proc. Natl Acad. Sci. USA 109, 15229–15234 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Article PubMed PubMed Central CAS Google Scholar
Arnold, P. et al. Modeling of epigenome dynamics identifies transcription factors that mediate Polycomb targeting. Genome Res. 23, 60–73 (2013).
Article CAS PubMed PubMed Central Google Scholar
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
Article CAS PubMed PubMed Central Google Scholar
Neumann, T. et al. Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets. BMC Bioinformatics 20, 258 (2019).
Article PubMed PubMed Central CAS Google Scholar
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Article CAS PubMed PubMed Central Google Scholar
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. J. 17, 10 (2011).
Article Google Scholar
Stadler, M. B. et al. DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature 480, 490–495 (2011).
Article ADS CAS PubMed Google Scholar
Blattler, A. et al. Global loss of DNA methylation uncovers intronic enhancers in genes showing expression changes. Genome Biol. 15, 469 (2014).
Article PubMed PubMed Central CAS Google Scholar
Xuan Lin, Q. X. et al. MethMotif: an integrative cell specific database of transcription factor binding motifs coupled with DNA methylation profiles. Nucleic Acids Res. 47, D145–D154 (2019).
Article PubMed CAS Google Scholar
Hon, G. C. et al. Global DNA hypomethylation coupled to repressive chromatin domain formation and gene silencing in breast cancer. Genome Res. 22, 246–258 (2012).
Article CAS PubMed PubMed Central Google Scholar
The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Article ADS PubMed Central CAS Google Scholar
Venables, W. N. & Ripley, B. D. Modern Applied Statistics with S 4th edn (Springer, 2002).
Hahne, F. & Ivanek, R. Visualizing genomic data using Gviz and Bioconductor. Methods Mol Biol. 1418, 335–351 (2016).
Article PubMed Google Scholar
Alexa, A., Rahnenführer, J. & Lengauer, T. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 22, 1600–1607 (2006).
Article CAS PubMed Google Scholar
Lawrence, M. et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118 (2013).
Article CAS PubMed PubMed Central Google Scholar
Fenouil, R. et al. CpG islands and GC content dictate nucleosome depletion in a transcription-independent manner at mammalian promoters. Genome Res. 22, 2399–2408 (2012).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank P. Papasaikas for help with the SLAM-seq analysis; M. Frederiksen and N. Leroy from the Novartis Institutes of Biomedical Research for providing the dTAG13 compound; and M. Lorincz and members of the D.S. laboratory for critical feedback on the manuscript. D.S. and N.T. acknowledge support from the Novartis Research Foundation, the Swiss National Science Foundation (310030B_176394 to D.S. and 31003A_179541 to N.T.) and the European Research Council under the European Union’s (EU) Horizon 2020 research and innovation programme grant agreements (ReadMe-667951 and DNAaccess-884664 to D.S. and CsnCRL-666068 and NucEM-884331 to N.T.). M.V. is part of the Oncode Institute, which is partly funded by the Dutch Cancer Society. R.S.G., A.K.M. and S.D. acknowledge EMBO Long-Term Fellowships. R.S.G and L.I. acknowledge the EU Horizon 2020 Research and Innovation Program under the Marie Sklodowska-Curie grant (705354 to R.S.G. and 748760 to L.I.). A.K.M. acknowledges the Human Frontier Science Program. L.I. acknowledges the National Health and Medical Research Council CJ Martin Fellowship APP1148380. A.R.K. acknowledges support from the European Molecular Biology Laboratory, Deutsche Forschungsgemeinschaft (KR 5247/1-1) and a Swiss National Fund Ambizione grant (PZOOP3_161493).

Author information

Arnaud R. Krebs
Present address: Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
These authors contributed equally: Ralph S. Grand, Lukas Burger

Authors and Affiliations

Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
Ralph S. Grand, Lukas Burger, Alicia K. Michael, Luke Isbel, Daniel Hess, Leslie Hoerner, Vytautas Iesmantavicius, Sevi Durdu, Marco Pregnolato, Arnaud R. Krebs, Sébastien A. Smallwood, Nicolas Thomä & Dirk Schübeler
Swiss Institute of Bioinformatics, Basel, Switzerland
Lukas Burger
Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, The Netherlands
Cathrin Gräwe & Michiel Vermeulen
School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
Luke Isbel
Faculty of Science, University of Basel, Basel, Switzerland
Marco Pregnolato & Dirk Schübeler

Authors

Ralph S. Grand
View author publications
You can also search for this author in PubMed Google Scholar
Lukas Burger
View author publications
You can also search for this author in PubMed Google Scholar
Cathrin Gräwe
View author publications
You can also search for this author in PubMed Google Scholar
Alicia K. Michael
View author publications
You can also search for this author in PubMed Google Scholar
Luke Isbel
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Hess
View author publications
You can also search for this author in PubMed Google Scholar
Leslie Hoerner
View author publications
You can also search for this author in PubMed Google Scholar
Vytautas Iesmantavicius
View author publications
You can also search for this author in PubMed Google Scholar
Sevi Durdu
View author publications
You can also search for this author in PubMed Google Scholar
Marco Pregnolato
View author publications
You can also search for this author in PubMed Google Scholar
Arnaud R. Krebs
View author publications
You can also search for this author in PubMed Google Scholar
Sébastien A. Smallwood
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Thomä
View author publications
You can also search for this author in PubMed Google Scholar
Michiel Vermeulen
View author publications
You can also search for this author in PubMed Google Scholar
Dirk Schübeler
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.S.G., L.B. and D.S. conceived and planned the experiments. R.S.G. performed all experiments, performed SMF analysis and contributed to initial data analysis. L.B. performed comprehensive computational data analysis. C.G. and M.V. validated the affinity purification and performed PAQMAN analysis. A.K.M. performed and N.T. supervised protein purification and biochemistry assays. L.I. assisted with genomics and biochemistry assays. D.H. and V.I. performed mass spectrometry quantification and initial data processing. L.H. assisted with western blots, MNase-seq and cell line maintenance. S.D. performed and analysed the immunofluorescence experiments. M.P. and A.R.K. assisted in the establishment of the SMF method. S.A.S. advised on and oversaw the generation of next-generation-sequencing data. D.S. supervised the project. R.S.G., L.B. and D.S. interpreted the results and wrote the manuscript.

Corresponding author

Correspondence to Dirk Schübeler.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Eric Mendenhall and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 BANP binds the orphan CGCG element in the mouse genome.

a, Footprint created by REST bound to its motif (red) and CpG methylation around the bound motif (black). Motif indicated by grey rectangle in the middle. b, No footprint over the scrambled REST motif (red) and corresponding CpG methylation at this inserted construct (black). Individual biological replicates are shown (n = 2). Red line is the mean. c, ChIP–seq read counts for REST at two genomic loci with a REST motif (top and middle) and one without (bottom). SMF amplicon indicated in blue, REST motif in grey¹². d, Footprinting of the corresponding loci in c. A footprint of around 30 bp was detected over the REST motif (top and middle) compared to a site without a motif. The transcription factor footprint is distinguishable from the neighbouring nucleosome footprint by size—around 30 bp compared to around 150 bp in width. Individual biological replicates are shown (n = 4). Red line is the mean. e, f, Same as Fig. 1b but including CpG methylation. Individual biological replicates are shown (n = 2). Red line is the mean. g, Quantitative mass spectrometry (PAQMAN) determines affinity of BANP for the CGCG element to be around 18.5 nM. Binding curves were generated by fitting the parameters of the Hill equation to determine the relative equilibrium dissociation constant (K_d^app). Data are the mean of three experiments (n = 3), error bars represent standard error of the mean. h, Reproducibility of enrichments at peaks (n = 1302) for three independent BANP ChIP–seq replicates (R1–R3) from wild-type mouse ES cells. Pearson correlation coefficients are indicated. i, Top motif found by de novo motif search in the top 500 peaks of each ChIP–seq replicate (R1–R3) using HOMER. j, 6mer enrichments as measured by Pearson residuals (Methods) at the top 500 peaks inside CpG islands (CGIs) and outside of CGIs (nonCGI). The five sub-6mers of the highest-scoring motif in i (TCTCGCGAGA), TCTCGC, CTCGCG, TCGCGA, CGCGAG and GCGAGA are marked in red. R1–R3 indicate biological replicates. k, BANP motif instances of varying motif scores were predicted genome-wide (Methods) and the fraction of predicted motifs that overlap common peaks (peaks identified in all three replicates, Methods) was determined for equally spaced bins of motif scores (missing bins do not contain any predicted sites). The chosen cut-off of 12.5 is indicated by a dashed line. l, The fraction of common BANP peaks in varying bins of BANP enrichment that contain a predicted BANP motif using the cut-off of 12.5 defined in k. m, Reproducibility of BANP enrichments at predicted BANP motifs (n = 1207) as defined in k. Pearson correlation coefficients are indicated. R1–R3 indicate biological replicates. n, GO enrichment of genes that contain a bound BANP motif. The top 30 most significant GO categories are shown.

Extended Data Fig. 2 BANP is methylation-sensitive in mouse and human cells.

a, BANP binding versus percentage methylation of the CpGs in the BANP motif. n indicates the number of motifs per bin. R indicates Pearson correlation coefficient. Black lines correspond to median, boxes to first and third quartile and whiskers to the maximum and minimum values of the distribution after removal of outliers, in which outliers are defined as more than 1.5 × (interquartile range) away from the box (Methods). b, Fraction of variance in BANP binding explained by a linear model that incorporates either motif score or methylation of the motif, or both (Methods). c, True versus predicted BANP ChIP–seq enrichments at predicted motifs (as defined in Extended Data Fig. 1k) for a linear model that uses motif score (left), methylation (middle) or both motif score and methylation (right) as predictors. Fraction of variance explained is indicated as R². d, PAQMAN using a methylated (Meth.) BANP motif reduces affinity by more than 16-fold compared to an unmethylated (unmeth.) motif. Binding curves were generated by fitting the parameters of the Hill equation to determine the relative equilibrium dissociation constant (K_d^app). Data are the mean of three experiments (n = 3), error bars represent standard error of the mean. e, Top motif found by HOMER in the top 500 peaks of each replicate (R1–R3) in DNMT TKO cells. f, Reproducibility of changes in BANP binding in DNMT TKO versus wild-type ES cells at predicted BANP motifs (n = 1,207). Pearson correlation coefficient is indicated. R1–R3 indicate biological replicates. g, Change in BANP binding in DNMT TKO versus wild-type ES cells compared to the methylation level of the motif (WGBS) in wild-type ES cells at predicted BANP motifs (n = 1,207). h, Distribution of motif scores as a function of change in BANP binding in DNMT TKO versus wild-type ES cells. Box plots as in a. Notches extend to ± 1.58 × (interquartile range/sqrt(n)). i, Single-locus examples of BANP binding in wild-type ES cells and DNMT TKO cells at promoters with a methylated BANP motif. Methylation of the CpGs in the motif is indicated by the colour of the circles above the motif. Colour range from white (0% methylation) to black (100% methylation). For Tex13b, the circle represents the average methylation of both CpGs as the coverage was too low to quantify each CpG separately. j, Expression changes versus changes in BANP binding between wild-type ES cells and DNMT TKO cells at genes with a predicted BANP motif in their promoter. For the definition of gene–motif pairings, see Methods. k, Superose 6 increase 10/300 GL size exclusion chromatography profile of full-length (FL) BANP protein. Peak fractions were analysed by SDS–PAGE and stained by Coomassie (inset) showing protein size and high purity. l, Electrophoretic mobility shift assay of full-length BANP binding to the unmethylated (left), methylated (middle) or scrambled (right) BANP motif (n = 2 replicates). m, Reproducibility of changes in BANP binding at peaks between the human cancer cell line HCT116 and HCC1954. R indicates the Pearson correlation coefficient. n, Top motif found by de novo motif search (HOMER) in the top 500 peaks of the first replicate of both cell types. Motifs found for the remaining replicates are very similar (data not shown). o, The fraction of peaks that contain a BANP motif as a function of peak strength in the first replicate of both cell types. Peaks were sorted by read counts and binned into groups of 250 peaks (each bar representing one group). Although we identified between around 14,000–24,000 peaks, only the top bins show a high fraction of peaks with motif (results very similar for the remaining replicates, data not shown). The additional peaks are likely to be false positives owing to an open chromatin bias in the ChIP–seq data as shown for HCT116 in p. p, BANP binding versus DNaseI in HCT116 cells in 1-kb tiling windows of chromosome 1. There is a global correlation of ChIP–seq and DNaseI signal, which probably explains the large number of peaks without a BANP motif. q, GO enrichment of genes that contain a bound BANP motif. The 30 most significant GO categories are shown and GO categories were grouped as in Extended Data Fig. 1n. Similarities to the mouse GO analysis are indicated by coloured bars and arrows (Extended Data Fig. 1n). r, Differential methylation versus differential binding at BANP motifs bound in at least one of the two cell types. Single locus displayed in Fig. 2d is circled. R indicates Pearson correlation coefficient. s, t, BANP binding and DNA methylation in HCT116 and HCC1954 cells at a BANP motif at a differentially methylated CGI shore (s) or in a CGI that lies in a partially methylated domain (t). Methylation of the CpGs in the motif is indicated by the colour of the circles above the motif. Colour range from white (0% methylation) to black (100% methylation). u, Negative correlation between methylation in the BANP motif and BANP binding at CGI promoter sites in several human cancer cell lines of different origin, using DNaseI hypersensitivity as an indicator of BANP binding. Both methylation and DNaseI are shown relative to the average level across all cell types. Only sites that are bound in at least one cell line are shown (Methods). Pearson correlation coefficients are indicated.

Extended Data Fig. 3 Inducible BANP depletion by targeted degradation allows the loss of function of this essential gene to be studied.

a, Activation of a luciferase reporter gene by one, two or three copies of the BANP motif after transient transfection into mouse ES cells. b, BANP lethality score from genome wide CRISPR screens across more than 500 cell lines. A gene with a score below −0.5 is considered a common essential gene (Broad Institute (https://depmap.org/portal/)). Black lines correspond to median, boxes to first and third quartile and whiskers to the maximum and minimum values of the distribution. c, Full sized western blot of BANP in wild-type and DNMT TKO cell lines before and after addition of the dTAG demonstrates the reduced level of BANP due to tagging, and the absence of protein following induced degradation by the addition of the dTAG13 compound to the medium (n = 3 replicates). Arrowhead on right indicates the target protein. d, The cell-cycle phase distribution determined by BrdU incorporation followed by flow cytometry analysis of wild-type and DNMT TKO cells before and after endogenous tagging of the Banp gene (n = 2 replicates). e, Immunofluorescence visualizes BANP degradation in mouse ES cells (n = 2 replicates). a.u., arbitrary units. f, Quantification of cell death in wild-type and TKO cells after inducing BANP depletion by the addition of the dTAG13 compound. Individual data points are shown and the bars represent the mean of three biological replicates. g, Level of s4U incorporation at different time points of a BANP degradation time course, in which D stands for the time of induced BANP degradation and T for the time of incorporation. Percentage refers to the fraction of Ts converted to Cs. Although cells untreated with s4U show very low percentages (wild-type untreated R1–R3), the percentages increase with increasing incorporation time. Only genes with at least a total count of 50 reads overlapping Ts in all replicates are shown. R1–R3 indicate biological replicates. Black lines correspond to median, boxes to first and third quartile and whiskers to the maximum and minimum values of the distribution after removal of outliers, in which outliers are defined as more than 1.5 × (interquartile range) away from the box. n = 13,801 genes. h, Change in gene expression level of BANP-bound and unbound genes between wild-type and BANP degron-tagged cells. There is no consistent global difference between BANP bound and unbound genes. Same y range as Fig. 3c for comparison. Box plots as in g. Notches extend to ± 1.58 × (interquartile range/sqrt(n)). i, Change in RNA across a BANP degradation time course showing the response of unbound genes. Same y range as Fig. 3c for comparison. Box plots as in g. j, Beeswarm plot of expression changes (log₂) after 6 h of dTAG treatment (versus untreated) for all genes with a bound BANP motif belonging to one of the groups of GO categories as defined in Extended Data Fig. 1n.

Extended Data Fig. 4 Correlation heat maps for the wild-type ES cell RNA-seq time course.

a–f, Pearson correlations between samples for all quantifiable genes on exon level (a), intron level (b) and in SLAM-seq (c). d–f, Same as a–c but for all quantifiable genes with a BANP motif in their promoter. To remove correlations due to varying gene lengths, counts in all three measures were converted to log₂ RPKM values before determining the correlation coefficients. r1–r3 indicate biological replicates. Samples named as in Extended Data Fig. 3g. a–f illustrate high reproducibility between replicates. d–f indicate that changes on the transcriptional level at BANP target genes occur fast whereas changes on the mRNA level are delayed. Note that the correlation structure in SLAM-seq is also influenced by the varying incorporation times.

Extended Data Fig. 5 Reproducibility and comparison of exonic, intronic and SLAM-seq signal during a time course of BANP degradation.

Top three rows show reproducibility of log₂ changes relative to untreated for exonic, intronic and SLAM-seq, respectively. The first two replicates are shown in each case (R1 and R2). Bottom three rows show the same changes comparing SLAM-seq to intronic, intronic to exonic and SLAM-seq to exonic signal, respectively. Average of all replicates is shown. Comparisons indicate a high degree of similarity between intronic and SLAM-seq signal and a delayed exonic response. In all figures, Pearson correlations were calculated on all genes with a promoter that overlaps a bound BANP motif, highlighted in red.

Extended Data Fig. 6 The downregulation of BANP target genes is also detected at the protein level.

a, Pearson correlations of log₂ changes in protein levels at different time points relative to untreated for all quantifiable genes (left) or for all quantifiable genes with a BANP motif in their promoter (right). Reproducible changes can be clearly observed after 10 h. R1–R3 indicate biological replicates. b, Scatter plots showing the correlation between RNA (exonic) and protein levels across a BANP degradation time course (n = 8,128 genes). In all panels, Pearson correlations were calculated on all genes with a promoter that overlaps a bound BANP motif (n = 357 genes), shown in red. As the aim of these comparisons is to see how mRNA changes of BANP targets are reflected at the protein level, but BANP itself, which has a BANP motif in its promoter, has been degraded at the protein level, it was removed from the comparisons. Its protein level changes are shown in c (top). The 10-h time point is missing as it was not measured in RNA. c, Depletion of BANP (top) and an essential BANP target gene, TUBGCP5 (bottom), relative to untreated cells across the BANP degradation time course. d, Total proteome showing the downregulation of BANP target genes at the protein level over a BANP degradation time course. BANP itself was removed as in b. Black lines correspond to median, boxes to first and third quartile and whiskers to the maximum and minimum values of the distribution after removal of outliers, in which outliers are defined as more than 1.5 × (interquartile range) away from the box. Notches extend to ± 1.58 × (interquartile range/sqrt(n)). Bound: n = 357 genes, unbound: n = 7,861 genes. e, Western blot for the essential BANP target gene TUBGCP5 over a degradation time course (n = 2 replicates). Arrowhead on right indicates the target protein. f, Quantification of the TUBGCP5 protein level in e normalized to the loading control. Individual replicates are shown, and the bars represent the mean.

Extended Data Fig. 7 BANP regulates a similar set of genes in DNMT TKO cells and is necessary and sufficient to drive expression of TKO-specific bound genes.

a–d, Same as Extended Data Fig. 4a, b, d, e, but for a DNMT TKO RNA-seq time course. e, f, Same as top two rows in Extended Data Fig. 5, but for a DNMT TKO RNA-seq time course. g, Comparison of the RNA response (exonic level) to BANP removal in wild-type versus TKO cell lines. In all panels, Pearson correlations were calculated on all genes with a promoter that overlaps a bound (bound in either wild-type or DNMT TKO) BANP motif, which are shown in red. All annotated promoters were used in order not to bias the analysis towards promoters with Pol II signal in wild-type (Methods, ‘Annotations’). h, Expression changes (relative to wild-type cells) in the DNMT TKO degron cell line across a BANP degradation time course for the genes that gain binding and increase expression in DNMT TKO cells (Extended Data Fig. 2i, j). The three genes are inactivated in response to BANP removal in DNMT TKO cells, which is a combination of reduced BANP levels in the dTAG line (dTAG untreated) and induced degradation by the addition of the dTAG13 compound (dTAG 1–6 h). Initial expression levels of these genes in wild-type ES cells are below 0.1 RPKM and can thus be considered inactive. Bars show means of n = 6 for TKO, n = 2 for TKO dTAG 1h and otherwise n = 3 biological replicates. Individual replicates are shown as dots. Error bars denote ±1 standard deviation.

Extended Data Fig. 8 BANP binding in neurons is mostly conserved compared to ES cells but also shows cell-type-specific binding.

a, Immunofluorescence of mouse ES cells and derived neurons stained with Hoechst and calcein-AM (n = 3 replicates). b, Reproducibility of BANP enrichments at predicted BANP motifs (n = 1,207) as defined in Extended Data Fig. 1k. Pearson correlation coefficients are indicated. R1–R3 indicate biological replicates. UI, ES cells with uninduced Ngn2 construct. c, Scatter plot of the change in BANP binding from ES cells to neurons at predicted BANP-binding sites. R1 and R2 indicate biological replicates. d, Change in RNA compared to the change in BANP binding between mouse ES cells and neurons at predicted BANP motifs (n = 1,207). The Pearson correlation coefficient is indicated.

Extended Data Fig. 9 BANP-bound genes are rapidly downregulated in neurons.

a, Western blot of BANP in neurons demonstrates absence of protein following induced degradation by the addition of the dTAG13 compound to the medium (n = 3 replicates, replicate 1 shown). Arrowhead on right indicates the target protein. b, c, Same as Extended Data Fig. 4a, b, d, e, but for a neuron RNA-seq time course. Ey wt, wild-type ES cells. wtBANP, wild-type ES cells with the BANP dTAG. UI, uninduced, I, induced. D, dTAG-treated. d, e, Same as top two rows in Extended Data Fig. 5, but for a neuron RNA-seq time course. f, Scatter plots of the change in gene expression after BANP degradation in ES cells versus neurons. R indicates Pearson correlation coefficient. Bound BANP motifs are shown in red.

Extended Data Fig. 10 Open chromatin and phased nucleosomes around BANP-bound motifs in CGIs is linked to gene activity.

a, Hierarchically clustered correlation heat map of ATAC-seq signal at predicted BANP motifs across a BANP degradation time course. The main change in signal occurs already in the first hour of degradation. b, Average ATAC-seq profiles around bound BANP motifs across the time course, illustrating (as in a) that the main change occurs within the first hour. For each time point, there are two replicates shown in the same colour. Signal smoothed over 51 nt. c, Accessibility change relative to untreated at bound BANP motifs after removal of BANP. Same as inset in Fig. 4b, but for all time points. Dots represent individual replicates, bars the mean of the two replicates. d, Scatter plots of MNase-seq signal at predicted BANP motifs in untreated, 1-h-BANP-degraded (1 h) and 4-h-BANP-degraded (4 h) cells (n = 2 replicates combined per condition). The main change occurs within the first hour (see also e). The Pearson correlation coefficient is indicated. e, Scatter plot comparing changes in MNase-seq signal at predicted BANP motifs after 1 h and 4 h of BANP degradation relative to untreated, indicating little change from 1 h to 4 h. Pearson correlation coefficient indicated. f, Nucleosome phasing around the top 100 BANP or CTCF-bound motifs in CGI promoters. Profiles are oriented in the 5′ to 3′ direction to the corresponding genes. This highly organized chromatin is at odds with previous suggestions of low nucleosomal density at CGIs⁸⁴, which we speculate reflects inefficient amplification of GC rich sequences in first-generation sequencing reagents. g, Changes in MNase-seq versus changes in ATAC-seq signal after 1h of BANP degradation at BANP motifs. Sites that lose accessibility tend to gain nucleosome signal. Pearson correlation coefficient indicated. h, i, Changes in ATAC-seq (h) and MNase-seq (i) after 1h of BANP degradation versus BANP binding strength. Loss in accessibility and gain in nucleosomal signal occurs mostly at bound sites. Pearson correlation coefficient indicated. j–l, Change in expression at 6h versus untreated (exonic) compared to BANP binding (j), change in accessibility (k) or nucleosomal signal (l) after 1 h of BANP degradation at predicted BANP motifs. For the definition of gene–motif pairings, see Methods. Exonic changes at 6 h were used as they are similar to intronic changes at 1 h (Extended Data Fig. 5), but allow for the quantification of a larger number of genes. Accessibility and expression changes are positively correlated, whereas nucleosomal signal and expression changes are negatively correlated. P values were determined via an approximate permutation test (two-sided, n = 458 in all cases, Methods). m, The changes in expression (RNA, P = 2.5⋅10⁻¹⁶, robust F test, two-sided), accessibility (ATAC-seq, P = 0.015, robust F test, two-sided) and nucleosome positioning (MNase-seq, P = 1.3⋅10⁻¹⁰, robust F test, two-sided) after removal of BANP increase significantly with increasing binding strength (Methods). Unbound, below twofold enriched (IP/IgG). Weak, log₂ enrichment (IP/IgG) between 1 and 4. Strong, log₂ enrichment larger than 4. Box plots as in Extended Data Fig. 6d. n, Linear model to predict changes in expression (exonic) after 6 h of BANP degradation versus untreated cells using BANP binding, ATAC-seq changes after 1 h of BANP degradation (ATAC-seq), MNase-seq changes after 1 h of BANP degradation (MNase-seq) and distance of the BANP motif to TSS, a binary variable that indicates whether the motif lies within 100 nt upstream of the TSS (Methods). Only bound motifs were used (log₂(IP/IgG) ≥ 1) and, to be able to cleanly assign motifs to genes, only genes with promoters that contained one bound motif and for which the motif did not overlap with any other promoter were used (n = 321). The models were evaluated via fivefold crossvalidation (Methods). Left, fraction of variance explained using only BANP binding (binding), binding and distance to TSS (binding + TSSdist) or BANP binding, distance to TSS, MNase-seq and ATAC-seq signal (binding + TSSdist + chromatin). Coloured dots refer to the performance of each model in each partition of the cross-validation. Chromatin information increases the predictive power of the model as is evident by the larger average fraction of explained variance of ‘Binding + TSS + chromatin’ (averaged over all five partitions) as well as the fact that ‘Binding + TSSdist + chromatin’ outperforms ‘Binding’ in all partitions and ‘Binding + TSSdist’ in 4 out of 5 partitions. Middle, inferred coefficients for the full model. Colours refer to the different partitions. Right, true expression changes versus predicted expression changes when using the average coefficients (averaged over all partitions) for prediction. In all panels, ATAC-seq, MNase-seq and ChIP–seq signal are quantified in a 201-bp window centred around the motif.

Supplementary information

Supplementary Tables

This file contains Supplementary Tables 1-6. Supplementary Table 1: Antibodies and dilutions used in this study. Supplementary Table 2: Motif sequences used for RMCE insertion and footprinting. A footprintable GpC was added to both ends of each motif to maximize the ability to detect a footprint (red). Supplementary Table 3: Oligonucleotide sequences used for affinity purification. Supplementary Table 4: Oligonucleotides used for PAQMAN assay. Supplementary Table 5: Motif sequences used for RMCE insertion and luciferase assays. Supplementary Table 6: BANP degradation time and s4U incorporation time for the SLAM-seq time course.

Reporting Summary

Peer Review File

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grand, R.S., Burger, L., Gräwe, C. et al. BANP opens chromatin and activates CpG-island-regulated genes. Nature 596, 133–137 (2021). https://doi.org/10.1038/s41586-021-03689-8

Download citation

Received: 22 August 2020
Accepted: 03 June 2021
Published: 07 July 2021
Issue Date: 05 August 2021
DOI: https://doi.org/10.1038/s41586-021-03689-8

This article is cited by

Integrative cross-omics and cross-context analysis elucidates molecular links underlying genetic effects on complex traits
- Yihao Lu
- Meritxell Oliva
- Lin S. Chen
Nature Communications (2024)
Epigenomic insights into common human disease pathology
- Christopher G. Bell
Cellular and Molecular Life Sciences (2024)
Emergence and influence of sequence bias in evolutionarily malleable, mammalian tandem arrays
- Margarita V. Brovkina
- Margaret A. Chapman
- E. Josephine Clowney
BMC Biology (2023)
Predicting the impact of sequence motifs on gene regulation using single-cell data
- Jacob Hepkema
- Nicholas Keone Lee
- Martin Hemberg
Genome Biology (2023)
Transposable elements as tissue-specific enhancers in cancers of endodermal lineage
- Konsta Karttunen
- Divyesh Patel
- Biswajyoti Sahu
Nature Communications (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.