Polycomb-mediated repression of gene expression is essential for development, with a pivotal role played by trimethylation of histone H3 lysine 27 (H3K27me3), which is deposited by Polycomb Repressive Complex 2 (PRC2). The mechanism by which PRC2 is recruited to target genes has remained largely elusive, particularly in vertebrates. Here we demonstrate that MTF2, one of the three vertebrate homologs of Drosophila melanogaster Polycomblike, is a DNA-binding, methylation-sensitive PRC2 recruiter in mouse embryonic stem cells. MTF2 directly binds to DNA and is essential for recruitment of PRC2 both in vitro and in vivo. Genome-wide recruitment of the PRC2 catalytic subunit EZH2 is abrogated in Mtf2 knockout cells, resulting in greatly reduced H3K27me3 deposition. MTF2 selectively binds regions with a high density of unmethylated CpGs in a context of reduced helix twist, which distinguishes target from non-target CpG islands. These results demonstrate instructive recruitment of PRC2 to genomic targets by MTF2.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Genome Biology Open Access 03 October 2022
Epigenetics & Chromatin Open Access 22 February 2022
Cell Research Open Access 19 January 2022
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Hauri, S. et al. A high-density map for navigating the human Polycomb complexome. Cell Rep. 17, 583–595 (2016).
Di Croce, L. & Helin, K. Transcriptional regulation by Polycomb group proteins. Nat. Struct. Mol. Biol. 20, 1147–1155 (2013).
Margueron, R. & Reinberg, D. The Polycomb complex PRC2 and its mark in life. Nature 469, 343–349 (2011).
Comet, I., Riising, E. M., Leblanc, B. & Helin, K. Maintaining cell identity: PRC2-mediated regulation of transcription and cancer. Nat. Rev. Cancer 16, 803–810 (2016).
Simon, J. A. & Kingston, R. E. Occupying chromatin: Polycomb mechanisms for getting to genomic targets, stopping transcriptional traffic, and staying put. Mol. Cell 49, 808–824 (2013).
Blackledge, N. P., Rose, N. R. & Klose, R. J. Targeting Polycomb systems to regulate gene expression: modifications to a complex story. Nat. Rev. Mol. Cell Biol. 16, 643–649 (2015).
Brockdorff, N. Noncoding RNA and Polycomb recruitment. RNA 19, 429–442 (2013).
Bauer, M., Trupke, J. & Ringrose, L. The quest for mammalian Polycomb response elements: are we there yet? Chromosoma 125, 471–496 (2016).
Kassis, J. A. & Brown, J. L. Polycomb group response elements in Drosophila and vertebrates. Adv. Genet. 81, 83–118 (2013).
Ringrose, L., Rehmsmeier, M., Dura, J. M. & Paro, R. Genome-wide prediction of Polycomb/Trithorax response elements in Drosophila melanogaster. Dev. Cell 5, 759–771 (2003).
Grijzenhout, A. et al. Functional analysis of AEBP2, a PRC2 Polycomb protein, reveals a Trithorax phenotype in embryonic development and in ESCs. Development 143, 2716–2723 (2016).
Cao, R. et al. Role of hPHF1 in H3K27 methylation and Hox gene silencing. Mol. Cell. Biol. 28, 1862–1872 (2008).
Sarma, K., Margueron, R., Ivanov, A., Pirrotta, V. & Reinberg, D. Ezh2 requires PHF1 to efficiently catalyze H3 lysine 27 trimethylation in vivo. Mol. Cell. Biol. 28, 2718–2731 (2008).
Inouye, C., Remondelli, P., Karin, M. & Elledge, S. Isolation of a cDNA encoding a metal response element binding protein using a novel expression cloning procedure: the one hybrid system. DNA Cell Biol. 13, 731–742 (1994).
Li, X. et al. Mammalian polycomb-like Pcl2/Mtf2 is a novel regulatory component of PRC2 that can differentially modulate polycomb activity both at the Hox gene cluster and at Cdkn2a genes. Mol. Cell. Biol. 31, 351–364 (2011).
Casanova, M. et al. Polycomblike 2 facilitates the recruitment of PRC2 Polycomb group complexes to the inactive X chromosome and to target loci in embryonic stem cells. Development 138, 1471–1482 (2011).
Walker, E., Manias, J. L., Chang, W. Y. & Stanford, W. L. PCL2 modulates gene regulatory networks controlling self-renewal and commitment in embryonic stem cells. Cell Cycle 10, 45–51 (2011).
Cai, L. et al. An H3K36 methylation-engaging Tudor motif of polycomb-like proteins mediates PRC2 complex targeting. Mol. Cell 49, 571–582 (2013).
Musselman, C. A. et al. Molecular basis for H3K36me3 recognition by the Tudor domain of PHF1. Nat. Struct. Mol. Biol. 19, 1266–1272 (2012).
Brien, G. L. et al. Polycomb PHF19 binds H3K36me3 and recruits PRC2 and demethylase NO66 to embryonic stem cell genes during differentiation. Nat. Struct. Mol. Biol. 19, 1273–1281 (2012).
Ballaré, C. et al. Phf19 links methylated Lys36 of histone H3 to regulation of Polycomb activity. Nat. Struct. Mol. Biol. 19, 1257–1265 (2012).
Hunkapiller, J. et al. Polycomb-like 3 promotes polycomb repressive complex 2 binding to CpG islands and embryonic stem cell self-renewal. PLoS Genet. 8, e1002576 (2012).
Mendenhall, E. M. et al. GC-rich sequence elements recruit PRC2 in mammalian ES cells. PLoS Genet. 6, e1001244 (2010).
Li, H. et al. Polycomb-like proteins link the PRC2 complex to CpG islands. Nature 549, 287–291 (2017).
Choi, J. et al. DNA binding by PHF1 prolongs PRC2 residence time on chromatin and thereby promotes H3K27 methylation. Nat. Struct. Mol. Biol. 24, 1039–1047 (2017).
van Heeringen, S. J. et al. Principles of nucleation of H3K27 methylation during embryonic development. Genome Res. 24, 401–410 (2014).
Lee, D., Karchin, R. & Beer, M. A. Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 21, 2167–2180 (2011).
Marks, H. et al. The transcriptional and epigenomic foundations of ground state pluripotency. Cell 149, 590–604 (2012).
Ying, Q. L. et al. The ground state of embryonic stem cell self-renewal. Nature 453, 519–523 (2008).
Liu, X. et al. Distinct features of H3K4me3 and H3K27me3 chromatin domains in pre-implantation embryos. Nature 537, 558–562 (2016).
Kloet, S. L. et al. The dynamic interactome and genomic targets of Polycomb complexes during stem-cell differentiation. Nat. Struct. Mol. Biol. 23, 682–690 (2016).
Coulson, M., Robert, S., Eyre, H. J. & Saint, R. The identification and localization of a human gene with sequence similarity to Polycomblike of Drosophila melanogaster. Genomics 48, 381–383 (1998).
O’Connell, S. et al. Polycomblike PHD fingers mediate conserved interaction with enhancer of zeste protein. J. Biol. Chem. 276, 43065–43073 (2001).
Long, H. K. et al. Epigenetic conservation at gene regulatory elements revealed by non-methylated DNA profiling in seven vertebrates. Elife 2, e00348 (2013).
Landeira, D. et al. Jarid2 is a PRC2 component in embryonic stem cells required for multi-lineage differentiation and recruitment of PRC1 and RNA Polymerase II to developmental regulators. Nat. Cell Biol. 12, 618–624 (2010).
Schoeftner, S. et al. Recruitment of PRC1 function at the initiation of X inactivation independent of PRC2 and silencing. EMBO J. 25, 3110–3122 (2006).
van der Heijden, T., van Vugt, J. J., Logie, C. & van Noort, J. Sequence-based prediction of single nucleosome positioning and genome-wide nucleosome occupancy. Proc. Natl Acad. Sci. USA 109, E2514–E2522 (2012).
Mathelier, A. et al. DNA shape features improve transcription factor binding site predictions in vivo. Cell Syst. 3, 278–286.e274 (2016).
Yang, L. et al. Transcription factor family-specific DNA shape readout revealed by quantitative specificity models. Mol. Syst. Biol. 13, 910 (2017).
Vizán, P., Beringer, M., Ballaré, C. & Di Croce, L. Role of PRC2-associated factors in stem cells and disease. FEBS J. 282, 1723–1735 (2015).
Savla, U., Benes, J., Zhang, J. & Jones, R. S. Recruitment of Drosophila Polycomb-group proteins by Polycomblike, a component of a novel protein complex in larvae. Development 135, 813–817 (2008).
Simossis, V. A. & Heringa, J. The PRALINE online server: optimising progressive multiple alignment on the web. Comput. Biol. Chem. 27, 511–519 (2003).
Wiśniewski, J. R., Zougman, A., Nagaraj, N. & Mann, M. Universal sample preparation method for proteome analysis. Nat. Methods 6, 359–362 (2009).
Rappsilber, J., Mann, M. & Ishihama, Y. Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips. Nat. Protoc. 2, 1896–1906 (2007).
Bogdanović, O. et al. Active DNA demethylation at enhancers during the vertebrate phylotypic period. Nat. Genet. 48, 417–426 (2016).
Cox, J. & Mann, M. MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372 (2008).
Smits, A. H., Jansen, P. W., Poser, I., Hyman, A. A. & Vermeulen, M. Stoichiometry of chromatin-associated protein complexes revealed by label-free quantitative mass spectrometry-based proteomics. Nucleic Acids Res. 41, e28 (2013).
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Shao, Z., Zhang, Y., Yuan, G. C., Orkin, S. H. & Waxman, D. J. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets. Genome Biol. 13, R16 (2012).
Georgiou, G. & van Heeringen, S. J. fluff: exploratory analysis and visualization of high-throughput sequencing data. PeerJ 4, e2209 (2016).
van Heeringen, S. J. & Veenstra, G. J. GimmeMotifs: a de novo motif prediction pipeline for ChIP-sequencing experiments. Bioinformatics 27, 270–271 (2011).
Chiu, T. P. et al. DNAshapeR: an R/Bioconductor package for DNA shape prediction and feature encoding. Bioinformatics 32, 1211–1213 (2016).
Zhou, T. et al. DNAshape: a method for the high-throughput prediction of DNA structural features on a genomic scale. Nucleic Acids Res. 41, W56–W62 (2013).
Fabian Pedregosa, G.V. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Vizcaíno, J. A. et al. 2016 update of the PRIDE database and its related tools. Nucleic Acids Res. 44 D1, D447–D456 (2016).
We thank H. Koseki (RIKEN Research Center for Allergy and Immunology, Japan) for sharing the Mtf2GT/GT and Mtf2Δ/Δ mESC lines and C. Fisher (MRC Clinical Sciences Centre, Imperial College School of Medicine, UK) for sharing the Jarid2−/− mESCs. We are grateful to M. Makowski for help and discussion. We thank P. Jansen, S. Kloet, A. H. Smits and L. N. Nguyen for advice and technical support with mass spectrometry, E. Janssen-Megens for help with Illumina sequencing, S. Wardle for help with ChIP and G. Georgiou for help with Python scripting. This work has been financially supported by the People Program (Marie Curie Actions) of the European Union’s Seventh Framework Program FP7 under grant agreement number 607142 (DevCom). Research in the group of H.M. is supported by a grant from the Netherlands Organization for Scientific Research (NWO-VIDI 864.12.007).
The authors declare no competing financial interests.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Integrated Supplementary Information
a, DNA pulldown-mass spectrometry of 2i mESC nuclear extract using region 2 baits shows highly specific enrichment of PRC2 core subunits and associated proteins on the WT pulldown bait. Proteins on the right side of the plot are enriched in the wild type bait dataset. Highlighted proteins are enriched in both region 1 (Fig. 1b) and region 2 pulldowns. Each condition was measured in three independent experiments and the FDR was calculated from a two tailed t-test. b, DNA pulldown-western blot of 2i mESC nuclear extract with region 1 baits. Recruitment of PRC2 is inhibited by DNA methylation (mC) but not RNAse treatment. mC indicates methylated cytosines in the CG of the TGCGCAAA kmer on both strands. Similar results were obtained in two independent pulldowns. c, DNA pulldown-western blots of nuclear extract from mESC using baits containing the same kmer but different flanking regions (Supplementary Table 1), representing different genomic locations. Blots represent at least two independent pulldowns. d, DNA pulldown-western blot of recombinant myc-tagged MTF2, myc-tagged C17ORF96 and both recombinant proteins. Recombinant MTF2 binding is specific to the WT DNA bait and mirrors the EZH2 recruitment pattern. C17ORF96 does not show DNA binding properties in either presence or absence of MTF2. Input lane represents 1% of the starting material. Blots represent three independent pulldowns. e, Coomassie staining of purified GST-MTF2 (left), and EMSA showing direct binding of GST-Mtf2 to region 1 probes. Blots represent at least three independent pulldowns. Uncropped gels available as supplementary information.
a, Whole-proteome mass spectrometry quantification (n = 2) in 2i vs. serum of PRC2 components enriched in 2i DNA pulldown. MTF2 and PRC2 core components show higher expression in 2i. Dotted lines represent two-fold change (FC). Dashed lines represent log2 FC of 1. The vertical line between the dots indicates the mean FC. b,c, DNA pulldown-mass spectrometry of serum-grown wild type mESC nuclear extract using region 1 (b) and region 2 (c) baits showing specific enrichment of PRC2 core proteins and associated proteins on the WT pulldown bait. Proteins on the right side of the plot are enriched in the wild type bait dataset. Each condition was measured in three independent experiments. FDR from two tailed t-test. d, Stoichiometry of protein present in DNA-pulldown from serum grown mESC. MTF2 is the only non-core protein consistently enriched with a stoichiometric ratio to EZH2 in serum grown cells. Error bars represent standard deviation. Each condition was measured in three independent experiments. Dots represent means, error bars standard deviation e, DNA pulldown-western blot of serum mESC nuclear extract from the indicated cell lines using region 1 baits, confirming loss of PRC2 recruitment in Mtf2GT/GT mESC. This experiment was reproduced multiple times, three times by mass spectrometry (cf. panel b) and two times by western blot. f, Whole-proteome mass spectrometry quantification of PRC2 components in (b, c) in wild type and Mtf2GT/GT mESC (n = 2). Absence of MTF2 does not affect PRC2 abundance, but only its recruitment to DNA. Dashed lines represent log2 FC of 1. The vertical line between the dots indicates the mean FC. N.D. = not detected in Mtf2 GT.
Supplementary Figure 3 MTF2 domains C-terminal to PHD2 are dispensable for PRC2 recruitment and not sufficient for MTF2–PRC2 interaction.
a DNA pulldown of E14 (wild type), Mtf2GT/GT and Mtf2Δ/Δ mESC with region1 bait. PRC2 recruitment is lost in Mtf2GT/GT cells while EZH2 from Mtf2Δ/Δ mESC show normal recruitment. The blots is representative of three independent pulldowns experiments. Uncropped gels available as Supplementary Information. b, c Interaction proteomics of rescue constructs in Mtf2GT/GT background. MTF2 isoform 2 (b) binds PRC2 while the construct only encoding the two PHD domains (c) does not. Each condition was measured in three independent experiments. FDR from two tailed t-test.
a, ChIP qPCR of MTF2 in wild type and Mtf2GT/GT mESC showing MTF2 antibody specificity. This ChIP was confirmed in 2i grown cells. b, Scatterplot of RPKM values for high-confidence peaks for ChIP-seq replicates, showing high reproducibility of ChIPseq samples.
a, b, Heatmap (left) and boxplot quantification (right) of RPKM-normalized ChIP-seq signal on EZH2 (a) or H3K27me3 (b) high confidence peaks (resp. n = 7213 and n = 6294). MTF2 is required for PRC2 recruitment to the majority of its targets. c, RPKM quantification of ChIP-seq signal on repetitive elements, as defined by UCSC genome browser Repeat Masker track (n = 4,048,423 regions). d, Additional examples of ChIP-seq signal (reads per million) on peaks belonging to each of the clusters shown in Fig. 4a. Boxplots represent the interquartile range (IQR, box), the median (central bar), and 1.5 IQR (whiskers). The ChIP-seq experiments were performed in duplicate, except MTF2 in Eed-/- where the second replicate was excluded because of low recovery and library quality.
Data were mapped in the same way and RPKM normalized. a, Heatmaps are centered on the high-confidence MTF2 peaks identified in this study. Li et al. reported an analysis based on transcription start sites instead of peaks, and showed the Hoxd cluster as track example (cf. panel b). b, Genome browser view showing an example of the tracks (ChIP signal as reads per million) on the Hoxd locus.
Supplementary Figure 7 Identification of additional bound k-mers and characterization of DNA shape within MTF2 peaks.
a, Performance of kmer-SVM algorithm for the identification of MTF2 peaks compared to H3K27me3-negative unmethylated DNA, quantified as Receiver Operator Characteristic Area Under the Curve (ROC-AUC, perfect prediction AUC = 1, random AUC = 0.5). G + C percentage and CpG richness (k = 1, k = 2) have decent predictive power. The algorithm performance, however, steeply increases with longer kmers, with trinucleotide-based prediction closely approaching the performance obtained with k = 6 or k = 7 (n = 8092 regions). b, Sequence logo of positive-scoring kmers (kmer-SVM of MTF2 peak summits, k = 8, weight > 1.5) containing four bases to the left of the CpG dinucleotide, aligned on the CpG. There is a preference for G or C in front of the CpG. c, Enrichment of high- and low-scoring kmers containing the GCG motif in MTF2 peak summits compared to unmethylated BioCap regions. d, DNA pulldown of EZH2 and MTF2 with bait region 5 containing the CGCCCGG and TGCGCGCG kmers (cf. panel c) that are also found in the MRE3-4 element of the Mt1 gene14. Blots are representative of three independent pulldowns. Uncropped gels are available as Supplementary information. e, DNA shape parameter distribution of MTF2 and BioCap regions (resp. n = 6,357 and n = 48,247) aligned on the 5’ border of the peaks. Predicted DNA shape of MTF2 bound regions differs from the genomic context more than non Polycomb-targeted CpG islands. f, CpG density of MTF2 peak summits (100 bp) and unmethylated (BioCap) or genomic regions (n = 24309). Boxplots represent IQR, central bar the median, whiskers are 1.5 IQR.
a, Performance of shape based prediction of MTF2 binding, using Random Forest-based machine learning algorithm (Methods). The ROC-AUC values ( > 0.7 for all parameters) show that it is possible to identify bound GCGs using the predicted DNA shape only (n = 58,903 GCGs). b, DNA shape value distribution of all the GCG trinucleotides occurring in MTF2 peak summits (colored areas) compared with sequence composition matched control regions obtained by randomly shuffling the sequences in the positive set while keeping constant both number and position of GCGs (grey areas). Dark shades represent 10 percentiles around the median, lighter shade the interquartile range (IQR). The shape of MTF2-bound GCGs (n = 58,903) is highly specific and depends on flanking regions. c-d, Prediction of the DNA shape of the Region 6 containing TGAGCGCG kmer (c) and region 7 containing CCCGGGCG kmer (d) used for pulldown in Fig. 5d. Bandplots represent IQR of DNA shape parameters of top scoring GCG-containing kmers shown in Fig. 5a (n = 2017). Points represent DNA shape values of non-qualifying GCG contained in the kmers and of the qualifying GCG in the flanking regions. For clarity only the most relevant positions are shown. Number and position of GCG trinucleotides along the 30 bp baits are shown above the panels. Only positions with GCGs are shown for clarity. e, Comparison of the predicted shape of region 1, region 6, region 7 with the original and reverse-complemented sequence used for the MTF2-DNA co-crystal24. Bandplots represent IQR of DNA shape parameters of top scoring GCG-containing kmers shown in Fig. 5a. Points represent DNA shape values of qualifying GCGs shown in panels c-d, contained in region 1 and in the DNA crystallized with MTF2. For clarity only the most relevant positions are shown. Number and position of GCG trinucleotides along the 30 bp baits are shown above the panels. Only position with GCGs are shown for clarity. f, GCG density of MTF2 peak summits (100 bp) and unmethylated (BioCap) or genomic regions (all groups n = 8092 regions). Boxplots represent IQR, central bar the median, whiskers are 1.5 IQR. g, Quantification of shape qualifying GCGs for each shape parameter, showing a higher occurrence of properly shaped GCGs in MTF2-bound regions (all groups n = 8092 regions). Qualifying GCGs were defined as matching the IQR of the shape parameter from position −2 to position + 1 of the GCG.
About this article
Cite this article
Perino, M., van Mierlo, G., Karemaker, I.D. et al. MTF2 recruits Polycomb Repressive Complex 2 by helical-shape-selective DNA binding. Nat Genet 50, 1002–1010 (2018). https://doi.org/10.1038/s41588-018-0134-8
This article is cited by
Nature Reviews Molecular Cell Biology (2023)
Nature Cell Biology (2023)
Genome Biology (2022)
Epigenetics & Chromatin (2022)
Nature Reviews Genetics (2022)