DNA sequence and chromatin modifiers cooperate to confer epigenetic bistability at imprinting control regions

Genomic imprinting is regulated by parental-specific DNA methylation of imprinting control regions (ICRs). Despite an identical DNA sequence, ICRs can exist in two distinct epigenetic states that are memorized throughout unlimited cell divisions and reset during germline formation. Here, we systematically study the genetic and epigenetic determinants of this epigenetic bistability. By iterative integration of ICRs and related DNA sequences to an ectopic location in the mouse genome, we first identify the DNA sequence features required for maintenance of epigenetic states in embryonic stem cells. The autonomous regulatory properties of ICRs further enabled us to create DNA-methylation-sensitive reporters and to screen for key components involved in regulating their epigenetic memory. Besides DNMT1, UHRF1 and ZFP57, we identify factors that prevent switching from methylated to unmethylated states and show that two of these candidates, ATF7IP and ZMYM2, are important for the stability of DNA and H3K9 methylation at ICRs in embryonic stem cells.

Genomic imprinting is regulated by parental-specific DNA methylation of imprinting control regions (ICRs). Despite an identical DNA sequence, ICRs can exist in two distinct epigenetic states that are memorized throughout unlimited cell divisions and reset during germline formation. Here, we systematically study the genetic and epigenetic determinants of this epigenetic bistability. By iterative integration of ICRs and related DNA sequences to an ectopic location in the mouse genome, we first identify the DNA sequence features required for maintenance of epigenetic states in embryonic stem cells. The autonomous regulatory properties of ICRs further enabled us to create DNA-methylation-sensitive reporters and to screen for key components involved in regulating their epigenetic memory. Besides DNMT1, UHRF1 and ZFP57, we identify factors that prevent switching from methylated to unmethylated states and show that two of these candidates, ATF7IP and ZMYM2, are important for the stability of DNA and H3K9 methylation at ICRs in embryonic stem cells.
Epigenetic regulation of gene activity depends on multiple layers of chromatin modifications that are maintained during DNA replication 1,2 . By definition, these epigenetic mechanisms act independently of the DNA sequence at the genomic sites they occupy. However, several studies have highlighted a contribution of DNA sequence to the regulation and maintenance of chromatin modifications, preventing a clear distinction between epigenetic and genetic control of gene activity [3][4][5][6][7][8] . Genomic imprinting is an epigenetic phenomenon, where DNA methylation marks on either the maternal or paternal ICRs dictate parental-specific activity of transcripts in cis [9][10][11] . ICRs inherit parental-specific DNA methylation marks from either the oocyte or sperm, which are then propagated in all somatic tissues of the next generation 9 . The inheritance of differential epigenetic states on the parental chromosomes, despite identical DNA sequence, identical Article https://doi.org/10.1038/s41588-022-01210-z of mESCs for more than 20 passages, or upon integration to a different RMCE position in the genome, and also following in vitro differentiation of mESCs to neuronal progenitors (Extended Data Fig. 2a-c). Furthermore, DNA methylation of the ectopic Airn ICR was still retained at high levels after cultivation of mESCs in 2i medium for 10 days, despite the global reduction in 5-methylcytosine resulting from acquiring a naïve stem cell state 23-26 (Extended Data Fig. 2a,d).
Besides DNA methylation, endogenous ICRs further display differential histone modifications (Extended Data Fig. 1a), whereby the methylated ICR is decorated by H3K9me3 and the unmethylated ICR by H3K4me2 (refs. 14,22,27 ). We performed chromatin immunoprecipitation (ChIP) quantitative PCR (qPCR) for H3K9me3 and H3K4me2 and compared the enrichment of these marks at the RMCE integrations of the Airn and Kcnq1ot1 ICRs with their endogenous counterparts (Fig. 1d). The unmethylated ICRs at the RMCE site showed lack of H3K9me3 and increased H3K4me2, whereas the premethylated ICRs revealed the opposite pattern, with increased H3K9me3 and absence of H3K4me2 (Fig. 1d). Previous studies identified the DNA-methylation-specific KRAB-Znf protein ZFP57 to be required for maintenance of DNA methylation and H3K9me3 at endogenous ICRs 16,18,21 . This regulation is recapitulated at the ectopic ICR, as CRISPR-Cas9 deletion of Zfp57 in mESCs results in rapid and complete loss of DNA methylation at both ectopic and endogenous sites (Extended Data Fig. 2e).

Epigenetic bistability depends on DNA sequence
We set out to test if the ICR DNA sequence is required for epigenetic memory. First, we aimed to identify if smaller ICR fragments would also efficiently memorize preset DNA methylation patterns and repeated the same experiments with four smaller fragments from the Airn ICR ( Fig. 2a and Extended Data Fig. 2f). None of the tested fragments could faithfully recapitulate the differential methylation maintenance. The same was observed for the paternally methylated H19 ICR (Extended Data Fig. 2g,h). Previous studies focusing on non-ICR regulatory regions (promoters, CpG islands or enhancers) have revealed that CpG density, GC content and/or nucleotide sequence can influence establishment of DNA methylation patterns [3][4][5][6] . Based on their CpG density and GC content, the ICRs tested here are in the range of genomic elements overlapping with unmethylated CpG island promoters (Extended Data Fig. 3a). To investigate if the CpG density and GC content of the ICRs contribute to the maintenance of methylated and unmethylated states, we selected four genomic regions that are highly similar to the Airn ICR in size, GC%, CpG number and distribution (Extended Data Fig.  3b-d). These 'Airn-like' elements failed to maintain the differential methylation and adopted a hypomethylated state like their endogenous counterpart, suggesting that DNA sequence length, CpG density and GC content are not sufficient to establish two distinct epigenetic states (Extended Data Fig. 3e,f).
To further distinguish the direct requirement of DNA sequence from CpG and GC content, we generated a synthetic DNA element based on the Airn ICR sequence, where we permutated the inter-CpG DNA sequences until 78% mismatch was reached (Extended Data Fig. 4a,b). Importantly, the permutation of the original sequence retained the local GC content and the number and position of the original CpGs. This replacement removed all inter-CpG DNA sequence information, allowing us to distinguish the contribution of DNA sequence from CpG frequency and distribution. We repeated the RMCE experiments with this 'shuffled' Airn ICR and observed that it failed to maintain the preset epigenetic state (Fig. 2b). In both cases, DNA methylation reached an intermediate value of 40.3% for the unmethylated and 23.5% for the premethylated insertion, with disordered methylation patterns (Fig. 2b). The sequence alterations further led to reduced establishment of the H3K9me3 and H3K4me2 at the RMCE site, independently of the preset methylation state (Fig. 2c).
The shuffling of the inter-CpG DNA sequence in the Airn ICR disrupted all ZFP57 binding motifs, which might explain the observed chromosomal location and exposure to the same regulatory factors in the nucleus, make ICRs a great model to study the individual contribution of DNA sequence and chromatin modifications to epigenetic memory.
Several factors and mechanisms have been identified that regulate the maintenance of DNA methylation at ICRs. Once methylation marks have been deposited in the germline 12 , the maintenance methyltransferase DNMT1 and its accessory protein UHRF1 are responsible for the maintenance of methylation during DNA replication 13 . In addition, several factors have been identified to regulate H3K9me3 at the DNA-methylated ICRs, including SETDB1, KAP1 and G9A [14][15][16] . Importantly. the KRAB zinc-finger factor ZFP57 binds the methylated hexanucleotide DNA sequence TGCmCGC and recruits KAP1 and other associated factors to establish a feedback between DNA methylation and H3K9me3 at ICRs 16,17 . Indeed, binding of ZFP57 and recruitment of KAP1 are crucial steps in regulating imprints, as knockout (KO) of Zfp57 in mice results in loss of almost all imprints and embryonic lethality 18-20 , and ZFP57 is required for maintenance of DNA methylation and H3K9me3 at ICRs in cellular systems 16,18,21 .
Although the factors that control DNA and histone methylation at ICRs have been widely investigated, the DNA sequence properties of ICRs have not been explored in detail. Furthermore, it is also not known if additional key players contribute to the epigenetic maintenance at ICRs. By iterative integration of ICR DNA sequences to the same genomic site in mouse embryonic stem cells (mESCs), we show that ICRs are autonomous genetic elements that can recapitulate the epigenetic states observed at the endogenous locations. Using this setup, we show that by presetting DNA methylation, we can establish two opposing epigenetic states that are faithfully propagated by the ectopic ICR. This DNA-methylation-dependent switch is unique to ICRs. Systematic integrations of variant and synthetic ICRs allowed us to identify the sequence requirements that are necessary and sufficient for this switch-like behavior. Furthermore, by using the ectopic ICRs as DNA-methylation-sensitive reporters in loss-of-function genetic screens, we confirm DNMT1, UHRF1 and ZFP57 as the core epigenetic regulators of genomic imprinting. In addition, we identify ATF7IP and ZMYM2 as factors involved in regulating maintenance of epigenetic states at ICRs.

Autonomous ICRs memorize preestablished epigenetic states
We hypothesized that the DNA sequence of ICRs should contain sufficient information to establish and maintain the distinct epigenetic states observed on the parental alleles (Extended Data Fig. 1a). We selected four ICRs from the Airn, Kcnq1ot1, Zrsr1 and H19 imprinting clusters and used recombinase-mediated cassette exchange (RMCE 3 ) to integrate them individually into the genome of mESCs (Fig. 1a). To mimic the differential DNA methylation states of the ICRs, we performed RMCE in parallel for unmethylated ICRs and ICRs that were premethylated by the bacterial CpG methyltransferase M.SssI ( Fig. 1a and Extended Data Fig. 1b). As a control sequence, we used the Igf2r DMR (differentially methylated region), a promoter that acquires differential DNA methylation only during differentiation 22 . Furthermore, we included a set of inactive gene promoters (Hes3, Tcl1 and Syt1), which were previously shown to be protected from de novo DNA methylation when integrated to the same RMCE site 3 (Fig. 1b).
After successful integration, we measured DNA methylation at the RMCE site by bisulfite conversion PCR (bsPCR). All four ICRs maintained their preestablished DNA methylation status at the ectopic site, although in some cases, minor de novo methylation at the unmethylated ICRs was observed (Fig. 1b,c and Extended Data Fig. 1c-e). In contrast, maintenance of preestablished DNA methylation was not observed for the Igf2r DMR and the control promoter elements ( Fig.1b and Extended Data Fig. 1f-i). The differential DNA methylation states at the ectopic Airn ICR were stably maintained after prolonged cultivation Article https://doi.org/10.1038/s41588-022-01210-z lack of maintenance, in agreement with an in silico evaluation of ZFP57 binding to wild-type and shuffled Airn ICR sequence using BPNet 28 (Extended Data Fig. 4c,d). Accordingly, we wanted to investigate if ZFP57 motifs are sufficient for the maintenance of the epigenetic state. Therefore, we restored the ZFP57 binding motifs in the shuffled ICR (Extended Data Fig. 4e) and introduced this methylated and unmethylated DNA element to the RMCE site in mESCs. Although the unmethylated version failed to maintain the hypomethylated state, the premethylated ICR was able to maintain a fully hypermethylated state (Fig. 2d). Given these observations, we wondered if the requirement for ZFP57 binding sites is dependent on the cellular context, especially as Zfp57 gene expression is tissue specific 18,29 . Therefore we introduced the shuffled Airn ICR to RMCE-competent mouse erythroleukemia (MEL) cells 30 and performed targeted bisulfite sequencing. Both methylated and unmethylated shuffled ICRs retained the preset DNA methylation patterns (Fig. 2e), indicating that in MEL cells, CpG content is sufficient for the memory of DNA methylation states.

Ectopic ICRs establish epigenetic silencing in cis
Endogenous ICRs are cis-regulating elements that dictate the allelic expression of nearby transcripts based on their DNA methylation state 9 . We first wanted to test if ICR sequences can silence three different reporter constructs in presence of DNA methylation when integrated together to the RMCE site ( Fig. 3a and Extended Data Fig. 5a). We selected three commonly used constitutive promoters (pCAGGS, hEF1alpha and hPGK) and showed that they can maintain expression of a GFP reporter at the RMCE integration site in absence of ICRs (Extended Data Fig. 5b). Next, we measured the ability of three methylated ICRs (Airn, Kcnq1ot1 and Peg10) to stably repress these promoters at the same RMCE integration site (Fig. 3b). All tested ICR sequences showed stable repression in combination with the Ef1alpha and hPGK promoters. In contrast, the methylated promoters without ICRs, or in combination with the Dazl promoter, which is known to be regulated in a DNA-methylation-dependent manner 31 , were not able to maintain a repressed state (Fig. 3b) Mat.
n/a n/a n/a Article https://doi.org/10.1038/s41588-022-01210-z composite promoter can overcome the epigenetic repression induced by some ICRs (Fig. 3b). The DNA-methylation-dependent repression was maintained over longer periods, as measured by GFP activity in multiple clonally derived populations after 16, 23 and 30 days (Extended Data Fig. 5c). The same methylated ICR-dependent repression was observed for the paternally methylated H19 ICR (Extended Data Fig. 5d).
This setup allowed us to test the contribution of DNA methylation and sequence on the silencing potential of ICRs. For this, we made use of the Airn-pEF1a-GFP reporter construct that showed stable maintenance of GFP expression when inserted unmethylated and stable silencing when inserted methylated (Fig. 3c). When we replaced the Airn ICR with the shuffled Airn version, we observed loss of silencing in most of the measured clones already after 16 days and even more after prolonged cultivation, suggesting that methylation-dependent silencing in cis requires an intact ICR sequence (Fig. 3c). Finally, we introduced the shuffled Airn sequence containing reconstituted ZFP57 binding sites. Although the unmethylated version led to stochastic loss of transcriptional activity, the methylated construct gave rise to stable repression of the nearby promoter for multiple generations, indicating that ZFP57 binding is not only required for the maintenance of epigenetic memory at ICRs but also sufficient for epigenetic silencing in cis (Fig. 3c).
To test if the DNA methylation of ICRs is required for the repression of the nearby promoter, we challenged the established reporter cell lines by culturing them in 2i and 2i + vitamin C media. Both conditions reduce genome-wide DNA methylation levels 23-25 , whereas addition of vitamin C results in further removal of DNA methylation from ICRs and repetitive elements 26,32 . GFP repression was maintained in 2i medium; however, repression was progressively lost in presence of 2i + vitamin C (Extended Data Fig. 5e-g). To further test the dependency on DNA methylation for maintaining the repressed state at the ICR reporters, we performed KO experiments of the general DNA methylation maintenance factors Uhrf1 and Dnmt1 (ref. 33 ). As expected, removal of DNA methylation in these KO cells led to a reactivation of the ICR reporter within 7 days (Extended Data Fig. 6a,b). The low percentage of cells that show GFP reactivation in these assays is due to low KO efficiency in the CRISPR-targeted pool of cells. Therefore, we cultured the Airn-ICR reporter in presence of the DNMT1 inhibitor GSK-3484862 (ref. 34 ) for 2 days. We observed complete reactivation with over 95% of cells expressing GFP ( Fig. 3d and Extended Data Fig. 6c). The use of this DNMT1 inhibitor further allowed us to test if the reactivation is reversible; therefore, we removed GSK-3484862 from the medium and continued cultivation for 7 more days after washout ( Fig. 3d and Extended Data Fig. 6c). We observed no resilencing of activated reporters, indicating that once the ICR is switched on, it cannot revert to a silent state.

CRISPR screens identify regulators of epigenetic memory at ICRs in mESCs
After establishing multiple ICR-specific reporter cell lines, we wanted to screen for proteins required for maintenance of repressive ICR states. We first established the CRISPR screen workflow using a targeted library against 1,051 chromatin-related factors with 6 Table 1). Additionally, other heterochromatin-associated factors like Cbx1, Cbx5, Atrx, Daxx and Setdb1 were enriched in the GFP-positive fraction. The list of high-confidence hits that were repeatedly found in all screens was enriched for Zfp57, Uhrf1 and Dnmt1, whereas other hits were identified in individual ICR reporter cell lines (Fig. 4c). We redesigned an extended CRISPR library (EpiTF) consisting of 20,470 guide RNAs against 4,095 genes encoding nuclear factors to cover a large fraction of the KRAB zinc-finger protein family and repeated the screen using the Airn ICR reporter (Supplementary Table 1). Despite the increased complexity of the library, we did not identify additional transcription factors to play a role in the maintenance of Airn reporter silencing ( Fig. 4c and Extended Data Fig. 8d,e). Several candidates identified in more than one screen were tested by single-KO validation. Zfp57, Uhrf1 and Dnmt1 showed consistent upregulation in all three reporter lines, whereas other candidates resulted in lower or stochastic reactivation in some of the tested reporter lines (Extended Data Fig. 8f).

ATF7IP and ZMYM2 colocalize to endogenous ICRs
Two factors were identified in at least three different screens ( Fig. 4c and Extended Data Fig. 8g): ATF7IP, responsible for SETDB1-mediated silencing of transposable elements 35-37 , as well as ZMYM2, an ATF7IP-interacting factor associated with growth restriction of human pluripotent cells 38,39 . Given their association with H3K9me3 and reported involvement in transcriptional silencing of repetitive elements, we tested their contribution to regulation of epigenetic maintenance at ICRs. In addition human ATF7IP was recently identified to be a repressor of paternally expressed imprinted genes and required for silencing sperm-specific genes 40 . We first wanted to see if these factors indeed localize to the endogenous ICRs and analyzed existing mESC ChIP-seq datasets available for SETDB1 (ref. 41  Article https://doi.org/10.1038/s41588-022-01210-z SETDB1 (Fig. 5b). Notable exceptions are MCTS2/H13, where ATF7IP is absent, and H19, which shows a reduced localization of ZMYM2. As a general trend, we observe that ATF7IP and ZMYM2 always colocalize in presence of SETDB1 and ZFP57, suggesting that they localize to ICRs as part of the H3K9me3 machinery. Interestingly, genome-wide analysis of ATF7IP, ZMYM2, ZFP57 and SETDB1 peaks indicates that this colocalization is not always observed outside of ICRs. Although the majority (85%) of the few ATF7IP peaks that we detected overlap with ZFP57 and SETDB1 sites, only 30% of ZMYM2 peaks colocalize with ZFP57 and SETDB1 (Extended Data Fig. 9a). ZMYM2 peaks outside of ZFP57/SETDB1 sites show lower H3K9me3 and DNA methylation compared to peaks overlapping with ZFP57/SETDB1, suggesting that ZMYM2 is involved in multiple regulatory pathways independently of SETDB1 (Extended Data Fig. 9b,c). Regardless of this binding, we see a reduction of ATF7IP and ZMYM2 localization to the Airn ICRs in absence of ZFP57 (Extended Data Fig. 9d).
To further interrogate the link between ATF7IP and ZMYM2 at ICRs, we recruited the proximity biotin ligase TurboID 44 to methylated ICRs via a ZFP57-TurboID fusion protein expressed from the RMCE site and performed BioID as previously described 45 (Fig. 5c). As a background control, we generated a cell line expressing just the NLS-TurboID (nTurbo) and included a cell line expressing only the KRAB domain of ZFP57 fused to the TurboID ligase to distinguish between proteins that interact with ZFP57 when not bound to chromatin. Mass-spectrometric detection of enriched proteins included several factors previously associated with ZFP57 (KAP1, CBX3, CBX5 and MORC3). Among them, we detected ATF7IP (Fig. 5c, Extended Data Fig. 9e and Supplementary  Table 1), supporting the results obtained from the CRISPR screen and genome-wide analysis. In the case of ZMYM2, we could not detect the protein in the biotinylated fraction or the background sample, suggesting that its enrichment was either below the detection limit or not specifically interacting with ZFP57.

ATF7IP and ZMYM2 regulate epigenetic memory at endogenous ICRs in mESCs
Next, we wanted to test if absence of these factors that are expressed during early mouse development influences the epigenetic state at endogenous ICRs, and we generated KO mESCs for Atf7ip and Zmym2 using CRISPR-Cas9 (Extended Data Fig. 9f,g). Whole-genome bisulfite sequencing (WGBS) revealed a reduction of DNA methylation at the majority of analyzed ICRs, despite limited loss of methylation genome-wide ( Fig. 5d and Extended Data Fig. 10a-c). Peg13 and Meg3/ Rian ICRs retained DNA methylation in both KO cell lines, whereas H13/ Mcts and Gnas/Nespas specifically retained methylation in absence of ZMYM2 and Zrsr1/Commd1 and H19 in absence of ATF7IP. Loss of ICR methylation was further confirmed by targeted bisulfite sequencing around the binding sites of ATF7IP and ZMYM2 at the Airn, Kcnq1ot1 and Peg10 ICRs (Extended Data Fig. 10d). Finally, we profiled H3K9me3 in the same KO cell lines and observed loss of H3K9me3 at all ICRs, except for Peg13 and Meg3/Rian, which retained H3K9me3 in both KO lines. In addition, Zrsr1/Commd1 and H19 retained H3K9me3 in Atf7ip KO cells ( Fig. 5e and Extended Data Fig. 10e). The concordant changes in DNA methylation and H3K9me3 at ICRs in the absence of ATF7IP or ZMYM2 indicate that these factors are required for preventing switching of ICRs from methylated to unmethylated states in mESCs.

Discussion
Here, we set out to study the genetic and epigenetic determinants that allow ICRs to maintain their differential DNA methylation. Toward this, we isolated ICRs from their endogenous chromosomal context and inserted them into a heterologous position in the mESC genome. When integrated unmethylated, the tested ICRs maintained a DNA-methylation-free and euchromatic state, suggesting sequence-specific mechanisms that prevent de novo methylation. This behavior is similar to CpG island promoters, which are protected   [45][46][47][48][49] . Importantly, this switching between two opposing chromatin states based on DNA methylation was not observed for non-ICR promoters and other DNA sequences of similar size, CpG density or GC content, suggesting that specialized properties of the full-length ICR are required for this 'epigenetic bistability'. The ectopic ICRs enabled to systematically study the DNA sequences and chromatin regulatory factors required for creating and maintaining epigenetic memory at ICRs in a controlled genomic environment. Through introducing synthetic ICRs with modified DNA sequences, we observe that GC content and CpG density is not sufficient for encoding bistability in mESCs but that additional sequences, such as ZFP57 binding motifs, play an important role in maintaining DNA and H3K9 methylation. This finding is in line with previous work, where mutations of the methylated CpGs of the ZFP57 recognition motif resulted in loss of methylation maintenance over the entire Snrpn ICR 21 . In addition, we show that ZFP57 binding is not only required but also sufficient for the epigenetic memory at the methylated Airn ICR state in mESCs. In the case of the unmethylated allele, the same sequence changes result in loss of protection from de novo methylation, suggesting sequence-specific mechanisms that protect from de novo methylation, potentially similar to those observed at regulatory regions of nonimprinted genes 3,5 . Nevertheless, because maintenance of differential Airn ICR methylation in MEL cells was independent of DNA sequences outside of CpGs, we suggest a cell-type-specific requirement for sequence-specific factors involved in epigenetic maintenance. In the case of ZFP57, this would be in line with its restricted transcriptional activity to germ cells and during early development 18,29 .
Having identified the robust establishment and maintenance of heterochromatin at methylated ICRs, we could generate reporter cell lines that respond to DNA methylation. In contrast to previous strategies that used the Snrpn promoter to report changes in methylation at endogenous gene promoters 50,51 , our cell lines directly report regulatory changes at the introduced ICRs. We used these reporters to screen for factors required for maintenance of the repressed state. Targeted CRISPR screens identify Dnmt1, Uhrf1 and Zfp57 as the most relevant genes required to maintain the DNA methylation status at all tested ICRs, confirming the suitability of our setup. In addition, our functional screens identified, and thus validate, additional factors that have been     Article https://doi.org/10.1038/s41588-022-01210-z described to regulate H3K9me3 throughout the genome and to associate with ICRs, including DAXX, ATRX, CBX1 and CBX5 (refs. 14,52,53 ). Among the obtained hits, we identified ATF7IP and ZMYM2 as factors involved in the maintenance of epigenetic repression at ICR reporters. ATF7IP and SETDB1 show functional overlap in the regulation of endogenous retroviral elements, with ATF7IP acting as a cofactor of SETDB1 by stimulating its enzymatic activity, protecting it from proteasomal degradation and facilitating its nuclear localization 35,37,54,55 . Although loss of SETDB1 is lethal in mESCs, the absence of ATF7IP reduces levels of SETDB1, sufficient for viability but insufficient to maintain all repressed sites in the genome 37,56 . The C-terminal fibronectin type-III domain of ATF7IP has been shown to interact with ZMYM2 (also ZFP198), and this interaction was suggested to be important for the silencing of a few germline-specific genes, including the imprinted FKBP6 gene 39,57 . ZMYM2 was also described to interact with H3K9me3-marked chromatin 58,59 and furthermore required for endogenous retroviral element silencing, thereby preventing the transition to two-cell-like cells in mESC culture 43,54 . The role of ZMYM2 in restricting potency is further supported by the fact that ZMYM2 is required for exit from pluripotency 38,60 . We show that ATF7IP and ZMYM2 colocalize together with ZFP57 and SETDB1 at the majority of endogenous ICRs in mESCs and are required for the memory of the epigenetic state at methylated ICRs. This is in line with a publication showing a role of ATF7IP in regulating sperm-specific genes and paternally expressed imprinted genes, including Peg13 in human parthenogenetic ESCs 40 .
Our results indicate that, in mESCs, ATF7IP could play a broader role in regulating all methylated ICRs, independently of the parental origin.
If and how these two factors contribute to maintenance of imprints during zygote formation and early development remains to be tested. In mESCs, their absence resulted in impaired maintenance fidelity and sporadic loss of H3K9me3 at multiple ICRs, independently of their parental origin. We suggest that this destabilizes the repressive feedback loop between DNA methylation and H3K9me3, allowing switching of the ICR to the default unmethylated state. While we observe differences in regulatory activities of ATF7IP and ZMYM2 at some ICRs (for example, Mcts2/H13, Zrsr1/Commd1 and H19), it remains to be determined if this is due to a specificity of these factors toward these ICRs. Alternatively, this could reflect stochasticity in ICR switching to an unmethylated state in absence of either factor, which is memorized in clonally derived cells.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41588-022-01210-z. recognize a methylated hexanucleotide to affect chromatin and DNA methylation of imprinting control regions.

Cell line generation
Targeted cell line integrations in mESCs were obtained through RMCE using either electroporation of 2 × 10 6 cells with the Amaxa Nucleofector (Lonza) or Lipofectamine 3000 (Invitrogen) transfections of 2.5 × 10 4 cells. All RMCE vectors were cotransfected with a CRE-expressing plasmid at a ratio of 1:0.6 µg, using either a total of 40 µg plasmid for the Amaxa Nucleofector kit or 1 µg for Lipofectamine 3000. Two days after transfection, cells were selected with 3 µM Ganciclovir for more than 8 days. The obtained cell lines were kept as pools and when necessary clonal cell lines were obtained through limited dilution. Pools or clonal cell lines were genotyped using integration site specific PCRs. The parental cell line for all reporter cell lines used in the CRISPR screens contains a stably expressed Cas9 gene from the Rosa26 locus, obtained by TALEN-mediated integration as previously described 62 . Single-clone KO cell lines were obtained by CRISPR-Cas9 using the px330-hSpCas9 (Addgene, 42230) plasmid together with a pRR-Puro recombination reporter 62 . A total of 1 µg plasmid DNA at a ratio of 1:0.1 of px330 to pRR-Puro was transfected using Lipofectamine 3000. Puromycin selection was started 36 h after transfection for 36-48 h at a concentration of 2 µg ml −1 . KO cell lines were validated using targeting site-specific PCR. RMCE in MEL cells was performed using Lipofectamine 3000 (Invitrogen), plating 5 × 10 5 cells in 6-well plates for suspension cells. A total of 2.5 µg plasmid DNA, using the same ration as described before, was transfected according to the manufacturer's instruction. After 48 h, cells were transferred into T75 flasks, and cells that underwent recombination were selected with 5 µM Ganciclovir containing media for more than 8 days.

Reporter cell line generation
A backbone containing two inverted loxP sites 3 was used to clone several empty reporter vectors containing a 60-bp universal entry site with a central EcoRV restriction site, followed by a promoter (pCAGGS, hPGK and Ef1alpha) that drives an eGFP or mScarlet gene for the ChroMM and EpiTF screens, respectively, followed by a downstream BGH-poly(A) and a WPRE sequence. Individual ICR or control sequences were amplified from genomic DNA (Supplementary Table 1). Gibson assembly was performed according to the NEB Gibson Assembly Master Mix protocol. In vitro methylation was performed with up to 40 µg plasmid DNA using the NEB M.SssI methyltransferase in two consecutive reactions of at least 4 h with 600 µM SAM (NEB, B9003S) and 1.5 U M.SssI (NEB, M0226L) per microgram DNA. Complete methylation of plasmids was confirmed by using the CpG methylation sensitive restriction enzyme HpaII (NEB) and a methylation insensitive control reaction with MspI (NEB). Cell lines were generated as described before. Individual clones were genotyped using PCR with primers spanning the loxP sites. Methylation of the integrated reporter construct was validated on selected clones.

Flow cytometry and fluorescence-activated cell sorting (FACS)
Flow cytometry data acquisition was performed on a BD FACSCanto II or a BD LSR Fortessa cell analyzer. FACS was performed with a BD FACSAria III cell sorter. Data analysis was done with FlowJo (version 10.7) or BD FACSDiva (

In silico sequence analysis using BPnet
BPnet 28 (version 0.0.23) was used to determine sequence motif and context of ZPF57 binding in mESCs. ZFP57 ChIP-seq data and corresponding input files 42 were aligned to the mouse genome (NCBI Build 37 mm9, July 2007) using bowtie2 (version 2.3.5.1) after removal of adapters using trimgalore (version 0.6.6). Aligned reads were filtered for PCR duplicates using Picard (version 2.23.9), and only reads with a mapping quality (MAPQ) > 40 were kept for further analysis. All replicates were merged before peak calling using MACS2 (version 2.1.1.20160309) with the following parameters: callpeak -g mm-keep-dup all -q 0.05call-summits. Reads mapped to the positive and negative strand of the merged datasets were split into individual files and trimmed to the 5′ base as input tracks for BPnet. A model was trained with the default bpnet9 architecture (https://github.com/kundajelab/bpnet), using chromosomes 1, 8 and 9 as test set, and peaks on chromosomes 2, 3 and 4 as validation sets. Peaks on chromosomes X and Y were excluded from model training. After calculation of the contribution scores with BPnet's DeepLIFT method, motifs were determined using BPnet's TF-MoDISco method. To determine contribution scores on the Airn and shuffled Airn sequences, the input DNA was one-hot encoded before subjecting them to the trained model to generate ZFP57 binding predictions. For the walking mutations, 10 nt of the shuffled sequence was swapped with the original Airn sequence and shifted by 1 bp per prediction.

Bisulfite PCR and sequencing
Up to 2 µg genomic DNA, or the total amount to eluted material from ChIP, was used for bisulfite conversion using the EpiTect Bisulfite Kit (Qiagen). Bisulfite PCR was carried out using the PhusionU polymerase (Thermo Fisher Scientific) with the primers indicated in Supplementary  Table 1 using the following conditions: initial denaturation at 95 °C for 5 min, followed by 45 cycles of 1 min at 95 °C, 1 min at 50-60 °C (dependent on the primer pair) and 1 min at 72 °C, followed by 5 min of final extension at 72 °C. Amplicons were cloned into the CloneJET vector (Thermo Fisher Scientific), sequenced by Sanger sequencing and analyzed using QUMA 63 .

Targeted bisulfite sequencing
Targeted bisulfite sequencing libraries were made from equimolar pooled bisulfite PCR fragments. Two independent PCR reactions were run per target with annealing temperatures at 50 °C and 58 °C to mitigate amplification bias. Indexed libraries were prepared using Article https://doi.org/10.1038/s41588-022-01210-z the NEBNext Ultra II kit (NEB) starting from 10 ng pooled amplicons according to the manufacturer's protocol. Sequencing was done on an Illumina NovaSeq6000 machine with 150-bp paired-end reads. Fastq files were trimmed using trim_galore (version 0.6.6) and alignment was performed with Bismark (version 0.23.0) with the parameter non_directional. CpG methylation was extracted using the Bismark methylation extractor and average CpG methylation was calculated in R, excluding CpGs that were covered less than 500 times.

Whole-genome bisulfite library preparation and sequencing
WGBS of Atf7ip and Zmym2 KO mESCs was performed as described previously 64

ChIP-qPCR
qPCR reactions were run as technical duplicates on a Rotor Gene Q machine (Roche) using the KAPA SYBR Fast universal qPCR kit (Sigma) in 10-µl reactions with 1 µl eluted DNA for the IP material or 1 µl of a 1:10 dilution of the input material. Delta Ct values were calculated over input, followed by delta Ct and fold change over an intergenic region. Primers are listed in Supplementary Table 1. Corresponding plots were generated with Prism (5.0a).

ChIP-seq library preparation
ChIP-seq libraries were prepared using the NEBNext ChIP-seq Library Prep Master Mix set for Illumina (NEB, E6240) or NEBNext Ultra II Kit, following the manufacturer's protocol. Final libraries were visualized and quantified on a 2200 TapeStation System (Agilent) and pooled with equal molar ratios before sequencing on an Illumina NovaSeq6000 machine with 150-bp paired-end reads.

Genome-wide datasets and analysis
Published mESC genome-wide datasets were obtained from GEO (WGBS 65 ; H3K9me3, H3K4me3, H3K36me3 and H3K27me3 (ref. 66  and ATF7IP 39 . Sequencing reads from published datasets and ChIP-seq reads generated in this study were filtered for low-quality reads as well as adaptor sequences using trimgalore (version 0.6.6) and mapped to the mouse genome (NCBI Build 37 mm9, July 2007). Mapping of H3K9me3, H3K4me3, H3K36me3, H3K27me3, DNase-seq and RNA-seq was done with QuasR (1.30.0) in R with standard qAlign() settings. Wig tracks were obtained with QuasR qExportWig() command and visualized using the UCSC genome browser (https://genome.ucsc. edu). Mapping of WGBS data was done with QuasR using qAlign() with following settings: genome = 'BSgenome.Mmusculus.UCSC.mm9', aligner = 'Rbowtie' and bisulfite = 'dir'. CpG methylation calls were extracted using qMeth() and filtered to contain only CpGs covered at least 10×. ChIP-seq peak coordinates were obtained using MACS2 (version 2.1.1.20160309) with the following parameters: callpeak -g mmkeep-dup all -q 0.05-call-summits. Coordinates were imported into R as GenomicRanges objects and peaks larger than 1 kb were removed from further analysis. Overlaps between peaks were calculated using the findOverlaps() function in R, with maxgap=1000 L. Heatmaps over ICRs and peak regions were generated with genomation() in R using the ScoreMatrixList() and multiHeatMatrix() functions.

CRISPR libraries and screens
The ChromMM and EpiTFs library was constructed as a subpool of the Vienna sgRNA library as described previously 68 . Lentivirus was produced in HEK293T (obtained from G. Schwank, University of Zurich) cells as described 69 . Nonconcentrated virus was titrated with different amounts following the same transduction procedure used for the actual CRISPR screens. Transduction was performed with 1.25 × 10 6 cells seeded in gelatin-coated 6-well plates in embryonic stem cell medium containing 8 µg ml −1 polybrene (Merck), spinning for 60 min at 500 × g at 37 °C. After centrifugation, cells were incubated for 12 h at 37 °C, before transferring them on multiple 15-cm plates and culturing them for another 24 hours. After 36 h, transduced cells were selected using FACS. For the ChromMM library cells were stained against the CD90.1 cell surface marker using an APC-conjugated antibody (Invitrogen, 17-0900-82, 1 µl per 15 million cells), gating on APC-positive and GFP-negative single cells. For the EpiTF library, cells were gated on GFP-positive and mScarlet negative single cells. After the sort, cells were seeded sparsely on multiple 15-cm dishes and only passaged once after 4 days to avoid bottlenecks in the library representation. On day 10 after transduction, GFP-positive cells were sorted and further processed for genomic DNA extraction using the DNeasy Blood and Tissue kit (Qiagen). All screens were performed with at least 30 million cells and a low multiplicity of infection between 0.1 and 0.2, yielding a total cell number of at least 3 million cells and a guide representation of at least 450× per guide after the first sort. For the final sort, the same number of initially transduced cells were used for the sort and kept as Article https://doi.org/10.1038/s41588-022-01210-z the reference pool. The screen with the EpiTF library was performed as described above, however 90 million cells were used for transduction at an multiplicity of infection of 0.2. Initially, all screens were performed as technical duplicates or triplicates with individual reporter clones and later repeated once more as independent experiments. To score essential and growth-restricting genes, the pooled cells at indicated days were compared to the initial plasmid library.

Library preparation for CRISPR screen
Library preparation was done for the entire amount of extracted genomic DNA in two consecutive PCR amplification steps. In the first PCR, the integrated guide sequences were amplified using the Herculase II Fusion DNA polymerase (Agilent) according to the manufacturer's instruction with a maximum of 500 ng DNA input per 50 µl reaction using library specific primers with 3′ adapter sequences for barcoding (Supplementary Table 1). The PCR mix contained 1.5% DMSO and had a final concentration of 3 nM of MgCl 2 . Amplified products were first purified using the MinElute Gel extraction kit (Qiagen) and potential primer dimers were removed using AmpureXP beads (Beckmann) at a ratio of 0.7× volume. Sample specific barcoding was done in a second PCR using NEBNext Multiplex Oligos (NEB) and the NEBNext Q5 Hot Start HiFi PCR Master mix (NEB) according to the manufacturer's manual with 10% of the eluted product from the first PCR and 7 amplification cycles. For the EpiTF library, barcoding was done with the i5 primers from the NEBNext Multiplex Oligo kit (NEB) and a custom primer that carries the P7 sequence (Supplementary Table 1). Sequencing was done on an Illumina NovaSeq6000 or a MiSeq machine, specifying a 10-bp index read 1 for the EpiTF library.

CRISPR screen analysis
Demultiplexing was performed using the standard pipeline of Illumina. For the EpiTF library, demultiplexing was only performed on the i5 index, as the i7 index contains the UMI 68 . Fastq files were trimmed to only include the guide RNA sequence using cutadapt (version 3.10) specifying -g 5′-TAGCTCTTAAAC...GGTGTTTCGTC-3′ for the linked adapter sequences in the lentivirus backbone for the ChromMM library or -g aaacaccg…gtttaaga for the EpiTF library. Alignment was done using bowtie2 (version 2.3.5.1) against a reference genome built from the sgRNA sequences, specifying the following alignment parameters: -k 1-very-sensitive. BAM files were converted into bed files using bedtools (version 2.27.1) bamtobed function. Bed files were imported into R to create a count matrix for MAGeCK (version 0.5.9.2). For the final analysis, counts from technical replicates as well as different GFP high and GFP low bins were aggregated. MAGeCK was run withnorm-method set to total and run against the unsorted pool as the control sample, specifying the independent replicates.

CRISPR screen validation
Single-guide validation was done with one guide RNA that was included in the library and one independently designed guide with high on-target and low off-target activity, as described in ref. 70 (Supplementary Table 1). Guides were cloned into the px459 backbone (Addgene, 62988) which allows for puromycin selection. For this, 1,000 cells were seeded in gelatin-coated wells of a 96-well plate 1 day before transfection. Next, 100 ng plasmid DNA was transfected per well as technical replicates using Lipofectamine 3000 (Invitrogen). After 36 h, transfected cells were selected for 36-48 h with 2 µg ml −1 puromycin using untransfected cells as a control. Reactivation of the reporter was evaluated 12 days after transfection using flow cytometry, and GFP reactivation was quantified over cells transfected with nontargeting control guides.

Western blot
For western blotting, 20-35 µg protein was separated on 6% or 10% polyacrylamide gels and transferred on polyvinylidene fluoride membranes using the TransBlot Turbo system (Bio-Rad). For antibody-based staining, the membrane was washed once with TBS-T (10 mM Tris, pH 8.0, 150 mM NaCl and 0.1% Tween-20), blocked with 5% non-fat dry milk in TBS-T and stained with primary antibodies against ATF7IP (Bethyl, A300-169A), ZMYM2 (Bethyl, A301-711A-M), or LAMIN B1 (Santa Cruz Biotechnology, 374015) at 4 °C overnight. Next day, membranes were washed three times with TBS-T for 10 min before incubation with species-specific horseradish peroxidase-conjugated secondary antibodies for 1 h at room temperature. After additional three washes with TBS-T for 10 min each, signal was detected using the Amersham ECL Western blotting detection reagent (GE Healthcare Life Sciences; RPN2109) and exposure on Amersham Hyperfilm ECL (GE Healthcare Life Sciences; 28906836) in a darkroom.

Cell line generation for proximity ligation experiments
Cell lines were generated as described in Villasenor et al. 45 .
The coding sequence of the BioID2 enzyme of the original entry vector was exchanged for the coding sequence of the TurboID enzyme 44 . Cells were either transfected with the entry vector, containing only the TurboID with a nuclear localization sequence, the full-length mouse ZFP57 cDNA sequence cloned upstream of the TurboID, or the KRAB domain of ZFP57 as annotated on UniProt. All cells were validated using western blot of cells incubated with 50 µM biotin (Sigma-Aldrich) for 12 h as previously described with minor adjustments to accommodate the biotin detection. In short, membranes were blocked in 5% BSA in TBS containing 0.1% Triton X-100 for 1 h and stained with streptavidin-horseradish peroxidase (1:20,000) in TBS containing 0.1% Triton X-100 overnight at 4 °C. The membrane was washed twice with TBS containing 0.3% Triton X-100, twice with TBS containing 0.3% Triton X-100 and additional 500 mM NaCl, before one final wash with TBS containing 0.3% Triton X-100 for 10 min at room temperature each.

Proximity ligation using TurboID
TurboID samples were prepared as described in Villaseñor et al. 45 . In brief, cells were grown as quadruplicates on 15-cm plates, incubated with 50 µM biotin (Sigma-Aldrich) for 12 h upon 70% confluency and harvested with trypsin. In the following, samples were handled at 4 °C or on ice. Cell pellets were swelled in 5× volume of nuclear extraction buffer 1 (NEB1; 10 mM HEPES, pH 7.5, 10 mM KCl, 1 mM EDTA, 1.5 mM MgCl 2 , 1 mM dithiothreitol (DTT), 1× EDTA-free complete protease inhibitor cocktail (PIC; Roche) for 10 min, before spinning at 2,000 × g for 10 min. Cells were homogenized using a loose Dounce pistil in 2× volumes of NEB1. Nuclei were collected by centrifugation at 2,000 × g for 10 min, resuspended in 1× volume nuclear extraction buffer 2 with 450 mM NaCl (NEB2; 20 mM HEPES, pH 7.5, 0.2 mM EDTA, 1.5 mM MgCl 2 , 20% glycerol, 1 mM DTT and 1× PIC) and homogenized 10 more times with a tight Dounce pistil, followed by an incubation for 1 h with overhead rotation. Debris was removed by centrifugation at 2,000 × g for 10 min before adjusting the salt concentration of the

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
Sequencing data have been deposited to NCBI GEO under the following accession number GSE176461; The mass spectrometry proteomic data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD034918. Source data are provided with this paper.

Code availability
All the analyses were performed using previously published or developed tools. No custom code was developed or used.

e)
Rank plot for the screen using the Airn pEF1a reporter in serum using the EpiTF library. Dashed horizontal line indicates the p-value threshold of 0.05 obtained using MAGeCK RRA (robust rank aggregation). f) Validation of potential candidates using single transfections of guide RNAs against the indicated gene. GFP expression was measured 12 days after transfection. Transfections were performed in technical replicates. Potential candidates were targeted with one independent guide and one guide from the ChromMM library. Data is showing % of positive cells in pools after CRISPR targeting. g) Network representation for all potential candidates using the STRING database. Pink edge indicates experimentally determined interactions; cyan edge indicates known interactions from curated databases. Green, red, and blue edges indicate predicted interactions based on gene neighbourhood, gene fusion, and gene co-occurrence, respectively. Yellow and black edges are predicted interactions based on textmining and co-expression, respectively.
Article https://doi.org/10.1038/s41588-022-01210-z Extended Data Fig. 9 | ATF7IP and ZMYM2 co-localize with ZFP57 at ICRs. a) Peak overlap analysis showing percentage of ZMYM2 or ATF7IP peaks coinciding with SETDB1, ZFP57, or genomic sites co-bound by ZFP57 and SETDB1. b) Heatmap indicating ZMYM2 binding at ZMYM2 peaks and separated by peaks overlapping SETB1 and SETB1-independent peaks. H3K9me3 ChIP-seq signal is shown for the same peak sets. c) WGBS methylation analysis at independent ZMYM2 and ATF7IP peaks (N = 159). ZMYM2 peaks are separated by sites overlapping (N = 4201) and non-overlapping (N = 10292) to SETDB1. Box plots denote the interquartile range as a box (IQR) and the lowest and highest values within the range of 1.5 x IQR around the box as whiskers. d) ChIP-qPCR for ATF7IP and ZMYM2 binding at the endogenous Airn ICR in wild type and Zfp57-KO cells. ChIP enrichment is normalized to 5% input and calibrated to a background genomic site (intergenic). Shown are independent technical replicates. e) Volcano plots indicating proteins enriched in the ZFP57_dZNF-TurboID (KRAB domain only) over a background TurboID cell line (nTurbo). Statistically enriched proteins are indicated (FDR-corrected two-tailed t-test: FDR = 0.05, s0 = 1, n = 4 technical replicates). f) Expression levels for ZFP57, ATF7IP and ZMYM2, measured at different timepoints during early embryo development. Shown are RPKM-normalized reads obtained from ref. 73 . g) Immunoblot detection of ATF7IP and ZMYM2 in wild type and KO cells using specific antibodies. Lamin B1 is used as loading control. Asterisk denotes the Zmym2-KO clone used for WGBS analysis. Experiment was repeated at least three times. Molecular weight markers are indicated.

Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.
n/a Confirmed The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one-or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section.
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors and reviewers. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Portfolio guidelines for submitting code & software for further information.