Casilio-ME: Enhanced CRISPR-based DNA demethylation by RNA-guided coupling methylcytosine oxidation and DNA repair pathways

We have developed a methylation editing toolbox, Casilio-ME, that enables not only RNA-guided methylcytosine editing by targeting TET1 to genomic sites, but also by co-delivering TET1 and protein factors that couple methylcytosine oxidation to DNA repair activities, and/or promote TET1 to achieve enhanced activation of methylation-silenced genes. Delivery of TET1 activity by Casilio-ME1 robustly altered the CpG methylation landscape of promoter regions and activated methylation-silenced genes. We augmented Casilio-ME1 to simultaneously deliver the TET1-catalytic domain and GADD45A (Casilio-ME2) or NEIL2 (Casilio-ME3) to streamline removal of oxidized cytosine intermediates to enhance activation of targeted genes. Using two-in-one effectors or modular effectors, Casilio-ME2 and Casilio-ME3 remarkably boosted gene activation and methylcytosine demethylation of targeted loci. We expanded the toolbox to enable a stable and expression-inducible system for broader application of the Casilio-ME platforms. This work establishes an advanced platform for editing DNA methylation to enable transformative research investigations interrogating DNA methylomes.


INTRODUCTION
DNA methylation is part of the multifaceted epigenetic modifications of chromatin that shape cellular differentiation, gene expression, and maintenance of cellular homeostasis. Aberrant DNA methylation is implicated in various diseases including cancer, imprinting disorders and neurological diseases 1 . Developing tools to directly edit the methylation state of a specific genomic locus is of significant importance both for studying the biology of DNA methylation as well as for development of therapies to treat DNA methylation-associated diseases.
In mammalian cells, the 5-methylcytosine (5mC) epigenetic mark generated by covalent linkage of a methyl group to the 5th position of the cytosine ring of CpG sequences is catalyzed by one of the three canonical DNA methyltransferases DMNT1, DNMT3A and DNMT3B 2,3,4 . DNA methylation is dynamic and involves demethylation pathways which erase 5mC to restore unmethylated DNA. Active demethylation involves the ten-eleven translocation (TET) family of methylcytosine dioxygenases that iteratively oxidize 5mC into 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC) intermediates 5 . 5fC and 5caC are subsequently processed by the base-excision repair (BER) machinery to restore unmethylated cytosines. Restoration of an intact DNA base is initiated by DNA glycosylases that excise damaged bases to generate an apurinic/apyrimidinic site (AP site) for processing by the rest of the BER machinery. Thymine DNA glycosylase (TDG)-based BER has been functionally linked to TET1-mediated demethylation, suggesting an interplay between TET1 and enzymes of the BER machinery to actively erase 5mC marks. TDG acts on 5fC and 5caC and NEIL1 and NEIL2 DNA glycosylase/AP-lyase activities facilitate restoration of unmethylated cytosine by displacing TDG from the AP site to create a single strand DNA break substrate for further BER processing 6, 7, 8, 9, 10, 11, 12 . Interestingly, DNA demethylation is enhanced by GADD45A (Growth Arrest and DNA-Damage-inducible Alpha), a multi-faceted nuclear protein involved in maintenance of genomic stability, DNA repair and suppression of cell growth 13,14,15 . GADD45A interacts with TET1 and TDG, and was suggested to play a role in coupling 5mC oxidation to DNA repair 16,17 .
Advances in artificial transcription factor (ATF) technologies have enabled direct control of gene expression and epigenetic states 18,19,20 . CRISPR/Cas9-based technologies allow much flexibility and scalability because the specificity is programmable by a single guide RNA (sgRNA) 21,22 . Tethering of TET1 or DNMT3a to genomic targets by use of ATFs has been shown to allow targeted removal or deposition of DNA methylation 23,24,25,26,27,28,29 . However, these ATF systems have inherent limitations in enabling multiplexed targeting, effector multimerization or formation of protein complexes at the targeted sequence. We recently developed the Casilio system which uses an extended sgRNA scaffold to assemble protein factors at target sites, enabling multimerization, differential multiplexing 30 , and potentially stoichiometric complex formation.
Here we develop an advanced DNA methylation editing technology which allows targeted bridging of TET1 activity to BER machinery to efficiently alter the epigenetic state of CpG targets and activate methylation-silenced genes. Casilio-DNA Methylation Editing (ME) platforms enable targeted delivery of the TET1 effector alone (Casilio-ME1) or in association with GADD45A (Casilio-ME2) or NEIL2 (Casilio-ME3) to achieve enhanced 5mC demethylation and gene activation. We showed that Casilio-MEmediated delivery of TET1 activity to gene promoters induced robust cytosine demethylation within the targeted CpG island (CGI) and activation of gene expression.
When systematically compared to other reported methylation editing systems, Casilio-ME showed superior activities in mediating transcriptional activation of methylationsilenced gene and 5mC demethylation. The ability of Casilio-ME to mediate co-delivery of TET1 activity along with other protein factors, that enhance turnover of oxidized cytosine intermediates, paves the way for new areas of research to efficiently address the cause-effect relationships of DNA methylation in normal and pathological processes.

Casilio-ME1 mediates delivery of TET1 activity to a specific genomic locus
Casilio-ME1 is a three-component DNA Methylation Editing platform built on Casilio which uses nuclease-deficient Cas9 (dCas9), an effector module made of Pumilio/FBF (PUF) domain linked to an effector protein, and a modified sgRNA containing PUFbinding sites (PBS) (Fig. 1a) 30 . The dCas9/sgRNA ribonucleoprotein complex binds DNA targets without cutting to serve as an RNA-guided DNA binding vehicle whose specificity is dictated by the spacer sequence of the sgRNA and a short protospacer adjacent motif located on the target DNA. PUF-tethered effectors are recruited to the ribonucleoprotein complex via binding to their cognate PBS present on the sgRNA scaffold. PUF domains are found in members of an evolutionarily conserved family of eukaryotic RNA-binding proteins whose specificity is encoded within their structural tandem repeats, each of which recognizes a single ribonucleobase 31 . PUF domains 6 can be programmed to bind to any 8-mer RNA sequence, e.g., PUFa and PUFc used in this study were designed to bind PBSa (UGUAUGUA) and PBSc (UUGAUGUA), respectively 30,31 . Multiple PBS added in tandem to the 3' end of the sgRNA allow concurrent recruitment of multiple PUF-effectors to targeted DNA sequences without interfering with dCas9 targeting, and therefore allow amplification of the response to associated effector modules 30 .
To enable targeted cytosine demethylation and subsequent activation of methylation-silenced genes, we built a DNA methyl editor TET1-effector Casilio-ME1 as a protein fusion of hTET1 catalytic domain (TET1(CD)) to the carboxyl end of PUFa ( Fig. 1a). We chose as a target the MLH1 promoter region that is part of a large CGI whose aberrant hypermethylation induces MLH1-silencing in 10-30% of colorectal and other cancers 32,33 . MLH1 is silenced in HEK293T cells and therefore represents a clinically relevant model for developing Casilio-ME.
To test the system, cells were transiently transfected with plasmids encoding Casilio-ME1 components PUFa-TET1(CD) effector, dCas9 and six MLH1-promoter-targeting sgRNAs each containing five copies of PBSa (Fig. 1a). This resulted in robust MLH1 activation as indicated by the obtained fold changes in MLH1 mRNA in cells collected on day 3 post transfection (Fig. 1b, c upper panel). In contrast, MLH1 activation was not obtained with a non-targeting sgRNA (NT-sgRNA) (Fig. 1b), indicating that Casilio-ME1mediated MLH1 activation requires specific targeting of the PUFa-TET1(CD) module directed by the programmable sgRNAs.
Evidence that the Casilio-ME1-induced activation of MLH1 results from TET1mediated 5mC demethylation came from high throughput bisulfite sequencing (BSeq) of MLH1 amplicons derived from the cells analyzed in Fig.1b. BSeq showed that targeted delivery of the TET1(CD) effector induces a profound decrease in CpG methylation frequency within the MLH1 promoter region (Fig. 1c lower panel). Demethylation activity was prominently higher within CpGs neighboring MLH1-sgRNA sites ( Fig. 1c lower panel (arrows)), and seemed to spread away, albeit with relatively reduced activities.
These data indicate that Casilio-ME1 mediates delivery of TET1 activity to promoter regions to induce 5mC demethylation within the targeted CGI and subsequent activation of the methylation-silenced gene.

Comparison of Casilio-ME1 with other TET1 delivery systems
Although other technologies enabling targeted delivery of TET1 activity to genomic loci have been reported to induce activation of methylation-silenced genes 23,24,26,29 , a direct comparison of their efficiency is lacking. Here we compared Casilio-ME1 efficiency to alter expression of methylation-regulated genes to alternative technologies for 5mC demethylation that are based on TALEs, dCas9/MS2 or dCas9/SunTag systems 23,24,26 (Fig. 1d). We therefore assembled four TALE-TET1(CD) fusions each designed to bind to one of the four MLH1-sgRNA target sequences used for dCas9based delivery systems. Relative quantitation of MLH1 mRNA indicated that the SunTag, TALEs or MS2 based systems only achieved 63%, 7% or 1%, respectively, of the Casilio-ME1-mediated activation level (Fig. 1e). BSeq analysis of MLH1 promoter comparing Casilio-ME1 and SunTag systems showed that Casilio-ME1 induced stronger demethylation at most of the CpG sites examined (Fig. S1). These results suggest that Casilio-ME1-mediated delivery of TET1 activity to CGI target enables stronger 5mC demethylation and gene activation compared to published systems. 8

Casilio-ME2: co-delivery of TET1(CD) and GADD45A enhances gene activation
Active 5mC erasure is a two-step process initiated by TET1-mediated iterative 5mC oxidations followed by base-excision (BER) or nucleotide-excision (NER) repair conversions of oxidized intermediates to cytosines 5,15 . Thus, coupling these two steps could streamline 5mC active erasure to efficiently activate methylation-silenced genes.
Thus, coupling of GADD45A and TET1(CD) as a two-in-one effector enhances TET1mediated activation of methylation-silenced genes.

Casilio-ME3: co-delivery of TET1(CD) and base excision repair enzymes
TDG, NEIL1 and NEIL2 have been functionally linked to active DNA demethylation as they are involved in the initial step of removing oxidized cytosines 5fC and 5caC produced by TET1 activities 6,7,8,9,11,12 . Because initiating repair of oxidized cytosines by the BER machinery might be a rate limiting step to TET1-mediated activation of methylation-silenced genes, we reasoned that coupling TET1 activities with DNA glycosylases could facilitate 5mC active erasure and enhance subsequent gene activation.
To determine whether the enhanced gene activation obtained with Casilio-ME3. 3 and ME3.4 systems requires co-targeting of TET1(CD) and NEIL2 to a genomic site and does not result from NEIL2 over expression, we disabled targeting of NEIL2 effector modules by using sgRNAs comprising PBSa but lacking the PBSc required for targeting PUFc-based NEIL2 effectors (Fig. S5a). Cells transfected with Casilio-ME3.3 or ME3.4 components comprising sgRNAs that lacked PBSc tethering sites showed no significant gains in TET1-mediated MLH1 activation, indicating that enhanced TET1-mediated gene activation requires co-targeting of NEIL2 and TET1(CD) modules via an RNA scaffold (Fig. S5b). Thus, these data show that co-delivery of NEIL2 and TET1(CD) to genomic loci synergistically promotes TET1-mediated gene activation, likely via facilitated coupling of 5mC demethylation and BER activities to efficiently restore unmethylated cytosine to targeted sites.

Comparison of Casilio-ME platforms
Casilio-ME2 and Casilio-ME3 platforms showed an enhanced activation of a methylation-silenced gene compared to Casilio-ME1. Here we sought to compare these platforms to one another in their efficiencies to activate MLH1 and alter methylation landscape of targeted CGI. The comparison included the previously reported dCas9-TET1 as alternative system for reference 25 (Fig. S6b). Only background MLH1 mRNA levels could be detected with TET1(CD)-dead mutants of the Casilio-ME platforms (Casilio-dME) (Fig.   S6a, b), indicating that TET1 activity is required for gene activation and that delivery of GADD45A or NEIL2 without TET1 oxidative activity is not sufficient for activating methylation-silenced genes.
To ask whether the augmented MLH1 activation of Casilio-ME2.2 and Casilio-ME3.1 came from an increased efficiency in 5mC erasure, we performed BSeq and oxidative BSeq (oxBSeq) by high throughput amplicon sequencing of MLH1 promoter regions derived from cells transfected with Casilio-ME components or dCas9-TET1. Analysis of 5mC frequencies within MLH1 CGI showed that Casilio-ME2.2 and Casilio-ME3.1 produced higher demethylation activities compared to Casilio-ME1 and dCas9-TET1 (Fig 4a upper panel, b). This is consistent with the observed higher accumulation trends of 5mC-oxidation products 5hmC and bisulfite converted CpGs (5fC, 5caC and C) (Fig.   4). Interestingly, a noticeable trend appear to exist when looking at the levels of 5mCoxidation products; Casilio-ME2.2 produced more bisulfite converted CpGs (5fC, 5caC and C), whereas Casilio-ME3.1 produced more 5hmC (Fig. 4b, c). This apparent difference in demethylation patterns could explain the relative efficiencies of Casilio-ME2.2 and ME3.1 in enhancing MLH1 activation. The higher accumulation of 5hmC in the NEIL2-based Casilio-ME platform could be explained by NEIL2 competing with TET1(CD) for processing 5fC and 5caC substrates to potentially steer TET1 activity more toward the 5mC substrate. Alternatively, TET1 activity may be promoted in the presence of NEIL2 or NEIL2-associated proteins. For Casilio-ME2.2, the observed demethylation profiles are consistent with GADD45A promoting TET1 activity and/or recruiting BER to the target site, leading to accumulation of bisulfite converted CpGs (5fC, 5caC and C).
Evidence that the enhanced gene activation obtained with Casilio-ME2.2 and Casilio-ME3.1 required fully active GADD45A or NEIL2, respectively was obtained when point mutations were introduced to alter key functional features or inactivate corresponding proteins. GADD45A lacks any obvious enzymatic activity, however, previous reports pointed us to key amino acids required for chromatin interaction (G39A) or dimerization/self-association (L77E) of the protein 34,35 . Catalytically inactive NEIL2 with (C291S) or (R310Q) mutations located at the zinc finger domain required for NEIL2-binding to DNA substrate were also reported 36 . Introduction of these point mutations to Casilio-ME2.2 or Casilio-ME3.1 abrogated the enhanced MLH1 activation (Fig. S7a). The reduced MLH1 activations obtained were not due to protein destabilization caused by amino-acid changes introduced to GADD45A and NEIL2 (Fig.   S7b, c). Interestingly, Casilio-ME3.1 containing NEIL2(R310Q) mutation seemed to retain a weak enhancement that is likely attributed to residual catalytic and DNA-binding activities of the R310Q NEIL2 mutant (Fig. S7a) 36 .
The enhanced demethylation activities, taken together with the fact that the boost in MLH1 activation mediated by Casilio-ME2.2 and Casilio-ME3.1 required TET1 oxidative activity and functionally active GADD45A or NEIL2 enhancer proteins, is consistent with the idea that these platforms might facilitate bridging oxidative removal of 5mC to DNA repair pathways to efficiently restore unmethylated cytosine to targeted loci.

Evaluation of potential off-target activity and mutagenicity of Casilio-ME platforms
The CRISPR/dCas9 system inherently tolerates mismatches, to some extent, between guide RNAs and genomic loci to subsequently give rise to potential off-target effects 37,38 . To evaluate Casilio-ME platforms for potential off-target effects, we performed reduced representation bisulfite sequencing (RRBS) of genomic DNA extracted from cells transfected with Casilio-ME components, dCas9-TET1 or SunTag systems in the presence of either MLH1 or non-targeting sgRNAs. Pairwise correlations between all samples, including untransfected cells, gave similar correlations. The correlations were within the same range as previously reported for RRBS replicates 39 , suggesting that Casilio-ME platform associated off target activities, if any existed, do not exceed those of alternative 5mC editing systems (Fig. S8a).
To evaluate further the specificity of the Casilio-ME platforms, we performed systems could represent off-target effects or reflect potential transcriptome changes subsequent to MLH1 reactivation (Fig. S8b).
Recruitment of BER associated proteins via Casilio-ME platforms might introduce mutations to targeted sites. To evaluate potential mutagenicity of Casilio-ME, we performed deep sequencing of the MLH1 locus, a 1kb targeted region that comprises the promoter and part of the first exon, and compared sequence identity distribution among reads to untransfected cells. Casilio-ME transfected cells showed no significant difference in sequence identity within MLH1 reads, ruling out the possibility of Casilio-ME platforms introducing mutations to targeted sites (Fig. S9b, c).

Portability of the Casilio-ME platforms
To show that Casilio-ME platforms enable efficient activation of other 5mC-silenced MLH1 mRNA level also increased in response to increasing amounts of doxycycline ( Fig. S11b). Without doxycycline added only background levels of MLH1 were detected and no detectable amounts of Casilio-ME1 protein components were observed in Western blot analysis of protein extracts from transfected cells (Fig. S11b, c). This DIP_Casilio-ME1 will enable establishment of isogenic cell lines that can be used to study different target CGIs in a tunable manner by supplying different target-specific sgRNAs and adjusting doxycycline dosage.

DISCUSSION
The present study establishes a modular RNA-guided DNA methylation editing platform that not only recruits the TET1 effector to initiate DNA demethylation by 5mC oxidations, but also delivers protein factors to facilitate coupling 5mC oxidation to DNA repair pathways to effectively restore intact DNA to targeted sites. Such dual delivery enhanced 5mC demethylation at CGI target and augmented gene activations when compared to TET1(CD) delivered alone. In addition to the robustness of the platform, the modular design of Casilio-ME allows a high degree of tunability and flexibility in editing 5mC epigenetic marks.
Turnover of 5fC and 5caC by DNA repair machinery lags behind TET1-mediated 5mC oxidations as these intermediates accumulate before getting converted to unmethylated cytosine 40 . Coupling TET1 activity with BER or NER pathways could accelerate 5fC and 5caC turnover, thereby enhancing activation of methylation-silenced genes. Consistent with this idea, Casilio-ME2 and Casilio-ME3 platforms designed to facilitate coupling TET1 and DNA repair activities gave an enhanced gene activation and CpG demethylation of targeted sites. This enhanced gene activation requires TET1 catalytic activity, fully functional GADD45A or NEIL2 proteins and co-targeting relevant effectors in close proximity to genomic target sites.
Previous studies revealed interesting functional and physical interactions among proteins involved in oxidizing 5mC and removal of oxidized cytosine intermediates via BER or NER. NEIL2 promotes substrate turnover by TDG during DNA demethylation 12 .
GADD45A physically interacts with TET1 or TDG and seems to promote TET1 activity, and enhances removal of 5fC and 5caC by TDG 14,16,17 . GADD45A also recruits repair enzymes such as the 3′-NER endonuclease XPG to genomic sites DNA 41, 42 . As GADD45A is devoid of any enzymatic activity, it was proposed to function as a liaison protein to physically couple 5mC oxidation with DNA repair 16 . Consistent with these observations, Casilio-mediated co-targeting of TET1(CD) with GADD45A or NEIL2 within close proximity of their substrates enhanced 5mC demethylation and activation of methylation-silenced genes. However, the addition of TDG to Casilio-ME modules failed to augment gene activation, because the TDG protein fusions tested here might not be functional or other factor(s) might be required for TDG to produce enhanced gene activation.
Different levels of activation of methylation-silenced genes could be obtained by using one of the three flavors of Casilio-ME and by varying doxycycline concentrations with the DIP_Casilio-ME1 platform described here. This equips Casilio-ME platforms with a unique capability to fine-tune gene activation. These Casilio-ME platforms significantly expand 5mC editing capability to efficiently address the causal-effect relationships of methylcytosine epigenetic marks in numerous biological and 18 pathological systems.

Quantitative RT-PCR analysis
Harvested cells were washed with Dulbecco's phosphate-buffered saline (dPBS), centrifuged at 125 x g for 5 min and then the flash-frozen pellets were stored at -80ºC.
Extracted RNA (500 ng to 2 µg) were used as templates to make cDNA libraries using a High Capacity RNA-to-cDNA kit (Applied Biosystems). TaqMan gene expression assays were designed using GAPDH (Hs03929097, VIC) as an endogenous control and CDH1

Bisulfite and oxidative bisulfite sequencing (BSeq and oxBSeq)
Bisulfite and oxidative bisulfite conversion experiments were performed by using the EpiTect Fast DNA Bisulfite Kit, True Methyl oxBS Module and genomic DNA according to manufacturers' instructions (Qiagen, NuGen respectively). Bisulfite treated DNA served as templates to PCR-amplify three DNA fragments of 350-400 bp long that cover the entire MLH1 promoter region using ZymoTaq PreMix according to manufacturer's instructions (Zymo Research). The MLH1 PCR fragments were then cloned by SLIC into EcoRI-linearized pUC19 plasmid using T4 DNA polymerize 45 . Ten independent positive clones for each sample were then subjected to Sanger sequencing to determine methylation profiles based on bisulfite-mediated cytosine to thymine conversion frequency of individual CpGs. MLH1 amplicons obtained from bisulfite converted DNA templates and from unconverted DNA were subjected to high throughput sequencing (2 x 250 paired-end reads) conducted at GENEWIZ (South Plainfield, NJ, USA). Fifty to 120 thousand reads were obtained per sample. Sequence analysis of the plasmids extracted from MLH1 clones to determine methylation frequencies was performed by using BiQ Analyzer 3 with minimal bisulfite conversion rate and sequence identity set to 97% and 95%, respectively 46 . Reads from high throughput amplicon sequencing, on the other hand, were analyzed for 5mC and 5hmC by using BiQ Analyzer HiMod with minimal read quality score, alignment score, sequence identity and bisulfite conversion rate set to 30, 1000, 0.9 and 0.9, respectively 47 . Mean bisulfite conversion rates in the retained sequences were >0.98. BiQ Analyzer HiMod was used without sequence identity filter to analyze sequence integrity of MLH1 amplicons derived from genomic DNA without bisulfite treatment.

Reduced representation bisulfite sequencing (RRBS)
Library preparation for RRBS was performed according to manufacturer's instructions (Diagenode). Briefly, 100 ng of genomic DNA for each sample was enzymatically digested, end-repaired and ligated with an adaptor. Samples with different adaptors were then pooled together and subjected to bisulfite treatment followed by