Site-specific manipulation of Arabidopsis loci using CRISPR-Cas9 SunTag systems

Understanding genomic functions requires site-specific manipulation of loci via efficient protein effector targeting systems. However, few approaches for targeted manipulation of the epigenome are available in plants. Here, we adapt the dCas9-SunTag system to engineer targeted gene activation and DNA methylation in Arabidopsis. We demonstrate that a dCas9-SunTag system utilizing the transcriptional activator VP64 drives robust and specific activation of several loci, including protein coding genes and transposable elements, in diverse chromatin contexts. In addition, we present a CRISPR-based methylation targeting system for plants, utilizing a SunTag system with the catalytic domain of the Nicotiana tabacum DRM methyltransferase, which efficiently targets DNA methylation to specific loci, including the FWA promoter, triggering a developmental phenotype, and the SUPERMAN promoter. These SunTag systems represent valuable tools for the site-specific manipulation of plant epigenomes.


List of Figures (page number in bold) vi
Acknowledgements ix Vita/Biographical Sketch x

De novo DNA methylation in Arabidopsis
The de novo establishment of DNA methylation in Arabidopsis is carried out by the RNAdirected DNA methylation (RdDM) pathway 1 [3][4][5] . Importantly, these maintenance pathways ensure the faithful inheritance of DNA methylation states throughout meiotic generations in plants.

DNA demethylation
DNA methylation is a stable epigenetic mark that can be efficiently propagated through generations. However, in some instances such as during development, or during a response to stress, methyl marks are removed in order to alter gene expression states. DNA demethylation can occur through both a passive mechanism and an active mechanism. During passive demethylation, methylation is diluted out as successive rounds of DNA replication progress and the maintenance machinery fails to copy the methylated state of parental strands. During active demethylation, DNA methylation is enzymatically removed from specific genomic loci.
Active demethylation in plants is achieved through the removal of a methylated base, involving glycosylase activity and the base excision repair pathway 1 . The family of DNA glycosylases in Arabidopsis includes ROS1, DME, DML2, and DML3. In mammals, one mechanism to demethylate 5mC is through hydroxylation of the methyl group. TET1 and other members of the ten-eleven translocation family can catalyze the formation of the 5hydroxymethylcytosine mark from 5-methylcytosine. 5-hydroxymethylcytosine can then be actively or passively removed 1 .

Epialleles in plants
The

CRISPR-Cas9-mediated epigenome engineering
Recent advances in genome engineering have allowed researchers to carefully dissect the functions of individual genes, and gene families, by inducing site-specific mutations at loci of interest. Numerous systems have been developed for this purpose, such as those involving zinc finger (ZF) proteins, transcription activator-like (TAL) effectors, and the clustered regularly interspaced palindromic repeats (CRISPR) -Cas systems 11,12 .
ZFs are DNA binding domains that can recognize approximately three base pairs of DNA. These ZFs can be engineered to bind a specific sequence by changing a limited number of amino acid residues within the domain. They can be engineered into an array with specific amino acid residues within each domain that can subsequently recognize a target DNA sequence. ZF proteins can then be coupled to nuclease domains to create zinc finger nucleases (ZFNs) that can cleave a target and induce mutations 11,12 . In addition to creating site-specific nucleases, ZFs can be coupled to other effector proteins, such as transcriptional modulators, in order to alter gene expression states. Similar to ZFs, TAL effectors consist of a repetitive array of highly conserved amino acid repeats. These repeats contain two hypervariable amino acids, and depending on the residues, each repeat is able to recognize a specific nucleotide. TAL effectors can thus be designed to target a specific genomic region, and can also be coupled with effectors like nucleases to induce changes at these regions 11,12 . Although ZFs and TAL effectors are highly effective in many cases, they can be laborious to design and verify, and are difficult to use for multiplexing.
The recent advances surrounding the CRISPR-Cas9 system have made genome editing experiments simpler to design, specific, and have made multiplexing approaches more feasible.
CRISPR-Cas9 is from the type II CRISPR-Cas system. These systems are involved in endowing bacteria with adaptive immunity capabilities against viruses and other invaders. In contrast to ZFs and TAL effectors, the CRISPR-Cas9 system is an RNA guided system, which does not require the engineering of specific protein residues to recognize particular target DNA sequences. Cas9 is a DNA endonuclease that requires two domains to cleave DNA: an HNH domain and a RuvC-like domain 12 . Cas9 is guided to its target site by a CRISPR RNA (crRNA) and trans-activating crRNA (tracrRNA) duplex, where the 20-nucleotide protospacer region of the crRNA recognizes specific targets. This spacer region can be replaced by other nucleotides in order to target Cas9 to any locus for cleavage and subsequent functional studies. The crRNA:tracrRNA complex has been combined, where now only a single guide RNA (sgRNA) is required for genome editing applications. In addition, prior to guide RNA binding to a complementary DNA sequence, Cas9 requires the presence of a protospacer adjacent motif (PAM) sequence next to the target site 12 . Expressing multiple guide RNAs simultaneously also allows for multiplexing to edit numerous loci at once, which can be advantageous for many research applications, such as to obtain null alleles of a family of genes that are redundant.
In addition to using the CRISPR-Cas9 system for DNA modifications, it can also be used to recruit protein effectors of interest to target sites. The HNH and RuvC domains of Cas9 can be mutated, which removes Cas9's ability to cleave DNA (dCas9) 12 . Thus, when fused to an effector protein, dCas9 recruits the effector to a sequence complementary to the guide RNA. For example, dCas9 and multiple next generation derivatives involving dCas9 have been coupled to transcriptional activators to induce expression changes 13 . One such next generation system is the SunTag system. It consists of two modules: one consisting of dCas9 fused to a tandem repeat epitope (GCN4) tail and a second module consisting of a single chain variable fragment (scFv) GCN4 antibody fused to superfolder-GFP (sfGFP) followed by VP64 14,15 . This system has been used to overexpress genes in mammalian cell lines 14,15 . The SunTag system has also been adapted for DNA demethylation and methylation targeting in mammals [16][17][18] . However, there had 6 been no CRISPR-Cas9-based tools developed to alter DNA methylation patterns in order to engineer the epigenome in plants. I have developed a plant-specific SunTag system for DNA demethylation with the TET1 catalytic domain, a plant-specific SunTag system for gene expression modulation with VP64, and have coupled SunTag to a DNA methylation effector in order to specifically and efficiently target DNA methylation in plants. This system provides a toolkit for highly robust and site-specific modulation of the epigenome, and furthermore, provides a novel means to systematically study the causal relationships between gene expression and DNA methylation in plants.

Conclusions
These CRISPR-Cas9 SunTag systems represent a new powerful set of tools for the manipulation of gene expression in plants. By recruiting various effector proteins to loci of interest with the SunTag system, the effects of directly altering expression states and epigenetic marks can be studied at specific loci as opposed to relying on epigenetic mutants, which can have numerous indirect effects.
As described in Chapters 2 and 3, SunTag TET1cd and SunTag VP64 can be utilized to demethylate and activate transposable elements. Thus, these sets of tools can be used to activate specific copies or families of TEs in order to observe the effects of TE activation on genomic integrity and the effects of mobile element insertions into new genomic loci. Mutants that lead to altered epigenomes can also be used for this purpose, however they give an aggregate and average view of the biological consequences of reactivating many TEs and genes simultaneously. In addition to studying the biological effects of TE upregulation, TE activation can also be used as an approach for directed evolution to endow genes with novel new functions that are beneficial to plant development.
As described in Chapter 3, targeting SunTag VP64 to the FWA promoter and to the 5' end of EVD and ATR led to the activation of these loci. Additionally, there was a decrease or complete loss of methylation observed at the targeted regions. This is a novel observation, which suggests that gene expression and silencing pathways are competing, and targeted activation is interfering with methylation maintenance directly or through an indirect mechanism. Thus, SunTag VP64 can be used to activate methylated genes and TEs, and reduce or abolish methylation near the 5' end of coding regions.
As shown in Chapter 3, SunTag NtDRMcd is able to target DNA methylation to specific genomic loci. This new tool can be used to study the effects of ectopic methylation upon gene expression at loci that usually lack or have minimal amounts of DNA methylation. By targeting SunTag NtDRMcd to different regions within a promoter, regulatory elements can be identified by observing which of the methylated targets had a corresponding effect on gene expression. In addition, by modulating the amount of targeted DNA methylation, such as by varying the number of guide RNAs that are tiled or targeted to a certain region, the expression of a target gene may be fine-tuned, such that lower methylation levels may lead to a slight reduction in gene expression, whereas a large patch of promoter methylation may lead to complete silencing.
The initial development of SunTag NtDRMcd exhibited genome-wide off-target methylation activity. However, an improved no NLS version of the construct led to limited off target methylation. This is a novel development that extends beyond plant epigenome engineering, and can also be applied to targeting tools in mammalian systems. The SunTag NtDRMcd system can thus be used to target DNA methylation to a particular locus, and potentially create novel epialleles that can be propagated through meiotic generations in the absence of the transgene encoding the SunTag system. As a result, SunTag-mediated methylation targeting may be a valuable tool for the improvement of crops. For example, new epialleles may be generated that confer bacterial or viral disease resistance to important crops, such as by silencing the expression of genes utilized by bacteria or viruses during pathogenesis. In addition, expression levels of genes in the host plant may be fine-tuned in order to avoid developmental and pleiotropic effects upon the plant as a result of repression, thus maximizing total yield.
These SunTag tools provide effective methods to alter the expression profiles of specific loci in plants and have broad applications in basic research and plant biotechnology.