Introduction

DNA methylation refers to the enzymatic transfer of a methyl group to specific nucleotides within the DNA sequence. In eukaryotes, this modification almost exclusively affects cytosines. Although clearly ancestral, cytosine methylation is not universally present in the eukaryotic tree of life. Thus, the yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe, as well as the nematode Caenorhabditis elegans have no DNA methyltransferases (DNA MTases; Goll and Bestor, 2005). Furthermore, Drosophila contains a single, enigmatic DNA MTase-like protein (Goll and Bestor, 2005) and it is unclear if this species actually methylates DNA. By contrast, DNA methylation is readily detected in plants and mammals, where it is critical for normal development and genome stability. Although numerous hypotheses have been proposed (Bird, 1995; Yoder et al., 1997; Martienssen, 1998; Regev et al., 1998; Colot and Rossignol, 1999; Suzuki and Bird, 2008), it is still a mystery as to why these organisms cannot dispense with cytosine methylation when many lower eukaryotes can.

In plants and mammals, most methylated cytosines are found over repeat elements (Goll and Bestor, 2005; Suzuki and Bird, 2008; Law and Jacobsen, 2010) and loss of this modification is associated with transcriptional reactivation as well as increased mobilization of transposable elements (TEs; Slotkin and Martienssen, 2007). These observations likely reflect the ancestral role of cytosine methylation in the defence against invasive DNA. Methylation of repeat elements is also thought to have been exapted recurrently during eukaryotic evolution to exert other essential functions, notably in the epigenetic regulation of genes. Thus, genomic imprinting, which results in parent-of-origin-dependent expression, may have evolved in plants and mammals from situations in which methylation of repeat elements influences the activity of neighbouring genes (Barlow, 1993; Martienssen, 1998; Youngson and Whitelaw, 2008; Berger and Chaudhury, 2009; Köhler and Weinhofer-Molisch, 2009). Non-repeat sequences can be methylated too, and methylation of such sequences in the context of gene promoters often correlates with transcriptional silencing in plants and mammals (Henderson and Jacobsen, 2007; Suzuki and Bird, 2008; Ooi et al., 2009). However, the exact function(s), if any, of much of the DNA methylation found outside of repeat elements remains unclear (Henderson and Jacobsen, 2007; Suzuki and Bird, 2008).

Thanks to a near-complete genome sequence annotated to very high standards, a comprehensive set of genomics tools and powerful genetics, the flowering plant Arabidopsis has rapidly become a prime model for the study of DNA methylation and its inheritance patterns in higher eukaryotes. Here, we review our current understanding of DNA methylation in Arabidopsis, with particular emphasis on the interplay between the mechanisms that enable the establishment and maintenance of this modification over repeat elements within and between generations. The role of RNAi in the incremental methylation and silencing of repeat elements over successive generations is highlighted. We argue that paramutation, first described in maize, is an extreme manifestation of this RNAi-dependent pathway.

Sequence context and genomic distribution of DNA methylation

In all plant species examined to date, cytosine methylation is not restricted to CG sites; it also affects CHG and CHH sites, where H=A, T or C. Two studies have combined bisulfite treatment of genomic DNA, which converts unmethylated cytosines to uracils but leaves methylated cytosines intact, with next-generation sequencing to provide unprecedented genome-wide views of DNA methylation at a single base resolution in Arabidopsis (Cokus et al., 2008; Lister et al., 2008). Consistent with HPLC measurements (Rozhon et al., 2008), these two studies reported methylation of 6–7% of all Cs. In total, 55% of methylcytosines are within CG sites, the rest being equally partitioned between CHG and CHH sites. These values reflect major differences in DNA methylation patterns between genes and repeat elements (see below), the relatively small number of such elements in the Arabidopsis genome (15–20%; AGI, 2000; Buisine et al., 2008) and a typically higher level of methylation at CG than at CHG and CHH sites (>90%, 30–80% and <40%, respectively). Thus, 23% of CG sites, but only 7–9% of CHG and 2% of CHH sites, are methylated in Arabidopsis.

As in other plant species, most DNA methylation in Arabidopsis aligns with repeat elements, which tend to cluster within pericentromeric regions (Zhang et al., 2006; Zilberman et al., 2007; Cokus et al., 2008; Lister et al., 2008). Irrespective of their location in the genome, however, repeat elements are typically methylated at CG, CHG and CHH sites over their entire length. Furthermore, this dense DNA methylation strongly correlates with dimethylation of histone H3 lysine 9 (H3K9me2), a classic heterochromatic mark, and transcriptional silencing (Roudier et al., 2009). Finally, DNA methylation of repeat elements is consistent across Arabidopsis accessions (Vaughn et al., 2007; Zhang et al., 2008), indicating that it is stably deposited over these sequences and/or that its absence is strongly counter selected.

Early genome-wide mapping of DNA methylation in Arabidopsis using tiling arrays revealed that, in addition to repeat elements, 20–30% of Arabidopsis genes are methylated (Zhang et al., 2006; Zilberman et al., 2007). This gene methylation is restricted to only part of the transcribed region and is associated with expression rather than silencing (Zilberman et al., 2007). In addition, the two whole-genome bisulfite sequencing studies cited above established that gene body methylation is restricted almost exclusively to CG sites (Cokus et al., 2008; Lister et al., 2008). Furthermore, genes appear to display variable DNA methylation between Arabidopsis accessions (Vaughn et al., 2007; Zhang et al., 2008). Although current evidence suggests that gene body methylation is a by-product of transcription by RNA Polymerase II (Pol II) and has limited functional consequences (Roudier et al., 2009; Law and Jacobsen, 2010), the extent to which it varies in relation to transcriptional activity during development, or in response to environmental changes, remains to be determined.

Establishment of DNA methylation over repeat elements

The establishment of DNA methylation over specific sequences is a first and critical step in conferring biological significance to this modification. The observation that viroid double-stranded RNA can trigger methylation of homologous DNA provided the initial clue that in plants, RNA is implicated in this step (Wassenegger et al., 1994). Genetic and molecular studies, carried out mainly in Arabidopsis, have since revealed the existence of an RNAi-dependent de novo DNA methylation pathway in plants, which affects both transgenes and endogenous repeat elements (Huettel et al., 2007; Law and Jacobsen, 2010). In the current model of RNA-directed DNA methylation (RdDM), illustrated in Figure 1a, repeat elements are first transcribed by RNA polymerase IV (Pol IV), one of two Pol II-related plant-specific RNA polymerases (Lahmy et al., 2010), to generate long single-stranded RNAs. These are converted by RNA-DEPENDENT RNA POLYMERASE 2 (RDR2) into long double-stranded RNAs, which are in turn processed by the RNase III enzyme DICER-LIKE 3 into 24-nt small interfering RNAs (siRNAs). These siRNAs are loaded onto a silencing complex containing ARGONAUTE 4 (AGO4) and interact at target loci with complementary RNA transcripts produced by the second non-canonical RNA polymerase Pol V, which is closely related to Pol IV (Wierzbicki et al., 2008, 2009). In an alternative version of the model, Pol V serves instead as an anchoring platform for the direct pairing of AGO4-charged siRNAs to their complementary DNA sequences (Wierzbicki et al., 2008). In the next step, the DNA MTase DOMAINS REARRANGED METHYLTRANSFERASE 2 (DRM2), which is the most active Arabidopsis homolog of mammalian de novo DNA MTases (Cao et al., 2000; Goll and Bestor, 2005), is recruited by the Pol V–AGO4–siRNA complex in a sequence-specific manner to establish methylation at CG, CHG and CHH sites (Cao and Jacobsen, 2002b; Cao et al., 2003). Pol V and AGO4 also contribute to the accumulation of siRNAs at some loci (Figure 1a), suggesting either a reinforcement loop (Kanno et al., 2005; Pontier et al., 2005; Mosher et al., 2008) or a role for these proteins in siRNA stabilization. Several additional proteins are implicated in de novo DNA methylation, such as the SRA- and SET-domain-containing histone methyltransferases SUPPRESSOR OF VARIEGATION 3-9 HOMOLOGUE 2 (SUVH2) and SUVH9 as well as the putative transcription elongation and chromatin-remodelling factors SUPPRESSOR OF TY INSERTION 5-LIKE (SPT5L, also known as KTF1) and DEFECTIVE IN RNA-DIRECTED DNA METHYLATION (DRD1), but their precise function is less well understood (Law and Jacobsen, 2010). Finally, genome-wide bisulfite and small RNA sequencing data suggest that methylation occurs on the DNA strand of the same polarity (Lister et al., 2008), which may hint at how siRNAs guide DNA methylation.

Figure 1
figure 1

Establishment and maintenance of DNA methylation. (a) Proposed model of de novo methylation by the RdDM pathway. Primary RNA transcripts are thought to be produced by RNA polymerase Pol IV (or perhaps Pol II, not shown) and converted by the RNA-dependent RNA polymerase RDR2 into long dsRNAs. Intra- or intermolecular long dsRNAs could also be produced from inverted repeats or as a result of sense/antisense transcription, respectively (not shown). The long dsRNAs are then processed by the RNase III enzyme DCL3 into 24-nt siRNAs, which are loaded into a silencing complex containing AGO4. Formation of the siRNA-loaded AGO4 complex, in concert with transcription of the target locus by the RNA polymerase Pol V, would lead to the recruitment of the DNA MTase DRM2 to mediate de novo DNA methylation of the target locus in all sequence contexts. Transcripts produced by either Pol IV or Pol V and corresponding to the methylated region would then be used to amplify the production of siRNAs, creating a reinforcing loop. All cytosines on the two DNA strands are shown as methylated for simplicity. However, not all cytosines within a target sequence are expected to become methylated at once. Black and coloured Cs represent unmethylated and methylated (m) sites, respectively (CG—red; CHG—blue; CHH—green). (b) Proposed model of maintenance of DNA methylation. After DNA replication, the newly synthesized strand (grey) is unmethylated. The SRA- and RING-domain-containing protein VIM1 is thought to recognize hemi-methylated CG sites and help recruit the DNA Mtase MET1 to these sites. Maintenance of CHG methylation is thought to involve a reinforcing loop between the plant-specific DNA MTase CMT3 and the H3K9me2 methyltransferase SUVH4/KYP. CHH methylation is propagated in a locus-specific manner by the constant action of RdDM (represented only by AGO4 and DRM2 for simplicity) and/or by CMT3 and MET1 (not shown). Colours as in (a).

Most steps of this model are based on genetic evidence and await biochemical validation. For example, the nature of the primary transcripts required to initiate RNAi remain ill defined, and the mechanism by which DRM2 is recruited to target loci is unknown. Furthermore, as shown for AGO4, AGO6 and AGO9, members of a given protein family may participate to different degrees in RdDM as a result of functional diversification (Zheng et al., 2007; Havecker et al., 2010). Moreover, the observation that Pol IV and Pol V are not involved in the production of strand-specific clusters of 24-nt siRNAs that match long inverted duplications (Kasschau et al., 2007; Zhang et al., 2007) implies that at least in this case, siRNAs must derive from transcripts produced by one of the three canonical RNA polymerases, presumably Pol II. In fact, transcripts produced by Pol II are likely to be the initial trigger for RdDM in other situations as well, notably immediately after TE insertion, with the Pol IV/Pol V loop serving only in a second step, after Pol II activity has ceased or has been altered by the process of heterochromatin formation. In agreement with this scenario, transcription by Pol II appears to be necessary for Pol IV- and Pol V-mediated silencing of intergenic low-copy-number loci (Zheng et al., 2009). Finally, a direct role for Pol II in RdDM would parallel the situation in fission yeast where Pol II is involved in the RNAi-dependent deposition of the heterochromatic mark H3K9me2 at pericentromeric repeats (Grewal and Jia, 2007; Kloc and Martienssen, 2008).

DNA methylation maintenance mechanisms

The first eukaryotic DNA MTase to be identified was mouse Dnmt1, which is considered a maintenance methyltransferase because of its higher affinity in vitro for hemi-methylated than for unmethylated CGs (Goll and Bestor, 2005). Indeed, this feature of Dnmt1 provided the first experimental evidence that once established, DNA methylation could be propagated during each cell cycle by the recognition of the hemi-methylated nature of newly replicated DNA, as originally proposed (Holliday and Pugh, 1975; Riggs, 1975). Arabidopsis encodes several homologs of Dnmt1, among which DNA METHYLTRANSFERASE 1 (MET1) is responsible for maintaining most CG methylation (Figure 1b; Henderson and Jacobsen, 2007). Numerous additional proteins are required for the maintenance of CG methylation (Law and Jacobsen, 2010), including the SRA- and RING-domain-containing protein VARIANT IN METHYLATION 1 in Arabidopsis and its homolog Uhrf1 (also known as NP95) in mammals, which are thought to recruit MET1/Dnmt1 to hemi-methylated CG sites (Bostick et al., 2007; Sharif et al., 2007; Woo et al., 2007, 2008).

Maintenance of CHG methylation is mostly effected by CHROMOMETHYLASE 3 (CMT3), a plant-specific DNA MTase (Henikoff and Comai, 1998; Lindroth et al., 2001; Cokus et al., 2008) and requires in addition SUVH4 (also known as kryptonite, KYP), the main histone methyltransferase involved in histone H3K9 dimethylation (Jackson et al., 2002; Malagnac et al., 2002). A reinforcing loop between these modifications is suggested by the fact that the chromodomain of CMT3 and the SRA domain of SUVH4 bind H3K9me2 and methylated CHG sites, respectively (Figure 1b; Lindroth et al., 2004; Johnson et al., 2007). Other histone methyltransferases, including SUVH5 and SUVH6, may be involved in similar reinforcing loops (Ebbs et al., 2005; Ebbs and Bender, 2006). To date it is not known whether the hemi-methylated status of CHG sites after passage of the replication fork serves as an additional cue for maintaining methylation at these sites.

In the case of CHH methylation, template-based maintenance can be excluded because of the asymmetry in the CHH sequence. Instead, perpetuation of CHH methylation is effected mainly by RdDM (Figure 1b), with the DNA MTase DRM2 (and, in some instances, CMT3) ensuring its re-establishment after each round of replication (Cao and Jacobsen, 2002a). Consistent with this, small RNAs deep sequencing data indicate that a large fraction of methylated repeat elements are characterized by an abundance of matching 24-nt siRNAs throughout development (Kasschau et al., 2007; Lister et al., 2008; Mosher et al., 2009; Slotkin et al., 2009). However, in contrast to the maintenance activity of MET1, which typically results in over 80% methylation at any given CG site (Cokus et al., 2008; Lister et al., 2008), the action of RdDM at CHH sites rarely leads to more than 40% methylation. Furthermore, CHH methylation is not completely abolished in drm1drm2cmt3 triple mutants (Cokus et al., 2008; Lister et al., 2008). This and other observations suggest that some CHH methylation is in fact laid down by MET1 or additional DNA MTases, presumably in an RNAi-independent manner (Henderson et al., 2006; Cokus et al., 2008; Lister et al., 2008; Teixeira et al., 2009).

In addition to the factors listed above, the ATPase SWI2/SNF2 chromatin remodeler DECREASE IN DNA METHYLATION 1 (DDM1) has an essential role in the maintenance of high DNA methylation levels over repeat elements (Vongs et al., 1993; Jeddeloh et al., 1999; Lippman et al., 2004; Teixeira et al., 2009). Although the mechanism of action of DDM1 remains mysterious, studies carried out on the mammalian homolog lymphoid-specific helicase (LSH) suggest that it may serve as a recruiting factor for DNA MTases and histone deacetylases at target loci (Myant and Stancheva, 2008).

Active DNA demethylation

In the absence of maintenance and de novo DNA methylation activity, DNA methylation is progressively lost through replication, in a process called passive demethylation. Active demethylation on the other hand refers to enzymatic mechanism(s) that ultimately lead to the replacement of methylcytosines by cytosines in DNA (Zhu, 2009). Although active demethylation was first documented in mammals, the enzymes and mechanisms involved are only beginning to be elucidated in those species (Bhutani et al., 2009; Gehring et al., 2009b; Popp et al., 2010). By contrast, much more is known about this process in Arabidopsis, which possesses four methylcytosine DNA glycosylases that mediate active DNA demethylation through a DNA base excision repair process (Zhu, 2009; Gehring et al., 2009b). These are REPRESSOR OF SILENCING 1 (ROS1), DEMETER (DME), DEMETER-LIKE 2 (DML2) and DML3. Efficiency of methylcytosine excision in vitro differs between these four proteins and in relation to sequence context (Gehring et al., 2006; Morales-Ruiz et al., 2006; Penterman et al., 2007b). Moreover, DME is expressed in the central cell during the later stages of female gametogenesis (Choi et al., 2002) and is essential for the extensive maternal-specific DNA demethylation of repeat elements that may underlie genomic imprinting in the endosperm (Hsieh et al., 2009; Gehring et al., 2009a). In contrast, the three other DNA demethylase genes are ubiquitously expressed and their protein products seem to be targeted to a limited number of loci (Penterman et al., 2007b; Lister et al., 2008). Although the targeting mechanism is unknown, genetic evidence suggest that small RNAs are involved (Mosher et al., 2008; Zheng et al., 2008; Hsieh et al., 2009).

Interplay between DNA methylation and demethylation mechanisms

As expected from their distinct roles, the RdDM and maintenance DNA methylation pathways are, to a large extent, genetically independent. Thus, CG methylation is nearly unchanged in RdDM mutants, and significant amounts of CHH methylation and 24-nt siRNAs persist in ddm1 and met1 mutant plants (Jacobsen et al., 2000; Mathieu et al., 2007; Lister et al., 2008; Blevins et al., 2009; Slotkin et al., 2009; Teixeira et al., 2009). However, both decreases and increases in siRNA abundance are observed in ddm1 and met1, presumably as a result of the widespread transcriptional reactivation of TEs which occurs in these two mutant backgrounds (Lippman et al., 2004; Zhang et al., 2006; Zilberman et al., 2007; Lister et al., 2008). The massive accumulation of 21-nt small RNAs matching ATHILA sequences is also particularly striking (Lister et al., 2008; Slotkin et al., 2009; Teixeira et al., 2009) and likely reflects the mounting of a strong post-transcriptional silencing response against this family of retroelements.

Unexpectedly, genetic evidence indicates a clear relationship between the DNA demethylation pathway and RdDM. Thus, the local DNA hypermethylation observed in ros1dml2dml3 is frequently associated with overaccumulation of matching siRNAs (Lister et al., 2008). Moreover, many second-site suppressors of ros1-induced hypermethylation and silencing are components of the RdDM machinery (Zheng et al., 2007; Penterman et al., 2007a; He et al., 2009a, 2009b). These observations imply that at some loci, DNA methylation patterns result from the opposing action of RdDM and DNA demethylation. The identification of the small RNA-binding protein ROS3 as a component of the DNA demethylation machinery (Zheng et al., 2008) should help to define the number of loci targeted by the two pathways, as well as the factors involved in the balance between methylation and demethylation at each locus.

Transgenerational inheritance of DNA methylation patterns

In mammals, DNA methylation patterns are reset genome wide at least twice, in the germ cells and the embryo. Resetting is best documented in the embryo, where it is thought to involve a massive wave of both active and passive DNA demethylation immediately after fertilization, followed by rapid de novo DNA methylation (Reik, 2007). In plants, there is no evidence for a similar widespread resetting of DNA methylation in germ cells or embryos. For instance, Arabidopsis mutants defective in RdDM or active DNA demethylation still exhibit near-normal CG methylation within and across generations (Tran et al., 2005; Zhang et al., 2006; Penterman et al., 2007b; Cokus et al., 2008; Lister et al., 2008). Conversely, met1- or ddm1-induced hypomethylation can be stably inherited at numerous loci throughout the genome for at least eight generations after outcrossing of the mutant alleles (Johannes et al., 2009; Reinders et al., 2009). Moreover, plants seem more prone to the inheritance of DNA methylation defects than mammals (Richards, 2006; Whitelaw and Whitelaw, 2008). Collectively, these observations indicate that DNA methylation patterns tend to be propagated across generations in plants, rather than re-established anew at each generation like in mammals.

Nonetheless, DNA methylation and silencing can be restored in an RNAi-dependent manner over a subset of repeat sequences following ddm1-induced hypomethylation (Johannes et al., 2009; Teixeira et al., 2009). Typically, restoration takes place over several generations (Teixeira et al., 2009), which is consistent with the frequent progressivity of TE inactivation in maize and transgene silencing in many plant species (Chandler and Stam, 2004; Slotkin and Martienssen, 2007). Furthermore, because little restoration of DNA methylation and silencing can be detected during vegetative growth (FKT and VC, unpublished data), this activity seems mainly restricted to the reproductive phase of the Arabidopsis life cycle. This is in agreement with a series of observations pointing to an essential role of RNAi in reinforcing silencing of TEs in germ cells and the embryo (Mosher and Melnyk, 2010). Specifically, in the vegetative nucleus of the pollen grain, where the level of DDM1 protein is particularly low, TEs are reactivated, producing a pattern of siRNA accumulation similar to that seen in ddm1 (Slotkin et al., 2009). These siRNAs are then transported by still unknown mechanisms to the two sperm cells (Slotkin et al., 2009) where they presumably participate in either RdDM or post-transcriptional silencing depending on their size (Schoft et al., 2009; Slotkin et al., 2009). Like other transcripts produced in pollen (Bayer et al., 2009), some or all of these siRNAs could also be carried over to the zygote and the endosperm, where they would exert functions similar to those postulated in sperm cells (Figure 2a). On the female side, siRNAs accumulate in the central cell and endosperm (Mosher et al., 2009), possibly as a result of active and widespread demethylation of the maternal genome (Hsieh et al., 2009; Gehring et al., 2009a). In a process analogous to that hypothesized for pollen, some of these siRNAs would be transported into the zygote and developing embryo to reinforce DNA methylation and silencing of matching TEs (Figure 2a; Hsieh et al., 2009). Transfer of epigenetic information from the endosperm into the zygote has in fact already been proposed as a possible explanation for why de novo DNA methylation of newly integrated transgenes requires fertilization (Chan et al., 2006b). This requirement for fertilization is also supported by transcriptome data indicating that many components of RdDM are most highly expressed in developing seeds (Figure 2b; http://jsp.weigelworld.org/expviz/expviz.jsp). Collectively, these findings suggest that at most targets, RdDM can only enforce limited de novo DNA methylation (and silencing) during each reproductive cycle. The level of epigenetic control achieved in the embryo as a result of this round of RdDM would then be stably maintained throughout vegetative growth, mainly by other pathways, and would serve as a starting point for additional DNA methylation and silencing by RdDM during the next reproductive stage. This process would repeat itself until maximal levels of DNA methylation and silencing have been attained, at which stage RdDM should become largely superfluous (Figure 2c).

Figure 2
figure 2

A model for the progressive acquisition/restoration of DNA methylation over multiple generations. (a) siRNAs are produced from the vegetative nucleus of the pollen grain and are transported (green arrows) to the two sperm cells. They also accumulate in the central cell and endosperm and would likewise be transported to the zygote and embryo to instruct de novo DNA methylation. (b) Many components of RdDM are most highly expressed during seed development, including DRM2 (At5g14620), RDR2 (At4g11130), AGO4 (At4g11130), NRPD1 (At1g63020), NRPE1 (At2g40030), DRD1 (At2g40030), and DMS3 (At3g49250). (c) RdDM is largely dispensable for the maintenance of DNA methylation at most repeat elements. However, RdDM is critical for restoring wild-type DNA methylation after severe loss. This restoration is progressive across several generations because RdDM can only enforce limited de novo DNA methylation during each reproductive cycle. Newly reached DNA methylation levels would be maintained during vegetative growth mainly by other pathways, and would provide a starting point for further DNA methylation by RdDM during the next reproductive phase. This process would also apply for the progressive methylation and silencing of newly inserted TEs.

A role for small RNAs in instructing the silencing of TEs specifically in the germ-line has also been documented in Drosophila and mammals (Aravin and Bourc'his, 2008; Malone and Hannon, 2009). In at least one case, in Drosophila, progressive effects over successive generations have been reported (Josse et al., 2007), providing interesting parallels with the scenario presented above.

As mentioned earlier, DNA methylation of repeat elements is mostly constant among Arabidopsis accessions (Vaughn et al., 2007; Zhai et al., 2008; Zhang et al., 2008), yet some highly methylated repeat elements seem to produce hardly any siRNAs (Kasschau et al., 2007; Lister et al., 2008; Teixeira et al., 2009). This raises the question of how such repeat elements maintain DNA methylation over evolutionary time. This could either be through some unknown process, or alternatively persistence of DNA methylation may be ensured by constant, low-efficiency RdDM, or else through stochastic, high-efficiency RdDM triggered by transcriptional reactivation. In the latter hypothesis, TE relics and repeat elements that have lost their capacity to be transcribed should ultimately lose DNA methylation over time, in agreement with the observation that old TEs tend to be unmethylated in the Arabidopsis genome (Hollister and Gaut, 2009).

Determinants of RdDM strength

Full restoration of DNA methylation by RdDM after ddm1-induced hypomethylation typically requires two to five generations, depending on the target locus (Teixeira et al., 2009). Such variation in RdDM strength between target loci may result from variable degrees of competition between RdDM and DNA demethylation (see above), or differences in sequence composition (Matzke et al., 2004; Huettel et al., 2007), chromatin/DNA methylation states and abundance of matching siRNAs. Evidence that these last two factors have particularly important roles comes mainly from studies carried out with the two Arabidopsis genes FLOWERING WAGENINGEN A (FWA) and SUPPRESSOR OF drm1drm2drm3 (SDC).

The gene FWA is methylated and transcriptionally silent throughout the plant life cycle, except in the endosperm, where the two copies of the maternal allele are demethylated and expressed (Kinoshita et al., 2007). Methylation is restricted to the promoter and 5′ untranslated region of FWA, which comprised of tandem repeats generating siRNAs (Lippman et al., 2004; Kinoshita et al., 2007). In addition, transformation experiments have shown that newly integrated FWA transgenes become de novo methylated and silenced through RdDM (Cao and Jacobsen, 2002b; Chan et al., 2004, 2006b). However, when FWA transgenes are introduced by transformation into plants defective in RdDM, they adopt an unmethylated and active state that prevents them from being targeted by RdDM in the progeny of crosses with wild type (Chan et al., 2004). Similarly, endogenous FWA, which becomes hypomethylated and reactivated to varying degrees in ddm1, can efficiently regain full methylation and silencing after outcrossing of ddm1, but only when ddm1-induced hypomethylation and reactivation are moderate (Kakutani, 1997; Johannes et al., 2009). When ddm1 effects are more severe, which occurs sporadically in advanced lines, stable fwa epialleles are produced and these have never been found to revert spontaneously (Kakutani, 1997; Johannes et al., 2009). However, siRNAs matching the FWA tandem repeats are still detected in plants with fwa epialleles, albeit at lower levels compared to wild type (Lippman et al., 2004; Chan et al., 2006b; Lister et al., 2008). Collectively, these findings suggest that impairment of RdDM at FWA involves an additional barrier that is not strictly coupled to siRNA production and which likely resides in chromatin itself (Chan et al., 2006b). Thus, it is tempting to speculate that strong PoL II activity over FWA directly interferes with RdDM, with the possible help of DNA demethylases. Nonetheless, fwa epialleles can be readily reverted to FWA by forcing RdDM targeting through the production of large amounts of siRNAs from transgenes (Kinoshita et al., 2007). Moreover, establishment of silencing and DNA methylation over a naïve FWA transgene is greatly aided when the endogenous FWA locus is itself silent and methylated (Chan et al., 2006b). These results are in keeping with other findings indicating that production or accumulation of siRNAs in trans can contribute in determining the strength of RdDM (Matzke et al., 2004; Huettel et al., 2007).

The gene SDC contains a set of methylated tandem repeats within its promoter that have similar base composition to the FWA tandem repeats (Henderson and Jacobsen, 2008). Unlike these, however, the SDC tandem repeats are mainly targeted by RdDM, with little additional contribution from the MET1/DDM1 DNA methylation maintenance pathway. Perhaps as a consequence, DNA methylation and silencing of SDC are immediately restored in the F1 progeny of drm1drm2cmt3 mutant plants crossed with wild type (Henderson and Jacobsen, 2008). This differs from the situation observed for repeat elements that are mainly targeted by the MET1/DDM1-dependent pathway, as these do not detectably regain DNA methylation before the F2 in crosses of ddm1 with wild type (Teixeira et al., 2009). Furthermore, indirect evidence suggests that transformation of drm1drm2cmt3 mutant plants with DRM2 or CMT3 transgenes can result in immediate restoration of SDC silencing (Chan et al., 2006a). Thus, the presence of a silent and methylated SDC allele, such as that provided through crosses with wild type, may not be necessary for the efficient restoration of SDC silencing by the RdDM pathway.

Variation in RdDM strength and paramutation

The term paramutation was first introduced in the 1950s after observations in maize that specific alleles can interact when brought together in a cross, to produce a meiotically heritable change in expression of one of the alleles (Chandler and Stam, 2004). Although this process has long remained mysterious, the recent discovery that siRNA biogenesis is involved is providing important insights (Chandler and Alleman, 2008; Hollick, 2009; Arteaga-Vazquez and Chandler, 2010) and several observations suggest that paramutation is in fact an extreme manifestation of RdDM or a related RNAi-dependent chromatin targeting pathway. The best characterized and most striking case of paramutation concerns a single allele of the maize b1 locus, which encodes a transcription factor involved in anthocyanin biosynthesis. This allele exists in two epiallelic forms, B-I and B′, which are strongly and weakly active, respectively. Difference in expression is associated with differential DNA methylation and chromatin accessibility of a set of seven tandem repeats that act as enhancers and that are located 100 kb upstream of the b1 transcription start site (Stam et al., 2002; Louwers et al., 2009). These seven tandem repeats are responsible for mediating in F1 heterozygotes the transfer of the expression state of paramutagenic B′ to paramutable B-I with 100% penetrance and in an RNAi-dependent manner. The paramutated form of B-I, designated B′*, is stably inherited independently of B′ and is itself paramutagenic, which makes it indistinguishable from B′ (Chandler and Stam, 2004). Moreover, whereas B-I reverts spontaneously to B′ at high frequency (1–10%) and can thus be considered a metastable epiallele, B′ is highly stable and only becomes more active in RNAi mutant backgrounds (Dorweiler et al., 2000; Alleman et al., 2006; Woodhouse et al., 2006). These findings suggest that the tandem repeats of B-I and B′ are, respectively, just below and well above the threshold required for RNAi-dependent chromatin targeting, presumably as a consequence of differences in chromatin states (Stam et al., 2002; Louwers et al., 2009). Although the available evidence seems to indicate otherwise (Arteaga-Vazquez and Chandler, 2010), it is also reasonable to assume that the tandem repeats of B′ and B-I produce distinct amount of siRNAs, at least during the reproductive phase, when the RNAi-dependent chromatin targeting pathway could be most active, as in Arabidopsis (Figure 2). Efficient targeting of the B-I repeats in F1 progeny would therefore result from a combination of two factors, the provision in trans of high amounts of siRNAs by the B′ repeats and a particularly responsive chromatin state at the B-I repeats. In this scenario, paramutation of B-I to B′* would stabilize the weakly active state by way of a self-reinforcing loop between chromatin and RNAi-dependent targeting, as suggested for FWA (Chan et al., 2006b). However, repeats at B′/B-I more closely resemble those at SDC than at FWA in constantly requiring the RNAi-dependent pathway for maintenance of the chromatin state associated with weak or no expression. Thus, the variable degree of RNAi-dependent chromatin targeting, as well as the interplay between this process and RNAi-independent processes may dictate the stability and paramutagenic or paramutable properties of epialleles.

Concluding remarks

Although the mechanisms involved in the trans-generational heritability of DNA methylation affecting repeat elements are beginning to be elucidated in plants, many questions remain in addition to those already mentioned in this review. For instance, despite the clear involvement of RdDM in the restoration of wild-type methylation of repeat elements after methylation loss and in the de novo methylation of transgenes, the contribution of RdDM in the establishment of epigenetic control over newly inserted TEs is still poorly documented. This is an important consideration as chromatin states are likely to differ substantially between TEs or transgenes that are newly inserted and TEs that have become transcriptionally reactivated. Furthermore, plant genomes often contain a few transcriptionally active elements. How such elements can remain stably active for many generations despite the presence of related or identical silent copies is unknown. Other outstanding questions concern the exact molecular nature of the progressive DNA methylation and silencing established by RdDM during the sexual phase of the Arabidopsis life cycle and the reason for the apparent absence of further DNA methylation and silencing during vegetative growth, despite the presence of siRNAs. Thanks to the power of Arabidopsis genetics and genomics and to the rapid development of biochemical approaches for the study of RdDM, we can expect answers to such questions in the near future.