DNA methylation is among the best studied epigenetic modifications and is essential to mammalian development. Although the methylation status of most CpG dinucleotides in the genome is stably propagated through mitosis, improvements to methods for measuring methylation have identified numerous regions in which it is dynamically regulated. In this Review, we discuss key concepts in the function of DNA methylation in mammals, stemming from more than two decades of research, including many recent studies that have elucidated when and where DNA methylation has a regulatory role in the genome. We include insights from early development, embryonic stem cells and adult lineages, particularly haematopoiesis, to highlight the general features of this modification as it participates in both global and localized epigenetic regulation.
DNA methylation is essential for mammalian development and has notable roles in gene silencing, protection against spurious repetitive element activity, genomic stability during mitosis and parent-of-origin imprinting.
DNA methylation functions to coordinately silence some promoter and enhancer classes that are typified by low overall CpG density. By contrast, many CpG islands, particularly those at promoters, remain generally and constitutively protected from DNA methylation through numerous opposing cis- and trans-based mechanisms.
Embryonic stem cells (ESCs) remain one of the most highly studied in vitro models for dissecting epigenetic mechanisms during cellular differentiation. They are unique in their ability to retain the molecular characteristics of their cellular state in the absence of DNA methylation but cannot differentiate into embryonic lineages while simultaneously gaining extra-embryonic potential.
ESCs are also unique in their heterogeneous ability to maintain repetitive element silencing through DNA methyltransferase 1 (DNMT1) maintenance activity alone and rely on the DNMT3 enzymes to re-silence these elements continuously de novo. The pre-implantation stage of development is one of the few stages at which endogeneous transposable elements are actively regulated.
Promoters of germline-specific genes are coordinately repressed by DNMT3B in embryonic lineages and are reactivated during the global demethylation that accompanies primordial germ cell (PGC) specification.
Haematopoiesis has served as a benchmark model for the function of DNA methylation in adult stem cells and in later lineage specification. Several notable regulatory principles for DNA methylation were originally or comprehensively described in this system, including the gating of lymphoid versus myeloid fates, the dynamics of intragenic CpG island methylation in transcript selection and demethylation as a lineage stabilizer by providing a mitotically heritable memory of transcription factor binding.
DNA methylation is globally erased during specification of the germ line and during fertilization, at which point the paternal genome is specifically demethylated. These two key events have numerous related cascades of chromatin re-organization that suggest similar epigenetic mechanisms for erasure and re-establishment of global nuclear organization, although the relationship between these events remains incompletely defined.
Although they have remained mysterious for more than 2 decades, the targets and mechanisms for DNA methylation erasure in the paternal genome as well as for protection of the maternal genome during fertilization are becoming clearer. For example, there are distinct temporal windows for global hydroxymethylation followed by erasure to unmodified cytosines occurring later in zygotic progression. The epigenetic determinants that retain silencing at repetitive element classes and imprints are less clear, but they are likely to be discovered as the dynamics and relationships of other epigenetic regulators are refined.
Development from a zygote into a complex, multicellular adult organism requires an array of shared and specific cellular processes. The regulation of gene expression is primarily encoded in cis and is directed by transcription factors. However, genes are also regulated by heritable, covalent modifications to DNA and histones that often help to shape developmental decisions. How these modifications are interpreted and inherited and how they influence genomic output before, during or after a cell-fate transition are fundamental questions to our understanding of development.
Methylation of the fifth position of cytosine is one of the best studied and most mechanistically understood epigenetic modifications and is well conserved among most plant, animal and fungal models1. In mammals, cytosine methylation is primarily restricted to the symmetrical CpG context2,3. Three conserved enzymes, DNA methyltransferase 1 (DNMT1), DNMT3A and DNMT3B, are responsible for its deposition and maintenance and are essential for normal development4,5. Mammalian genomes are globally CpG-depleted and, of the roughly 28 million CpGs in the human genome, 60–80% are generally methylated. Less than 10% of CpGs occur in CG-dense regions that are termed CpG islands; these are prevalent at transcription start sites of housekeeping and developmental regulator genes6, where they are largely resistant to DNA methylation. This bimodal landscape represents the global perspective on our understanding of DNA methylation. In this model, most bulk genomic methylation patterns are static across tissues and throughout life, changing only in localized contexts as specific cellular processes are activated or shut down. The notable exceptions are in the germ line and during pre-implantation development, where rapid demethylation of the paternal genome at fertilization is followed by a depletion in both parental genomes over early embryonic progression. How DNA methylation is globally and locally modulated, and for what purpose, remain compelling questions, as only a few robust and general rules have been formulated.
Since its original postulation as an epigenetic regulator7,8, numerous assays have been developed to study cytosine methylation, including methylation-sensitive restriction enzyme mapping, deamination of unmethylated cytosines with sodium bisulphite or enrichment with targeting antibodies9. High-throughput sequencing now enables complete methylomes to be elucidated, and methylation has been mapped at base-pair resolution across development from zygote to terminally differentiated adult cells10. Novel transgenic systems have confirmed functions for DNA methylation in multiple different lineages and have identified developmental windows in which DNA methylation is essential.
In this Review, we first discuss genetic features that are sensitive to regulation through DNA methylation and highlight links to other epigenetic modifiers. We then explore, in greater detail, specific examples in which DNA methylation is dynamic or essential for developmental transitions, using embryonic stem cells (ESCs) and the haematopoietic system to demonstrate rules that may extend to other lineages. These comparatively stable transitions are then contrasted against two developmental periods, primordial germ cell (PGC) specification and the early embryo, in which DNA methylation levels are globally reset. By comparing what is currently known of local and global changes, we hope to clarify the governing principles of DNA methylation in mammalian development and set these roles in the context of the larger canon of epigenetic regulation.
Regulatory targets of DNA methylation
During mitosis, DNMT1 faithfully propagates symmetrically methylated CpGs through recognition of the nascent strand opposite a previously methylated position (Box 1). Although maintenance ensures epigenetic inheritance at established positions, there are many instances in which methylation must be specifically targeted and others in which methylation must be inhibited or removed. Before exploring dynamic regulation within selected cellular systems, we provide a breakdown of DNA methylation as it acts on promoters, repetitive elements and parent-specific imprints.
Maintaining unmethylated promoters. Most CpGs in mammalian genomes remain methylated during development, but CpG islands found at promoters of many housekeeping or developmentally regulated genes are constitutively hypomethylated. Nearly half of the unmethylated CpG islands identified in mammalian genomes do not occur at annotated promoters, although many of these so-called 'orphan' CpG islands show similar epigenetic features. However, those CpG islands that occur at intragenic regions are more frequently methylated during development and may contribute more nuanced regulatory functions11.
At promoter CpG islands, maintaining a hypomethylated state requires that DNA methyltransferases be actively and continuously excluded (Fig. 1a). Classical studies established that the unmethylated state of promoter CpG islands is strongly influenced by transcription factor binding; CpG islands can progressively accrue heritable methylation if they are truncated or depleted of known transcription factor binding sites12,13. Moreover, transfer of an SP1 binding site into an endogenously methylated locus induced appreciable local demethylation, confirming the dominance of transcription factor binding over DNA methylation in this context12. Short, cis-acting sequences are often sufficient to recapitulate an in vivo unmethylated state, even without measurable transcription14. The genome-wide analysis of transcription factor binding and DNA methylation confirms that these examples highlight a regulatory principle that extends to enhancer elements, in which CpG density is often not appreciably higher than the genomic average15.
Histone modifications and variants at CpG islands are also important (Fig. 1a). Unmethylated, CpG-rich regions are bound by CXXC finger protein 1 (CFP1; also known as CXXC1), which recruits histone H3 lysine 4 (H3K4) methyltransferases and is sufficient to maintain ectopic, CG-rich transgenes that lack promoter features or transcription in an unmethylated state16. However, Cfp1-knockout cells lose local H3K4 trimethylation (H3K4me3) without changes in expression or promoter DNA methylation17; this could reflect a difference between de novo assembly of euchromatin and the retention of redundant protective mechanisms at already active loci. In addition, binding of the MLL family H3K4 methyltransferases protects promoters of developmental genes from DNA methylation, and this is also likely to be instructed through their CXXC domains18. Mechanistically, the DNMT3 enzymes each contain an ATRX–DNMT3–DNMT3L (ADD) domain that recognizes unmodified H3 and is allosterically inhibited by H3K4 methylation19,20. For activating and repressive mechanisms that act independently of DNA methylation, some other epigenetic modifications may assist in preventing methylation through mutual antagonism. Like H3K4 methylation, the histone variant H2A.Z is strongly enriched at unmethylated, active promoters21. When DNA methylation is chemically inhibited by azacytidine, H2A.Z-containing nucleosomes encroach into adjacent regions22. This supports a model in which epigenetic modifications and histone variants associated with transcription (such as H2A.Z) may be limited to their functional targets by surrounding DNA methylation21,22.
CpG island promoters are not usually repressed by DNA methylation and are instead silenced by H3K27 methylation, which may also protect them from spurious DNA methylation23,24. When DNMTs are inhibited, H3K27 methylation spreads from CpG-rich loci into peripheral regions that are normally DNA methylated; this spreading of H3K27 methylation might compensate for the loss of global silencing23,25. Large-scale screening of cancer or age-related disease cohorts revealed a general trend for regions that are normally regulated by H3K27 methylation to be frequently hypermethylated, suggesting that transcriptionally repressed CpG-island-containing genes may be less stably protected from DNA methylation26. Although numerous mechanisms maintain the asymmetric distribution of methylation, additional proofreading enzymes, such as the TET dioxygenases, cytosine deaminases and base excision repair (BER) enzymes, are also tightly coupled to regulatory complexes that are associated with promoters or enhancers and may prevent infrequent aberrations that could otherwise trigger hypermethylation (Box 1).
In addition to epigenetic inhibition, many housekeeping gene promoters have an asymmetric GC distribution downstream of their transcriptional start sites27. This GC skew may inhibit de novo methyltransferase recruitment by forming a guanidine-rich, single-stranded DNA loop (known as an R-loop), through complementary base pairing between nascent RNA and the template strand27. Inversion of this GC asymmetry negates RNA–DNA helix formation and induces de novo methylation27. At both the structural and sequence levels, CpG islands are normally protected through their intrinsic recruitment of antagonistic euchromatin or opposing facultative heterochromatin.
De novo methylation at repressed promoters. Although many CpG island promoters remain unmethylated, some repressed promoters, particularly those of lower CpG density, acquire DNA methylation during development. Although specific examples will be discussed in the following sections, many general events are associated with promoter silencing. The low affinity and catalytic activity of DNMT1 at unmethylated DNA limit its de novo methyltransferase activity28,29. However, both DNMT3A and DNMT3B target promoters in complex with other epigenetic repressors, including histone deacetylases (HDACs) and methyltransferases associated with repressive H3K9 methylation. Transcription factors such as retinoic-acid-response nuclear receptors in the context of embryonic stem cell (ESC) differentiation30 often participate in proper targeting. Cooperative heterochromatin assembly at promoters is mediated through interactions between epigenetic silencers and the nucleosome remodeller lymphoid-specific helicase (LSH; also known as HELLS), which may provide nucleosomes in previously depleted regions to act as a template for epigenetic silencing31,32,33. LSH is co-recruited with the H3K9 dimethyltransferase G9A (also known as EHMT2) in complex with DNMT3A or DNMT3B34,35. The catalytic activity of G9A stabilizes and accelerates repression, but its binding activity is often sufficient for DNA methylation35,36. The requirement of both H3K9 methylation and DNA methylation for complete, stable promoter silencing suggests a model in which H3K9 methylation initiates heterochromatin formation and DNA methylation ensures long-term silencing37 (Fig. 1b).
Pericentromeric repeats. Most of the sequence in mammalian genomes is non-coding but includes many features with latent transcriptional potential, including pericentromeric repeats (which instruct centromeric assembly) and parasitic repetitive elements (discussed below). DNA methylation has important roles in the maintenance of these genomic components that may reflect its most conserved function across species (Fig. 1c,d).
Pericentromeric minor and major satellite elements extend from the centromere in thousands to tens of thousands of tandem copies38. These elements have latent transcriptional potential, the repression of which is essential for proper chromosome alignment, segregation and integrity during mitosis. This is essential for normal development, as demonstrated by the heritable, autosomal-recessive disease immunodeficiency, centromeric instability and facial anomaly (ICF) syndrome, which is caused by missense mutations in DNMT3B4,39. SUV39H1 deposits H3K9 methylation, directs DNMT3B to major satellites and is sufficient for silencing at these regions38,40. DNMT3B-null cells exhibit hyperacetylation at minor satellite repeats, leading to chromosomal mispairing and lagging anaphase bridges during mitosis41. DNMT3B is directly recruited by centromere protein C (CENPC; also known as CENPC1) and functions downstream of pericentromeric nucleosome assembly, which is instructed by CENPB42. DNMT3B can be retained at these regions through metaphase, which suggests a continued requirement for silencing throughout mitosis41. These data highlight a crucial, specific role for DNMT3B recruitment in maintaining centromeric proximal heterochromatin to facilitate proper cell division.
Transposable elements. Endogenous transposable elements constitute nearly 40% of mammalian genomes and consist of three major classes: long interspersed nuclear elements (LINEs), short interspersed nuclear elements (SINEs) and long terminal repeat (LTR)-containing endogenous retroviruses43. Full-length LINE and LTR elements encode strong promoters that must be constitutively repressed to prevent their activity, and generally these regions are constitutively hypermethylated. However, in ESCs, DNMT1 activity alone is often not sufficient to maintain DNA methylation stably at these sites, and DNMT3 enzyme recruitment and activity are also required44. LTR sequences are silenced by tripartite-motif-containing protein 28 (TRIM28)-mediated recruitment of the H3K9 methyltransferase SETDB1. Recruitment of SETDB1 by TRIM28 appears to be a general epigenetic mechanism that is targeted to specific sequences through zinc finger proteins such as ZFP809, which itself is specific for proviral promoters45,46,47,48,49. The histone methyltransferase activity of SETDB1 acts upstream of DNMT recruitment, and DNA methylation appears to function as a secondary stabilizer50. Mechanistically, silencing also involves G9A to initiate heterochromatin assembly and DNMT3-like (DNMT3L) to recruit de novo methyltransferases36,51,52. The components required for de novo repetitive element silencing are largely restricted in their expression to discrete developmental windows, including early embryogenesis (and thus also ESCs). In adult tissues, the de novo silencing machinery is often not present, and DNA methylation is crucial to direct continued repression by recruiting methyl-CpG-binding protein 2 (MECP2) in complex with histone deacetylases (HDACs)53,54,55,56,57. Developmental windows in which repeat activity is prevalent are restricted to the early embryo and during germline development, particularly during gametogenesis, in which a unique mode of epigenetic regulation involves PIWI-interacting RNAs (Box 2).
Targeting imprints. DNA methylation is the classically assigned instructional modification for germline imprint control regions (ICRs), although mounting evidence implicates other epigenetic modifiers in guiding DNA methylation to imprinted loci58. For instance, different rates of de novo methylation are observed at previously imprinted maternal and paternal loci, indicating that an epigenetic memory persists to reinstruct methylation59,60. The H19 locus has an ICR that has been well-characterized in mice. During male gametogenesis, the maternal allele retains transcriptional activity and CTCF binding, which delays its methylation until this interaction is abrogated61. In embryos generated from Dnmt3l−/− females, which lack imprints, a few alleles are stochastically remethylated at their maternal ICRs after fertilization, suggesting that other epigenetic machinery can recognize and rescue methylation defects in trans62. In a mechanism similar to that for LTR silencing, TRIM28 is specifically targeted to imprints by the zinc finger protein ZFP57, and both proteins are essential for maintaining several imprints in the embryo63,64. In ESCs, ZFP57 recognizes a methylated CpG-containing motif and recruits a complex containing TRIM28, the DNMT3s and UHRF1 (Refs 63,65) (Fig. 1e). TRIM28 and ZFP57 therefore provide additional control in specific contexts where DNA methylation may be otherwise unstable. Further exploration of TRIM28 targeting through zinc finger proteins may reveal a more detailed understanding of trans-mediated, allele-specific silencing and may uncover several missing components to the full imprinting mechanism.
Pluripotency and differentiation
Mouse ESCs are one of the most extensively studied systems for dissecting epigenetic mechanisms. The transcriptional circuitry associated with pluripotency is rapidly silenced on differentiation — in part through de novo methylation — as embryonic programmes are resolved towards specific lineages. Moreover, pluripotency represents a unique developmental window in which repetitive elements can be silenced de novo.
Tolerance to global demethylation. DNA methylation has a crucial, yet not fully understood, role in ESC commitment but not in maintenance or establishment of pluripotency. Neither the molecular signature of pluripotency nor self-renewal is affected by its complete erasure. ESCs that lack all three DNA methyltransferases remain viable and show no notable aneuploidy66. ESCs that are specifically depleted of either maintenance or de novo methylation machinery lose nearly all global methylation, albeit at markedly disparate rates and steady-state global values: loss of Dnmt1 induces a rapid loss of methylation that stabilizes to ∼20% of the normal value, whereas Dnmt3a−/− Dnmt3b−/− ESCs lose nearly all methylation over progressive divisions, indicating that the DNMT3 enzymes provide additional robustness to the inheritance of DNA methylation67,68. However, reintroduction of DNMT1 into Dnmt1-knockout ESCs restores previous methylation patterns, except imprints69.
Although stem cell molecular identity is not impaired in the absence of DNA methylation, differentiation is almost completely inhibited. Methylation-free ESCs cannot upregulate germ-layer-associated markers and do not efficiently silence pluripotency factors68. Acute deletion of the DNMT3 enzymes does not completely inhibit differentiation, suggesting that DNA methylation levels themselves, and not necessarily de novo silencing, may be responsible for this phenotype67,68.
DNA methylation also participates in buffering extra-embryonic commitment68,70. Dnmt1-knockout ESCs have unique trophectodermal lineage potential, which is caused in part by upregulation of the transcription factor ELF5 (Ref. 70). In vitro, DNA methylation seems to influence the choice between embryonic and extra-embryonic potential through reciprocal methylation of Elf5 in ESCs or Nanog and Oct4 in trophectodermal stem cells (Fig. 2a), although neither set of genes is methylated in vivo until after the extra-embryonic or embryonic fate is specified71.
Non-CpG methylation and de novo methylation activity. ESCs show extensive non-CpG methylation, most prominently at CpA dinucleotides. This methylation probably reflects a state of hyperactive de novo methyltransferase activity, as it lacks the symmetry required for replication-based maintenance by DNMT1 (Ref. 2). Although it is globally observed, non-CpG methylation is particularly enriched at certain genomic features and co-occurs with high levels of CpG methylation3. De novo non-CG methylation is enhanced by DNMT3L, which may direct de novo activity during pluripotency, but it is silenced on differentiation72,73.
The increased activity of the DNMT3 enzymes in pluripotent cells relative to differentiated cells co-occurs with unstable repetitive element silencing and may compensate for diminished or inhibited DNA methylation maintenance at these elements. The rate of demethylation in Dnmt3a−/− Dnmt3b−/− ESCs is more rapid for LINEs, minor satellites and some LTRs than for intracisternal A-type particles (IAPs) or imprints, indicating that certain elements may be especially prone to activation by escaping silencing66. These dynamically methylated regions are more frequently hemimethylated in pluripotent cells, and this fluctuation could be propagated through replication and could lead to complete demethylation72. At LINE elements, for example, protection is supervised by DNMT3A and probably DNMT3L72. LINE elements are also frequently hydroxymethylated within ESCs, and this modification might be an intermediate for either passive dilution of methylation during division or active catalytic removal74,75. The inability of DNMT1 activity alone to maintain methylation patterns at repetitive elements probably reflects a crucial function for the DNMT3 enzymes to counteract repetitive element expression specifically within pluripotent cells74.
In ESCs, fluctuations in the epigenetic status of repetitive elements suggest dual regulation by DNA and histone methylation50. Small subpopulations of ESCs concurrently reactivate LTR and LINE sequences and can contribute to extra-embryonic tissues in chimaeras, a phenotype that is strikingly similar to that of Dnmt1-null ESCs76. Although a direct link between repetitive element activity, DNA demethylation and potency has not yet been identified, the extra-embryonically competent subpopulation within ESCs is expanded in G9A and TRIM28-deficient cells, suggesting that the turnover of DNA methylation at these repetitive elements may also participate76.
Promoter methylation on differentiation. Many pluripotency-associated promoters — including those of Oct4 (also known as Pou5f1), Nanog and germline-specific genes — are silenced by hypermethylation on differentiation. Repressor binding initiates silencing, and this is followed by G9A-mediated H3K9 methylation, heterochromatin protein 1 (HP1) recruitment and finally de novo DNA methylation77. The regulatory regions of Oct4 during silencing have been studied closely and, although DNMT3A and DNMT3B show equal potential to initiate methylation at the proximal enhancer, DNMT3A more robustly triggers stable inheritance78. Intriguingly, differentiating LSH-deficient ESCs appear to methylate normally but with decreased consistency among neighbouring CpGs78. As promoter silencing usually rapidly proceeds from the assembly of heterochromatin and recruitment of DNMTs to complete hypermethylation, this discrepancy between neighbouring CpG methylation values implies that the transition from early to terminal silencing is decoupled in the absence of LSH.
On the genome scale, nucleosome-depleted regions associated with cell-type-specific regulation show pluripotency factor binding and DNA hypomethylation in ESCs79. During differentiation, DNA methylation co-occurs with nucleosome assembly, thus inhibiting transcription factor binding79. Oct4 silencing can be ectopically induced through artificial targeting of HP1α, which instructs H3K9 methylation followed by DNA methylation80. In this system, de novo silencing outside pluripotent cells remains heritable after removal of the targeting initiator, but in ESCs, removal leads to simultaneous Oct4 reactivation and demethylation80. Assembly of silenced chromatin at active genes can also occur through ectopic, modular recruitment of KRAB repressive domain fusion proteins, similarly to silencing by TRIM28 (Ref. 37). Furthermore, loss of linker histone H1, which recruits HP1α and initiates heterochromatin formation, prevents silencing and stalls differentiation81.
Silencing of germline-specific genes requires DNA methylation downstream of sequence-specific transcriptional repressors and occurs early during the onset of differentiation. These genes are misregulated in somatic cells of Dnmt1-null mouse embryos, suggesting that maintenance of methylation is crucial for continual repression82. The transcription factor E2F6 participates in promoter silencing, and its knockout reactivates genes that are normally hypermethylated and silent across embryonic tissues83. E2F6 directly recruits DNMT3B, and the knockout of either factor deregulates a similar set of genes84,85. Moreover, in somatic cells, these promoters are depleted of other chromatin modifications associated with silencing, such as H3K9 methylation86. The presence of hydroxymethylcytosine at these promoters within ESCs and during specification of the germ line (see below), as well as their notable demethylation after nuclear transfer of somatic cells into enucleated oocytes, also suggest that this promoter set is directly and dynamically regulated through DNA methylation87,88,89. Although similar mechanisms probably occur at other genes, germline gene silencing represents one of the most robust, coordinated promoter methylation events during ESC differentiation and embryonic development.
Mammalian X-chromosome inactivation is also associated with pluripotency exit and targeted DNA methylation90. Inactivation is mediated upstream of DNA methylation by the non-coding RNA X-inactivation-specific transcript (Xist) and Polycomb-group-mediated H3K27 methylation90. Promoter methylation on the inactivated X chromosome occurs with different dynamics for different sets of genes but, like germline genes, is largely reliant on DNMT3B recruitment91.
Regulation through adult lineages
Compared with ESCs, the influence of DNA methylation in adult stem cells and lineages requires a more nuanced appraisal in vivo to circumvent the early gestational or postnatal lethality of Dnmt gene deletion. As such, studying the roles of DNA methylation has often relied on conditional knockout models. The haematopoietic system remains one of the most carefully dissected lineages: an almost complete hierarchy is at hand, and multiple assays are available to quantify phenotypes. This system has provided some of the most comprehensive information on how DNA methylation functions within adult lineages, from stem cell to terminally differentiated cell, and may prove to be representative of other developmental transitions.
Haematopoietic stem cells and lymphoid versus myeloid fates. By and large, DNA methylation profiles across haematopoiesis are extraordinarily similar, particularly when compared against other tissues, suggesting that DNA methylation may have a larger role in lineage specification than in progression92,93,94. However, the importance of DNA methylation within quiescent haematopoietic stem cells (HSCs) is evident in conditional Dnmt1-knockout mice, which suffer from self-renewal defects and dramatic misregulation of myeloid versus lymphoid compartments95,96 (Fig. 2b). Although knockout HSCs arrest, they upregulate myeloid-progenitor-associated factors, and irradiated mice reconstituted with Dnmt1 hypomorphic HSCs show a skew towards myeloid fates96. Comprehensive mapping of DNA methylation at different stages of haematopoiesis confirms that lymphoid progenitors are specifically methylated at myeloid transcription factor binding sites; this might protect the evolutionarily younger lymphoid lineage from entering a 'default' myeloid state92,94,97,98. DNA methylation may generally serve to balance alternative fates within adult stem cells. For example, similar observations have been made in neurogenesis, in which glia-associated transcription factors but not neuronal-lineage-specific genes are hyperactive in DNMT1-depleted neural progenitor cells99.
Although progressive methylation accumulates in the lymphoid lineages, and entry into this fate is protected by DNMT1, the roles for DNMT3A and DNMT3B remain less clear. In mice, conditional knockout of both Dnmt3a and Dnmt3b results in minimal phenotypic effects on HSCs, although serial transplantation can lead to renewal defects that resemble the Dnmt1-knockout phenotype and that are likely to be a consequence of global loss of methylation100. As such, it appears that the DNMT3 enzymes are not essential for normal HSC function, but they might participate in the stable silencing of specific regions that otherwise retain the potential for activation if they are appropriately challenged. Serial transplantation of Dnmt3a-null HSCs results in hyperproliferation and retained expression of multipotency-associated factors in terminally differentiated lineages101. Unlike cells with both Dnmt3a and Dnmt3b knocked out, these serially transplanted, Dnmt3a-null HSCs show prominent CpG island hypermethylation, and differentiated progeny also show a background of more global hypomethylation; such an aberrant landscape is similarly observed during proliferation-dependent transformation101. Interestingly, Dnmt3a reintroduction results in only partial rescue, indicating that Dnmt3a-null HSCs are irrevocably transformed101. Although the specific function of DNMT3A in the proper maintenance of HSC quiescence remains unknown, it probably provides robust, intransigent silencing of stem-cell-associated regulatory elements that can only be abrogated under appropriate conditions.
Targeted demethylation also seems to participate in HSC differentiation102,103. For example, progression from an early progenitor to a granulocyte–macrophage progenitor stage in mice is accompanied by promoter demethylation of numerous genes, including growth arrest and DNA-damage-inducible 45 alpha (Gadd45a), which is concurrently upregulated98. Intriguingly, mutation of Tet2 is common in many forms of myeloid leukaemia, and deletion of this gene during in vitro HSC differentiation or after in vivo transplantation results in increased proliferation of the HSC compartment and myeloid fate skewing, suggesting that demethylation during lineage progression may be essential for the carefully regulated exit from multipotency to more specialized cell types104,105,106,107. TET2 has also been shown to promote CEBPα-directed transdifferentiation from lymphoid pre-B cells to macrophages by inducing epigenetic changes associated with active enhancers at hypermethylated regulatory elements108. In vivo, the DNA methylation changes that occur as precursor cells differentiate to dendritic or macrophage cells do not appear to require cell division109, making transcription-factor-directed TET2 recruitment a likely candidate for mediating targeted demethylation during haematopoiesis.
DNA methylation in later stages of differentiation. Lymphoid cells at later stages of differentiation also depend on DNA methylation. Conditional Dnmt1-knockout in naive B and T cells hinders their proliferative capacity, similarly to the observations in HSCs110,111,112. These phenotypes contrast to those observed during the terminal stages of erythropoeisis, in which the final stages before enucleation are accompanied by a subtle, division-dependent, global decrease in methylation113. Lymphoid cells must remain able to proliferate and to respond to extracellular signals, so maintenance methylation is likely to be more important in immune cell regulation than during erythropoiesis. For example, Dnmt1-null, naive CD4 T cells upregulate sets of cytokines that are normally silenced and methylated110. By contrast, CD4 T cell cytokines are upregulated in Dnmt1-null CD8 T cells114. In both cases, the misregulated cytokines are not in excess compared with the cell's proper set, highlighting the function of DNA methylation as a lineage buffer114.
Specific lymphoid classes also use alternative lineage silencing to distinguish unique programmes. In TH2 CD4 cells, interleukin 4 (Il4) is activated by binding of the transcription factor GATA3, H3K4 methylation and passive demethylation; it is spuriously activated in other lineages in the absence of DNA methylation maintenance115. The Il2 promoter is rapidly demethylated in response to binding of the transcription factor OCT1; in this case, demethylation occurs so closely after transcriptional activation that it may involve an active process (that is, it is independent of DNA replication)116,117. Il2 promoter demethylation stabilizes OCT1 binding and ensures that secondary activation in ensuing cell progeny is more rapid and more intense117. A conserved enhancer element at the forkhead box P3 (Foxp3) locus functions similarly during regulatory T cell proliferation. In this case, core binding factor beta (CBFβ) and the transcription factor RUNX1 bind in conjunction with rapid demethylation, providing an open window for FOXP3 binding that in turn stabilizes lineage progression over ensuing divisions118 (Fig. 3). Localized demethylation at enhancer elements is therefore not only associated with transcription factor binding but also with stabilization of these interactions to ensure robust expression of the target gene after its activation.
Outside promoters and enhancers, most methylation differences during haematopoietic differentiation are observed at intragenic, often exonic, CpG islands93. However, the regulatory consequences of this observation are not fully understood. It seems likely that intragenic methylation may coordinate differential expression through alternative promoters or splicing, in part because DNA methylation affects the kinetics and stability of RNA polymerase II (RNA Pol II) elongation93,119. It is at these regions within the gene body that DNA methylation might show the strongest correspondence with expression changes and, counterintuitively, is more strongly associated with transcriptional activity120. During lymphocyte maturation, a weak, generally excluded exon within CD45 (also known as PTPRC) is specifically included by demethylation of a downstream intron; this demethylation event stabilizes CTCF binding and slows Pol II elongation, ensuring that the upstream exon is incorporated into the final transcript121. This provides a clear example of subtle changes in DNA methylation acting outside promoters that stabilizes lineage choice.
The germ line and early embryo
Primordial germ cell specification and global demethylation. The relative stability of the bulk methylome throughout somatic cell commitment dramatically contrasts to the specification of the germ line, when global DNA demethylation occurs. In mice, PGCs are spatially confined around embryonic day (E)6.5 in the proximal epiblast and are specified by PR domain zinc finger protein 1 (PRDM1; also known as BLIMP1), which interacts with the arginine methyltransferase PRMT5 to silence somatic genes122,123,124. Commitment proceeds at ∼E7.5 as PGCs begin migrating along the embryonic–extra-embryonic interface to the developing gonad; this coincides with the reactivation of multiple pluripotency-associated factors125,126,127. Before their migration, PGCs show a somatic methylation pattern that reflects their embryonic origin. They become grossly demethylated over a window of ∼1 day from E10.5–11.5 as measured by locus-specific bisulphite sequencing or around E9.5 as assessed by immunohistochemical analysis128,129. Genome-wide studies confirm that demethylation during PGC specification is almost complete, the exception being IAPs and a few novel LTR sequences that escape complete erasure but that are still less methylated when compared with somatic cells129,130,131,132,133,134. Most promoters of germline-specific genes are also hypermethylated outside gametes, the early embryo and pluripotent cells, and their demethylation during PGC specification corresponds with their eventual expression82,85,135,136. Although many candidates have been proposed to act as the primary catalyst, how demethylation occurs remains unknown and needs to be addressed to define completely the cause and context of epigenetic reprogramming in the germ line.
DNA demethylation co-occurs with multiple global epigenetic remodelling events during PGC reprogramming (Fig. 4). Between E8.5 and E9.5, PGCs upregulate DNA-binding factor Stella (also known as DPPA3 or PGC7) and asynchronously arrest in the G2 phase of the cell cycle. TET1 is expressed during PGC specification and is a potential candidate mediator for global DNA demethylation, but it is difficult to reconcile this with the germline competence of Tet1-null mice137,138. Recent evidence suggests that TET1 may not specifically be essential for global DNA demethylation but may facilitate the activation of germline-associated genes during PGC progression139. Global hydroxymethylcytosine (hmC) observed in PGCs appears to be dependent on TET1 activity and notable enrichment is observed at germline gene promoters and ICRs, which are specifically demethylated during this phase88,134. Deamination may also participate as activation-induced cytidine deaminase (Aid)-knockout PGCs exhibit higher methylation signatures genome-wide compared with wild-type PGCs, but they are still dramatically demethylated132. Specified PGCs show marked downregulation of UHRF1, which destabilizes DNMT1 and prevents its localization to replication foci, as well as downregulated DNMT3 expression, suggesting that aspects of global demethylation within PGCs may be facilitated by cell division in the absence of methyltransferase activity88,134,140,141. A complete model of the roles of catalytic demethylators, hydroxymethylation, cell division and DNA demethylation remains to be fully assembled.
PGCs also globally erase H3K9me2, and this process is assisted by rapid downregulation of the binding partner of G9A, GLP (also known as EHMT1)142. Simultaneously, histone chaperones associated with replication-independent exchange may be preferentially recruited143. During demethylation, single-strand DNA breaks and BER enzyme activity co-occur across the genome, possibly participating in stabilizing the demethylation process137. DNA demethylation is delayed when BER components are chemically inhibited, supporting a relationship between the global DNA damage response and DNA demethylation, although the direct link between these two processes remains speculative137. Intriguingly, H3K9 demethylation and DNA demethylation are accompanied by a 'pulse' of global H3K27 methylation, which possibly compensates for the loss of these other silencers (as it does in DNMT-deficient ESCs)23,128,142. Establishing the hierarchy of epigenetic events during germline reprogramming remains an ongoing endeavour.
Remethylation of the germ line. Demethylation dynamics between male and female PGCs are almost identical; however, the time and place at which the bulk genome is remethylated follows sex-specific timelines. Female gametes accumulate methylation after meiosis I arrest and do not reach their full global levels until sexual maturation144. By contrast, in male gametes the bulk genome is remethylated before birth and ensuing meioses134,145. However, DNMT3L and DNMT3A are essential for de novo methylation in both sexes144,146,147,148,149,150,151,152. Failure to re-establish DNA methylation in male gametes causes severe spermatogenesis defects and sterility, although fertilization-competent oocyte production is not hindered by the global loss of DNA methylation and maternal-effect lethality is almost exclusively conferred as a consequence of defective imprinting146,147. Oocyte development may not require DNA methylation per se, but Lsh knockout during oogenesis induces a similar phenotype to that observed in Dnmt3l−/− males, including arrest, misregulated synapsis, double-strand breaks and abnormally low repetitive element methylation153. Therefore, the maintenance of heterochromatin at certain repetitive elements is equally essential for female oogenesis, although it is possibly acquired through different mechanisms.
Oocytes and sperm retain large differences in their methylation status, particularly at certain repetitive element classes, including LINEs and some LTR promoters, which are more methylated in sperm144,152, whereas oocytes are mostly hypermethylated at IAPs154. Sex differences in de novo methylation probably involve molecular components that are unique to one sex during gametogenesis, but the exact mechanisms remain unknown. Intriguingly, the PIWI-associated RNA (piRNA)-binding protein MILI (also known as PIWIL2; Box 2) is expressed in both types of gamete, and Mili-knockout testes show disproportionate IAP, LTR and LINE activity130,145,155,156. MIWI2 (also known as PIWIL4) acts downstream of MILI and is associated with LINE silencing; as MIWI2 is exclusively expressed during male gametogenesis, this link could contribute to the discrepancy in repetitive element silencing between gametes155,157,158.
DNA methylation and the early embryo. On fertilization, the hypermethylated sperm undergoes a rapid, almost complete loss of methylation that was originally described through immunohistochemistry as occurring before the onset of replication159. Methylation-sensitive restriction digestion and bisulphite sequencing confirmed that the most dramatically demethylated repetitive element classes, such as LINE and certain LTR classes, reduce their values to levels near those in the oocyte, in which these elements are hypomethylated, whereas IAPs and other resistant elements remain methylated10,131,154,160,161. After this initial pulse of paternally targeted demethylation, global methylation is further depleted after ensuing divisions and reaches a minimum before embryonic specification in the blastocyst10. Since its original description, careful dissection of demethylation across pronuclear stages using bisulphite sequencing has pinpointed DNA synthesis as the period during which demethylation to unmodified cytosine is most dramatic162. This information from bisulphite sequencing contrasts prior immunohistochemical observations and indicates that intermediate modifications, such as hmC, precede demethylation to unmodified cytosine; this last step may be closely tied to replication162. Before DNA replication, the paternal methylome is globally oxidized by TET3, which, like TET1, contains a CXXC domain that probably confers specificity for as of yet undefined targets163,164,165. The oxidation of bulk methylcytosine in the paternal genome occurs shortly after rechromatinization, before DNA synthesis, and may be intimately linked to the acetylation of nascently incorporated histones. The relationship between cytosine hydroxymethylation and histone acetylation may be coupled by the histone acetyltransferase elongator complex protein 3 (ELP3)166. Intriguingly, Tet3-null mice confer a maternally dominant mid-gestational lethality, and few embryos develop to term; this is the first phenotype to have been associated with abnormal paternal demethylation165. Globally, the hmC signal is halved after the first division and is further reduced after subsequent divisions, suggesting that at fertilization the active catalysis to cytosine is likely to be restricted to specific targets, whereas most of the bulk demethylation probably proceeds through replicative loss163,164,167. Intriguingly, BER complex recruitment, single-strand breaks and phosphorylated H2A.X are enriched at different phases of zygotic progression, first appearing immediately after fertilization and emerging again during DNA replication, particularly within the paternal pronucleus, possibly tying these additional complexes to the demethylation process137,162,168.
Similarly to remodelling in the germ line, DNA demethylation after fertilization is coupled both to histone exchange and to novel chromatin regulation169. A key difference, however, is that the maternal genome appears to be generally static during paternal genome remodelling (Fig. 5). The maternal pronucleus of the developing zygote shows strong H3K9 methylation, whereas the remodelled paternal genome instead accumulates H3K27 methylation and Polycomb repressive complex 1 (PRC1) components, which, like hmC, are serially diluted over the first few divisions170,171. The maternal genome is protected by Stella, which recognizes and binds H3K9 methylation172. In the absence of Stella, the epigenetic events of paternal reprogramming, including TET3-mediated hydroxymethylation and BER activity, are recruited to both pronuclei137,162,164,165,172,173. Maternal methylation is retained at both known imprints that persist through adulthood and at novel, pre-implantation-specific imprints, which are lost later in cleavage10.
Compared to PGCs, the early embryo shows considerably higher, more stable DNA methylation at a number of genomic features, including IAPs and other retroelements131. Whereas the hmC signal is enriched on the paternal genome and is halved throughout cleavage, these methylated regions must either be protected from demethylation or recognized by other silencing complexes that can rescue DNA methylation. Many elements that display some demethylation during DNA replication are immediately remethylated, suggesting either delayed maintenance or compensation by maternally contributed DNMT3A162,174. In the early embryo, an oocyte-specific isoform of DNMT1, called DNMT1O, is responsible for maintenance, whereas the transcription start site of the somatic isoform is maternally methylated and not expressed until the four-cell stage144,174,175. UHRF1 and DNMT3A can recognize hemihydroxymethylated substrates, whereas DNMT1 cannot, although these data are currently limited to in vitro biochemistry176,177,178. Methyl-CpG-binding domain protein 3 (MBD3) might participate in bridging hmC to epigenetic silencing by recruiting the nucleosome remodelling and histone deacetylation (NURD) complex179,180. HDAC complexes might participate in re-silencing hydroxymethylated targets, as cytosine oxidation and histone acetylation seem to be tightly coupled166. Notably, Mbd3-null ESCs cannot differentiate into embryonic cell fates but do gain extra-embryonic potential, whereas Tet1-null ESCs retain their differentiation ability but exhibit a subtle extra-embryonic skew138,181,182,183; this bias for extra-embryonic fates is similar to that of Dnmt1 mutants, implying that a tight relationship among many epigenetic silencers is required to establish and to protect the embryonic lineage.
The recent availability of genome-wide DNA methylation data has extended to complete lineage hierarchies and clarifies early locus-specific observations, linking dynamic regions to the phenotypes derived from classical mouse genetics. However, the exact relationship between DNA methylation and these phenotypes remains a complex problem. Evaluating the part that CpG methylation plays in diverse developmental contexts, where causality may span from global patterns to a single base, will require systematic integration of cutting edge genomic assays and classical genetic approaches.
In many instances, DNA methylation acts as a buffer to stabilize decisions made by transcription factors, ensuring that they are precise and robust. High-throughput perturbation strategies, such as those made possible through zinc finger nuclease and TALEN-based genomic editing technologies, could further dissect the mechanism and underlying circuitry of these transcriptional silencing events. These technologies may also allow independent sequences to be screened in vivo, leading to the identification of underlying, instructional cis-acting elements that have been, to date, comparatively obscure. High-precision measurements of CpG methylation at per-base resolution will lead to a refined understanding of its normal variability and stability, potentially pinpointing the events behind the progressive accumulation of aberrations as observed during tumorigenesis or ageing.
In several exceptional contexts, notably at fertilization and in germline specification, DNA methylation is extraordinarily and globally dynamic, although the evolutionary stimulus behind these events, their root cause and overt function, remain unclear. Examining the drift inherent to DNA methylation, and its underlying, mutable base, across generations may eventually explain complex hereditary phenotypes. Hydroxymethylation has only recently been confirmed as a modification of developmental importance, and its notable presence in the germ line and at fertilization make it a compelling candidate for a role in epigenetic transgenerational inheritance. How DNA methylation and other epigenetic silencers participate in normal development is now seen with an unprecedented clarity, although how these processes are intimately coordinated remains a complicated problem to parse into discrete, hierarchical relationships. In summary, it is exciting to look back and note the fundamental insights that have been already acquired and to see from these works the future path towards a complete understanding of mammalian DNA methylation.
We thank members of the Meissner laboratory as well as M. M. Chan, A. Regev, T. S. Mikkelsen, R. P. Koche, H. Gu, A. Gnirke, and P. Boyle for discussion and insight. A.M. is supported by the Pew Charitable Trusts, US National Institutes of Health (NIH) grants (U01ES017155 and P01GM099117) and is a New York Stem Cell Foundation (NYSCF) Robertson Investigator.
- CpG islands
Regions of several hundred to approximately two thousand base pairs that are frequently found at promoters and that exhibit strong enrichment for CpG dinucleotides. Those CpG islands at promoters are predominantly unmethylated across cell types.
- Symmetrically methylated CpGs
The presence of a methyl group on carbon 5 of the cytosine base is typically found on both bases within the palindromic CpGs of opposing DNA strands, reflecting successful maintenance methylation during DNA synthesis.
A conserved Cys–X–X–Cys domain that is frequently found in developmental epigenetic regulators that bind unmethylated CpG-containing DNA and often instruct the modification of histones in ways that oppose DNA methylation.
A histone H2A variant that is incorporated into euchromatic nucleosomes and is believed to contribute to the rapid exchange of histones in active genomic loci.
A cytosine analogue that incorporates into synthesizing DNA and covalently binds DNA methyltransferases (DNMTs), sequestering and inhibiting their function. It is used in some cancer therapies.
- TET dioxygenases
Enzymes that associate with DNA and use α-ketoglutarate and Fe2+ cofactors to mediate the oxidation of methylated cytosine to hydroxy-, formyl- or carboxymethylcytosine, which are potential demethylation intermediates.
- Lymphoid-specific helicase
(LSH). A SWI/SNF-like helicase believed to participate in the remodelling of nucleosomes that is necessary to initiate heterochromatin assembly and de novo methylation.
- Minor satellite
A repetitive array of AT-rich sequences that instructs the formation of the centromeric nucleosomes to form the kinetocore and to permit sister chromatid pairing. Centromere protein B (CENPB) mediates the assembly of CENPA-containing nucleosomes.
- Lagging anaphase bridges
Delayed sister chromatid segregation during mitosis as a consequence of faulty centromere formation or connection to the mitotic spindle; a frequent cause of chromosome loss and aneuploidy.
- Long interspersed nuclear elements
(LINEs). A type of repetitive element that encodes the necessary proteins for integration into the genome of its reverse-transcribed RNA transcript. It preferentially integrates within gene-poor regions.
- Trophectodermal lineage
Cells that contribute to placental tissues. They are derived from the external component of blastocyst stage embryos; this represents the first restriction in cellular potency during mammalian development.
- Intracisternal A-type particles
(IAPs). These are notable class II long terminal repeat (LTR)-containing retroelements that are specific to mice and that retain high methylation levels throughout development.
Asymmetric methylation of only one cytosine within opposing CpGs. If not remethylated during S phase by DNA methyltransferase 1 (DNMT1), hemimethylated DNA can lead to loss of mitotic inheritance after a subsequent round of replication.
Oxidation of the methyl group as mediated by the TET dioxygenases through α-ketoglutarate catalysis. Hydroxymethylated residues can be further oxidized to formyl- and carboxymethylcytosine.
- Germline-specific genes
Genes of unique gametogenic function that are tightly regulated outside gametes or the early embryo by DNA methylation. They are specifically demethylated during primordial germ cell specification.
About this article
Nature Communications (2019)