In mammals, all somatic development originates from lineage segregation in early embryos. However, the dynamics of transcriptomes and epigenomes acting in concert with initial cell fate commitment remains poorly characterized. Here we report a comprehensive investigation of transcriptomes and base-resolution methylomes for early lineages in peri- and postimplantation mouse embryos. We found allele-specific and lineage-specific de novo methylation at CG and CH sites that led to differential methylation between embryonic and extraembryonic lineages at promoters of lineage regulators, gene bodies, and DNA-methylation valleys. By using Hi-C experiments to define chromatin architecture across the same developmental period, we demonstrated that both global demethylation and remethylation in early development correlate with chromatin compartments. Dynamic local methylation was evident during gastrulation, which enabled the identification of putative regulatory elements. Finally, we found that de novo methylation patterning does not strictly require implantation. These data reveal dynamic transcriptomes, DNA methylomes, and 3D chromatin landscapes during the earliest stages of mammalian lineage specification.
In mammals, early lineage specification in preimplantation and postimplantation embryonic development generates founder tissues for all subsequent somatic development1. The first lineage specification starts at the morula stage, when the inner cell mass (ICM) and the trophectoderm (TE) begin to segregate2. The ICM contains both cells of the epiblast lineage, which give rise to the entire fetus, and cells of the primitive endoderm lineage, which form visceral endoderm (VE) and parietal endoderm3,4. VE becomes the chief metabolic component of the visceral yolk sac, and parietal endoderm contributes to the transient parietal yolk sac4. TE contains progenitor cells for trophoblasts, which form the majority of the fetal-origin part of the placenta3. In mice, by embryonic day 6.5 (E6.5), the anterior epiblast gives rise to ectoderm, and the posterior proximal epiblast develops into the primitive streak, which then forms mesoderm and endoderm5. The resulting three germ layers contain virtually all progenitors for the future body plan6.
Notably, early cell fate commitment is accompanied by extensive epigenetic reprogramming2. For example, drastic demethylation and remethylation of DNA take place during early embryogenesis7. DNA methylation plays critical roles in gene repression, genomic imprinting, and X chromosome inactivation8. Deficiency in DNA methyltransferases (DNMTs) often leads to lethality or sterility7. Interestingly, defects in extraembryonic tissues, which provide both nutrients and developmental cues for embryonic development, are frequently found in mice deficient in DNMTs9,10,11. A large portion of DNA methylation in gametes is removed during preimplantation development7. The methylome in postimplantation embryos then forms part of the epigenetic basis for the entire body plan. DNA methylome reprogramming in preimplantation embryos has been studied extensively12,13,14,15,16. However, because of the limited materials available and the difficulty of tissue isolation in early embryos, lineage-specific regulation of transcriptomes and epigenomes in peri- and postimplantation embryos is poorly characterized. Here we conducted a comprehensive analysis of transcriptomes and whole-genome DNA methylomes at single-base resolution for major lineages that arise before and after implantation. This analysis, together with Hi-C experiments probing higher-order chromatin structure during the same period, provides unprecedented spatiotemporal views for the establishment of the molecular architecture regulating early cell fate commitment and body plan in mammals.
Mapping global transcriptomes and DNA methylomes during early lineage specification
To study the transcriptional programs and epigenomes involved in early lineage segregation, we carefully dissected various tissues from peri- and postimplantation mouse embryos (DBA/2N male × C57BL/6N female mice) using methods described previously17,18 (Fig. 1a, Supplementary Fig. 1a, Methods). These included ICM (E3.5 and E4.0), mural TE (E3.5), VE (E5.5 and E6.5), epiblast (E5.5 and E6.5), ectoderm (E7.5), endoderm (E7.5), mesoderm (E7.5), and primitive streak (PS) (E7.5). We chose TE and VE as representatives of extraembryonic tissues. Analysis of lineage-marker genes and transcriptomes strongly supported the correct identities of these tissues and dynamic transcription landscapes in early lineages (Supplementary Fig. 1b–e, Supplementary Table 1). E3.5 ICM and E4.0 ICM were grouped together and were distantly connected with all ICM-derived tissues from mice at E5.5 to E7.5 (Supplementary Fig. 1e). Both the marker genes and the global transcriptome of endoderm were similar to those of E6.5 VE (Supplementary Fig. 1c–e); this is in line with the notion that VE also contributes to the endoderm lineage19. Taken together, these data demonstrate the high quality of the early tissues we isolated and lineage-specific transcription landscapes in early development.
Next, to examine the dynamics of DNA methylation during early lineage commitment, we developed a low-input method for genome-wide DNA-methylation profiling: STEM-seq (small-scale TELP-enabled methylome sequencing) (Supplementary Fig. 2a). This approach reduces DNA loss because bisulfite conversion is performed before TELP-mediated DNA amplification, a highly sensitive library-preparation method20. Our data showed that STEM-seq could accurately determine DNA methylomes with as little as 10 ng of genomic DNA, or 500 cells (Supplementary Fig. 2b–e). Next, we profiled high-depth methylomes for early lineages (with 2–3 replicates) by STEM-seq (sequencing information is provided in Supplementary Table 2). We first focused on CG methylation. The methylome data generally showed excellent replicate reproducibility (Supplementary Fig. 3a) and genome coverage for CG sites (Supplementary Fig. 3b,c). A global view of methylomes revealed large hypomethylated regions around the Hoxa gene cluster, as expected21 (Fig. 1b). We also observed dynamic DNA methylation near developmentally regulated genes including Hnf4a (VE/endoderm marker), Pou5f1 (also known as Oct4), and Tdgf1 (epiblast markers), which are reciprocally methylated in epiblast or VE (Supplementary Fig. 3d). Notably, these promoters showed intermediate levels of methylation in endoderm, consistent with the mixed origin of endoderm from both epiblast and VE19. We then investigated whether the global CG methylome of each tissue reflects its spatiotemporal relationship by conducting a hierarchical clustering analysis of methylomes for early embryos, as well as for somatic tissues21 and mouse embryonic stem cells (mESCs)22 (Fig. 1c). We found that E3.5 ICM, E3.5 TE, and E4.0 ICM, which are all hypomethylated, clustered together away from all other lineages (Fig. 1c). The methylomes of endoderm, ectoderm, and mesoderm were much closer to each other than to the methylomes of the derivative somatic tissues. These data suggest that substantial epigenome drift occurs between embryonic progenitor and somatic tissue.
Dynamic lineage-specific methylation at CG and CH sites
The segregation of ICM and TE is the first lineage-specification event in embryos2. Genome-wide, we identified a total of 208 and 47 promoters that were hypermethylated in ICM and TE, respectively (Supplementary Table 3, Methods). The majority of genes that were differentially expressed between ICM and TE did not show differences in promoter methylation (Supplementary Fig. 3e). For example, both Pou5f1 and Tdgf1 were expressed at high levels in ICM but not in TE, yet their promoters remained unmethylated in both lineages (Supplementary Fig. 3d). However, both promoters are methylated in TE-derived placenta21 (Supplementary Fig. 3d), which indicates that DNA methylation is involved in maintaining these lineage regulators but not in initially silencing them. We then asked how de novo methylation occurs in concert with the specification of epiblast and VE. From E4.0 to E6.5, DNA methylation increased considerably genome-wide in epiblast, but it increased to a lesser extent in VE (Fig. 1d, Supplementary Fig. 3f). This was accompanied by epiblast-specific sharp upregulation of Dnmt3a, Dnmt3b, and Dnmt3l at E5.5, with Dnmt3l likely undergoing autorepression through promoter methylation at E6.523 (Supplementary Fig. 3g,h). In addition to CG methylation, CH methylation was relatively enriched in oocytes but was barely detected in sperm and after the four-cell stage, although it reappeared in E5.5 epiblast (Fig. 1e). Unlike CG methylation, which showed further increases, CH methylation decreased from E6.5 to E7.5 (Fig. 1d, e). This is consistent with the reduced expression of Dnmt-family genes (Supplementary Fig. 3g) and the fact that CH methylation cannot be maintained by DNMT124. CH methylation in early embryos preferentially occurs in TACAG sequences (Fig. 1e and data not shown), similar to what is observed in embryonic stem cells24. In sum, these data indicate that lineage-specific de novo methylation of CG and CH sites correlates with the activities of DNMT proteins. Because CH methylation levels were much lower than CG methylation levels (Fig. 1e), we focused mainly on CG methylation in subsequent analyses, unless otherwise noted.
Allele-specific de novo methylation highlights conserved gene body methylation
In preimplantation development, the two parental genomes undergo differential demethylation7. We asked whether the two parental alleles are also subjected to distinct de novo methylation in postimplantation embryos. We first validated our allele-specific analyses with methylome data from gametes13 and imprinted loci (Supplementary Fig. 4a,b, Methods). Notably, in E3.5 ICM, the maternal genome, but not the paternal genome, was hypermethylated in gene bodies, showing an oocyte-like methylome pattern25 (Fig. 2a). The parental methylomes quickly became symmetrical by E5.5 as a result of allele-specific acquisition of DNA methylation that was anticorrelated with the starting levels of DNA methylation in E4.0 ICM (Fig. 2a,b). CG and CH methylation both occurred preferentially in active gene bodies in VE and epiblast from E5.5 to E6.5 (Fig. 2a,c). In E6.5 epiblast, the gene body CG methylation pattern became attenuated, and it is likely that this was a result of the saturation effect. These data show both allele-specific methylation and conserved gene body methylation in postimplantation embryos.
Differential methylation of promoters and DNA methylation valleys between epiblast and VE
We then asked whether the distinct methylomes in embryonic and extraembryonic tissues might regulate lineage-specific transcription programs. Because the parental methylomes became similar after E5.5, we conducted the analyses without separating the alleles. We identified promoters that were hypermethylated in E6.5 epiblast (n = 2,936) or VE (n = 242) (Supplementary Table 3). Of the corresponding genes, only a small fraction (6.4% and 20.2%, respectively) showed consistent changes in expression (threefold downregulation in hypermethylated tissues), and we considered these as possible ‘DNA methylation effectors’ (Supplementary Table 3). Genes that were specifically hypermethylated in epiblast included many VE markers, such as members of the apolipoprotein family Apoa1, Apoa4, Apoa5, Apob, and Apoc2 (Fig. 3a). In contrast, several key epiblast marker genes such as Pou5f1, Nanog, and Tdgf1 were hypermethylated in VE (Fig. 3a). Aside from these methylation effectors, the rest of the differentially methylated genes were largely silenced in both epiblast and VE (Supplementary Fig. 5a). Among these genes, those hypermethylated in VE were strongly enriched for developmental genes (P = 5.83 × 10−5) and transcription factors (P = 7.41 × 10−7), including the Hox genes (Hoxb2, Hoxb3, and Hoxd12), Nkx 2-5, Nkx 2-6, Prdm14, and Hand1. We did not observe this for genes that were hypermethylated in epiblast, which were overwhelmingly enriched for the olfactory receptor gene family (P << 0.001; fold enrichment, 4.72) (Supplementary Fig. 5a). Thus, DNA methylation is likely to be engaged in reciprocal gene silencing of lineage regulators (or future regulators) between epiblast and VE.
We found it intriguing that the promoters of developmental genes were preferentially methylated in VE. Careful examination revealed that these hypermethylated regions extended beyond promoters (Fig. 3b). Previously, we and others found that developmental genes tend to reside in large domains of hypomethylated regions, termed DNA methylation valleys (DMVs)26 or DNA methylation canyons27. Using a previously described approach26, we identified 842–900 DMVs in E6.5 epiblast, ectoderm, PS, mesoderm, and endoderm (Methods). We were not able to call DMVs in other lineages that were globally hypomethylated. Indeed, DMVs in early embryos were similarly enriched for developmental genes and Polycomb targets (Supplementary Fig. 5b,c). By identifying all hypermethylated regions in E6.5 VE and comparing them with those in E6.5 epiblast (Methods), we confirmed that promoters and CpG islands (CGIs) were preferentially methylated in VE (Fig. 3c). In epiblast, DMVs with trimethylation of histone H3 at Lys27 (H3K27me3) also gained partial DNA methylation at E5.5. Unlike those in VE, these DMVs quickly lost DNA methylation at E6.5 and remained relatively hypomethylated in somatic tissues (Fig. 3b,d, Supplementary Fig. 5d). Similar patterns were observed for an epiblast methylome dataset generated via reduced-representation bisulfite sequencing28 (Fig. 3b,d). The changes of DNA methylation were most evident for non-CGI regions in DMVs, but they were also found in CGIs (Supplementary Fig. 5e). In fact, CGIs in DMVs were preferentially methylated in VE compared with other CGIs (Supplementary Fig. 5f). Because the DNA methylation oxidase genes Tet1 and Tet2 were expressed at high levels in peri- and/or postimplantation embryos (Supplementary Fig. 5g), we asked whether they are involved in demethylation of DMVs. To explore this, we generated Tet1/Tet2 double-knockout (DKO) mice (by crossing Tet1-knockout mice and Tet2-knockout mice; Methods) and isolated E6.5 epiblast for STEM-seq analysis. Indeed, DMVs from Tet1/Tet2 DKO mice showed increased DNA methylation compared with that in wild-type mice (Fig. 3b,d), indicating that DMVs undergo TET-mediated demethylation in epiblast at E6.5. The active demethylation of DMVs raises the possibility that perhaps the hypomethylation of DMVs is important for maintenance of the transcription plasticity of the associated developmental genes.
Lineage-specific methylation is associated with chromatin higher-order structure
The differential methylation between VE and epiblast was not limited to promoters and DMVs. A chromosome-wide view showed that such differences also existed in much larger regions (Fig. 4a). For example, whereas epiblast showed relatively even methylation across the chromosome, VE showed megabase-sized hypomethylated domains (Fig. 4a), a feature that resembled partially methylated domains (PMDs) in placenta29. Chromatin is known to be spatially organized into two types of large compartments, A and B, which show preferential physical interaction within each class but not between classes30. Compartments A and B generally match open chromatin domains with high gene densities and closed chromatin domains with low gene densities, respectively30. We asked whether the PMDs in VE correlate with such chromatin compartments. Using sisHi-C, a low-input Hi-C method31 (Methods), we investigated higher-order chromatin organization for E3.5 ICM31, E6.5 epiblast, E6.5 VE, and E7.5 ectoderm (Supplementary Table 2). We found that the three-dimensional chromatin interaction patterns were globally similar to one another in early lineages, as well as to those in mESCs32 (Fig. 4b). This was also true for ‘topological domains’ (Fig. 4b) defined by directionality index32 (Supplementary Fig. 6a,b), P(s) curves (which reflect the relationship of genomic distances and chromatin interaction frequencies) (Supplementary Fig. 6c), and chromatin compartments (Fig. 4a). These data indicate that higher-order chromatin structure is established as early as in ICM and is largely conserved from E3.5 to E7.5. We then sought to identify all PMDs and highly methylated domains (HMDs) in E6.5 VE (Methods). Indeed, we found that HMDs and PMDs in VE correlated with chromatin compartments A and B, respectively (Fig. 4a,c). One interesting question is whether the higher-order chromatin structure modulates DNA methylation, or vice versa. As chromatin organization is already established in ICM (Fig. 4a,b), where the genome is globally hypomethylated, it is unlikely that DNA methylation regulates chromatin compartments. To test whether the preferential DNA methylation in compartment A in VE was simply due to higher transcriptional activities, we examined DNA-methylation levels in active gene bodies, inactive gene bodies, and intergenic regions in each compartment. In VE, active gene bodies were preferentially methylated in compartments A and B, which is in line with gene-body-dependent DNA methylation. However, inactive gene bodies and intergenic regions showed considerable levels of DNA methylation only in compartment A, and not in compartment B (Supplementary Fig. 7a), which suggests that compartment-correlated DNA methylation in VE may be independent of transcription.
Notably, all regions in epiblast seemed to acquire similar levels of DNA methylation in compartments A and B (Supplementary Fig. 7a). It is unclear why compartment-specific methylation was absent in epiblast. One possibility is that chromatin in compartment A is more accessible for DNMTs, but in epiblast excessive DNMT machinery leads to equal methylation in compartment B. Because CH methylation occurred at comparatively lower levels that were far from saturation, we asked whether CH methylation might be correlated with chromatin compartments in both lineages. Indeed, unlike CG methylation, CH methylation occurred preferentially in compartment A in both epiblast and VE (Supplementary Fig. 7b). As a result, CG and CH methylation were highly correlated in both E5.5 VE (R = 0.83) and E6.5 VE (R = 0.80), but showed weaker correlation in E5.5 epiblast (R = 0.37) and virtually no correlation in E6.5 epiblast (R = –0.02) (Supplementary Fig. 7c). Taken together, our data indicate that lineage-specific de novo methylation correlates with chromatin compartment and differential expression of Dnmt genes.
Paternal demethylation in preimplantation embryos correlates with chromatin compartment
Given that de novo methylation is associated with chromatin higher-order structure, we asked whether this is also true for genome demethylation in preimplantation embryos. Surprisingly, we found that compartment A, but not compartment B, was preferentially demethylated on the paternal genome (Fig. 4d,e). This compartment-specific demethylation also explains the differential background methylation levels near active and inactive genes on the paternal genome in preimplantation embryos (Fig. 2a, Supplementary Fig. 7d). To determine whether such demethylation depends on TET3, a methylcytosine oxidase that preferentially demethylates the paternal genome33, we analyzed a published methylome comparing wild-type and Tet3-knockout zygotes34. Although TET3 indeed showed a preference for compartment A (Supplementary Fig. 7e), its effect seemed to be moderate, thus indicating the presence of additional regulators for compartment-specific demethylation35. By contrast, the demethylation on the maternal allele seemed to be relatively uniform, enabling the inheritance of an oocyte methylome pattern to blastocysts (Fig. 4d,e). The allele-specific compartment-correlated methylome of ICM was clearly different from those of mESCs (Supplementary Fig. 7f). Methylomes of both primed and naive (cultured in 2i medium) mESCs showed little correlation with chromatin compartment. In sum, these data demonstrate that both demethylation and de novo methylation are associated with chromatin higher-order structure.
Dynamic methylation identifies putative cis-regulatory elements during gastrulation
Although the global methylome is largely established by E6.5 (Fig. 1d), we asked whether dynamic DNA methylation occurs at individual loci after that point. Previously, it was shown that unmethylated regions (UMRs) and low-methylation regions (LMRs) preferentially mark cis-regulatory elements such as promoters and enhancers, respectively36. We therefore sought to identify UMRs and LMRs in early embryos as previously described37. In total, we identified 17,204–17,898 UMRs and 24,039–32,019 LMRs in ectoderm, PS, mesoderm, endoderm, and E6.5 epiblast (Supplementary Table 4). We did not carry out similar analyses in earlier lineages because of the difficulty of LMR/UMR calling in globally hypomethylated genomes. As validation, we found that the locations of UMRs were strongly enriched for promoters and were largely invariant among different lineages (Supplementary Fig. 8a). In contrast, LMRs were much more dynamic, indicating putative enhancers36. Furthermore, large fractions of UMRs and LMRs in E6.5 epiblast (94% and 58%, respectively) overlapped with DNase hypersensitivity sites in mESCs38 (Supplementary Fig. 8b). The epiblast tissue-specific LMRs (tsLMRs) showed hypermethylation in Tet1/Tet2 DKO mutant E6.5 epiblast (Fig. 5a), indicating the involvement of TET proteins in the demethylation of these putative enhancers. Using GREAT analysis39, we found that tsLMRs (Supplementary Table 4) were preferentially located near genes involved in corresponding lineage specification (Fig. 5a). We then determined which regulators may function at LMRs by searching for their DNA motifs in these regions (Fig. 5b). For instance, the motif of POU5F1 was enriched in epiblast, ectoderm, and, to a lesser extent, PS LMRs, whereas SOX2 was enriched mainly in ectoderm LMRs. This is consistent with their expression patterns as determined in this study (Supplementary Fig. 1d) and in previous work40,41,42 (it is worth noting that Pou2f1 and Pou3f1 are also weakly expressed at these stages (data not shown)). In fact, conditional depletion of Pou5f1 in postimplantation embryos leads to deficient cell proliferation in PS41. FOXA1, FOXA2, and GATA4 (the motifs of GATA family members were highly similar; data not shown) were enriched in endoderm tsLMRs, consistent with their pivotal roles in endoderm differentiation43,44,45. Taken together, these results demonstrate that the dynamic DNA methylation at LMRs correlates with lineage identities during gastrulation.
Next, we asked whether these LMRs in early lineages are retained in somatic tissues. Using published datasets21,46, we found that tsLMRs from early embryos showed significant overlap with putative enhancers in E14.5 and somatic tissues (Supplementary Fig. 8c). However, the enrichment decreased as development proceeded, suggesting gradual drift of the epigenome. UMRs and LMRs enriched in early embryos but not in somatic tissues (Supplementary Table 5) included those at the promoters of pluripotency genes such as Pou5f1, Nanog, and Tdgf1 (Supplementary Fig. 8d). Distal UMRs and LMRs specific for early embryos were preferentially located near many developmental regulator genes such as Lin28a, Sall4, and Dnmt3b (Supplementary Fig. 8d,e). Taken together, these data demonstrate that dynamic DNA methylation occurs at lineage-specific putative enhancers during gastrulation.
Global lineage methylome patterning does not strictly require implantation
Because de novo methylation is accompanied by the implantation of embryos, we asked whether implantation is required for establishment of the DNA methylome. Notably, mouse embryos can grow through the early stages of organogenesis in vitro47. Thus, we isolated E4.0 embryos in vivo; cultured them in vitro using established protocols47,48; and collected the embryos at days 1, 2 and 4 for STEM-seq and RNA-seq analyses. The in vitro–cultured (IVC) embryos developed more slowly than their in vivo counterparts, retaining a blastocyst-like shape after 2 d of culture (data not shown) and then adopting a postimplantation-embryo-like morphology by day 4 (Supplementary Fig. 9a). Despite the delayed development, de novo methylation occurred in embryos after 1 d of IVC culture (IVC + 1d) and in IVC + 2d embryos (Fig. 5c). For day 4 embryos, we segregated and collected epiblast-like and VE-like tissues (on the basis of morphology) (Methods). Lineage-marker analysis and global transcriptome clustering analysis showed that IVC + 4d epiblast and VE resembled E5.5 epiblast and VE in vivo (Supplementary Fig. 9b,c). In IVC + 4d VE, DNA methylation continued to increase at a relatively steady rate. However, the acquisition of DNA methylation in IVC + 4d epiblast was much faster (Fig. 5c) and was closely accompanied by sharp upregulation of Dnmt3b (Supplementary Fig. 9d). These data indicate the presence of a default and progressive methylation-patterning process that is accelerated by dramatic upregulation of Dnmt genes (especially Dnmt3b) preferentially in epiblast. We noted that the methylation patterns of IVC + 4d epiblast and IVC + 4d VE largely recapitulated those of their in vivo counterparts, both in a chromosome-wide analysis (Supplementary Fig. 10a) and in gene bodies (Supplementary Fig. 10b). We observed compartment-dependent methylation patterns in early-stage IVC embryos as well (especially IVC + 2d embryos) (Supplementary Fig. 10a). Notably, compared with their counterparts in vivo (both E5.5 and E6.5), IVC + 4d epiblast and VE showed higher global methylation overall (Fig. 5c). In addition, a detailed analysis showed that aberrant hypermethylation in IVC + 4d epiblast was located preferentially in DMVs and CGIs (Supplementary Fig. 10c), which raises the possibility that these regions are highly sensitive to environmental changes. Taken together, these data indicate that a similar mechanism may govern de novo methylation and lineage-specific methylation patterning both in vivo and in vitro, and that this mechanism does not strictly require implantation.
Lineage segregation during pre- and postimplantation development gives rise to the earliest fate-committed cell types and the founder tissues for complete body development. These events also provide models for studying cell fate determination from naive pluripotency to primed states for differentiation49. However, the transcription circuitry and epigenetic regulation in these processes in vivo remain poorly understood. Here, by using several complementary approaches with carefully dissected early lineages, we obtained a comprehensive view of transcriptome, methylome, and 3D chromatin organization during early lineage specification. Our work identified extensive stage-specific and lineage-specific patterning of DNA methylomes during the initial cell fate commitment. Lineage-specific methylomes were particularly evident for embryonic and extraembryonic tissues. It is tempting to speculate that such differential methylomes may provide an epigenetic barrier not only between embryonic and extraembryonic tissues in the fetus, but also between extraembryonic fetal tissues and maternal tissues. It is likely that methylome patterning is regulated by multiple factors. First, we found that transcription-dependent gene body methylation exists in both embryonic and extraembryonic lineages, which suggests that it is an evolutionarily conserved mechanism50. However, gene body methylation is relatively transient in embryonic tissues, as regions beyond active gene bodies also become methylated eventually, probably as a result of highly active DNMTs. Second, amid global de novo methylation, DMVs were unexpectedly demethylated in epiblast but not in VE, via a process that involves the TET proteins. As DMVs are preferentially located near promoters of developmental genes and transcription factors26, the hypomethylation of DMVs may be essential to maintain the plasticity of developmental regulators for rapid response to signals. Finally, both demethylation and de novo methylation in early development were strongly correlated with higher-order chromatin structure. We speculate that chromatin higher structure may regulate the accessibility of DNMTs and regulators of demethylation, especially when their availability is limited. Importantly, compartment-wide PMDs are also a hallmark for cancer and immortalized cell lines51,52. It would be interesting to investigate whether the presence of PMDs in these cells is also attributable to downregulation of DNMTs. These data show that de novo methylation seems to be a pervasive process regulated by inherited methylation from previous stages, lineage-specific expression of DNA methylation machinery, gene activity, and 3D chromatin organization (Supplementary Fig. 10d). Finally, a recent study reported that it is likely that the differential methylation patterning between embryonic and extraembryonic tissues is driven by WNT and FGF signaling53. Taken together, our results provide an unprecedented view of transcription circuitry and epigenetic landscapes in early lineage specification. Investigation of this molecular architecture and its highly dynamic reprogramming should help researchers decipher the regulatory foundation for initial cell fate commitment and body plan in mammalian development.
For collection of E3.5 and E4.0 tissues, 6-week-old C57BL/6N female mice were injected with pregnant mare serum gonadotropin followed by human chorionic gonadotropin before being mated with DBA/2N male mice. The first day that a vaginal plug was observed was considered as E0.5.
Fertilized embryos were flushed out from the uterus with HEPES-buffered CZB medium at defined times. Immunosurgery was performed as reported previously56 to remove TE and isolate ICM. Briefly, after pronase treatment to remove the zona pellucida, blastocysts were incubated with DMEM containing rabbit anti-mouse serum (1:10) for 30 min and then washed three times in DMEM plus 10% FBS. The resulting embryos were exposed to guinea pig complement (1:5 in DMEM) for 10 min, washed three times, and then pipetted under microscopy to carefully remove TE cells. We separated TE from blastocysts by manual bisection to collect the opposite part of ICM as described previously57. The derivatives of TE at later stages were not investigated because of the difficulty of cleanly separating them from maternal tissues after embryo implantation.
E5.5–E7.5 tissues were collected via previously described methods17,58. Briefly, female mice were mated naturally, and the first day that the vaginal plug was observed was considered as E0.5. After embryos were dissected from uterus and decidua, they were transferred into a dish containing DMEM plus 10% FBS to remove the Reichert’s membrane using syringe needles. Embryonic regions were separated from extraembryonic tissues and transferred into pancreatic and trypsin enzyme solution at room temperature for 2–10 min. For E5.5 and E6.5 embryos, we obtained VE by gently sucking the embryonic part into a capillary pipet two or three times, which detached the VE from the embryo and isolated the rest of the embryonic region as epiblast. To dissect the three germ layers from E7.5 embryo, first we collected the endoderm similarly as for the VE. Next, glass needles were inserted parallel to the PS to cut off both mesoderm wings. Finally, the J-shaped PS was cut off from the lateral side of the ectoderm where the mesoderm attached, and the rest was collected as ectoderm.
Tet1 +/− mice (B6;129S4-Tet1 tm1.1Jae/J) and Tet2 −/− mice (B6;129S-Tet2 tm1.1Iaai/J) were purchased from The Jackson Laboratory. After mating Tet1 +/−;Tet2 −/− heterozygotes, we collected E6.5 epiblast from embryonic regions from Tet1/Tet2 DKO embryos as described above. Extraembryonic regions were used for genotyping.
In vitro culture of mouse embryos was carried out as previously described48,59. Briefly, 6-week-old C57BL/6N female mice were injected with hormone and mated with DBA/2N male mice. E4.0 embryos were flushed out of the uterus with HEPES-buffered CZB medium and cultured for 4 d in a 35-mm Falcon plastic dish that contained 2 ml of CMRL 1066 supplemented with 1 mM glutamine, 1 mM sodium pyruvate, and 20% FBS. As described previously, the embryonic region was cut off with a needle and subjected to trypsin and pancreatic enzyme digestion followed by mechanical dissection to separate epiblast from VE.
STEM-seq library preparation and sequencing
The detailed STEM-seq procedure is described as below.
(1) Early lineage samples were lysed with 10 µl of lysis buffer (10 mM Tris-HCl, pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1 mM EDTA, pH 8.0, NP-40 0.5%) and 1 µl of protease K (Roche; 10910000) for at least 3 h at 55 °C. The reaction was then heat-inactivated for 1 h at 72 °C. After lysis, spike-in λ-DNA (Promega; D150A) was added at a mass ratio of 1/200. The reaction (20 µl) was then treated with 1 µl of dsDNA Fragmentase (NEB; M0348AA) for 30 min.
(2) The digested DNA was directly treated with bisulfite conversion reagent in a 140-µl reaction with the EpiTect Fast Bisulfite Conversion Kit (Qiagen; 59824) according to a modified protocol: denature for 8 min at 95 °C, incubate at 60 °C for 25 min, and repeat the procedure.
(3) The converted DNA was subjected to column purification and desulfonation on MinElute DNA spin columns (Qiagen; 59824) with carrier RNA (Qiagen; 59824) according to the manufacturer’s instructions. The purified DNA was eluted in 30 µl of elution buffer.
(4) The converted DNA was then subjected to TELP library preparation as previously described20. Specifically, 28 µl of purified converted DNA was mixed with 1 µl of 1 mM dCTP and 1 µl of ExTaq buffer, incubated at 95 °C for 1 min, and then quickly cooled on ice. 1 µl of terminal transferase (NEB; M0315L) was then added to the mix, and the reaction was incubated at 37 °C for 30 min, after which 1 µl of 1 mM dATP was added and the mixture was incubated at 37 °C for 5 min before being inactivated at 80 °C for 20 min. This was followed by DNA extension with a poly-G-containing primer. Specifically, the previous reaction (~30 µl) was added to a 30-µl reaction mixture (6.2 µl of ddH20, 12 µl of 5 × KAPA buffer A, 5 µl of 2.5 mM dNTP, 6 µl of 2 µM poly G primer, and 0.8 µl of KAPA 2 G polymerase (KE5507)). This was followed by a PCR reaction: 95 °C for 3 min, (47 °C for 1 min, 68 °C for 2 min) × 16 cycles, 72 °C for 10 min, and a pause at 4 °C until the next step. The mixture was digested with 2 µl of Exonuclease I (NEB; M0293L) and 6 µl of Exonuclease I buffer at 37 °C for 50 min and was inactivated at 80 °C for 10 min. A one-third volume (23 µl) of 4 × B&W buffer (40 mM Tris-HCl, pH 8.0, 2 mM EDTA, 4 M NaCl) was then added to the 69 µl of digested mixture. Each reaction system was supplemented with 10 µl of prewashed streptavidin beads (prewashed with 1 × B&W buffer three times and resuspended with 10 µl of 1 × B&W buffer). The mixture with beads was mixed in a Thermomixer (Eppendorf) at 1,400 r.p.m. (5 s on, 10 s off) at 23 °C for 30 min, and beads were washed once with 120 µl of 1 × B&W buffer and three times with 120 µl of EBT buffer (10 mM Tris-HCl, pH 8.0, 0.02% Triton X-100). The washed beads were resuspended with 20 µl of ligation reaction, including 8.4 µl of EB buffer (10 mM Tris-HCl, pH 8.0), 0.6 µl of 10 µM TA adaptor, 10 µl of 2 × Quick ligase buffer, and 1 µl of Quick ligase (NEB; M2200L). The ligation mixture was rotated at 4 °C overnight, then moved to room temperature for 10 min and washed with 120 µl of EBT buffer three times. DNA was eluted in 35.5 µl of H2O in a Thermomixer at 66 °C (1,400 r.p.m., 5 s on, 10 s off for 30 min). The eluted 35.5 µl of DNA was added to the 14.5-µl reaction mixture (5 µl of 2.5 mM dNTP, 5 µl of 10 × ExTaq buffer, 0.5 µl of ExTaq (RR006), 2 µl of 20 µM P1_FL and 2 µl of 20 µM index primer). This was followed by a PCR reaction: 95 °C for 3 min, (95 °C for 30 s, 58 °C for 30 s, 72 °C for 30 s) × 12 cycles, 72 °C for 10 min, and a pause at 4 °C until the next step. The resulting libraries were size-selected with AMPure XP according to the manufacturer’s standard protocol and subjected to deep sequencing.
RNA-seq library preparation and sequencing
Total RNAs from various lineages isolated from E5.5–E7.5 embryos were extracted with the RNeasy Plus micro kit (Qiagen; 74034) according to the manufacturer’s protocol. For ICM and TE, cells were directly lysed in hypotonic lysis buffer without RNA extraction (Amresco; M334). The cDNA libraries were then generated via the Smart-seq2 method60. After reverse transcription reaction with oligo-dT primers and preamplification, cDNAs were sheared by Covaris and subjected to Illumina TruSeq library preparation. All libraries were sequenced on an Illumina HiSeq 1500 according to the manufacturer’s instructions.
Generation of sisHi-C library and sequencing
sisHi-C libraries were produced as described31. Briefly, samples were cross-linked with 1% formaldehyde at room temperature for 10 min. Formaldehyde was quenched with glycine for 10 min at room temperature. After being washed with 1 × PBS, the embryos were lysed on ice and the chromatin was solubilized with 0.5% SDS. The nuclei were digested with MboI at 37 °C overnight. After fill-in with biotin-14-dCTP, the fragments were ligated in a small volume. Reversal of cross-linking, DNA purification, and sonication were done sequentially. The biotin-labeled DNA was then pulled down with Dynabeads MyOne Streptavidin C1 (Life Technology). The fragments that included a ligation junction were subjected to Illumina library preparation. Fourteen cycles of PCR amplification were performed with Extaq (Takara), and the products were purified and size-selected with AMPure XP beads. All libraries were sequenced on an Illumina HiSeq 1500 according to the manufacturer’s instructions.
STEM-seq data processing
All STEM-seq datasets were mapped to the mm9 reference genome by BSSeeker261. Because STEM-seq libraries contain poly C in the ends of reads, we used scripts to remove poly G from the beginning of read 2 for paired-end mapping. Alignments were performed with the following parameters in addition to the default parameters: --bt2-p 8 --XS 0.2,3 --a CCCCCC --m 4. Multi-mapped reads and PCR duplicates were removed. We also removed the reads marked by BSSeeker2 as unconverted (--XS 0.2, 3) and reads with mapped region lengths shorter than 30 bp. After validating the reproducibility between replicates, we pooled data from replicates for subsequent analyses.
Quantification of CG and CH methylation
For each CG site, the methylation level was calculated as the total methylated counts (combining Watson and Crick strands) divided by the total counts across all reads covering that CG. Because the CH site is usually asymmetrical, CH methylation was calculated separately for each strand. The bisulfite conversion error rate was subtracted from the CG or CH methylation level. If the methylation value was less than the error rate, the methylation value for that site was set as 0.
Allele assignment of sequencing reads
To generate strain-specific genomes by considering SNP information, we downloaded SNP tables for the DBA/2J and C57BL/6N strains from the Sanger Institute Mouse Genome Project. We generated DBA/2J and C57BL/6N genomes by substituting corresponding bases from the mm9 genome. Please note that because we used the DBA/2N strain instead of the DBA/2J strain, we verified the identity of the strain by sequencing its genome. The genomes of DBA/2N and DBA/2J are very similar, and 99.4% of SNPs identified in the DBA/2N strain (compared with the reference genome) were the same as those found in the DBA/2J strain identified by the Sanger Institute.
To minimize the mapping bias introduced by the two parental alleles, we aligned all STEM-seq reads to the genomes of the C57BL and DBA strains separately with BSseeker2.8, using the following parameters: --bt2-p 8 --XS 0.2,3 --a CCCCCC --m 4. SNP information from both reads in the pair was summed and used. If the SNP contained a cytosine, its bisulfite-converted form (T) was also considered. SNPs that became non-informative (i.e., could not be distinguished from the opposite allele) after bisulfite conversion were discarded. When multiple SNPs were present in a read (or a read pair), the parental origin was determined by votes from all SNPs, and the read was assigned to the allele that received at least two-thirds of the total votes.
RNA-seq data processing
RNA-seq reads were mapped to the mm9 reference genome by TopHat (version 2.0.11)62. Cufflinks (version 2.0.2)62 was used to calculate the gene expression levels, with the refFlat database from the UCSC Genome Browser used as reference.
Hi-C data processing
Sequencing reads were mapped, processed, and iteratively corrected with HiC-Pro as described previously63,64. Briefly, the read pairs were mapped to the mm9 reference genome in a two-step approach with bowtie265. Then the invalid read pairs including dangling ends, self-circle ligation, and duplicates were discarded. The genome was divided into bins of specific lengths to generate the contact maps. We used 100-kb and 40-kb bins to investigate global chromatin contacts and local domain contacts, respectively. Hi-C interaction heat maps were generated with the normalized interaction maps with HiCPlotter66. We carried out A/B compartment segmentation with a 100-kb interaction matrix using a previously described method30. After validating the reproducibility between replicates for each cell type, we pooled data from replicates for subsequent analyses.
Validation of STEM-seq and RNA-seq datasets
To compare MethylC-seq and STEM-seq or to make comparisons between STEM-seq replicates, we calculated the average methylation values for 2-kb bins across the entire genome. Bins that had values in both samples were selected, and the Pearson correlation was calculated between samples or replicates. For RNA-seq samples, the Spearman correlation coefficients were calculated for FPKM values across all genes in the genome between replicates.
Hierarchical clustering of DNA methylomes
The average methylation value was calculated in a 1-kb window for the entire genome for each tissue/cell type. Hierarchical clustering was done with Cluster 3.067 with the parameter --e 2 (Pearson correlation). Java Treeview was used to visualize the clustering result. The methylomes of somatic tissues were obtained from a previous study21.
Identification of differentially methylated CG sites
CG sites covered by at least five reads were selected13. Two-tailed Fisher’s exact test was performed to evaluate the significance of differentially methylated CG sites between two stages. Only CG sites with P < 0.05 and changes in CG methylation levels between two stages greater than 0.2 were identified and used for downstream analyses.
CH methylation motif analysis
The CH sites covered by at least ten reads were sorted by their methylation levels. The top 5,000 sites were selected, and sequences within ± 5 bp around the CH sites were subjected to a motif analysis with Weblogo3.068.
Identification of differentially methylated promoters between E3.5 ICM and TE, and between E6.5 Epi and VE
First we calculated the methylation levels between two samples for each gene promoter (transcription start site within 500 bp). Those genes with promoter methylation levels greater than 0.35 in one sample and twofold greater than those in the other sample were identified as differentially methylated promoters.
Analysis of differentially methylated regions between VE and epiblast
We first identified differentially methylated CG sites between VE and epiblasts as described above. Then we identified differentially methylated bins (2-kb) containing at least three differentially methylated CG sites. These bins were further merged into differentially methylated regions if they were no more than 2 kb away. To determine the genomic distribution of hypermethylated regions, we segmented the genome into transcription start sites, exons, introns, transcription end sites, and intergenic regions using annotations combining the RefSeq, UCSC Known Gene, Ensemble, and GENCODE databases. To assess the significance of hypermethylated regions falling into a certain category, we generated a set of random regions with lengths equal to those of each individual hypermethylated region. The numbers of regions that fell into each category were calculated, and the significance was computed as the log ratio of observed numbers divided by those for random regions.
Analysis of DMVs
Identification of gene-dense regions and gene deserts
The genome was split into 1-Mb bins, and genes located in each bin were counted. Gene-dense or gene-desert regions were identified as those with more than ten genes or no more than one gene in each bin, respectively.
Identification of PMDs and HMDs for VE
The PMDs and HMDs in VE were identified as previously described52. We calculated the average methylation level for each 10-kb bin, and included only bins with at least 20 CpGs. Because of the different global methylation levels for different cell types, we used different cutoffs for PMD and HMD identification. Specifically, hypomethylated bins (mCG/CG ≤ 0.3 for E5.5 VE and mCG/CG ≤ 0.4 for E6.5 VE) and hypermethylated bins (mCG/CG ≥ 0.6 for E5.5 VE and mCG/CG ≥ 0.7 for E6.5 VE) were identified and merged into PMDs and HMDs, respectively. We also excluded the promoter regions (± 2.5 kb) for PMDs.
Identification of allelically expressed genes
To minimize the mapping bias introduced by the sequence differences between the two parental alleles, we aligned all sequencing reads to the genomes of the C57BL/6N and DBA/2J strains (mm9) separately. We examined all SNPs with high-quality base-calling (Phred score ≥ 30) and assigned each read to its parental origins. Only SNP information from both paired reads was retained. If multiple SNPs were present in a read, we determined the parental origin that received at least two-thirds of the total votes from all SNPs. The assigned reads mapped to exons were quantified by Htseq-count70. Allele-specific genes were identified on the basis of at least threefold change between the numbers of reads assigned to maternal or paternal alleles with P < 10−3.
Identification of topologically associated domains
We used a directionality index and a hidden Markov model (HMM) to identify topologically associating domains (TADs) as previously described32. We used a 40-kb bin resolution and 2-Mb window size to calculate the directionality index score. We defined TAD boundaries as the middle bin (40 kb) between two consecutive TADs identified by HMM with distances of no more than 400 kb.
Identification of compartments A and B
Compartments A and B were identified as described previously30, with several modifications. For each stage, we used normalized 100-kb interaction matrices in this analysis. Bins that had no interactions with any other bins were removed, and the expected interaction matrices were generated via a previously described window sliding approach71 (bin size, 400 kb; step size, 100 kb). The resulting correlation matrices were subjected to principal component analysis. Principal component 1 of the correlation matrix and the gene density of genome mm9 were used to generate compartments A and B.
P(s) curve analysis
The P(s) curve was calculated as previously described72, using 100-kb-resolution normalized interaction matrices. First, we used 1.15 as an increasing factor to generate logarithmically spaced bins (100 kb, 100 kb × 1.15, 100 kb × 1.152, and so on). Next, for each bin we counted all the numbers of interactions in the corresponding distances. To calculate the probability (P(s)), we divided the total numbers of interactions generated in the last step for each bin by the total number of possible region pairs. Finally, the P(s) values were normalized to enable the sum over the range of the distances to be 1.
Analyses of LMRs, UMRs and tissue-specific LMRs
The methylomes of E6.5 epiblast and E7.5 germ layer samples were segmented with an HMM as previously described36,37. UMRs, LMRs, and fully methylated regions were identified accordingly. LMRs that were unique to a lineage were identified as tsLMRs. The functional enrichment for genes near tsLMRs was analyzed with the GREAT tool39. HOMER55 was used to identify potential transcription factor motifs in LMRs.
Identification of early embryo enriched UMRs/LMRs and their predicted target genes
We first combined all UMRs and LMRs identified in the five early lineages (E6.5 epiblast, E7.5 ectoderm, E7.5 PS, E7.5 mesoderm, and E7.5 endoderm). We then selected those regions with lower methylation levels in early lineages (average mCG/CG ≤ 0.4) and higher methylation levels (mCG/CG ≥ 0.5) in at least two-thirds of total somatic tissues (≥8). Regions overlapping annotated promoters (RefSeq) (within 2.5 kb) were identified as promoter UMRs/LMRs, and the rest were classified as distal UMRs/LMRs. To identify the possible gene targets of distal UMRs/LMRs, we examined all genes within 200 kb of each UMR/LMR and calculated the Spearman correlation between methylation values and expression levels for each UMR/LMR–gene pair across all early lineages and somatic tissues. UMR/LMR–gene pairs that showed strong negative correlation (R < –0.4) were selected for downstream analysis as previously described73.
Life Sciences Reporting Summary
Further information on experimental design is available in the Life Sciences Reporting Summary.
All sequencing data, including the STEM-seq, MethylC-seq, RNA-seq, and sisHi-C datasets, are available through the Gene Expression Omnibus (GEO) under accession GSE76505.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rossant, J. & Tam, P. P. Emerging asymmetry and embryonic patterning in early mouse development. Dev. Cell 7, 155–164 (2004).
Zernicka-Goetz, M., Morris, S. A. & Bruce, A. W. Making a firm decision: multifaceted regulation of cell fate in the early mouse embryo. Nat. Rev. Genet. 10, 467–477 (2009).
Rossant, J. & Tam, P. P. Blastocyst lineage formation, early embryonic asymmetries and axis patterning in the mouse. Development 136, 701–713 (2009).
Bielinska, M., Narita, N. & Wilson, D. B. Distinct roles for visceral endoderm during embryonic mouse development. Int. J. Dev. Biol. 43, 183–205 (1999).
Arnold, S. J. & Robertson, E. J. Making a commitment: cell lineage allocation and axis patterning in the early mouse embryo. Nat. Rev. Mol. Cell Biol. 10, 91–103 (2009).
Lawson, K. A., Meneses, J. J. & Pedersen, R. A. Clonal analysis of epiblast fate during germ layer formation in the mouse embryo. Development 113, 891–911 (1991).
Smith, Z. D. & Meissner, A. DNA methylation: roles in mammalian development. Nat. Rev. Genet. 14, 204–220 (2013).
Bird, A. DNA methylation patterns and epigenetic memory. Genes Dev. 16, 6–21 (2002).
Bourc’his, D., Xu, G. L., Lin, C. S., Bollman, B. & Bestor, T. H. Dnmt3L and the establishment of maternal genomic imprints. Science 294, 2536–2539 (2001).
Branco, M. R. et al. Maternal DNA methylation regulates early trophoblast development. Dev. Cell 36, 152–163 (2016).
McGraw, S. et al. Loss of DNMT1o disrupts imprinted X chromosome inactivation and accentuates placental defects in females. PLoS Genet. 9, e1003873 (2013).
Smith, Z. D. et al. A unique regulatory phase of DNA methylation in the early mammalian embryo. Nature 484, 339–344 (2012).
Wang, L. et al. Programming and inheritance of parental DNA methylomes in mammals. Cell 157, 979–991 (2014).
Guo, H. et al. The DNA methylation landscape of human early embryos. Nature 511, 606–610 (2014).
Smith, Z. D. et al. DNA methylation dynamics of the human preimplantation embryo. Nature 511, 611–615 (2014).
Gao, F. et al. De novo DNA methylation during monkey pre-implantation embryogenesis. Cell Res. 27, 526–539 (2017).
Nagy, A., Gertsenstein, M., Vintersten, K. & Behringer, R. Separating postimplantation germ layers. CSH Protoc. http://dx.doi.org/10.1101/pdb.prot4368 (2006).
Beddington, R. S. P. Isolation, culture and manipulation of post-implantation mouse embryos. In: M. Monk ed. Mammalian Development: A Practical Approach (pp. 43–69. IRL Press, Oxford, UK, 1987).
Kwon, G. S., Viotti, M. & Hadjantonakis, A. K. The endoderm of the mouse embryo arises by dynamic widespread intercalation of embryonic and extraembryonic lineages. Dev. Cell 15, 509–520 (2008).
Peng, X. et al. TELP, a sensitive and versatile library construction method for next-generation sequencing. Nucleic Acids Res. 43, e35 (2015).
Hon, G. C. et al. Epigenetic memory at embryonic enhancers identified in DNA methylation maps from adult mouse tissues. Nat. Genet. 45, 1198–1206 (2013).
Habibi, E. et al. Whole-genome bisulfite sequencing of two distinct interconvertible DNA methylomes of mouse embryonic stem cells. Cell Stem Cell 13, 360–369 (2013).
Hu, Y. G. et al. Regulation of DNA methylation activity through Dnmt3L promoter methylation by Dnmt3 enzymes in embryonic development. Hum. Mol. Genet. 17, 2654–2664 (2008).
He, Y. & Ecker, J. R. Non-CG methylation in the human genome. Annu. Rev. Genomics Hum. Genet. 16, 55–77 (2015).
Pastor, W. A. et al. Naive human pluripotent cells feature a methylation landscape devoid of blastocyst or germline memory. Cell Stem Cell 18, 323–329 (2016).
Xie, W. et al. Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell 153, 1134–1148 (2013).
Jeong, M. et al. Large conserved domains of low DNA methylation maintained by Dnmt3a. Nat. Genet. 46, 17–23 (2014).
Auclair, G., Guibert, S., Bender, A. & Weber, M. Ontogeny of CpG island methylation and specificity of DNMT3 methyltransferases during embryonic development in the mouse. Genome Biol. 15, 545 (2014).
Schroeder, D. I. et al. Early developmental and evolutionary origins of gene body DNA methylation patterns in mammalian placentas. PLoS Genet. 11, e1005442 (2015).
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Du, Z. et al. Allelic reprogramming of 3D chromatin architecture during early mammalian development. Nature 547, 232–235 (2017).
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).
Gu, T. P. et al. The role of Tet3 DNA dioxygenase in epigenetic reprogramming by oocytes. Nature 477, 606–610 (2011).
Peat, J. R. et al. Genome-wide bisulfite sequencing in zygotes identifies demethylation targets and maps the contribution of TET3 oxidation. Cell Reports 9, 1990–2000 (2014).
Amouroux, R. et al. De novo DNA methylation drives 5hmC accumulation in mouse zygotes. Nat. Cell Biol. 18, 225–233 (2016).
Stadler, M. B. et al. DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature 480, 490–495 (2011).
Burger, L., Gaidatzis, D., Schübeler, D. & Stadler, M. B. Identification of active regulatory regions from DNA methylation data. Nucleic Acids Res. 41, e155 (2013).
Vierstra, J. et al. Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution. Science 346, 1007–1012 (2014).
McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).
Schöler, H. R., Dressler, G. R., Balling, R., Rohdewohld, H. & Gruss, P. Oct-4: a germline-specific transcription factor mapping to the mouse t-complex. EMBO J. 9, 2185–2195 (1990).
DeVeale, B. et al. Oct4 is required ~E7.5 for proliferation in the primitive streak. PLoS Genet. 9, e1003957 (2013).
Iwafuchi-Doi, M. et al. Transcriptional regulatory networks in epiblast cells and during anterior neural plate development as modeled in epiblast stem cells. Development 139, 3926–3937 (2012).
Ang, S. L. et al. The formation and maintenance of the definitive endoderm lineage in the mouse: involvement of HNF3/forkhead proteins. Development 119, 1301–1315 (1993).
Bossard, P. & Zaret, K. S. GATA transcription factors as potentiators of gut endoderm differentiation. Development 125, 4909–4917 (1998).
Kuo, C. T. et al. GATA4 transcription factor is required for ventral morphogenesis and heart tube formation. Genes Dev. 11, 1048–1060 (1997).
Shen, Y. et al. A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116–120 (2012).
Libbus, B. L. & Hsu, Y. C. Sequential development and tissue organization in whole mouse embryos cultured from blastocyst to early somite stage. Anat. Rec. 197, 317–329 (1980).
Morris, S. A. et al. Dynamics of anterior-posterior axis formation in the developing mouse embryo. Nat. Commun. 3, 673 (2012).
Kalkan, T. & Smith, A. Mapping the route from naive pluripotency to lineage specification. Philos. Trans. R. Soc. Lond. B Biol. Sci. 369, 20130540 (2014).
Baubec, T. et al. Genomic profiling of DNA methyltransferases reveals a role for DNMT3B in genic methylation. Nature 520, 243–247 (2015).
Berman, B. P. et al. Regions of focal DNA hypermethylation and long-range hypomethylation in colorectal cancer coincide with nuclear lamina-associated domains. Nat. Genet. 44, 40–46 (2011).
Lister, R. et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315–322 (2009).
Smith, Z. D. et al. Epigenetic restriction of extraembryonic lineages mirrors the somatic transition to cancer. Nature 549, 543–547 (2017).
Zylicz, J. J. et al. Chromatin dynamics and the role of G9a in gene regulation and enhancer silencing during early mouse development. eLife 4, e09571 (2015).
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Solter, D. & Knowles, B. B. Immunosurgery of mouse blastocyst. Proc. Natl. Acad. Sci. USA 72, 5099–5102 (1975).
Ohnishi, Y. et al. Cell-to-cell expression variability followed by signal reinforcement progressively segregates early mouse lineages. Nat. Cell Biol. 16, 27–37 (2014).
Harrison, S. M., Dunwoodie, S. L., Arkell, R. M., Lehrach, H. & Beddington, R. S. Isolation of novel tissue-specific genes from cDNA libraries representing the individual tissue constituents of the gastrulating mouse embryo. Development 121, 2479–2489 (1995).
Libbus, B. L. & Hsu, Y. C. Changes in S-phase associated with differentiation of mouse embryos in culture from blastocyst to early somite stage. Anat. Embryol. (Berl.) 159, 235–244 (1980).
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
Guo, W. et al. BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data. BMC Genomics 14, 774 (2013).
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).
Imakaev, M. et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods 9, 999–1003 (2012).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Akdemir, K. C. & Chin, L. HiCPlotter integrates genomic data with interaction matrices. Genome Biol. 16, 198 (2015).
de Hoon, M. J., Imoto, S., Nolan, J. & Miyano, S. Open source clustering software. Bioinformatics 20, 1453–1454 (2004).
Crooks, G. E., Hon, G., Chandonia, J. M. & Brenner, S. E. WebLogo: a sequence logo generator. Genome Res. 14, 1188–1190 (2004).
Huang, W., Sherman, B. T. & Lempicki, R. A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 37, 1–13 (2009).
Anders, S., Pyl, P. T. & Huber, W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).
Dixon, J. R. et al. Chromatin architecture reorganization during stem cell differentiation. Nature 518, 331–336 (2015).
Naumova, N. et al. Organization of the mitotic chromosome. Science 342, 948–953 (2013).
Bell, R. E. et al. Enhancer methylation dynamics contribute to cancer plasticity and patient mortality. Genome Res. 26, 601–611 (2016).
We are grateful to members of the Xie laboratory for helpful comments during preparation of the manuscript. We thank J. Na for critical reading of the manuscript. This work was supported by the National Key R&D Program of China (2016YFC0900301 to W. Xie; 2017YFC1001401 to L.L.), the National Basic Research Program of China (2015CB856201 to W. Xie), the National Natural Science Foundation of China (31422031 to W. Xie), the THU-PKU Center for Life Sciences (W. Xie), Beijing Advanced Innovation Center for Structural Biology (W. Xie), and the Biomedical Research Council of A*STAR (Agency for Science, Technology and Research), Singapore (F.X.). W. Xie is a Howard Hughes Medical Institute (HHMI) International Research Scholar. J.W. was funded by grants from the NIH (R01GM095942 and R21HD087722) and the Empire State Stem Cell Fund through the New York State Department of Health (NYSTEM) (C028103 and C028121), and is a recipient of an Irma T. Hirschl and Weill-Caulier Trusts Career Scientist Award.
Integrated Supplementary Information
Supplementary Figure 1 Transcriptome profiling for early lineages during peri- and postimplantation development.
a) Schematic showing tissue dissection of early embryos from E3.5 to E7.5 (see Methods). b) The correlation of gene expression levels across the genome between biological replicates for RNA-seq samples generated in this study. c) The expression of various lineage marker genes is shown for dissected tissues at each developmental stage as determined by RNA-seq. Error bars denote the standard deviation of FPKM values from biological replicates. d) Heatmap showing the expression levels of various marker genes in each tissue isolated from E3.5 to E7.5 embryos based on RNA-seq. Due to wide distributions of gene expression for different genes, each gene expression was relatively normalized by setting the FPKM of the highest lineage as 10. e) Hierarchical clustering analysis of global gene expression levels (with replicates) based on the RNA-seq data for tissues isolated from early embryos
a Schematic of STEM-seq procedure. Genomic DNA (or lysed cells) is first subjected to bisulfite conversion, followed by sequencing library preparation using TELP, a highly sensitive single-strand DNA amplification and library preparation method. Briefly, the purified converted DNA is tailed by poly C, followed by extension with a biotinylated poly G containing primer. The extension product is ligated to an adaptor followed by PCR amplification for sequencing library preparation (See Methods for details). b A UCSC genome browser snapshot shows comparison of mESC methylomes determined by STEM-seq and MethylC-seq using various amounts of DNA or cells in two mESC lines (TT2 and R1) near the Hoxa gene cluster. CpG islands (CGIs) in this region are also shown. c Scatterplots show the comparison of mESC methylation levels between those determined by STEM-seq and MethylC-seq (2kb bin), or between replicates of STEM-seq data across the whole genome. Pearson’s correlation coefficients are also shown. d Comparison of average CG methylation levels in mESCs at different genomic elements between those determined by STEM-seq using 10ng or 100ng DNA and MethylC-seq. e A similar plot as d for STEM-seq using 500 mESCs with two replicates. Please note that different mESC lines were unintentionally used in d (TT2) and e (R1)
a Scatterplots comparing biological replicates (2kb bin, across the whole genome) for lineage methylomes. Dashed lines indicate mCG difference = 0.2. b The plot showing the percentages of CG sites covered by various numbers of STEM-seq reads for each lineage. c The percentages of CG sites (≥5x) across various types of genomic elements for methylome datasets in this study. d The promoter methylation and gene expression levels for Hnf4a (VE marker) and Oct4, Tdgf1 (epiblast markers). e The promoter methylation and gene expression levels for E3.5 ICM and E3.5 TE specific expressed genes. f The dynamics of average DNA methylation levels across different classes of genomic elements for lineages from E3.5 to E7.5. g Barcharts showing the expression levels of Dnmts in early embryos from E3.5 to E7.5. h The promoter methylation and expression levels for Dnmt3l in development are shown
a Barcharts showing the percentages of reads that were assigned to the maternal or the paternal genome in each tissue. Only reads that contain SNPs were counted. b Heatmaps showing allelic DNA methylation levels at imprinting control regions. Only regions that are covered by sufficient SNPs were included for analysis. Gray (marked by asterisks) indicates stages with no or insufficient allelic reads
Supplementary Figure 5 Dynamic DNA methylation at promoters and DNA methylation valleys during lineage specification.
a Heatmap showing the promoter methylation (left) and expression (right) levels for genes that are differentially methylated between E6.5 VE and E6.5 Epi but are silenced in both lineages. b The GO analysis result for all genes located in DMVs identified in E6.5 Epi. c Barcharts showing the percentages of DMVs identified in early embryo (combining five lineages) that are marked by H3K27me3 (using data from a panel of somatic cells). d Barcharts showing the enrichment (logratio of observed/expected) of E5.5 Epi hypermethylated regions (vs. E6.5 Epi) in various classes of genomic elements. A set of random regions with equal lengths of individual hypermethylated regions were used as controls. e The boxplot showing the methylation levels in all CGIs and the non-CGI regions in DMVs in E5.5 and E6.5 embryos. f Barcharts showing the percentages of hypermethylated CGIs in E6.5 VE that fall into DMVs identified in early embryos (combining five lineages). The percentage of all CGIs that are located in DMVs (background) is also shown. g The barplot shows the expression of all Tet genes in early development
Supplementary Figure 6 Hi-C analysis for early lineages during peri- and postimplantation development.
a The Directionality Index (DI) tracks for E3.5 ICM, E6.5 epiblast, E6.5 VE and E7.5 ectoderm are shown. b The scatterplots comparing the DI values between E6.5 epiblast and other lineages, or between E6.5 epiblast replicates. The correlation coefficients (Spearman) are also shown. c The P(s) curves (chromatin contract frequency vs. genomic distances) for mESC and early lineages are shown
Supplementary Figure 7 Both de novo methylation and demethylation are correlated with chromatin compartments.
a Barcharts showing the average methylation levels gained from E4.0 ICM to E5.5 Epi or E5.5 VE (left) in active gene bodies (+), inactive gene bodies (-), intergenic regions (i), and the whole compartment (w), for either compartment A or compartment B (left). b A chromosome-wide view of CH methylation levels (1Mb bin) is shown for E5.5 Epi and E5.5 VE (top). Chromatin compartments in E3.5 ICM, E6.5 Epi, E6.5 VE and gene-dense regions are also shown (bottom). Arrows indicate hypomethylated regions that overlap with compartment B. c Scatterplots comparing CG and CH methylation levels (1Mb bin) between epiblast and VE at E5.5 and E6.5. The Pearson correlation coefficients are indicated. d The average allelic methylation levels near active (green) or silenced (black) genes for E3.5 ICM are shown, before (top) or after (bottom) TAD-based normalization (subtracting TAD background methylation levels for each gene). The background methylation level for each TAD was calculated by averaging DNA methylation levels across the TAD (excluding gene bodies and regulatory elements such as promoters and putative enhancers). e Barcharts showing the average methylation differences between wild type and Tet3 knockout zygotes in active gene bodies, inactive gene bodies, intergenic regions and the whole compartment for either compartment A or B. f A chromosome-wide view of DNA methylation levels (1Mb bin) is shown for mESC (serum and 2i). Chromatin compartments and gene-dense regions are also shown
a Heatmap showing the pairwise overlap of UMRs or LMRs among individual lineages. Tissues with global hypomethylation were excluded from the analysis. The percentages of UMRs that overlap with annotated TSSs are also shown. b Venn diagram showing the overlap of identified LMRs/UMRs in E6.5 Epi with DHS sites in mESC identified by ENCODE. c Heatmap showing the overlap of tissue-specific LMRs (tsLMRs) and putative enhancers previously defined in various tissues using histone modification signatures. The enrichment was calculated as logratio of observed overlap divided by expected overlap using a random set of regions with equal lengths of individual tsLMRs. d Heatmap showing the average methylation levels of early embryo-specific UMRs/LMRs (left) and associated gene expression (right) between early embryonic tissues (average of E6.5 Epi, Ect, PS, Mes, End) and somatic tissues (average of 11 tissues). Representative genes associated with UMRs/LMRs are shown on the right. e The snapshot showing methylation levels near Dnmt3b in oocyte, early embryos, somatic tissues, and mESCs (left). The expression for Dnmt3b in each cell type is also shown as heatmap (right). The shade indicates early embryo specific UMRs/LMRs. The DNaseI hypersensitive sites in mESCs are also shown
a The embryos were collected at E4.0 and were cultured in vitro for 4 days. Epiblast-like and VE-like tissues were dissected from in vitro cultured (IVC) embryos. b Barcharts showing the expression of marker genes for epiblast, VE/endoderm, ectoderm, PS, and mesoderm for lineages isolated from in vivo (red) and in vitro cultured (IVC) embryos (blue). As some germ layer markers are also expressed at earlier stages including epiblast (in vivo), only those that are exclusively expressed during gastrulation were examined. c Hierarchical clustering analysis of RNA-seq data for tissues isolated from in vivo embryos and IVC embryos. d Barcharts showing the expression levels of Dnmts in tissues isolated from IVC embryos and in vivo embryos
a Chromosome-wide view of CG methylation for tissues isolated from in vivo E6.5 embryos (red) and IVC embryos (blue, replicate 1). The second replicate of IVC embryos showed similar patterns (data not shown). b The average methylation levels near active (green, FPKM≥10) or silenced (black, FPKM≤1) genes for IVC epiblast and VE (replicate 1). Similar observations were made for replicate 2 (data not shown). c The enrichment (logratio of observed/expected) of various types of genomic elements for regions hypermethylated in IVC epiblast compared to E5.5 or E6.5 epiblast in the genome. d A model of allele and lineage-specific DNA methylation reprogramming in mouse early embryos. After fertilization, the maternal allele inherits gene body DNA methylation pattern from oocyte to blastocyst. Gene body methylation then occurs in postimplantation embryos on both alleles during de novo methylation. Such pattern is retained in extraembryonic tissues but is gradually diminished in embryonic tissues. On the other hand, the paternal allele in preimplantation embryos undergoes mega-base chromatin compartment A-specific demethylation. During de novo methylation, VE preferentially gains DNA methylation in compartment A while epiblast shows even DNA methylation in both compartments. The differences between epiblast and VE are likely in part contributed by the differential expression of Dnmts
Supplementary Figures 1–10.
Lists of lineage-specific genes from E3.5 to E7.5.
Differentially methylated genes.
Lists of all UMRs and LMRs.
Early-embryo-specific UMRs and LMRs.
About this article
Nature Genetics (2019)
Cell Biology and Toxicology (2019)
Genome-wide analyses reveal a role of Polycomb in promoting hypomethylation of DNA methylation valleys
Genome Biology (2018)
Divergent wiring of repressive and active chromatin interactions between mouse embryonic and trophoblast lineages
Nature Communications (2018)