Letter | Published:

Single-cell DNA methylome sequencing of human preimplantation embryos

Abstract

DNA methylation is a crucial layer of epigenetic regulation during mammalian embryonic development1,2,3. Although the DNA methylome of early human embryos has been analyzed4,5,6, some of the key features have not been addressed thus far. Here we performed single-cell DNA methylome sequencing for human preimplantation embryos and found that tens of thousands of genomic loci exhibited de novo DNA methylation. This finding indicates that genome-wide DNA methylation reprogramming during preimplantation development is a dynamic balance between strong global demethylation and drastic focused remethylation. Furthermore, demethylation of the paternal genome is much faster and thorough than that of the maternal genome. From the two-cell to the postimplantation stage, methylation of the paternal genome is consistently lower than that of the maternal genome. We also show that the genetic lineage of early blastomeres can be traced by DNA methylation analysis. Our work paves the way for deciphering the secrets of DNA methylation reprogramming in early human embryos.

Main

Global reprogramming of DNA methylation occurs during human preimplantation development. Our group and others have analyzed DNA methylation dynamics in human preimplantation embryos1,2,3,7,8. However, several key issues have not previously been addressed. For example, a large percentage of early human embryos are abnormal owing to aneuploidy9,10. Thus, all previous studies using pools of embryos cannot exclude effects from aneuploid embryos mixed in the samples. Also, parental-specific DNA methylation has not been systematically addressed thus far1,11. Moreover, the heterogeneity of DNA methylation among individual cells within an embryo has not been studied12,13,14. Here we performed post-bisulfite adaptor tagging (PBAT) DNA methylome analysis of early human embryos at single-cell and single-base resolution to resolve these issues12,15.

We performed single-cell PBAT DNA methylome sequencing analysis of 480 individual single cells from 50 human oocytes (including 42 mature oocytes and 8 germinal vesicle (GV) oocytes), 23 single sperm cells, and 62 preimplantation embryos (Supplementary Fig. 1 and Supplementary Table 1). We generated 6.5 Tb of sequencing data. For each individual cell, we sequenced 8.4 Gb on average and covered 10.8 million CpG sites (≥1×) (Supplementary Tables 2 and 3). We found that 3 oocytes were aneuploid, and 33 embryos had at least one aneuploid blastomere (Supplementary Figs. 2 and 3). Furthermore, cells with copy number gain of chromosomes tended to be hypomethylated, whereas those with copy number loss tended to be hypermethylated, as compared to euploid cells (Supplementary Fig. 4). For the following analysis, we excluded all of these aneuploid cells and exclusively focused on the euploid gametes and blastomeres (Supplementary Figs. 2 and 3).

Detailed analysis identified three waves of global demethylation in preimplantation embryos. The first wave occurred during the first 10 to 12 h after fertilization. The median DNA methylation level of the paternal genome decreased from 82.0% in the sperm to 52.9% in the early male pronucleus (PN). At the same time, the median DNA methylation level of the maternal genome decreased from 54.5% in the mature oocyte to 50.7% in the early female PN (Fig. 1). Demethylated genomic regions during this wave were strongly enriched for enhancer and gene body regions, indicating that the earliest demethylated genomic regions mainly correspond to these important functional genomic elements (Figs. 2 and 3a). The second wave of global demethylation occurred from the late zygote to the two-cell stage, with the methylation level drastically decreased from 49.9% to 40.4%. The third wave occurred from the eight-cell to the morula stage, with the methylation level decreasing from 47.0% to 35.1% (Fig. 1a). These two waves of demethylation occurred in regions drastically enriched for introns and short interspersed nuclear elements (SINEs), especially the evolutionarily younger subfamily of Alu elements (Figs. 2 and 3b,c). This finding indicates that demethylation occurs in a stepwise and wave-like manner in euploid preimplantation embryos.

Fig. 1: De novo methylation patterns in early human embryos.
figure1

a, Box plots depicting the DNA methylation dynamics across different developmental stages. There were two significant de novo methylation waves during early human embryonic development, including between the early male pronuclear to mid-pronuclear stage and the four-cell to eight-cell stage. All DNA methylation levels for the preimplantation stages were calculated using single-cell samples, whereas those for the postimplantation stages were calculated from bulk samples. Early PN, 10–11 hours post-fertilization (h.p.f.); mid-PN, 22–23 h.p.f.; late PN, 25 h.p.f. Each box plot represents the median, 25% and 75% quantiles; whiskers indicate 1.5 times the IQR (interquartile range). MII, metaphase II. b, Representative locus of drastic de novo methylation from the early to mid-pronuclear stage in sperm. c, Representative locus of drastic de novomethylation from the four- to eight-cell stage. d, Histogram showing the numbers of tiles with increasing (magenta) and decreasing (cyan) methylation across consecutive developmental stages.

Fig. 2: DNA methylation dynamics on different annotated genomic elements.
figure2

Promoters were further classified as high-density CpG promoters (HCP), intermediate-density CpG promoters (ICP) or low-density CpG promoters (LCP) according to their CpG densities22. Box plots represent the median, 25% and 75% quantiles; whiskers indicate 1.5 times the IQR.

Fig. 3: DNA methylation dynamics of de novo–methylated regions and enrichment analysis of demethylation regions.
figure3

ac, Enrichment analysis for demethylated tiles in the indicated genomic regions in the transitions from sperm to the early male pronucler stage (a), zygotes to the two-cell stage (b), and the eight-cell stage to the morula (c) (hypergeometric test). d,e, Heat maps showing dynamic changes in de novo–methylated regions in the early to mid-pronuclear stage in sperm (d) and the four- to eight-cell stage (e). White indicates missing (undetected) values.

More notably, we found drastic de novo DNA methylation during human preimplantation development (sites that at early developmental stages had <25% DNA methylation and at later stages had >75% DNA methylation; Methods). In particular, we found that there were two strong waves of de novo DNA methylation, one of which occurred from the early male pronuclear to the mid-pronuclear stage and the other of which occurred from the four-cell to the eight-cell stage (Fig. 1). During these two waves of methylation, 19,861 and 53,437 tiles (≥300 bp) were de novo methylated, respectively (Figs. 1d and 3d,e, and Supplementary Fig. 5a,b). This finding indicates that drastic de novo methylation occurs during preimplantation development in embryos, and the global DNA methylation reprogramming is in fact a dynamic balance between strong genome-wide demethylation and focused remethylation16. The de novo–methylated CpG sites were strongly enriched for major families of repeat elements, such as SINEs, long interspersed nuclear elements (LINEs), and long terminal repeats (LTRs) (Supplementary Fig. 5c). Furthermore, for both waves, the de novo–methylated CpG sites were dramatically enriched for evolutionarily younger subfamilies, such as Alu and LINE-1 (L1) retroelements (Supplementary Fig. 5c)17. This finding indicates that the de novo methylation favors potentially active repeat elements, likely repressing their transcriptional activity to avoid mobilization and genome instability. Interestingly, these de novo–methylated genomic regions were, in general, demethylated again in the following developmental stages, highlighting the transient and dynamic nature of the methylation changes (Fig. 3d,e).

Subsequently, we assessed differences in demethylation dynamics between the paternal and maternal genomes. We sequenced the genome of each sperm donor and used this information to determine the parental origin of tens of thousands of heterozygous CpG sites in the DNA methylome datasets for individual blastomeres or embryos. We found that the demethylation rate of the paternal genome was much faster than that of the maternal genome at the zygote stage. After the two-cell stage, residual methylation in the paternal genome (average of 15.2% at the two-cell stage, n = 6) was continuously lower than that in the maternal genome (average of 23.0% at the two-cell stage, n = 6) throughout preimplantation development (Fig. 4). This finding indicates that, although the methylation level of the paternal genome in a sperm is originally much higher than that of the maternal genome, the residual methylation memory from the paternal genome after global demethylation is considerably lower than that from the maternal genome from the two-cell stage throughout preimplantation development. This phenomenon was not due to focused parental-specific methylation of known imprinting control regions (ICRs), given that the preferential hypermethylation of the maternal genome still held when we excluded all known ICRs. More notably, the DNA methylation level of the maternal genome (78.6%, n = 3) was weakly yet consistently higher than that of the paternal genome (76.0%, n = 3) in the postimplantation embryo proper after global remethylation of the genome during implantation (Fig. 4a)18. This feature was also noted in the extra-embryonic villus, where the methylation levels of the maternal and paternal genomes were 62.0% (n = 2) and 58.5% (n = 2), respectively (Fig. 4a). This finding indicates that the pattern of preferential hypermethylation of the maternal genome is maintained for a wide variety of genomic loci during both preimplantation and postimplantation development and in both embryonic and extra-embryonic lineages (Fig. 4b,c, Supplementary Fig. 6, and Supplementary Table 4). To explore the potential effects of parental-specific methylation on allele-specific gene expression (ASE), we performed single-cell RNA-seq and single-cell DNA methylome sequencing for inner cell mass (ICM) and trophectoderm (TE) cells obtained from blastocysts. Although we observed higher DNA methylation at gene promoter regions in the maternal genome than in the paternal genome in both ICM and TE cells, very few genes with parental-specific methylation events at their promoter regions showed ASE (Supplementary Fig. 7), which indicates that the majority of DNA methylation differences between the parental genomes do not contribute to ASE at the blastocyst stage.

Fig. 4: Parental methylome dynamics during early human embryogenesis.
figure4

a, CpG DNA methylation dynamics of the traceable paternal (red) and maternal (blue) genomes. After the two-cell stage, we traced the parental origins of the genomes using available and reliable heterozygous SNPs (in total, 4,556,209 traceable CpG sites were used in this analysis; Methods). Data are shown as means ± s.e.m. b, Six loci with asymmetric methylation between the maternal and paternal genomes from the two-cell to the postimplantation stage. c, Asymmetric DNA methylation of the maternal and paternal genomes at a known maternal imprinting region (PEG10).

Next, we analyzed the differentially methylated regions (DMRs) between oocytes and sperm. We identified 165,914 DNA regions corresponding to paternal DMRs (300 to 166,800 bp; median of 2,100 bp) and 20,984 regions that were maternal DMRs (300 to 25,500 bp; median of 900 bp) (Fig. 5a,b and Supplementary Fig. 8a,b). The total lengths of the sperm- and oocyte-specific DMRs were 539 Mb and 25 Mb, respectively, comprising up to 17.1% and 0.80% of the total human genome. The longest paternal DMR was 166.8 kb and was located in the genic region of CDH23, whereas the longest maternal DMR was 25.5 kb and was located in the intron of TTC34 (Fig. 5c and Supplementary Fig. 8c). The ICRs for all 31 known imprinted genes were included among the DMRs that we identified, verifying the accuracy of our results (Fig. 5d and Supplementary Figs. 8d and 9)19. We then analyzed the distribution of these DMRs in the genome and found that the paternal DMRs were strongly enriched on somatic-cell-specific enhancers and one of the major families of transposable elements, SINEs—especially the evolutionarily younger subfamily of Alu elements (Supplementary Fig. 8e). On the other hand, the maternal DMRs were clearly enriched on CpG islands (CGIs), gene promoters, and SINEs, especially Alu elements (Supplementary Fig. 8f), which is consistent with our previous results1. This finding indicates that both parental genomes tend to silence evolutionarily younger and more active SINEs by DNA methylation to complement individual genomic regions.

Fig. 5: Characterization of gamete-specific differentially methylated regions.
figure5

a,b, Heat maps showing the DNA methylation dynamics of 20,984 metaphase II oocyte–specific DMRs (a) and 165,914 sperm-specific DMRs (b) across different developmental stages. c,d, Graphic representations of the longest metaphase II oocyte–specific DMR (c) and the H19 paternal imprinting region (d).

Inspired by the observation of passive dilution and asymmetric distribution of 5-hydroxymethylcytosine (5hmC) in preimplantation blastomeres20, we analyzed the potential asymmetric inheritance of DNA methylation. Passive dilution of methylation during DNA replication leads to passive demethylation of the genome, which happened in the first few rounds of cell division after fertilization and may make the DNA methylation levels of the same chromosome vary between daughter cells, as deduced in Supplementary Fig. 10. If so, we could trace cell lineage by comparing chromosome-level DNA methylation patterns among all individual blastomeres from the same embryo. We microinjected FITC-coupled dextran into one blastomere each of two-cell mouse embryos; only the two daughter cells from the injected blastomere were fluorescent at the four-cell stage, as expected (Fig. 6a). We then dissociated each four-cell embryo and performed single-cell DNA methylome sequencing for all four blastomeres. We found that the whole-chromosome DNA methylation patterns for the two fluorescent cells in the same four-cell embryo, which were derived from the same cell at the two-cell stage, showed clear negative correlation in each four-cell embryo we analyzed. This indicates that the two blastomeres at the four-cell stage with negatively correlated DNA methylation on the scale of individual chromosomes are from the same mother cell at the two-cell stage (Fig. 6b,c). In this way, single-cell DNA methylome data could be used to reconstruct the genetic lineage of all the blastomeres in a four-cell embryo. We found that negative correlation was evident in all three of the four-cell human embryos we analyzed (Fig. 6b,d). These results validate our speculation that single-cell DNA methylome information in early mammalian embryos can be used to reconstruct the genetic lineage of blastomeres at the four-cell stage.

Fig. 6: Lineage tracing of four-cell embryos by DNA methylation analysis.
figure6

a, Indication of the genetic lineage for blastomeres at the four-cell stage with fluorescence. One blastomere of a two-cell-stage mouse embryo was microinjected with FITC-coupled dextran; fluorescence indicates the two daughter cells from the injected blastomere at the four-cell stage. Scale bar, 100 μm. b, Distribution of the pairwise Pearson correlation coefficients for whole-chromosome-level DNA methylation in individual blastomeres from the same four-cell embryo, showing a clear negative correlation for DNA methylation patterns between sibling cells in both human (red) and mouse (blue) embryos. Gray dashed lines show the two peaks for the correlation coefficients. Chromosome-level DNA methylation values for four single cells from the same four-cell embryo were transformed to z scores, and Pearson correlations were then calculated. c, Heat map showing the correlation coefficients in a mouse four-cell embryo; immunostaining showed that four cell-1 and four cell-2 were derived from the same cell at the two-cell stage. Unsupervised hierarchical clustering also grouped these two cells together. d, Heat map showing the correlation coefficients in a human four-cell embryo. Unsupervised hierarchical clustering indicated that four cell-2 and four cell-4 were from the same cell at the two-cell stage.

We found that, after the two-cell stage, the residual methylation on the maternal genome was consistently and drastically higher than that on the paternal genome. More notably, this global maternal hypermethylation pattern was still consistently maintained after implantation, especially in the extra-embryonic villus (Fig. 4 and Supplementary Fig. 6)21. This finding suggests that the maternal genome may contribute more DNA methylation memory to both the embryo proper and the placenta throughout early embryonic development than the paternal genome. Thus, the DNA methylome is asymmetrically distributed between maternal and paternal alleles, which may have consequences for transcription and other features of parental-specific alleles.

Additionally, we found that there was a distinct set of de novo–methylated genomic loci during preimplantation development, especially from the early to mid-pronuclear stage and from the four-cell to the eight-cell stage (Figs. 1a–c, 2, and 3d,e, and Supplementary Fig. 5a,b). In fact, the global DNA methylation level of eight-cell embryos was dramatically increased in comparison to that of four-cell embryos, indicating that de novo methylation prevailed over demethylation during this developmental period. More notably, these de novo–methylated genomic regions were strongly enriched for Alu elements, which represent an evolutionarily more active subfamily of SINEs (Supplementary Fig. 5c).

In summary, the DNA methylome of early human euploid embryos was surveyed at single-cell and single-base resolution. We identified several key features of the methylome for euploid embryos, including drastic de novo methylation and asymmetric methylation of the parental genomes. We also were able to trace the genetic lineage of blastomeres at the four-cell stage by DNA methylation analysis. Altogether, these findings offer a roadmap for elucidating the contribution of these features to cell fate determination and genome integrity during early development.

URLs

Annotations of genomic regions were downloaded from the UCSC Genome Browser at http://genome.ucsc.edu/cgi-bin/hgCustom. Bis-SNP, http://people.csail.mit.edu/dnaase/bissnp2011/; Trim Galore, http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/; Bismark (version 0.7.6), http://www.bioinformatics.babraham.ac.uk/projects/bismark/; BWA, http://bio-bwa.sourceforge.net/; GATK, https://software.broadinstitute.org/gatk/.

Methods

Volunteer recruitment and ethics approval of this study

All of the human gametes and early embryos at different developmental stages were obtained from the donors voluntarily at the Center for Reproductive Medicine of Peking University Third Hospital with signed informed consent and the approval of the Ethics Committee (license no. 2012SZ015). We confirm that our study is compliant with the “Guidance of the Ministry of Science and Technology (MOST) for the Review and Approval of Human Genetic Resources.” All embryo collection, thawing and culture procedures were performed using standard clinical protocols as previously published23,24.


Collection of early human embryos

Human gametes

The human semen sample used in this study was obtained from one healthy man with normal semen parameters. The swim-up sperm were further collected after several rounds of washing with HTF medium (Lifeglobal). Human oocytes, including GV oocytes and mature oocytes at metaphase II, were obtained from healthy women without habitual drug use history or familial disease history according to published protocols25,26. GV oocytes and metaphase II oocytes were carefully discriminated under the microscope and collected separately. The surrounding cumulus cells and other somatic cells attached to the oocytes were removed after treatment with hyaluronidase (Sigma), and the polar bodies from the metaphase II oocytes were further biopsied using laser-assisted micromanipulators. After these steps, all of the gametes were extensively washed with PBS buffer before single-cell selection, and the remaining sperm cells were further pelleted after centrifugation for bulk genomic DNA extraction.

Human preimplantation embryos

All human preimplantation embryos used in this study exhibited normal morphology with appropriate developmental speed27. To trace the parental origins at different developmental stages, zygotes, two-cell-, four-cell-, eight-cell- (or eight- to ten-cell-), morula-, and blastocyst-stage embryos were obtained at appropriate time points after the metaphase II oocytes (from different donors) were fertilized with the same batch of sperm, unless otherwise specified. All of these embryos were immersed in an acid solution, pipetted to remove the zona pellucida, and then washed several times with PBS before transfer into cell lysis buffer. Male and female pronuclei were precisely isolated from the individual zygotes at different time points according to the time after ICSI using laser-assisted biopsy, and male and female pronuclei were further discriminated on the basis of a previously reported method1. ICM and TE cells of blastocysts were physically separated by cutting using a glass needle with the assistance of a syringe needle under the stereoscope as we previously described28. Next, the ICM and TE portions were dissociated into single-cell suspension with Accutase and were randomly picked for DNA methylome sequencing separately.

Human postimplantation embryos

The postimplantation embryos used in this study were collected from aborted fetuses, and the gestational ages of these embryos were assessed on the basis of data from the last menstrual period together with ultrasound results29. Embryos were first washed several times with PBS to remove any possible maternal somatic contaminates and blood and were further dissected under a stereomicroscope to isolate the heart and villus tissues.


Genomic DNA extraction from bulk samples

The sperm pellet, peripheral blood from the couples who donated the postimplantation embryos, and the isolated tissues from these embryos were subjected to extraction of the genomic DNA using the Genomic DNA Clean and Concentrator kit (VisTech) according to the manufacturer's instructions.


Whole-genome bisulfite sequencing

Single-cell whole-genome bisulfite sequencing was performed according to a previously published protocol12. Briefly, a single cell or small numbers of cells were seeded into lysis buffer by mouth pipette; DNA was released after proteinase treatment at 50 °C and then subjected to bisulfite conversion. After column-based purification, DNA was complemented with the biotinylated random primer Bio-P5-N9 (Supplementary Table 5) and 50 units of Klenow polymerase (3′ to 5′ exo, New England BioLabs). This random priming was repeated five times in total. Second strands were synthesized using another random primer, P7-N9, and final libraries were generated after 8 to 12 cycles of PCR amplification with the Illumina universal PCR primer and Illumina indexed primer.


Sequencing library construction for transcriptome analysis

Transcriptomes of single cells or low numbers of cells were amplified using modified Smart-seq2. We reverse transcribed RNA molecules using an oligo(dT) primer anchored with 8-bp unique molecular identifiers (UMIs). Amplified cDNA was fragmented into ~300-bp fragments using a Covaris S2 sonicator, and sequencing libraries were constructed using a Kapa Hyper Prep kit (Kapa Biosystems).


Sequencing read quality control and alignment

All sequencing data in this study were generated on the Illumina HiSeq 2500 or HiSeq 4000 platform (sequenced by Novogene).

All bisulfite sequencing reads were first trimmed to remove adaptors and low-quality bases using trim_galore (version 0.3.3) (parameters: --quality 20 --phred33 --length 50 --paired). Next, reads that passed quality control were mapped to the human reference genome (hg19) using Bismark (version 0.7.6)30 with paired-end alignment mode (parameters: --non-directional --fastq). To recover additional sequencing reads, unmapped reads were realigned to the same reference genome in single-end alignment mode. Only reads with a unique mapping location in the genome were retained for further analysis. We also incorporated published raw whole-genome bisulfite sequencing data from human sperm cells into our analysis pipeline8.

All genomic DNA sequencing reads were also stripped of their adaptor sequences using the trim_galore tool (version 0.3.3) and then mapped to the hg19 reference genome using BWA31 with long-reads ‘Mem’ mode. RNA-seq data were mapped to the hg19 reference genome using TopHat32 with known gene annotation.


DNA methylation level estimation

After alignment, reads were further deduplicated using the ‘samtools rmdup’ command33. For each cytosine site (or guanine corresponding to a cytosine on the opposite strand) in the reference genome sequence, the DNA methylation level was determined by the ratio of the number of reads supporting C (methylated) to that of total reads (methylated and unmethylated). Notably, in this study, the DNA methylation level of any given sample or region was referred to as the DNA methylation level of the CpG sites of this sample or region, unless specified elsewhere. In the single-cell DNA methylome analysis, CpG sites with greater than 90% DNA methylation were considered methylated, whereas CpG sites with less than 10% DNA methylation levels were considered unmethylated12,14. The read coverage threshold used to call the DNA methylation level for any cytosine was 1× for single-cell samples and 3× for bulk samples.

A tile-based method was then applied to bin consecutive genomic windows with a fixed length to facilitate comparison across samples. To determine the length of this window, we estimated the DNA methylation level variance and number of CpG sites covered in windows with different lengths, and we finally chose 300 bp as the window length. The DNA methylation level of any given 300-bp tile was estimated as the ratio of the total number of reported cytosines and the total number of reported cytosines and thymines, only if greater than three CpG sites were covered in this region.

We also estimated the DNA methylation level of different genomic regions that were annotated in the UCSC genome table and other databases, including CGIs, exons, introns, intergenic regions, intragenic regions, and repetitive regions1. Promoters were further classified as high-density CpG promoters (HCPs), intermediate-density CpG promoters (ICPs), and low-density CpG promoters (LCPs) on the basis of the corresponding CpG density as previously defined22,34. The annotation of human enhancers was obtained from ref. 1. All annotations of known human imprinted regions were obtained from ref. 18. The DNA methylation level of these genomic regions was calculated on the basis of the average methylation level of all covered CpG sites within these regions.


SNP calling and parental genome tracing

To enhance the accuracy of SNP calling using bisulfite sequencing data, we used two independent methods based on the binomial test and Bayesian inference. Only the SNPs inferred by both methods were considered to be confidently called SNPs and used in further analyses. Additionally, we also sequenced the genomic DNA of the peripheral blood from the couples who donated the miscarried postimplantation embryos for our study and used these whole-genome sequencing datasets to validate the SNPs called from the bisulfite sequencing data of the fetal embryos and further facilitate the tracing of parental origins.

Binomial test inference of SNPs using bisulfite sequencing data

After alignment of bisulfite sequencing reads, we first counted the number of reads that supported each nucleotide (A, T, G, C) at all covered genomic sites. Subsequently, we used a binomial test to determine whether the read count that supported each nucleotide was observed by chance, assuming that the sequencing error rate equals 0.01. Genotypes were called only if the read coverage depth for a given position was no less than 8× and the P value was less than 0.0535,36.

Additionally, while counting the number of reads that supported each nucleotide, we discarded the indistinguishable reads due to bisulfite conversion. Specifically, for reads mapped to the plus strand of the reference genome sequence, reads supporting T were removed, whereas reads that mapped to the minus strand supporting A were removed.

Bayesian inference using the Bis-SNP package

Mapped data (BAM files) were first merged together for samples obtained from the same embryo. To ensure accuracy, we realigned all mapped reads using the Bis-SNP tool with the ‘BisulfiteIndelRealigner’ parsing function to enable insertion and deletion detection37. The SNPs were then called following the standard pipeline of Bis-SNP after base-quality recalibration (parameters: -T BisulfiteGenotyper -stand_call_conf 20 -stand_emit_conf 0 -mmq 30 -mbq 5).

SNP calling using whole-genome sequencing data

After mapping with the BWA tool, GATK was applied to call SNPs (parameters: -T HaplotypeCaller -stand_emit_conf 10 -stand_call_conf 30)38. Raw SNPs were further filtered using ‘VariantFiltration’ in GATK (parameters: QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < –12.5 || ReadPosRankSum < –8.0). After these filtering steps, only homogeneous SNPs with sufficient read depths that were annotated in the dbSNP database (v135) were retained for subsequent analysis.

Parental genome tracing

We sequenced the genomic DNA of the bulk sperm sample obtained from one healthy man and also used the same batch of sperm for the ICSI to generate the preimplantation embryos. Only SNPs called in the preimplantation embryos as heterozygous (such as G/T) and also homozygous (such as T/T) in the sperm samples simultaneously were retained to trace parental origins (in this case, the G allele in embryos was deduced to be maternally inherited).

For the postimplantation embryos, the same filtering threshold was applied. We sequenced genomic DNA derived from the peripheral blood of the couples who donated the miscarried embryos for this study. The homogeneous SNPs called from the male and female peripheral blood DNA sequencing data were used to deduce paternal and maternal origin, respectively.


Parental-specific methylation

For each determined heterozygous SNP site for which there was knowledge of parental allele origin, we assigned the mapped read pairs covering this site to paternally and maternally inherited groups. Thus, we could estimate the DNA methylation level of allele-linked cytosine sites after read assignment. Subsequently, the regions for which one allele exhibited a greater than 75% DNA methylation level while the other allele exhibited less than 25% DNA methylation were defined as allele-specific methylation regions (multiple t test with adjusted P value less than 0.05).


Allele-specific gene expression

Mapped reads were first assigned to the paternally or maternally inherited group by each heterozygous SNP site. Genes with at least ten read assignments were further analyzed. The maternal (or paternal) expression level of each gene was estimated by the ratio of maternally (or paternally) inherited reads to the total number of reads assigned to each gene17. Genes that had a difference in the expression levels of their maternal and paternal alleles greater than 80% (or less than 20%) were defined as genes with ASE (Fisher's test, adjusted P value less than 0.05).


Differentially methylated regions

Initially, we defined the methylation levels of the CpG sites in a given stage by averaging the methylation levels of each CpG site in all biological replicates at the same stage. We then systematically compared the DNA methylation levels of 300-bp tiles covered by each of two consecutive developmental stages only if more than three CpG sites were covered within each tile. An FDR-adjusted multiple t test was applied for this comparison. The tiles with a DNA methylation level of less than 25% in the first stage and greater than 75% in the later stage (P value less than 0.05) were classified as increasing tiles, whereas tiles with a DNA methylation level of greater than 75% in the first stage and less than 25% in the later stage were classified as decreasing tiles. The remaining tiles were defined as stable. When we called DMRs, a more stringent cutoff was applied: only the tiles with DNA methylation level greater than 80% in one stage and less than 20% in the other stage with P value less than 0.05 (multiple t test, FDR less than 0.05) were kept for the subsequent analysis. We then extended these tiles in both the forward and backward directions and stitched them together until the tiles had fewer than three CpG sites covered or had a change in the opposite direction. These extended and merged tiles were finally defined as DMRs. The hypergeometric enrichment analysis of DMRs was performed using the R function ‘phyper’. In detail, tiles (300 bp in length) with at least three CpG sites covered by both stages in a comparison were used as the background to find the DMRs. All these background tiles, including those defined as DMRs, were annotated to the known genomic regions downloaded from the UCSC Genome Browser. For each genomic region, we counted the number of tiles and DMRs that were annotated to them requiring at least 1-bp overlap. Then, we assigned these numbers to the R program with function ‘phyper’ and calculated the P values.


Statistics

The details of the statistics methods used are described in each analysis subsection. We used a customized binomial test and Bayesian module (Bis-SNP package) for SNP calling from bisulfite sequencing data, and GATK was used to call SNP sites using whole-genome sequencing data. Student's t test was applied to compare the methylation difference between parental genomes, heterozygous alleles and developmental stages, all requiring adjusted P values less than 0.05. Fisher’s test was applied to analyze ASE. Hypergeometric enrichment analysis was performed by using the R function ‘phyper’.


Life Sciences Reporting Summary

Further information on experimental design is available in the Life Sciences Reporting Summary.


Data availability

All sequencing data have been deposited in the Gene Expression Omnibus (GEO) database under accession GSE81233.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. 1.

    Bird, A. DNA methylation patterns and epigenetic memory. Genes Dev. 16, 6–21 (2002).

  2. 2.

    Smith, Z. D. & Meissner, A. DNA methylation: roles in mammalian development. Nat. Rev. Genet. 14, 204–220 (2013).

  3. 3.

    Hackett, J. A. & Surani, M. A. DNA methylation dynamics during the mammalian life cycle. Phil. Trans. R. Soc. Lond. B 368, 20110328 (2013).

  4. 4.

    Guo, H. et al. The DNA methylation landscape of human early embryos. Nature 511, 606–610 (2014).

  5. 5.

    Smith, Z. D. et al. DNA methylation dynamics of the human preimplantation embryo. Nature 511, 611–615 (2014).

  6. 6.

    Okae, H. et al. Genome-wide analysis of DNA methylation dynamics during early human development. PLoS Genet. 10, e1004868 (2014).

  7. 7.

    Fulka, H., Mrazek, M., Tepla, O. & Fulka, J. Jr. DNA methylation pattern in human zygotes and developing embryos. Reproduction 128, 703–708 (2004).

  8. 8.

    Molaro, A. et al. Sperm methylation profiles reveal features of epigenetic inheritance and evolution in primates. Cell 146, 1029–1041 (2011).

  9. 9.

    Vanneste, E. et al. Chromosome instability is common in human cleavage-stage embryos. Nat. Med. 15, 577–583 (2009).

  10. 10.

    Ambartsumyan, G. & Clark, A. T. Aneuploidy and early human embryo development. Hum. Mol. Genet. 17R1, R10–R15 (2008).

  11. 11.

    Fang, F. et al. Genomic landscape of human allele-specific DNA methylation. Proc. Natl. Acad. Sci. USA 109, 7332–7337 (2012).

  12. 12.

    Smallwood, S. A. et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat. Methods 11, 817–820 (2014).

  13. 13.

    Farlik, M. et al. Single-cell DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics. Cell Rep. 10, 1386–1397 (2015).

  14. 14.

    Guo, H. et al. Single-cell methylome landscapes of mouse embryonic stem cells and early embryos analyzed using reduced representation bisulfite sequencing. Genome Res. 23, 2126–2135 (2013).

  15. 15.

    Miura, F., Enomoto, Y., Dairiki, R. & Ito, T. Amplification-free whole-genome bisulfite sequencing by post-bisulfite adaptor tagging. Nucleic Acids Res. 40, e136 (2012).

  16. 16.

    Gao, F. et al. De novo DNA methylation during monkey pre-implantation embryogenesis. Cell Res. 27, 526–539 (2017).

  17. 17.

    Batzer, M. A. & Deininger, P. L. Alu repeats and human genomic diversity. Nat. Rev. Genet. 3, 370–379 (2002).

  18. 18.

    Hamada, H. et al. Allele-specific methylome and transcriptome analysis reveals widespread imprinting in the human placenta. Am. J. Hum. Genet. 99, 1045–1058 (2016).

  19. 19.

    Court, F. et al. Genome-wide parent-of-origin DNA methylation analysis reveals the intricacies of human imprinting and suggests a germline methylation-independent mechanism of establishment. Genome Res. 24, 554–569 (2014).

  20. 20.

    Mooijman, D., Dey, S. S., Boisset, J.-C., Crosetto, N. & van Oudenaarden, A. Single-cell 5hmC sequencing reveals chromosome-wide cell-to-cell variability and enables lineage reconstruction. Nat. Biotechnol. 34, 852–856 (2016).

  21. 21.

    Hanna, C. W. et al. Pervasive polymorphic imprinted methylation in the human placenta. Genome Res. 26, 756–767 (2016).

  22. 22.

    Hardarson, T. et al. A morphological and chromosomal study of blastocysts developing from morphologically suboptimal human pre-embryos compared with control blastocysts. Hum. Reprod. 18, 399–407 (2003).

  23. 23.

    Sathananthan, A. H. & Osianlis, T. Human embryo culture and assessment for the derivation of embryonic stem cells (ESC). Methods Mol. Biol. 584, 1–20 (2010).

  24. 24.

    Chian, R.-C., Lim, J.-H. & Tan, S.-L. State of the art in in-vitro oocyte maturation. Curr. Opin. Obstet. Gynecol. 16, 211–219 (2004).

  25. 25.

    Li, R., Qiao, J., Wang, L., Zhen, X. & Lu, Y. Serum progesterone concentration on day of HCG administration and IVF outcome. Reprod. Biomed. Online 16, 627–631 (2008).

  26. 26.

    Baltaci, V. et al. Relationship between embryo quality and aneuploidies. Reprod. Biomed. Online 12, 77–82 (2006).

  27. 27.

    Yan, L. et al. Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells. Nat. Struct. Mol. Biol. 20, 1131–1139 (2013).

  28. 28.

    Li, R. et al. Retain singleton or twins? Multifetal pregnancy reduction strategies in triplet pregnancies with monochorionic twins. Eur. J. Obstet. Gynecol. Reprod. Biol. 167, 146–148 (2013).

  29. 29.

    Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).

  30. 30.

    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

  31. 31.

    Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

  32. 32.

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

  33. 33.

    Borgel, J. et al. Targets and dynamics of promoter DNA methylation during early mouse development. Nat. Genet. 42, 1093–1100 (2010).

  34. 34.

    Xie, W. et al. Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell 153, 1134–1148 (2013).

  35. 35.

    Orozco, L. D. et al. Epigenome-wide association of liver methylation patterns and complex metabolic traits in mice. Cell Metab. 21, 905–917 (2015).

  36. 36.

    Guo, W. et al. BS-Seeker2: a versatile aligning pipeline for bisulfite sequencing data. BMC Genomics 14, 774 (2013).

  37. 37.

    Liu, Y., Siegmund, K. D., Laird, P. W. & Berman, B. P. Bis-SNP: combined DNA methylation and SNP calling for Bisulfite-seq data. Genome Biol. 13, R61 (2012).

  38. 38.

    McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

Download references

Acknowledgements

We thank W. Guo for his insightful discussion. L.Y., J.Q., and F.T. were supported by grants from the National Natural Science Foundation of China (81561138005, 31230047, 31522034, 31571544, 81521002, and 31625018) and the National Basic Research Program of China (2014CB943200 and 2017YFA0102702). J.Q. and F.T. were also supported by a grant from the Beijing Municipal Science and Technology Commission (D151100002415000). L.Y. was supported by a grant from the National High-Technology Research and Development Program (2015AA020407). The work was supported by the Beijing Advanced Innovation Center for Genomics at Peking University.

Author information

F.T., J.Q., and L.Y. conceived the project. H.G., Y.R., Y.H., R.L., Y.L., X.F., Y.G., X.W., Y.W., P.L., J.Y., X.R., P.Y., Y.Y., Z.Y. and L.W. performed the experiments. P.Z., J.D., B.H. and H.G. conducted the bioinformatic analyses. F.T., J.Q., L.Y., H.G., P.Z., Y.R. and Y.H. wrote the manuscript with help from all of the authors.

Competing interests

The authors declare no competing financial interests.

Correspondence to Liying Yan or Jie Qiao or Fuchou Tang.

Integrated supplementary information

Supplementary Figure 1 Isolation of human early embryos for single-cell DNA methylome analysis.

a, Morphologies of oocytes and human early embryos at different developmental stages. Scale bar, 100 μm. bf, The polar body and nuclear region of MII oocytes were biopsied by laser-assisted micromanipulation. Scale bar, 100 μm. b, Bright-field image of an MII oocyte. c, Staining with Hoechst 33342 dye to confirm the absence of genomic contaminants and reveal the nuclear region (arrow) and the first polar body (triangle) of the MII oocyte. d, Bright-field image of the aspiration of a polar body of an MII oocyte (triangle). e, Bright-field image of aspiration of the nuclear region of an MII oocyte (arrow). f, Hoechst 33342 staining showing aspiration of the nuclear region of an MII oocyte (arrow). g, Heat map showing the pairwise correlations of euploid single cells from different developmental stages. Unsupervised hierarchical clustering indicated that there were at least three major clusters, representing the sperm and male pronuclei (cluster I), GV and MII oocytes and female pronuclei (cluster II), and the cells from cleavage-stage embryos (cluster III), corresponding to the hypermethylation, intermediate methylation, and hypomethylation features of the different cell types.

Supplementary Figure 2 General quality control and sequencing statistics of the single-cell DNA methylome sequencing samples.

a, The number of CpG sites with at least one-, three-, and fivefold coverage across the single-cell samples. The “Mixed” label represents blastocyst cells that we did not separate manually into the ICM and TE (but directly dissociated the whole blastocyst into single-cell suspension and randomly picked single cells from them). The box plot represents the median, 25% and 75% quantiles; whiskers indicate 1.5 times the IQR (interquartile range). b, The CpG coverage (1×) at each developmental stage and the CpG coverage of single-cell and bulk samples at the same stages were merged together. c, Box plot showing the percentage of the digitized CpG methylation output in the single-cell DNA methylome sequencing samples; greater than 90% of CpG sites at each developmental stage had a methylation level of either fully methylated or unmethylated. The box plot represents the median, 25% and 75% quantiles; whiskers indicate 1.5 times the IQR. d, The general quality controls of all samples across different developmental stages; aneuploid samples and samples with low sequencing quality were excluded from subsequent analyses. Low-quality samples were determined by performing pairwise correlation analysis of all single cells at the same developmental stage. e, Pairwise comparison at different developmental stages showing the heterogeneity across intra- and inter-embryos; eight-cell stage embryo and morula were excluded from this analysis owing to the inadequate number of samples; n.s., not significant (Student's t test). The box plot represents the median, 25% and 75% quantiles; whiskers indicate 1.5 times the IQR.

Supplementary Figure 3 The copy number variation landscapes of human early embryos deduced from the DNA methylome dataset.

a, The normal sate (light blue), gain (red), and loss (cyan) of autosomes at different developmental stages. The “Mixed” label represents blastocyst cells that we did not separate manually into the ICM and TE (but directly dissociated the whole blastocyst into single-cell suspension and randomly picked single cells from them). b, Several representative examples of euploid and aneuploid single-cell samples; chromosomes with an aberrant copy number are highlighted in the purple boxes. c,d, Histograms of the statistics regarding the number of embryos (c) or single cells (d) with CNVs on each chromosome.

Supplementary Figure 4 Effects of chromosome copy number variation on DNA methylation.

a, Density distribution of DNA methylation for ICM samples grouped by chromosome copy number. b, Density distribution of DNA methylation for TE samples grouped by chromosome copy number. c, Box plots show the DNA methylation distribution of 35 individual TE cells isolated from the same embryo and grouped by chromosome copy number; P values were calculated by t test. The box plot represents the median, 25% and 75% quantiles; whiskers indicate 1.5 times the IQR.

Supplementary Figure 5 Main features of de novo DNA methylation in human early embryos.

a, The representative regions that were de novo methylated from the earlier stage to the later stage. White open circles represent unmethylated CpG dinucleotides, whereas black filled circles represent methylated CpG dinucleotides. b, Histogram showing the numbers of increasing (magenta) and decreasing (cyan) CpG sites across consecutive stages. c, Enrichment analysis of de novo–methylated tiles on different genomic elements from the early male to mid-pronuclear stage and the four- to eight-cell stage (hypergeometric test).

Supplementary Figure 6 Select representative loci showing the hypermethylated maternal genome and hypomethylated paternal genome at different developmental stages.

White open circles represent unmethylated CpG dinucleotides, whereas black filled circles represent methylated CpG dinucleotides. Heterozygous SNPs were used to trace the parental genome.

Supplementary Figure 7 Parental-specific methylation and allele-specific expression in blastocyst-stage embryos.

Scatterplot showing ASE (x axis) and parental-specific methylation at the promoter regions of the corresponding genes (y axis), both presented as maternal scores minus paternal scores. The histogram shows the frequency distribution of methylation (M – P) indicating hypermethylation of the maternal genome. Genes showed both parental-specific methylation at promoters and ASE was labeled in blue. Genes labeled more than once were presented by different embryos (n = 4). ICM samples were displayed in a, and TE samples are displayed in b.

Supplementary Figure 8 The main features of the gamete-specific methylation regions (DMRs).

a, Length distributions of sperm-specific (red) and MII oocyte–specific (blue) DMRs. b, Gene ontology analysis of the genes with promoters located in the gamete-specific DMRs. c,d, Two representative regions showing the longest sperm-specific DMRs (c) and one representative maternal imprinting region, PEG3 (d) (red, CpG sites with methylation greater than 50%; blue, CpG sites with methylation level less than 50%). e,f, Enrichment analysis of the sperm-specific (e) and MII oocyte–specific (f) DMRs on different genomic elements (hypergeometric test).

Supplementary Figure 9 DNA methylation of all 31 known imprinting regions in preimplantation euploid cells and postimplantation tissues Heat map showing the DNA methylation of all 31 known imprinting regions in both preimplantation euploid cells and postimplantation tissues.

Colors indicate methylation levels from low (cyan) to high (red). White indicates missing (undetected) values. The “Mixed” label represents blastocyst cells that we did not separate manually into the ICM and TE (but directly dissociated the whole blastocyst into single-cell suspension and randomly picked single cells from them).

Supplementary Figure 10 Schematic of the chromosome-level DNA methylation discrepancies from the zygotes to four-cell embryos.

Three distribution types of DNA methylation for one genomic region in each cell from the zygote to four-cell embryos, indicating either “high–low” or “intermediate–intermediate” DNA methylation patterns for the two sibling cells in a four-cell embryo from a mother's cell at the two-cell stage.

Supplementary information

Supplementary Text and Figures.

Life Sciences Reporting Summary

General statistics of all samples analyzed in this study

Supplementary Table 1.

Sequencing statistics of all samples used in this study

Supplementary Table 2.

CpG coverage information for all bisulfite sequencing samples with 1×, 3× and 5× read depth

Supplementary Table 3.

Parental specifically methylated regions determined at the blastocyst stage

Supplementary Table 4.

Oligonucleotide sequences used in this study

Supplementary Table 5.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading