Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications

Journal name:
Nature Biotechnology
Volume:
28,
Pages:
1097–1105
Year published:
DOI:
doi:10.1038/nbt.1682
Published online

Abstract

Analysis of DNA methylation patterns relies increasingly on sequencing-based profiling methods. The four most frequently used sequencing-based technologies are the bisulfite-based methods MethylC-seq and reduced representation bisulfite sequencing (RRBS), and the enrichment-based techniques methylated DNA immunoprecipitation sequencing (MeDIP-seq) and methylated DNA binding domain sequencing (MBD-seq). We applied all four methods to biological replicates of human embryonic stem cells to assess their genome-wide CpG coverage, resolution, cost, concordance and the influence of CpG density and genomic context. The methylation levels assessed by the two bisulfite methods were concordant (their difference did not exceed a given threshold) for 82% for CpGs and 99% of the non-CpG cytosines. Using binary methylation calls, the two enrichment methods were 99% concordant and regions assessed by all four methods were 97% concordant. We combined MeDIP-seq with methylation-sensitive restriction enzyme (MRE-seq) sequencing for comprehensive methylome coverage at lower cost. This, along with RNA-seq and ChIP-seq of the ES cells enabled us to detect regions with allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression.

At a glance

Figures

  1. CpG coverage by each method.
    Figure 1: CpG coverage by each method.

    (a,b) The percentage of CpGs covered genome-wide (a) or in CpG islands (b) are plotted as a function of read-coverage threshold. (c) The percentage of genome-wide CpGs (28,163,863) covered by multiple, single or no methods are shown.

  2. Comparison of bisulfite-based methods.
    Figure 2: Comparison of bisulfite-based methods.

    (a) Calls of highly/partially/weakly methylated (0.80–0.20 or 0.75–0.25 cutoff) or highly/weakly methylated (0.20 cutoff) were made for CpGs covered at several minimum read depths by MethylC-seq and by RRBS (both on replicate no. 3). The number and percent of genome-wide CpGs covered and the percent of concordant calls are shown for each minimum read depth and methylation call cutoff. (b) Differences (MethylC-seq - RRBS) in methylated proportions (methylated reads/(methylated reads + unmethylated reads)) for CpGs with a minimum coverage of five reads by both methods. Percentages of concordant and discordant methylation were determined at cutoffs of ±0.1 (green dashed lines) and ±0.25 (red dashed lines). (c,d) CpG density in a 400-bp window (c) and genomic context of concordant and discordant CpGs at the 0.25 cutoff (d).

  3. Comparison of methylated DNA enrichment methods.
    Figure 3: Comparison of methylated DNA enrichment methods.

    (a) Calls of highly/weakly methylated were made by averaging methylation scores for CpGs covered at varying minimum read depths by MeDIP-seq or MBD-seq in 1,000- and 200-bp windows. The number of windows, percent of genome-wide CpGs covered and the percent of concordant calls are shown for each minimum read depth and window size. (b,c) For the 1,000-bp windows with a minimum read depth of 5, the CpG density (b) and genomic context (c) of the concordant and discordant windows are shown. The inset in b shows a close-up of the concordance/discordance of CpG densities consistent with CpG islands. (d) For the 1,000-bp windows with a minimum read depth of 5, MethylC-seq methylation proportions for CpGs and non-CpG cytosines covered at a minimum read depth of 5, 444,590 windows, were summed and the windows were binned by the sum. For each of these bins, the number of windows called highly methylated by MeDIP-seq or MBD-seq is shown on the left y axis and the percent of total windows with calls of highly methylated is shown on the right y axis. Windows with a MethylC-seq methylation proportion sum >15, representing 83% of all windows, were called highly methylated by MeDIP-seq and MBD-seq in 99.9% of cases. The windows with a methylation proportion sum of 1–15, representing 17% of all windows, were called highly methylated by MeDIP-seq and MBD-seq in at least 99.1% of cases.

  4. Comparison of all methods.
    Figure 4: Comparison of all methods.

    (a) The table shows the percentage of 1,000-bp windows with concordant and discordant MethylC-seq (replicate no. 3), RRBS (replicate no. 3), MeDIP-seq (replicate no. 2) and MBD-seq (replicate no. 2) calls at minimum read depths of 5 and 10. Methods making the same call are grouped together in parentheses. Calls were made for MethylC-seq and RRBS by averaging the methylation proportion of CpGs within the window that were covered at the minimum read depth and applying a highly/weakly methylated cutoff of 0.2. Calls were made for MeDIP-seq and MBD-seq by averaging the methylation score of CpGs within the window that were covered at the minimum read depth. (b) Genome browser view of the 100-kb CpG rich Protocadherin alpha cluster (PCDHA), exemplifying the significant concordance in methylation status seen on a genome-wide level. For MethylC-seq and RRBS, the y axis displays methylation scores of individual CpGs. Scores range between −500 (unmethylated) and 500 (methylated) and the zero line is equivalent to 50% methylated. Negative scores are displayed as green bars and positive scores are displayed as orange bars. For MeDIP-seq (1), MeDIP-seq (2) and MBD-seq, the y axis indicates extended read density. Browsable genome-wide views of these data sets are available at http://www.genboree.org/ and http://genome.ucsc.edu/.

  5. Integrative method increases methylome coverage and enables identification of a DMR.
    Figure 5: Integrative method increases methylome coverage and enables identification of a DMR.

    (a) MRE-seq involves parallel digests with methylation-sensitive restriction enzymes (HpaII, AciI and Hin6I), selection of cut fragments of ~50–300 bp, pooling the digests, library construction and sequencing. For every 600-bp window along chromosome 21, MeDIP-seq scores were plotted against MRE-seq scores. The plot depicts the inverse relationship between MRE-seq and MeDIP-seq signals. (b) Coverage of CpGs in the human genome by MeDIP-seq alone (red), MRE-seq alone (green), both (yellow) or neither method (no fill). Sequence from replicate nos. 1 and 2 were used in these calculations. (c) UCSC Genome Browser view of ZNF331 in H1 ESC, showing overlap of MeDIP-seq, MRE-seq and H3K4me3 (from ChIP-seq) signals at bisulfite region 1 and only MeDIP-seq signal at bisulfite region 2. (d) Clonal bisulfite sequencing results for specified regions in ESC from replicate no. 1. A filled circle represents a methylated CpG and an open circle indicates an unmethylated CpG.

  6. Allelic DNA methylation, histone methylation and gene expression in ESCs.
    Figure 6: Allelic DNA methylation, histone methylation and gene expression in ESCs.

    (a) Venn diagram summarizing the number of loci exhibiting monoallelic DNA methylation, histone methylation or monoallelic expression and their overlap. The top 1,000 loci (average size of 2.9 kb and encompassing a CpG island) with potential allelic DNA methylation were further evaluated, using the following assays: MRE-Seq and MeDIP-Seq for allelic DNA methylation within the loci, MethylC-seq and expression data for monoallelic expression of genes associated (±50 kb) with the loci, MethylC-seq and histone modifications H3K4me3 and H3K9me3 for monoallelic histone methylation within 1 kb from the loci. (b,c) Validation of known and novel DMRs identified from MeDIP-seq and MRE-seq. DMRs are presented in a UCSC Genome Browser window with MeDIP-seq and MRE-seq signals in human H1 ESC, along with bisulfite sequencing results. The results from the biological replicates (nos. 1 and 2) were very similar. (b) Imprinted gene GRB10 including a known DMR (Bisulfite region 1) and an upstream unmethylated CpG island (Bisulfite region 2). (c) Novel DMR upstream of POTEB, which exhibits allele-specific DNA methylation. Open circle indicates an unmethylated CpG site. Filled circle represents a methylated CpG site. 'x' indicates absence of a CpG site due to a heterozygous SNP, which destroyed the 28th CpG. All clones without the CpG were unmethylated, whereas all the clones containing the CpG were methylated. Furthermore, the alleles could be distinguished in the sequence reads from MeDIP-seq (G allele, 9 of 9 reads) and MRE-seq (A allele, 30 of 30 reads).

References

  1. Robertson, K.D. DNA methylation and human disease. Nat. Rev. Genet. 6, 597610 (2005).
  2. Bird, A. DNA methylation patterns and epigenetic memory. Genes Dev. 16, 621 (2002).
  3. Feinberg, A.P. & Vogelstein, B. Hypomethylation distinguishes genes of some human cancers from their normal counterparts. Nature 301, 8992 (1983).
  4. Gama-Sosa, M.A. et al. Tissue-specific differences in DNA methylation in various mammals. Biochim. Biophys. Acta 740, 212219 (1983).
  5. Tahiliani, M. et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 324, 930935 (2009).
  6. Kriaucionis, S. & Heintz, N. The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science 324, 929930 (2009).
  7. Ito, S. et al. Role of Tet proteins in 5mC to 5hmC conversion, ES-cell self-renewal and inner cell mass specification. Nature 466, 11291133 (2010).
  8. Lister, R. et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315322 (2009).
  9. Meissner, A. et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature 454, 766770 (2008).
  10. Jacinto, F.V., Ballestar, E. & Esteller, M. Methyl-DNA immunoprecipitation (MeDIP): hunting down the DNA methylome. Biotechniques 44, 3543 (2008).
  11. Down, T.A. et al. A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis. Nat. Biotechnol. 26, 779785 (2008).
  12. Serre, D., Lee, B.H. & Ting, A.H. MBD-isolated Genome Sequencing provides a high-throughput and comprehensive survey of DNA methylation in the human genome. Nucleic Acids Res. 38, 391399 (2010).
  13. Maunakea, A.K. et al. Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature 466, 253257 (2010).
  14. Ball, M.P. et al. Targeted and genome-scale strategies reveal gene-body methylation signatures in human cells. Nat. Biotechnol. 27, 361368 (2009).
  15. Cokus, S.J. et al. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452, 215219 (2008).
  16. Lister, R. et al. Highly integrated single-base resolution maps of the epigenome in Arabidopsis . Cell 133, 523536 (2008).
  17. The American Association for Cancer Research Human Epigenome Task Force European Union, Network of Excellence, Scientific Advisory Board Moving AHEAD with an international human epigenome project. Nature 454, 711715 (2008).
  18. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
  19. Xi, Y. & Li, W. BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics 10, 232 (2009).
  20. Coarfa, C. & Milosavljevic, A. Pash 2.0: scaleable sequence anchoring for next-generation sequencing technologies. Pac. Symp. Biocomput. 2008, 102113 (2008).
  21. Smith, A.D. et al. Updates to the RMAP short-read mapping software. Bioinformatics 25, 28412842 (2009).
  22. Lin, H., Zhang, Z., Zhang, M.Q., Ma, B. & Li, M. ZOOM! Zillions of oligos mapped. Bioinformatics 24, 24312437 (2008).
  23. Wang, T. et al. Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53. Proc. Natl. Acad. Sci. USA 104, 1861318618 (2007).
  24. Kunarso, G. et al. Transposable elements have rewired the core regulatory network of human embryonic stem cells. Nat. Genet. 42, 631634 (2010).
  25. Pant, P.V.K. et al. Analysis of allelic differential expression in human white blood cells. Genome Res. 16, 331339 (2006).
  26. Pollard, K.S. et al. A genome-wide approach to identifying novel-imprinted genes. Hum. Genet. 122, 625634 (2008).
  27. Schalkwyk, L.C. et al. Allelic skewing of DNA methylation is widespread across the genome. Am. J. Hum. Genet. 86, 196212 (2010).
  28. Pick, M. et al. Clone- and gene-specific aberrations of parental imprinting in human induced pluripotent stem cells. Stem Cells 27, 26862690 (2009).
  29. Arnaud, P. et al. Conserved methylation imprints in the human and mouse GRB10 genes with divergent allelic expression suggests differential reading of the same mark. Hum. Mol. Genet. 12, 10051019 (2003).
  30. Li, N. et al. Whole genome DNA methylation analysis based on high throughput sequencing technology. Methods published online, doi: 10.1016/j.ymeth.2010.04.009 (27 April 2010).
  31. Deng, J. et al. Targeted bisulfite sequencing reveals changes in DNA methylation associated with nuclear reprogramming. Nat. Biotechnol. 27, 353360 (2009).
  32. Bourque, G. Transposable elements in gene regulation and in the evolution of vertebrate genomes. Curr. Opin. Genet. Dev. 19, 607612 (2009).
  33. Duhl, D.M., Vrieling, H., Miller, K.A., Wolff, G.L. & Barsh, G.S. Neomorphic agouti mutations in obese yellow mice. Nat. Genet. 8, 5965 (1994).
  34. Waterland, R.A. & Jirtle, R.L. Transposable elements: targets for early nutritional effects on epigenetic gene regulation. Mol. Cell. Biol. 23, 52935300 (2003).
  35. Hellman, A. & Chess, A. Gene body-specific methylation on the active X chromosome. Science 315, 11411143 (2007).
  36. Ludwig, T.E. et al. Feeder-independent culture of human embryonic stem cells. Nat. Methods 3, 637646 (2006).
  37. Gu, H. et al. Genome-scale DNA methylation mapping of clinical samples at single-nucleotide resolution. Nat. Methods 7, 133136 (2010).
  38. Smith, Z.D., Gu, H., Bock, C., Gnirke, A. & Meissner, A. High-throughput bisulfite sequencing in mammalian genomes. Methods 48, 226232 (2009).
  39. O'Geen, H., Frietze, S. & Farnham, P.J. Using ChIP-seq technology to identify targets of zinc finger transcription factors. Methods Mol. Biol. 649, 437455 (2010).
  40. Robertson, G. et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods 4, 651657 (2007).
  41. Blahnik, K.R. et al. Sole-Search: an integrated analysis program for peak detection and functional annotation using ChIP-seq data. Nucleic Acids Res. 38, e13 (2010).
  42. Waterland, R.A., Lin, J., Smith, C.A. & Jirtle, R.L. Post-weaning diet affects genomic imprinting at the insulin-like growth factor 2 (Igf2) locus. Hum. Mol. Genet. 15, 705716 (2006).
  43. Shen, L., Guo, Y., Chen, X., Ahmed, S. & Issa, J.J. Optimizing annealing temperature overcomes bias in bisulfite PCR methylation analysis. Biotechniques 42, 48, 50, 52 passim (2007).
  44. Grunau, C., Clark, S.J. & Rosenthal, A. Bisulfite genomic sequencing: systematic investigation of critical experimental parameters. Nucleic Acids Res 29, E65 (2001).

Download references

Author information

Affiliations

  1. Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.

    • R Alan Harris,
    • Cristian Coarfa,
    • Robert A Waterland &
    • Aleksandar Milosavljevic
  2. Center for Genome Sciences and Systems Biology, Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, USA.

    • Ting Wang,
    • Xin Zhou,
    • Kevin J Forsberg &
    • Junchen Gu
  3. Brain Tumor Research Center, Department of Neurosurgery, Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California, USA.

    • Raman P Nagarajan,
    • Chibo Hong,
    • Sara L Downey,
    • Brett E Johnson,
    • Shaun D Fouse,
    • Adam Olshen &
    • Joseph F Costello
  4. Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada.

    • Allen Delaney,
    • Yongjun Zhao,
    • Marco A Marra &
    • Martin Hirst
  5. Center for Biomolecular Science and Engineering, University of California, Santa Cruz, California, USA.

    • Tracy Ballinger &
    • David Haussler
  6. Department of Pharmacology and the Genome Center, University of California-Davis, Davis, California, USA.

    • Lorigail Echipare,
    • Henriette O'Geen &
    • Peggy J Farnham
  7. Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, California, USA.

    • Ryan Lister,
    • Mattia Pelizzola &
    • Joseph R Ecker
  8. Division of Biostatistics, Dan L. Duncan Cancer Center, Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas, USA.

    • Yuanxin Xi &
    • Wei Li
  9. Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA.

    • Charles B Epstein,
    • Bradley E Bernstein,
    • Hongcang Gu,
    • Christoph Bock,
    • Andreas Gnirke &
    • Alexander Meissner
  10. Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA.

    • Bradley E Bernstein
  11. Center for Cancer Research, Massachusetts General Hospital, Boston, Massachusetts, USA.

    • Bradley E Bernstein
  12. Ludwig Institute for Cancer Research, University of California San Diego, La Jolla, California, USA.

    • R David Hawkins &
    • Bing Ren
  13. Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, California, USA.

    • Bing Ren
  14. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA.

    • Wen-Yu Chung &
    • Michael Q Zhang
  15. Department of Molecular and Cell Biology, Center for Systems Biology, University of Texas at Dallas, Dallas, Texas, USA.

    • Wen-Yu Chung &
    • Michael Q Zhang
  16. Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, Massachusetts, USA.

    • Christoph Bock &
    • Alexander Meissner
  17. Harvard Stem Cell Institute, Cambridge, Massachusetts, USA.

    • Christoph Bock &
    • Alexander Meissner
  18. Max Planck Institute for Informatics, Saarbrücken, Germany.

    • Christoph Bock
  19. USDA/ARS Children's Nutrition Research Center, Department of Pediatrics, Baylor College of Medicine, Houston, Texas, USA.

    • Robert A Waterland

Contributions

J.F.C., R.A.H., T.W., M.H., M.A.M. and A. Milosavljevic conceived and designed the experiments. R.P.N., C.H., S.L.D., B.E.J., S.D.F., Y.Z. and M.H. performed the MeDIP, MRE and bisulfite sequencing experiments. R.A.W. and X.Z. designed and performed pyrosequencing and data analyses. H.G., C.B., A.G. and A. Meissner9 performed and analyzed RRBS. L.E., H.O., P.J.F., B.E.B., C.B.E., R.D.H. and B.R. performed and analyzed Chip-seq experiments. R.L., M.P. and J.R.E. analyzed MethylC-seq data and performed Bowtie aligner testing. R.A.H., T.W., K.J.F., J.G., C.C., M.H., X.Z., A.D. and A.O. performed data analysis. T.W., T.B. and D.H. developed MeDIP and methyl-sensitive restriction enzyme scoring algorithms and performed coverage analyses including repetitive sequence analyses. Y.X., W.-Y.C., R.L., M.Q.Z. and W.L. compared bisulfite sequence aligners. J.F.C., R.A.H., M.H., T.W., R.P.N. and R.A.W. wrote the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary Text and Figures (4 MB)

    Supplementary Tables 2, 4, 5 and 8 and Supplementary Figs. 1–18

Excel files

  1. Supplementary Table 1 (36 KB)

    Primer designs for bisulfite pyrosequencing. See Excel spreadsheet Supplementary_Table_1.xls.

  2. Supplementary Table 3 (120 KB)

    Bisulfite data for Supplementary Figure 12.

  3. Supplementary Table 6 (224 KB)

    Genome-wide catalogue of CpG island regions exhibiting overlapping MeDIP-seq (methylated) signals and MRE-seq (unmethylated) signals.

  4. Supplementary Table 7 (252 KB)

    Validation of known and putative DMRs by bisulfite, PCR, cloning and sequencing.

  5. Supplementary Table 9 (412 KB)

    Details of the comparison of genomic variation between pairs of assays to determine allele-specific epigenetic states.

Additional data