Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

DNA methylome analysis using short bisulfite sequencing data

Abstract

Bisulfite conversion of genomic DNA combined with next-generation sequencing (BS-seq) is widely used to measure the methylation state of a whole genome, the methylome, at single-base resolution. However, analysis of BS-seq data still poses a considerable challenge. Here we summarize the challenges of BS-seq mapping as they apply to both base and color-space data. We also explore the effect of sequencing errors and contaminants on inferred methylation levels and recommend the most appropriate way to analyze this type of data.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Effect of bisulfite treatment of DNA.
Figure 2: Performance and accuracy of unbiased base-space and color-space BS-seq alignment tools.
Figure 3: Recommended workflow for the primary analysis of BS-seq data.

References

  1. 1

    Law, J.A. & Jacobsen, S.E. Establishing, maintaining and modifying DNA methylation patterns in plants and animals. Nat. Rev. Genet. 11, 204–220 (2010).

    CAS  Article  Google Scholar 

  2. 2

    Pelizzola, M. & Ecker, J.R. The DNA methylome. FEBS Lett. 585, 1994–2000 (2010).

    Article  Google Scholar 

  3. 3

    Robertson, K.D. DNA methylation and human disease. Nat. Rev. Genet. 6, 597–610 (2005).

    CAS  Article  Google Scholar 

  4. 4

    Doi, A. et al. Differential methylation of tissue- and cancer-specific CpG island shores distinguishes human induced pluripotent stem cells, embryonic stem cells and fibroblasts. Nat. Genet. 41, 1350–1353 (2009).

    CAS  Article  Google Scholar 

  5. 5

    Esteller, M. Epigenetics in cancer. N. Engl. J. Med. 358, 1148–1159 (2008).

    CAS  Article  Google Scholar 

  6. 6

    Bock, C. et al. Reference Maps of human ES and iPS cell variation enable high-throughput characterization of pluripotent cell lines. Cell 144, 439–452 (2011).

    CAS  Article  Google Scholar 

  7. 7

    Lister, R. et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315–322 (2009).This was the first human methylome analyzed at single-base resolution using whole-genome bisulfite next-generation sequencing.

    CAS  Article  Google Scholar 

  8. 8

    Lister, R. et al. Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature 471, 68–73 (2011).

    CAS  Article  Google Scholar 

  9. 9

    Bird, A.P. DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res. 8, 1499–1504 (1980).

    CAS  Article  Google Scholar 

  10. 10

    Coulondre, C., Miller, J.H., Farabaugh, P.J. & Gilbert, W. Molecular basis of base substitution hotspots in Escherichia coli. Nature 274, 775–780 (1978).

    CAS  Article  Google Scholar 

  11. 11

    Weber, M. et al. Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat. Genet. 39, 457–466 (2007).

    CAS  Article  Google Scholar 

  12. 12

    Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    CAS  Google Scholar 

  13. 13

    Suzuki, M.M. & Bird, A. DNA methylation landscapes: provocative insights from epigenomics. Nat. Rev. Genet. 9, 465–476 (2008).

    CAS  Article  Google Scholar 

  14. 14

    Waterston, R.H. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002).

    CAS  Article  Google Scholar 

  15. 15

    Illingworth, R.S. et al. Orphan CpG islands identify numerous conserved promoters in the mammalian genome. PLoS Genet. 6, e1001134 (2010).

    Article  Google Scholar 

  16. 16

    Lister, R. & Ecker, J.R. Finding the fifth base: genome-wide sequencing of cytosine methylation. Genome Res. 19, 959–966 (2009).

    CAS  Article  Google Scholar 

  17. 17

    Laird, P.W. Principles and challenges of genomewide DNA methylation analysis. Nat. Rev. Genet. 11, 191–203 (2010).

    CAS  Article  Google Scholar 

  18. 18

    Down, T.A. et al. A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis. Nat. Biotechnol. 26, 779–785 (2008).

    CAS  Article  Google Scholar 

  19. 19

    Jacinto, F.V., Ballestar, E. & Esteller, M. Methyl-DNA immunoprecipitation (MeDIP): hunting down the DNA methylome. Biotechniques 44, 35–39 (2008).

    CAS  Article  Google Scholar 

  20. 20

    Serre, D., Lee, B.H. & Ting, A.H. MBD-isolated Genome sequencing provides a high-throughput and comprehensive survey of DNA methylation in the human genome. Nucleic Acids Res. 38, 391–399 (2010).

    CAS  Article  Google Scholar 

  21. 21

    Li, N. et al. Whole genome DNA methylation analysis based on high throughput sequencing technology. Methods 52, 203–212 (2010).

    Article  Google Scholar 

  22. 22

    Frommer, M. et al. A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc. Natl. Acad. Sci. USA 89, 1827–1831 (1992).

    CAS  Article  Google Scholar 

  23. 23

    Bock, C. et al. Quantitative comparison of genome-wide DNA methylation mapping technologies. Nat. Biotechnol. 28, 1106–1114 (2010).

    CAS  Article  Google Scholar 

  24. 24

    Harris, R.A. et al. Comparison of sequencing-based methods to profile DNA methylation and identification of monoallelic epigenetic modifications. Nat. Biotechnol. 28, 1097–1105 (2010).A detailed comparison of different sequencing-based technologies to analyze DNA methylation genome-wide.

    CAS  Article  Google Scholar 

  25. 25

    Huang, Y. et al. The behaviour of 5-hydroxymethylcytosine in bisulfite sequencing. PLoS ONE 5, e8888 (2010).

    Article  Google Scholar 

  26. 26

    Ficz, G. et al. Dynamic regulation of 5-hydroxymethylcytosine in mouse ES cells and during differentiation. Nature 473, 398–402 (2011).

    CAS  Article  Google Scholar 

  27. 27

    Pastor, W.A. et al. Genome-wide mapping of 5-hydroxymethylcytosine in embryonic stem cells. Nature 473, 394–397 (2011).

    CAS  Article  Google Scholar 

  28. 28

    Song, C.X. et al. Selective chemical labeling reveals the genome-wide distribution of 5-hydroxymethylcytosine. Nat. Biotechnol. 29, 68–72 (2011).

    CAS  Article  Google Scholar 

  29. 29

    Li, Y. et al. The DNA methylome of human peripheral blood mononuclear cells. PLoS Biol. 8, e1000533 (2010).

    Article  Google Scholar 

  30. 30

    Meissner, A. et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature 454, 766–770 (2008).This study reported the first genome-wide DNA methylation in mouse cells generated by RRBS.

    CAS  Article  Google Scholar 

  31. 31

    Feng, S. et al. Conservation and divergence of methylation patterning in plants and animals. Proc. Natl. Acad. Sci. USA 107, 8689–8694 (2010).

    CAS  Article  Google Scholar 

  32. 32

    Popp, C. et al. Genome-wide erasure of DNA methylation in mouse primordial germ cells is affected by AID deficiency. Nature 463, 1101–1105 (2010).

    CAS  Article  Google Scholar 

  33. 33

    Gu, H. et al. Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling. Nat. Protoc. 6, 468–481 (2011).

    CAS  Article  Google Scholar 

  34. 34

    Gu, H. et al. Genome-scale DNA methylation mapping of clinical samples at single-nucleotide resolution. Nat. Methods 7, 133–136 (2010).

    CAS  Article  Google Scholar 

  35. 35

    Smith, Z.D., Gu, H., Bock, C., Gnirke, A. & Meissner, A. High-throughput bisulfite sequencing in mammalian genomes. Methods 48, 226–232 (2009).

    CAS  Article  Google Scholar 

  36. 36

    Song, F. et al. Association of tissue-specific differentially methylated regions (TDMs) with differential gene expression. Proc. Natl. Acad. Sci. USA 102, 3336–3341 (2005).

    CAS  Article  Google Scholar 

  37. 37

    Cokus, S.J. et al. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452, 215–219 (2008).This study reported a methylome of Arabidopsis thaliana at single-base resolution generated via a nondirectional bisulfite sequencing library.

    CAS  Article  Google Scholar 

  38. 38

    Smallwood, S.A. et al. Dynamic CpG island methylation landscape in oocytes and preimplantation embryos. Nat Genet. 43, 811–814 (2011).

    CAS  Article  Google Scholar 

  39. 39

    Chen, P.Y., Cokus, S.J. & Pellegrini, M.B.S. Seeker: precise mapping for bisulfite sequencing. BMC Bioinformatics 11, 203 (2010).

    CAS  Article  Google Scholar 

  40. 40

    Krueger, F. & Andrews, S.R. Bismark: A flexible aligner and methylation caller for Bisulfite-seq applications. Bioinformatics 27, 1571–1572 (2011).

    CAS  Article  Google Scholar 

  41. 41

    McKernan, K.J. et al. Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res. 19, 1527–1541 (2009).

    CAS  Article  Google Scholar 

  42. 42

    Ondov, B.D. et al. An alignment algorithm for bisulfite sequencing using the Applied Biosystems SOLiD System. Bioinformatics 26, 1901–1902 (2010).

    CAS  Article  Google Scholar 

  43. 43

    Harris, E.Y., Ponts, N., Levchuk, A., Roch, K.L. & Lonardi, S. BRAT: bisulfite-treated reads analysis tool. Bioinformatics 26, 572–573 (2010).

    CAS  Article  Google Scholar 

  44. 44

    Kreck, B. et al. B-SOLANA: An approach for the analysis of two-base encoding bisulfite sequencing data. Bioinformatics published online, doi:10.1093/bioinformatics/btr660 (6 December 2011).

  45. 45

    Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

    Article  Google Scholar 

  46. 46

    Schmieder, R. & Edwards, R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011).

    CAS  Article  Google Scholar 

  47. 47

    Cox, M.P., Peterson, D.A., Biggs, P.J. & Solexa, Q.A. At-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics 11, 485 (2010).

    Article  Google Scholar 

  48. 48

    Li, R., Li, Y., Kristiansen, K. & Wang, J. SOAP: short oligonucleotide alignment program. Bioinformatics 24, 713–714 (2008).

    CAS  Article  Google Scholar 

  49. 49

    Wu, T.D. & Nacu, S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics 26, 873–881 (2010).

    CAS  Article  Google Scholar 

  50. 50

    Xi, Y. & Li, W. BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics 10, 232 (2009).

    Article  Google Scholar 

  51. 51

    Pedersen, B., Hsieh, T.F., Ibarra, C. & Fischer, R.L. MethylCoder: software pipeline for bisulfite-treated sequences. Bioinformatics 27, 2435–2436 (2011).

    CAS  Article  Google Scholar 

  52. 52

    Smith, A.D. et al. Updates to the RMAP short-read mapping software. Bioinformatics 25, 2841–2842 (2009).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

This work was funded by the Biotechnology and Biological Sciences Research Council, UK. A.F. and B.K. received infrastructure support from the Deutsche Forschungsgemeinschaft Excellence Cluster 'Inflammation at Interfaces'.

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Andre Franke or Simon R Andrews.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–3 and Supplementary Table 1 (PDF 507 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Krueger, F., Kreck, B., Franke, A. et al. DNA methylome analysis using short bisulfite sequencing data. Nat Methods 9, 145–151 (2012). https://doi.org/10.1038/nmeth.1828

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing