Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Comprehensive analysis of DNA methylation data with RnBeads

Abstract

RnBeads is a software tool for large-scale analysis and interpretation of DNA methylation data, providing a user-friendly analysis workflow that yields detailed hypertext reports (http://rnbeads.mpi-inf.mpg.de/). Supported assays include whole-genome bisulfite sequencing, reduced representation bisulfite sequencing, Infinium microarrays and any other protocol that produces high-resolution DNA methylation data. Notable applications of RnBeads include the analysis of epigenome-wide association studies and epigenetic biomarker discovery in cancer cohorts.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: RnBeads workflow for analyzing large-scale DNA methylation data.
Figure 2: Analysis of DNA methylation during adult stem cell differentiation.

Similar content being viewed by others

References

  1. Lister, R. et al. Nature 462, 315–322 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Gu, H. et al. Nat. Methods 7, 133–136 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Bibikova, M. et al. Genomics 98, 288–295 (2011).

    Article  CAS  PubMed  Google Scholar 

  4. Down, T.A. et al. Nat. Biotechnol. 26, 779–785 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Harris, R.A. et al. Nat. Biotechnol. 28, 1097–1105 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Stevens, M. et al. Genome Res. 23, 1541–1553 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Bock, C. et al. Nat. Biotechnol. 28, 1106–1114 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Bock, C. Nat. Rev. Genet. 13, 705–719 (2012).

    Article  CAS  PubMed  Google Scholar 

  9. Krueger, F. & Andrews, S.R. Bioinformatics 27, 1571–1572 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Xi, Y. et al. Bioinformatics 28, 430–432 (2012).

    Article  CAS  PubMed  Google Scholar 

  11. Liu, Y., Siegmund, K.D., Laird, P.W. & Berman, B.P. Genome Biol. 13, R61 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Lienhard, M., Grimm, C., Morkel, M., Herwig, R. & Chavez, L. Bioinformatics 30, 284–286 (2014).

    Article  CAS  PubMed  Google Scholar 

  13. Wilson, G.A. et al. GigaScience 1, 3 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Riebler, A. et al. Genome Biol. 15, R35 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  15. Meyer, L.R. et al. Nucleic Acids Res. 41, D64–D69 (2013).

    Article  CAS  PubMed  Google Scholar 

  16. Flicek, P. et al. Nucleic Acids Res. 41, D48–D55 (2013).

    Article  CAS  PubMed  Google Scholar 

  17. Giardine, B. et al. Genome Res. 15, 1451–1455 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Zhou, X. et al. Nat. Methods 8, 989–990 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Halachev, K., Bast, H., Albrecht, F., Lengauer, T. & Bock, C. Genome Biol. 13, R96 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Weisenberger, D.J. J. Clin. Invest. 124, 17–23 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Bock, C. et al. Mol. Cell 47, 633–647 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Bock, C. Genome Med. 6, 41 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Gentleman, R.C. et al. Genome Biol. 5, R80 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Gentleman, R. & Temple Lang, D. Bioconductor Project Working Paper 2 (2004).

  25. Akman, K., Haaf, T., Gravina, S., Vijg, J. & Tresch, A. Bioinformatics 30, 1933–1934 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Hebestreit, K., Dugas, M. & Klein, H.U. Bioinformatics 29, 1647–1653 (2013).

    Article  CAS  PubMed  Google Scholar 

  27. Saito, Y., Tsuji, J. & Mituyama, T. Nucleic Acids Res. 42, e45 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Hansen, K.D., Langmead, B. & Irizarry, R.A. Genome Biol. 13, R83 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Morris, T.J. et al. Bioinformatics 30, 428–430 (2014).

    Article  CAS  PubMed  Google Scholar 

  30. Warden, C.D. et al. Nucleic Acids Res. 41, e117 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Barfield, R.T., Kilaru, V., Smith, A.K. & Conneely, K.N. Bioinformatics 28, 1280–1281 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. He, J., Sun, X., Shao, X., Liang, L. & Xie, H. Bioinformatics 29, 2044–2045 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Zhang, Y., Su, J., Yu, D., Wu, Q. & Yan, H. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2013, 655–658 (2013).

    Google Scholar 

  34. Wu, D., Gu, J. & Zhang, M.Q. PLoS ONE 8, e74275 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Mancuso, F.M., Montfort, M., Carreras, A., Alibes, A. & Roma, G. BMC Res. Notes 4, 546 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Wang, D. et al. Bioinformatics 28, 729–730 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Kuan, P.F., Wang, S., Zhou, X. & Chu, H. Bioinformatics 26, 2849–2855 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Kilaru, V., Barfield, R.T., Schroeder, J.W., Smith, A.K. & Conneely, K.N. Epigenetics 7, 225–229 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Akalin, A. et al. Genome Biol. 13, R87 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Park, Y., Figueroa, M.E., Rozek, L.S. & Sartor, M.A. Bioinformatics 30, 2414–2422 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Aryee, M.J. et al. Bioinformatics 30, 1363–1369 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Pidsley, R. et al. BMC Genomics 14, 293 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Maksimovic, J., Gordon, L. & Oshlack, A. Genome Biol. 13, R44 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Teschendorff, A.E. et al. Bioinformatics 29, 189–196 (2013).

    Article  CAS  PubMed  Google Scholar 

  45. Triche, T.J. Jr., Weisenberger, D.J., Van Den Berg, D., Laird, P.W. & Siegmund, K.D. Nucleic Acids Res. 41, e90 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Xi, Y. & Li, W. BMC Bioinformatics 10, 232 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Chavez, L. et al. Genome Res. 20, 1441–1450 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Du, P. et al. BMC Bioinformatics 11, 587 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Westra, H.J. et al. Bioinformatics 27, 2104–2111 (2011).

    Article  CAS  PubMed  Google Scholar 

  50. Nordlund, J. et al. Genome Biol. 14, r105 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Sandve, G.K. et al. Nucleic Acids Res. 41, W133–W141 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  52. Bock, C., Halachev, K., Büch, J. & Lengauer, T. Genome Biol. 10, R14 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Bock, C., Walter, J., Paulsen, M. & Lengauer, T. Nucleic Acids Res. 36, e55 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Smyth, G.K. Stat. Appl. Genet. Mol. Biol. 3, Article3 (2004).

    Article  PubMed  Google Scholar 

  55. Houseman, E.A., Molitor, J. & Marsit, C.J. Bioinformatics 30, 1431–1439 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Makambi, K.H. J. Appl. Stat. 30, 225–234 (2003).

    Article  Google Scholar 

  57. Leek, J.T. et al. Nat. Rev. Genet. 11, 733–739 (2010).

    Article  CAS  PubMed  Google Scholar 

  58. Leek, J.T., Johnson, W.E., Parker, H.S., Jaffe, A.E. & Storey, J.D. Bioinformatics 28, 882–883 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Johnson, W.E., Li, C. & Rabinovic, A. Biostatistics 8, 118–127 (2007).

    Article  PubMed  Google Scholar 

  60. Teschendorff, A.E., Zhuang, J. & Widschwendter, M. Bioinformatics 27, 1496–1505 (2011).

    Article  CAS  PubMed  Google Scholar 

  61. Gagnon-Bartsch, J.A. & Speed, T.P. Biostatistics 13, 539–552 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Jaffe, A.E. & Irizarry, R.A. Genome Biol. 15, R31 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  63. Houseman, E.A. et al. BMC Bioinformatics 13, 86 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  64. Michels, K.B. et al. Nat. Methods 10, 949–955 (2013).

    Article  CAS  PubMed  Google Scholar 

  65. Reinius, L.E. et al. PLoS ONE 7, e41361 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Zou, J., Lippert, C., Heckerman, D., Aryee, M. & Listgarten, J. Nat. Methods 11, 309–311 (2014).

    Article  CAS  PubMed  Google Scholar 

  67. Afgan, E. et al. BMC Bioinformatics 11 (suppl. 12), S4 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  68. Ziller, M.J. et al. Nature 500, 477–481 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Satterlee, J.S., Schübeler, D. & Ng, H.H. Nat. Biotechnol. 28, 1039–1044 (2010).

    Article  CAS  PubMed  Google Scholar 

  70. ENCODE Project Consortium. Science 306, 636–640 (2004).

  71. Varley, K.E. et al. Genome Res. 23, 555–567 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank D. Brocks, H. Hernandez-Vargas, A. Houseman, E. Schneider, A. Schönegger and all users of RnBeads for their extensive testing and feedback. We also thank G. Friedrich, J. Büch and the Information Services and Technology team at the Max Planck Institute for technical support. This work is funded in part by the European Union's Seventh Framework Programme (FP7/2007-2013) grant agreement no. 282510 (BLUEPRINT) and grant agreement no. 267038 (NOTOX), as well as by the German Science Ministry grant no. 01KU1216A (DEEP).

Author information

Authors and Affiliations

Authors

Contributions

Y.A., F.M. and P.L. developed and maintain RnBeads; J.W., T.L. and C.B. supervised the project; all authors contributed to the writing of the manuscript.

Corresponding authors

Correspondence to Fabian Müller or Christoph Bock.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Analysis of DNA methylation in a cancer cohort based on Infinium 450K data.

RnBeads was used to rediscover a clinically distinct subgroup of glioblastoma patients characterized by increased DNA methylation levels (termed G-CIMP+), and to predict the G-CIMP status for a total of 124 patients using Infinium 450k data obtained from the TCGA project (http://cancergenome.nih.gov).

(a) Detection of genetic duplicates among the patient samples (columns) using a clustered heatmap of intensity values for the genotyping probes that are present on the Infinium microarray (rows). The inset shows that two samples exhibit a high level of genetic identity, and they are indeed derived from tumors of the same patient.

(b) Quality control plot summarizing the outcome of the data filtering. The bar plots on the top left show that the majority of CpG sites (top) and samples (bottom) are of good quality and can be retained. The relatively straight line in the quantile-quantile plot indicates that the probe filtering does not have a major impact on the distribution of DNA methylation in the dataset.

(c) Identification of a small but clearly distinguished cluster of G-CIMP+ glioblastoma samples with elevated DNA methylation levels especially in CpG-rich genomic regions (light blue in the leftmost column). In the heatmap, blue colors denote high levels of DNA methylation, red indicates low levels and grey represents intermediate levels. For visualization purposes, only the 100 gene promoters (rows) with the highest levels of inter-sample variation in DNA methylation are shown (columns), but the hierarchical clustering is based on the full set of promoters.

(d) Global assessment of the similarity between the DNA methylation profiles, plotting all glioblastoma samples according to their second and third principal components. The samples exhibit strong separation according to the G-CIMP status (denoted by point shape) and IDH1 mutation status (denoted by point color).

(e) Analysis of significant associations between all user-provided sample annotations. Significant p-values (<0.05) are highlighted in the left triangle, and the corresponding statistical tests are annotated in the right triangle (orange: Pearson correlation followed by permutation-based estimation of the p-value; green: Fisher’s exact test; blue: Wilcoxon rank sum test; violet: Kruskal-Wallis one-way analysis of variance).

(f) Genome-scale comparison between the DNA methylation levels of G-CIMP positive (y-axis) and G-CIMP negative (x-axis) tumor samples, focusing on CpG islands (left scatterplot) and on 5-kilobase tiling regions with a CpG content in the bottom quartile (right scatterplot), respectively. Genomic regions that are differentially methylated with an FDR below 0.05 are presented as red points. All other regions are displayed in blue, and color brightness denotes point density.

Supplementary Figure 2 RnBeads-based Methylome Resource of reference epigenome data sets.

Screenshot of the Methylome Resource (http://rnbeads.mpi-inf.mpg.de/methylomes.php), which makes large DNA methylation datasets more readily available for follow-up research. On the one hand, it provides detailed analysis reports for publicly available methylome datasets that can be explored interactively. On the other hand, the Methylome Resource website lets RnBeads users download all data and configurations that are needed to re-run all or part of the DNA methylation analyses in their local or cloud-based computing environment. These re-runnable analysis configurations make it straightforward for RnBeads users to analyze their own DNA methylation data in the context of publicly available reference epigenome maps.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1 and 2, Supplementary Table 2 and Supplementary Note (PDF 886 kb)

Supplementary Table 1

Comparison between software tools for DNA methylation analysis (XLSX 37 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Assenov, Y., Müller, F., Lutsik, P. et al. Comprehensive analysis of DNA methylation data with RnBeads. Nat Methods 11, 1138–1140 (2014). https://doi.org/10.1038/nmeth.3115

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.3115

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics