Comparison and imputation-aided integration of five commercial platforms for targeted DNA methylome analysis

Tanić, Miljana; Moghul, Ismail; Rodney, Simon; Dhami, Pawan; Vaikkinen, Heli; Ambrose, John; Barrett, James; Feber, Andrew; Beck, Stephan

doi:10.1038/s41587-022-01336-9

Article
Published: 02 June 2022

Comparison and imputation-aided integration of five commercial platforms for targeted DNA methylome analysis

Nature Biotechnology volume 40, pages 1478–1487 (2022)Cite this article

4151 Accesses
3 Citations
30 Altmetric
Metrics details

Subjects

Abstract

Targeted bisulfite sequencing (TBS) has become the method of choice for the cost-effective, targeted analysis of the human methylome at base-pair resolution. In this study, we benchmarked five commercially available TBS platforms—three hybridization capture-based (Agilent, Roche and Illumina) and two reduced-representation-based (Diagenode and NuGen)—across 11 samples. Two samples were also compared with whole-genome DNA methylation sequencing with the Illumina and Oxford Nanopore platforms. We assessed workflow complexity, on/off-target performance, coverage, accuracy and reproducibility. Although all platforms produced robust and reproducible data, major differences in the number and identity of the CpG sites covered make it difficult to compare datasets generated on different platforms. To overcome this limitation, we applied imputation and show that it improves interoperability from an average of 10.35% (0.8 million) to 97% (7.6 million) common CpG sites. Our study provides guidance on which TBS platform to use for different methylome features and offers an imputation-based harmonization solution that allows comparative, integrative analysis.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Technology and design comparison of TBS platforms.**

**Fig. 2: Sequencing performance by the platform.**

**Fig. 3: Platform similarity and feature annotation.**

**Fig. 4: Platform reproducibility and concordance of DNA methylation calls.**

**Fig. 5: Differential methylation calls by the platform and imputation.**

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis

Article Open access 25 March 2024

Genome-wide association studies

Article 26 August 2021

Data availability

The datasets generated and analyzed in the current study, including all raw targeted bisulfite sequencing, WGBS of Ref.gDNA and Nanopore sequencing data, have been deposited in the European Nucleotide Archive repository under accession number PRJEB46506 and are freely available. Raw WGBS sequencing data for the Coriell-NA12878 WGBS_EC sample generated by the ENCODE Project Consortium²⁶ were downloaded from the ENCODE Project (experiment: ENCSR890UQO, library: ENCLB898WPW) (https://www.encodeproject.org/experiments/ENCSR890UQO/), and CpG count files for WGBS_IL sample were downloaded from Illumina BaseSpace Hub (https://basespace.illumina.com/datacentral) under sample name WGBS_P3 from HiSeq 4000: TruSeq DNA Methylation (NA12878, 2 × 76) dataset.

Code availability

The code used for annotation, differential methylation analysis, plotting and imputation is available in the GitHub repository at https://github.com/ucl-medical-genomics/EpiCapture.

References

Schubeler, D. Function and information content of DNA methylation. Nature 517, 321–326 (2015).
Article CAS Google Scholar
Laird, P. W. Principles and challenges of genomewide DNA methylation analysis. Nat. Rev. Genet. 11, 191–203 (2010).
Article CAS Google Scholar
Stirzaker, C., Taberlay, P. C., Statham, A. L. & Clark, S. J. Mining cancer methylomes: prospects and challenges. Trends Genet. 30, 75–84 (2014).
Article CAS Google Scholar
Gu, H. et al. Genome-scale DNA methylation mapping of clinical samples at single-nucleotide resolution. Nat. Methods 7, 133–136 (2010).
Article CAS Google Scholar
Guo, S. et al. Identification of methylation haplotype blocks aids in deconvolution of heterogeneous tissue samples and tumor tissue-of-origin mapping from plasma DNA. Nat. Genet. 49, 635–642 (2017).
Article CAS Google Scholar
Gnirke, A. et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189 (2009).
Article CAS Google Scholar
Meissner, A. et al. Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis. Nucleic Acids Res. 33, 5868–5877 (2005).
Article CAS Google Scholar
Kacmarczyk, T. J. et al. ‘Same difference’: comprehensive evaluation of four DNA methylation measurement platforms. Epigenetics Chromatin 11, 21 (2018).
Article Google Scholar
Warnecke, P. M. et al. Detection and measurement of PCR bias in quantitative methylation analysis of bisulphite-treated DNA. Nucleic Acids Res. 25, 4422–4426 (1997).
Article CAS Google Scholar
Wojdacz, T. K., Borgbo, T. & Hansen, L. L. Primer design versus PCR bias in methylation independent PCR amplifications. Epigenetics 4, 231–234 (2009).
Article CAS Google Scholar
Ebbert, M. T. et al. Evaluating the necessity of PCR duplicate removal from next-generation sequencing data and a comparison of approaches. BMC Bioinformatics 17, 239 (2016).
Article Google Scholar
Kivioja, T. et al. Counting absolute numbers of molecules using unique molecular identifiers. Nat. Methods 9, 72–74 (2011).
Article Google Scholar
Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
Article CAS Google Scholar
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).
Article CAS Google Scholar
Rhee, I. et al. DNMT1 and DNMT3b cooperate to silence genes in human cancer cells. Nature 416, 552–556 (2002).
Article CAS Google Scholar
Simpson, J. T. et al. Detecting DNA cytosine methylation using nanopore sequencing. Nat. Methods 14, 407–410 (2017).
Article CAS Google Scholar
Gershman, A. et al. Epigenetic patterns in a complete human genome. Science 376, eabj5089 (2022).
Article CAS Google Scholar
Angermueller, C., Lee, H. J., Reik, W. & Stegle, O. DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning. Genome Biol. 18, 67 (2017).
Article Google Scholar
Zou, L. S. et al. BoostMe accurately predicts DNA methylation values in whole-genome bisulfite sequencing of multiple human tissues. BMC Genomics 19, 390 (2018).
Article Google Scholar
Lister, R. et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315–322 (2009).
Article CAS Google Scholar
Liu, M. C. et al. Sensitive and specific multi-cancer detection and localization using methylation signatures in cell-free DNA. Ann. Oncol. 31, 745–759 (2020).
Article CAS Google Scholar
Landau, D. A. et al. Locally disordered methylation forms the basis of intratumor methylome variation in chronic lymphocytic leukemia. Cancer Cell 26, 813–825 (2014).
Article CAS Google Scholar
Li, S. et al. Distinct evolution and dynamics of epigenetic and genetic heterogeneity in acute myeloid leukemia. Nat. Med. 22, 792–799 (2016).
Article CAS Google Scholar
Rosenthal, R. et al. Neoantigen-directed immune escape in lung cancer evolution. Nature 567, 479–485 (2019).
Article CAS Google Scholar
Olova, N. et al. Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data. Genome Biol. 19, 33 (2018).
Article Google Scholar
Li, Q. et al. Post-conversion targeted capture of modified cytosines in mammalian and plant genomes. Nucleic Acids Res. 43, e81 (2015).
Article Google Scholar
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for bisulfite-seq applications. Bioinformatics 27, 1571–1572 (2011).
Article CAS Google Scholar
Ewels, P., Magnusson, M., Lundin, S. & Kaller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048 (2016).
Article CAS Google Scholar
Quinlan, A. R. BEDTools: the Swiss-army tool for genome feature analysis. Curr. Protoc. Bioinformatics 47, 11.12.1–11.12.34 (2014).
Article Google Scholar
Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016).
Article CAS Google Scholar
Zhu, L. J. et al. ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip data. BMC Bioinformatics 11, 237 (2010).
Article Google Scholar
Akalin, A. et al. methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles. Genome Biol. 13, R87 (2012).
Article Google Scholar
Wang, H. Q., Tuominen, L. K. & Tsai, C. J. SLIM: a sliding linear model for estimating the proportion of true null hypotheses in datasets with dependence structures. Bioinformatics 27, 225–231 (2011).
Article Google Scholar
Cavalcante, R. G. & Sartor, M. A. annotatr: genomic regions in context. Bioinformatics 33, 2381–2383 (2017).
Article CAS Google Scholar
Lawrence, M. et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118 (2013).
Article CAS Google Scholar
Morgan, M. & Shepherd, L. AnnotationHub: client to access AnnotationHub resources. R package version 3.2.0. https://bioconductor.org/packages/release/bioc/html/AnnotationHub.html (2022).
Lawrence, M. HelloRanges: introduce *Ranges to bedtools users. R package version 1.20.0. https://bioconductor.org/packages/release/bioc/html/HelloRanges.html (2022).
Khan, A. & Mathelier, A. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets. BMC Bioinformatics 18, 287 (2017).
Article Google Scholar
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Article Google Scholar

Download references

Acknowledgements

M.T. received funding from the European Union’s Seventh Framework Programme (Marie Skłodowska-Curie Actions FP7/2007-2013/WHRI-ACADEMY-608765); the Danish Council for Strategic Research (1309-00006B); the Ministry of Education, Science and Technological Development of Serbia (2011-2019/III-41026 and 451-03-68/2020-14/200043); and the Science Fund of the Republic of Serbia (PROMIS/2020/6060876). I.M. is supported by the Biotechnology and Biological Sciences Research Council (grant no. BB/M009513/1). S.B. has received funding from the Wellcome Trust (218274/Z/19/Z) and a Royal Society Wolfson Research Merit Award (WM100023). A.F. received support from the UCL/UCLH Biomedical Research Centre, the Medical Research Council (MR/M025411/1), Prostate Cancer UK (MA_TR15_009) and the Biotechnology and Biological Sciences Research Council (BB/R009295/1). S.R. received funding from Orchid. We further acknowledge support from D. Turner and B. Sipos (Oxford Nanopore Technologies) for the generation of the Nanopore sequencing data and from the CRUK–UCL Centre-funded Genomics and Genome Engineering and Bioinformatics Translational Technology Platforms.

Author information

Authors and Affiliations

University College London, UCL Cancer Institute, London, UK
Miljana Tanić, Ismail Moghul, Simon Rodney, Pawan Dhami, James Barrett, Andrew Feber & Stephan Beck
Institute for Oncology and Radiology of Serbia, Experimental Oncology Department, Belgrade, Serbia
Miljana Tanić
NIHR Biomedical Research Centre, Guy’s and St. Thomas’ NHS Foundation Trust, Great Maze Pond, London, UK
Pawan Dhami & Heli Vaikkinen
University College London, Genomics and Genome Engineering Translational Technology Platform, London, UK
Heli Vaikkinen
University College London, Bill Lyons Informatics Centre, London, UK
John Ambrose
University College London, Division of Surgery and Interventional Science, London, UK
Andrew Feber
Royal Marsden Hospital, Molecular Pathology, London, UK
Andrew Feber

Authors

Miljana Tanić
View author publications
You can also search for this author in PubMed Google Scholar
Ismail Moghul
View author publications
You can also search for this author in PubMed Google Scholar
Simon Rodney
View author publications
You can also search for this author in PubMed Google Scholar
Pawan Dhami
View author publications
You can also search for this author in PubMed Google Scholar
Heli Vaikkinen
View author publications
You can also search for this author in PubMed Google Scholar
John Ambrose
View author publications
You can also search for this author in PubMed Google Scholar
James Barrett
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Feber
View author publications
You can also search for this author in PubMed Google Scholar
Stephan Beck
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.T., A.F. and S.B. conceived and designed the study. M.T. and S.R. performed the hybridization capture and RRBS experiments. P.D. and H.V. sequenced the libraries. M.T. and J.B. processed raw sequencing data. M.T. performed analysis of TBS data. I.M. analyzed WGBS and Nanopore data and performed imputation analysis. M.T., A.F. and S.B. interpreted the results. M.T., A.F. and S.B. wrote the manuscript. All authors read and approved the manuscript.

Corresponding authors

Correspondence to Miljana Tanić or Stephan Beck.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks Miguel Branco, Alexander Dobrovic and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Sequencing data processing quality metrics produced by MultiQC.

a Bismark alignment rates for uniquely, ambiguously, or unaligned reads for each sample by platform; b Percent of reads aligning to top or bottom DNA strand for each sample by the platform; c Global methylation levels of CpG dinucleotides for each sample by the platform; d The global cytosine methylation level in CHG context for each sample by the platform used an estimate of sodium bisulfite under-conversion rates; e The global cytosine methylation level in CHH context for each sample by the platform used an estimate of sodium bisulfite under-conversion rates.; f -g M-bias plot shows the average percentage methylation and coverage across read length for each sample. Each line represents a sample. Methylation bias for the forward sequencing read by platform (f); Methylation bias for the reverse sequencing read by platform (g).

Extended Data Fig. 2 Target depth of coverage.

The fraction of targets covered at specific depth of sequencing for each sample by the platform: Agilent (a), Illumina (b) and Roche (c). Each sample is represented by a line.

Extended Data Fig. 3 Intra-platform concordance.

Scatterplot showing pairwise Pearson correlation coefficient for Coriell NA12878 data from, WGBS EC vs. WGBS IL (a), Nanopore vs. WGBS IL (b), and Nanopore vs. WGBS EC (c).

Extended Data Fig. 4 Platform interoperability.

Interoperability between platforms for Coriell NA12878 (left) and Ref.gDNA (right) before imputation (first row), after imputation without distnce treshold (sencond row), after imputation with 1000 bp distance treshold (third row) and after imputauion with 25 bp distance treshold (fourth row). Venn diagram showing CpGs overlapping between the platforms.

Supplementary information

Supplementary Information

Supplementary Figs. 1–10

Reporting Summary

Supplementary Tables 1–5

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tanić, M., Moghul, I., Rodney, S. et al. Comparison and imputation-aided integration of five commercial platforms for targeted DNA methylome analysis. Nat Biotechnol 40, 1478–1487 (2022). https://doi.org/10.1038/s41587-022-01336-9

Download citation

Received: 06 September 2021
Accepted: 28 April 2022
Published: 02 June 2022
Issue Date: October 2022
DOI: https://doi.org/10.1038/s41587-022-01336-9

This article is cited by

Detection of DNA methylation signatures through the lens of genomic imprinting
- Jean-Noël Hubert
- Nathalie Iannuccelli
- Julie Demars
Scientific Reports (2024)