Bisulfite sequencing has been the gold standard for mapping DNA modifications including 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) for decades1,2,3,4. However, this harsh chemical treatment degrades the majority of the DNA and generates sequencing libraries with low complexity2,5,6. Here, we present a bisulfite-free and base-level-resolution sequencing method, TET-assisted pyridine borane sequencing (TAPS), for detection of 5mC and 5hmC. TAPS combines ten-eleven translocation (TET) oxidation of 5mC and 5hmC to 5-carboxylcytosine (5caC) with pyridine borane reduction of 5caC to dihydrouracil (DHU). Subsequent PCR converts DHU to thymine, enabling a C-to-T transition of 5mC and 5hmC. TAPS detects modifications directly with high sensitivity and specificity, without affecting unmodified cytosines. This method is nondestructive, preserving DNA fragments over 10 kilobases long. We applied TAPS to the whole-genome mapping of 5mC and 5hmC in mouse embryonic stem cells and show that, compared with bisulfite sequencing, TAPS results in higher mapping rates, more even coverage and lower sequencing costs, thus enabling higher quality, more comprehensive and cheaper methylome analyses.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Clinical Epigenetics Open Access 04 January 2023
Clinical Epigenetics Open Access 27 August 2022
5-Hydroxymethylcytosine (5hmC) at or near cancer mutation hot spots as potential targets for early cancer detection
BMC Research Notes Open Access 21 April 2022
Subscribe to Nature+
Get immediate online access to Nature and 55 other Nature journal
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
The software used to process TAPS data can be downloaded from https://bitbucket.org/bsblabludwig/astair.
All sequencing data are available through GEO Series accession code GSE112520. All relevant additional data have been published with the manuscript, either as part of the main text or in the Supplementary Information.
Lister, R. et al. Global epigenomic reconfiguration during mammalian brain development. Science 341, 1237905 (2013).
Raiber, E.-A., Hardisty, R., van Delft, P. & Balasubramanian, S. Mapping and elucidating the function of modified bases in DNA. Nat. Rev. Chem. 1, 0069 (2017).
Yu, M. et al. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell 149, 1368–1380 (2012).
Booth, M. J. et al. Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science 336, 934–937 (2012).
Tanaka, K. & Okamoto, A. Degradation of DNA by bisulfite treatment. Bioorg. Med. Chem. Lett. 17, 1912–1915 (2007).
Olova, N. et al. Comparison of whole-genome bisulfite sequencing library preparation strategies identifies sources of biases affecting DNA methylation data. Genome Biol. 19, 33 (2018).
Li, E. & Zhang, Y. DNA methylation in mammals. Cold Spring Harb. Perspect. Biol. 6, a019133 (2014).
Tahiliani, M. et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 324, 930–935 (2009).
Ito, S. et al. Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science 333, 1300–1303 (2011).
Gal-Yam, E. N., Saito, Y., Egger, G. & Jones, P. A. Cancer epigenetics: modifications, screening, and therapy. Annu. Rev. Med. 59, 267–280 (2008).
Vasanthakumar, A. & Godley, L. A. 5-hydroxymethylcytosine in cancer: significance in diagnosis and therapy. Cancer Genet. 208, 167–177 (2015).
Chan, K. C. et al. Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing. Proc. Natl Acad. Sci. USA 110, 18761–18768 (2013).
Song, C. X. et al. 5-Hydroxymethylcytosine signatures in cell-free DNA provide information about tumor types and stages. Cell Res. 27, 1231–1242 (2017).
Xia, B. et al. Bisulfite-free, base-resolution analysis of 5-formylcytosine at the genome scale. Nat. Methods 12, 1047–1050 (2015).
Zhu, C. et al. Single-cell 5-formylcytosine landscapes of mammalian early embryos and ESCs at single-base resolution. Cell Stem Cell 20, 720–731 e725 (2017).
Lu, X., Zhao, B. S. & He, C. TET family proteins: oxidation activity, interacting molecules, and functions in diseases. Chem. Rev. 115, 2225–2239 (2015).
He, Y. F. et al. Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science 333, 1303–1307 (2011).
Sato, S., Sakamoto, T., Miyazawa, E. & Kikugawa, Y. One-pot reductive amination of aldehydes and ketones with alpha-picoline-borane in methanol, in water, and in neat conditions. Tetrahedron 60, 7899–7906 (2004).
Liu, J. & Doetsch, P. W. Escherichia coli RNA and DNA polymerase bypass of dihydrouracil: mutagenic potential via transcription and replication. Nucleic Acids Res. 26, 1707–1712 (1998).
Song, C. X. et al. Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming. Cell 153, 678–691 (2013).
Lu, X. et al. Chemical modification-assisted bisulfite sequencing (CAB-Seq) for 5-carboxylcytosine detection in DNA. J. Am. Chem. Soc. 135, 9315–9317 (2013).
Pais, J. E. et al. Biochemical characterization of a Naegleria TET-like oxygenase and its application in single molecule sequencing of 5-methylcytosine. Proc. Natl Acad. Sci. USA 112, 4316–4321 (2015).
Incarnato, D., Krepelova, A. & Neri, F. High-throughput single nucleotide variant discovery in E14 mouse embryonic stem cells provides a new reference genome assembly. Genomics 104, 121–127 (2014).
Holmes, E. E. et al. Performance evaluation of kits for bisulfite-conversion of DNA from tissues, cell lines, FFPE tissues, aspirates, lavages, effusions, plasma, serum, and urine. PLoS ONE. 9, e93933 (2014).
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Illumina. Illumina Whole-genome Bisulfite Sequencing on the HiSeq 3000/HiSeq 4000 Systems https://www.illumina.com/content/dam/illumina-marketing/documents/products/appnotes/hiseq3000-hiseq4000-wgbs-application-note-770-2015-052.pdf (Illumina, 2016).
Wen, L. et al. Whole-genome analysis of 5-hydroxymethylcytosine and 5-methylcytosine at base resolution in the human brain. Genome Biol. 15, R49 (2014).00
Flusberg, B. A. et al. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat. Methods 7, 461–465 (2010).
Jain, M. et al. Nanopore sequencing and assembly of a human genome with ultra-long reads. Nat. Biotechnol. 36, 338–345 (2018).
Song, C. X. et al. Selective chemical labeling reveals the genome-wide distribution of 5-hydroxymethylcytosine. Nat. Biotechnol. 29, 68–72 (2011).
Schutsky, E. K. et al. Nondestructive, base-resolution sequencing of 5-hydroxymethylcytosine using a DNA deaminase. Nat. Biotechnol. 36, 1083–1090 (2018).
Smallwood, S. A. et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat. Methods 11, 817–820 (2014).
Luo, C. et al. Single-cell methylomes identify neuronal subtypes and regulatory elements in mammalian cortex. Science 357, 600–604 (2017).
Wu, H., Wu, X. & Zhang, Y. Base-resolution profiling of active DNA demethylation using MAB-seq and caMAB-seq. Nat. Protoc. 11, 1081–1100 (2016).
Illumina. Whole-genome Bisulfite Sequencing for Methylation Analysis https://support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/samplepreps_legacy/WGBS_for_Methylation_Analysis_Guide_15021861_B.pdf (Illumina, 2015).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Huang, W., Li, L., Myers, J. R. & Marth, G. T. ART: a next-generation sequencing read simulator. Bioinformatics 28, 593–594 (2012).
We would like to acknowledge P. Spingardi, G. Berridge and B. Kessler for helping with the HPLC–MS/MS; P. Brennan and G.F. Ruda for helping with the NMR; T. Brown and A. H. El-Sagheer for the DNA synthesis; F. Howe, S. Kriaucionis and C. Goding for critical reading of this manuscript. This work was supported by the Ludwig Institute for Cancer Research. The C.-X.S. laboratory is also supported by Cancer Research UK (grant nos. C63763/A26394 and C63763/A27122), NIHR Oxford Biomedical Research Centre and Conrad N. Hilton Foundation. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health. F.Y., L.C. and Y.B. are supported by China Scholarship Council.
A patent application has been filed by Ludwig Institute for Cancer Research Ltd for the technology disclosed in this publication.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Integrated supplementary information
Supplementary Figure 1 Borane-containing compounds screening and proposed mechanism for the borane reaction of 5caC.
(a) Borane-containing compounds screened for conversion of 5caC to DHU in an 11mer oligo, with conversion rate estimated by MALDI. 2-picoline borane (pic-borane), borane pyridine, tert-butylamine borane, and ammonia borane could completely convert 5caC to DHU while ethylenediamine borane and dimethylamine borane only gave around 30% conversion rate. No detectable products were measured (n.d.) with morpholine borane, 4-methylmorpholine borane, trimethylamine borane, and cyclohexylamine borane. Other reducing agents such as sodium borohydride and sodium tri(acetoxy)borohydride decomposed rapidly in acidic media and led to incomplete conversion. Sodium cyanoborohydride was not used due to potential for hydrogen cyanide formation under acidic conditions. Pic-borane and pyridine borane were chosen because of the complete conversion, low toxicity and high stability. (b) Proposed mechanism for the borane reaction of 5caC to DHU.
Proposed mechanism for the borane reaction of 5fC to DHU.
Supplementary Figure 3 MALDI characterization of 5fC and 5caC containing model DNA oligos treated by pic-borane with or without the blocking of 5fC and 5caC.
5fC was blocked by O-ethylhydroxylamine which becomes oxime and resists pic-borane conversion while 5caC was blocked by ethylamine via EDC conjugation and converted to amide which blocks conversion by pic-borane. All experiments were performed once. Calculated MS was shown in black, and observed MS was shown in red.
Supplementary Figure 4 MALDI characterization of 5mC and 5hmC containing model DNA oligos treated by KRuO4 and pic-borane with or without blocking of 5hmC.
5hmC could be blocked by βGT with glucose and converted to 5gmC. 5mC, 5hmC and 5gmC could not be converted by pic-borane. 5hmC could be oxidized by KRuO4 to 5fC, and then converted to DHU by pic-borane. All experiments were performed once. Calculated MS was shown in black, observed MS was shown in Red.
(a) Illustration of restriction enzyme digestion assay to confirm the sequence change caused by TAPS. (b) Taqα tests to confirm the C-to-T transition caused by TAPS. A PCR-amplified 222 bp model DNA with TaqαI restriction site in the middle can be cleaved, whereas the amplified product of 5mC-TAPS stayed intact, suggesting loss of the restriction site and hence C-to-T transition. TAPS did not result in C-to-T transition on the unmethylated cytosine since C-TAPS was cleaved in the same way as the original untreated C. Experiment was performed once.
Supplementary Figure 6 Complete C-to-T transition induced after TAPS, TAPSβ and CAPS as indicated by Sanger sequencing.
Model DNA containing single methylated and single hydroxymethylated CpG sites was prepared as described in Supplementary Note 3. TAPS conversion was done following the NgTET1 Oxidation and Pyridine borane reduction protocols described in the Methods. TAPSβ conversion was done following the 5hmC blocking, NgTET1 Oxidation and Pyridine borane reduction protocols. CAPS conversion was done following the 5hmC oxidation and Pyridine borane reduction protocols. After conversion, 1 ng of converted DNA sample was PCR amplified by Taq DNA Polymerase and processed for Sanger sequencing. TAPS converted both 5mC and 5hmC to T. TAPSβ selectively converted 5mC whereas CAPS selectively converted 5hmC. None of the three methods caused conversion on unmodified cytosine and other bases.
Supplementary Figure 7 TAPS is compatible with various DNA and RNA polymerases and induces complete C-to-T transition shown by Sanger sequencing.
The model DNA containing methylated CpG sites for the polymerase test and primer sequences is described in Supplementary Note 3. After TAPS treatment, 5mC was converted to DHU. KAPA HiFi Uracil plus polymerase, Taq polymerase, and Vent exo- polymerase read DHU as T and therefore induce complete C-to-T transition after PCR. Alternatively, primer extension was done with a biotin-labelled primer and isothermal polymerases including Klenow fragment, Bst DNA polymerase, and phi29 DNA polymerase. The newly synthesized DNA strand was separated by Dynabeads® MyOne Streptavidin C1 and then amplified by PCR with Taq polymerase and processed for Sanger sequencing. T7 RNA polymerase could efficiently bypass DHU and insert adenine opposite the DHU site, which is shown by RT-PCR and Sanger sequencing. Other commercial polymerases including KAPA HiFi polymerase, NEB Q5 polymerase, and Phusion polymerase were also tested but failed to amplify DHU containing DNA efficiently.
A model DNA containing one DHU/U/T/C modification was synthesized with the corresponding DNA oligos as described in Supplementary Note 3. Standard curves for each model DNA with DHU/U/T/C modification were plotted based on qPCR reactions with 1:10 serial dilutions of the model DNA input (from 0.1 pg to 1 ng). Every qPCR experiment was run in triplicates (n=3 technical replicates). The slope of the regression between the log concentration (ng) values and the average Ct values was calculated by SLOPE function in Excel. PCR efficiency was calculated using the following equation: Efficiency % = (10^(-1/ Slope)-1)*100% Amplification factor was calculated using the following equation: Amplification factor=10^(-1/Slope). The PCR efficiency for the model DNAs with DHU or T or C modification were almost the same, which demonstrated that DHU could be read through as a regular base and would not cause PCR bias.
(a) Agarose gel images of the TaqαI-digestion assay confirming complete 5mC to T conversion in all samples regardless of DNA fragment length. 194 bp model sequence from the lambda genome was PCR amplified after TAPS and digested with TaqαI enzyme. The PCR product amplified from unconverted sample could be cleaved, whereas products amplified on TAPS treated samples stayed intact, suggesting loss of restriction site and hence complete C-to-T transition. Experiment was performed once. (b) The C-to-T conversion percentage was estimated by gel band quantification as 100% for all DNA fragment lengths tested.
The combination of mTet1 and pyridine borane achieved the highest conversion rate of methylated C (96.5%, calculated with fully CpG methylated Lambda DNA) and the lowest conversion rate of unmodified C (0.23%, calculated with 2kb unmodified spike-in), compared to other conditions with NgTET1 or pic-borane. Shown above are the conversion rates +/- SE of all tested cytosine sites (N of 2kb unmodified positions = 1041, N of covered bacteriophage lambda CpG positions used: mTet1 pyridine borane 6226, mTet1 pic-borane 5871, NgTET1 pyridine borane 5768, NgTET1 pic-borane 6226).
Supplementary Figure 11 TAPS resulted in more even coverage and fewer uncovered positions than WGBS.
Comparison of coverage depth across (a) all bases (N = 2 725 765 481) and (b) CpG sites (N = 43 445 914, based on mm9 genome, which includes potential genetic variants in E14 genome) between WGBS and TAPS, computed on both strands. For ‘TAPS (down-sampled)’, random reads out of all mapped TAPS reads were selected so that the median coverage matched the median coverage of WGBS. Positions with coverage above 50× are shown in the last bin.
Average modification levels in CpG islands (binned into 20 windows) and 4 kb flanking regions (binned into 50 equally sized windows). Bins with coverage below 3 reads were ignored.
All CpG sites were binned according to their coverage, and the mean (blue) and the median (orange) modification values are shown in each bin for WGBS (a) and TAPS (b). The CpG sites covered by more than 100 reads are shown in the last bin. The lines represent a linear fit through the data points.
Average modification levels in 100 kb windows along mouse chromosomes, weighted by the coverage of CpG, and smoothed using a Gaussian weighted moving average filter with window size 10.
Supplementary Figure 15 Low-input gDNA and cell-free DNA TAPS libraries prepared with dsDNA KAPA HyperPrep library preparation kit.
Sequencing libraries were successfully constructed with as little as 1 ng of (a) mESC gDNA and (b) cell-free DNA with KAPA HyperPrep kit. Experiment was performed once. Note that cell-free DNA has a sharp length distribution around 160 bp (nucleosome size) due to plasma nuclease digestion. After library construction, it becomes ~300bp, which is the sharp band in (b).
About this article
Cite this article
Liu, Y., Siejka-Zielińska, P., Velikova, G. et al. Bisulfite-free direct detection of 5-methylcytosine and 5-hydroxymethylcytosine at base resolution. Nat Biotechnol 37, 424–429 (2019). https://doi.org/10.1038/s41587-019-0041-2
This article is cited by
Clinical Epigenetics (2023)
Nature Reviews Genetics (2023)
Clinical Epigenetics (2022)
5-Hydroxymethylcytosine (5hmC) at or near cancer mutation hot spots as potential targets for early cancer detection
BMC Research Notes (2022)
Biomarker Research (2022)