5-methylcytosine (5mC) is the most important DNA modification in mammalian genomes. The ideal method for 5mC localization would be both nondestructive of DNA and direct, without requiring inference based on detection of unmodified cytosines. Here we present direct methylation sequencing (DM-Seq), a bisulfite-free method for profiling 5mC at single-base resolution using nanogram quantities of DNA. DM-Seq employs two key DNA-modifying enzymes: a neomorphic DNA methyltransferase and a DNA deaminase capable of precise discrimination between cytosine modification states. Coupling these activities with deaminase-resistant adapters enables accurate detection of only 5mC via a C-to-T transition in sequencing. By comparison, we uncover a PCR-related underdetection bias with the hybrid enzymatic-chemical TET-assisted pyridine borane sequencing approach. Importantly, we show that DM-Seq, unlike bisulfite sequencing, unmasks prognostically important CpGs in a clinical tumor sample by not confounding 5mC with 5-hydroxymethylcytosine. DM-Seq thus offers an all-enzymatic, nondestructive, faithful and direct method for the reading of 5mC alone.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Sequencing data supporting the findings of this study are available in the NCBI Gene Expression Omnibus (GEO, GSE225975). The plasmid encoding MBP-t-M.MpeI-N374K-His has be made available from Addgene (197985). Relevant DNA sequences are provided in Supplementary Information. Source data are provided with this paper.
Schubeler, D. Function and information content of DNA methylation. Nature 517, 321–326 (2015).
Luo, C., Hajkova, P. & Ecker, J. R. Dynamic DNA methylation: in the right place at the right time. Science 361, 1336–1340 (2018).
Shen, S. Y. et al. Sensitive tumour detection and classification using plasma cell-free DNA methylomes. Nature 563, 579–583 (2018).
Hoadley, K. A. et al. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell 158, 929–944 (2014).
Frommer, M. et al. A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands. Proc. Natl Acad. Sci. USA 89, 1827–1831 (1992).
Tanaka, K. & Okamoto, A. Degradation of DNA by bisulfite treatment. Bioorg. Med. Chem. Lett. 17, 1912–1915 (2007).
Huang, Y. et al. The behaviour of 5-hydroxymethylcytosine in bisulfite sequencing. PLoS ONE 5, e8888 (2010).
Kriaucionis, S. & Heintz, N. The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science 324, 929–930 (2009).
Johnson, K. C. et al. 5-Hydroxymethylcytosine localizes to enhancer elements and is associated with survival in glioblastoma patients. Nat. Commun. 7, 13177 (2016).
Wang, T., Loo, C. E. & Kohli, R. M. Enzymatic approaches for profiling cytosine methylation and hydroxymethylation. Mol. Metab. 57, 101314 (2021).
Booth, M. J. et al. Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science 336, 934–937 (2012).
Yu, M. et al. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell 149, 1368–1380 (2012).
Liu, Y. et al. Bisulfite-free direct detection of 5-methylcytosine and 5-hydroxymethylcytosine at base resolution. Nat. Biotechnol. 37, 424–429 (2019).
Liu, Y. et al. Subtraction-free and bisulfite-free specific sequencing of 5-methylcytosine and its oxidized derivatives at base resolution. Nat. Commun. 12, 618 (2021).
Schutsky, E. K. et al. Nondestructive, base-resolution sequencing of 5-hydroxymethylcytosine using a DNA deaminase. Nat. Biotech. 36, 1083–1090 (2018).
Vaisvila, R. et al. Enzymatic methyl sequencing detects DNA methylation at single-base resolution from picograms of DNA. Genome Res. 31, 1280–1289 (2021).
Wu, H., Wu, X., Shen, L. & Zhang, Y. Single-base resolution analysis of active DNA demethylation using methylase-assisted bisulfite sequencing. Nat. Biotechnol. 32, 1231–1240 (2014).
Stasevskij, Z., Gibas, P., Gordevicius, J., Kriukiene, E. & Klimasauskas, S. Tethered oligonucleotide-primed sequencing, TOP-Seq: a high-resolution economical approach for DNA epigenome profiling. Mol. Cell 65, 554–564.e6 (2017).
Kriukienė, E. et al. DNA unmethylome profiling by covalent capture of CpG sites. Nat. Commun. 4, 2190 (2013).
Wang, T. & Kohli, R. M. Discovery of an unnatural DNA modification derived from a natural secondary metabolite. Cell. Chem. Biol. 28, 97–104.e4 (2021).
Nabel, C. S. et al. AID/APOBEC deaminases disfavor modified cytosines implicated in DNA demethylation. Nat. Chem. Biol. 8, 751–758 (2012).
Schutsky, E. K., Nabel, C. S., Davis, A. K. F., DeNizio, J. E. & Kohli, R. M. APOBEC3A efficiently deaminates methylated, but not TET-oxidized, cytosine bases in DNA. Nucleic Acids Res. 45, 7655–7665 (2017).
Seiler, C. L. et al. Maintenance DNA methyltransferase activity in the presence of oxidized forms of 5-methylcytosine: structural basis for ten eleven translocation-mediated DNA demethylation. Biochemistry 57, 6061–6069 (2018).
Shi, K. et al. Structural basis for targeted DNA cytosine deamination and mutagenesis by APOBEC3A and APOBEC3B. Nat. Struct. Mol. Biol. 24, 131 (2017).
Ghanty, U., DeNizio, J. E., Liu, M. Y. & Kohli, R. M. Exploiting substrate promiscuity to develop activity-based probes for TET family enzymes. J. Am. Chem. Soc. 140, 17329–17332 (2018).
Chinchilla, R. & Najera, C. The Sonogashira reaction: a booming methodology in synthetic organic chemistry. Chem. Rev. 107, 874–922 (2007).
Kelly, T. K. et al. Genome-wide mapping of nucleosome positioning and DNA methylation within individual DNA molecules. Genome Res. 22, 2497–2506 (2012).
Liu, Y. et al. Accurate targeted long-read DNA methylation and hydroxymethylation sequencing with TAPS. Genome Biol. 21, 54–56 (2020).
Sipa, K. et al. Effect of base modifications on structure, thermodynamic stability, and gene silencing activity of short interfering RNA. RNA 13, 1301–1316 (2007).
Dalluge, J. J., Hashizume, T., Sopchik, A. E., McCloskey, J. A. & Davis, D. R. Conformational flexibility in RNA: the role of dihydrouridine. Nucleic Acids Res. 24, 1073–1079 (1996).
Onodera, A. et al. Roles of TET and TDG in DNA demethylation in proliferating and non-proliferating immune cells. Genome Biol. 22, 186 (2021).
Suvà, M. L. et al. Reconstructing and reprogramming the tumor-propagating potential of glioblastoma stem-like cells. Cell 157, 580–594 (2014).
Klughammer, J. et al. The DNA methylation landscape of glioblastoma disease progression shows extensive heterogeneity in time and space. Nat. Med. 24, 1611–1624 (2018).
Raiber, E. A. et al. Base resolution maps reveal the importance of 5-hydroxymethylcytosine in a human glioblastoma. npj Genom. Med. 2, 6 (2017).
Xie, Q. et al. N6-methyladenine DNA modification in glioblastoma. Cell 175, 1228–1243.e20 (2018).
Zhang, J. & Zheng, Y. G. SAM/SAH analogs as versatile tools for SAM-dependent methyltransferases. ACS Chem. Biol. 11, 583–597 (2016).
Kim, J. et al. Structure-guided discovery of the metabolite carboxy-SAM that modulates tRNA function. Nature 498, 123–126 (2013).
Xiong, J. et al. Bisulfite-free and single-base resolution detection of epigenetic DNA modification of 5-methylcytosine by methyltransferase-directed labeling with APOBEC3A deamination sequencing. Anal. Chem. 94, 15489–15498 (2022).
Siejka-Zielińska, P. et al. Cell-free DNA TAPS provides multimodal information for early cancer detection. Sci. Adv. 7, eabh0534 (2021).
Millar, D., Christova, Y. & Holliger, P. A polymerase engineered for bisulfite sequencing. Nucleic Acids Res. 43, e155 (2015).
Engler, C., Kandzia, R. & Marillonnet, S. A one pot, one step, precision cloning method with high throughput capability. PLoS ONE 3, e3647 (2008).
Wang, T. et al. Bisulfite-free sequencing of 5-hydroxymethylcytosine with APOBEC-Coupled Epigenetic Sequencing (ACE-Seq). Methods Mol. Biol. 2198, 349–367 (2021).
Arora, S., Horne, W. S. & Islam, K. Engineering methyllysine writers and readers for allele-specific regulation of protein-protein interactions. J. Am. Chem. Soc. 141, 15466–15470 (2019).
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
We thank Y. Lan, W. Zhou, T. Christopher and the Penn Center for Personalized Diagnostics for useful discussions and reagents. We also thank K. Islam for providing but-2-ynyl-SAM. This work was supported by the National Institutes of Health through grant no. R01-HG010646 (to R.M.K. and H.W.). E.K.S., K.N.B. and J.E.D. were NSF Graduate Research Fellows.
The University of Pennsylvania has patents pending for CxMTase enzymes, DNA deaminase-resistant adapters and the DM-Seq pipeline. R.M.K. has served as a scientific advisory board member for Cambridge Epigenetix (CEGX). W.S.G. is an employee of CEGX and T.W. was supported by a fellowship from CEGX. A.D. and N.D. are employees of Integrated DNA Technologies, Inc. The other authors declare no competing interests.
Peer review information
Nature Chemical Biology thanks Abdulkadir Abakir and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a) Methods differ in their use of protection or modification steps to alter C, 5mC or 5hmC. They differ in deamination steps with chemical or enzymatic reagents. In each method, C, 5mC, or 5hmC are detected based on the pattern of C-to-T changes in sequencing, resulting in different possible bases that can be confounded with 5mC. b) Shown are the anticipated sequencing results for C, 5mC and 5hmC in CpG versus CpH contexts.
a) A fluorophore-labelled top-strand is duplexed to a complementary bottom−strand containing a methylated cytosine. The methylated cytosine is represented with a black oval. The substrate is reacted with either WT M.MpeI + SAM, eM.MpeI + bSAM, or M.MpeI N374K + CxSAM. The half-purple/half-orange oval represents a modified cytosine that can either be 5mC, 5bC, or 5cxmC after the action of the MTase variant and SAM analog. The substrate is then deaminated with A3A before duplexing a complement strand. The restriction enzyme TaqαI only cleaves DNA if C is protected from A3A deamination. b) ESI-MS validating generation of 5bC and 5cxmC substrates before A3A reaction. No unmodified C substrate was detected.
A fluorophore-labelled top-strand is duplexed to a complementary strand containing either an unmodified or methylated cytosine (represented with a black oval). The duplex is incubated with M.MpeI N374K and either no SAM, SAM, or CxSAM. The half-black/half-purple oval represents the modified cytosine on the labelled top-strand resulting after the action of the M.MpeI N374K and the SAM analog. Excess of unmodified bottom strand exchanges away the modification on the bottom strand. HpaII cleavage interrogates the modification status of the top strand.
Extended Data Fig. 4 Structurally-informed identification of both 5cxmC and 5pyC as new protected cytosines useful for A3A dependent sequencing.
a) Generation of homogenously-modified PCR substrates containing unnatural cytosines. A DNA template is amplified with a C-depleted forward primer (red) and G-depleted reverse primer (green) as well as dA/G/TTP and a modified dCTP (blue). DNA is then A3A deaminated before amplification. Amplicons are interrogated with the TaqαI restriction enzyme or by Next-Generation Sequencing quantifying all C sites. b) Active site of human A3A (PDB: 5SWW) showing gating tyrosine (orange) which abuts the C5-C6 face of the target cytosine (yellow) and is anticipated to limit the size of the 5-position substituent (dashed yellow line). A cartoon representation is also shown above. c) Summary of cytosine analogs and deamination by A3A. Left: WT MTases, TETs, and βGT make naturally-occurring modified cytosines which have different reactivities towards A3A. 5caC and 5ghmC are used in the existing methods, EM-Seq and ACE-Seq, to protect from A3A deamination. Right: 5cxmC and 5pyC are identified as novel, protected A3A substrates, both employed in DM-Seq. Despite their shared utility, 5cxmC and 5pyC also contrast in their bond types at the 5-position of cytosine, which are determined by their contrasting modes of biochemical and chemical synthesis, respectively.
Extended Data Fig. 5 5pyC adapters improve DNA carboxymethylation efficiency through the synthesis of a 5mC copy strand.
a) Structure of 5pyC adapters. ESI-MS characterizing 5pyC adapters. Expected m/z of the two strands: 10,444.2 and 9,936.6. The phosphorothioate linkage (*) substitutes a sulfur in place of a phosphate in the backbone of the oligonucleotide to minimize nuclease degradation. b) 5pyC adapter ligation experiment. Template DNA was ligated to 5mC- or 5pyC-containing Y-shaped adapters. The template DNA was detected by amplification with internal primers (red) or successful ligation was detected by amplification with Illumina indexing primers (blue). Experiment was performed once. c) Schematic of copy strand synthesis. A copy strand is made by incubation of a copy primer, polymerase, and dA/G/TTPs with 5mdCTP.
a) Experimental scheme. Sheared lambda gDNA is ligated to 5mC- or 5pyC-containing adapters. A copy strand with 5mCs is synthesized before reaction with the CxMTase and either no SAM, SAM, or CxSAM, with the product of this reaction represented by the oval with mixed colors. Subsequent BS or A3A deamination shows efficiency of DNA modification. Data are presented as mean values +/− SD (n = 2 independent experiments). b) Next-Generation Sequencing quantifying efficiency of CxMTase with methylation or carboxymethylation after copy strand synthesis.
The M.CviPI-methylated lambda phage gDNA shows near complete modification at GpCpG sequencing contexts given the known GpC preference for M.CviPI. The enzyme has known off-target and heterogenous activity at CpCpG sites. The dashed line shows the readout if BS-Seq signal inversely correlates with DM-Seq as anticipated. a) Correlation of BS-Seq to DM-Seq in an M.CviPI-modified substrate. b) Comparison of BS-Seq and DM-Seq modification status by 5’ sequence context. The box shows the lower quartile, median, and upper quartile. Minimum and maximum values are shown by the whiskers. Data in a-b) corresponds to experiment shown in Fig. 3a-b. c) Comparison of BS-Seq, TAPS, TAPS-β, and DM-Seq modification status by 5’ sequence context. The box shows the lower quartile, median, and upper quartile. Minimum and maximum values are shown by the whiskers. d) Percent modification in CpG contexts of BS-Seq, TAPS, TAPS-β, and DM-Seq of 3 different methylated lambda phages. Data in c-d) corresponds to the experiment shown in Fig. 3c–d.
Extended Data Fig. 8 Comparison of deamination methods show that TAPS bias is dependent on both TET and borane-mediated deamination.
a) Workflow for comparing deamination methods. A mixture of unmodified pUC19 DNA, 100% CpG methylated lambda phage, and T4-hmC phage (where all C bases are 5hmC) was glucosylated. Samples were then subjected to either two rounds of TET treatment or no TET treatment. DNA was ligated to the appropriate adapter and subjected to one of four conditions: no deamination, BS, pyridine borane, or A3A. The pyridine borane workflow is equivalent to TAPS-β. The bases deaminated by each method (detected as T by sequencing) are noted, with structures of the deamination products at right, including the non-aromatic DHU. b) Percent reads C as determined by the methylated lambda phage spike-in. The sample with TET and bisulfite indicates efficient conversion of 5mC to 5fC/5caC by TET. c) Proportion of reads mapping to each spike-in are shown. Borane deamination shows decreased reads mapping to the methylated lambda phage, with depletion dependent upon TET oxidation. d) qPCR detection of amplifiable DNA library after each deamination method. Shown are the p-values from paired two-tailed t-test (n = 4 for each deamination condition, 3e-4 between BS and borane, 2e-5 between BS and A3A). Data are presented as mean values +/− SD. e) Mean library size ± standard deviation for each deamination method. A representative BioAnalyzer trace is shown for each deamination method.
a) Binned CpG analysis using non-overlapping 1 kB bins with at least 20 CpGs covered. Correlation between TAPS and BS-Seq in mESCs in existing datasets (GSE112520). b) Histogram showing ~2-fold as many 1 kB bins with greater modification detected by BS-Seq than TAPS. Percent Deviation = (TAPS % T – BS-Seq % C) / (BS-Seq % C). c) Percent deviation of TAPS vs BS-Seq as a function of % modification of CpGs in a given 1 kB bin. The box shows the lower quartile, median, and upper quartile. Minimum and maximum values are shown by the whiskers. d) ICRs show underdetection of 5mC by TAPS relative to BS-Seq. At bottom is the heatmap representation of individual ICRs. The percent modification outside of ICRs (64.2% vs 60.1%) represents the genome-wide average for each method using 1 kB bins vs just at the ICR (41.9% vs 31.6%). e) Plot of the CpG density in individual ICRs versus the percent TAPS underestimates the level of 5mC relative to BS-Seq. 28 of 29 ICRs show lower modification density by TAPS than by BS-Seq. The one exception is shown in red. The associated correlation coefficient tracks the % underestimate as a function of CpG density.
DM-seq and BS-seq data from gDNA derived from a patient glioblastoma. a) Final library yield. b) Average size of library fragments (adapters included) determined using a Bioanalyzer. c) Unique CpGs covered by BS-Seq and DM-Seq. The extrapolated BS-Seq bar takes into account if the sequencer was loaded with the same volume of each library rather than by normalizing the amount of DNA loaded. d) High 5hmCpG sites, previously identified by oxBS-Seq of 30 tumors. The Venn diagram shows the portion of these CpG sites that were covered by BS-seq or DM-Seq with this glioblastoma sample. The metrics for the sites sequenced by either BS-Seq or DM-Seq alone are similar to those at the sites that were sequenced by both methods. The analysis in Fig. 4e focuses on the 1,485 shared CpG sites sequenced by both methods.
About this article
Cite this article
Wang, T., Fowler, J.M., Liu, L. et al. Direct enzymatic sequencing of 5-methylcytosine at single-base resolution. Nat Chem Biol 19, 1004–1012 (2023). https://doi.org/10.1038/s41589-023-01318-1
This article is cited by
Joint single-cell profiling resolves 5mC and 5hmC and reveals their distinct gene regulatory effects
Nature Biotechnology (2023)
Nature Chemical Biology (2023)