Here we present APOBEC-coupled epigenetic sequencing (ACE-seq), a bisulfite-free method for localizing 5-hydroxymethylcytosine (5hmC) at single-base resolution with low DNA input. The method builds on the observation that AID/APOBEC family DNA deaminase enzymes can potently discriminate between cytosine modification states and exploits the non-destructive nature of enzymatic, rather than chemical, deamination. ACE-seq yielded high-confidence 5hmC profiles with at least 1,000-fold less DNA input than conventional methods. Applying ACE-seq to generate a base-resolution map of 5hmC in tissue-derived cortical excitatory neurons, we found that 5hmC was almost entirely confined to CG dinucleotides. The whole-genome map permitted cytosine, 5-methylcytosine (5mC) and 5hmC to be parsed and revealed genomic features that diverged from global patterns, including enhancers and imprinting control regions with high and low 5hmC/5mC ratios, respectively. Enzymatic deamination overcomes many challenges posed by bisulfite-based methods, thus expanding the scope of epigenome profiling to include scarce samples and opening new lines of inquiry regarding the role of cytosine modifications in genome biology.
Subscribe to Journal
Get full journal access for 1 year
only $21.58 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Schübeler, D. Function and information content of DNA methylation. Nature 517, 321–326 (2015).
Tahiliani, M. et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 324, 930–935 (2009).
Ito, S. et al. Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science 333, 1300–1303 (2011).
He, Y.F. et al. Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science 333, 1303–1307 (2011).
Pfaffeneder, T. et al. The discovery of 5-formylcytosine in embryonic stem cell DNA. Angew. Chem. Int. Edn Engl. 50, 7008–7012 (2011).
Kohli, R.M. & Zhang, Y. TET enzymes, TDG and the dynamics of DNA demethylation. Nature 502, 472–479 (2013).
Wagner, M. et al. Age-dependent levels of 5-methyl-, 5-hydroxymethyl-, and 5-formylcytosine in human and mouse brain tissues. Angew. Chem. Int. Edn Engl. 54, 12511–12514 (2015).
Bachman, M. et al. 5-Formylcytosine can be a stable DNA modification in mammals. Nat. Chem. Biol. 11, 555–557 (2015).
Wu, H. & Zhang, Y. Charting oxidized methylcytosines at base resolution. Nat. Struct. Mol. Biol. 22, 656–661 (2015).
Huang, Y. et al. The behavior of 5-hydroxymethylcytosine in bisulfite sequencing. PLoS One 5, e8888 (2010).
Yu, M. et al. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell 149, 1368–1380 (2012).
Booth, M.J. et al. Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science 336, 934–937 (2012).
Tanaka, K. & Okamoto, A. Degradation of DNA by bisulfite treatment. Bioorg. Med. Chem. Lett. 17, 1912–1915 (2007).
Luo, C. et al. Single-cell methylomes identify neuronal subtypes and regulatory elements in mammalian cortex. Science 357, 600–604 (2017).
Clark, S.J., Lee, H.J., Smallwood, S.A., Kelsey, G. & Reik, W. Single-cell epigenomics: powerful new methods for understanding gene regulation and cell identity. Genome Biol. 17, 72 (2016).
Booth, M.J., Raiber, E.A. & Balasubramanian, S. Chemical methods for decoding cytosine modifications in DNA. Chem. Rev. 115, 2240–2254 (2015).
Zahid, O.K., Zhao, B.S., He, C. & Hall, A.R. Quantifying mammalian genomic DNA hydroxymethylcytosine content using solid-state nanopores. Sci. Rep. 6, 29565 (2016).
Chavez, L. et al. Simultaneous sequencing of oxidized methylcytosines produced by TET/JBP dioxygenases in Coprinopsis cinerea. Proc. Natl. Acad. Sci. USA 111, E5149–E5158 (2014).
Sun, Z. et al. High-resolution enzymatic mapping of genomic 5-hydroxymethylcytosine in mouse embryonic stem cells. Cell Rep. 3, 567–576 (2013).
Sun, Z. et al. A sensitive approach to map genome-wide 5-hydroxymethylcytosine and 5-formylcytosine at single-base resolution. Mol. Cell 57, 750–761 (2015).
Mellén, M., Ayata, P. & Heintz, N. 5-hydroxymethylcytosine accumulation in postmitotic neurons results in functional demethylation of expressed genes. Proc. Natl. Acad. Sci. USA 114, E7812–E7821 (2017).
Gross, J.A. et al. Characterizing 5-hydroxymethylcytosine in human prefrontal cortex at single base resolution. BMC Genomics 16, 672 (2015).
Lister, R. et al. Global epigenomic reconfiguration during mammalian brain development. Science 341, 1237905 (2013).
Siriwardena, S.U., Chen, K. & Bhagwat, A.S. Functions and malfunctions of mammalian DNA-cytosine deaminases. Chem. Rev. 116, 12688–12710 (2016).
Nabel, C.S. et al. AID/APOBEC deaminases disfavor modified cytosines implicated in DNA demethylation. Nat. Chem. Biol. 8, 751–758 (2012).
Wijesinghe, P. & Bhagwat, A.S. Efficient deamination of 5-methylcytosines in DNA by human APOBEC3A, but not by AID or APOBEC3G. Nucleic Acids Res. 40, 9206–9217 (2012).
Carpenter, M.A. et al. Methylcytosine and normal cytosine deamination by the foreign DNA restriction enzyme APOBEC3A. J. Biol. Chem. 287, 34801–34808 (2012).
Schutsky, E.K., Nabel, C.S., Davis, A.K.F., DeNizio, J.E. & Kohli, R.M. APOBEC3A efficiently deaminates methylated, but not TET-oxidized, cytosine bases in DNA. Nucleic Acids Res. 45, 7655–7665 (2017).
Bryson, A.L. et al. Covalent modification of bacteriophage T4 DNA inhibits CRISPR-Cas9. MBio 6, e00648–e15 (2015).
Mo, A. et al. Epigenomic signatures of neuronal diversity in the mammalian brain. Neuron 86, 1369–1384 (2015).
Warnecke, P.M. et al. Identification and resolution of artifacts in bisulfite sequencing. Methods 27, 101–107 (2002).
Johnson, B.S. et al. Biotin tagging of MeCP2 in mice reveals contextual insights into the Rett syndrome transcriptome. Nat. Med. 23, 1203–1214 (2017).
Hu, L. et al. Crystal structure of TET2-DNA complex: insight into TET-mediated 5mC oxidation. Cell 155, 1545–1555 (2013).
Xie, W. et al. Base-resolution analyses of sequence and parent-of-origin dependent DNA methylation in the mouse genome. Cell 148, 816–831 (2012).
Yin, Y. et al. Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 356, eaaj2239 (2017).
Buenrostro, J.D., Giresi, P.G., Zaba, L.C., Chang, H.Y. & Greenleaf, W.J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Eid, J. et al. Real-time DNA sequencing from single polymerase molecules. Science 323, 133–138 (2009).
Gabel, H.W. et al. Disruption of DNA-methylation-dependent long gene repression in Rett syndrome. Nature 522, 89–93 (2015).
Wang, H. et al. One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering. Cell 153, 910–918 (2013).
Lu, F., Liu, Y., Jiang, L., Yamaguchi, S. & Zhang, Y. Role of Tet proteins in enhancer activity and telomere elongation. Genes Dev. 28, 2103–2119 (2014).
Wu, H., Wu, X., Shen, L. & Zhang, Y. Single-base resolution analysis of active DNA demethylation using methylase-assisted bisulfite sequencing. Nat. Biotechnol. 32, 1231–1240 (2014).
Krueger, F. & Andrews, S.R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
Bachman, M. et al. 5-Hydroxymethylcytosine is a predominantly stable DNA modification. Nat. Chem. 6, 1049–1055 (2014).
Qu, J., Zhou, M., Song, Q., Hong, E.E. & Smith, A.D. MLML: consistent simultaneous estimates of DNA methylation and hydroxymethylation. Bioinformatics 29, 2645–2646 (2013).
Thorvaldsdóttir, H., Robinson, J.T. & Mesirov, J.P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013).
Shen, L. et al. Genome-wide analysis reveals TET- and TDG-dependent 5-methylcytosine oxidation dynamics. Cell 153, 692–706 (2013).
We are grateful to Z. Zhou, M. Fasolino, A. Bryson and J.M. SanMiguel for discussion and reagents. This work was supported by the US National Institutes of Health through R21-HG009545 (to R.M.K.) and by the Penn Epigenetics Institute. Additional support included R00-HG007982 (to H.W.), DP2-HL142044 (to H.W.) and R01-GM118501 (to R.M.K.). E.K.S. and J.E.D. are NSF Graduate Research Fellows. J.E.D. and E.B.F. were supported by NIH training grant T32-GM07229, and M.Y.L. by F30-CA196097.
Aspects of the ACE-seq protocol have been non-exclusively licensed.
Integrated supplementary information
Supplementary Figure 1 Chemical and/or enzymatic deamination across various sequencing methods to localize modified cytosines.
(a) Upon treatment with bisulfite under acidic conditions and at elevated temperatures, unmodified cytosine becomes sulfonated at the 6-position, which facilitates its hydrolytic deamination. Moving to alkaline conditions promotes desulfonation to yield uracil. In contrast, APOBEC3A catalyzes enzymatic hydrolytic deamination under physiological temperatures and pHs. (b) Workflows of various deamination-based sequencing methods to localize modified cytosines. Traditional bisulfite sequencing (BS-Seq) localizes 5mC and 5hmC together after deamination of unmodified C. TET-assisted bisulfite sequencing (TAB-Seq) changes reactivities to specifically localize 5hmC by first protecting 5hmC with a glucose moiety and then oxidizing 5mC with TET to form 5-formyl or 5-carboxylcytosine, which will be read as T following bisulfite conversion and sequencing. Oxidative bisulfite sequencing (oxBS-Seq) directly localizes 5mC by oxidizing 5hmC with potassium perruthenate before bisulfite treatment. When used in conjunction with standard bisulfite sequencing, 5hmC can be indirectly localized through subtraction of the oxBS signal from that of BS-Seq. Finally, APOBEC-Coupled Epigenetic Sequencing (ACE-Seq) does not rely on bisulfite, and instead utilizes enzymatic deamination of C and 5mC by APOBEC3A after protecting 5hmC with a glucose moiety. The sequencing readout therefore is comparable to TAB-Seq without the need for the destructive bisulfite treatment.
(a-c) Validation of T4 mutant phenotypes by (a) restriction digest and (b,c) liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS). (a) AluI cleaves only unmodified cytosine-containing DNA, while MspJI selectively cleaves 5hmC-containing DNA. Exogenous treatment of T4-hmC with a glucosyltransferase also causes a mass shift to generate a band the same size as T4-ghmC. The gels are adapted from Bryson, et al. mBio, 2015 (PMID: 26081634) with permission from mBio; specifically, only panel A of the original figure was used, and the labels were altered slightly to align with nomenclature used in this article. (b) Raw MS/MS traces of C, 5mC, 5hmC, and 5ghmC in T4-C, T4-hmC, and T4-ghmC phage stocks. α and β denote peaks representing α- and β-anomers of 5ghmC. The MS/MS transitions are provided in the methods. Note that the peak area does not directly correlate with abundance given different ionization efficiencies. Independent experiments were performed in triplicate; representative traces shown. (c) Levels of C and 5ghmC in each stock are quantified relative to the T4-C and T4-ghmC reference genomes. Mean values from triplicate experiments are listed above each bar; error bars report the standard deviation. The presence of 5ghmC in T4-C is likely from rare reversion of the phage during propagation. T4 phage can package some of its host E. coli gDNA (strain DH10B). The presence of C in T4-hmC and T4-ghmC correlates with the expected amount based on the frequency of sequencing reads that map to the E. coli DH10B (bottom table). Notably, the presence of small amounts of E. coli gDNA do not impact any of the ACE-Seq quality metrics, as these are derived from reads that map to the T4-hmC genome alone (as in Fig. 2d,f). (d,e) Validation of CG methylation of λ phage genomic DNA by (d) restriction digest and (e) LC-MS/MS. (d) 75 ng of λ phage DNA that was either untreated or treated with M.SssI was incubated in the presence of no restriction enzyme (-), MspI, or HpaII, with appropriate buffers. While MspI cleaves both unmethylated and methylated CG-containing sites, HpaII is blocked from cleavage by CG methylation. Experiments were performed several times with similar results; representative image shown. (e) Raw MS/MS traces of C, 5mC, 5hmC, and 5ghmC in the methylated λ phage stock. Experiment was performed in triplicate (n = 3); representative traces shown.
Supplementary Figure 3 ACE-Seq optimization to promote full deamination of a locus in the T4-C phage genome.
(a) Highlighter plot showing deamination events (red) and non-deamination events (cyan) compared to the non-deaminated master sequence. Clones from experiments without DMSO and with 10% DMSO are shown. Strong hairpin formation was predicted using secondary structure prediction software from position 363-378, likely contributing to lack of deamination in this area without DMSO. (b) Highlighter plot showing deamination events (red) and non-deamination events (cyan) compared to the non-deaminated master sequence. Clones from experiments performed at 25 °C, 37 °C, and under slow ramping conditions (4 °C to 50 °C over 2 hours) are shown.
15 ng of a 1:1 pooled mixture of methylated λ phage gDNA and T4-hmC DNA was treated with or without βGT and then with or without A3A (4 samples, each in triplicate). These reactions were then digested to nucleosides, and analyzed via LC-MS/MS. Levels of (a) C, (b) 5mC, and (c) 5hmC were quantified using standard curves generated from purified nucleoside controls, and the relative amount of the base graphed relative to the untreated sample (-A3A, - bGT). Individual data points from triplicate experiments are overlaid on the bar graph; mean percentages are listed above each bar, and error bars represent standard deviations from the mean. (d) Representative raw traces of MS/MS counts for 5hmC and 5ghmC. Upon βGT treatment, 5hmC signal decreased significantly, accompanied by the conversion to 5ghmC. Mean percentages of 5hmC remaining from triplicate experiments (n = 3) are listed above the remaining peaks in the + βGT samples. (e) Chart outlining predicted/observed products for both spike-in controls under each condition tested.
ACE-Seq or BS-Seq samples of unsheared mESC DNA analyzed qualitatively in Fig. 3a were used to seed qPCR reactions. 0.5 μL of the treated samples were combined with 500 nM each of the forward and reverse primers (to amplify either (a) the 200-bp amplicon or (b) the 1-kb amplicon) and amplified using the KAPA SYBR Fast Rox low qPCR Mastermix kit (KAPA Biosystems). For the 200-bp amplicon, a two-step PCR protocol was used in which the samples were initially denatured at 95 °C for 3 minutes, and then cycled between 95 °C (15 seconds) and 63 °C (20 seconds) for a total of 35 cycles. For the 1-kb amplicon, a two-step PCR protocol was used in which the samples were initially denatured at 95 °C for 5 minutes, and then cycled between 95 °C (30 seconds) and 66 °C (90 seconds) for a total of 41 cycles. Resulting qPCR products were run on 1% agarose gels and stained with SybrSafe to confirm specific amplification. The gray dashed line in (b) represents the mean CT value from the no-template control due to primer dimer amplification. Notably, the signals from the no-template and bisulfite samples for the 1-kb amplicon were exclusively from primer dimer amplification and were not specific to the desired 1-kb product. Mean values from triplicate experiments (n = 3) are shown, with error bars representing standard deviation from the mean. The data from 1 μg input samples are the same as those presented in Fig. 3b. (c) Shown is the uncropped gel from Fig. 3b.
(a) Browser snapshot showing base-resolution 5hmCG (blue) maps near the Neurod6 gene (chr6:55,614,961-55,646,769; mm9). ACE-Seq raw signals of CGs on both strands were combined and only CG dyads sequenced to depth >=2 are shown. Gray tracks denote sequencing coverage for each CG dyad. Experiment was performed once each for 2 ng and 20 ng input of each gDNA (n = 2). (b) Correlation density plot between ACE-Seq experiments with 2 ng or 20 ng of DNA as input. Mean ACE-Seq raw signals were calculated for tiled 10-kb bins across the mouse genome. Correlation analysis was performed with 10-kb bins spanning the genome (n = 238,401 bins).
Supplementary Figure 7 Relationship of DNA modifications and chromatin states at representative loci.
Browser snapshot showing base-resolution CG (green), 5mCG (red), 5hmCG (blue) maps, as well as RNA-Seq, ATAC-Seq, and ChIP-Seq of major histone modifications near the (a) transcriptionally-active Neurod6 (chr6:55,667,934-55,690,102; mm10) gene and (b) the inactive Gad2 gene (chr2:22,607,043-22,660,967; mm10). Only CGs sequenced to depth ≥ 2 are shown. Gray tracks denote sequencing coverage for each position in each base-resolution map. ACE-Seq traces represent merged data sets from single experiments at 2 ng and 20 ng of input DNA (n = 2).
Supplementary Figure 8 Base-resolution analysis of 5hmCGs and 5mCGs at imprinted regions in mouse cortical excitatory neurons.
(a) The average 5hmC (left) or 5mC (right) levels are shown within two groups of imprinted regions (hmC-high: blue; hmC-low: green) and their flanking regions. (b) Snapshot of base-resolution C (green), 5mC (red), and 5hmC (blue) maps near the Kcnq1ot1 gene locus (chr7:143,290,000-143,300,000; genome build: mm10). Only CGs sequenced to depth >=2 are shown. Gray tracks denote sequencing coverage for each position. ACE-Seq traces represent merged data sets from single experiments at 2 ng and 20 ng of input DNA (n = 2). (c) Heat-map representation of normalized RNA-Seq, H3K4me3 (ChIP-Seq), H3K27me3 (ChIP-Seq), 5hmC (ACE-Seq), 5mC (derived from BS-Seq and ACE-Seq), and 5hmC/5mC ratios within 30 imprinted regions. Imprinted regions were ranked by their 5hmC levels in cortical excitatory neurons.
Ternary plots show the levels of C, 5mC, and 5hmC within 1-kb bins overlapping with representative genomic elements.
(a) The distribution of abundances of called 5hmCG (blue) and 5mCG (red) at individual sites in cortical excitatory neurons. The dashed line denotes the signal level of 0.6, the cutoff used to select high-level 5hmCG sites. (b) The fraction of high-level 5hmCG (purple) and all 5hmCG (yellow) sites within various genomic elements (relative to total number of CG sites in each group).
(a) Comparison of chemical and enzymatic deamination. Chemical deamination is efficient on C, while 5mC is largely resistant to the reaction. Also, 5hmC is converted to a CMS adduct, while 5fC is inefficiently deaminated. Efforts to drive this reaction starts to increase aberrant 5mC deamination (see Wu et al., Nature Prot, 2016, 11:1081-1100). Enzymatic deamination is efficient on C and 5mC and discriminates against all ox-mCs (see Schutsky et al., NAR, 2017, 45:7655-7665.) (b) Schemes for detection of 5mC. Potential schemes for localizing 5mC and all ox-mCs involve TET-mediated oxidation with or without coupled protection with βGT, which would lead to conversion of C bases only. As another alternative, after bisulfite treatment, A3A can be used to deaminate 5mC but not CMS to differentiate 5hmC and 5mC, although such a method would carry forward the limitations of bisulfite.
About this article
Cite this article
Schutsky, E., DeNizio, J., Hu, P. et al. Nondestructive, base-resolution sequencing of 5-hydroxymethylcytosine using a DNA deaminase. Nat Biotechnol 36, 1083–1090 (2018). https://doi.org/10.1038/nbt.4204
Chemical Research in Toxicology (2020)
Computational and Structural Biotechnology Journal (2020)
Journal of Human Genetics (2020)
Trends in Biotechnology (2020)
Genome Medicine (2020)