Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Simultaneous single-cell analysis of 5mC and 5hmC with SIMPLE-seq

Subjects

Abstract

Dynamic 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) modifications to DNA regulate gene expression in a cell-type-specific manner and are associated with various biological processes, but the two modalities have not yet been measured simultaneously from the same genome at the single-cell level. Here we present SIMPLE-seq, a scalable, base resolution method for joint analysis of 5mC and 5hmC from thousands of single cells. Based on orthogonal labeling and recording of ‘C-to-T’ mutational signals from 5mC and 5hmC sites, SIMPLE-seq detects these two modifications from the same molecules in single cells and enables unbiased DNA methylation dynamics analysis of heterogeneous biological samples. We applied this method to mouse embryonic stem cells, human peripheral blood mononuclear cells and mouse brain to give joint epigenome maps at single-cell and single-molecule resolution. Integrated analysis of these two cytosine modifications reveals distinct epigenetic patterns associated with divergent regulatory programs in different cell types as well as cell states.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Base resolution, joint analysis of 5mC and 5hmC in single cells.
Fig. 2: Analysis of 5mC and 5hmC from the same molecules revealed multiple 5hmC types associated with active chromatin.
Fig. 3: Cell-type-specific 5mC and 5hmC landscapes in human PBMCs.
Fig. 4: Single-cell joint analysis of 5mC and 5hmC from the mouse brain.

Similar content being viewed by others

Data availability

Raw sequencing and processed data generated in this study are available from the Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) with accession number GSE197740 (ref. 113). Other external datasets were downloaded from the GEO with the following accession numbers: WGBS and TAPS of mESC55 (GSE112520), RNA sequencing of mESC (2i and serum)66 (GSE23943), TAB-seq of mESC46 (GSE36173), Joint-snhmC-seq of mouse brain82 (GSE236798), Paired-Tag of mouse brain83 (GSE152020); ArrayExpress with the following accession numbers: scRNA-seq of mESC114 (E-MTAB-2600); 10x Genomics website: scRNA-seq of PBMCs (https://www.10xgenomics.com); ENCODE with the following accession numbers: DNase-seq ChIP-seq of E14 mESC (ENCSR000CMW), H3K4me1 ChIP-seq of E14 mESC (ENCSR000CGN), H3K27ac ChIP-seq of E14 mESC (ENCSR000CGQ), H3K4me1 ChIP-seq of immune cells (ENCSR777RWW (CD4+ T cell), ENCSR631BPS (CD8+ T cells), ENCSR214VUB (B cells), ENCSR963TKB (NK cells) and ENCSR400VWA (monocytes)), H3K4me3 ChIP-seq of immune cells (ENCSR263WLD (CD4+ T cells), ENCSR231FDF (CD8+ T cells), ENCSR269OVV (B cells), ENCSR570AUC (NK cells) and ENCSR796FCS (monocytes)), H3K27ac ChIP-seq of immune cells (ENCSR546SDM (CD4+ T cells), ENCSR835OJV (CD8+ T cells), ENCSR191ZQT (B cells), ENCSR391EQV (NK cells) and ENCSR012PII (monocytes)), H3K27me3 ChIP-seq of immune cells (ENCSR043SBG (CD4+ T cells), ENCSR797GOJ (CD8+ T cells), ENCSR522EGW (B cells), ENCSR939JZW (NK cells) and ENCSR080XUB (monocytes)), H3K9me3 ChIP-seq of immune cells (ENCSR453GNY (CD4+ T cells), ENCSR905SHH (CD8+ T cells), ENCSR295PSK (B cells), ENCSR021FSY (NK cells) and ENCSR236JVK (monocytes)), H3K36me3 ChIP-seq of immune cells (ENCSR828WZG (CD4+ T cells), ENCSR694CDP (CD8+ T cells), ENCSR789RGI (B cells), ENCSR519SOC (NK cells) and ENCSR244XWL (monocytes)) and ChromHMM states of NK cells (ENCSR972ZND).

Code availability

Custom scripts used for analyzing SIMPLE-seq datasets are available from GitHub (https://github.com/cxzhu/SIMPLE-seq)115.

References

  1. Kelsey, G., Stegle, O. & Reik, W. Single-cell epigenomics: recording the past and predicting the future. Science 358, 69–75 (2017).

    Article  ADS  CAS  PubMed  Google Scholar 

  2. Zhu, H., Wang, G. & Qian, J. Transcription factors as readers and effectors of DNA methylation. Nat. Rev. Genet. 17, 551–565 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Bhutani, N., Burns, D. M. & Blau, H. M. DNA demethylation dynamics. Cell 146, 866–872 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Wu, X. & Zhang, Y. TET-mediated active DNA demethylation: mechanism, function and beyond. Nat. Rev. Genet. 18, 517–534 (2017).

    Article  CAS  PubMed  Google Scholar 

  5. Luo, C., Hajkova, P. & Ecker, J. R. Dynamic DNA methylation: in the right place at the right time. Science 361, 1336–1340 (2018).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  6. Smith, Z. D. & Meissner, A. DNA methylation: roles in mammalian development. Nat. Rev. Genet. 14, 204–220 (2013).

    Article  CAS  PubMed  Google Scholar 

  7. Horvath, S. & Raj, K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat. Rev. Genet. 19, 371–384 (2018).

    Article  CAS  PubMed  Google Scholar 

  8. Parry, A., Rulands, S. & Reik, W. Active turnover of DNA methylation during cell fate decisions. Nat. Rev. Genet. 22, 59–66 (2021).

    Article  CAS  PubMed  Google Scholar 

  9. Crawford, G. E. et al. Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 16, 123–131 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Johnson, D. S., Mortazavi, A., Myers, R. M. & Wold, B. Genome-wide mapping of in vivo protein–DNA interactions. Science 316, 1497–1502 (2007).

    Article  ADS  CAS  PubMed  Google Scholar 

  12. Lister, R. et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315–322 (2009).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  13. Consortium, E. P. et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).

    Article  ADS  Google Scholar 

  14. Tang, F. et al. mRNA-seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).

    Article  CAS  PubMed  Google Scholar 

  15. Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Nagano, T. et al. Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature 502, 59–64 (2013).

    Article  ADS  CAS  PubMed  Google Scholar 

  18. Jin, W. et al. Genome-wide detection of DNase I hypersensitive sites in single cells and FFPE tissue samples. Nature 528, 142–146 (2015).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  19. Cusanovich, D. A. et al. Multiplex single cell profiling of chromatin accessibility by combinatorial cellular indexing. Science 348, 910–914 (2015).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  20. Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  21. Preissl, S. et al. Single-nucleus analysis of accessible chromatin in developing mouse forebrain reveals cell-type-specific transcriptional regulation. Nat. Neurosci. 21, 432–439 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Rotem, A. et al. Single-cell ChIP-seq reveals cell subpopulations defined by chromatin state. Nat. Biotechnol. 33, 1165–1172 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Harada, A. et al. A chromatin integration labelling method enables epigenomic profiling with lower input. Nat. Cell Biol. 21, 287–296 (2019).

    Article  CAS  PubMed  Google Scholar 

  24. Kaya-Okur, H. S. et al. CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat. Commun. 10, 1930 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  25. Carter, B. et al. Mapping histone modifications in low cell number and single cells using antibody-guided chromatin tagmentation (ACT-seq). Nat. Commun. 10, 3747 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  26. Ku, W. L. et al. Single-cell chromatin immunocleavage sequencing (scChIC-seq) to profile histone modification. Nat. Methods 16, 323–325 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Wang, Q. et al. CoBATCH for high-throughput single-cell epigenomic profiling. Mol. Cell 76, 206–216 (2019).

    Article  CAS  PubMed  Google Scholar 

  28. Ai, S. et al. Profiling chromatin states using single-cell itChIP-seq. Nat. Cell Biol. 21, 1164–1172 (2019).

    Article  CAS  PubMed  Google Scholar 

  29. Grosselin, K. et al. High-throughput single-cell ChIP-seq identifies heterogeneity of chromatin states in breast cancer. Nat. Genet. 51, 1060–1066 (2019).

    Article  CAS  PubMed  Google Scholar 

  30. Guo, H. et al. Single-cell methylome landscapes of mouse embryonic stem cells and early embryos analyzed using reduced representation bisulfite sequencing. Genome Res. 23, 2126–2135 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Mooijman, D., Dey, S. S., Boisset, J. C., Crosetto, N. & van Oudenaarden, A. Single-cell 5hmC sequencing reveals chromosome-wide cell-to-cell variability and enables lineage reconstruction. Nat. Biotechnol. 34, 852–856 (2016).

    Article  CAS  PubMed  Google Scholar 

  32. Zhu, C. et al. Single-cell 5-formylcytosine landscapes of mammalian early embryos and ESCs at single-base resolution. Cell Stem Cell 20, 720–731 (2017).

    Article  CAS  PubMed  Google Scholar 

  33. Wu, X., Inoue, A., Suzuki, T. & Zhang, Y. Simultaneous mapping of active DNA demethylation and sister chromatid exchange in single cells. Genes Dev. 31, 511–523 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Luo, C. et al. Single-cell methylomes identify neuronal subtypes and regulatory elements in mammalian cortex. Science 357, 600–604 (2017).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  35. Lake, B. B. et al. Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain. Nat. Biotechnol. 36, 70–80 (2018).

    Article  CAS  PubMed  Google Scholar 

  36. Cao, J. et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science 361, 1380–1385 (2018).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  37. Chen, S., Lake, B. B. & Zhang, K. High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nat. Biotechnol. 37, 1452–1457 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Zhu, C. et al. An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome. Nat. Struct. Mol. Biol. 26, 1063–1070 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Luo, C. et al. Single nucleus multi-omics identifies human cortical cell regulatory genome diversity. Cell Genom. 2, 100107 (2022).

  40. Xie, Y. et al. Droplet-based single-cell joint profiling of histone modifications and transcriptomes. Nat. Struct. Mol. Biol. 30, 1428–1433 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Smallwood, S. A. et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat. Methods 11, 817–820 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Farlik, M. et al. Single-cell DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics. Cell Rep. 10, 1386–1397 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Mulqueen, R. M. et al. Highly scalable generation of DNA methylation profiles in single cells. Nat. Biotechnol. 36, 428–431 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Nichols, R. V. et al. High-throughput robust single-cell DNA methylation profiling with sciMETv2. Nat. Commun. 13, 7627 (2022).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  45. Wangsanuwat, C., Chialastri, A., Aldeguer, J. F., Rivron, N. C. & Dey, S. S. A probabilistic framework for cellular lineage reconstruction using integrated single-cell 5-hydroxymethylcytosine and genomic DNA sequencing. Cell Rep. Methods 1, 100060 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Yu, M. et al. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell 149, 1368–1380 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Zeng, H. et al. Bisulfite-free, nanoscale analysis of 5-hydroxymethylcytosine at single base resolution. J. Am. Chem. Soc. 140, 13190–13194 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Schutsky, E. K. et al. Nondestructive, base-resolution sequencing of 5-hydroxymethylcytosine using a DNA deaminase. Nat. Biotechnol. https://doi.org/10.1038/nbt.4204 (2018).

  49. Sun, Z. et al. High-resolution enzymatic mapping of genomic 5-hydroxymethylcytosine in mouse embryonic stem cells. Cell Rep 3, 567–576 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Liu, Y. et al. Subtraction-free and bisulfite-free specific sequencing of 5-methylcytosine and its oxidized derivatives at base resolution. Nat. Commun. 12, 618 (2021).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  51. Cohen-Karni, D. et al. The MspJI family of modification-dependent restriction endonucleases for epigenetic studies. Proc. Natl Acad. Sci. USA 108, 11040–11045 (2011).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  52. Vaisvila, R. et al. Enzymatic methyl sequencing detects DNA methylation at single-base resolution from picograms of DNA. Genome Res 31, 1280–1289 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Sen, M. et al. Strand-specific single-cell methylomics reveals distinct modes of DNA demethylation dynamics during early mammalian development. Nat. Commun. 12, 1286 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  54. Fullgrabe, J. et al. Simultaneous sequencing of genetic and epigenetic bases in DNA. Nat. Biotechnol. 41, 1457–1464 (2023).

  55. Liu, Y. B. et al. Bisulfite-free direct detection of 5-methylcytosine and 5-hydroxymethylcytosine at base resolution. Nat. Biotechnol. 37, 424–429 (2019).

    Article  PubMed  Google Scholar 

  56. Kriaucionis, S. & Heintz, N. The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science 324, 929–930 (2009).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  57. Booth, M. J. et al. Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science 336, 934–937 (2012).

    Article  ADS  CAS  PubMed  Google Scholar 

  58. Xia, B. et al. Bisulfite-free, base-resolution analysis of 5-formylcytosine at the genome scale. Nat. Methods 12, 1047–1050 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Ito, S. et al. Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science 333, 1300–1303 (2011).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  60. Rosenberg, A. B. et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science 360, 176–182 (2018).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  61. Daley, T. & Smith, A. D. Predicting the molecular complexity of sequencing libraries. Nat. Methods 10, 325–327 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Sim, Y. J. et al. 2i maintains a naive ground state in ESCs through two distinct epigenetic mechanisms. Stem Cell Rep. 8, 1312–1328 (2017).

    Article  CAS  Google Scholar 

  63. Hashimoto, H. et al. Recognition and potential mechanisms for replication and erasure of cytosine hydroxymethylation. Nucleic Acids Res. 40, 4841–4849 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Yildirim, O. et al. Mbd3/NURD complex regulates expression of 5-hydroxymethylcytosine marked genes in embryonic stem cells. Cell 147, 1498–1510 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Schep, A. N., Wu, B., Buenrostro, J. D. & Greenleaf, W. J. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods 14, 975–978 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Marks, H. et al. The transcriptional and epigenomic foundations of ground state pluripotency. Cell 149, 590–604 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Kobayashi, T. et al. The cyclic gene Hes1 contributes to diverse differentiation responses of embryonic stem cells. Genes Dev. 23, 1870–1875 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Bylund, M., Andersson, E., Novitch, B. G. & Muhr, J. Vertebrate neurogenesis is counteracted by Sox1–3 activity. Nat. Neurosci. 6, 1162–1168 (2003).

    Article  CAS  PubMed  Google Scholar 

  69. Takahashi, K. & Yamanaka, S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126, 663–676 (2006).

    Article  CAS  PubMed  Google Scholar 

  70. Habibi, E. et al. Whole-genome bisulfite sequencing of two distinct interconvertible DNA methylomes of mouse embryonic stem cells. Cell Stem Cell 13, 360–369 (2013).

    Article  CAS  PubMed  Google Scholar 

  71. Tahiliani, M. et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 324, 930–935 (2009).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  72. McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Spruijt, C. G. et al. Dynamic readers for 5-(hydroxy)methylcytosine and its oxidized derivatives. Cell 152, 1146–1159 (2013).

    Article  CAS  PubMed  Google Scholar 

  74. Blondel, V. D., Guillaume, J. L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. P10008 (2008).

  75. Nish, S. A. et al. CD4+ T cell effector commitment coupled to self-renewal by asymmetric cell divisions. J. Exp. Med. 214, 39–47 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Wang, K., Wei, G. & Liu, D. CD19: a biomarker for B cell development, lymphoma diagnosis and therapy. Exp. Hematol. Oncol. 1, 36 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Egwuagu, C. E. STAT3 in CD4+ T helper cell differentiation and inflammatory diseases. Cytokine 47, 149–156 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Tsukumo, S. et al. Bach2 maintains T cells in a naive state by suppressing effector memory-related genes. Proc. Natl Acad. Sci. USA 110, 10735–10740 (2013).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  79. Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Hellman, A. & Chess, A. Gene body-specific methylation on the active X chromosome. Science 315, 1141–1143 (2007).

    Article  ADS  CAS  PubMed  Google Scholar 

  81. Stroud, H., Feng, S., Morey Kinney, S., Pradhan, S. & Jacobsen, S. E. 5-Hydroxymethylcytosine is associated with enhancers and gene bodies in human embryonic stem cells. Genome Biol. 12, R54 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Fabyanic, E. B. et al. Joint single-cell profiling resolves 5mC and 5hmC and reveals their distinct gene regulatory effects. Nat. Biotechnol. https://doi.org/10.1038/s41587-022-01652-0 (2023).

  83. Zhu, C. et al. Joint profiling of histone modifications and transcriptome in single cells from mouse brain. Nat. Methods 18, 283–292 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Theriault, F. M., Roy, P. & Stifani, S. AML1/Runx1 is important for the development of hindbrain cholinergic branchiovisceral motor neurons and selected cranial sensory neurons. Proc. Natl Acad. Sci. USA 101, 10343–10348 (2004).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  85. Matuzelski, E. et al. Transcriptional regulation of Nfix by NFIB drives astrocytic maturation within the developing spinal cord. Dev. Biol. 432, 286–297 (2017).

    Article  CAS  PubMed  Google Scholar 

  86. Viswanathan, R. et al. DARESOME enables concurrent profiling of multiple DNA modifications with restriction enzymes in single cells and cell-free DNA. Sci. Adv. 9, eadi0197 (2023).

    Article  CAS  PubMed  Google Scholar 

  87. Chialastri, A., Sarkar, S., Schauer, E. E., Lamba, S. & Dey, S. S. Combinatorial quantification of 5mC and 5hmC at individual CpG dyads and the transcriptome in single cells reveals modulators of DNA methylation maintenance fidelity. Preprint at bioRxiv https://doi.org/10.1101/2023.05.06.539708 (2023).

  88. Shahjalal, H. M., Abdal Dayem, A., Lim, K. M., Jeon, T. I. & Cho, S. G. Generation of pancreatic β cells for treatment of diabetes: advances and challenges. Stem Cell Res. Ther. 9, 355 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Lareau, C. A. et al. Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility. Nat. Biotechnol. 37, 916–924 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Datlinger, P. et al. Ultra-high-throughput single-cell RNA sequencing and perturbation screening with combinatorial fluidic indexing. Nat. Methods 18, 635–642 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Li, G. et al. Joint profiling of DNA methylation and chromatin architecture in single cells. Nat. Methods 16, 991–993 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Lee, D. S. et al. Simultaneous profiling of 3D genome structure and DNA methylation in single human cells. Nat. Methods 16, 999–1006 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Mimitou, E. P. et al. Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells. Nat. Biotechnol. 39, 1246–1258 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Chung, H. et al. Joint single-cell measurements of nuclear proteins and RNA in vivo. Nat. Methods 18, 1204–1212 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Zhang, B. et al. Characterizing cellular heterogeneity in chromatin state with scCUT&Tag-pro. Nat. Biotechnol. 40, 1220–1230 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Chen, A. F. et al. NEAT-seq: simultaneous profiling of intra-nuclear proteins, chromatin accessibility and gene expression in single cells. Nat. Methods 19, 547–553 (2022).

    Article  CAS  PubMed  Google Scholar 

  98. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Qiu, P. Embracing the dropouts in single-cell RNA-seq analysis. Nat. Commun. 11, 1169 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  100. Xu, H. et al. Modular oxidation of cytosine modifications and their application in direct and quantitative sequencing of 5-hydroxymethylcytosine. J. Am. Chem. Soc. 145, 7095–7100 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Schmitt, M. W. et al. Detection of ultra-rare mutations by next-generation sequencing. Proc. Natl Acad. Sci. USA 109, 14508–14513 (2012).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  102. Chan, K. C. et al. Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing. Proc. Natl Acad. Sci. USA 110, 18761–18768 (2013).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  103. Song, C. X. et al. 5-Hydroxymethylcytosine signatures in cell-free DNA provide information about tumor types and stages. Cell Res. 27, 1231–1242 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  105. Krueger, F. Trim Galore. https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ (2019).

  106. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. McInnes, L., Healy, J., Saul, N. & Großberger, L. UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3, 861 (2008).

    Article  Google Scholar 

  108. Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948).

  110. Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  111. Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. Sherman, B. T. et al. DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update). Nucleic Acids Res. 50, W216–W221 (2022).

  113. Bai, D., Zhu, C. & Yi, C. Single-cell joint analysis of 5-methylcytosine and 5-hydroxymethylcytosine. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE197740 (2023).

  114. Kolodziejczyk, A. A. et al. Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell 17, 471–485 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  115. Bai, D., Zhu, C. & Yi, C. Custom scripts and pipeline for SIMPLE-seq data analysis. https://github.com/cxzhu/SIMPLE-seq (2023).

Download references

Acknowledgements

We thank Y. Zhuang, J. Song, H. Zeng, C.-G. Ji, H.-W. Meng and Z.-R. Xu for technical assistance; P. Du and F.-C. Tang (Peking University) for providing mESCs; G.-L. Xu (Chinese Academy of Sciences) for providing mTET1CD plasmid; J.-Y. Xiao (Peking University) for protein purification assistance; and Z. Xi (Nankai University) for discussion. We thank the National Center for Protein Sciences at Peking University for technical help. We carried out data analysis on the High-Performance Computing Platform at the School of Life Sciences, Peking University. This study is supported by the Ministry of Science and Technology of China (no. 2023YFC3402200, no. 2019YFA0110900 and no. 2019YFA0802201 to C.Y.), the Beijing Natural Science Foundation (no. Z220013 to C.Y.) and the National Natural Science Foundation of China (no. 91953201 and no. 21825701 to C.Y.).

Author information

Authors and Affiliations

Authors

Contributions

D.B., C.Z. and C.Y. conceived and designed the study and wrote the paper. D.B. developed and optimized the SIMPLE-seq protocol and generated the data. C.Z. performed pilot labeling experiments, with help from D.B., and data analysis, with help from D.B. and X.Z. All authors discussed the results and edited the paper. C.Y. supervised the study.

Corresponding authors

Correspondence to Chenxu Zhu or Chengqi Yi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Sequential chemical labeling enables simultaneous detection of 5mC and 5hmC bases on the same molecules.

a, Overview of TAPS and hmC-CATCH. b, Sanger sequencing results showing the ‘C-to-T’ conversion signal from model oligonucleotide sequence (T5MH) before treatment, after 5hmC labeling, and after both 5hmC and 5mC labeling. c, Schematics of chemical labeling for 5hmC and 5mC. d, qPCR result of lambda DNA (25,086bp-25,308bp) before and after potassium ruthenate (K2RuO4) treatment; n = 3 (Treated), n = 3 (Control). Data are presented as 6.020 ± 0.006 (Treated) and 5.957 ± 0.026 (Control). e, Agarose gel images of dsDNA of fragmented lambda DNA treated with 5hmC-labelling reaction only (left panel) and sequential 5hmC and 5mC-labelling (right panel). Experiment was performed once. f, Barplot showing the distribution of C-to-T mutation rates for unmethylated CH sites and methylated CG sites. g, C-to-T mutation signals on both strands of T1M spike-in model DNA, symmetric methylated CG sites are indicated in red on the sequences below. h, Genome browser view showing the sequenced reads aligned to spike-in model DNA (upper: T1M with 5mCG, positive and negative strands are separately displayed, bottom left: T2H with a single 5hmCG site, bottom right: T3MH with both 5mC site and 5hmCG site at known position). C-to-T is colored in red for positive strands, and G-to-A is colored in green for negative strands, respectively. Conversion rates are estimated from all the modified cytosines of spike-in model DNA. i, Nuclei clumps and resolved single nuclei suspension after brief sonication under bright field microscope. Experiment was performed once. Scale bars, 100 μm. j, Enrichment analysis of genome coverage by SIMPLE-seq and DNase-seq on DHS (DNase I hypersensitive sites). k, A table showing tagmentation efficiency under different Tn5 reaction conditions, including Tn5 with hyperactive mutations, working concentrations and reaction buffers; right-sided showing the fragments analysis result under optimal condition. l-n, Standard curves for mixed oligo DNA with (l) 5mC and C, (m) 5hmC and C, (n) 5mC and 5hmC, were plotted based on gradient mixing ratios (0:10; 2:8; 4:6; 6:4; 8:2; 10:0). o, C-to-T mutation rate estimated from 10-kb non-overlap bins across the whole genome. For both boxplots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, whiskers with maximum 2× interquartile range (IQR). For 5mC, minima = 1.12%, maxima = 1.47%; for 5hmC, minima = 0.75%, maxima = 2.43%. 5mC, n = 156,755; 5hmC, n = 159,962. p and q, Scatter plot showing (p) the number of 5hmC reads mapped to human and mouse genome and (q) the fraction of 5mC and 5hmC reads mapped to human and mouse genome in each cell from the species-mixing experiment with twin axes. r, Stacked barplot showing the fraction of reads mapped to the reference genome and assigned to 5mC, 5hmC, or cannot be assigned. s-v, Comparisons between SIMPLE-seq and published single-cell and bulk 5mC and 5hmC sequencing methods: (s) fraction of mappable reads, (t) the number of 5mCG sites detected and total sequenced reads in each study, and (u) the number of 5hmCG sites detected and total sequenced reads in each study, (v) Dot plot showing the average number of covered CGs and sequenced reads for each cell in each study. w, Line plots showing the number of unique mapped reads or CG dinucleotides with at different sequenced read depths per cell in each study.

Extended Data Fig. 2 Joint analysis of 5mC and 5hmC from single cells.

a, Barplot showing the enrichment of 5mCG and 5hmCG sites detected by SIMPLE-seq, WGBS and TAB-seq over different genomic regions. b, Boxplots showing the 5mC or 5hmC modification levels on different genomic regions from bisulfite sequencing, TAB-seq and SIMPLE-seq. For all boxplots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, whiskers with maximum 2× IQR. The minima/maxima/numbers of elements of all boxplots: 60.76%/74.60%/157,324 (Alu,(1)), 58.52%/68.61%/137,869 (Alu,(2)), 3.71%/10.71%/37,810 (Alu,(3)), 0.00%/10.91%/8,332 (Alu,(4)), 14.71%/67.84%/5,828 (CGI,(1)), 12.36%/54.72%/3,662 (CGI,(2)), 1.96%/5.87%/1,059 (CGI,(3)), 1.45%/5.24%/304 (CGI,(4)), 6.02%/72.62%/45,014 (Intron,(1)), 8.48%/79.35%/30,294 (Intron,(2)), 4.19%/5.01%/7,333 (Intron,(3)), 3.41%/4.85%/1,669 (Intron,(4)), 70.19%/75.69%/124,777 (L1,(1)), 64.68%/94.66%/103,247 (L1,(2)), 6.89%/10.32%/22,350 (L1,(3)), 4.90%/8.53%/5,414 (L1,(4)), 67.40%/75.88%/26,295 (L2,(1)), 68.26%/87.71%/18,819 (L2,(2)), 1.86%/7.30%/4,342 (L2,(3)), 2.20%/6.31%/875 (L2,(4)), 64.16%/73.22%/909 (LCP,(1)), 57.34%/79.78%/739 (LCP,(2)), 2.88%/10.32%/171 (LCP,(3)), 2.56%/7.71%/62 (LCP,(4)), 70.15%/75.10%/194,933 (LINE,(1)), 65.93%/81.40%/177,226 (LINE,(2)),3.79%/7.49%/34,376 (LINE,(3)), 3.99%/5.61%/9,412 (LINE,(4)), 68.59%/75.74%/189,543 (LTR,(1)), 60.01%/72.15%/170,132 (LTR,(2)), 2.85%/7.75%/32,147 (LTR,(3)), 3.24%/5.59%/10,073 (LTR,(4)), 68.60%/74.58%/47,771 (MIR,(1)), 66.88%/85.86%/30,355 (MIR,(2)), 2.88%/9.75%/7,707 (MIR,(3)), 0.00%/7.62%/1,520 (MIR,(4)), 62.68%/74.58%/325,867 (SINE,(1)), 58.25%/73.48%/302,881 (SINE,(2)), 4.93%/9.37%/56,712 (SINE,(3)), 0.00%/9.52%/19,715 (SINE,(4)), 62.82%/79.57%/4,651 (H3K9me3,(1)), 48.15%/94.31%/3,514 (H3K9me3,(2)), 3.79%/12.09%/862 (H3K9me3,(3)), 0.00%/11.97%/226 (H3K9me3,(4)), 14.28%/69.99%/21,711 (DNase,(1)), 12.70%/70.84%/15,340 (DNase,(2)), 0.00%/10.11%/3,961 (DNase,(3)), 0.00%/6.10%/973 (DNase,(4)). c, Venn plot showing the 5mCG sites overlap between SIMPLE-seq and TAPS. P-value, two-sided Fisher’s exact test. d, Venn plot showing the 5hmCG sites overlap between SIMPLE-seq and TAB-seq. P-value, two-sided Fisher’s exact test. e, Stacked barplot showing the fraction of called 5hmC sites overlapped with 5mC (grey, 5hmC-shared) and 5hmC-sites did not overlapped with a called 5mC sites (blue, 5hmC-only). f, Boxplot showing the 5hmC modification levels of 5mC-5hmC shared sites and the 5hmC-only sites. For both boxplots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, whiskers with maximum 2× IQR. For 5mC-5hmC shared sites, minima = 0.01, maxima = 1.00, sites number n = 622,323, and for 5hmC-only sites, minima = 0.01, maxima = 1.00, sites number n = 435,143. P-value, two-sided Fisher’s exact test. g, Barplot showing the enrichment of 5mC-5hmC shared sites and the 5hmC-only sites over different genomic regions. h, Barplot showing the relative enrichment of 5hmC-only sites over 5mC-5hmC shared sites on different genomic regions. i-k, UMAP embedding showing cells based on their (i) 5mCG, (j) 5mCHG and (k) 5hmCHG levels (in 100-kb non-overlapping bins). Each dot represents a single cell and is colored according to its original identity. l, Assignment of 2i mES cells and serum mES cells into two distinct clusters grouped by unsupervised clustering. m, Silhouette plot to evaluate the degree of separation of the clusters based on 5mC or 5hmC. n, Line plots showing the cumulated coverages of 10-kb non-overlapping bins with different depths for 5mC (green) and 5hmC (blue) from different numbers of single cells. The shadowed area showing the error ranges from 5 randomly sampled cell sets. o, Smoothed line plots showing the 5hmCG levels around genic regions of genes with different expression levels (using the smooth.spline function with parameter df = 30). p-q, Line plots showing the relationships between promoter 5mCG and 5hmCG modification levels with gene expression levels in (p) 2i mES cells, (q) serum mES cells.

Extended Data Fig. 3 Analysis of 5mC and 5hmC from the same molecules revealed multiple 5hmC types.

a, Heat maps showing the 5mCHG and 5hmCHG TF motif enrichments along with TF expression level during mESC 2i state to serum state transition. b, Dotplots showing promoter methylation levels for genes associated with cell proliferation during mESC 2i to serum state transition. c, Cytosine modification entropies in single cells. Each dot represents a single cell and is colored and ordered according to its pseudotime score. Grey dots are background entropy levels estimated by shuffled the cell barcodes of called modification sites. d, Violin plots showing the fraction of 5mC and 5hmC reads with paired modality in single cells. For both Violin plots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, whiskers with maximum 2× IQR. For 5mC modality, minima = 1.94%, maxima = 9.12%; for 5hmC modality, minima = 6.93%, maxima = 20.75%; n = 300 randomly sampled cells. e, Barplot showing the distributions of 5hmCG sites with different fractions of cells with detected 5mCG-associated 5hmCG sites. False positive detection rates (FDR) based on the fraction of averaged false detected (Type 2) sites from shuffled groups in total detected sites (FDR = 0.0467 was selected). f, UMAP embedding showing 5hmC levels of different types in single cells. Each dot represents a single cell and is colored according to the average 5hmC level. g, Top enriched GREAT GO terms for different types of 5hmCG sites. P-value, one-sided Fisher’s exact test. h, Histograms and heatmaps showing the relationship between two types of 5hmCG with mESC DNase-seq signals from ENCODE (ENCSR000CMW). i, Fraction of Type 1 and Type 2 5hmCGs in detected from low-entropy (0–75%) or high-entropy (75%-100%) cells. j, Barplot showing the enrichment of 5hmCG sites detected in low-entropy or high-entropy cells over different genomic regions.

Extended Data Fig. 4 SIMPLE-seq generates cell-type-specific 5mC and 5hmC profiles from human PBMC.

a, Heatmap showing the gene expression and promoter methylation levels of marker genes for the five major cell types. b and c, UMAP embedding showing the single-cell clustering based on (b) 5mCHG and (c) 5hmCHG levels (in 100-kb non-overlapping bins) from PBMC. Each dot represents a single cell and is colored according to its annotation based on 5mCG.d, Silhouette plot showing the degree of separation of the PBMC clusters based on 5mC, 5hmC or joint 5mCG-5hmCG. e, UMAP embedding showing the single-cell clustering based on joint 5mCG-5hmCG levels (in 100-kb non-overlapping bins) from PBMC. Each dot represents a single cell and is colored according to its annotation based on 5mCG. f and g, UMAP embedding showing the single-cell clustering based on (f) 5mCG levels of H3K4me1 regions (g) 5hmCG levels of H3K4me1 regions from PBMC. Each dot represents a single cell and is colored according to its annotation based on 5mCG. h, Violin plots showing the modification levels of 5mCG, 5hmCG, 5mCHG and 5hmCHG in different cell types. i, The heatmap showing the numbers of pairwise differentially methylated (5mCG) regions across cell types (in 5-kb non-overlapping bins). j, Heatmap showing the methylation levels of 5mCG DMRs of a representative group (CD4+ T cells and Monocytes). k-m, The enrichment analysis of (k) know motifs, (l) top enriched de novo motifs, P-value, one-sided Fisher’s exact test, and (m) top enriched GREAT GO terms, P-value, one-sided Fisher’s exact test.

Extended Data Fig. 5 Comparison of conserved and differential cytosine states among immune cells.

a, Violin plots showing the genomic coverages of the 10 cytosine states. b, Heatmap showing the enrichment of different cytosine state regions around TSS and TES sites. c-e, Scatter plot showing the fraction of genome regions overlapped with peaks of different histone marks in conserved and differential (c) E5 (d) E7 and (e) E9 state regions. P-value, two-sided t-test. f, Top enriched GREAT GO terms for conserved and differential regions of E5, E7 and E9 in representative cell types. P-value, two-sided t-test.

Extended Data Fig. 6 SIMPLE-seq generates cell-type-specific 5mC and 5hmC profiles from mouse brain.

a, Violin plots showing the fraction of reads mapped to reference mouse genome for 5mC and 5hmC in SIMPLE-seq (this study) and Joint-snhmC-seq (GSE236798) datasets. b, Violin plots showing the numbers of unique reads per cell for 5mC and 5hmC in SIMPLE-seq (this study) and Joint-snhmC-seq (GSE236798) datasets. For fair comparisons, Joint-snhmC-seq dataset was down sampled to the same per-cell depth in SIMPLE-seq. Data are presented as 57,563 ± 4,260 (SIMPLE-seq, 5mC), 35,283 ± 552 (joint-snhmC-seq, 5mC), 39,374 ± 4,553 (SIMPLE-seq, 5hmC), 20,538 ± 565 (joint-snhmC-seq, 5hmC). Cell number n = 4,767 (SIMPLE-seq, 5mC and 5hmC), n = 552 (joint-snhmC-seq, 5mC and 5hmC). c and d, UMAP embedding showing the single-cell clustering based on (c) 5mCG and (d) 5hmCG levels (in 100-kb non-overlapping bins) from mouse brain cells. Each dot represents a single cell and is colored according to its annotation based on joint 5mCG-5hmCG clustering. e, Silhouette plot showing the degree of separation of the mouse brain cell clusters based on 5mC, 5hmC or joint 5mCG-5hmCG. f, Dot plots showing the genebody 5hmCG levels of representative marker genes in the detected cell types. g, The distribution of H3K4me1 and H3K27me3 reads densities around the 5mCG sites or 5hmCG sites from EXC1. h, Line plots showing the relationships between gene body 5mCG and 5hmCG levels with gene expression levels in EXC2 and ASC1 cell types i, Heatmap showing the 5mCG modification levels of cell type-specific 5mCG across the 11 cell types. j, Top enriched de novo motifs (left) and top enriched GO terms (right) for each cell type were also shown. P-value, one-sided Fisher’s exact test.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bai, D., Zhang, X., Xiang, H. et al. Simultaneous single-cell analysis of 5mC and 5hmC with SIMPLE-seq. Nat Biotechnol (2024). https://doi.org/10.1038/s41587-024-02148-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41587-024-02148-9

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing