We report scM&T-seq, a method for parallel single-cell genome-wide methylome and transcriptome sequencing that allows for the discovery of associations between transcriptional and epigenetic variation. Profiling of 61 mouse embryonic stem cells confirmed known links between DNA methylation and transcription. Notably, the method revealed previously unrecognized associations between heterogeneously methylated distal regulatory elements and transcription of key pluripotency genes.
Subscribe to Journal
Get full journal access for 1 year
only $21.58 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Shapiro, E., Biezuner, T. & Linnarsson, S. Nat. Rev. Genet. 14, 618–630 (2013).
Guo, H. et al. Genome Res. 23, 2126–2135 (2013).
Smallwood, S.A. et al. Nat. Methods 11, 817–820 (2014).
Farlik, M. et al. Cell Rep. 10, 1386–1397 (2015).
Levsky, J.M., Shenoy, S.M., Pezo, R.C. & Singer, R.H. Science 297, 836–840 (2002).
Yan, L. et al. Nat. Struct. Mol. Biol. 20, 1131–1139 (2013).
Macaulay, I.C. et al. Nat. Methods 12, 519–522 (2015).
Dey, S.S., Kester, L., Spanjaard, B., Bienko, M. & van Oudenaarden, A. Nat. Biotechnol. 33, 285–289 (2015).
Schübeler, D. Nature 517, 321–326 (2015).
Jones, P.A. Nat. Rev. Genet. 13, 484–492 (2012).
Singer, Z.S. et al. Mol. Cell 55, 319–331 (2014).
Kalmar, T. et al. PLoS Biol. 7, e1000149 (2009).
Chambers, I. et al. Nature 450, 1230–1234 (2007).
Singh, A.M., Hamazaki, T., Hankowski, K.E. & Terada, N. Stem Cells 25, 2534–2542 (2007).
Torres-Padilla, M.E. & Chambers, I. Development 141, 2173–2181 (2014).
Ficz, G. et al. Cell Stem Cell 13, 351–359 (2013).
Klein, A.M. et al. Cell 161, 1187–1201 (2015).
Kolodziejczyk, A.A. et al. Cell Stem Cell 17, 471–485 (2015).
Habibi, E. et al. Cell Stem Cell 13, 360–369 (2013).
Stadler, M.B. et al. Nature 480, 490–495 (2011).
Lee, H.J., Hore, T.A. & Reik, W. Cell Stem Cell 14, 710–719 (2014).
Papp, B. & Plath, K. EMBO J. 31, 4255–4257 (2012).
Whyte, W.A. et al. Cell 153, 307–319 (2013).
Krueger, F. & Andrews, S.R. Bioinformatics 27, 1571–1572 (2011).
Wu, T.D. & Nacu, S. Bioinformatics 26, 873–881 (2010).
Love, M.I., Huber, W. & Anders, S. Genome Biol. 15, 550 (2014).
Trapnell, C. et al. Nat. Biotechnol. 28, 511–515 (2010).
Bourgon, R., Gentleman, R. & Huber, W. Proc. Natl. Acad. Sci. USA 107, 9546–9551 (2010).
We thank A. Kolodziejczyk and S.A. Teichmann for providing a list of 86 ESC pluripotency and differentiation genes18. We thank W. Haerty for his supervision and valuable advice to T.X.H. We thank the Wellcome Trust Sanger Institute sequencing pipeline team for assistance with Illumina sequencing. We thank the members of the Sanger–European Bioinformatics Institute (EBI) Single-Cell Genomics Centre for general advice. W.R. is supported by the UK Biotechnology and Biological Sciences Research Council (BBSRC), the Wellcome Trust and the EU. G.K. is supported by the BBSRC, the UK Medical Research Council (MRC) and the EU. C.P.P. is supported by the Wellcome Trust and the MRC. T.V. is supported by the Wellcome Trust and KU Leuven (SymBioSys, PFV/10/016). H.J.L. is supported by EU Network of Excellence EpiGeneSys. O.S. is supported by the European Molecular Biology Laboratory (EMBL), the Wellcome Trust and the EU.
W.R. is a consultant and shareholder of Cambridge Epigenetix.
Integrated supplementary information
Single cells are collected and lysed before poly-A RNA is captured on magnetic beads and physically separated from DNA. Amplified cDNA is generated from mRNA on beads whilst DNA is bisulfite converted and Illumina sequencing libraries are prepared from both components in parallel.
Supplementary Figure 2 Quality metrics of scRNA-seq data obtained from mouse ESCs profiled using scM&T-seq.
(a,b) Number of genes detected on (Y-axis) as a function of the expression cut off (x-axis). In each cell, between 4,000 and 8,000 genes were expressed (TPM>1) (the dashed line drawn at X=1). High quality cells generally have about 5,000 genes detectable at the cut-off of TPM>1, indicating a high level of quality among the 61 serum ESCs (or the 14 2i ESCs). (c,d) Distribution of Pearson correlation coefficient calculated pairwise on the 61 serum ESCs (or the 14 2i ESCs). The observed correlation coefficient tended to be between 0.7-0.99, indicating a high degree of technical consistency in the measured transcriptome of the cells considered, and attesting high quality of scRNA-seq data.
Supplementary Figure 3 Quality metrics of single-cell methylomes in serum ESCs profiled using alternative protocols.
Shown are quality metrics for the scM&T-seq protocol to profile 20 serum ESCs, compared with scBS-seq (Smallwood et al. 2014) to profile 20 serum cells. (a) Read mapping efficiency. (b) Read duplication rate. (c) Genome-wide CpG and CHH methylation rate per cell. (d) Analysis of representation bias for different genomic contexts. (e) FASTQC report of adapter content from one representative single cell bisulfite library (Read 1 of cell B06). A large proportion of sequenced fragments are concatemers of the primer used in first strand synthesis which substantially limits the alignment rates of these libraries. It may be possible to improve mapping efficiencies by reducing oligo concentrations or reaction times but this is likely to result in reduced genomic coverage. Source data
Shown is the percentage of genomic contexts of different classes (y-axis) that are covered for an increasing number of minimum cells (x-axis), considering both scBS-seq (Smallwood et al. 2014, green) and scM&T-seq (blue). Note that the total number of serum cells is 20 for scBS-seq and 61 for scM&T-seq. Source data
Shown is the percentage of genome-wide 10kb, 5kb, and 1kb windows covered (y-axis) by an increasing minimum number of cells (x-axis), for scBS-seq (Smallwood et al. 2014, green) and scM&T-seq (blue). Note that the total number of serum cells is 20 for scBS-seq and 61 for scM&T-seq. Source data
Supplementary Figure 6 Hierarchical clustering of DNA-methylation profiles generated by scM&T-seq and scBS-seq.
Shown s a joint hierarchical clustering from 61 serum and 16 2i cells profiled using scM&T-seq, as well as 20 serum and 12 2i ESCs profiled by scBS-seq (Smallwood et al. 2014), as well as corresponding synthetic bulk samples and an independent bulk BS-seq sample from serum ESCs (Ficz et al. 2013). The clustering analysis was performed on gene body methylation of the 500 genes with the largest epigenome heterogeneity. Source data
Supplementary Figure 7 Correlation between single-cell methylomes and the methylome of a bulk cell population.
Shown is a scatter plot, relating bulk gene-body methylation (Ficz et al. 2013) on the x-axis, versus synthetic bulk estimates of gene-body methylation derived using either scBS-seq (Smallwood et al. 2014, green) or scM&T-seq (blue) on the y-axis. Synthetic bulk methylation profiles are derived form averages of the single-cell methylation profiles. The true bulk methylation profile is concordant with both single-cell profiles, where the scM&T-seq bulk estimates correlate slightly better (R=0.77) than the scBS-seq bulk (R=0.69). Source data
Supplementary Figure 8 Principal-component analysis of gene-body methylation and gene expression in serum-grown ESCs.
Shown are projections onto first two principle components (left) alongside with percentage of variance explained by individual components (right) for both gene expression levels (a) and gene body methylation (b). Cells are color-coded based on clustering obtained using gene expression values, showing that that the methylation principal components partially recapitulate the structure in the expression data. Source data
Supplementary Figure 9 Scatter-plot matrix of principal components from methylation and gene expression profiles.
Shown are scatter plots between individual principal components of gene expression levels (y-axis) and corresponding gene body methylation (x-axis), using 61 serum cells profiled using scM&T-seq. Cells are color coded as in Supplementary Fig. 8. There is a strong correlation between the second principal component of DNA methylation and the corresponding component from gene expression, suggesting shared axes of variation between transcriptome and methylome profiles. Source data
Supplementary Figure 10 Clustering analysis of transcriptome and methylation data from 61 serum ESCs.
Shown are heatmaps for the gene body methylation (left) and gene expression profiles (right) using the 300 most heterogeneous genes (based on gene expression). The order of genes was taken from an individual clustering analysis based on gene methylation whereas cells were clustered separately either using DNA methylation or expression data, showing unlinked clusters (colored clusters). The bar plots in the center show the heterogeneity in DNA methylation (left) and gene expression (right). Source data
Shown is the absolute (a) and relative (b) reduction in the number of significant methylation-expression associations for different genomic contexts, as well as the root mean squared error of Pearson’s correlation coefficient (c) when either considering the full datasets or alternatively boot-strapped samples for the methylation-RNA correlation analysis. Bootstrap samples were obtained from independent draws of 60%, 70%, or 80% of the total set of cells. As expected, a reduction in the number of analyzed cells resulted in reduced power to detect significant associations (a, b). Overall, only a relatively small number of linkages were affected and the concordance to the full dataset remained high (c). Source data
Supplementary Figure 13 Volcano plots for association tests between DNA-methylation profiles in alternative genomic contexts and gene expression levels.
For each context, shown is the correlation coefficient (Pearson r, x-axis) versus the adjusted p-value (Benjamini Hochberg adjustment; y-axis). The blue horizontal line corresponds to the 10% FDR significance level. Each dot corresponds to a gene and the size to the adjusted p-value of the association test. Genes colored in red correspond to known pluripotency genes (Supplementary Table 5). The vertical orange line denotes the average correlation coefficient across all genes for a given annotation. Source data
Supplementary Figure 14 Comparison of results of cell-specific correlation analysis with known covariates (mean CpG methylation rate).
Supplementary Figure 15 Comparison of cell-specific correlation analysis with known covariates (CpG coverage).
For alternative genomic contexts, shown are scatter plots between cell-specific methylation-expression correlation coefficients and the (technical) CpG coverage in the corresponding cell. The lack of associations suggests that technical factors do not drive the heterogeneity in the coupling between methylation and expression between cells. Source data
Supplementary Figures 1–15 and Supplementary Table 3 (PDF 2789 kb)
scRNA-seq and scBS-seq quality metrics. (XLSX 119 kb)
Genomic contexts considered for the methylation–gene expression association analyses. (XLSX 9 kb)
Gene-level results of the association tests between DNA-methylation variation in alternative genomic contexts and gene expression variation. (XLSX 21480 kb)
List of 86 literature-derived pluripotency genes. (XLS 33 kb)
Summary statistics obtained for the cell-specific association analysis correlating the methylome and the transcriptome in individual cells. (XLSX 51 kb)
scMT-seq software (ZIP 11 kb)
About this article
Cite this article
Angermueller, C., Clark, S., Lee, H. et al. Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity. Nat Methods 13, 229–232 (2016). https://doi.org/10.1038/nmeth.3728
Genome Research (2021)
Briefings in Bioinformatics (2021)
Journal of Human Genetics (2021)
The Application of Single-Cell RNA Sequencing in Studies of Autoimmune Diseases: a Comprehensive Review
Clinical Reviews in Allergy & Immunology (2021)