Single-cell RNA sequencing has revealed extensive transcriptional cell state diversity in cancer, often observed independently of genetic heterogeneity, raising the central question of how malignant cell states are encoded epigenetically. To address this, here we performed multiomics single-cell profiling—integrating DNA methylation, transcriptome and genotype within the same cells—of diffuse gliomas, tumors characterized by defined transcriptional cell state diversity. Direct comparison of the epigenetic profiles of distinct cell states revealed key switches for state transitions recapitulating neurodevelopmental trajectories and highlighted dysregulated epigenetic mechanisms underlying gliomagenesis. We further developed a quantitative framework to directly measure cell state heritability and transition dynamics based on high-resolution lineage trees in human samples. We demonstrated heritability of malignant cell states, with key differences in hierarchal and plastic cell state architectures in IDH-mutant glioma versus IDH-wild-type glioblastoma, respectively. This work provides a framework anchoring transcriptional cancer cell states in their epigenetic encoding, inheritance and transition dynamics.
This is a preview of subscription content, access via your institution
Subscribe to Nature+
Get immediate online access to the entire Nature family of 50+ journals
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Get time limited or full article access on ReadCube.
All prices are NET prices.
Processed data generated for this study are available through the NCBI Gene Expression Omnibus (GEO) under accession number GSE151506. Raw data access can be requested through the Data Use Oversight System (DUOS) Dataset Catalog with dataset ID DUOS-000133 as well as the European Genome–phenome Archive (EGA) with dataset ID EGAS00001005472. The data can be visualized and interrogated through the Broad Institute’s Single-Cell Portal at https://singlecell.broadinstitute.org/single_cell/study/SCP936. scATAC-seq data are available at the EGA repository under EGAS00001002185, EGAS00001001900 and EGAS00001003845 and at NCBI GEO under accession number GSE138794. TCGA data (DNA methylation, gene expression and clinical profiles) are available from the TCGA database (https://cancergenome.nih.gov/). ChIP–seq data are available at NCBI GEO under accession number GSE46016.
Tirosh, I. et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352, 189–196 (2016).
Nam, A. S. et al. Somatic mutations and cell identity linked by genotyping of transcriptomes. Nature 571, 355–360 (2019).
Puram, S. V. et al. Single-cell transcriptomic analysis of primary and metastatic tumor ecosystems in head and neck cancer. Cell 171, 1611–1624 (2017).
Hata, A. N. et al. Tumor cells can follow distinct evolutionary paths to become resistant to epidermal growth factor receptor inhibition. Nat. Med. 22, 262–269 (2016).
Shaffer, S. M. et al. Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance. Nature 546, 431–435 (2017).
Shaffer, S. M. et al. Memory sequencing reveals heritable single-cell gene expression programs associated with distinct cellular behaviors. Cell 182, 947–959 (2020).
Tirosh, I. et al. Single-cell RNA-seq supports a developmental hierarchy in human oligodendroglioma. Nature 539, 309–313 (2016).
Frieda, K. L. et al. Synthetic recording and in situ readout of lineage information in single cells. Nature 541, 107–111 (2017).
Spanjaard, B. et al. Simultaneous lineage tracing and cell-type identification using CRISPR–Cas9-induced genetic scars. Nat. Biotechnol. 36, 469–473 (2018).
Raj, B. et al. Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat. Biotechnol. 36, 442–450 (2018).
McKenna, A. et al. Whole-organism lineage tracing by combinatorial and cumulative genome editing. Science 353, aaf7907 (2016).
Alemany, A., Florescu, M., Baron, C. S., Peterson-Maduro, J. & van Oudenaarden, A. Whole-organism clone tracing using single-cell sequencing. Nature 556, 108–112 (2018).
Lathia, J. D., Mack, S. C., Mulkearns-Hubert, E. E., Valentim, C. L. L. & Rich, J. N. Cancer stem cells in glioblastoma. Genes Dev. 29, 1203–1217 (2015).
Gimple, R. C., Bhargava, S., Dixit, D. & Rich, J. N. Glioblastoma stem cells: lessons from the tumor hierarchy in a lethal cancer. Genes Dev. 33, 591–609 (2019).
Suvà, M. L. et al. Reconstructing and reprogramming the tumor-propagating potential of glioblastoma stem-like cells. Cell 157, 580–594 (2014).
Suvà, M. L. & Tirosh, I. The glioma stem cell model in the era of single-cell genomics. Cancer Cell 37, 630–636 (2020).
Bao, S. et al. Glioma stem cells promote radioresistance by preferential activation of the DNA damage response. Nature 444, 756–760 (2006).
Liau, B. B. et al. Adaptive chromatin remodeling drives glioblastoma stem cell plasticity and drug tolerance. Cell Stem Cell 20, 233–246 (2017).
Chen, J. et al. A restricted cell population propagates glioblastoma growth after chemotherapy. Nature 488, 522–526 (2012).
Filbin, M. G. et al. Developmental and oncogenic programs in H3K27M gliomas dissected by single-cell RNA-seq. Science 360, 331–335 (2018).
Neftel, C. et al. An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell 178, 835–849 (2019).
Patel, A. P. et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014).
Venteicher, A. S. et al. Decoupling genetics, lineages, and microenvironment in IDH-mutant gliomas by single-cell RNA-seq. Science 355, eaai8478 (2017).
Garofano, L. et al. Pathway-based classification of glioblastoma uncovers a mitochondrial subtype with therapeutic vulnerabilities. Nat. Cancer 2, 141–156 (2021).
Richards, L. M. et al. Gradient of developmental and injury response transcriptional states defines functional vulnerabilities underpinning glioblastoma heterogeneity. Nat. Cancer 2, 157–173 (2021).
Castellan, M. et al. Single-cell analyses reveal YAP/TAZ as regulators of stemness and cell plasticity in glioblastoma. Nat. Cancer 2, 174–188 (2021).
Latil, M. et al. Cell-type-specific chromatin states differentially prime squamous cell carcinoma tumor-initiating cells for epithelial to mesenchymal transition. Cell Stem Cell 20, 191–204 (2017).
Flavahan, W. A., Gaskell, E. & Bernstein, B. E. Epigenetic plasticity and the hallmarks of cancer. Science 357, eaal2380 (2017).
Meir, Z., Mukamel, Z., Chomsky, E., Lifshitz, A. & Tanay, A. Single-cell analysis of clonal maintenance of transcriptional and epigenetic states in cancer cells. Nat. Genet. 52, 709–718 (2020).
Guilhamon, P. et al. Single-cell chromatin accessibility profiling of glioblastoma identifies an invasive cancer stem cell population associated with lower survival. eLife 10, e64090 (2021).
La Manno, G. et al. RNA velocity of single cells. Nature 560, 494–498 (2018).
Fine, H. A. Malignant gliomas: simplifying the complexity. Cancer Discov. 9, 1650–1652 (2019).
Gaiti, F. et al. Epigenetic evolution and lineage histories of chronic lymphocytic leukaemia. Nature 569, 576–580 (2019).
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
Morton, A. R. et al. Functional enhancers shape extrachromosomal oncogene amplifications. Cell 179, 1330–1341 (2019).
Sun, W. et al. The association between copy number aberration, DNA methylation and gene expression in tumor samples. Nucleic Acids Res. 46, 3009–3018 (2018).
O’Hagan, H. M., Mohammad, H. P. & Baylin, S. B. Double strand breaks can initiate gene silencing and SIRT1-dependent onset of DNA methylation in an exogenous promoter CpG island. PLoS Genet. 4, e1000155 (2008).
Davoli, T. et al. Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape the cancer genome. Cell 155, 948–962 (2013).
Ceccarelli, M. et al. Molecular profiling reveals biologically discrete subsets and pathways of progression in diffuse glioma. Cell 164, 550–563 (2016).
McLendon, R. et al. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061–1068 (2008).
Brennan, C. W. et al. The somatic genomic landscape of glioblastoma. Cell 155, 462–477 (2013).
Capper, D. et al. DNA methylation-based classification of central nervous system tumours. Nature 555, 469–474 (2018).
Pine, A. R. et al. Tumor microenvironment is critical for the maintenance of cellular states found in primary glioblastomas. Cancer Discov. https://doi.org/10.1158/2159-8290.CD-20-0057 (2020).
Wang, Q. et al. Tumor evolution of glioma-intrinsic gene expression subtypes associates with immunological changes in the microenvironment. Cancer Cell 32, 42–56 (2017).
Verhaak, R. G. W. et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 17, 98–110 (2010).
Ben-Porath, I. et al. An embryonic stem cell-like gene expression signature in poorly differentiated aggressive human tumors. Nat. Genet. 40, 499–507 (2008).
Rheinbay, E. et al. An aberrant transcription factor network essential for Wnt signaling and stem cell maintenance in glioblastoma. Cell Rep. 3, 1567–1579 (2013).
Suvà, M.-L. et al. EZH2 is essential for glioblastoma cancer stem cell maintenance. Cancer Res. 69, 9211–9218 (2009).
Natsume, A. et al. Chromatin regulator PRC2 is a key regulator of epigenetic plasticity in glioblastoma. Cancer Res. 73, 4559–4570 (2013).
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
O’Connor, T., Grant, C. E., Bodén, M. & Bailey, T. L. T-Gene: improved target gene prediction. Bioinformatics https://doi.org/10.1093/bioinformatics/btaa227 (2020).
Reddington, J. P., Sproul, D. & Meehan, R. R. DNA methylation reprogramming in cancer: Does it act by re-configuring the binding landscape of Polycomb repressive complexes? Bioessays 36, 134–140 (2014).
Douillet, D. et al. Uncoupling histone H3K4 trimethylation from developmental gene expression via an equilibrium of COMPASS, Polycomb and DNA methylation. Nat. Genet. https://doi.org/10.1038/s41588-020-0618-1 (2020).
Bintu, L. et al. Dynamics of epigenetic regulation at the single-cell level. Science 351, 720–724 (2016).
Wang, L. et al. The phenotypes of proliferating glioblastoma cells reside on a single axis of variation. Cancer Discov. 9, 1708–1719 (2019).
Hoffmann, A., Sportelli, V., Ziller, M. & Spengler, D. Switch-like roles for Polycomb proteins from neurodevelopment to neurodegeneration. Epigenomes 1, 21 (2017).
Xu, W. et al. Oncometabolite 2-hydroxyglutarate is a competitive inhibitor of α-ketoglutarate-dependent dioxygenases. Cancer Cell 19, 17–30 (2011).
Turcan, S. et al. IDH1 mutation is sufficient to establish the glioma hypermethylator phenotype. Nature 483, 479–483 (2012).
Lu, F., Liu, Y., Jiang, L., Yamaguchi, S. & Zhang, Y. Role of Tet proteins in enhancer activity and telomere elongation. Genes Dev. https://doi.org/10.1101/gad.248005.114 (2014).
Hon, G. C. et al. 5mC oxidation by Tet2 modulates enhancer activity and timing of transcriptome reprogramming during differentiation. Mol. Cell 56, 286–297 (2014).
Ginno, P. A. et al. A genome-scale map of DNA methylation turnover identifies site-specific dependencies of DNMT and TET activity. Nat. Commun. 11, 2680 (2020).
Creyghton, M. P. et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl Acad. Sci. USA 107, 21931–21936 (2010).
Landau, D. A. et al. Locally disordered methylation forms the basis of intratumor methylome variation in chronic lymphocytic leukemia. Cancer Cell 26, 813–825 (2014).
Shipony, Z. et al. Dynamic and static maintenance of epigenetic memory in pluripotent and somatic cells. Nature 513, 115–119 (2014).
Landan, G. et al. Epigenetic polymorphism and the stochastic formation of differentially methylated regions in normal and cancerous tissues. Nat. Genet. 44, 1207–1214 (2012).
Pan, H. et al. Epigenomic evolution in diffuse large B-cell lymphomas. Nat. Commun. 6, 6921 (2015).
Jones, P. A. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat. Rev. Genet. 13, 484–492 (2012).
Turcan, S. et al. Mutant-IDH1-dependent chromatin state reprogramming, reversibility, and persistence. Nat. Genet. 50, 62–72 (2018).
Núñez, F. J. et al. IDH1-R132H acts as a tumor suppressor in glioma via epigenetic up-regulation of the DNA damage response. Sci. Transl. Med. 11, eaaq1427 (2019).
Flavahan, W. A. et al. Insulator dysfunction and oncogene activation in IDH mutant gliomas. Nature 529, 110–114 (2016).
Brocks, D. et al. Intratumor DNA methylation heterogeneity reflects clonal evolution in aggressive prostate cancer. Cell Rep. 8, 798–806 (2014).
Roerink, S. F. et al. Intra-tumour diversification in colorectal cancer at the single-cell level. Nature 556, 457–462 (2018).
Shibata, D. Mutation and epigenetic molecular clocks in cancer. Carcinogenesis 32, 123–128 (2011).
Moran, P. A. P. Notes on continuous stochastic phenomena. Biometrika 37, 17–23 (1950).
Maddison, W. P., Midford, P. E. & Otto, S. P. Estimating a binary character’s effect on speciation and extinction. Syst. Biol. 56, 701–710 (2007).
Stadler, T. & Bonhoeffer, S. Uncovering epidemiological dynamics in heterogeneous host populations using phylogenetic methods. Philos. Trans. R. Soc. B Biol. Sci. 368, 20120198 (2013).
Boyle, A. P. et al. High-resolution mapping and characterization of open chromatin across the genome. Cell 132, 311–322 (2008).
Bell, R. E. et al. Enhancer methylation dynamics contribute to cancer plasticity and patient mortality. Genome Res. 26, 601–611 (2016).
Ziller, M. J. et al. Charting a dynamic DNA methylation landscape of the human genome. Nature 500, 477–481 (2013).
Pastore, A. et al. Corrupted coordination of epigenetic modifications leads to diverging chromatin states and transcriptional heterogeneity in CLL. Nat. Commun. 10, 1874 (2019).
Irizarry, R. A. et al. The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nat. Genet. 41, 178–186 (2009).
Polak, P. et al. A mutational signature reveals alterations underlying deficient homologous recombination repair in breast cancer. Nat. Genet. 49, 1476–1486 (2017).
Izzo, F. et al. DNA methylation disruption reshapes the hematopoietic differentiation landscape. Nat. Genet. 52, 378–387 (2020).
Challen, G. A. et al. Dnmt3a is essential for hematopoietic stem cell differentiation. Nat. Genet. 44, 23–31 (2011).
Klughammer, J. et al. The DNA methylation landscape of glioblastoma disease progression shows extensive heterogeneity in time and space. Nat. Med. 24, 1611–1624 (2018).
Boyer, L. A. et al. Polycomb complexes repress developmental regulators in murine embryonic stem cells. Nature 441, 349–353 (2006).
Margueron, R. & Reinberg, D. The Polycomb complex PRC2 and its mark in life. Nature 469, 343–349 (2011).
Bernstein, B. E. et al. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315–326 (2006).
Boulard, M., Edwards, J. R. & Bestor, T. H. FBXL10 protects Polycomb-bound genes from hypermethylation. Nat. Genet. 47, 479–485 (2015).
Meissner, A. et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature 454, 766–770 (2008).
Domcke, S. et al. A human cell atlas of fetal chromatin accessibility. Science 370, eaba7612 (2020).
Mohn, F. et al. Lineage-specific Polycomb targets and de novo DNA methylation define restriction and potential of neuronal progenitors. Mol. Cell 30, 755–766 (2008).
Suvà, M. L., Riggi, N. & Bernstein, B. E. Epigenetic reprogramming in cancer. Science 339, 1567–1570 (2013).
Alcantara Llaguno, S. R. & Parada, L. F. Cell of origin of glioma: biological and clinical implications. Br. J. Cancer 115, 1445–1450 (2016).
Chaffer, C. L. et al. Normal and neoplastic nonstem cells can spontaneously convert to a stem-like state. Proc. Natl Acad. Sci. USA 108, 7950–7955 (2011).
Morris, V. et al. Single-cell analysis reveals mechanisms of plasticity of leukemia initiating cells. Preprint at bioRxiv https://doi.org/10.1101/2020.04.29.066332 (2020).
Lieberman, E., Hauert, C. & Nowak, M. A. Evolutionary dynamics on graphs. Nature 433, 312–316 (2005).
Lappalainen, T. & Greally, J. M. Associating cellular epigenetic models with human phenotypes. Nat. Rev. Genet. 18, 441–451 (2017).
Angermueller, C., Lee, H. J., Reik, W. & Stegle, O. DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning. Genome Biol. 18, 67 (2017).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
Van den Berge, K. et al. Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications. Genome Biol. 19, 24 (2018).
Risso, D., Perraudeau, F., Gribkova, S., Dudoit, S. & Vert, J.-P. A general and flexible method for signal extraction from single-cell RNA-seq data. Nat. Commun. 9, 284 (2018).
Van den Berge, K., Soneson, C., Robinson, M. D. & Clement, L. stageR: a general stage-wise method for controlling the gene-level false discovery rate in differential expression and differential transcript usage. Genome Biol. 18, 151 (2017).
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Seshan, V. E. & Olshen, A. B. DNAcopy: a package for analyzing DNA copy data (v1.60.0). R package. (2021).
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).
Robinson, D. F. & Foulds, L. R. Comparison of phylogenetic trees. Math. Biosci. 53, 131–147 (1981).
Gittleman, J. L. & Kot, M. Adaptation: statistics and a null model for estimating phylogenetic effects. Syst. Biol. 39, 227–241 (1990).
Wartenberg, D. Multivariate spatial correlation: a method for exploratory geographical analysis. Geographical Anal. 17, 263–283 (1985).
Czaplewski, R. L. Expected Value and Variance of Moran’s Bivariate Spatial Autocorrelation Statistic for a Permutation Test (US Department of Agriculture, Forest Service, Rocky Mountain Forest and Range Experiment Station, 1993).
FitzJohn, R. G. Diversitree: comparative phylogenetic analyses of diversification in R. Methods Ecol. Evol. 3, 1084–1092 (2012).
Revell, L. J. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3, 217–223 (2012).
Xiang, Y., Gubian, S., Suomela, B. & Hoeng, J. Generalized simulated annealing for global optimization: the GenSA package. R Journal 5, 13 (2013).
Bolker, B. Maximum likelihood estimation and analysis with the bbmle package (v188.8.131.52). R package. (2021).
Gaiti, F., Silverbush, D., Schiffman, J. & Kluegel, L. Single-cell multi-omics profiling of human gliomas. Zenodo https://doi.org/10.5281/zenodo.4776456 (2021).
We thank members of the Landau and Suvà laboratories for constructive discussions, the Epigenomics Core Facility at Weill Cornell Medical College for technical help and E. Rheinbay at the Massachusetts General Hospital Cancer Center for whole-exome sequencing data processing. This project and R.C. have received funding from the European Union’s Horizon 2020 research and innovation program under Marie Skłodowska-Curie grant agreement no. 750345. F.G. was supported by NIH K99/R00 Pathway to Independence Award (NCI K99CA248955). D.S. was supported by EMBO long-term fellowship ALTF (570-2017) and by the Schmidt Family Foundation. J.K. was supported by an HFSP long-term fellowship (LT000452/2019-L). A.R. was supported by funds from the Howard Hughes Medical Institute, the Klarman Cell Observatory, the STARR Cancer Consortium, NCI grant 1U24CA180922, NCI grant R33CA202820, Koch Institute support (core) grant P30CA14051 from the NCI, the Ludwig Center and the Broad Institute. L.N.G.C. was supported by NIH award K12CA090354. This work was supported by grants to M.L.S. from the Mark Foundation (Emerging Leader Award), the Sontag Foundation (Distinguished Scientist Award), the MGH Research Scholars, and NCI R37CA245523 and NCI R01CA258763 (to M.L.S. and D.A.L.). D.A.L. was supported by the Burroughs Wellcome Fund Career Award for Medical Scientists, the Pershing Square Sohn Prize for Young Investigators in Cancer Research, the NIH Director’s New Innovator Award (DP2-CA239065), the Sontag Foundation (Distinguished Scientist Award, SFI 203261-01), the William Rhodes and Louise Tilzer-Rhodes Center for Glioblastoma at NewYork-Presbyterian Hospital (NYPH 203205-01) and NHGRI RM1HG011014-01. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
M.L.S. is an equity holder, scientific cofounder and advisory board member of Immunitas Therapeutics. A.R. is a founder and equity holder of Celsius Therapeutics, is an equity holder in Immunitas Therapeutics and, until 31 July 2020, was a scientific advisory board member of Syros Pharmaceuticals, Neogene Therapeutics, Asimov and ThermoFisher Scientific. Since 1 August 2020, A.R. has been an employee of Genentech. Since 19 October 2020, O.R.-R. has been an employee of Genentech. D.A.L. is an equity holder, scientific cofounder and advisory board member of C2i Genomics and a scientific advisory board member for Mission Bio. The authors declare that these activities are not related to the research reported in this publication and have not influenced the conclusions in this manuscript. The remaining authors declare no competing interests.
Peer review information Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended Data Fig. 1 Multi-omics single-cell sequencing of GBM reveals intra-tumoral DNAme heterogeneity.
a, CNA inference based on coverage depth imbalance in the scDNAme data in windows of 20 Mb (sliding window of 5 Mb). Rows correspond to cells, clustered by overall CNA pattern. b, Proportion of single cells belonging to GBM cellular states (left) and two-dimensional representation of GBM cellular states (middle) or cycling cells based on the relative expression of gene-sets associated with G1.S and G2.M (right) for each GBM patient sample (including MGH105 biological replicates and MGH121 technical replicates). Each quadrant corresponds to one cellular state and the exact position of malignant cells (dots) reflect their relative scores for pairs of gene modules previously defined in scRNAseq data21. Light grey dots in the background represent all GBM samples (n = 844 malignant-only cells that passed quality control based on scRNAseq). c, Two-dimensional representation of single cells assigned to previously described LGm classes39, visualized as triangle plots (where each vertex corresponds to one LGm class) across all 7 GBM samples (n = 867 cells [malignant and non-malignant] that passed quality control based on scDNAme, top), and the two samples harboring the highest number of cells: MGH105 (n = 339, middle) and MGH121 (n = 275, bottom). RNA differentiation score (defined as the difference in gene module scores between AC-/MES-like and NPC-/OPC-like cells) is overlaid. d, Proportion of GBM cells (n = 867 cells [malignant and non-malignant]) with high or low DNAme (defined as above or below the median of mean DNAme across windows of 1,000 bp around 450K array probes from TCGA glioma samples used in the analysis, respectively; Methods) assigned to previously described LGm classes39. P value was determined by two-sided Fisher’s exact test (d).
Extended Data Fig. 2 Multi-omics single-cell sequencing of IDH-MUT reveals intra-tumoral DNAme heterogeneity.
a, CNA inference based on coverage depth imbalance in the scDNAme data in windows of 20 Mb. Rows correspond to cells, clustered by overall CNA pattern. b, Proportion of single cells belonging to IDH-MUT cellular states (left) and developmental hierarchy representation of IDH-MUT cellular states (middle) or cycling cells based on the relative expression of gene-sets associated with G1.S and G2.M (right) for each IDH-MUT patient sample. Lineage and stemness scores define the exact position of malignant cells (dots) as computed from scRNAseq data. Light grey dots in the background represent all IDH-MUT samples (n = 739 malignant-only cells that passed quality control based on scRNAseq). c, UMAP of all single cells that passed quality control based on scRNAseq (GBM n = 937, IDH-MUT n = 809) or scDNAme (GBM n = 867, IDH-MUT n = 718). Each patient sample is indicated. See also Fig. 1b. d, Two-dimensional representation of single cells assigned to previously described LGm classes39, visualized as triangle plots (where each vertex corresponds to one LGm class) across all 7 IDH-MUT samples (n = 718 cells [malignant and non-malignant] that passed quality control based on scDNAme, left), and three representative samples: MGH107 (n = 76), MGH135 (n = 96), and MGH208 (n = 177). DNAme value is overlaid. e, Same as (d) for the 7 GBM and 7 IDH-MUT samples (n = 867 cells [malignant and non-malignant]; IDH-MUT, n = 718 cells [malignant and non-malignant] that passed quality control based on scDNAme). Number of reads per cell (left), number of CpGs per cell (middle), and CpG conversion rate per cell (right) are overlaid. f, Proportion of IDH-MUT cellular states or cycling cells (n = 718 cells [malignant and non-malignant]) assigned to previously described LGm classes39. P values were determined by two-sided Fisher’s exact test (f).
Extended Data Fig. 3 High-resolution copy number alteration mapping enabled by single-cell multi-omics.
a, UMAP of single cells that passed quality control based on scRNAseq (GBM n = 937, IDH-MUT n = 809). b, CNA inference based on bulk WES for GBM samples MGH105A/B/C, MGH122, and MGH124. EGFR locus is highlighted. c, CNA inference by scDNAme (red line) and scRNAseq (grey line) performance in correctly classifying chr. loss vs. neutral, as assessed by the AUC of ROC curve at different genomic window resolutions. ROC curve at 20 Mb resolution is shown (inset). 95% confidence intervals were generated using bootstrapping. d, CNA inferred by scDNAme (left) and scRNAseq (right) at a 50 Mb region centered at EGFR locus. Mean CNA profile per sample is shown in black. Red lines represent CNA segments identified by circular binary segmentation (CBS) analysis. e, EGFR expression as assessed by scRNAseq for each GBM patient sample (n = 844 malignant-only cells that passed quality control based on scRNAseq). f, Same as (d) for CNA inference by scDNAme at a 2 Mb region centered at EGFR locus. Individual cell CNA profiles are shown in grey. g, UMAP of single cells as defined in (a). Clonal chr. 7 gain (left) and chr. 10 loss (middle), as inferred by scDNAme, along with sub-clonal loss of chr. 6 (right), are indicated. h, Percentage of CpG methylation change at copy number gain, loss, and neutral chromosomal regions when comparing DNAme level of individual malignant cells to baseline for GBM (n = 7) and IDH-MUT (n = 3) samples. i, Same as (h) across all GBM and IDH-MUT samples for different thresholds adopted to define copy number gain vs. loss genomic window resolutions. P values were determined by two-sided Mann–Whitney U-test (d-f, h-i), comparing the EGFR expression median values across samples (e). Boxplots represent the median, bottom and upper quartiles, whiskers correspond to 1.5 times the interquartile range.
Extended Data Fig. 4 GBM stem-like states exhibit PRC2 target hypomethylation compared with more differentiated-like cell states.
a, Differentially methylated TSS (±1Kb) between stem-like (NPC-like, n = 175 vs. OPC-like, n = 51; left) and differentiated GBM cellular states (MES-like, n = 201; AC-like, n = 168; right). b, Differential gene expression between AC-like (n = 205) and MES-like cells (n = 232). Genes with an absolute log2(fold-change) > 1 and BH-FDR < 0.05 were defined as differentially expressed (DE). DE genes belonging to immune response pathways are highlighted. c, Q-Q plot comparing the observed -log10P values of all genes used in the differential methylation analysis between GBM cellular states (Fig. 2c) to expected -log10P values. d, Distribution of mean promoter DNAme values in stem-like and differentiated cells for representative differentially methylated PRC2 target genes (Fig. 2c). e, Normalized enrichment scores for gene sets (MSigDB C2) enriched at hypomethylated promoters in NPC/OPC-like (turquoise) or MES-/AC-like (orange) cells (Fig. 2c; n = 15,218 genes). f, Enrichment score plot for SUZ12 targets46 gene set enriched at hypomethylated promoters in NPC-/OPC-like cells (Fig. 2c; n = 15,218 genes). g, Same as (a) for a representative GBM sample (MGH105; NPC-/OPC-like, n = 50 cells; MES-/AC-like, n = 138 cells). Genes belonging to PRC2 targets46 are labelled. h, Mean CpG methylation at promoters of PRC2 targets46 between cell states for each of the 7 GBM samples. Difference in median promoter DNAme at PRC2 targets46 between cell states is indicated. i, Median promoter DNAme at PRC2 targets46 of MES-/AC-like and NPC-/OPC-like cells for each of the 7 GBM samples. j, Mean CpG methylation at ChIP-seq maps50 of EZH2 and SUZ12 between GBM cell states (n = 706 cells). P values were determined by generalized linear model (a, c, g), weighted F-test (b), permutation test (f), two-sided Mann-Whitney U test (i, j). Boxplots represent the median, bottom and upper quartiles, whiskers correspond to 1.5 times the interquartile range.
Extended Data Fig. 5 Validation of PRC2 hypomethylation in GBM stem-like states using histone marks, single-cell ATACseq and TCGA bulk data.
a, Proportion of chromatin states at randomly sampled promoters (1,000 random samplings) and hypomethylated promoters in GBM stem-like (top) vs. AC/MES-like (bottom) cells. b, Proportion of ChIP-seq peaks47 at hypomethylated promoters in GBM stem-like vs. AC/MES-like cells. c, Heatmap of emission parameters for a HMM 18-state model derived from GBM ChIP-seq maps47. Chromatin states of interest are highlighted in red. d, Proportion of chromatin states (see (c)) at hypomethylated promoters in GBM stem-like and AC/MES-like cells (Fig. 2c), all genes used in differential methylation promoter analysis (n = 15,218 genes), and randomly sampled promoters. e, Fold-change (log2) of chromatin states (see (c)) between hypomethylated promoters in GBM stem-like vs. AC/MES-like cells. Chromatin states of interest are highlighted in red. f, Differential gene expression between NPC/OPC-like (n = 270) and AC-/MES-like cells (n = 437). PRC2 target46 genes are highlighted. g, EZH2 expression (scRNAseq) between NPC-/OPC-like and MES-/AC-like cells across GBM samples. h, Gene expression activity derived from scATAC-seq open chromatin for GBM cellular states, cell cycle-related genes, and PRC2 targets46 at distinct NPC-/OPC-like and AC-/MES-like clusters identified based on scATACseq GBM data55. i, UMAP of scATACseq GBM data55 (sample SF11956) overlaid with density plot of peaks frequency (top) and chromatin accessibility of housekeeping genes1 (bottom). j, Spearman’s rank-order correlation between mean DNAme at promoters of PRC2 targets46 and RNA differentiation score and bulk sample purity for 67 TCGA GBM samples40,41. k, Mean gene expression of hypomethylated PRC2 targets in stem-like cells (n = 60; Fig. 2c) and randomly selected non-PRC2 targets (n = 60) in TCGA GBM samples40,41 enriched for NPC-/OPC-like vs. AC-/MES-like signature. P values were determined by permutation test (a), two-sided Fisher’s exact test (b), weighted F-test (f), two-sided Mann-Whitney U test (g, k). Boxplots represent the median, bottom and upper quartiles, whiskers correspond to 1.5 times the interquartile range.
a, Two-dimensional representation of single cells assigned to previously described LGm classes39, visualized as triangle plots (where each vertex corresponds to one LGm class) across 7 GBM samples (n = 867 cells [malignant and non-malignant] that passed quality control based on scDNAme). Mean DNAme at promoters of PRC2 targets46 (top), mean DNAme at promoters of housekeeping genes1, and number of tiles per cell (bottom) are overlaid for each triangle plot. b, Comparison between mean genome wide DNAme (defined as the mean DNAme across windows of 1,000 bp around 450K array probes, Methods) and mean DNAme at promoters (TSS ± 1Kb) of PRC2 targets46 for the 478 TCGA GBM samples that were classified as LGm4-6 by Ceccarelli et al.39. LGm classes assignment for each sample is shown. c, Left: mean genome wide DNAme for TCGA GBM samples (n = 478) previously classified as either LGm4, LGm5, or LGm6 by Ceccarelli et al.39 Right: mean DNAme at promoters (TSS ± 1Kb) of PRC2 targets46 for TCGA GBM samples (n = 478) previously classified as either LGm4, LGm5, or LGm6 by Ceccarelli et al.39. P values were determined by two-sided Mann-Whitney U test (c). Boxplots represent the median, bottom and upper quartiles, whiskers correspond to 1.5 times the interquartile range.
Extended Data Fig. 7 Comparison of DNA methylation and chromatin state patterns between transcriptional cell states in IDH-MUT.
a, Q-Q plot comparing the observed -log10P values of genes used in the differential methylation analysis of promoters (n = 14,808 genes) between undiff/stem-like and AC-/OC-like IDH-MUT cellular states (defined in (b)) to expected -log10P values. b, Differentially methylated promoters between undiff/stem-like (n = 251) and AC-/OC-like (n = 133) cells with matched scRNAseq and scDNAme data across IDH-MUT samples. Promoters with absolute mean DNAme difference > 5% and P values < 0.05 were defined as differentially methylated (red). c, Enrichment score plots (n = 14,808 genes, as in (b)) for PRC2 and SUZ12 targets46 between stem-like/undifferentiated cells and AC-/OC-like cells in IDH-MUT samples. d-f, Same as (a-c), for single-cell DNA methylomes obtained performing double digestion with HaeIII+MspI on cells from two IDH-MUT samples (MGH201 and MGH208). g, Mean (±s.e.m.) CpG methylation at ChIP-seq maps50 of EZH2 and SUZ12 between undiff/stem-like and AC-/OC-like cells in each IDH-MUT sample. h, Proportion of chromatin states at hypomethylated promoters in IDH-MUT AC-/OC-like cells (defined in (b)), randomly sampled promoters (1,000 random samplings), and hypomethylated promoters in IDH-MUT undiff/stem-like (defined in (b)). i, Proportion of chromatin states at randomly sampled promoters (1,000 random samplings) and hypomethylated promoters in IDH-MUT undiff/stem-like (top) vs. AC-/OC-like cells (bottom). j, Proportion of ChIP-seq peaks47 at hypomethylated promoters in IDH-MUT undiff/stem-like vs. AC-/OC-like cells. k, Proportion of each of the chromatin states (defined in Extended Data Fig. 5c) at hypomethylated promoters in IDH-MUT undiff/stem-like (defined in (b)), hypomethylated promoters in IDH-MUT AC-/OC-like cells (defined in (b)), all genes used in differential methylation promoter analysis (n = 14,808 genes), and randomly sampled promoters, respectively. l, Fold-change (log2) of chromatin states between hypomethylated promoters in IDH-MUT undiff/stem-like vs. AC-/OC-like cells. P values were determined by generalized linear model (a-b, d-e), Fisher’s combined probability test (g), permutation test (c, f, h-i), two-sided Fisher’s exact test (j).
Extended Data Fig. 8 IDH-MUT cells exhibit preferential enhancer hypermethylation, decoupling of the promoter methylation-expression relationship and disruption of CTCF insulation.
a, Number of aligned reads and unique CpGs for MspI (n=476) and HaeIII+MspI digested IDH-MUT cells (n=242; MGH201 and MGH208). b, Mean CpG methylation at FANTOM5 enhancers vs. H3K27ac ChIP-seq peaks47,70 between GBM (n=765) and IDH-MUT (n=670) cells. c, Mean CpG methylation at TSS (±1Kb) vs. FANTOM5 enhancers between GBM (n=765) and IDH-MUT (n=670) cells (G-CIMP-low [MGH107, MGH135, MGH45, MGH64]; G-CIMP-high [MGH142, MGH201, MGH208]). d, Mean (±SEM) CpG methylation at FANTOM5 enhancers for stem-like/undifferentiated and AC-/OC-like IDH-MUT cells. e, Epimutation rate across non-malignant (n=148), GBM (n=765) and IDH-MUT (n=670) cells. f, Proportion of cells with gene expression (read count >0) and above-threshold DNAme at 500 base-pairs regions upstream (left) or downstream (right) of TSS. Data are mean (±s.e.m.) across all genes (expression seen in > 5 cells, DNAme >5 CpGs per region) for non-malignant cells (n=148), GBM (n=765) and IDH-MUT (n=670) cells. ‘*’ P-value < 0.05. g, Left: Distribution of Spearman’s rho of expression and promoter DNAme correlation (n=1,523 genes expressed >5 cells, DNAme >5 CpGs per promoter); GBM (n=765) and IDH-MUT (n=670) cells. Right: Median values of Spearman’s rho of expression and promoter DNAme correlation. h, Percentage of genes pairs across CTCF sites70 being co-expressed (both RNA read count >0); GBM (n=765) and IDH-MUT (n=670) cells. Scrambled represents randomly permuted cell labels for the expression values. Inset: Increase in percentage of genes pairs across CTCF sites70 being co-expressed when comparing matched vs. scrambled groups. Error bars represent 95% CIs. i, Gene expression correlation (Spearman’s rho) of genes pairs across CTCF sites70 per tile of mean CpG methylation at CTCF binding sites (low-to-high); IDH-MUT (n=670) cells. P values are two-sided Mann-Whitney U test (a-c, e-f, h-i), Fisher’s combined probability test (d), two-sided Kolmogorov–Smirnov test (g). Boxplots represent the median, bottom and upper quartiles, whiskers correspond to 1.5 times the interquartile range.
Extended Data Fig. 9 High-resolution DNAme-based lineage trees coupled with leaf annotation of cellular states.
a, Representative (random cell subsampling) DNAme-based lineage tree for each GBM patient sample (including MGH105 biological replicates and MGH121 technical replicates), with projection of GBM cellular states. b, Representative (random cell subsampling) DNAme-based lineage tree for each IDH-MUT patient sample (including MGH142 and MGH208 technical replicates), with projection of IDH-MUT cellular states. Throughout the figure, scale represents DNAme changes per site.
Extended Data Fig. 10 Cell state transition dynamics inference from lineage tree architectures revealed higher cellular plasticity in GBM compared to a more stable differentiation hierarchy in IDH-MUT.
a, Top: GBM DNAme-based lineage tree (MGH105) with RPL5 c.621 C>G genotyping. Bottom: GBM gene module scores. b, IDH-MUT DNAme-based lineage tree (MGH107) with IDH-MUT gene module scores. c, Normalized Robinson-Foulds between GBM tree replicates (from same sample; full dataset or removing CpGs from DMRs (Fig. 2c) or PRC2 targets46) reconstructed by maximum-likelihood (ML) vs. maximum parsimony. d, Transcriptional distances as function of lineage distance between unique cell pairs for MGH115, MGH122 and MGH107. e, As (d), for DNAme-based lineage tree of MGH115 and MGH122 (n=47 and 46 cells, respectively) reconstructed removing CpGs from DMRs (Fig. 2c) or PRC2 targets46. f, Pairwise gene expression correlation (Pearson’s) and cross-correlation (heritability). Grey points=all gene pair relationships; red points=gene pair relationships within selected gene module (top: stem-like; bottom: cell cycle). g, Phylogenetic association of cell states on GBM (n=7 patients; n=10 samples with MGH105A-D) and IDH-MUT (n=7 patients). Barplots=weighted mean±s.e.m. Moran’s I permutation-based one-sided P values (106 permutations) across replicates. Dashed line: P=0.025. h, As (g), comparing DNAme-based lineage tree reconstruction of MGH115 and MGH122, using replicates from same sample with full dataset or removing CpGs from DMRs (Fig. 2c) or PRC2 targets46. Barplots=mean±s.e.m. i, Heat maps of pairwise cell state phylogenetic associations. Close phylogenetic associations are shown in warmer colors. j, ML estimate (median±MAD across tree replicates; samples as in (g)) rates of cell state growth and transition. k, Mathematical model of glioma evolutionary dynamics. l, ML estimate (mean±s.e.m. across tree replicates of MGH115 and MGH122) rates of cell state self-renewal and transition, using replicates from same sample (full dataset or removing CpGs from DMRs analysis (Fig. 2c) or PRC2 targets46). m, Weighted median±weighted MAD rates of dedifferentiation compared to stem-like cell self-renewal across lineage tree replicates (sample as in (g)). P values: two-sided Mann-Whitney U test (d-e, j, l-m). Boxplots: median, bottom and upper quartiles, whiskers: 1.5 times the interquartile range.
About this article
Cite this article
Chaligne, R., Gaiti, F., Silverbush, D. et al. Epigenetic encoding, heritability and plasticity of glioma transcriptional cell states. Nat Genet 53, 1469–1479 (2021). https://doi.org/10.1038/s41588-021-00927-7
Leveraging extrachromosomal DNA to fine-tune trials of targeted therapy for glioblastoma: opportunities and challenges
Nature Reviews Clinical Oncology (2022)
Nature Reviews Cancer (2022)
Seminars in Immunopathology (2022)
Dissecting cell fate dynamics in pediatric glioblastoma through the lens of complex systems and cellular cybernetics
Biological Cybernetics (2022)