Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia

Article metrics

Abstract

Identifying the causes of human diseases requires deconvolution of abnormal molecular phenotypes spanning DNA accessibility, gene expression and protein abundance1,2,3. We present a single-cell framework that integrates highly multiplexed protein quantification, transcriptome profiling and analysis of chromatin accessibility. Using this approach, we establish a normal epigenetic baseline for healthy blood development, which we then use to deconvolve aberrant molecular features within blood from patients with mixed-phenotype acute leukemia4,5. Despite widespread epigenetic heterogeneity within the patient cohort, we observe common malignant signatures across patients as well as patient-specific regulatory features that are shared across phenotypic compartments of individual patients. Integrative analysis of transcriptomic and chromatin-accessibility maps identified 91,601 putative peak-to-gene linkages and transcription factors that regulate leukemia-specific genes, such as RUNX1-linked regulatory elements proximal to the marker gene CD69. These results demonstrate how integrative, multiomic analysis of single cells within the framework of normal development can reveal both distinct and shared molecular mechanisms of disease from patient samples.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Multiomic epigenetic and phenotypic analysis of human hematopoiesis.
Fig. 2: Multiomic projection of MPALs into hematopoiesis identifies normal and leukemic programs.
Fig. 3: Integrative scATAC-seq and scRNA-seq analyses nominate putative TFs that regulate leukemic programs.

Data availability

Sequencing data are deposited in the Gene Expression Omnibus (GEO) with the accession code GSE139369. There are no restrictions on data availability or use.

Code availability

Code used in this study can be found on Github at https://github.com/GreenleafLab/MPAL-Single-Cell-2019.

References

  1. 1.

    Hoadley, K. A. et al. Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer. Cell 173, 291–304 (2018).

  2. 2.

    Corces, M. R et al. The chromatin accessibility landscape of primary human cancers. Science 362, eaav1898 (2018).

  3. 3.

    Polak, P. et al. Cell-of-origin chromatin organization shapes the mutational landscape of cancer. Nature 518, 360–364 (2015).

  4. 4.

    Weinberg, O. K. & Arber, D. A. Mixed-phenotype acute leukemia: historical overview and a new definition. Leukemia 24, 1844–1851 (2010).

  5. 5.

    Arber, D. A. et al. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood 127, 2391–2405 (2016).

  6. 6.

    Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).

  7. 7.

    Satpathy, A. T. et al. Massively parallel single-cell chromatin landscapes of human immune cell development and intratumoral T cell exhaustion. Nat. Biotechnol. 37, 925–936 (2019).

  8. 8.

    Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).

  9. 9.

    Cusanovich, D. A. et al. The cis-regulatory dynamics of embryonic development at single-cell resolution. Nature 555, 538–542 (2018).

  10. 10.

    Cusanovich, D. A. et al. A single-cell atlas of in vivo mammalian chromatin accessibility. Cell 174, 1309–1324 (2018).

  11. 11.

    Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).

  12. 12.

    McInnes, L., Healy, J. & Melville, J. UMAP: Uniform manifold approximation and projection for dimension reduction. Preprint at arXiv https://arxiv.org/abs/1802.03426 (2018).

  13. 13.

    Janeway, C. J., Travers, P., Walport, M. & Shlomchik, M. J. Immunobiology 5th edn (Garland Science, 2001).

  14. 14.

    Pliner, H. A. et al. Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data. Mol. Cell 71, 858–871 (2018).

  15. 15.

    Schep, A. N., Wu, B., Buenrostro, J. D. & Greenleaf, W. J. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods 14, 975–978 (2017).

  16. 16.

    Alexander, T. B. et al. The genetic basis and cell of origin of mixed phenotype acute leukaemia. Nature 562, 373–379 (2018).

  17. 17.

    Takahashi, K. et al. Integrative genomic analysis of adult mixed phenotype acute leukemia delineates lineage associated molecular subtypes. Nat. Commun. 9, 2670 (2018).

  18. 18.

    Corces, M. R. et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).

  19. 19.

    van Galen, P. et al. Single-cell RNA-seq reveals AML hierarchies relevant to disease progression and immunity. Cell 176, 1265–1281 (2019).

  20. 20.

    Satpathy, A. T. et al. Transcript-indexed ATAC-seq for precision immune profiling. Nat. Med. 24, 580–590 (2018).

  21. 21.

    Mezger, A. et al. High-throughput chromatin accessibility profiling at single-cell resolution. Nat. Commun. 9, 3647 (2018).

  22. 22.

    Buenrostro, J. D. et al. Integrated single-cell analysis maps the continuous regulatory landscape of human hematopoietic differentiation. Cell 173, 1535–1548 (2018).

  23. 23.

    Li, B. et al. Census of immune cells. HCA https://data.humancellatlas.org/explore/projects/cc95ff89-2e68-4a08-a234-480eca21ce79 (2018).

  24. 24.

    Mitchell, K. et al. IL1RAP potentiates multiple oncogenic signaling pathways in AML. J. Exp. Med. 215, 1709–1727 (2018).

  25. 25.

    Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).

  26. 26.

    Lim, S. & Kaldis, P. Cdks, cyclins and CKIs: roles beyond cell cycle regulation. Development 140, 3079–3093 (2013).

  27. 27.

    Wolach, O. & Stone, R. M. How I treat mixed-phenotype acute leukemia. Blood 125, 2477–2485 (2015).

  28. 28.

    Zheng, C. et al. What is the optimal treatment for biphenotypic acute leukemia? Haematologica 94, 1778–1780 (2009).

  29. 29.

    Osato, M. et al. Biallelic and heterozygous point mutations in the runt domain of the AML1/PEBP2αB gene associated with myeloblastic leukemias. Blood 93, 1817–1824 (1999).

  30. 30.

    Haferlach, T. et al. Landscape of genetic lesions in 944 patients with myelodysplastic syndromes. Leukemia 28, 241–247 (2014).

  31. 31.

    Zhang, J. et al. The genetic basis of early T-cell precursor acute lymphoblastic leukaemia. Nature 481, 157–163 (2012).

  32. 32.

    Della Gatta, G. et al. Reverse engineering of TLX oncogenic transcriptional networks identifies RUNX1 as tumor suppressor in T-ALL. Nat. Med. 18, 436–440 (2012).

  33. 33.

    Wang, X et al. Breast tumors educate the proteome of stromal tissue in an individualized but coordinated manner. Sci. Signal. 10, eaam8065 (2017).

  34. 34.

    Sanda, T. et al. Core transcriptional regulatory circuit controlled by the TAL1 complex in human T cell acute lymphoblastic leukemia. Cancer Cell 22, 209–221 (2012).

  35. 35.

    Ben-Ami, O. et al. Addiction of t(8;21) and inv(16) acute myeloid leukemia to native RUNX1. Cell Rep. 4, 1131–1143 (2013).

  36. 36.

    Wilkinson, A. C. et al. RUNX1 is a key target in t(4;11) leukemias that contributes to gene activation through an AF4–MLL complex interaction. Cell Rep. 3, 116–127 (2013).

  37. 37.

    Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).

  38. 38.

    Welch, J. D. et al. Single-cell multi-omic integration compares and contrasts features of brain cell identity. Cell 177, 1873–1887 (2019).

  39. 39.

    Mumbach, M. R. et al. Enhancer connectome in primary human cells identifies target genes of disease-associated DNA elements. Nat. Genet. 49, 1602–1612 (2017).

  40. 40.

    Martín, P. et al. CD69 association with Jak3/Stat5 proteins regulates Th17 cell differentiation. Mol. Cell. Biol. 30, 4877–4889 (2010).

  41. 41.

    Shiow, L. R. et al. CD69 acts downstream of interferon-α/β to inhibit S1P1 and lymphocyte egress from lymphoid organs. Nature 440, 540–544 (2006).

  42. 42.

    Egawa, T., Tillman, R. E., Naoe, Y., Taniuchi, I. & Littman, D. R. The role of the Runx transcription factors in thymocyte differentiation and in homeostasis of naive T cells. J. Exp. Med. 204, 1945–1957 (2007).

  43. 43.

    Laguna, T. et al. New insights on the transcriptional regulation of CD69 gene through a potent enhancer located in the conserved non-coding sequence 2. Mol. Immunol. 66, 171–179 (2015).

  44. 44.

    Simeonov, D. R. et al. Discovery of stimulation-responsive immune enhancers with CRISPR activation. Nature 549, 111–115 (2017).

  45. 45.

    Feld, C. et al. Combined cistrome and transcriptome analysis of SKI in AML cells identifies SKI as a co-repressor for RUNX1. Nucleic Acids Res. 46, 3412–3428 (2018).

  46. 46.

    Cancer Genome Atlas Research Network Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N. Engl. J. Med. 368, 2059–2074 (2013).

  47. 47.

    Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

  48. 48.

    Soneson, C. & Robinson, M. D. Bias, robustness and scalability in single-cell differential expression analysis. Nat. Methods 15, 255–261 (2018).

Download references

Acknowledgements

We thank A. Satpathy and other members of the Chang and Greenleaf laboratories for helpful discussions. We thank the following people at 10x Genomics: D. Jhutty, J. Lau, J. Lee, L. Montesclaros, K. Pfeiffer, J. Terry, J. Wang, Y. Yin and S. Ziraldo for help with sample preparation and library generation of scATAC-seq and feature barcoding libraries. We acknowledge the Stanford Hematology Division Tissue Bank for providing samples for this study. This study was supported by the Swedish Research Council (grant 2015–06403, to A.M.). M.R.C. is supported by grant K99AG059918 (NIA) and the American Society of Hematology Scholars award. Further support came from National Institutes of Health grants P50-HG007735 and UM1-HG009442 (to H.Y.C. and W.J.G.), UM1-HG009436 and U19-AI057266 (to W.J.G), and R35-CA209919 (to H.Y.C.), as well as from Ludwig Cancer Research (to R.M. and H.Y.C.) and grants from the Chan-Zuckerberg Initiative and the Rita Allen Foundation. H.Y.C. is an Investigator of the Howard Hughes Medical Institute. W.J.G is a Chan–Zuckerberg Investigator. S.K. was supported by The Stanford Genome Training Program (NIH/NHGRI). B.P. was supported by the JIMB/NIST training program.

Author information

L.M.M. and S.K. conceived the project and designed the experiments. L.M.M., M.L., E.G. and R.M. curated patient samples. S.K. led data production and performed the experiments together with A.S.K., A.M. and L.M.M. G.X.Y.Z. provided healthy bone marrow and peripheral blood CITE-seq data. S.K. analyzed the scADT-seq data with contribution from B.P. M.R.C. performed data analysis. J.M.G. conceived the analytical workflows and performed the data analysis for scATAC-seq and scRNA-seq supervised by H.Y.C. and W.J.G. J.M.G., S.K., L.M.M. and W.J.G wrote the manuscript with input from all authors.

Correspondence to Sandy Klemm or Lisa M. McGinnis or William J. Greenleaf.

Ethics declarations

Competing interests

R.M. is a founder of, is an equity holder in, and serves on the board of directors of Forty Seven. H.Y.C. has affiliations with Accent Therapeutics (founder and scientific advisory board (SAB) member), 10x Genomics (SAB member), Boundless Bio (cofounder, SAB), Arsenal Biosciences (SAB) and Spring Discovery (SAB member). W.J.G. has affiliations with 10x Genomics (consultant), Guardant Health (consultant) and Protillion Biosciences (co-founder and consultant).

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Quality control of CITE-seq data for hematopoiesis samples.

(a) Number of cells passing filter for each experimental replicate (number of informative genes > 400 and number of unique molecular identifiers (UMI) > 1000). (b) Number of aligned reads on average per cell passing filter for each experimental sample. (c) Violin and box-whisker plot of the number of informative genes detected per single cell passing filter per experimental sample (n = 2,424 – 7,544). (d) Violin and box-whisker plot of the number of unique molecular identified (UMI) transcripts per cell passing filter per experimental sample (n = 2,424 – 7,544). (e) Aggregated scRNA-seq (n = 20,287) one to one reproducibility plots for biological replicates (left) and across sample types (right) colored by the density. The correlation (r) represents the Pearson correlation across all genes. (f) scRNA-seq experimental sample labels overlay on UMAP of hematopoiesis (n = 35,582). (g) scADT-seq UMAP of BMMC and PBMC samples (n = 4) across 14 antibodies. scADT overlay of experimental sample labels, CD19, CD3, CD56, CD4, CD8A, CD14, CD16, CD45RA, CD45RO, TIGIT and PD-1 respectively. Color represents experimental labels or scADT-seq values after CLR transformation. (h) scADT-seq UMAP of BMMC and PBMC samples (n = 4) across 14 antibodies colored by scRNA-seq clusters with biological classification. Box-whisker plot; lower whisker is the lowest value greater than the 25% quantile minus 1.5 times the interquartile range (IQR), the lower hinge is the 25% quantile, the middle is the median, the upper hinge is the 75% quantile and the upper whisker is the largest value less than the 75% quantile plus 1.5 times the IQR.

Supplementary Figure 2 Quality control of scATAC-seq data for hematopoiesis samples.

(a) scATAC-seq cell filtering plot of 4 representative scATAC-seq hematopoietic samples. The x-axis is the number of unique accessible fragments and the y-axis is the enrichment of Tn5 insertions at transcription start sites, representing the robust signal to background for each single cell. (b) Aggregated scATAC-seq fragment size distributions across individual experiments demonstrating sub-, mono- and multi nucleosome spanning ATAC-seq fragments. (c) Number of cells passing filter for each experimental replicate (Unique nuclear fragments > 1000 and TSS enrichment > 8). (d) Violin and box-whisker plot of the number of total aligned fragments for each single cell passing filter per experimental sample (n = 836 – 12,394). (e) Violin and box-whisker plot of the number of unique aligned nuclear fragments for each single cell passing filter per experimental sample (n = 836 – 12,394). (f) Violin and box-whisker plot of the fraction of the total number Tn5 insertions (Reads) that are within the healthy hematopoietic union peak set (n = 452,004) for each single cell passing filter. (g) Violin and box-whisker plot of the normalized transcription start site (TSS) enrichment for each single cell passing filter per experimental sample. (h) Aggregated scATAC-seq (n = 452,004) one to one reproducibility plots for biological replicates colored by the density. The correlation (r) represents the Pearson correlation across all genes. (i) scATAC-seq experimental sample labels overlay on UMAP of hematopoiesis (n = 35,038). Box-whisker plot; lower whisker is the lowest value greater than the 25% quantile minus 1.5 times the interquartile range (IQR), the lower hinge is the 25% quantile, the middle is the median, the upper hinge is the 75% quantile and the upper whisker is the largest value less than the 75% quantile plus 1.5 times the IQR.

Supplementary Figure 3 Validation of key marker genes for both scRNA-seq and scATAC-seq for hematopoiesis.

(a-h) Multi-omic tracks; (Top) average track of all clusters displayed, (Middle) binarized 100 random scATAC-seq tracks for each locus at 100bp resolution and (right) violin and box-whisker plot of the scRNA-seq log2 normalized expression for each cluster. Box-whisker plot; lower whisker is the lowest value greater than the 25% quantile minus 1.5 times the interquartile range (IQR), the lower hinge is the 25% quantile, the middle is the median, the upper hinge is the 75% quantile and the upper whisker is the largest value less than the 75% quantile plus 1.5 times the IQR. (a) Multi-omic track of GATA1 (specific in these clusters for Erythroid) for erythroid development from HSC progenitor cells (n = 111 - 1,653). (b) Multi-omic track of GATA2 (specific in these clusters for Basophil) for erythroid development from HSC progenitor cells (n = 111 - 1,653). (c) Multi-omic track of ELANE (specific in these clusters for GMP/Neutrophil) for neutrophil development from HSC progenitor cells (n = 1,050 – 2,260). (d) Multi-omic track of IRF8 (specific in these clusters for pDC) across pDC development from HSC progenitor cells (n = 544 – 2,260). (e) Multi-omic track of SDC1 (specific in these clusters for Plasma cells) across B cell development and plasma cells (n = 62 – 2,260). (f) Multi-omic track of CD1C (specific in these clusters for cDC) across cDC development from HSC progenitor cells (n = 325 – 2,260). (g) Multi-omic track of SELL (specific in these clusters for Naive T cells vs memory, and CD8 central memory vs CD8 effector memory) across NK and T cells (n = 796 – 3,539). (h) Multi-omic track of GZMB (specific in these clusters for NK cells) across NK and T cells (n = 796 – 3,539).

Supplementary Figure 4 Diagnostic flow cytometry plots for MPALs 1-5R.

(a-f) Diagnostic flow cytometry plots from the 5 different MPAL cases (MPAL1-5R) gated on blasts area (highlighted in red) and lymphocytes (highlighted in black) from CD45 and side scatter area (SSC-A). (a) MPAL 1 shows classic bilineal phenotype with both T-lymphoblasts (cCD3-positive and CD7-positve) and myeloid blasts (MPO-positive and CD33-positive). (b) MPAL 2 demonstrates a more complex phenotype with both biphenotypic (single population expressing lymphoid marker CD7 and myeloid marker CD33) and bilineal T-Myeloid patterns (subpopulation expressing monocytic markers CD64, CD33, and CD14). (c) MPAL 3 demonstrates a classic biphenotypic case with coexpression of both T-lineage markers (cCD3-positive) and myeloid markers (MPO-positive). (d) MPAL4 demonstrates a classic bilineal B/M phenotype expressing B-lineage markers (CD79a and CD19-positive) and myeloid markers (MPO-positive and CD33-positive). (e) MPAL5 demonstrates a more complicated phenotype with a subpopulation of blasts expressing T-lineage markers (cCD3-positive and CD7-positive) and a subpopulation expressing myeloid marker MPO. (f) MPAL5R post-treatment relapse of MPAL5. Flow cytometry reveals expansion of the T-lymphoblastic subpopulation (cCD3-positive, TdT-positive population) following chemotherapy. (g) High-confidence mutations detected in 5 MPAL cases by whole exome sequencing. Missense mutations are shown in blue, frameshift deletions are shown in yellow, stop-gain mutations are shown in purple, frameshift insertions are shown in orange, and nonframeshift deletions are shown in dark gray.

Supplementary Figure 5 Quality control of scRNA-seq and scATAC-seq data for MPAL samples.

(a) Number of cells passing filter for each experimental replicate (number of informative genes > 400 and number of unique molecular identifiers (UMI) > 1000). (b) Number of aligned reads on average per cell passing filter for each experimental sample. (c) Violin and box-whisker plot of the number of informative genes detected per single cell passing filter per experimental sample. (d) Violin and box-whisker plot of the number of unique molecular identified (UMI) transcripts per cell passing filter per experimental sample. (e) Aggregated scRNA-seq (n = 20,287) one to one reproducibility plots for technical replicates colored by the density. The correlation (r) represents the Pearson correlation across all genes. (f) scATAC-seq cell filtering plot of 6 representative scATAC-seq MPAL samples. The x-axis is the number of unique accessible fragments and the y-axis is the enrichment of Tn5 insertions at transcription start sites, representing the robust signal to background for each single cell. (g) Aggregated scATAC-seq fragment size distributions across individual experiments demonstrating sub-, mono- and multi nucleosome spanning ATAC-seq fragments. (h) Number of cells passing filter for each experimental replicate (Unique nuclear fragments > 1000 and TSS enrichment > 8). (i) Violin and box-whisker plot of the number of total aligned fragments for each single cell passing filter per experimental sample. (j) Violin and box-whisker plot of the number of unique aligned nuclear fragments for each single cell passing filter per experimental sample. (k) Violin and box-whisker plot of the fraction of the total number Tn5 insertions (Reads) that are within the MPAL union peak set (n = 346,274) for each single cell passing filter. (l) Violin and box-whisker plot of the normalized transcription start site (TSS) enrichment for each single cell passing filter per experimental sample. (m) Aggregated scATAC-seq (n = 346,274) one to one reproducibility plots for technical replicates colored by the density. The correlation (r) represents the Pearson correlation across all genes. Box-whisker plot; lower whisker is the lowest value greater than the 25% quantile minus 1.5 times the interquartile range (IQR), the lower hinge is the 25% quantile, the middle is the median, the upper hinge is the 75% quantile and the upper whisker is the largest value less than the 75% quantile plus 1.5 times the IQR.

Supplementary Figure 6 Evaluation of LSI projection workflow for previously published bulk and single-cell hematopoietic data sets across different platforms.

(a) Overview of LSI projection workflow. Briefly, when previously computing the TF-IDF transform for the original hematopoietic manifold, we store the feature sums (document frequency) and use this information to compute the TF-IDF transform for the new matrix. We then use the hematopoietic singular value decomposition loadings on the new TF-IDF matrix to project the new matrix into a common subspace. To visualize the hematopoietic subspace, we constructed a UMAP projection using uwot in R. We then used this learned manifold to project the new matrix subspace into the original UMAP projection. (b) LSI projection of downsampled previously published bulk sorted hematopoietic data sets18,20. (Left) RNA-seq downsampled bulk projections for 49 samples (n = 250 downsampled cells). (Right) ATAC-seq downsampled bulk projections for 90 samples (n = 250 downsampled cells). (c) LSI projection of downsampled previously published single-cell hematopoietic data sets labeled by previous classifications20,21,22. (Left) scRNA-seq projections of previous study healthy bone marrow cells (different platform and different aligned genome) colored by previous classifications. (Right) scATAC-seq projections for healthy bone marrow and peripheral blood samples (2 different platforms across 3 studies), colored by ground truth isolated populations. (d) Projection of hematopoietic scRNA-seq into of Human Cell Atlas (HCA) Census of Immune Cells. (Left) Number of cells per each of 8 bone marrow donors. (Middle) UMAP projection of LSI iterative clustering of HCA bone marrow scRNA-seq. (Right) LSI projection of our scRNA-seq hematopoietic single cells into HCA bone marrow UMAP colored by cluster definitions.

Supplementary Figure 7 LSI projection of previously published healthy and AML scRNA-seq identifies malignant programs across AML subpopulations.

(a) (Left) Schematic of LSI projection. (Right) Initial projection of all AML malignant single-cells colored by previous classifications19 (n = 31,890). (b) Re-classification of scRNA-seq AML single-cells based on closest normal cells in healthy hematopoiesis (See Methods). Broader re-classification increases the number of cells per category for improved power in differential analyses. LSI projection for each individual AML samples onto scRNA-seq healthy hematopoiesis colored by re-classifications (denoted is the sample id and number of cells, n = 143 – 3,358). (c) K-means differential scRNA-seq heatmap (k = 10), colored by log2 fold change, comparing each AML sample subpopulations (classifications) vs their closest normal bone marrow cells from the same study19.

Supplementary Figure 8 Classification of MPALs by projection onto the hematopoietic hierarchy.

(a) MPAL single cell classification workflow for scRNA and scATAC-seq. First cells were clustered with reference hematopoietic cells (1) and classified based on healthy hematopoietic clusters (2-3). Cells were then LSI projected into the reference hematopoietic manifold (4) and then classified based on the nearest reference cell hematopoietic compartment (5). MPAL 5 replicate 1 is shown as an example for scRNA (Top) and scATAC-seq (Bottom). (b) Proportion of estimated blast cells for each MPAL with Flow Cytometry, Morphology, scRNA and scATAC-seq (Range is from 0 to 1). (c) (Left) Projected MPALs colored by hematopoietic compartments as described in a. (Right) scADT-seq overlay of CD7, CD33, CD14, CD4 and CD19 on MPAL single cells LSI projected onto hematopoiesis.

Supplementary Figure 9 Visualization of top differential genes and accessible peak regions.

(a) Top conserved differential genes across the MPAL hematopoietic compartments. (b) Top conserved differential transcription factors across the MPAL hematopoietic compartments. (c) KEGG pathway enrichment of genes (scRNA) differentially conserved k-means 2, 3, 4, and 10 (Figure 2c, n = 2,117 genes). Color represents the significance (hypergeometric test adjusted p-value with the Benjamini-Hochberg correction) of the KEGG path way generated using clusterProfiler (Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012)). (d-e) Multi-omic differential tracks (Left) scATAC tracks showing MPAL disease subpopulations (red) closest normal cells (gray). (Right) Violin plot of the log2 normalized expression for MPAL disease subpopulations (red) and closest normal cells (gray); black line represents the mean and asterisk denote significance (LFC > 0.5 and FDR < 0.01 from Figure 2c). (d) Multi-omic differential track of CDK11A, up-regulated in MPALs 1, 2, 5 and 5R (n = 89 - 500). (e) Multi-omic differential track of CDKN2A, up-regulated in MPALs 1, 2, 3, 4, and 5 (n = 89 - 500).

Supplementary Figure 10 LSI projection of bulk leukemia RNA-seq onto hematopoietic hierarchy.

(a) Schematic of LSI projection of downsampled bulk leukemia RNA-seq onto healthy hematopoiesis. (b) Representative downsampled LSI projections (n = 250) for B-ALLs, non-ETP T-ALLs, ETP T-ALLs, AMLs, T/M MPALs and B/M MPALs from previous studies16. (c) LSI UMAP of differentially up-regulated gene expression profiles across bulk leukemias16 (n = 321) and MPAL subpopulations assayed in this study (n = 17), colored by cytogenetics. (d) Binary heatmap of variable malignant genes across leukemia classifications. Each cell in the heatmap is colored whether the gene was identified as malignant for the leukemic sample.

Supplementary Figure 11 Seurat canonical correlation analysis alignment of scRNA and scATAC-seq hematopoietic and MPAL samples.

(a) UMAP of CCA alignment of scATAC-seq using Cicero gene activity scores and scRNA-seq for (Left) bone marrow (nATAC = 12,602; nRNA = 16,510), (Middle) CD34+ enriched bone marrow (nATAC = 8,176; nRNA = 10,160), (Right) peripheral blood (nATAC = 14,804; nRNA = 8,368). (b) UMAP of CCA alignment of scATAC-seq using Cicero gene activity scores and scRNA-seq for MPAL samples 1-5R (nATAC = 4,127 – 8,255; nRNA = 835 – 5,885).

Supplementary Figure 12 Evaluation of scRNA and scATAC-seq alignment and peak-to-gene linkage across hematopoiesis and MPAL samples.

(a) Spearman rank correlation between scATAC-seq Cicero gene activity scores to scRNA-seq for each mapped cell within across all biological experiments (n = 4,127 – 16,510). (b) Pearson correlation of CCA scRNA and scATAC-seq nearest-neighbors. The cutoff (R > 0.45) for high quality nearest neighbor mappings is shown (n = 4,127 – 16,510). (c) (Left) UMAP of scATAC-seq hematopoiesis colored by scATAC-seq clusters (n = 35,038). (Right) UMAP of scATAC-seq hematopoiesis colored by filtered mapped scRNA-seq clusters (n = 34,507). (d) Confusion matrix of initial clusters for mapped scRNA-seq to scATAC-seq clusters for hematopoiesis (Figure 1b-c). Above shows the assigned biological classifications for each cluster across both scRNA and scATAC-seq. (e) (Left) Distribution of peak-to-gene distances. (Left-Middle) Distribution of number of peaks mapped per gene (median = 6). (Right-Middle) Distribution of number of genes mapped per peak (median = 1). (Right) Distribution of number of genes skipped for peak-to-gene links (median = 2). (f) MetaV4C plots of K27ac HiChIP in Naive T and HCASMC cells for top 500 biased T/NK (broad classification) peak-to-gene links that are identified only in healthy hematopoiesis. The line represents the average signal and the shading represents the range of the signal times the square root of 2 between biological replicates (n = 2). (g) Peak-to-gene links (n = 91,601) enrichment in GTEx eQTLs over a permuted background distance-matched set (permutations = 250) for the union set of peak-to-gene links. The mean enrichment is shown, and the error bars indicate 1 standard deviation. Box-whisker plot; lower whisker is the lowest value greater than the 25% quantile minus 1.5 times the interquartile range (IQR), the lower hinge is the 25% quantile, the middle is the median, the upper hinge is the 75% quantile and the upper whisker is the largest value less than the 75% quantile plus 1.5 times the IQR.

Supplementary Figure 13 Peak-to-gene links nominate putative regulatory regions that nominate key leukemic genes.

(a-d) Multi-omic differential track; (Middle) Aggregated scATAC tracks showing MPAL disease subpopulations (red) and closest normal cells (gray). (Right) Distribution of log2 normalized expression of gene of interest for MPAL disease subpopulations (red) and closest normal cells (gray); black line represents the mean and asterisk denote significance (LFC > 0.5 and FDR < 0.01 from Figure 2c). Violin plot represents the smoothed density of the distribution of the log2 normalized expression and the black line represents the mean log2 normalized expression. (Bottom) Peak-to-gene links for gene of interest colored by Pearson correlation of the peak accessibility and gene expression (see methods). (a) Multi-omic differential track for IL1RAP (n = 89 - 500). (b) Multi-omic differential track for CD96 (n = 89 - 500). (c) Multi-omic differential track for FLT3 (n = 89 - 500). (d) Multi-omic differential track for MCL1 (n = 89 - 500).

Supplementary Figure 14 Analysis workflows for processing of scRNA-seq and scATAC-seq data.

(Top) scRNA-seq analysis workflow. Briefly cells are aligned using 10x cell ranger, quality filtered, and clustered using a feature optimization approach (see methods). (Bottom) scATAC-seq analysis workflow. Briefly cells are aligned using 10x cell ranger atac, quality filtered, clustered in large windows genome-wide, peak-calling on clusters, creation of a counts matrix and clustered using a feature optimization approach (see methods). (see methods).

Supplementary information

Supplementary Materials

Supplementary Figs. 1–14

Reporting Summary

Supplementary Table 1

Healthy donor information: sex and age range. Information from patients with MPAL: WHO diagnosis, age, sex, blast percentage, white blood cell count, cytogenetics, previous treatment. Hematopoiesis cluster biological classification labels are also included.

Supplementary Table 2

Antibodies used in flow cytometry of MPALs.

Supplementary Table 3

CITE-seq antibody list and barcodes. Antibody information for hematopoietic and MPAL samples, as well as barcodes used for sequencing ADT libraries

Supplementary Table 4

Differential analyses for MPAL and AMLs. Differential genes and peaks for MPAL differential RNA-seq k-means, MPAL differential ATAC-seq k-means, AML differential RNA-seq k-means and an MPAL versus AML comparison.

Supplementary Table 5

Motif enrichment and linkage to target genes. MPAL differential ATAC-seq k-means enrichment for CIS-BP motifs shown in Fig. 3a. All motifs, significant peak-to-gene links and RUNX1-target genes are shown.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Granja, J.M., Klemm, S., McGinnis, L.M. et al. Single-cell multiomic analysis identifies regulatory programs in mixed-phenotype acute leukemia. Nat Biotechnol 37, 1458–1465 (2019) doi:10.1038/s41587-019-0332-7

Download citation

Further reading