Abstract
While several tools have been developed to map axes of variation among individual cells, no analogous approaches exist for identifying axes of variation among multicellular biospecimens profiled at single-cell resolution. For this purpose, we developed ‘phenotypic earth mover’s distance’ (PhEMD). PhEMD is a general method for embedding a ‘manifold of manifolds’, in which each datapoint in the higher-level manifold (of biospecimens) represents a collection of points that span a lower-level manifold (of cells). We apply PhEMD to a newly generated drug-screen dataset and demonstrate that PhEMD uncovers axes of cell subpopulational variation among a large set of perturbation conditions. Moreover, we show that PhEMD can be used to infer the phenotypes of biospecimens not directly profiled. Applied to clinical datasets, PhEMD generates a map of the patient-state space that highlights sources of patient-to-patient variation. PhEMD is scalable, compatible with leading batch-effect correction techniques and generalizable to multiple experimental designs.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Next-Generation Morphometry for pathomics-data mining in histopathology
Nature Communications Open Access 28 January 2023
-
Context specificity of the EMT transcriptional response
Nature Communications Open Access 01 May 2020
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout






Data availability
The mass cytometry data that support the findings of this study are available at https://community.cytobank.org/cytobank/projects/1296. Source data for Figs. 3–6 are provided with the paper. Any additional data supporting the findings of this study are available from the corresponding author upon request.
Code availability
PhEMD takes as input a list of \(N\) matrices representing \(N\) single-cell specimens. An R implementation of PhEMD is publicly available as a Bioconductor R package (package name: ‘phemd’) and can alternatively be downloaded from https://github.com/wschen/phemd. Note that the cell-state space for all analyses presented in this manuscript was modeled using the PHATE method8. However, alternative approaches are viable and we have provided support for PHATE, Monocle2 (ref. 41) and Louvain community detection (as implemented in the Seurat software package)16 for this purpose in the R package.
References
Bodenmiller, B. et al. Multiplexed mass cytometry profiling of cellular states perturbed by small-molecule regulators. Nature Biotech. 30, 858–867 (2012).
Tirosh, I. et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352, 189–196 (2016).
Chevrier, S. et al. An immune atlas of clear cell renal cell carcinoma. Cell 169, 736–749.e18 (2017).
Lavin, Y. et al. Innate immune landscape in early lung adenocarcinoma by paired single-cell analyses. Cell 169, 750–765.e17 (2017).
Ribas, A. et al. Pd-1 blockade expands intratumoral memory t cells. Cancer Immunol. Res. 4, 194–203 (2016).
Behbehani, G. K. et al. Mass cytometric functional profiling of acute myeloid leukemia defines cell-cycle and immunophenotypic properties that correlate with known responses to therapy. Cancer Disc. 5, 988–1003 (2015).
Gasperini, M. et al. A genome-wide framework for mapping gene regulation via cellular genetic screens. Cell 176, 377–390.e19 (2019).
Moon, K. R. et al. Visualizing transitions and structure for high-dimensional data exploration. Nat. Biotechnol. 37, 1482–1492 (2019).
Angerer, P. et al. destiny: diffusion maps for large-scale single-cell data in R. Bioinformatics 32, 1241–1243 (2016).
Kalluri, R. & Weinberg, R. A. The basics of epithelial-mesenchymal transition. J. Clin. Invest. 119, 1420–1428 (2009).
Rubner, Y., Tomasi, C. & Guibas, L. J. The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40, 99–121 (2000).
Coifman, R. R. & Lafon, S. Diffusion maps. Appl. Comput. Harm. Anal. 21, 5–30 (2006).
Zappia, L., Phipson, B. & Oshlack, A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 18, 174 (2017).
Alpert, A., Moore, L. S., Dubovik, T. & Shen-Orr, S. S. Alignment of single-cell trajectories to compare cellular expression dynamics. Nat. Methods 15, 267–270 (2018).
Liu, Q. et al. Quantitative assessment of cell population diversity in single-cell landscapes. PLoS Biol. 16, e2006687 (2018).
Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nature Biotech. 36, 411–420 (2018).
Mani, S. A. et al. The epithelial-mesenchymal transition generates cells with properties of stem cells. Cell 133, 704–715 (2008).
Zhu, H. et al. The role of the hyaluronan receptor CD44 in mesenchymal stem cell migration in the extracellular matrix. Stem Cells 24, 928–935 (2006).
L Ramos, T. et al. MSC surface markers (CD44, CD73, and CD90) can identify human MSC-derived extracellular vesicles by conventional flow cytometry. Cell Commun. Signal. 14, 2 (2016).
Ivaska, J., Pallari, H.-M., Nevo, J. & Eriksson, J. E. Novel functions of vimentin in cell adhesion, migration, and signaling. Exp. Cell Res. 313, 2050–2062 (2007).
Li, W. et al. Unraveling the roles of CD44/CD24 and ALDH1 as cancer stem cell markers in tumorigenesis and metastasis. Sci. Rep. 7, 13856 (2017).
Ma, F. et al. Enriched CD44(+)/CD24(-) population drives the aggressive phenotypes presented in triple-negative breast cancer (TNBC). Cancer Lett. 353, 153–159 (2014).
Ricardo, S. et al. Breast cancer stem cell markers CD44, CD24 and ALDH1: expression distribution within intrinsic molecular subtype. J. Clin. Pathol. 64, 937–946 (2011).
Yu, M. et al. Circulating breast tumor cells exhibit dynamic changes in epithelial and mesenchymal composition. Science 339, 580–584 (2013).
Nieto, M., Huang, R.-J., Jackson, R. & Thiery, J. EMT: 2016. Cell 166, 21–45 (2016).
Jolly, M. K. et al. Implications of the hybrid epithelial/mesenchymal phenotype in metastasis. Front. Oncol. 5, 155 (2015).
Elkabets, M. et al. Mtorc1 inhibition is required for sensitivity to pi3k p110Îś inhibitors in pik3ca-mutant breast cancer. Sci. Trans. Med. 5, 196ra99 (2013).
Salhov, M., Bermanis, A., Wolf, G. & Averbuch, A. Approximately-isometric diffusion maps. Appl. Comput. Harm. Anal. 38, 399–419 (2015).
Klaeger, S. et al. The target landscape of clinical kinase drugs. Science 358, eaan4368 (2017).
Bengio, Y. et al. Out-of-sample extensions for lle, isomap, mds, eigenmaps, and spectral clustering. In Proc. 16th International Conference on Neural Information Processing Systems, NIPS 2003, 177–184 (MIT Press, 2003).
Fowlkes, C., Belongie, S., Chung, F. & Malik, J. Spectral grouping using the Nyström method. EEE Trans. Pattern Anal. Mach. Intell. 26, 214–225 (2004).
Williams, C.K.I. & Seeger, M. in Advances in Neural Information Processing Systems Vol. 13 (eds Leen, T. K. et al.) 682–688 (MIT Press, 2001).
Bendall, S. C. et al. Single-cell trajectory detection uncovers progression and regulatory coordination in human B cell development. Cell 157, 714–725 (2014).
Moon, K. R. et al. Manifold learning-based methods for analyzing single-cell RNA-sequencing data. Curr. Opin. Syst. Biol. 7, 36–46 (2018).
Damond, N. et al. A map of human type 1 diabetes progression by imaging mass cytometry. Cell Metab. 29, 755–768.e5 (2019).
Hammers, H. J. et al. Safety and efficacy of nivolumab in combination with ipilimumab in metastatic renal cell carcinoma: the checkmate 016 study. J. Clin. Oncol. 35, 3851–3858 (2017).
Motzer, R. J. et al. Nivolumab plus ipilimumab versus sunitinib in advanced renal-cell carcinoma. New Engl. J. Med. 378, 1277–1290 (2018).
Levine, J. et al. Data-driven phenotypic dissection of aml reveals progenitor-like cells that correlate with prognosis. Cell 162, 184–197 (2015).
Setty, M. et al. Wishbone identifies bifurcating developmental trajectories from single-cell data. Nature Biotechnol. 34, 637–645 (2016).
Haghverdi, L., Büttner, M., Wolf, F. A., Buettner, F. & Theis, F. J. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016).
Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017).
Sachs, K. Causal protein-signaling networks derived from multiparameter single-cell data. Science 308, 523–529 (2005).
Krishnaswamy, S. et al. Conditional density-based analysis of T cell signaling in single-cell data. Science 346, 1250689–1250689 (2014).
Liu, L. L. et al. Critical role of cd2 co-stimulation in adaptive natural killer cell responses revealed in nkg2c-deficient humans. Cell Rep. 15, 1088–1099 (2016).
Wang, F. & Guibas, L. in Computer Vision—ECCV 2012 Vol. 7572 (eds Fitzgibbon, A. et al.) 442–455 (Springer, 2012).
Zhao, Q., Yang, Z. & Tao, H. Differential earth mover’s distance with its applications to visual tracking. IEEE Trans. Pattern Ana. Mach. Intel. 32, 274–287 (2010).
Typke, R., Wiering, F. & Veltkamp, R. C. Transportation distances and human perception of melodic similarity. Musicae Scientiae 11, 153–181 (2007).
Orlova, D. Y. et al. Earth mover’s distance (emd): a true metric for comparing biomarker expression levels in cell populations. PLoS ONE 11, e0151859 (2016).
Courty, N. Flamary, R. & Ducoffe, M. Learning Wasserstein embeddings. Preprint at https://arxiv.org/pdf/1710.07457.pdf (2017).
Waldmeier, L., Meyer-Schaller, N., Diepenbruck, M. & Christofori, G. Py2T murine breast cancer cells, a versatile model of TGFß-induced EMT in vitro and in vivo. PLoS ONE 7, e48651 (2012).
Zunder, E. R. et al. Palladium-based mass tag cell barcoding with a doublet-filtering scheme and single-cell deconvolution algorithm. Nat. Protocols 10, 316–333 (2015).
Zivanovic, N. Jacobs, A. & Bodenmiller, B. in High-Dimensional Single Cell Analysis Vol. 377 (eds Fienberg, H. G. & Nolan, G. P.) 95–109 (Springer, 2013).
Ornatsky, O. et al. Highly multiparametric analysis by mass cytometry. J. Immunol. Meth. 361, 1–20 (2010).
Finck, R. et al. Normalization of mass cytometry data with bead standards. Cytometry Part A 83A, 483–494 (2013).
Levina, E. & Bickel, P.J. in Advances in Neural Information Processing Systems Vol. 17 (eds Saul, L. K. et al.) 777–784 (MIT Press, 2005).
Hino, H. Ider: intrinsic dimension estimation with R. R J. 9, 329–341 (2017).
van Dijk, D. et al. Recovering gene interactions from single-cell data using data diffusion. Cell 174, 716–729.e27 (2018).
Acknowledgements
We thank the Krishnaswamy and Bodenmiller laboratories for fruitful discussions. This study was supported in part by the Chan–Zuckerberg Initiative Seed Networks for the Human Cell Atlas (S.K.), a Swiss National Science Foundation (SNSF) R’Equip grant (B.B.), a SNSF Assistant Professorship grant no. PP00P3-144874 (B.B.), the SystemsX Transfer Project ‘Friends and Foes’ (B.B.), the SystemX grants Metastasix and PhosphoNEtX (B.B.), the European Research Council (ERC) under the European Union’s Seventh Framework Program (no. FP/2007-2013)/ERC Grant Agreement no. 336921 (B.B.), the CRUK IMAXT Grand Challenge (B.B.) and the following National Institutes of Health (NIH) grants: nos. R01GM135929 (S.K. and G.W.), UC4 DK108132 (B.B.) and NIH–NIDDK T35DK104689 (W.S.C.).
Author information
Authors and Affiliations
Contributions
W.S.C., N.Z., G.W., B.B. and S.K. conceived the study. W.S.C. and S.K. developed the PhEMD algorithm. W.S.C. wrote the software implementation. W.S.C. and D.v.D. performed all computational analyses. N.Z. performed all single-cell profiling experiments and data quality assessments. W.S.C., N.Z., B.B. and S.K. interpreted the results and drafted the manuscript.
Corresponding authors
Ethics declarations
Competing interests
S.K. is on the scientific advisory board of AI Therapeutics.
Additional information
Peer review information Rita Strack was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–8 and Notes 1–7.
Supplementary Table 1
List of inhibitors included in EMT drug-screen experiment
Supplementary Table 2
List of antibodies included in EMT drug-screen experiment
Supplementary Table 3
Clusters of inhibitors with similar effects in multiple-batch EMT drug-screen experiment
Supplementary Table 4
Cell yield of each experimental condition in EMT drug-screen experiment
Supplementary Table 5
Subgroups of inhibitors with similar effects in single-batch EMT drug-screen experiment
Supplementary Table 6
Subgroups of biospecimens with similar single-cell profiles in melanoma scRNA-seq expeiment
Supplementary Table 7
Subgroups of biospecimens with similar single-cell profiles in ccRCC mass cytometry expeiment
Rights and permissions
About this article
Cite this article
Chen, W.S., Zivanovic, N., van Dijk, D. et al. Uncovering axes of variation among single-cell cancer specimens. Nat Methods 17, 302–310 (2020). https://doi.org/10.1038/s41592-019-0689-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-019-0689-z
This article is cited by
-
Next-Generation Morphometry for pathomics-data mining in histopathology
Nature Communications (2023)
-
Unraveling non-genetic heterogeneity in cancer with dynamical models and computational tools
Nature Computational Science (2023)
-
Control of cell state transitions
Nature (2022)
-
Global absence and targeting of protective immune states in severe COVID-19
Nature (2021)
-
Context specificity of the EMT transcriptional response
Nature Communications (2020)