Abstract

Advances in single-cell technologies have enabled high-resolution dissection of tissue composition. Several tools for dimensionality reduction are available to analyze the large number of parameters generated in single-cell studies. Recently, a nonlinear dimensionality-reduction technique, uniform manifold approximation and projection (UMAP), was developed for the analysis of any type of high-dimensional data. Here we apply it to biological data, using three well-characterized mass cytometry and single-cell RNA sequencing datasets. Comparing the performance of UMAP with five other tools, we find that UMAP provides the fastest run times, highest reproducibility and the most meaningful organization of cell clusters. The work highlights the use of UMAP for improved visualization and interpretation of single-cell data.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , & Computational flow cytometry: helping to make sense of high-dimensional immunology data. Nat. Rev. Immunol. 16, 449–462 (2016).

  2. 2.

    , & A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000).

  3. 3.

    et al. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc. Natl. Acad. Sci. USA 102, 7426–7431 (2005).

  4. 4.

    & Visualizing high-dimensional data using t-SNE. journal of machine learning research. J. Mach. Learn. Res. 9, 26 (2008).

  5. 5.

    et al. viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nat. Biotechnol. 31, 545–552 (2013).

  6. 6.

    et al. Mass cytometry of the human mucosal immune system identifies tissue- and disease-associated immune subsets. Immunity 44, 1227–1239 (2016).

  7. 7.

    & UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at (2018).

  8. 8.

    , , & UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3, 861 (2018).

  9. 9.

    et al. Mapping the mouse cell atlas by microwell-seq. Cell 172, 1091–1107.e17 (2018).

  10. 10.

    , , , & Automated mapping of phenotype space with single-cell data. Nat. Methods 13, 493–496 (2016).

  11. 11.

    et al. A high-dimensional atlas of human T cell diversity reveals tissue-specific trafficking and cytokine signatures. Immunity 45, 442–456 (2016).

  12. 12.

    Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15, 3221–3245 (2014).

  13. 13.

    , , , & Efficient algorithms for t-distributed stochastic neighborhood embedding. Preprint at (2017).

  14. 14.

    , & Interpretable dimensionality reduction of single cell transcriptome data with deep generative models. Nat. Commun. 9, 2002 (2018).

  15. 15.

    et al. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162, 184–197 (2015).

  16. 16.

    , & Transcriptional regulation of mast cell and basophil lineage commitment. Semin. Immunopathol. 38, 539–548 (2016).

  17. 17.

    , & How to use t-SNE effectively. Distill 1, e2 (2016).

  18. 18.

    et al. Haemopedia: an expression atlas of murine hematopoietic cells. Stem Cell Rep. 7, 571–582 (2016).

  19. 19.

    , & The pre-B-cell receptor. Curr. Opin. Immunol. 19, 137–142 (2007).

  20. 20.

    , & SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).

  21. 21.

    , , , & Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).

  22. 22.

    et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).

Download references

Acknowledgements

We thank members of the Singapore Immunology Network and notably members of the E.W.N. laboratory. We thank S. Li, Y. Simoni, M. Chng, Y. Cheng, J.W. Lim and M. Fehlings for their insightful feedback. This study was funded by A-STAR/SIgN core funding and A-STAR/SIgN immunomonitoring platform funding.

Author information

Affiliations

  1. Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore.

    • Etienne Becht
    • , Charles-Antoine Dutertre
    • , Immanuel W H Kwok
    • , Lai Guan Ng
    • , Florent Ginhoux
    •  & Evan W Newell
  2. Tutte Institute for Mathematics and Computing, Ottawa, Ontario, Canada.

    • Leland McInnes
    •  & John Healy
  3. Fred Hutchinson Cancer Research Center, Vaccine and Infectious Disease Division, Seattle, Washington, USA.

    • Evan W Newell

Authors

  1. Search for Etienne Becht in:

  2. Search for Leland McInnes in:

  3. Search for John Healy in:

  4. Search for Charles-Antoine Dutertre in:

  5. Search for Immanuel W H Kwok in:

  6. Search for Lai Guan Ng in:

  7. Search for Florent Ginhoux in:

  8. Search for Evan W Newell in:

Contributions

E.B., L.M., J.H., C.-A.D., I.W.H.K. and E.W.N. analyzed data. L.G.N., F.G. and E.W.N. helped supervise the project. L.M. and J.H. developed UMAP. All authors participated in writing and revising the manuscript.

Competing interests

E.W.N. is a board director and shareholder of immunoSCAPE Pte. Ltd., which is an immune profiling service provider.

Corresponding author

Correspondence to Evan W Newell.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–7

  2. 2.

    Life Sciences Reporting Summary

Excel files

  1. 1.

    Supplementary Table 1

    Description of the datasets

  2. 2.

    Supplementary Table 2

    Algorithms benchmarked

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nbt.4314