Revealing the vectors of cellular identity with single-cell genomics

Abstract

Single-cell genomics has now made it possible to create a comprehensive atlas of human cells. At the same time, it has reopened definitions of a cell's identity and of the ways in which identity is regulated by the cell's molecular circuitry. Emerging computational analysis methods, especially in single-cell RNA sequencing (scRNA-seq), have already begun to reveal, in a data-driven way, the diverse simultaneous facets of a cell's identity, from discrete cell types to continuous dynamic transitions and spatial locations. These developments will eventually allow a cell to be represented as a superposition of 'basis vectors', each determining a different (but possibly dependent) aspect of cellular organization and function. However, computational methods must also overcome considerable challenges—from handling technical noise and data scale to forming new abstractions of biology. As the scale of single-cell experiments continues to increase, new computational approaches will be essential for constructing and characterizing a reference map of cell identities.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Diverse factors combine to create a cell's unique identity, and computational methods reveal them.
Figure 2: Biological and technical factors combine to determine the measured genomic profiles of single cells; computational methods remove technical effects and tease apart facets of the biological variation.
Figure 3: Technical confounders of single-cell RNA-seq and computational methods to handle them.

References

  1. 1

    Gaublomme, J.T. et al. Single-cell genomics unveils critical regulators of Th17 Cell pathogenicity. Cell 163, 1400–1412 (2015).

  2. 2

    Shalek, A.K. et al. Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature 510, 363–369 (2014).

  3. 3

    Shalek, A.K. et al. Single-cell transcriptomics reveals bimodality in expression and splicing in immune cells. Nature 498, 236–240 (2013).

  4. 4

    Zeisel, A. et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).

  5. 5

    Grün, D. et al. Single-cell messenger RNA sequencing reveals rare intestinal cell types. Nature 525, 251–255 (2015).

  6. 6

    Altschuler, S.J. & Wu, L.F. Cellular heterogeneity: do differences make a difference? Cell 141, 559–563 (2010).

  7. 7

    Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).

  8. 8

    Ramsköld, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777–782 (2012).

  9. 9

    Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).

  10. 10

    Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Reports 2, 666–673 (2012).

  11. 11

    Navin, N. et al. Tumour evolution inferred by single-cell sequencing. Nature 472, 90–94 (2011).

  12. 12

    Zong, C., Lu, S., Chapman, A.R. & Xie, X.S. Genome-wide detection of single-nucleotide and copy-number variations of a single human cell. Science 338, 1622–1626 (2012).

  13. 13

    Xu, X. et al. Single-cell exome sequencing reveals single-nucleotide mutation characteristics of a kidney tumor. Cell 148, 886–895 (2012).

  14. 14

    Hou, Y. et al. Single-cell exome sequencing and monoclonal evolution of a JAK2-negative myeloproliferative neoplasm. Cell 148, 873–885 (2012).

  15. 15

    Wang, Y. et al. Clonal evolution in breast cancer revealed by single nucleus genome sequencing. Nature 512, 155–160 (2014).

  16. 16

    Leung, M.L., Wang, Y., Waters, J. & Navin, N.E. SNES: single nucleus exome sequencing. Genome Biol. 16, 55 (2015).

  17. 17

    Lohr, J.G. et al. Whole-exome sequencing of circulating tumor cells provides a window into metastatic prostate cancer. Nat. Biotechnol. 32, 479–484 (2014).

  18. 18

    Bandura, D.R. et al. Mass cytometry: technique for real time single cell multitarget immunoassay based on inductively coupled plasma time-of-flight mass spectrometry. Anal. Chem. 81, 6813–6822 (2009).

  19. 19

    Bendall, S.C. et al. Single-cell mass cytometry of differential immune and drug responses across a human hematopoietic continuum. Science 332, 687–696 (2011).

  20. 20

    Chattopadhyay, P.K. et al. Quantum dot semiconductor nanocrystals for immunophenotyping by polychromatic flow cytometry. Nat. Med. 12, 972–977 (2006).

  21. 21

    Bodenmiller, B. et al. Multiplexed mass cytometry profiling of cellular states perturbed by small-molecule regulators. Nat. Biotechnol. 30, 858–867 (2012).

  22. 22

    Guo, H. et al. Single-cell methylome landscapes of mouse embryonic stem cells and early embryos analyzed using reduced representation bisulfite sequencing. Genome Res. 23, 2126–2135 (2013).

  23. 23

    Smallwood, S.A. et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat. Methods 11, 817–820 (2014).

  24. 24

    Farlik, M. et al. Single-cell DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics. Cell Reports 10, 1386–1397 (2015).

  25. 25

    Guo, H. et al. The DNA methylation landscape of human early embryos. Nature 511, 606–610 (2014).

  26. 26

    Rotem, A. et al. Single-cell ChIP-seq reveals cell subpopulations defined by chromatin state. Nat. Biotechnol. 33, 1165–1172 (2015).

  27. 27

    Cusanovich, D.A. et al. Multiplex single cell profiling of chromatin accessibility by combinatorial cellular indexing. Science 348, 910–914 (2015).

  28. 28

    Buenrostro, J.D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).

  29. 29

    Nagano, T. et al. Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature 502, 59–64 (2013).

  30. 30

    Macosko, E.Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

  31. 31

    Klein, A.M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).

  32. 32

    Krishnaswamy, S. et al. Systems biology. Conditional density-based analysis of T cell signaling in single-cell data. Science 346, 1250689 (2014).

  33. 33

    Sen, N. et al. Single-cell mass cytometry analysis of human tonsil T cell remodeling by varicella zoster virus. Cell Reports 8, 633–645 (2014).

  34. 34

    Levine, J.H. et al. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162, 184–197 (2015).

  35. 35

    Tirosh, I. et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352, 189–196 (2016).

  36. 36

    Tasic, B. et al. Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346 (2016).

  37. 37

    Gawad, C., Koh, W. & Quake, S.R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17, 175–188 (2016).

  38. 38

    Kim, J.K., Kolodziejczyk, A.A., Ilicic, T., Teichmann, S.A. & Marioni, J.C. Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression. Nat. Commun. 6, 8687 (2015).

  39. 39

    Kolodziejczyk, A.A., Kim, J.K., Svensson, V., Marioni, J.C. & Teichmann, S.A. The technology and biology of single-cell RNA sequencing. Mol. Cell 58, 610–620 (2015).

  40. 40

    Raj, A., Peskin, C.S., Tranchina, D., Vargas, D.Y. & Tyagi, S. Stochastic mRNA synthesis in mammalian cells. PLoS Biol. 4, e309 (2006).

  41. 41

    Stewart-Ornstein, J., Weissman, J.S. & El-Samad, H. Cellular noise regulons underlie fluctuations in Saccharomyces cerevisiae. Mol. Cell 45, 483–493 (2012).

  42. 42

    Raj, A. & van Oudenaarden, A. Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135, 216–226 (2008).

  43. 43

    Swain, P.S., Elowitz, M.B. & Siggia, E.D. Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc. Natl. Acad. Sci. USA 99, 12795–12800 (2002).

  44. 44

    Ilicic, T. et al. Classification of low quality cells from single-cell RNA-seq data. Genome Biol. 17, 29 (2016).

  45. 45

    Leek, J.T. et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genet. 11, 733–739 (2010).

  46. 46

    Johnson, W.E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

  47. 47

    Benito, M. et al. Adjustment of systematic microarray data biases. Bioinformatics 20, 105–114 (2004).

  48. 48

    Gagnon-Bartsch, J.A. & Speed, T.P. Using control genes to correct for unwanted variation in microarray data. Biostatistics 13, 539–552 (2012).

  49. 49

    Leek, J.T. svaseq: removing batch effects and other unwanted noise from sequencing data. Nucleic Acids Res. 42, e161–e161 (2014).

  50. 50

    Bullard, J.H., Purdom, E., Hansen, K.D. & Dudoit, S. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics 11, 94 (2010).

  51. 51

    Lovén, J. et al. Revisiting global gene expression analysis. Cell 151, 476–482 (2012).

  52. 52

    Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).

  53. 53

    Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

  54. 54

    Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).

  55. 55

    Wagner, G.P., Kin, K. & Lynch, V.J. Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples. Theory Biosci. 131, 281–285 (2012).

  56. 56

    Li, B., Ruotti, V., Stewart, R.M., Thomson, J.A. & Dewey, C.N. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics 26, 493–500 (2010).

  57. 57

    Finak, G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).

  58. 58

    Stegle, O., Teichmann, S.A. & Marioni, J.C. Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet. 16, 133–145 (2015).

  59. 59

    Hicks, S.C., Teng, M. & Irizarry, R.A. On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data. bioRxiv (2015).

  60. 60

    Brennecke, P. et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat. Methods 10, 1093–1095 (2013).

  61. 61

    Vallejos, C.A., Marioni, J.C. & Richardson, S. BASiCS: Bayesian analysis of single-cell sequencing data. PLOS Comput. Biol. 11, e1004333 (2015).

  62. 62

    Lun, A.T.L., Bach, K. & Marioni, J.C. Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 17, 75 (2016).

  63. 63

    Vallejos, C.A., Richardson, S. & Marioni, J.C. Beyond comparisons of means: understanding changes in gene expression at the single-cell level. Genome Biol. 17, 70 (2016).

  64. 64

    Prabhakaran, S., Azizi, E., Carr, A. & Pe'er, D. Dirichlet process mixture model for correcting technical variation in single-cell gene expression data. Proc. 33nd Int. Conf. Mach. Learn., ICML 2016, 1070–1079 (2016).

  65. 65

    Levin, J.Z. et al. Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat. Methods 7, 709–715 (2010).

  66. 66

    Alter, O., Brown, P.O. & Botstein, D. Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. USA 97, 10101–10106 (2000).

  67. 67

    Risso, D., Schwartz, K., Sherlock, G. & Dudoit, S. GC-content normalization for RNA-Seq data. BMC Bioinformatics 12, 480 (2011).

  68. 68

    Risso, D., Ngai, J., Speed, T.P. & Dudoit, S. Normalization of RNA-seq data using factor analysis of control genes or samples. Nat. Biotechnol. 32, 896–902 (2014).

  69. 69

    Leek, J.T. & Storey, J.D. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 3, 1724–1735 (2007).

  70. 70

    Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011).

  71. 71

    Marinov, G.K. et al. From single-cell to cell-pool transcriptomes: stochasticity in gene expression and RNA splicing. Genome Res. 24, 496–510 (2014).

  72. 72

    Achim, K. et al. High-throughput spatial mapping of single-cell RNA-seq data to tissue of origin. Nat. Biotechnol. 33, 503–509 (2015).

  73. 73

    Munro, S.A. et al. Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures. Nat. Commun. 5, 5125 (2014).

  74. 74

    Grün, D., Kester, L. & van Oudenaarden, A. Validation of noise models for single-cell transcriptomics. Nat. Methods 11, 637–640 (2014).

  75. 75

    Grün, D. & van Oudenaarden, A. Design and analysis of single-cell sequencing experiments. Cell 163, 799–810 (2015).

  76. 76

    Deng, Q., Ramsköld, D., Reinius, B. & Sandberg, R. Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343, 193–196 (2014).

  77. 77

    Reinius, B. et al. Analysis of allelic expression patterns in clonal somatic cells by single-cell RNA-seq. Nat. Genet. http://dx.doi.org/10.1038/ng.3678 (2016).

  78. 78

    Jaitin, D.A. et al. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science 343, 776–779 (2014).

  79. 79

    Wang, Y. & Navin, N.E. Advances and applications of single-cell sequencing technologies. Mol. Cell 58, 598–609 (2015).

  80. 80

    Saliba, A.-E., Westermann, A.J., Gorski, S.A. & Vogel, J. Single-cell RNA-seq: advances and future challenges. Nucleic Acids Res. 42, 8845–8860 (2014).

  81. 81

    Hashimshony, T. et al. CEL-Seq2: sensitive highly-multiplexed single-cell RNA-Seq. Genome Biol. 17, 77 (2016).

  82. 82

    Islam, S. et al. Quantitative single-cell RNA-seq with unique molecular identifiers. Nat. Methods 11, 163–166 (2014).

  83. 83

    Kivioja, T. et al. Counting absolute numbers of molecules using unique molecular identifiers. Nat. Methods 9, 72–74 (2011).

  84. 84

    Fu, G.K. et al. Molecular indexing enables quantitative targeted RNA sequencing and reveals poor efficiencies in standard library preparations. Proc. Natl. Acad. Sci. USA 111, 1891–1896 (2014).

  85. 85

    Shiroguchi, K., Jia, T.Z., Sims, P.A. & Xie, X.S. Digital RNA sequencing minimizes sequence-dependent bias and amplification noise with optimized single-molecule barcodes. Proc. Natl. Acad. Sci. USA 109, 1347–1352 (2012).

  86. 86

    Fu, G.K., Hu, J., Wang, P.-H. & Fodor, S.P.A. Counting individual DNA molecules by the stochastic attachment of diverse labels. Proc. Natl. Acad. Sci. USA 108, 9026–9031 (2011).

  87. 87

    Kharchenko, P.V., Silberstein, L. & Scadden, D.T. Bayesian approach to single-cell differential expression analysis. Nat. Methods 11, 740–742 (2014).

  88. 88

    McDavid, A. et al. Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments. Bioinformatics 29, 461–467 (2013).

  89. 89

    McDavid, A. et al. Modeling bi-modality improves characterization of cell cycle on gene expression in single cells. PLOS Comput. Biol. 10, e1003696 (2014).

  90. 90

    Dalrymple, M.L., Hudson, I.L. & Ford, R.P.K. Finite Mixture, Zero-inflated Poisson and Hurdle models with application to SIDS. Comput. Stat. Data Anal. 41, 491–504 (2003).

  91. 91

    Fan, J. et al. Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis. Nat. Methods 13, 241–244 (2016).

  92. 92

    Pierson, E. & Yau, C. ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol. 16, 241 (2015).

  93. 93

    Buettner, F., Moignard, V., Göttgens, B. & Theis, F.J. Probabilistic PCA of censored data: accounting for uncertainties in the visualization of high-throughput single-cell qPCR data. Bioinformatics 30, 1867–1875 (2014).

  94. 94

    DeTomaso, D. & Yosef, N. FastProject: a tool for low-dimensional analysis of single-cell RNA-Seq data. BMC Bioinformatics 17, 315 (2016).

  95. 95

    Satija, R., Farrell, J.A., Gennert, D., Schier, A.F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).

  96. 96

    Qiu, P. et al. Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE. Nat. Biotechnol. 29, 886–891 (2011).

  97. 97

    Antebi, Y.E. et al. Mapping differentiation under mixed culture conditions reveals a tunable continuum of T cell fates. PLoS Biol. 11, e1001616 (2013).

  98. 98

    Korem, Y. et al. Geometry of the gene expression space of individual cells. PLOS Comput. Biol. 11, e1004224 (2015).

  99. 99

    Patel, A.P. et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014).

  100. 100

    Gokce, O. et al. Cellular taxonomy of the mouse striatum as revealed by single-cell RNA-Seq. Cell Reports 16, 1126–1137 (2016).

  101. 101

    Pollen, A.A. et al. Molecular identity of human outer radial glia during cortical development. Cell 163, 55–67 (2015).

  102. 102

    Kowalczyk, M.S. et al. Single-cell RNA-seq reveals changes in cell cycle and differentiation programs upon aging of hematopoietic stem cells. Genome Res. 25, 1860–1872 (2015).

  103. 103

    Lande-Diner, L., Stewart-Ornstein, J., Weitz, C.J. & Lahav, G. Single-cell analysis of circadian dynamics in tissue explants. Mol. Biol. Cell 26, 3940–3945 (2015).

  104. 104

    Buettner, F. et al. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat. Biotechnol. 33, 155–160 (2015).

  105. 105

    Shapiro, E., Biezuner, T. & Linnarsson, S. Single-cell sequencing-based technologies will revolutionize whole-organism science. Nat. Rev. Genet. 14, 618–630 (2013).

  106. 106

    Beyer, K.S., Goldstein, J., Ramakrishnan, R. & Shaft, U. When is “nearest neighbor” meaningful? in Proceedings of the 7th International Conference on Database Theory (ICDT'99) (eds. Beeri, C. & Buneman, P.) 217–235 (Springer, 1999).

  107. 107

    Usoskin, D. et al. Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing. Nat. Neurosci. 18, 145–153 (2015).

  108. 108

    Chu, L.-F. et al. Single-cell RNA-seq reveals novel regulators of human embryonic stem cell differentiation to definitive endoderm. Genome Biol. 17, 173 (2016).

  109. 109

    Amir, A.D. et al. viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia. Nat. Biotechnol. 31, 545–552 (2013).

  110. 110

    Shekhar, K., Brodin, P., Davis, M.M. & Chakraborty, A.K. Automatic classification of cellular expression by nonlinear stochastic embedding (ACCENSE). Proc. Natl. Acad. Sci. USA 111, 202–207 (2014).

  111. 111

    Wilson, N.K. et al. Combined single-cell functional and gene expression analysis resolves heterogeneity within stem cell populations. Cell Stem Cell 16, 712–724 (2015).

  112. 112

    van der Maaten, L. & Hinton, G.E. Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

  113. 113

    van der Maaten, L. Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res. 15, 3221–3245 (2014).

  114. 114

    Mahfouz, A. et al. Visualizing the spatial gene expression organization in the brain through non-linear similarity embeddings. Methods 73, 79–89 (2015).

  115. 115

    Maaten, L. Learning a parametric embedding by preserving local structure. in Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS-09) (eds. Dyk, D.V & Welling, M.) 384–391 (2009).

  116. 116

    Berman, G.J., Choi, D.M., Bialek, W. & Shaevitz, J.W. Mapping the stereotyped behaviour of freely moving fruit flies. J. R. Soc. Interface 11, 20140672 (2014).

  117. 117

    Ester, M., Kriegel, H.-P., Sander, J. & Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, Oregon, USA (eds. Simoudis, E., Han, J. & Fayyad, U.) 226–231 (AAAI Press, 1996).

  118. 118

    Habib, N. et al. Div-Seq: Single-nucleus RNA-Seq reveals dynamics of rare adult newborn neurons. Science 353, 925–928 (2016).

  119. 119

    Wang, B., Zhu, J., Pierson, E., Ramazzotti, D. & Batzoglou, S. Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. bioRxiv Preprint at http://biorxiv.org/content/early/2016/05/09/052225 (2016).

  120. 120

    Shekhar, K. et al. Comprehensive classification of retinal bipolar neurons by single-cell transcriptomics. Cell 166, 1308–1323 (2016).

  121. 121

    Tsafrir, D. et al. Sorting points into neighborhoods (SPIN): data analysis and visualization by ordering distance matrices. Bioinformatics 21, 2301–2308 (2005).

  122. 122

    Madeira, S.C. & Oliveira, A.L. Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans. Comput. Biol. Bioinformatics 1, 24–45 (2004).

  123. 123

    Blondel, V.D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008 (2008).

  124. 124

    Newman, M.E.J. Communities, modules and large-scale structure in networks. Nat. Phys. 8, 25–31 (2012).

  125. 125

    Newman, M.E.J. & Girvan, M. Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004).

  126. 126

    Bray, N.L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).

  127. 127

    Ntranos, V., Kamath, G.M., Zhang, J.M., Pachter, L. & Tse, D.N. Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts. Genome Biol. 17, 112 (2016).

  128. 128

    Grün, D. et al. De novo prediction of stem cell identity using single-cell transcriptome data. Cell Stem Cell 19, 266–277 (2016).

  129. 129

    Horowitz, A. et al. Genetic and environmental determinants of human NK cell diversity revealed by mass cytometry. Sci. Transl. Med. 5, 208ra145 (2013).

  130. 130

    Fruchterman, T.M.J. & Reingold, E.M. Graph drawing by force-directed placement. Softw. Pract. Exper. 21, 1129–1164 (1991).

  131. 131

    Shoval, O. et al. Evolutionary trade-offs, Pareto optimality, and the geometry of phenotype space. Science 336, 1157–1160 (2012).

  132. 132

    Hart, Y. et al. Inferring biological tasks using Pareto analysis of high-dimensional data. Nat. Methods 12, 233–235, 3, 235 (2015).

  133. 133

    Tendler, A., Mayo, A. & Alon, U. Evolutionary tradeoffs, Pareto optimality and the morphology of ammonite shells. BMC Syst. Biol. 9, 12 (2015).

  134. 134

    Sheftel, H., Shoval, O., Mayo, A. & Alon, U. The geometry of the Pareto front in biological phenotype space. Ecol. Evol. 3, 1471–1483 (2013).

  135. 135

    Novershtern, N. et al. Densely interconnected transcriptional circuits control cell states in human hematopoiesis. Cell 144, 296–309 (2011).

  136. 136

    Gupta, P.B. et al. Stochastic state transitions give rise to phenotypic equilibrium in populations of cancer cells. Cell 146, 633–644 (2011).

  137. 137

    Wagner, F. GO-PCA: an unsupervised method to explore gene expression data using prior knowledge. PLoS One 10, e0143196 (2015).

  138. 138

    Chung, N.C. & Storey, J.D. Statistical significance of variables driving systematic variation in high-dimensional data. Bioinformatics 31, 545–554 (2015).

  139. 139

    Bar-Joseph, Z., Gitter, A. & Simon, I. Studying and modelling dynamic biological processes using time-series gene expression data. Nat. Rev. Genet. 13, 552–564 (2012).

  140. 140

    Bendall, S.C. et al. Single-cell trajectory detection uncovers progression and regulatory coordination in human B cell development. Cell 157, 714–725 (2014).

  141. 141

    Kafri, R. et al. Dynamics extracted from fixed cells reveal feedback linking cell growth to cell cycle. Nature 494, 480–483 (2013).

  142. 142

    Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).

  143. 143

    Magwene, P.M., Lizardi, P. & Kim, J. Reconstructing the temporal ordering of biological samples using microarray data. Bioinformatics 19, 842–850 (2003).

  144. 144

    Booth, K.S. & Lueker, G.S. Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms. J. Comput. Syst. Sci. 13, 335–379 (1976).

  145. 145

    Haghverdi, L., Buettner, F. & Theis, F.J. Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics 31, 2989–2998 (2015).

  146. 146

    Moignard, V. et al. Decoding the regulatory network of early blood development from single-cell gene expression measurements. Nat. Biotechnol. 33, 269–276 (2015).

  147. 147

    Angerer, P. et al. destiny: diffusion maps for large-scale single-cell data in R. Bioinformatics 32, 1241–1243 (2016).

  148. 148

    Coifman, R.R. et al. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc. Natl. Acad. Sci. USA 102, 7426–7431 (2005).

  149. 149

    Marco, E. et al. Bifurcation analysis of single-cell gene expression data reveals epigenetic landscape. Proc. Natl. Acad. Sci. USA 111, E5643–E5650 (2014).

  150. 150

    Tibshirani, R., Walther, G. & Hastie, T. Estimating the number of clusters in a data set via the gap statistic. J. R. Stat. Soc. Ser. B. Stat. Methodol. 63, 411–423 (2001).

  151. 151

    Setty, M. et al. Wishbone identifies bifurcating developmental trajectories from single-cell data. Nat. Biotechnol. 34, 637–645 (2016).

  152. 152

    Whitfield, M.L. et al. Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol. Biol. Cell 13, 1977–2000 (2002).

  153. 153

    Leng, N. et al. Oscope identifies oscillatory genes in unsynchronized single-cell RNA-seq experiments. Nat. Methods 12, 947–950 (2015).

  154. 154

    Rosenkrantz, D., Stearns, R. & Lewis, P. I. I. An analysis of several heuristics for the traveling salesman problem. SIAM J. Comput. 6, 563–581 (1977).

  155. 155

    Cho, R.J. et al. Transcriptional regulation and function during the human cell cycle. Nat. Genet. 27, 48–54 (2001).

  156. 156

    Zopf, C.J., Quinn, K., Zeidman, J. & Maheshri, N. Cell-cycle dependence of transcription dominates noise in gene expression. PLOS Comput. Biol. 9, e1003161 (2013).

  157. 157

    Shin, J. et al. Single-cell RNA-seq with waterfall reveals molecular cascades underlying adult neurogenesis. Cell Stem Cell 17, 360–372 (2015).

  158. 158

    Llorens-Bobadilla, E. et al. Single-cell transcriptomics reveals a population of dormant neural stem cells that become activated upon brain injury. Cell Stem Cell 17, 329–340 (2015).

  159. 159

    Lawrence, N. Probabilistic non-linear principal component analysis with Gaussian process latent variable models. J. Mach. Learn. Res. 6, 1783–1816 (2005).

  160. 160

    Yosef, N. & Regev, A. Writ large: genomic dissection of the effect of cellular environment on immune response. Science 354, 64–68 (2016).

  161. 161

    Ke, R. et al. In situ sequencing for RNA analysis in preserved tissue and cells. Nat. Methods 10, 857–860 (2013).

  162. 162

    Lee, J.H. et al. Highly multiplexed subcellular RNA sequencing in situ. Science 343, 1360–1363 (2014).

  163. 163

    Lovatt, D. et al. Transcriptome in vivo analysis (TIVA) of spatially defined single cells in live tissue. Nat. Methods 11, 190–196 (2014).

  164. 164

    Lubeck, E. & Cai, L. Single-cell systems biology by super-resolution imaging and combinatorial labeling. Nat. Methods 9, 743–748 (2012).

  165. 165

    Lubeck, E., Coskun, A.F., Zhiyentayev, T., Ahmad, M. & Cai, L. Single-cell in situ RNA profiling by sequential hybridization. Nat. Methods 11, 360–361 (2014).

  166. 166

    Chen, K.H., Boettiger, A.N., Moffitt, J.R., Wang, S. & Zhuang, X. RNA imaging. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).

  167. 167

    Crosetto, N., Bienko, M. & van Oudenaarden, A. Spatially resolved transcriptomics and beyond. Nat. Rev. Genet. 16, 57–66 (2015).

  168. 168

    Rockhill, R.L., Euler, T. & Masland, R.H. Spatial order within but not between types of retinal neurons. Proc. Natl. Acad. Sci. USA 97, 2303–2307 (2000).

  169. 169

    Masland, R.H. The neuronal organization of the retina. Neuron 76, 266–280 (2012).

  170. 170

    Scialdone, A. et al. Resolving early mesoderm diversification through single-cell expression profiling. Nature 535, 289–293 (2016).

  171. 171

    Durruthy-Durruthy, R. et al. Reconstruction of the mouse otocyst and early neuroblast lineage at single-cell resolution. Cell 157, 964–978 (2014).

  172. 172

    Durruthy-Durruthy, R., Gottlieb, A. & Heller, S. 3D computational reconstruction of tissues with hollow spherical morphologies using single-cell gene expression data. Nat. Protoc. 10, 459–474 (2015).

  173. 173

    Kim, H.D., Shay, T., O'Shea, E.K. & Regev, A. Transcriptional regulatory circuits: predicting numbers from alphabets. Science 325, 429–432 (2009).

  174. 174

    Yosef, N. & Regev, A. Impulse control: temporal dynamics in gene transcription. Cell 144, 886–896 (2011).

  175. 175

    Wills, Q.F. et al. Single-cell gene expression analysis reveals genetic associations masked in whole-tissue experiments. Nat. Biotechnol. 31, 748–752 (2013).

  176. 176

    Tay, S. et al. Single-cell NF-kappaB dynamics reveal digital activation and analogue information processing. Nature 466, 267–271 (2010).

  177. 177

    Xue, Z. et al. Genetic programs in human and mouse early embryos revealed by single-cell RNA sequencing. Nature 500, 593–597 (2013).

  178. 178

    Munsky, B., Neuert, G. & van Oudenaarden, A. Using gene expression noise to understand gene regulation. Science 336, 183–187 (2012).

  179. 179

    Kim, J.K. & Marioni, J.C. Inferring the kinetics of stochastic gene expression from single-cell RNA-sequencing data. Genome Biol. 14, R7 (2013).

  180. 180

    Karlebach, G. & Shamir, R. Modelling and analysis of gene regulatory networks. Nat. Rev. Mol. Cell Biol. 9, 770–780 (2008).

  181. 181

    Fisher, J., Köksal, A.S., Piterman, N. & Woodhouse, S. Synthesising executable gene regulatory networks from single-cell gene expression data. in Computer Aided Verification—27th International Conference, CAV 2015, San Francisco, California, USA, July 18–24, 2015, Proceedings, Part I (eds. Kroening, D. & Păsăreanu, C.S.) 544–560 (Springer, 2015).

  182. 182

    Köksal, A.S. et al. Synthesis of biological models from mutation experiments. in Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages 469–482 (ACM, 2013).

  183. 183

    Botev, Z.I., Grotowski, J.F. & Kroese, D.P. Kernel density estimation via diffusion. Ann. Stat. 38, 2916–2957 (2010).

  184. 184

    Battich, N., Stoeger, T. & Pelkmans, L. Control of transcript variability in single mammalian cells. Cell 163, 1596–1610 (2015).

  185. 185

    Bahar Halpern, K. et al. Nuclear retention of mRNA in mammalian tissues. Cell Reports 13, 2653–2662 (2015).

  186. 186

    Rabani, M. et al. High-resolution sequencing and modeling identifies distinct dynamic RNA regulatory strategies. Cell 159, 1698–1710 (2014).

  187. 187

    Taniguchi, Y. et al. Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science 329, 533–538 (2010).

  188. 188

    Liu, Y., Beyer, A. & Aebersold, R. On the dependency of cellular protein levels on mRNA abundance. Cell 165, 535–550 (2016).

  189. 189

    Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011).

  190. 190

    Li, J.J., Bickel, P.J. & Biggin, M.D. System wide analyses have underestimated protein abundances and the importance of transcription in mammals. PeerJ 2, e270 (2014).

  191. 191

    Jovanovic, M. et al. Immunogenetics. Dynamic profiling of the protein life cycle in response to pathogens. Science 347, 1259038 (2015).

  192. 192

    Clark, S.J., Lee, H.J., Smallwood, S.A., Kelsey, G. & Reik, W. Single-cell epigenomics: powerful new methods for understanding gene regulation and cell identity. Genome Biol. 17, 72 (2016).

  193. 193

    Cedar, H. & Bergman, Y. Linking DNA methylation and histone modification: patterns and paradigms. Nat. Rev. Genet. 10, 295–304 (2009).

  194. 194

    Zhou, V.W., Goren, A. & Bernstein, B.E. Charting histone modifications and the functional organization of mammalian genomes. Nat. Rev. Genet. 12, 7–18 (2011).

  195. 195

    Dixon, J.R. et al. Chromatin architecture reorganization during stem cell differentiation. Nature 518, 331–336 (2015).

  196. 196

    Landan, G. et al. Epigenetic polymorphism and the stochastic formation of differentially methylated regions in normal and cancerous tissues. Nat. Genet. 44, 1207–1214 (2012).

  197. 197

    Shipony, Z. et al. Dynamic and static maintenance of epigenetic memory in pluripotent and somatic cells. Nature 513, 115–119 (2014).

  198. 198

    Schwartzman, O. & Tanay, A. Single-cell epigenomics: techniques and emerging applications. Nat. Rev. Genet. 16, 716–726 (2015).

  199. 199

    Cadwell, C.R. et al. Electrophysiological, transcriptomic and morphologic profiling of single neurons using Patch-seq. Nat. Biotechnol. 34, 199–203 (2016).

  200. 200

    Xin, R.S. et al. Shark: SQL and rich analytics at scale. in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data 13–24 (ACM, 2013).

  201. 201

    LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

  202. 202

    Navin, N.E. The first five years of single-cell cancer genomics and beyond. Genome Res. 25, 1499–1507 (2015).

  203. 203

    Gawad, C., Koh, W. & Quake, S.R. Dissecting the clonal origins of childhood acute lymphoblastic leukemia by single-cell genomics. Proc. Natl. Acad. Sci. USA 111, 17947–17952 (2014).

  204. 204

    Potter, N.E. et al. Single-cell mutational profiling and clonal phylogeny in cancer. Genome Res. 23, 2115–2125 (2013).

  205. 205

    Meyer, M. et al. Single cell-derived clonal analysis of human glioblastoma links functional and genomic heterogeneity. Proc. Natl. Acad. Sci. USA 112, 851–856 (2015).

  206. 206

    Biesecker, L.G. & Spinner, N.B. A genomic view of mosaicism and human disease. Nat. Rev. Genet. 14, 307–320 (2013).

  207. 207

    Cai, X. et al. Single-cell, genome-wide sequencing identifies clonal somatic copy-number variation in the human brain. Cell Reports 8, 1280–1289 (2014).

  208. 208

    Evrony, G.D. et al. Single-neuron sequencing analysis of L1 retrotransposition and somatic mutation in the human brain. Cell 151, 483–496 (2012).

  209. 209

    McConnell, M.J. et al. Mosaic copy number variation in human neurons. Science 342, 632–637 (2013).

  210. 210

    Gole, J. et al. Massively parallel polymerase cloning and genome sequencing of single cells using nanoliter microwells. Nat. Biotechnol. 31, 1126–1132 (2013).

  211. 211

    Knouse, K.A., Wu, J., Whittaker, C.A. & Amon, A. Single cell sequencing reveals low levels of aneuploidy across mammalian tissues. Proc. Natl. Acad. Sci. USA 111, 13409–13414 (2014).

  212. 212

    Zhang, C.-Z. et al. Calibrating genomic and allelic coverage bias in single-cell sequencing. Nat. Commun. 6, 6822 (2015).

  213. 213

    Kim, K.I. & Simon, R. Using single cell sequencing data to model the evolutionary history of a tumor. BMC Bioinformatics 15, 27 (2014).

  214. 214

    Suzuki, A. et al. Single-cell analysis of lung adenocarcinoma cell lines reveals diverse expression patterns of individual cells invoked by a molecular target drug treatment. Genome Biol. 16, 66 (2015).

  215. 215

    Weirather, J.L. et al. Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing. Nucleic Acids Res. 43, e116–e116 (2015).

  216. 216

    Afik, S. et al. Targeted reconstruction of T cell receptor sequence from single cell RNA-sequencing links CDR3 length to T cell differentiation state. bioRxiv (2016).

  217. 217

    Stubbington, M.J.T. et al. T cell fate and clonality inference from single-cell transcriptomes. Nat. Methods 13, 329–332 (2016).

  218. 218

    Tirosh, I. et al. Single-cell RNA-seq supports a developmental hierarchy in IDH-mutant oligodendroglioma. Nature (in the press) (2016).

  219. 219

    Dey, S.S., Kester, L., Spanjaard, B., Bienko, M. & van Oudenaarden, A. Integrated genome and transcriptome sequencing of the same cell. Nat. Biotechnol. 33, 285–289 (2015).

  220. 220

    Macaulay, I.C. et al. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes. Nat. Methods 12, 519–522 (2015).

  221. 221

    Angermueller, C. et al. Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity. Nat. Methods 13, 229–232 (2016).

  222. 222

    Frei, A.P. et al. Highly multiplexed simultaneous detection of RNAs and proteins in single cells. Nat. Methods 13, 269–275 (2016).

  223. 223

    Albayrak, C. et al. Digital quantification of proteins and mRNA in single mammalian cells. Mol. Cell 61, 914–924 (2016).

  224. 224

    Darmanis, S. et al. Simultaneous multiplexed measurement of RNA and proteins in single cells. Cell Reports 14, 380–389 (2016).

  225. 225

    Albert, F.W. & Kruglyak, L. The role of regulatory variation in complex traits and disease. Nat. Rev. Genet. 16, 197–212 (2015).

  226. 226

    Risso, D. et al. Power gain: how normalization affects reproducibility and biological insight of RNA-seq studies in neuroscience [v1; not peer reviewed]. F1000Research ISCB Comm J. 4, 411 (2015).

Download references

Acknowledgements

We thank E. Lander, A.K. Shalek, R.B. Fletcher, O. Ram, and D. Stafford for helpful discussions, and L. Gaffney and A. Hupalowska for artwork. A.W. and N.Y. were supported in part by the BRAIN Initiative grant U01 MH105979 from the US National Institute of Mental Health. A.R. is an Investigator of the Howard Hughes Medical Institute and was supported by the Klarman Cell Observatory at the Broad Institute, NIH grant P50 HG006193, Koch Institute Support (core) grant P30-CA14051 from the National Cancer Institute, NIH BRAIN grant 1U01MH105960-01, NCI grant 1U24CA180922, and NIAID grant 1U24AI118672-01.

Author information

Correspondence to Aviv Regev.

Ethics declarations

Competing interests

A.R. is a member of the Scientific Advisory Board for Thermo Fisher Scientific and Syros Pharmaceuticals and a consultant for Driver Group. A.W. and N.Y. declare no competing financial interests.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wagner, A., Regev, A. & Yosef, N. Revealing the vectors of cellular identity with single-cell genomics. Nat Biotechnol 34, 1145–1160 (2016) doi:10.1038/nbt.3711

Download citation

Further reading