Article | Published:

Single-cell topological RNA-seq analysis reveals insights into cellular differentiation and development

Nature Biotechnology volume 35, pages 551560 (2017) | Download Citation

Abstract

Transcriptional programs control cellular lineage commitment and differentiation during development. Understanding of cell fate has been advanced by studying single-cell RNA-sequencing (RNA-seq) but is limited by the assumptions of current analytic methods regarding the structure of data. We present single-cell topological data analysis (scTDA), an algorithm for topology-based computational analyses to study temporal, unbiased transcriptional regulation. Unlike other methods, scTDA is a nonlinear, model-independent, unsupervised statistical framework that can characterize transient cellular states. We applied scTDA to the analysis of murine embryonic stem cell (mESC) differentiation in vitro in response to inducers of motor neuron differentiation. scTDA resolved asynchrony and continuity in cellular identity over time and identified four transient states (pluripotent, precursor, progenitor, and fully differentiated cells) based on changes in stage-dependent combinations of transcription factors, RNA-binding proteins, and long noncoding RNAs (lncRNAs). scTDA can be applied to study asynchronous cellular responses to either developmental cues or environmental perturbations.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

Primary accessions

ArrayExpress

Gene Expression Omnibus

References

  1. 1.

    Neuronal specification in the spinal cord: inductive signals and transcriptional codes. Nat. Rev. Genet. 1, 20–29 (2000).

  2. 2.

    , , & Directed differentiation of embryonic stem cells into motor neurons. Cell 110, 385–397 (2002).

  3. 3.

    et al. Modeling ALS with motor neurons derived from human induced pluripotent stem cells. Nat. Neurosci. 19, 542–553 (2016).

  4. 4.

    et al. Intricate interplay between astrocytes and motor neurons in ALS. Proc. Natl. Acad. Sci. USA 110, E756–E765 (2013).

  5. 5.

    , & Engineering the embryoid body microenvironment to direct embryonic stem cell differentiation. Biotechnol. Prog. 25, 43–51 (2009).

  6. 6.

    , , , & Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016).

  7. 7.

    et al. Wishbone identifies bifurcating developmental trajectories from single-cell data. Nat. Biotechnol. 34, 637–645 (2016).

  8. 8.

    , & SLICER: inferring branched, nonlinear cellular trajectories from single cell RNA-seq data. Genome Biol. 17, 106 (2016).

  9. 9.

    et al. destiny: diffusion maps for large-scale single-cell data in R. Bioinformatics 32, 1241–1243 (2016).

  10. 10.

    et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).

  11. 11.

    et al. Bifurcation analysis of single-cell gene expression data reveals epigenetic landscape. Proc. Natl. Acad. Sci. USA 111, E5643–E5650 (2014).

  12. 12.

    , & Topology of viral evolution. Proc. Natl. Acad. Sci. USA 110, 18566–18571 (2013).

  13. 13.

    , & Inference of Ancestral Recombination Graphs through Topological Data Analysis. PLoS Comput. Biol. 12, e1005071 (2016).

  14. 14.

    , , , & Topological data analysis generates high-resolution, genome-wide maps of human recombination. Cell Syst. 3, 83–94 (2016).

  15. 15.

    , & Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. Proc. Natl. Acad. Sci. USA 108, 7265–7270 (2011).

  16. 16.

    et al. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci. Transl. Med. 7, 311ra174 (2015).

  17. 17.

    et al. Single-cell trajectory detection uncovers progression and regulatory coordination in human B cell development. Cell 157, 714–725 (2014).

  18. 18.

    , , , & Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).

  19. 19.

    , & in SPBG 91–100 (Citeseer, 2007).

  20. 20.

    , , & CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification. Cell Rep. 2, 666–673 (2012).

  21. 21.

    et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).

  22. 22.

    , & The contribution of cell cycle to heterogeneity in single-cell RNA-seq data. Nat. Biotechnol. 34, 591–593 (2016).

  23. 23.

    , , & Large-scale gene function analysis with the PANTHER classification system. Nat. Protoc. 8, 1551–1566 (2013).

  24. 24.

    & Gene expression regulation by retinoic acid. J. Lipid Res. 43, 1773–1808 (2002).

  25. 25.

    & Retinoic acid signalling during development. Development 139, 843–858 (2012).

  26. 26.

    & Temporal colinearity in expression of anterior Hox genes in developing chick embryos. Dev. Dyn. 207, 270–280 (1996).

  27. 27.

    , & Long intergenic non-coding RNA HOTAIRM1 regulates cell cycle progression during myeloid maturation in NB4 human promyelocytic leukemia cells. RNA Biol. 11, 777–787 (2014).

  28. 28.

    et al. RNA-Seq of human neurons derived from iPS cells reveals candidate long non-coding RNAs involved in neurogenesis and neuropsychiatric disorders. PLoS One 6, e23356 (2011).

  29. 29.

    & The regulation of Hox gene expression during animal development. Development 140, 3951–3963 (2013).

  30. 30.

    et al. Long noncoding RNAs in mouse embryonic stem cell pluripotency and differentiation. Genome Res. 18, 1433–1445 (2008).

  31. 31.

    , & neurogenins, a novel family of atonal-related bHLH transcription factors, are putative mammalian neuronal determination genes that reveal progenitor cell heterogeneity in the developing CNS and PNS. Mol. Cell. Neurosci. 8, 221–241 (1996).

  32. 32.

    RNA protein interaction in neurons. Annu. Rev. Neurosci. 36, 243–270 (2013).

  33. 33.

    , , & Essential roles for the splicing regulator nSR100/SRRM4 during nervous system development. Genes Dev. 29, 746–759 (2015).

  34. 34.

    et al. Regulation of vertebrate nervous system alternative splicing and development by an SR-related protein. Cell 138, 898–910 (2009).

  35. 35.

    et al. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature 509, 371–375 (2014).

  36. 36.

    et al. Single-cell RNA-Seq reveals lineage and X chromosome dynamics in human preimplantation embryos. Cell 165, 1012–1026 (2016).

  37. 37.

    et al. Sequential transcriptional waves direct the differentiation of newborn neurons in the mouse neocortex. Science 351, 1443–1446 (2016).

  38. 38.

    , & TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

  39. 39.

    , & HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).

  40. 40.

    , & Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet. 16, 133–145 (2015).

  41. 41.

    , & Validation of noise models for single-cell transcriptomics. Nat. Methods 11, 637–640 (2014).

  42. 42.

    , & Bayesian approach to single-cell differential expression analysis. Nat. Methods 11, 740–742 (2014).

  43. 43.

    et al. Single-cell RNA-seq reveals dynamic paracrine control of cellular variation. Nature 510, 363–369 (2014).

  44. 44.

    et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat. Methods 10, 1093–1095 (2013).

  45. 45.

    , & Topological persistence and simplification. Discrete Comput. Geom. 28, 511–533 (2002).

  46. 46.

    & Computing persistent homology. Discrete Comput. Geom. 33, 249–274 (2005).

  47. 47.

    et al. QuickGO: a web-based tool for Gene Ontology searching. Bioinformatics 25, 3045–3046 (2009).

  48. 48.

    UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–D212 (2015).

  49. 49.

    et al. NONCODE 2016: an informative and valuable data source of long non-coding RNAs. Nucleic Acids Res. 44, D203–D208 (2016).

Download references

Acknowledgements

We thank T. Jessell, N. Francis, and H. Phatnani for critical reading of the manuscript. A.H.R. and T.M. thank the New York Genome Center and D. Goldstein for sequencing support, S. Morton for providing Engrailed antibody, and P. Sims for experimental discussions. P.G.C. and R.R. thank A.J. Levine, G. Carlsson, F. Abate, I. Filip, S. Zairis, U. Rubin, and P. van Nieuwenhuizen for useful comments and discussions, O.T. Elliott for technical support with the online database, and Ayasdi Inc. for technical support. The work of P.G.C. and R.R. is supported by the NIH grants U54-CA193313-01 and R01GM117591. The work of A.H.R., E.K.K., T.J.R. and T.M. is supported by ALS Therapy Alliance grant ATA-2013-F-056 and NIH grant NS088992.

Author information

Author notes

    • Abbas H Rizvi
    •  & Pablo G Camara

    These authors contributed equally to this work.

Affiliations

  1. Department of Biochemistry and Molecular Biophysics, Columbia University Medical Center, New York, New York, USA.

    • Abbas H Rizvi
    • , Elena K Kandror
    • , Thomas J Roberts
    •  & Tom Maniatis
  2. The Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, New York, USA.

    • Abbas H Rizvi
    • , Elena K Kandror
    • , Thomas J Roberts
    • , Ira Schieren
    •  & Tom Maniatis
  3. Department of Systems Biology, Columbia University Medical Center, New York, New York, USA.

    • Pablo G Camara
    •  & Raul Rabadan
  4. Department of Biomedical Informatics, Columbia University Medical Center, New York, New York, USA.

    • Pablo G Camara
    • , Thomas J Roberts
    •  & Raul Rabadan
  5. Howard Hughes Medical Institute, Columbia University Medical Center, New York, New York, USA.

    • Ira Schieren

Authors

  1. Search for Abbas H Rizvi in:

  2. Search for Pablo G Camara in:

  3. Search for Elena K Kandror in:

  4. Search for Thomas J Roberts in:

  5. Search for Ira Schieren in:

  6. Search for Tom Maniatis in:

  7. Search for Raul Rabadan in:

Contributions

P.G.C. and R.R. developed the topology-based computational approach (scTDA) and applied it to single cell RNA sequencing data. A.H.R., E.K.K., and T.M. designed all experiments. A.H.R., E.K.K., and T.J.R. conducted experiments. I.S. conducted all flow cytometry. A.H.R., P.G.C., E.K.K., T.M., and R.R. analyzed the data and wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Tom Maniatis or Raul Rabadan.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–20, Supplementary Table 1, and Supplementary Notes 1 and 2

Excel files

  1. 1.

    Supplementary Tables 2–4

    All genes; Ontology; lncRNAs.

  2. 2.

    Supplementary Table 5

    All genes characterization of the expression profile in the topological representation of 80 embryonic (E18.5) mouse lung 35 epithelial cells.

  3. 3.

    Supplementary Table 6

    Characterization of the expression profile in the topological representation of 1,529 individual cells from 88 human preimplantation embryos.

  4. 4.

    Supplementary Table 7

    All genes characterization of the expression profile in the topological representation of 272 newborn neurons from the mouse neocortex.

  5. 5.

    Supplementary Table 8

    Barcoded reverse transcription primers utilized in motor neuron differentiation experiment 2.

Text files

  1. 1.

    Supplementary Code

    Python code for single-cell topological data analysis (scTDA). Also available at https://github.com/RabadanLab/scTDA.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nbt.3854

Further reading