Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

PhyloVelo enhances transcriptomic velocity field mapping using monotonically expressed genes

Abstract

Single-cell RNA sequencing (scRNA-seq) is a powerful approach for studying cellular differentiation, but accurately tracking cell fate transitions can be challenging, especially in disease conditions. Here we introduce PhyloVelo, a computational framework that estimates the velocity of transcriptomic dynamics by using monotonically expressed genes (MEGs) or genes with expression patterns that either increase or decrease, but do not cycle, through phylogenetic time. Through integration of scRNA-seq data with lineage information, PhyloVelo identifies MEGs and reconstructs a transcriptomic velocity field. We validate PhyloVelo using simulated data and Caenorhabditis elegans ground truth data, successfully recovering linear, bifurcated and convergent differentiations. Applying PhyloVelo to seven lineage-traced scRNA-seq datasets, generated using CRISPR–Cas9 editing, lentiviral barcoding or immune repertoire profiling, demonstrates its high accuracy and robustness in inferring complex lineage trajectories while outperforming RNA velocity. Additionally, we discovered that MEGs across tissues and organisms share similar functions in translation and ribosome biogenesis.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Schematic of the PhyloVelo framework.
Fig. 2: PhyloVelo recovers complex cell lineages in simulations.
Fig. 3: PhyloVelo reconstructs the embryonic differentiation trajectories of C. elegans.
Fig. 4: PhyloVelo reconstructs the cellular trajectory of mouse erythroid maturation.
Fig. 5: PhyloVelo identifies a dedifferentiation trajectory in lung tumor evolution.
Fig. 6: PhyloVelo inference with clonal lineage-tracing data and MEGs are enriched in ribosome-mediated processes.

Data availability

All data analyzed in this article are publicly available through online sources. The annotated data, lineage trees, results and Python implementation are available at https://phylovelo.readthedocs.io/. The raw data for the C. elegans dataset14 can be accessed with GSE126954, and the lineage tree can be accessed at http://dulab.genetics.ac.cn/TF-atlas/Cell.html. The CRISPR lineage-tracing datasets from the mouse embryos32 can be accessed with GSE117542. The scRNA-seq data of mouse brain development48 can be accessed with PRJNA637987. The time-course scRNA-seq data of whole mouse embryos (E6.5–E8.5)19 can be accessed with E-MTAB-6967. The dataset of mouse primary lung tumors51 can be accessed with PRJNA803321 and from Zenodo (https://zenodo.org/record/5847462#.Yt4-PewRXUI). The dataset of mouse pancreatic cancer cell line KPCY62 can be accessed with GSE173958 and from Mendeley (https://doi.org/10.17632/t98pjcd7t6.1). The dataset of human lung cancer cell line A549 (ref. 63) can be accessed with GSE161363. The dataset of human kidney cell line HEK293T64 can be accessed with PRJNA757179. The LARRY lentiviral barcoding dataset of hematopoiesis37 can be accessed with GSE140802. The single-cell TCR and RNA sequencing data of T cells in BCC57 can be accessed with GSE123813.

Code availability

PhyloVelo86 is freely available as a Python package at https://github.com/kunwang34/PhyloVelo. Detailed workflows to reproduce figures and results in this paper are written as Jupyter Notebook in the repository. The annotated data, lineage trees, results and Python implementation are available at https://phylovelo.readthedocs.io/.

References

  1. Salipante, S. J. & Horwitz, M. S. Phylogenetic fate mapping. Proc. Natl Acad. Sci. USA 103, 5448–5453 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Sulston, J. E., Schierenberg, E., White, J. G. & Thomson, J. N. The embryonic cell lineage of the nematode Caenorhabditis elegans. Dev. Biol. 100, 64–119 (1983).

    Article  CAS  PubMed  Google Scholar 

  3. Stadler, T., Pybus, O. G. & Stumpf, M. P. Phylodynamics for cell biologists. Science 371, eaah6266 (2021).

    Article  CAS  PubMed  Google Scholar 

  4. Baron, C. S. & van Oudenaarden, A. Unravelling cellular relationships during development and regeneration using genetic lineage tracing. Nat. Rev. Mol. Cell Biol. 20, 753–765 (2019).

    Article  CAS  PubMed  Google Scholar 

  5. Moris, N., Pina, C. & Arias, A. M. Transition states and cell fate decisions in epigenetic landscapes. Nat. Rev. Genet. 17, 693–703 (2016).

    Article  CAS  PubMed  Google Scholar 

  6. Bendall, S. C. et al. Single-cell trajectory detection uncovers progression and regulatory coordination in human B cell development. Cell 157, 714–725 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Haghverdi, L., Buttner, M., Wolf, F. A., Buettner, F. & Theis, F. J. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016).

    Article  CAS  PubMed  Google Scholar 

  9. Saelens, W., Cannoodt, R., Todorov, H. & Saeys, Y. A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 37, 547–554 (2019).

    Article  CAS  PubMed  Google Scholar 

  10. Gulati, G. S. et al. Single-cell transcriptional diversity is a hallmark of developmental potential. Science 367, 405–411 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Weinreb, C., Wolock, S., Tusi, B. K., Socolovsky, M. & Klein, A. M. Fundamental limits on dynamic inference from single-cell snapshots. Proc. Natl Acad. Sci. USA 115, E2467–E2476 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Tritschler, S. et al. Concepts and limitations for learning developmental trajectories from single cell genomics. Development 146, dev170506 (2019).

    Article  PubMed  Google Scholar 

  13. Wagner, D. E. & Klein, A. M. Lineage tracing meets single-cell omics: opportunities and challenges. Nat. Rev. Genet. 21, 410–427 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Packer, J. S. et al. A lineage-resolved molecular atlas of C. elegans embryogenesis at single-cell resolution. Science 365, eaax1971 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Mulas, C., Chaigne, A., Smith, A. & Chalut, K. J. Cell state transitions: definitions and challenges. Development 148, dev199950 (2021).

    Article  CAS  PubMed  Google Scholar 

  16. Gerber, T. et al. Single-cell analysis uncovers convergence of cell identities during axolotl limb regeneration. Science 362, eaaq0681 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Liu, X. et al. Single-cell RNA-seq of the developing cardiac outflow tract reveals convergent development of the vascular smooth muscle cells. Cell Rep. 28, 1346–1361 (2019).

    Article  CAS  PubMed  Google Scholar 

  18. Nowotschin, S. et al. The emergent landscape of the mouse gut endoderm at single-cell resolution. Nature 569, 361–367 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Pijuan-Sala, B. et al. A single-cell molecular map of mouse gastrulation and early organogenesis. Nature 566, 490–495 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Lange, M. et al. CellRank for directed single-cell fate mapping. Nat. Methods 19, 159–170 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Gupta, P. B., Pastushenko, I., Skibinski, A., Blanpain, C. & Kuperwasser, C. Phenotypic plasticity: driver of cancer initiation, progression, and therapy resistance. Cell Stem Cell 24, 65–78 (2019).

    Article  CAS  PubMed  Google Scholar 

  22. La Manno, G. et al. RNA velocity of single cells. Nature 560, 494–498 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Bergen, V., Lange, M., Peidli, S., Wolf, F. A. & Theis, F. J. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol. 38, 1408–1414 (2020).

    Article  CAS  PubMed  Google Scholar 

  24. Barile, M. et al. Coordinated changes in gene expression kinetics underlie both mouse and human erythroid maturation. Genome Biol. 22, 197 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Bergen, V., Soldatov, R. A., Kharchenko, P. V. & Theis, F. J. RNA velocity—current challenges and future perspectives. Mol. Syst. Biol. 17, e10282 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. McKenna, A. et al. Whole-organism lineage tracing by combinatorial and cumulative genome editing. Science 353, aaf7907 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Frieda, K. L. et al. Synthetic recording and in situ readout of lineage information in single cells. Nature 541, 107–111 (2017).

    Article  CAS  PubMed  Google Scholar 

  28. Kalhor, R. et al. Developmental barcoding of whole mouse via homing CRISPR. Science 361, eaat9804 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  29. VanHorn, S. & Morris, S. A. Next-generation lineage tracing and fate mapping to interrogate development. Dev. Cell 56, 7–21 (2021).

    Article  CAS  PubMed  Google Scholar 

  30. Alemany, A., Florescu, M., Baron, C. S., Peterson-Maduro, J. & van Oudenaarden, A. Whole-organism clone tracing using single-cell sequencing. Nature 556, 108–112 (2018).

    Article  CAS  PubMed  Google Scholar 

  31. Raj, B. et al. Simultaneous single-cell profiling of lineages and cell types in the vertebrate brain. Nat. Biotechnol. 36, 442–450 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Chan, M. M. et al. Molecular recording of mammalian embryogenesis. Nature 570, 77–82 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Kester, L. & van Oudenaarden, A. Single-cell transcriptomics meets lineage tracing. Cell Stem Cell 23, 166–179 (2018).

    Article  CAS  PubMed  Google Scholar 

  34. Spanjaard, B. et al. Simultaneous lineage tracing and cell-type identification using CRISPR–Cas9-induced genetic scars. Nat. Biotechnol. 36, 469–473 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Bowling, S. et al. An engineered CRISPR–Cas9 mouse line for simultaneous readout of lineage histories and gene expression profiles in single cells. Cell 181, 1693–1694 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Wang, S. W., Herriges, M. J., Hurley, K., Kotton, D. N. & Klein, A. M. CoSpar identifies early cell fate biases from single-cell transcriptomic and lineage information. Nat. Biotechnol. 40, 1066–1074 (2022).

    Article  CAS  PubMed  Google Scholar 

  37. Weinreb, C., Rodriguez-Fraticelli, A., Camargo, F. D. & Klein, A. M. Lineage tracing on transcriptional landscapes links state to fate during differentiation. Science 367, eaat9804 (2020).

    Article  Google Scholar 

  38. Forrow, A. & Schiebinger, G. LineageOT is a unified framework for lineage tracing and trajectory inference. Nat. Commun. 12, 4940 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Butler, M. A. & King, A. A. Phylogenetic comparative analysis: a modeling approach for adaptive evolution. Am. Nat. 164, 683–695 (2004).

    Article  PubMed  Google Scholar 

  41. Papadopoulos, N., Gonzalo, P. R. & Soding, J. PROSSTT: probabilistic simulation of single-cell RNA-seq data for complex differentiation processes. Bioinformatics 35, 3517–3519 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Liu, K. et al. Mapping single-cell-resolution cell phylogeny reveals cell population dynamics during organ development. Nat. Methods 18, 1506–1514 (2021).

    Article  CAS  PubMed  Google Scholar 

  43. Wagner, D. E. et al. Single-cell mapping of gene expression landscapes and lineage in the zebrafish embryo. Science 360, 981–987 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Salipante, S. J., Kas, A., McMonagle, E. & Horwitz, M. S. Phylogenetic analysis of developmental and postnatal mouse cell lineages. Evol. Dev. 12, 84–94 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  45. Street, K. et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics 19, 477 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Wolf, F. A. et al. PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol. 20, 59 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Salvador-Martinez, I., Grillo, M., Averof, M. & Telford, M. J. Is it possible to reconstruct an accurate cell lineage using CRISPR recorders? eLife 8, e40292 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  48. La Manno, G. et al. Molecular architecture of the developing mouse brain. Nature 596, 92–96 (2021).

    Article  PubMed  Google Scholar 

  49. Baron, M. H., Isern, J. & Fraser, S. T. The embryonic origins of erythropoiesis in mammals. Blood 119, 4828–4837 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Qiu, X. et al. Mapping transcriptomic vector fields of single cells. Cell 185, 690–711 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Yang, D. et al. Lineage tracing reveals the phylodynamics, plasticity, and paths of tumor evolution. Cell 185, 1905–1923 e1925 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Marjanovic, N. D. et al. Emergence of a high-plasticity cell state during lung cancer evolution. Cancer Cell 38, 229–246 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. LaFave, L. M. et al. Epigenomic state transitions characterize tumor progression in mouse lung adenocarcinoma. Cancer Cell 38, 212–228 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Kim, N. et al. Single-cell RNA sequencing demonstrates the molecular and cellular reprogramming of metastatic lung adenocarcinoma. Nat. Commun. 11, 2285 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Biddy, B. A. et al. Single-cell mapping of lineage and identity in direct reprogramming. Nature 564, 219–224 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Penter, L., Gohil, S. H. & Wu, C. J. Natural barcodes for longitudinal single cell tracking of leukemic and immune cell dynamics. Front. Immunol. 12, 788891 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  57. Yost, K. E. et al. Clonal replacement of tumor-specific T cells following PD-1 blockade. Nat. Med. 25, 1251–1259 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Gu, Y., Blaauw, D. & Welch, J. D. Bayesian inference of RNA velocity from multi-lineage single-cell data. Preprint at bioRxiv https://doi.org/10.1101/2022.07.08.499381 (2022).

  59. Cui, H., Maan, H., Taylor, M. D. & Wang, B. DeepVelo: deep learning extends RNA velocity to multi-lineage systems with cell-specific kinetics. Preprint at bioRxiv https://doi.org/10.1101/2022.04.03.486877 (2022).

  60. Li, S. et al. A relay velocity model infers cell-dependent RNA velocity. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01728-5 (2023).

  61. Gao, M., Qiao, C. & Huang, Y. UniTVelo: temporally unified RNA velocity reinforces single-cell trajectory inference. Nat. Commun. 13, 6586 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Simeonov, K. P. et al. Single-cell lineage tracing of metastatic cancer reveals selection of hybrid EMT states. Cancer Cell 39, 1150–1162 e1159 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Quinn, J. J. et al. Single-cell lineages reveal the rates, routes, and drivers of metastasis in cancer xenografts. Science 371, eabc1944 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Choi, J. et al. A time-resolved, multi-symbol molecular recorder via sequential genome editing. Nature 608, 98–107 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Qiu, Q. et al. Massively parallel and time-resolved RNA sequencing in single cells with scNT-seq. Nat. Methods 17, 991–1001 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Athanasiadis, E. I. et al. Single-cell RNA-sequencing uncovers transcriptional states and fate decisions in haematopoiesis. Nat. Commun. 8, 2045 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  67. Fei, L. et al. Systematic identification of cell-fate regulatory programs using a single-cell atlas of mouse development. Nat. Genet. 54, 1051–1061 (2022).

    Article  CAS  PubMed  Google Scholar 

  68. Shi, J., Teschendorff, A. E., Chen, W., Chen, L. & Li, T. Quantifying Waddington’s epigenetic landscape: a comparison of single-cell potency measures. Brief. Bioinform 21, 248–261 (2018).

    Google Scholar 

  69. Teschendorff, A. E. & Feinberg, A. P. Statistical mechanics meets single-cell biology. Nat. Rev. Genet. 22, 459–476 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Singh, R., Wu, A. P., Mudide, A. & Berger, B. Unraveling causal gene regulation from the RNA velocity graph using Velorama. Preprint at bioRxiv https://doi.org/10.1101/2022.10.18.512766 (2022).

  71. Hughes, N. W. et al. Machine-learning-optimized Cas12a barcoding enables the recovery of single-cell lineages and transcriptional profiles. Mol. Cell. 82, 3103–3118 (2022).

    Article  CAS  PubMed  Google Scholar 

  72. Gong, W. et al. Benchmarked approaches for reconstruction of in vitro cell lineages and in silico models of C. elegans and M. musculus developmental trees. Cell Syst. 12, 810–826 (2021).

    Article  CAS  PubMed  Google Scholar 

  73. Espinosa-Medina, I., Garcia-Marques, J., Cepko, C. & Lee, T. High-throughput dense reconstruction of cell lineages. Open Biol. 9, 190229 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  74. Jindal, K. et al. Multiomic single-cell lineage tracing to dissect fate-specific gene regulatory programs. Preprint at bioRxiv https://doi.org/10.1101/2022.10.23.512790 (2022).

  75. Gillespie, D. T. Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81, 2340–2361 (1977).

    Article  CAS  Google Scholar 

  76. Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Paradis, E., Claude, J. & Strimmer, K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290 (2004).

    Article  CAS  PubMed  Google Scholar 

  78. Chen, W. et al. UMI-count modeling and differential expression analysis for single-cell RNA sequencing. Genome Biol. 19, 70 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  79. Jia, C. Kinetic foundation of the zero-inflated negative binomial model for single-cell RNA sequencing data. SIAM J. Appl. Math. 80, 1336–1355 (2020).

    Article  Google Scholar 

  80. Prim, R. C. Shortest connection networks and some generalizations. Bell System Technical Journal 36, 1389–1401 (1957).

    Article  Google Scholar 

  81. Cock, P. J. et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25, 1422–1423 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  84. Glanville, J. et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Wu, T. et al. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innovation (Camb). 2, 100141 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  86. Wang, K. et al. PhyloVelo, Phylogeny-based transcriptomic velocity of single cells. GitHub https://github.com/kunwang34/PhyloVelo

Download references

Acknowledgements

We thank Y. Huang, J. Wang, J. Xu, L. Ma, W. Chen and members of the Hu laboratory for constructive discussions. This work was supported by the National Key R&D Program of China (2021YFA1302500 to Z.H.), the National Natural Science Foundation of China (11971405 to D.Z. and 32270693 to Z.H.), the Guangdong Basic and Applied Basic Research Foundation (2021B1515020042 to Z.H.), Fundamental Research Funds for the Central Universities (20720230023 to D.Z.) and the China Postdoctoral Science Foundation (2021M693303 to Z.L and 2022M723301 to X.W.).

Author information

Authors and Affiliations

Authors

Contributions

Z.H. and K.W. conceived the concept of phylogenetic velocity. Z.H., K.W. and D.Z. designed the study. K.W. developed the mathematical framework and implemented the software. K.W., Z.H., L.H., Z.L., X.W., X.Z. and Z.Z. analyzed the data. W.Z. and Z.Z. provided constructive suggestions on the model. K.W., Z.H., D.Z., C.C. and X.H. interpreted results. Z.H. and K.W. wrote the manuscript, with contributions from all co-authors. Z.H. and D.Z. supervised the study.

Corresponding authors

Correspondence to Da Zhou or Zheng Hu.

Ethics declarations

Competing interests

C.C. is an advisor to and stockholder in Grail, Ravel and DeepCell and an advisor to Genentech, Bristol Myers Squibb, 3T Biosciences and NanoString. All other authors declare no competing interests.

Peer review

Peer review information

Nature Biotechnology thanks Junyue Cao and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Quantitative metrics for evaluating the performance of PhyloVelo using simulation data.

Two quantitative metrics with varied cell numbers (a), non-linear MEGs (b), different dimensionality reduction methods (c), varied data sparsity (d) and varied numbers of MEGs (e). All benchmarks are simulated 50 times independently. Bar, median; box, 25th to 75th interquartile range (IQR); vertical line, data within 1.5 times the IQR.

Extended Data Fig. 2 PhyloVelo velocity fields in three additional lineages of C. elegans.

Hypodermis, body wall muscle (BWM) and pharynx lineage cells, respectively. Colors are labeled by the estimated embryo time (minutes). (d-f) PhyloVelo velocity fields of the three lineages respectively each consisting of 2,000 randomly sampled cells from multiple embryos, which were reconstructed using the MEGs identified from 298 AB lineage cells. Colors are labeled by the PhyloVelo pseudotime. (g-i) The correlation between PhyloVelo pseudotime and embryo time for the cells in the three lineages. The Spearman correlation coefficients and P values are shown here.

Extended Data Fig. 3 High concordance of MEGs identified from 4 mouse embryos (E8.0/8.5) in Chan et al.

(a) Venn diagram showing the overlap of MEGs identified from four mouse embryos in the dataset of Chan et al. P-value is by one-sided SuperExactTest multi-set intersection test. (b-g) The correlation of phylogenetic velocities \({\boldsymbol{v}}\) for the overlapped MEGs between any two embryos. The Pearson correlation coefficients and P values are shown here.

Extended Data Fig. 4 The global differentiation trajectories of whole mouse embryos and brain tissues predicted by LT-MEGs.

(a) PhyloVelo velocity fields of mouse embryos (E6.5-8.5) mapped by 104 LT-MEGs with the temporal scRNA-seq dataset from Pijuan-Sala et al. (b-c) UMAP plot colored by PhyloVelo pseudotime (b) or sample capture time (c). (d) PhyloVelo velocity fields of mouse brain (E7-18) mapped by LT-MEGs with the temporal scRNA-seq dataset from La Manno et al. (e-f) tSNE plot colored by PhyloVelo pseudotime (e) or sample capture time (f). UMAP or tSNE coordinates were as the original studies.

Extended Data Fig. 5 PhyloVelo velocity fields and quantitative state transitions of mouse erythroid development for four embryos from Chan et al.

(a-d) PhyloVelo velocity fields. (e-h) The transition rate (backward) between any two cell types. (i-l) Cell-type transition graph (backward) visualized based on the cell-type transition rates. PhyloVelo velocity fields were used as the input of Dynamo.

Extended Data Fig. 6 PhyloVelo reconstructs the cellular trajectory of lung cancer evolution in 3435_NT_T1.

(a) Single-cell phylogenetic tree of primary lung tumor 3435_NT_T1 (n = 1,109 cells) from KP (KrasLSL-G12D/+;Trp53fl/fl) mouse model. The single-cell RNA data, cell type annotations and lineage tree were obtained from Yang et al. (b) RNA velocity fields (scVelo - dynamical mode). (c) PhyloVelo velocity fields. (d) The fitness signatures of single cells as defined by Yang et al. (e) The correlation between scVelo latent time and fitness signatures. The Spearman correlation coefficients and P values are shown here. (f) The correlation between PhyloVelo pseudotime and fitness signatures. The Spearman correlation coefficients and P values are shown here. (g) CytoTRACE score of individual cells. (h) The correlation between scVelo latent time and CytoTRACE scores. The Spearman correlation coefficients and P values are shown here. (i) The correlation between PhyloVelo pseudotime and CytoTRACE scores. (j) The correlation of phylogenetic velocities for the overlapped MEGs between KP primary tumor 3435_NT_T1 and 3726_NT_T1. The Pearson correlation coefficient and P value are shown here.

Extended Data Fig. 7 Comparison of PhyloVelo with scVelo, VeloVAE, DeepVelo, CellDancer and UniTVelo respectively on mouse erythroid data.

scVelo - RNA velocity fields (a), latent time (b) and the fractions of different cell types along latent time (c). VeloVAE - RNA velocity fields (d), latent time (e) and the fractions of different cell types along latent time (f). DeepVelo - RNA velocity fields (g), latent time (h) and the fractions of different cell types along latent time (i). cellDancer - RNA velocity fields (j), pseudotime (k) and the fractions of different cell types along pseudotime (l). UniTVelo - RNA velocity fields (m), latent time (n) and the fractions of different cell types along latent time (o). PhyloVelo - velocity fields (p), pseudotime (q) and the fractions of different cell types along pseudotime (r). PhyloVelo velocity fields are in backward directions.

Extended Data Fig. 8 The dynamic EMT trajectory in metastatic progression of pancreatic cancer KPCY cells.

(a) Phylogenetic tree of 601 non-repetitive terminal cells in tumor subclone M1.1 from Simeonov et al. Cell colors are labeled by EMT pseudotime as defined in the original study. (b) The total UMI count (normalized) of MEGs changing with the phylogenetic distance from the root. (c) Heatmap of MEG expressions (z-score normalized) with EMT pseudotime. (d) RNA velocity fields (scVelo - dynamical mode). Cell colors are labeled by EMT pseudotime. (e) scVelo latent time. (f) The correlation between scVelo latent time and EMT pseudotime. (g) PhyloVelo velocity fields. Cell colors are labeled by EMT pseudotime. (h) PhyloVelo pseudotime. (i) The correlation between PhyloVelo pseudotime and EMT pseudotime. The Spearman correlation coefficients and P values are shown here.

Extended Data Fig. 9 Continuous state transitions inferred by PhyloVelo after regressing out cell-cycle effect.

(ad) PhyloVelo velocity fields after regressing out cell-cycle dynamics in KPCY, A549 lg1, A549 lg2 and HEK293T, respectively. (e–h) The correlation of PhyloVelo pseudotime between original analysis and post regressing out of cell-cycle effect in KPCY, A549 lg1, A549 lg2 and HEK293T, respectively. The Pearson correlation coefficients and P values are shown here.

Extended Data Fig. 10 Overlap of MEGs across organisms and tissue/cell types and the permutation analysis of MEG identification.

(a) The overlap of MEGs identified in different datasets as stratified by mouse vs human. (b) The overlap of MEGs identified in different datasets as stratified by normal vs tumor cells. P values are by one-sided hypergeometric test. (c) The q values of MEGs in standard and permutation analysis. Permutation analysis was performed by randomly shuffling the phylogenetic distances of the cells, followed by the PhyloVelo inference procedure. The number of detected MEGs using standard and permutation analysis: n = 1,724 and n = 941 genes in Embryo E8/E8.5; n = 681 and n = 445 genes in KP lung tumor; n = 424 and n = 141 genes in KPCY; n = 629 and n = 50 genes in A549; n = 243 and n = 90 genes in HEK293T; n = 419 and n = 112 genes in in vitro hematopoiesis; n = 368 and n = 270 genes in CD8 + T cells. Bar, median; box, 25th to 75th interquartile range (IQR); vertical line, data within 1.5 times the IQR. (d) The GO enrichment of pseudo-MEGs across the seven lineage tracing datasets.

Supplementary information

Supplementary Information

Supplementary Figs. 1–24 and Supplementary Note.

Reporting Summary

Supplementary Table 1

The MEGs and their phylogenetic velocity estimates in C. elegans, five CRISPR lineage-tracing datasets and two clonal lineage-tracing datasets.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, K., Hou, L., Wang, X. et al. PhyloVelo enhances transcriptomic velocity field mapping using monotonically expressed genes. Nat Biotechnol (2023). https://doi.org/10.1038/s41587-023-01887-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41587-023-01887-5

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing