Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Systematic identification of cell-fate regulatory programs using a single-cell atlas of mouse development

An Author Correction to this article was published on 03 August 2022

This article has been updated

Abstract

Waddington’s epigenetic landscape is a metaphor frequently used to illustrate cell differentiation. Recent advances in single-cell genomics are altering our understanding of the Waddington landscape, yet the molecular mechanisms of cell-fate decisions remain poorly understood. We constructed a cell landscape of mouse lineage differentiation during development at the single-cell level and described both lineage-common and lineage-specific regulatory programs during cell-type maturation. We also found lineage-common regulatory programs that are broadly active during the development of invertebrates and vertebrates. In particular, we identified Xbp1 as an evolutionarily conserved regulator of cell-fate determinations across different species. We demonstrated that Xbp1 transcriptional regulation is important for the stabilization of the gene-regulatory networks for a wide range of mouse cell types. Our results offer genetic and molecular insights into cellular gene-regulatory programs and will serve as a basis for further advancing the understanding of cell-fate decisions.

This is a preview of subscription content, access via your institution

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Single-cell transcriptional atlas of mouse differentiation.
Fig. 2: Cellular heterogeneity in mouse tissues.
Fig. 3: Analysis of regulatory programs in MCDA.
Fig. 4: Global characteristics of cell differentiation across species.
Fig. 5: Inference of gene regulation during cell-fate decisions across species.
Fig. 6: ScRNA-seq and high-resolution MS revealed gene and protein changes in Xbp1−/− embryos.

Data availability

The data generated in the present study can be downloaded from the NCBI’s Gene Expression Omnibus under accession nos. GSE176063 and GSE178217. The raw and processed files of MCDA are at accession no. GSE176063. The raw and processed files of WT and Xbp1 KO embryos are at accession no. GSE178217. Processed count matrices and cell annotations are provided on the figshare website (https://figshare.com/s/340e8e7f349559f61ef6), including the development stage, tissue of origin, lineage information and cell-type annotations. We have provided separate datasets for each tissue and the merged datasets for the MCDA. We have also provided an interactive website (http://bis.zju.edu.cn/MCA) to enable public access to the data. The proteomics data was provided in the Proteomics Identifications Database (PRIDE) under accession no. PXD032847. The following publicly available datasets were used in the study: Mus_musculus. GRCm38.88 genome, Mus_musculus_10090 database, AnimalTFDB 3.0 database, STRING database (v.1.1), eggNOG database (v.5.0), Ensembl v.96; the S. mediterranea dataset generated by Plass et al.12 (accession no. GSE103633), the C. elegan dataset generated by Packer et al.15 (accession no. GSE126954.); the C. intestinalis dataset generated by Cao et al.16 (accession no. GSE131155); the H. vulgaris dataset generated by Siebert et al.13 (accession no. GSE121617); the D. rerio dataset generated by Li et al.42 (GSE178151); the H. sapiens dataset generated by Han et al.14 (GSE134355); and part of the M. musculus dataset (E14.5 and adult) generated by Han et al.11 (accession nos. GSE108097 and GSE134355). The mouse scATAC-seq dataset was generated by Cusanovich et al.63 (accession no. GSE111586, https://atlas.gs.washington.edu/mouse-atac/data) and Di Bella et al.64 (accession no. GSE153164), and the human scATAC-seq dataset by Domcke et al.65 (descartes.brotmanbaty.org).

Code availability

Detailed code is available at GitHub (https://github.com/ggjlab/MCDA) and Zenodo (https://zenodo.org/record/6548256#.Yn92F-hBw2w)95.

Change history

References

  1. Mathis, L. & Nicolas, J.-F. Cellular patterning of the vertebrate embryo. Trends Genet. 18, 627–635 (2002).

    CAS  PubMed  Google Scholar 

  2. Heinäniemi, M. et al. Gene-pair expression signatures reveal lineage control. Nat. Methods 10, 577–583 (2013).

    PubMed  PubMed Central  Google Scholar 

  3. Moris, N., Pina, C. & Arias, A. M. Transition states and cell fate decisions in epigenetic landscapes. Nat. Rev. Genet. 17, 693–703 (2016).

    CAS  PubMed  Google Scholar 

  4. Waddington, C. H. The Strategy of the Genes (Routledge, 2014).

  5. Teschendorff, A. E. & Feinberg, A. P. Statistical mechanics meets single-cell biology. Nat. Rev. Genet. 22, 459–476 (2021).

    CAS  PubMed  Google Scholar 

  6. Wagner, D. E. & Klein, A. M. Lineage tracing meets single-cell omics: opportunities and challenges. Nat. Rev. Genet. 21, 410–427 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Huang, S., Eichler, G., Bar-Yam, Y. & Ingber, D. E. Cell fates as high-dimensional attractor states of a complex gene regulatory network. Phys. Rev. Lett. 94, 128701 (2005).

    PubMed  Google Scholar 

  8. Orkin, S. H. & Zon, L. I. Hematopoiesis: an evolving paradigm for stem. Cell Biol. Cell 132, 631–644 (2008).

    CAS  Google Scholar 

  9. Davis, R. L., Weintraub, H. & Lassar, A. B. Expression of a single transfected cDNA converts fibroblasts to myoblasts. Cell 51, 987–1000 (1987).

    CAS  PubMed  Google Scholar 

  10. Niwa, H. et al. Interaction between Oct3/4 and Cdx2 determines trophectoderm differentiation. Cell 123, 917–929 (2005).

    CAS  PubMed  Google Scholar 

  11. Han, X. et al. Mapping the Mouse Cell Atlas by Microwell-Seq. Cell 172, 1091–1107.e17 (2018).

    CAS  PubMed  Google Scholar 

  12. Plass, M. et al. Cell type atlas and lineage tree of a whole complex animal by single-cell transcriptomics. Science 360, eaaq1723 (2018).

  13. Siebert, S. et al. Stem cell differentiation trajectories in Hydra resolved at single-cell resolution. Science 365, eaav9314 (2019).

  14. Han, X. et al. Construction of a human cell landscape at single-cell level. Nature 581, 303–309 (2020).

    CAS  PubMed  Google Scholar 

  15. Packer, J. S. et al. A lineage-resolved molecular atlas of C. elegans embryogenesis at single-cell resolution. Science 365, eaax1971 (2019).

  16. Cao, C. et al. Comprehensive single-cell transcriptome lineages of a proto-vertebrate. Nature 571, 349–354 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. Mittnenzweig, M. et al. A single-embryo, single-cell time-resolved model for mouse gastrulation. Cell https://doi.org/10.1016/j.cell.2021.04.004 (2021).

  18. Qiu, C. et al. Systematic reconstruction of cellular trajectories across mouse embryogenesis. Nat. Genet. 54, 328–341 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Ferre, P., Decaux, J.-F., Issad, T. & Girard, J. Changes in energy metabolism during the suckling and weaning period in the newborn. Reprod. Nutr. Dev. 26, 619–631 (1986).

    CAS  PubMed  Google Scholar 

  20. Stewart, B. J. et al. Spatiotemporal immune zonation of the human kidney. Science 365, 1461–1466 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Chen, L. et al. Renal-tubule epithelial cell nomenclature for single-cell RNA-sequencing studies. J. Am. Soc. Nephrol. 30, 1358–1364 (2019).

  22. Zheng, B. et al. Prospective identification of myogenic endothelial cells in human skeletal muscle. Nat. Biotechnol. 25, 1025–1034 (2007).

    CAS  PubMed  Google Scholar 

  23. Chaudhari, P., Tian, L., Deshmukh, A. & Jang, Y.-Y. Expression kinetics of hepatic progenitor markers in cellular models of human liver development recapitulating hepatocyte and biliary cell fate commitment. Exp. Biol. Med. 241, 1653–1662 (2016).

    CAS  Google Scholar 

  24. Willnow, D. et al. Quantitative lineage analysis identifies a hepato-pancreato-biliary progenitor niche. Nature 597, 87–91 (2021).

    CAS  PubMed  Google Scholar 

  25. Banerji, C. R. S. et al. Cellular network entropy as the energy potential in Waddington’s differentiation landscape. Sci. Rep. 3, 3039 (2013).

    PubMed  PubMed Central  Google Scholar 

  26. Grün, D. et al. De novo prediction of stem cell identity using single-cell transcriptome data. Cell Stem Cell 19, 266–277 (2016).

    PubMed  PubMed Central  Google Scholar 

  27. Guo, M. SLICE: determining cell differentiation and lineage based on single cell entropy. Nucleic Acids Res. 45, 14 (2017).

    Google Scholar 

  28. Teschendorff, A. E. & Enver, T. Single-cell entropy for accurate estimation of differentiation potency from a cell’s transcriptome. Nat. Commun. 8, 15599 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Teschendorff, A. E., Maity, A. K., Hu, X., Weiyan, C. & Lechner, M. Ultra-fast scalable estimation of single-cell differentiation potency from scRNA-Seq data. Bioinformatics 37, 1528–1534 (2021).

    CAS  PubMed  Google Scholar 

  30. Kauffman, S. Homeostasis and differentiation in random genetic control networks. Nature 224, 177–178 (1969).

    CAS  PubMed  Google Scholar 

  31. Lee, T. I. & Young, R. A. Transcriptional regulation and its misregulation in disease. Cell 152, 1237–1251 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  32. Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. Holland, C. H. et al. Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data. Genome Biol. 21, 36 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Parker, H. J. Mammalian embryo: Hox genes. eLS 1–15 (2020).

  35. Cassandri, M. et al. Zinc-finger proteins in health and disease. Cell Death Discov. 3, 17071 (2017).

    PubMed  PubMed Central  Google Scholar 

  36. Eisenberg, E. & Levanon, E. Y. Human housekeeping genes, revisited. Trends Genet. 29, 569–574 (2013).

    CAS  PubMed  Google Scholar 

  37. Jochum, W., Passegué, E. & Wagner, E. F. AP-1 in mouse development and tumorigenesis. Oncogene 20, 2401–2412 (2001).

    CAS  PubMed  Google Scholar 

  38. Velazquez, F. N., Caputto, B. L. & Boussin, F. D. c-Fos importance for brain development. Aging 7, 1028 (2015).

    PubMed  PubMed Central  Google Scholar 

  39. Liu, J. et al. The oncogene c-Jun impedes somatic cell reprogramming. Nat. Cell Biol. 17, 856–867 (2015).

    CAS  PubMed  Google Scholar 

  40. Madrigal, P. & Alasoo, K. AP-1 takes centre stage in enhancer chromatin dynamics. Trends Cell Biol. 28, 509–511 (2018).

    CAS  PubMed  Google Scholar 

  41. Hinman, V. & Cary, G. The evolution of gene regulation. eLife 6, e27291 (2017).

    PubMed  PubMed Central  Google Scholar 

  42. Li, J. et al. Inferring predictive genetic models and regulatory elements by deep learning of cross-species single-cell gene expression landscapes. Preprint at https://www.researchsquare.com/article/rs-1544073/v1 (2022).

  43. Wolf, F. A. et al. PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol. 20, 59 (2019).

    PubMed  PubMed Central  Google Scholar 

  44. Briggs, J. A. et al. The dynamics of gene expression in vertebrate embryogenesis at single-cell resolution. Science 360, eaar5780 (2018).

  45. Chakraborty, C. & Agoramoorthy, G. Stem cells in the light of evolution. Indian J. Med. Res. 135, 813 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  46. Saeed, R. & Deane, C. M. Protein protein interactions, evolutionary rate, abundance and age. BMC Bioinform. 7, 128 (2006).

    Google Scholar 

  47. Boon, K. et al. N‐myc enhances the expression of a large set of genes functioning in ribosome biogenesis and protein synthesis. EMBO J. 20, 1383–1393 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Van Riggelen, J., Yetil, A. & Felsher, D. W. MYC as a regulator of ribosome biogenesis and protein synthesis. Nat. Rev. Cancer 10, 301–309 (2010).

    PubMed  Google Scholar 

  49. Shi, J., Teschendorff, A. E., Chen, W., Chen, L. & Li, T. Quantifying Waddington’s epigenetic landscape: a comparison of single-cell potency measures. Brief. Bioinform. https://doi.org/10.1093/bib/bby093 (2018).

  50. Farley-Barnes, K. I. et al. Diverse regulators of human ribosome biogenesis discovered by changes in nucleolar number. Cell Rep. 22, 1923–1934 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Macaulay, I. C. et al. Single-cell RNA-sequencing reveals a continuous spectrum of differentiation in hematopoietic cells. Cell Rep. 14, 966–977 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Athanasiadis, E. I. et al. Single-cell RNA-sequencing uncovers transcriptional states and fate decisions in haematopoiesis. Nat. Commun. 8, 2045 (2017).

    PubMed  PubMed Central  Google Scholar 

  53. Krausgruber, T. et al. Structural cells are key regulators of organ-specific immune responses. Nature 583, 296–302 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  54. Tarashansky, A. J. et al. Mapping single-cell atlases throughout Metazoa unravels cell type evolution. eLife 10, e66747 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Fischer, S., Crow, M., Harris, B. D. & Gillis, J. Scaling up reproducible research for single-cell transcriptomics using MetaNeighbor. Nat. Protoc. 16, 4031–4067 (2021).

  56. Van de Sande, B. et al. A scalable SCENIC workflow for single-cell gene regulatory network analysis. Nat. Protoc. 15, 2247–2276 (2020).

    PubMed  Google Scholar 

  57. Wang, J. et al. Tracing cell-type evolution by cross-species comparison of cell atlases. Cell Rep. 34, 108803 (2021).

    CAS  PubMed  Google Scholar 

  58. Lee, A.-H., Chu, G. C., Iwakoshi, N. N. & Glimcher, L. H. XBP-1 is required for biogenesis of cellular secretory machinery of exocrine glands. EMBO J. 24, 4368–4380 (2005).

    CAS  PubMed  PubMed Central  Google Scholar 

  59. Todd, D. J. et al. XBP1 governs late events in plasma cell differentiation and is not required for antigen-specific memory B cell development. J. Exp. Med. 206, 2151–2159 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  60. Pramanik, J. et al. Genome-wide analyses reveal the IRE1a-XBP1 pathway promotes T helper cell differentiation by resolving secretory stress and accelerating proliferation. Genome Med. 10, 76 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Masaki, T., Yoshida, M. & Noguchi, S. Targeted disruption of CRE-binding factor TREB5 gene leads to cellular necrosis in cardiac myocytes at the embryonic stage. Biochem. Biophys. Res. Commun. 261, 350–356 (1999).

    CAS  PubMed  Google Scholar 

  62. Reimold, A. M. et al. An essential role in liver development for transcription factor XBP-1. Genes Dev. 14, 152–157 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. Cusanovich, D. A. et al. A single-cell atlas of in vivo mammalian chromatin accessibility. Cell 174, 1309–1324.e18 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. Di Bella, D. J. et al. Molecular logic of cellular diversification in the mouse cerebral cortex. Nature 595, 554–559 (2021).

    PubMed  PubMed Central  Google Scholar 

  65. Domcke, S. et al. A human cell atlas of fetal chromatin accessibility. Science 370, eaba7612 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  66. Viswanathan, S. R. & Daley, G. Q. Lin28: a microRNA regulator with a macro role. Cell 140, 445–449 (2010).

    CAS  PubMed  Google Scholar 

  67. Yang, C.-S., Chang, K.-Y., Dang, J. & Rana, T. M. Polycomb group protein Pcgf6 acts as a master regulator to maintain embryonic stem cell identity. Sci. Rep. 6, 26899 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. Pasini, D. et al. JARID2 regulates binding of the Polycomb repressive complex 2 to target genes in ES cells. Nature 464, 306–310 (2010).

    CAS  PubMed  Google Scholar 

  69. Yoshida, H., Matsui, T., Yamamoto, A., Okada, T. & Mori, K. XBP1 mRNA Is Induced by ATF6 and spliced by IRE1 in response to ER stress to produce a highly active transcription factor. Cell 107, 881–891 (2001).

    CAS  PubMed  Google Scholar 

  70. Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882.e21 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  71. Acosta-Alvear, D. et al. XBP1 controls diverse cell type- and condition-specific transcriptional regulatory networks. Mol. Cell 27, 53–66 (2007).

    CAS  PubMed  Google Scholar 

  72. Grün, D. Revealing dynamics of gene expression variability in cell state space. Nat. Methods 17, 45–49 (2020).

    PubMed  Google Scholar 

  73. Pijuan-Sala, B. et al. A single-cell molecular map of mouse gastrulation and early organogenesis. Nature https://doi.org/10.1038/s41586-019-0933-9 (2019).

  74. Xie, Y. et al. An episomal vector-based CRISPR/Cas9 system for highly efficient gene knockout in human pluripotent stem cells. Sci. Rep. 7, 2320 (2017).

    PubMed  PubMed Central  Google Scholar 

  75. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  PubMed  Google Scholar 

  76. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).

  77. Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  78. Park, J.-E. et al. A cell atlas of human thymic development defines T cell repertoire formation. Science 367, eaay3224 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  79. Zhou, Q. et al. A mouse tissue transcription factor atlas. Nat. Commun. 8, 15089 (2017).

    PubMed  PubMed Central  Google Scholar 

  80. Lambert, L. J., Muzumdar, M. D., Rideout III, W. M. & Jacks, T. Basic mouse methods for clinician researchers: harnessing the mouse for biomedical research. in Basic Science Methods for Clinical Researchers 291–312 (Elsevier, 2017).

  81. Teschendorff, A. E. & Wang, N. Improved detection of tumor suppressor events in single-cell RNA-Seq data. NPJ Genom. Med. 5, 43 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  82. Saito, R. et al. A travel guide to Cytoscape plugins. Nat. Methods 9, 1069–1076 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  83. Rozanski, A. et al. PlanMine 3.0—improvements to a mineable resource of flatworm biology and biodiversity. Nucleic Acids Res. 47, D812–D820 (2019).

    CAS  PubMed  Google Scholar 

  84. Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).

    CAS  PubMed  Google Scholar 

  85. Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015).

    PubMed  PubMed Central  Google Scholar 

  86. Hu, H. et al. AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors. Nucleic Acids Res. 47, D33–D38 (2019).

    CAS  PubMed  Google Scholar 

  87. Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics: J. Integr. Biol. 16, 284–287 (2012).

    CAS  Google Scholar 

  88. Supek, F., Bošnjak, M., Škunca, N. & Šmuc, T. REVIGO summarizes and visualizes long lists of gene ontology terms. PLoS ONE 6, e21800 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  89. Mering, Cvon et al. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 31, 258–261 (2003).

    Google Scholar 

  90. Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  91. Song, Y. et al. Screening of potential biomarkers for gastric cancer with diagnostic value using label-free global proteome Analysis. Genom. Proteom. Bioinform. 18, 679–695 (2020).

  92. Schep, A. N., Wu, B., Buenrostro, J. D. & Greenleaf, W. J. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods 14, 975–978 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  93. Argemí, J. et al. X-box binding protein 1 regulates unfolded protein, acute-phase, and DNA damage responses during regeneration of mouse liver. Gastroenterology 152, 1203–1216. e15 (2017).

    PubMed  Google Scholar 

  94. Khetchoumian, K. et al. Pituitary cell translation and secretory capacities are enhanced cell autonomously by the transcription factor Creb3l2. Nat. Commun. 10, 3960 (2019).

    PubMed  PubMed Central  Google Scholar 

  95. Fei, L. ggjlab/MCDA: v1.0.0. Zenodo https://doi.org/10.5281/zenodo.6423564 (2022).

Download references

Acknowledgements

G.G. is a participant of the Human Cell Atlas Project. We thank J. Wu, H. Wu, J. Liu, D. Wang, S. Xiong, Y. Huang and P. Xu for support on the project. We thank: G-BIO (Hangzhou), BGI (Shenzhen) and CNGB (Shenzhen) for supporting this project; Vazyme (Nanjing) for supplying the customized enzymes in the study; the Core Facilities of Zhejiang University School of Medicine for technical support; the Center of Cryo-Electron Microscopy at Zhejiang University for computational support; PTM Biolab (Hangzhou) for performing the LC–MS proteomic analysis; and Gempharmatech (Nanjing) for producing the KO mice. Funding was provided by: National Natural Science Foundation of China (grant nos. 31930028 to G.G., 31871473 to X.H., 31922049 to X.H., 91842301 to G.G., 32001068 to H.C. and 32000461 to J.W.), the Funds for Creative Research Groups of China (grant no. T2121004 to G.G.), National Key Research and Development Program (grant nos. 2018YFA0107804 to G.G., 2018YFA0107801 to X.H. and 2018YFA0800503 to G.G.); and Fundamental Research Funds for the Central Universities (to G.G.).

Author information

Authors and Affiliations

Authors

Contributions

G.G. conceived the project. X.H., H.C., X.F., Z.Z., R.W. and L.F. performed the experiments. L.F., L.M., W.E., H.S., J.W., X.W., C.Y. and Y.M. performed the single-cell data processing, clustering analyses, gene-regulated analyses and cell-type annotation. G.G., L.F., H.C., L.M., W.E. and X.F. wrote the paper. M.J., D.J. and T.Z. performed the sequencing experiments. L.F., L.M. and W.E. preserved and made available the data, code and materials on publication. G.G., X.H., H.C. and J.W. acquired the funds.

Corresponding authors

Correspondence to Xiaoping Han or Guoji Guo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Malte Spielmann and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Construction of the MCDA.

a, Hierarchical trees showing the relationship between 95 cell types in MCDA, colored by lineage. b, t-SNE visualization of 520,801 single cells from seven developmental stages of mice, colored by lineage. They share the same color legend of lineages. c, t-SNE visualization of 520,801 single cells from different developmental stages of mice, colored by tissue. d, Heatmaps showing the number of differentially expressed genes (DEGs) in each developmental stage across the ten tissues of mice. DEGs between two stages of cells were identified using a Wilcoxon rank sum test. e, Summary of the GO enrichment analysis performed on the DEGs in each developmental stage. f, Visualization of the top 10 principal components of PCA in MCDA. Colors represent tissues, which is the same in Extended Data Fig. 1c. g, Lollipop chart displaying the gene expression variance explained by residuals (that is, biological and technical noise) or experimental factors such as tissue, stage, gender, and their respective combinations. Items like “tissue and gender” are variances explained by interactions of two factors instead of the union of two factors. h, UMAP visualization of 57,118 single cells in the kidneys at 7 different time points, colored by stage. i, Summary of the GO enrichment analysis performed on the DEGs in the kidneys across different stages. The red marks the go terms related to physiological functions of renal functions.

Extended Data Fig. 2 t-SNE maps for examples of analyzed tissues in MCDA.

t-SNE maps for single-cell data from brain at P0 (a, n =9,265 cells), P10 (b, n = 6,100 cells), P21 (c, n = 4,433 cells) stages, heart at P0 (d, n = 3,948 cells), P10 (e, n = 5,383 cells), P21 (f, n = 4,054 cells) stages, intestine at P0 (g, n = 9,101 cells), P10 (h, n = 17,909 cells), P21 (i, n = 9,365 cells) stages, kidney at P0 (j, n = 13,155 cells), P10 (k, n = 12,129 cells), P21 (l, n = 5,700 cells) stages, liver at P0 (m, n = 9,980 cells), P10 (n, n = 9,259 cells), P21 (o, n = 5,867 cells) stages, lung at P0 (p, n = 5,906 cells), P10 (q, n = 11,314 cells), P21 (r, n = 6,391 cells) stages, and pancreas at P0 (s, n = 5,639 cells), P10 (t, n = 11,007 cells) stages.

Extended Data Fig. 3 t-SNE maps for examples of analyzed tissues in MCDA.

t-SNE maps for single-cell data from pancreas at P21 (a, n = 4,858 cells) stages, stomach at P0 (b, n = 4,073 cells), P10 (c, n = 22,599cells), P21(d, n = 9,945 cells) stages, testes at P0 (e, n = 9,034 cells), P10 (f, n = 15,808 cells), P21 (g, n = 9,095 cells) stages, uterus at P0 (h, n = 4,561 cells), P10 (i, n = 4,841 cells), P21 (j, n = 9,077 cells) stages, and embryo at E10.5 (k, n = 26,551 cells) and E12.5 (l, n = 72,792 cells) stages.

Extended Data Fig. 4 Examples of novel cell populations.

a, Feature plots in the t-SNE map of P10 lung (n = 11,314 cells). Cells are colored according to the expression of the indicated marker genes or two genes. The red boxes magnify the co-expressed cell types in the tissues. b, Immunofluorescence assay for the club cell marker gene Scgb1a1 (green) and goblet cell marker gene Tff2 (yellow) in P10 lung. The red boxes indicate the co-expressed locations. The experiment was replicated three times with similar results. Scale bar, 20 μm. c, d, Left: feature plots of Afp in the t-SNE map of P0 pancreas (c, n = 5,639 cells), P10 pancreas (d, n = 11,007 cells). Cells are colored according to the expression of Afp. Right: immunofluorescence assay for the hepatocyte marker gene Afp (green) in P0 (c) pancreas and P10 (d) pancreas. The experiment was replicated three times with similar results. Scale bar, 20 μm. e, Heatmap shows the differentially expressed genes between liver hepatocytes and pancreas hepatocyte-like cells at the P0 stage. Wilcoxon rank-sum test (two-sided) was performed to identify differentially expressed genes and p-value adjustment was performed using bonferroni correction (p adjusted values < 0.05, fold change > = 2). f, Heatmap shows the differentially expressed genes between liver hepatocytes and pancreas hepatocyte-like cells at the P10 stage. Wilcoxon rank-sum test (two-sided) was performed to identify differentially expressed genes and p-value adjustment was performed using bonferroni correction (p adjusted values < 0.05, fold change > 2).

Extended Data Fig. 5 Entropy estimations of the MCDA using.

a, Entropy measurement of cells in MCDA using the SLICE method. The color represents the stage. P-values are from a two -sided Wilcoxon rank sum test comparing entropies of two different development stages (n = 60,065 cells, ns: not significant, p-value > 0.05, * p-value ≤  0.05, ** p-value ≤  0.01, *** p-value ≤  0.001, **** p-value ≤  0.0001). The exact p-values were displayed in the Source Data. Box plots: center line, median; boxes, first and third quartiles of the distribution; whiskers, highest and lowest data points within 1.5 × IQR. The same statistical analysis was performed for Extended Data Fig. 5a-d. b, Entropy measurement of each lineage in MCDA using the SLICE method. The color represents the stage (epithelial: n = 13,642 cells, neuron: n = 3,638 cells, immune: n = 15,719 cells, muscle n = 2,592 cells, stromal: n = 8,541 cells, endothelial: n = 4,528 cells, other: n = 2,626 cells, erythroid: n = cells, proliferating: n = 3,442 cells, secretory: n = 2,892 cells, germline: n = 5,480 cells). c, Entropy measurement of cells in MCDA using the StemID method (n = 60,065 cells). The color represents the stage. d, Entropy measurement of each lineage in MCDA using the StemID method. The color represents the stage (epithelial: n = 13,642 cells, neuron: n = 3,638 cells, immune: n = 15,719 cells, muscle n = 2,592 cells, stromal: n = 8,541 cells, endothelial: n = 4,528 cells, other: n = 2,626 cells, erythroid: n = cells, proliferating: n = 3,442 cells, secretory: n = 2,892 cells, germline: n = 5,480 cells). e, Boxplots displaying the sensitivity, specificity, FPR (False Positive Rate), and PRAUC (Precision-Recall Area Under Curve) of two methods with different inputs to detect tissue-specific TFs in MCDA (n = 9 tissues per box). Methods represented are running VIPER-DOROTHEA with pseudo cells (pseudo_VIPER-DOROTHEA) or single cells (single_VIPER-DOROTHEA), running SCENIC with pseudo cells (pseudo_SCENIC) or single cells(single_SCENIC). The union of the two methods with single cells (single_union (ABC)) was the union of collection ABC. And the intersection of the two methods with single cells (single_intersection (A)) is the collection A. Box plots: center line, median; boxes, first and third quartiles of the distribution; point, tissues in MCDA. The results indicate SCENIC with single-cell datasets performs better in specificity and PRAUC than VIPER-DOROTHEA. The union of two methods achieves over 75% sensitivity in identifying regulatory programs while the intersection of two methods achieves the highest specificity. f, Heatmap of aggregated module activities of TFs clustered by fuzzy c-means showing variation by stage and lineage from VIPER-DOROTHEA. g, Boxplot showing the module activity scores in module 14 (n = 56 TFs) and module 15 (n = 36 TFs) per lineage per stage in SECNIC. Red lines mark the zero line. Colors from blue to yellow represent the 7 development stages from E10.5 to adult stage. Box plots: center line, median; boxes, first and third quartiles of the distribution; whiskers, highest and lowest data points within 1.5 × IQR. h, Venn diagrams of the numbers of overlapping genes between housekeeping TFs and commonly upregulated TFs (TFs in module 14, collection ABC) in MCDA. i, Heatmap showing commonly upregulated TFs (TFs in module 14, collection ABC) with regard to expression levels in MCDA. The color displays the Spearman correlation between aggregated TF expression levels in tissue-lineage against development stages (labeled as 1 to 7 to represent E14.5 to adult). Red blocks indicate the TFs display the upregulated expression patterns in the specific lineages of tissues.

Source data

Extended Data Fig. 6 Analysis of the developmental branch across species.

a, Circos plot showing the subphyla, species, tissues/lineages, and time points of the single-cell dataset used in the cross-species analysis. bd, Radial network plot showing the inferred relationships among cell types of invertebrates (b, H. vulgaris c, C. elegans d, S. mediterraneaia). Dot representing cell types, colored by lineage. e, Sankey plot showing the inferred relationships among cell types in fetal and adult human lungs.

Extended Data Fig. 7 Cross-species analysis of commonly upregulated and downregulated genes.

a, b, Entropy measurement of each lineage in H. sapiens (a) and D. rerio (b) using the CCAT methods (H. sapiens: immune, n = 26,976 cells, stromal, n = 11,278 cells, muscle, n = 5,450 cells, epithelial, n = 20,347 cells, erythroid, n = 1,897 cells, neuron, n = 4,659 cells, endothelial n = 7,475 cells, proliferating, n = 3,421 cells, secretory, n = 3,708 cells; D. rerio: epithelial, n = 36,243 cells, stromal, n = 8,801 cells, erythroid, n = 693 cells, others, n = 3,454 cells, muscle, n = 4,140, neuron: n = 10,363 cells, immune: n = 10,104 cells). The color represents the stage. P-values were from a two-sided Wilcoxon rank sum test comparing entropies of two different development stages. Box plots: center line, median; boxes, first and third quartiles of the distribution; whiskers, highest and lowest data points within 1.5 × IQR. c, d, Venn plots showing the downregulated (c) and upregulated (d) genes in 7 species (homologous genes of humans, p-adj < 0.1). e, Bar plot showing the numbers of conserved upregulated and conserved downregulated genes per species, which were homologous genes of humans. f, Boxplots showing the number of log10 protein–protein interactions of commonly upregulated genes (at least 3 species, n = 59), commonly downregulated genes (at least 3 species, n = 524), other conserved genes (at least 3 species and homologous to human genes, n = 12,543), and other genes (n = 17,839). P-values were from a twosided Wilcoxon rank sum test comparing log10 PPI numbers of two different gene types. Box plots: center line, median; boxes, first and third quartiles of the distribution; whiskers, highest and lowest data points within 1.5 × IQR. g, Bar plot showing the gene composition of conserved upregulated genes (at least 3 species, n = 59 genes), conserved downregulated genes (at least 3 species, n = 524 genes), other conserved genes (in at least 3 species and homologous to human genes, n = 12,543 genes), and other genes (n = 17,839 genes). Gene categories were colored by mean values of log10 PPI number (blue: less PPIs, red: more PPIs). h, i, Bubble plot showing the GO terms of commonly downregulated (h) and upregulated (i) genes. The bubble color indicates the value representing the proportion of selected GO term in the EBI GOA database for the human. Higher value implies more general terms, lower implies more specific ones. The bubble size indicates the frequency of the GO term in the underlying GOA database. Hypergeometric test was performed to identify significant go terms and benjamini-hochberg correction was used to adjust p-values. j, Heatmap showing the cell type frequencies of commonly upregulated genes in 7 species.

Extended Data Fig. 8 Lineage-specific regulators among different species.

a, Network plot showing the reliable and biologically plausible matches of lineages from 7 species using Metaneighbor and SAMap (sm: S. mediterranea, ce: C. elegans, hy: H. vulgaris, ci: C. intestinalis, ze: D. rerio, mo: Mus M. musculus, hu: H. sapiens, the abbreviations are the same in Extended Data Fig. 8). b, UMAP showing the combination projection of seven species based on pseudo-bulk cells, colored by species. c, UMAP showing the combination projection based on pseudo-bulk cells, colored by meta-lineages. dj, Heatmaps showing the sequence similarities (log values) of development-related lineage-specific TFs within the meta-lineage across species: stromal (d), endothelial (e), muscle (f), stem/germline (g), neural (h), immune (i), and epithelial (j).

Extended Data Fig. 9 scRNA-seq revealed the changes in Xbp1-/- embryos.

a, Western blot for the knockout experiment. The molecular weight markers were labeled. The experiment was replicated three times with similar results. b, A igv view of mapped reads in the Xbp1 gene in the sequencing data of the WT and KO embryos. The left one shows the entire Xbp1 gene. The right one shows the marked red region which is the exon1 and exon2 region of Xbp1. The exon2 region shows no read coverage, which indicates that the exon2 (97 bp) has been completely disrupted in KO embryos. The blue lines link the different parts of reads that, by definition, map on several exons. The left and right genome browser tracks share the same y axis. c, Xbp1-/- embryos at E12.5. The arrows represent dead embryos. d, Scatter plot showing the cell composition proportions of differential cell types between KO and WT embryos on E12.5 (WT: n = 4, KO: n = 5, FDR < 0.01). ef, Entropy measurement of each cluster in Fig. 6b using the StemID (e, n = 93,246 cells) and SLICE (f, n = 93,246 cells) methods. They share the same text in the x coordinates. P-values are from a two-sided Wilcoxon rank sum test comparing entropies of two different groups from each cluster (ns: not significant, p-value > 0.05, * p-value ≤  0.05, ** p-value ≤  0.01, *** p-value ≤  0.001, **** p-value ≤  0.0001). The exact p values were displayed in the Source Data. Box plots: center line, median; boxes, first and third quartiles of the distribution; whiskers, highest and lowest data points within 1.5 × IQR.

Source data

Extended Data Fig. 10 High-resolution MS revealed the protein changes in Xbp1-/- embryos.

a, Heatmap illustrating the proteins that were differentially expressed in Xbp1−/− embryos and wild-type embryos (the colors represent the z-scores of the protein expression). A two-sided t-test is performed for comparing protein levels of KO embryos to WT embryos (p-value ≤ 0.05, fold change ≥ 1.5). b, Volcano plot showing the differentially expressed proteins in Xbp1−/− embryos and WT embryos. The lines mark thresholds for log values of the p-value and fold change. The dots of text annotations are genes that are canonical Xbp1 targets related to the unfolded protein response (UPR). The yellow and blue dots are genes with significantly upregulated genes in KO embryos and WT embryos respectively. c, mESCs and Xbp1−/− mESCs grown in mESCs medium for 3 days and showing no visible differences in cell morphology. The experiment was replicated three times with similar results. Scale bar, 50 μm. d, qPCR analysis of Nanog, Oct4, and Sox2 expression in mESCs and Xbp1−/− mESCs showing no significant differences (normalized by the expression level of Gapdh, n = 3 per box). A two-sided Wilcoxon rank sum test is performed for comparing gene expression levels of wild-type and knockout mESCs (p-value ≥ 0.05: not significant, mean ± s.d.).

Source data

Supplementary information

Reporting Summary

Supplementary Tables

Supplementary Tables 1–7 and 9–23.

Supplementary Data

Supplementary Table 8 DEGs in each tissue of each stage. Wilcoxon’s rank-sum test (two-sided) was performed to identify DEGs and p-value adjustment was performed using Bonferroni’s correction.

Source data

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Fig. 6

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 9

Statistical source data.

Source Data Extended Data Fig. 10

Unprocessed western blots.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Fei, L., Chen, H., Ma, L. et al. Systematic identification of cell-fate regulatory programs using a single-cell atlas of mouse development. Nat Genet 54, 1051–1061 (2022). https://doi.org/10.1038/s41588-022-01118-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-022-01118-8

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing