Abstract
Waddington’s epigenetic landscape is a metaphor frequently used to illustrate cell differentiation. Recent advances in single-cell genomics are altering our understanding of the Waddington landscape, yet the molecular mechanisms of cell-fate decisions remain poorly understood. We constructed a cell landscape of mouse lineage differentiation during development at the single-cell level and described both lineage-common and lineage-specific regulatory programs during cell-type maturation. We also found lineage-common regulatory programs that are broadly active during the development of invertebrates and vertebrates. In particular, we identified Xbp1 as an evolutionarily conserved regulator of cell-fate determinations across different species. We demonstrated that Xbp1 transcriptional regulation is important for the stabilization of the gene-regulatory networks for a wide range of mouse cell types. Our results offer genetic and molecular insights into cellular gene-regulatory programs and will serve as a basis for further advancing the understanding of cell-fate decisions.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The data generated in the present study can be downloaded from the NCBI’s Gene Expression Omnibus under accession nos. GSE176063 and GSE178217. The raw and processed files of MCDA are at accession no. GSE176063. The raw and processed files of WT and Xbp1 KO embryos are at accession no. GSE178217. Processed count matrices and cell annotations are provided on the figshare website (https://figshare.com/s/340e8e7f349559f61ef6), including the development stage, tissue of origin, lineage information and cell-type annotations. We have provided separate datasets for each tissue and the merged datasets for the MCDA. We have also provided an interactive website (http://bis.zju.edu.cn/MCA) to enable public access to the data. The proteomics data was provided in the Proteomics Identifications Database (PRIDE) under accession no. PXD032847. The following publicly available datasets were used in the study: Mus_musculus. GRCm38.88 genome, Mus_musculus_10090 database, AnimalTFDB 3.0 database, STRING database (v.1.1), eggNOG database (v.5.0), Ensembl v.96; the S. mediterranea dataset generated by Plass et al.12 (accession no. GSE103633), the C. elegan dataset generated by Packer et al.15 (accession no. GSE126954.); the C. intestinalis dataset generated by Cao et al.16 (accession no. GSE131155); the H. vulgaris dataset generated by Siebert et al.13 (accession no. GSE121617); the D. rerio dataset generated by Li et al.42 (GSE178151); the H. sapiens dataset generated by Han et al.14 (GSE134355); and part of the M. musculus dataset (E14.5 and adult) generated by Han et al.11 (accession nos. GSE108097 and GSE134355). The mouse scATAC-seq dataset was generated by Cusanovich et al.63 (accession no. GSE111586, https://atlas.gs.washington.edu/mouse-atac/data) and Di Bella et al.64 (accession no. GSE153164), and the human scATAC-seq dataset by Domcke et al.65 (descartes.brotmanbaty.org).
Code availability
Detailed code is available at GitHub (https://github.com/ggjlab/MCDA) and Zenodo (https://zenodo.org/record/6548256#.Yn92F-hBw2w)95.
Change history
03 August 2022
A Correction to this paper has been published: https://doi.org/10.1038/s41588-022-01176-y
References
Mathis, L. & Nicolas, J.-F. Cellular patterning of the vertebrate embryo. Trends Genet. 18, 627–635 (2002).
Heinäniemi, M. et al. Gene-pair expression signatures reveal lineage control. Nat. Methods 10, 577–583 (2013).
Moris, N., Pina, C. & Arias, A. M. Transition states and cell fate decisions in epigenetic landscapes. Nat. Rev. Genet. 17, 693–703 (2016).
Waddington, C. H. The Strategy of the Genes (Routledge, 2014).
Teschendorff, A. E. & Feinberg, A. P. Statistical mechanics meets single-cell biology. Nat. Rev. Genet. 22, 459–476 (2021).
Wagner, D. E. & Klein, A. M. Lineage tracing meets single-cell omics: opportunities and challenges. Nat. Rev. Genet. 21, 410–427 (2020).
Huang, S., Eichler, G., Bar-Yam, Y. & Ingber, D. E. Cell fates as high-dimensional attractor states of a complex gene regulatory network. Phys. Rev. Lett. 94, 128701 (2005).
Orkin, S. H. & Zon, L. I. Hematopoiesis: an evolving paradigm for stem. Cell Biol. Cell 132, 631–644 (2008).
Davis, R. L., Weintraub, H. & Lassar, A. B. Expression of a single transfected cDNA converts fibroblasts to myoblasts. Cell 51, 987–1000 (1987).
Niwa, H. et al. Interaction between Oct3/4 and Cdx2 determines trophectoderm differentiation. Cell 123, 917–929 (2005).
Han, X. et al. Mapping the Mouse Cell Atlas by Microwell-Seq. Cell 172, 1091–1107.e17 (2018).
Plass, M. et al. Cell type atlas and lineage tree of a whole complex animal by single-cell transcriptomics. Science 360, eaaq1723 (2018).
Siebert, S. et al. Stem cell differentiation trajectories in Hydra resolved at single-cell resolution. Science 365, eaav9314 (2019).
Han, X. et al. Construction of a human cell landscape at single-cell level. Nature 581, 303–309 (2020).
Packer, J. S. et al. A lineage-resolved molecular atlas of C. elegans embryogenesis at single-cell resolution. Science 365, eaax1971 (2019).
Cao, C. et al. Comprehensive single-cell transcriptome lineages of a proto-vertebrate. Nature 571, 349–354 (2019).
Mittnenzweig, M. et al. A single-embryo, single-cell time-resolved model for mouse gastrulation. Cell https://doi.org/10.1016/j.cell.2021.04.004 (2021).
Qiu, C. et al. Systematic reconstruction of cellular trajectories across mouse embryogenesis. Nat. Genet. 54, 328–341 (2022).
Ferre, P., Decaux, J.-F., Issad, T. & Girard, J. Changes in energy metabolism during the suckling and weaning period in the newborn. Reprod. Nutr. Dev. 26, 619–631 (1986).
Stewart, B. J. et al. Spatiotemporal immune zonation of the human kidney. Science 365, 1461–1466 (2019).
Chen, L. et al. Renal-tubule epithelial cell nomenclature for single-cell RNA-sequencing studies. J. Am. Soc. Nephrol. 30, 1358–1364 (2019).
Zheng, B. et al. Prospective identification of myogenic endothelial cells in human skeletal muscle. Nat. Biotechnol. 25, 1025–1034 (2007).
Chaudhari, P., Tian, L., Deshmukh, A. & Jang, Y.-Y. Expression kinetics of hepatic progenitor markers in cellular models of human liver development recapitulating hepatocyte and biliary cell fate commitment. Exp. Biol. Med. 241, 1653–1662 (2016).
Willnow, D. et al. Quantitative lineage analysis identifies a hepato-pancreato-biliary progenitor niche. Nature 597, 87–91 (2021).
Banerji, C. R. S. et al. Cellular network entropy as the energy potential in Waddington’s differentiation landscape. Sci. Rep. 3, 3039 (2013).
Grün, D. et al. De novo prediction of stem cell identity using single-cell transcriptome data. Cell Stem Cell 19, 266–277 (2016).
Guo, M. SLICE: determining cell differentiation and lineage based on single cell entropy. Nucleic Acids Res. 45, 14 (2017).
Teschendorff, A. E. & Enver, T. Single-cell entropy for accurate estimation of differentiation potency from a cell’s transcriptome. Nat. Commun. 8, 15599 (2017).
Teschendorff, A. E., Maity, A. K., Hu, X., Weiyan, C. & Lechner, M. Ultra-fast scalable estimation of single-cell differentiation potency from scRNA-Seq data. Bioinformatics 37, 1528–1534 (2021).
Kauffman, S. Homeostasis and differentiation in random genetic control networks. Nature 224, 177–178 (1969).
Lee, T. I. & Young, R. A. Transcriptional regulation and its misregulation in disease. Cell 152, 1237–1251 (2013).
Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).
Holland, C. H. et al. Robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data. Genome Biol. 21, 36 (2020).
Parker, H. J. Mammalian embryo: Hox genes. eLS 1–15 (2020).
Cassandri, M. et al. Zinc-finger proteins in health and disease. Cell Death Discov. 3, 17071 (2017).
Eisenberg, E. & Levanon, E. Y. Human housekeeping genes, revisited. Trends Genet. 29, 569–574 (2013).
Jochum, W., Passegué, E. & Wagner, E. F. AP-1 in mouse development and tumorigenesis. Oncogene 20, 2401–2412 (2001).
Velazquez, F. N., Caputto, B. L. & Boussin, F. D. c-Fos importance for brain development. Aging 7, 1028 (2015).
Liu, J. et al. The oncogene c-Jun impedes somatic cell reprogramming. Nat. Cell Biol. 17, 856–867 (2015).
Madrigal, P. & Alasoo, K. AP-1 takes centre stage in enhancer chromatin dynamics. Trends Cell Biol. 28, 509–511 (2018).
Hinman, V. & Cary, G. The evolution of gene regulation. eLife 6, e27291 (2017).
Li, J. et al. Inferring predictive genetic models and regulatory elements by deep learning of cross-species single-cell gene expression landscapes. Preprint at https://www.researchsquare.com/article/rs-1544073/v1 (2022).
Wolf, F. A. et al. PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol. 20, 59 (2019).
Briggs, J. A. et al. The dynamics of gene expression in vertebrate embryogenesis at single-cell resolution. Science 360, eaar5780 (2018).
Chakraborty, C. & Agoramoorthy, G. Stem cells in the light of evolution. Indian J. Med. Res. 135, 813 (2012).
Saeed, R. & Deane, C. M. Protein protein interactions, evolutionary rate, abundance and age. BMC Bioinform. 7, 128 (2006).
Boon, K. et al. N‐myc enhances the expression of a large set of genes functioning in ribosome biogenesis and protein synthesis. EMBO J. 20, 1383–1393 (2001).
Van Riggelen, J., Yetil, A. & Felsher, D. W. MYC as a regulator of ribosome biogenesis and protein synthesis. Nat. Rev. Cancer 10, 301–309 (2010).
Shi, J., Teschendorff, A. E., Chen, W., Chen, L. & Li, T. Quantifying Waddington’s epigenetic landscape: a comparison of single-cell potency measures. Brief. Bioinform. https://doi.org/10.1093/bib/bby093 (2018).
Farley-Barnes, K. I. et al. Diverse regulators of human ribosome biogenesis discovered by changes in nucleolar number. Cell Rep. 22, 1923–1934 (2018).
Macaulay, I. C. et al. Single-cell RNA-sequencing reveals a continuous spectrum of differentiation in hematopoietic cells. Cell Rep. 14, 966–977 (2016).
Athanasiadis, E. I. et al. Single-cell RNA-sequencing uncovers transcriptional states and fate decisions in haematopoiesis. Nat. Commun. 8, 2045 (2017).
Krausgruber, T. et al. Structural cells are key regulators of organ-specific immune responses. Nature 583, 296–302 (2020).
Tarashansky, A. J. et al. Mapping single-cell atlases throughout Metazoa unravels cell type evolution. eLife 10, e66747 (2021).
Fischer, S., Crow, M., Harris, B. D. & Gillis, J. Scaling up reproducible research for single-cell transcriptomics using MetaNeighbor. Nat. Protoc. 16, 4031–4067 (2021).
Van de Sande, B. et al. A scalable SCENIC workflow for single-cell gene regulatory network analysis. Nat. Protoc. 15, 2247–2276 (2020).
Wang, J. et al. Tracing cell-type evolution by cross-species comparison of cell atlases. Cell Rep. 34, 108803 (2021).
Lee, A.-H., Chu, G. C., Iwakoshi, N. N. & Glimcher, L. H. XBP-1 is required for biogenesis of cellular secretory machinery of exocrine glands. EMBO J. 24, 4368–4380 (2005).
Todd, D. J. et al. XBP1 governs late events in plasma cell differentiation and is not required for antigen-specific memory B cell development. J. Exp. Med. 206, 2151–2159 (2009).
Pramanik, J. et al. Genome-wide analyses reveal the IRE1a-XBP1 pathway promotes T helper cell differentiation by resolving secretory stress and accelerating proliferation. Genome Med. 10, 76 (2018).
Masaki, T., Yoshida, M. & Noguchi, S. Targeted disruption of CRE-binding factor TREB5 gene leads to cellular necrosis in cardiac myocytes at the embryonic stage. Biochem. Biophys. Res. Commun. 261, 350–356 (1999).
Reimold, A. M. et al. An essential role in liver development for transcription factor XBP-1. Genes Dev. 14, 152–157 (2000).
Cusanovich, D. A. et al. A single-cell atlas of in vivo mammalian chromatin accessibility. Cell 174, 1309–1324.e18 (2018).
Di Bella, D. J. et al. Molecular logic of cellular diversification in the mouse cerebral cortex. Nature 595, 554–559 (2021).
Domcke, S. et al. A human cell atlas of fetal chromatin accessibility. Science 370, eaba7612 (2020).
Viswanathan, S. R. & Daley, G. Q. Lin28: a microRNA regulator with a macro role. Cell 140, 445–449 (2010).
Yang, C.-S., Chang, K.-Y., Dang, J. & Rana, T. M. Polycomb group protein Pcgf6 acts as a master regulator to maintain embryonic stem cell identity. Sci. Rep. 6, 26899 (2016).
Pasini, D. et al. JARID2 regulates binding of the Polycomb repressive complex 2 to target genes in ES cells. Nature 464, 306–310 (2010).
Yoshida, H., Matsui, T., Yamamoto, A., Okada, T. & Mori, K. XBP1 mRNA Is Induced by ATF6 and spliced by IRE1 in response to ER stress to produce a highly active transcription factor. Cell 107, 881–891 (2001).
Adamson, B. et al. A multiplexed single-cell CRISPR screening platform enables systematic dissection of the unfolded protein response. Cell 167, 1867–1882.e21 (2016).
Acosta-Alvear, D. et al. XBP1 controls diverse cell type- and condition-specific transcriptional regulatory networks. Mol. Cell 27, 53–66 (2007).
Grün, D. Revealing dynamics of gene expression variability in cell state space. Nat. Methods 17, 45–49 (2020).
Pijuan-Sala, B. et al. A single-cell molecular map of mouse gastrulation and early organogenesis. Nature https://doi.org/10.1038/s41586-019-0933-9 (2019).
Xie, Y. et al. An episomal vector-based CRISPR/Cas9 system for highly efficient gene knockout in human pluripotent stem cells. Sci. Rep. 7, 2320 (2017).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).
Park, J.-E. et al. A cell atlas of human thymic development defines T cell repertoire formation. Science 367, eaay3224 (2020).
Zhou, Q. et al. A mouse tissue transcription factor atlas. Nat. Commun. 8, 15089 (2017).
Lambert, L. J., Muzumdar, M. D., Rideout III, W. M. & Jacks, T. Basic mouse methods for clinician researchers: harnessing the mouse for biomedical research. in Basic Science Methods for Clinical Researchers 291–312 (Elsevier, 2017).
Teschendorff, A. E. & Wang, N. Improved detection of tumor suppressor events in single-cell RNA-Seq data. NPJ Genom. Med. 5, 43 (2020).
Saito, R. et al. A travel guide to Cytoscape plugins. Nat. Methods 9, 1069–1076 (2012).
Rozanski, A. et al. PlanMine 3.0—improvements to a mineable resource of flatworm biology and biodiversity. Nucleic Acids Res. 47, D812–D820 (2019).
Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).
Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015).
Hu, H. et al. AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors. Nucleic Acids Res. 47, D33–D38 (2019).
Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics: J. Integr. Biol. 16, 284–287 (2012).
Supek, F., Bošnjak, M., Škunca, N. & Šmuc, T. REVIGO summarizes and visualizes long lists of gene ontology terms. PLoS ONE 6, e21800 (2011).
Mering, Cvon et al. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 31, 258–261 (2003).
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).
Song, Y. et al. Screening of potential biomarkers for gastric cancer with diagnostic value using label-free global proteome Analysis. Genom. Proteom. Bioinform. 18, 679–695 (2020).
Schep, A. N., Wu, B., Buenrostro, J. D. & Greenleaf, W. J. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods 14, 975–978 (2017).
Argemí, J. et al. X-box binding protein 1 regulates unfolded protein, acute-phase, and DNA damage responses during regeneration of mouse liver. Gastroenterology 152, 1203–1216. e15 (2017).
Khetchoumian, K. et al. Pituitary cell translation and secretory capacities are enhanced cell autonomously by the transcription factor Creb3l2. Nat. Commun. 10, 3960 (2019).
Fei, L. ggjlab/MCDA: v1.0.0. Zenodo https://doi.org/10.5281/zenodo.6423564 (2022).
Acknowledgements
G.G. is a participant of the Human Cell Atlas Project. We thank J. Wu, H. Wu, J. Liu, D. Wang, S. Xiong, Y. Huang and P. Xu for support on the project. We thank: G-BIO (Hangzhou), BGI (Shenzhen) and CNGB (Shenzhen) for supporting this project; Vazyme (Nanjing) for supplying the customized enzymes in the study; the Core Facilities of Zhejiang University School of Medicine for technical support; the Center of Cryo-Electron Microscopy at Zhejiang University for computational support; PTM Biolab (Hangzhou) for performing the LC–MS proteomic analysis; and Gempharmatech (Nanjing) for producing the KO mice. Funding was provided by: National Natural Science Foundation of China (grant nos. 31930028 to G.G., 31871473 to X.H., 31922049 to X.H., 91842301 to G.G., 32001068 to H.C. and 32000461 to J.W.), the Funds for Creative Research Groups of China (grant no. T2121004 to G.G.), National Key Research and Development Program (grant nos. 2018YFA0107804 to G.G., 2018YFA0107801 to X.H. and 2018YFA0800503 to G.G.); and Fundamental Research Funds for the Central Universities (to G.G.).
Author information
Authors and Affiliations
Contributions
G.G. conceived the project. X.H., H.C., X.F., Z.Z., R.W. and L.F. performed the experiments. L.F., L.M., W.E., H.S., J.W., X.W., C.Y. and Y.M. performed the single-cell data processing, clustering analyses, gene-regulated analyses and cell-type annotation. G.G., L.F., H.C., L.M., W.E. and X.F. wrote the paper. M.J., D.J. and T.Z. performed the sequencing experiments. L.F., L.M. and W.E. preserved and made available the data, code and materials on publication. G.G., X.H., H.C. and J.W. acquired the funds.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Malte Spielmann and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Construction of the MCDA.
a, Hierarchical trees showing the relationship between 95 cell types in MCDA, colored by lineage. b, t-SNE visualization of 520,801 single cells from seven developmental stages of mice, colored by lineage. They share the same color legend of lineages. c, t-SNE visualization of 520,801 single cells from different developmental stages of mice, colored by tissue. d, Heatmaps showing the number of differentially expressed genes (DEGs) in each developmental stage across the ten tissues of mice. DEGs between two stages of cells were identified using a Wilcoxon rank sum test. e, Summary of the GO enrichment analysis performed on the DEGs in each developmental stage. f, Visualization of the top 10 principal components of PCA in MCDA. Colors represent tissues, which is the same in Extended Data Fig. 1c. g, Lollipop chart displaying the gene expression variance explained by residuals (that is, biological and technical noise) or experimental factors such as tissue, stage, gender, and their respective combinations. Items like “tissue and gender” are variances explained by interactions of two factors instead of the union of two factors. h, UMAP visualization of 57,118 single cells in the kidneys at 7 different time points, colored by stage. i, Summary of the GO enrichment analysis performed on the DEGs in the kidneys across different stages. The red marks the go terms related to physiological functions of renal functions.
Extended Data Fig. 2 t-SNE maps for examples of analyzed tissues in MCDA.
t-SNE maps for single-cell data from brain at P0 (a, n =9,265 cells), P10 (b, n = 6,100 cells), P21 (c, n = 4,433 cells) stages, heart at P0 (d, n = 3,948 cells), P10 (e, n = 5,383 cells), P21 (f, n = 4,054 cells) stages, intestine at P0 (g, n = 9,101 cells), P10 (h, n = 17,909 cells), P21 (i, n = 9,365 cells) stages, kidney at P0 (j, n = 13,155 cells), P10 (k, n = 12,129 cells), P21 (l, n = 5,700 cells) stages, liver at P0 (m, n = 9,980 cells), P10 (n, n = 9,259 cells), P21 (o, n = 5,867 cells) stages, lung at P0 (p, n = 5,906 cells), P10 (q, n = 11,314 cells), P21 (r, n = 6,391 cells) stages, and pancreas at P0 (s, n = 5,639 cells), P10 (t, n = 11,007 cells) stages.
Extended Data Fig. 3 t-SNE maps for examples of analyzed tissues in MCDA.
t-SNE maps for single-cell data from pancreas at P21 (a, n = 4,858 cells) stages, stomach at P0 (b, n = 4,073 cells), P10 (c, n = 22,599cells), P21(d, n = 9,945 cells) stages, testes at P0 (e, n = 9,034 cells), P10 (f, n = 15,808 cells), P21 (g, n = 9,095 cells) stages, uterus at P0 (h, n = 4,561 cells), P10 (i, n = 4,841 cells), P21 (j, n = 9,077 cells) stages, and embryo at E10.5 (k, n = 26,551 cells) and E12.5 (l, n = 72,792 cells) stages.
Extended Data Fig. 4 Examples of novel cell populations.
a, Feature plots in the t-SNE map of P10 lung (n = 11,314 cells). Cells are colored according to the expression of the indicated marker genes or two genes. The red boxes magnify the co-expressed cell types in the tissues. b, Immunofluorescence assay for the club cell marker gene Scgb1a1 (green) and goblet cell marker gene Tff2 (yellow) in P10 lung. The red boxes indicate the co-expressed locations. The experiment was replicated three times with similar results. Scale bar, 20 μm. c, d, Left: feature plots of Afp in the t-SNE map of P0 pancreas (c, n = 5,639 cells), P10 pancreas (d, n = 11,007 cells). Cells are colored according to the expression of Afp. Right: immunofluorescence assay for the hepatocyte marker gene Afp (green) in P0 (c) pancreas and P10 (d) pancreas. The experiment was replicated three times with similar results. Scale bar, 20 μm. e, Heatmap shows the differentially expressed genes between liver hepatocytes and pancreas hepatocyte-like cells at the P0 stage. Wilcoxon rank-sum test (two-sided) was performed to identify differentially expressed genes and p-value adjustment was performed using bonferroni correction (p adjusted values < 0.05, fold change > = 2). f, Heatmap shows the differentially expressed genes between liver hepatocytes and pancreas hepatocyte-like cells at the P10 stage. Wilcoxon rank-sum test (two-sided) was performed to identify differentially expressed genes and p-value adjustment was performed using bonferroni correction (p adjusted values < 0.05, fold change > 2).
Extended Data Fig. 5 Entropy estimations of the MCDA using.
a, Entropy measurement of cells in MCDA using the SLICE method. The color represents the stage. P-values are from a two -sided Wilcoxon rank sum test comparing entropies of two different development stages (n = 60,065 cells, ns: not significant, p-value > 0.05, * p-value ≤ 0.05, ** p-value ≤ 0.01, *** p-value ≤ 0.001, **** p-value ≤ 0.0001). The exact p-values were displayed in the Source Data. Box plots: center line, median; boxes, first and third quartiles of the distribution; whiskers, highest and lowest data points within 1.5 × IQR. The same statistical analysis was performed for Extended Data Fig. 5a-d. b, Entropy measurement of each lineage in MCDA using the SLICE method. The color represents the stage (epithelial: n = 13,642 cells, neuron: n = 3,638 cells, immune: n = 15,719 cells, muscle n = 2,592 cells, stromal: n = 8,541 cells, endothelial: n = 4,528 cells, other: n = 2,626 cells, erythroid: n = cells, proliferating: n = 3,442 cells, secretory: n = 2,892 cells, germline: n = 5,480 cells). c, Entropy measurement of cells in MCDA using the StemID method (n = 60,065 cells). The color represents the stage. d, Entropy measurement of each lineage in MCDA using the StemID method. The color represents the stage (epithelial: n = 13,642 cells, neuron: n = 3,638 cells, immune: n = 15,719 cells, muscle n = 2,592 cells, stromal: n = 8,541 cells, endothelial: n = 4,528 cells, other: n = 2,626 cells, erythroid: n = cells, proliferating: n = 3,442 cells, secretory: n = 2,892 cells, germline: n = 5,480 cells). e, Boxplots displaying the sensitivity, specificity, FPR (False Positive Rate), and PRAUC (Precision-Recall Area Under Curve) of two methods with different inputs to detect tissue-specific TFs in MCDA (n = 9 tissues per box). Methods represented are running VIPER-DOROTHEA with pseudo cells (pseudo_VIPER-DOROTHEA) or single cells (single_VIPER-DOROTHEA), running SCENIC with pseudo cells (pseudo_SCENIC) or single cells(single_SCENIC). The union of the two methods with single cells (single_union (ABC)) was the union of collection ABC. And the intersection of the two methods with single cells (single_intersection (A)) is the collection A. Box plots: center line, median; boxes, first and third quartiles of the distribution; point, tissues in MCDA. The results indicate SCENIC with single-cell datasets performs better in specificity and PRAUC than VIPER-DOROTHEA. The union of two methods achieves over 75% sensitivity in identifying regulatory programs while the intersection of two methods achieves the highest specificity. f, Heatmap of aggregated module activities of TFs clustered by fuzzy c-means showing variation by stage and lineage from VIPER-DOROTHEA. g, Boxplot showing the module activity scores in module 14 (n = 56 TFs) and module 15 (n = 36 TFs) per lineage per stage in SECNIC. Red lines mark the zero line. Colors from blue to yellow represent the 7 development stages from E10.5 to adult stage. Box plots: center line, median; boxes, first and third quartiles of the distribution; whiskers, highest and lowest data points within 1.5 × IQR. h, Venn diagrams of the numbers of overlapping genes between housekeeping TFs and commonly upregulated TFs (TFs in module 14, collection ABC) in MCDA. i, Heatmap showing commonly upregulated TFs (TFs in module 14, collection ABC) with regard to expression levels in MCDA. The color displays the Spearman correlation between aggregated TF expression levels in tissue-lineage against development stages (labeled as 1 to 7 to represent E14.5 to adult). Red blocks indicate the TFs display the upregulated expression patterns in the specific lineages of tissues.
Extended Data Fig. 6 Analysis of the developmental branch across species.
a, Circos plot showing the subphyla, species, tissues/lineages, and time points of the single-cell dataset used in the cross-species analysis. b–d, Radial network plot showing the inferred relationships among cell types of invertebrates (b, H. vulgaris c, C. elegans d, S. mediterraneaia). Dot representing cell types, colored by lineage. e, Sankey plot showing the inferred relationships among cell types in fetal and adult human lungs.
Extended Data Fig. 7 Cross-species analysis of commonly upregulated and downregulated genes.
a, b, Entropy measurement of each lineage in H. sapiens (a) and D. rerio (b) using the CCAT methods (H. sapiens: immune, n = 26,976 cells, stromal, n = 11,278 cells, muscle, n = 5,450 cells, epithelial, n = 20,347 cells, erythroid, n = 1,897 cells, neuron, n = 4,659 cells, endothelial n = 7,475 cells, proliferating, n = 3,421 cells, secretory, n = 3,708 cells; D. rerio: epithelial, n = 36,243 cells, stromal, n = 8,801 cells, erythroid, n = 693 cells, others, n = 3,454 cells, muscle, n = 4,140, neuron: n = 10,363 cells, immune: n = 10,104 cells). The color represents the stage. P-values were from a two-sided Wilcoxon rank sum test comparing entropies of two different development stages. Box plots: center line, median; boxes, first and third quartiles of the distribution; whiskers, highest and lowest data points within 1.5 × IQR. c, d, Venn plots showing the downregulated (c) and upregulated (d) genes in 7 species (homologous genes of humans, p-adj < 0.1). e, Bar plot showing the numbers of conserved upregulated and conserved downregulated genes per species, which were homologous genes of humans. f, Boxplots showing the number of log10 protein–protein interactions of commonly upregulated genes (at least 3 species, n = 59), commonly downregulated genes (at least 3 species, n = 524), other conserved genes (at least 3 species and homologous to human genes, n = 12,543), and other genes (n = 17,839). P-values were from a twosided Wilcoxon rank sum test comparing log10 PPI numbers of two different gene types. Box plots: center line, median; boxes, first and third quartiles of the distribution; whiskers, highest and lowest data points within 1.5 × IQR. g, Bar plot showing the gene composition of conserved upregulated genes (at least 3 species, n = 59 genes), conserved downregulated genes (at least 3 species, n = 524 genes), other conserved genes (in at least 3 species and homologous to human genes, n = 12,543 genes), and other genes (n = 17,839 genes). Gene categories were colored by mean values of log10 PPI number (blue: less PPIs, red: more PPIs). h, i, Bubble plot showing the GO terms of commonly downregulated (h) and upregulated (i) genes. The bubble color indicates the value representing the proportion of selected GO term in the EBI GOA database for the human. Higher value implies more general terms, lower implies more specific ones. The bubble size indicates the frequency of the GO term in the underlying GOA database. Hypergeometric test was performed to identify significant go terms and benjamini-hochberg correction was used to adjust p-values. j, Heatmap showing the cell type frequencies of commonly upregulated genes in 7 species.
Extended Data Fig. 8 Lineage-specific regulators among different species.
a, Network plot showing the reliable and biologically plausible matches of lineages from 7 species using Metaneighbor and SAMap (sm: S. mediterranea, ce: C. elegans, hy: H. vulgaris, ci: C. intestinalis, ze: D. rerio, mo: Mus M. musculus, hu: H. sapiens, the abbreviations are the same in Extended Data Fig. 8). b, UMAP showing the combination projection of seven species based on pseudo-bulk cells, colored by species. c, UMAP showing the combination projection based on pseudo-bulk cells, colored by meta-lineages. d–j, Heatmaps showing the sequence similarities (log values) of development-related lineage-specific TFs within the meta-lineage across species: stromal (d), endothelial (e), muscle (f), stem/germline (g), neural (h), immune (i), and epithelial (j).
Extended Data Fig. 9 scRNA-seq revealed the changes in Xbp1-/- embryos.
a, Western blot for the knockout experiment. The molecular weight markers were labeled. The experiment was replicated three times with similar results. b, A igv view of mapped reads in the Xbp1 gene in the sequencing data of the WT and KO embryos. The left one shows the entire Xbp1 gene. The right one shows the marked red region which is the exon1 and exon2 region of Xbp1. The exon2 region shows no read coverage, which indicates that the exon2 (97 bp) has been completely disrupted in KO embryos. The blue lines link the different parts of reads that, by definition, map on several exons. The left and right genome browser tracks share the same y axis. c, Xbp1-/- embryos at E12.5. The arrows represent dead embryos. d, Scatter plot showing the cell composition proportions of differential cell types between KO and WT embryos on E12.5 (WT: n = 4, KO: n = 5, FDR < 0.01). e–f, Entropy measurement of each cluster in Fig. 6b using the StemID (e, n = 93,246 cells) and SLICE (f, n = 93,246 cells) methods. They share the same text in the x coordinates. P-values are from a two-sided Wilcoxon rank sum test comparing entropies of two different groups from each cluster (ns: not significant, p-value > 0.05, * p-value ≤ 0.05, ** p-value ≤ 0.01, *** p-value ≤ 0.001, **** p-value ≤ 0.0001). The exact p values were displayed in the Source Data. Box plots: center line, median; boxes, first and third quartiles of the distribution; whiskers, highest and lowest data points within 1.5 × IQR.
Extended Data Fig. 10 High-resolution MS revealed the protein changes in Xbp1-/- embryos.
a, Heatmap illustrating the proteins that were differentially expressed in Xbp1−/− embryos and wild-type embryos (the colors represent the z-scores of the protein expression). A two-sided t-test is performed for comparing protein levels of KO embryos to WT embryos (p-value ≤ 0.05, fold change ≥ 1.5). b, Volcano plot showing the differentially expressed proteins in Xbp1−/− embryos and WT embryos. The lines mark thresholds for log values of the p-value and fold change. The dots of text annotations are genes that are canonical Xbp1 targets related to the unfolded protein response (UPR). The yellow and blue dots are genes with significantly upregulated genes in KO embryos and WT embryos respectively. c, mESCs and Xbp1−/− mESCs grown in mESCs medium for 3 days and showing no visible differences in cell morphology. The experiment was replicated three times with similar results. Scale bar, 50 μm. d, qPCR analysis of Nanog, Oct4, and Sox2 expression in mESCs and Xbp1−/− mESCs showing no significant differences (normalized by the expression level of Gapdh, n = 3 per box). A two-sided Wilcoxon rank sum test is performed for comparing gene expression levels of wild-type and knockout mESCs (p-value ≥ 0.05: not significant, mean ± s.d.).
Supplementary information
Supplementary Tables
Supplementary Tables 1–7 and 9–23.
Supplementary Data
Supplementary Table 8 DEGs in each tissue of each stage. Wilcoxon’s rank-sum test (two-sided) was performed to identify DEGs and p-value adjustment was performed using Bonferroni’s correction.
Source data
Source Data Fig. 3
Statistical source data.
Source Data Fig. 4
Statistical source data.
Source Data Fig. 5
Statistical source data.
Source Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 9
Statistical source data.
Source Data Extended Data Fig. 10
Unprocessed western blots.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Fei, L., Chen, H., Ma, L. et al. Systematic identification of cell-fate regulatory programs using a single-cell atlas of mouse development. Nat Genet 54, 1051–1061 (2022). https://doi.org/10.1038/s41588-022-01118-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-022-01118-8
This article is cited by
-
aKNNO: single-cell and spatial transcriptomics clustering with an optimized adaptive k-nearest neighbor graph
Genome Biology (2024)
-
Periportal macrophages protect against commensal-driven liver inflammation
Nature (2024)
-
Bioluminescence imaging of Cyp1a1-luciferase reporter mice demonstrates prolonged activation of the aryl hydrocarbon receptor in the lung
Communications Biology (2024)
-
A single-cell atlas of chromatin accessibility in mouse organogenesis
Nature Cell Biology (2024)
-
LY6D is crucial for lipid accumulation and inflammation in nonalcoholic fatty liver disease
Experimental & Molecular Medicine (2023)