Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Deep learning of cross-species single-cell landscapes identifies conserved regulatory programs underlying cell types

Abstract

Despite extensive efforts to generate and analyze reference genomes, genetic models to predict gene regulation and cell fate decisions are lacking for most species. Here, we generated whole-body single-cell transcriptomic landscapes of zebrafish, Drosophila and earthworm. We then integrated cell landscapes from eight representative metazoan species to study gene regulation across evolution. Using these uniformly constructed cross-species landscapes, we developed a deep-learning-based strategy, Nvwa, to predict gene expression and identify regulatory sequences at the single-cell level. We systematically compared cell-type-specific transcription factors to reveal conserved genetic regulation in vertebrates and invertebrates. Our work provides a valuable resource and offers a new strategy for studying regulatory grammar in diverse biological systems.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Zebrafish and Drosophila cell landscapes constructed using Microwell-seq.
Fig. 2: Cross-species analysis of eight species.
Fig. 3: Comparison of regulatory TFs within and across species using scRNA-seq data.
Fig. 4: Application and interpretation of the deep-learning model framework.
Fig. 5: Interpretation of the deep-learning model framework.

Similar content being viewed by others

Data availability

Raw data files for the RNA sequencing analysis reported in this paper have been deposited in is the NCBI Gene Expression Omnibus under accession number GSE178151. Digital expression matrices are available at https://figshare.com/s/ecc05b1051fb5678fd3e. Nvwa data can be accessed at http://bis.zju.edu.cn/nvwa/.

Code availability

The source code for reproducing our analysis and running and training the Nvwa models is available at GitHub (https://github.com/JiaqiLiZju/Nvwa/) and Zenodo (https://zenodo.org/record/6806748) (JiaqiLiZju/Nvwa: release v.1.0, 2022).

References

  1. Han, X. et al. Construction of a human cell landscape at single-cell level. Nature 581, 303–309 (2020).

    Article  CAS  PubMed  Google Scholar 

  2. Tabula Sapiens, C. et al. The Tabula Sapiens: a multiple-organ, single-cell transcriptomic atlas of humans. Science 376, eabl4896 (2022).

    Article  Google Scholar 

  3. Han, X. et al. Mapping the Mouse Cell Atlas by Microwell-Seq. Cell 173, 1307 (2018).

    Article  CAS  PubMed  Google Scholar 

  4. Tabula Muris, C. et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature 562, 367–372 (2018).

    Article  Google Scholar 

  5. Jiang, M. M. et al. Characterization of the zebrafish cell landscape at single-cell resolution. Front. Cell Dev. Biol. 9, 743421 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Cao, J. Y. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661–667 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Cao, C. et al. Comprehensive single-cell transcriptome lineages of a proto-vertebrate. Nature 571, 349–354 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Fincher, C. T., Wurtzel, O., de Hoog, T., Kravarik, K. M. & Reddien, P. W. Cell type transcriptome atlas for the planarian Schmidtea mediterranea. Science 360, eaaq1736 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Sebe-Pedros, A. et al. Cnidarian cell type diversity and regulation revealed by whole-organism single-cell RNA-seq. Cell 173, 1520–1534.e20 (2018).

    Article  CAS  PubMed  Google Scholar 

  10. Li, H. et al. Fly Cell Atlas: A single-nucleus transcriptomic atlas of the adult fruit fly. Science 375, eabk2432 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Arendt, D. et al. The origin and evolution of cell types. Nat. Rev. Genet. 17, 744–757 (2016).

    Article  CAS  PubMed  Google Scholar 

  12. Wang, J. et al. Tracing cell-type evolution by cross-species comparison of cell atlases. Cell Rep. 34, 108803 (2021).

    Article  CAS  PubMed  Google Scholar 

  13. Agarwal, V. & Shendure, J. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 31, 107663 (2020).

    Article  CAS  PubMed  Google Scholar 

  14. Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Kelley, D. R., Snoek, J. & Rinn, J. L. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. 26, 990–999 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Avsec, Z. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Maslova, A. et al. Deep learning of immune cell differentiation. Proc. Natl Acad. Sci. USA 117, 25655–25666 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Buchon, N., Silverman, N. & Cherry, S. Immunity in Drosophila melanogaster–from microbial recognition to whole-organism physiology. Nat. Rev. Immunol. 14, 796–810 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Krausgruber, T. et al. Structural cells are key regulators of organ-specific immune responses. Nature 583, 296–302 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. van Dijk, D. et al. Recovering gene interactions from single-cell data using data diffusion. Cell 174, 716 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Tarashansky, A. J. et al. Mapping single-cell atlases throughout Metazoa unravels cell type evolution. eLife 10, e66747 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Rui, L., Schmitz, R., Ceribelli, M. & Staudt, L. M. Malignant pirates of the immune system. Nat. Immunol. 12, 933–940 (2011).

    Article  CAS  PubMed  Google Scholar 

  24. Zhou, J. et al. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat. Genet. 50, 1171–1179 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Kelley, D. R. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. de Souza, N. The ENCODE project. Nat. Methods 9, 1046 (2012).

    Article  PubMed  Google Scholar 

  27. Srivastava, A. K. & Schlessinger, D. Structure and organization of ribosomal DNA. Biochimie 73, 631–638 (1991).

    Article  CAS  PubMed  Google Scholar 

  28. Suzuki, H., Moriwaki, K. & Sakurai, S. Sequences and evolutionary analysis of mouse 5S rDNAs. Mol. Biol. Evol. 11, 704–710 (1994).

    CAS  PubMed  Google Scholar 

  29. Zentner, G. E., Balow, S. A. & Scacheri, P. C. Genomic characterization of the mouse ribosomal DNA locus. G3 4, 243–254 (2014).

    Article  CAS  PubMed  Google Scholar 

  30. Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Gupta, S., Stamatoyannopoulos, J. A., Bailey, T. L. & Noble, W. S. Quantifying similarity between motifs. Genome Biol. 8, R24 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Cusanovich, D. A. et al. A single-cell atlas of in vivo mammalian chromatin accessibility. Cell 174, 1309–1324.e18 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Hannenhalli, S. & Kaestner, K. H. The evolution of Fox genes and their role in development and disease. Nat. Rev. Genet. 10, 233–240 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Shafer, M. E. R. Cross-species analysis of single-cell transcriptomic data. Front. Cell Dev. Biol. 7, 175 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).

    Article  CAS  PubMed  Google Scholar 

  36. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  PubMed  Google Scholar 

  37. Zheng, G. X. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. McGinnis, C. S., Murrow, L. M. & Gartner, Z. J. DoubletFinder: doublet detection in single-cell RNA sequencing data using artificial nearest neighbors. Cell Syst. 8, 329–337.e4 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902.e21 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Shao, Y. et al. Genome and single-cell RNA-sequencing of the earthworm Eisenia andrei identifies cellular mechanisms underlying regeneration. Nat. Commun. 11, 2656 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Rozanski, A. et al. PlanMine 3.0–improvements to a mineable resource of flatworm biology and biodiversity. Nucleic Acids Res. 47, D812–D820 (2019).

    Article  CAS  PubMed  Google Scholar 

  43. Satou, Y., Kawashima, T., Shoguchi, E., Nakayama, A. & Satoh, N. An integrated database of the ascidian, Ciona intestinalis: towards functional genomics. Zool. Sci. 22, 837–843 (2005).

    Article  CAS  Google Scholar 

  44. Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).

    Article  CAS  PubMed  Google Scholar 

  45. Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Crow, M., Paul, A., Ballouz, S., Huang, Z. J. & Gillis, J. Characterizing the replicability of cell types defined by single cell RNA-sequencing data using MetaNeighbor. Nat. Commun. 9, 884 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Fischer, S., Crow, M., Harris, B. D. & Gillis, J. Scaling up reproducible research for single-cell transcriptomics using MetaNeighbor. Nat. Protoc. 16, 4031–4067 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Hu, H. et al. AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors. Nucleic Acids Res. 47, D33–D38 (2019).

    Article  CAS  PubMed  Google Scholar 

  49. dos Santos, G. et al. FlyBase: introduction of the Drosophila melanogaster Release 6 reference genome assembly and large-scale migration of genome annotations. Nucleic Acids Res. 43, D690–D697 (2015).

    Article  PubMed  Google Scholar 

  50. Dubaj Price, M. & Hurd, D. D. WormBase: a model organism database. Med. Ref. Serv. Q. 38, 70–80 (2019).

    Article  PubMed  Google Scholar 

  51. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  52. Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Fornes, O. et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92 (2020).

    CAS  PubMed  Google Scholar 

  54. Harris, T. W. et al. WormBase: a modern model organism information resource. Nucleic Acids Res. 48, D762–D767 (2020).

    CAS  PubMed  Google Scholar 

  55. Zhao, H. et al. CrossMap: a versatile tool for coordinate conversion between genome assemblies. Bioinformatics 30, 1006–1007 (2014).

    Article  PubMed  Google Scholar 

  56. Ramirez, F., Dundar, F., Diehl, S., Gruning, B. A. & Manke, T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–W191 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Yu, G., Wang, L. G. & He, Q. Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics 31, 2382–2383 (2015).

    Article  CAS  PubMed  Google Scholar 

  58. Ramirez, F. et al. High-resolution TADs reveal DNA sequences underlying genome organization in flies. Nat. Commun. 9, 189 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. Preprint at https://doi.org/10.48550/arXiv.1312.6034 (2014).

  60. Weirauch, M. T. et al. Determination and inference of eukaryotic transcription factor sequence specificity. Cell 158, 1431–1443 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

G.G. is a participant in the Human Cell Atlas Project. We thank M. Chen, Y. Zhou, F. Gu, D. Wang, P. Xu, C. Li, K. Li and H. Wu for support on the project. We thank M20 (Hangzhou), G-BIO (Hangzhou), BGI (Shenzhen) and CNGB (Shenzhen) for supporting the sequencing experiments; and Vazyme (Nanjing) for supplying the customized enzymes used in the study. We also thank the core facility platform of Zhejiang University School of Medicine and the Center of Cryo-Electron Microscopy at Zhejiang University for computational resources, and the core facilities of Zhejiang University Medical Center and the Liangzhu Laboratory for technical support. This work was supported by National Natural Science Foundation of China grants 31930028 to G.G., 31922049 to X.H., 91842301 to G.G., 32000461 to J.W. and 62088101 to J.C.; National Key Research and Development Program grants 2018YFA0800503 to G.G., 2018YFA0107804 to G.G. and 2018YFA0107801 to X.H.; Fundamental Research Funds for the Central Universities (G.G.); and Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare.

Author information

Authors and Affiliations

Authors

Contributions

G.G., X.H. and J. Wang conceived the study. G.G. and X.H. supervised the study. Jiaqi Li designed the model. X.H., R.W., M.J., X.H., H.C., Xinru Wang, Xueyi Wang, Y.L., D.J. and T.Z. performed all the experiments. Jiaqi Li, J. Wang, P.Z., Y.M., Z.S., L.F., L.M., W.E., Y.F., H.W., D.L., H.W., Jingyu Li, Q.G. and C.Y. performed all computational analyses. H.L., J.M. and K.L. helped with the cell-type annotation. J. Wu., S.H. and J.C. guided model design and parameter optimization. G.G., J. Wang., Jiaqi Li and P.Z. wrote the initial draft of the manuscript. All authors participated in discussion of results and manuscript editing.

Corresponding authors

Correspondence to Jingjing Wang, Xiaoping Han or Guoji Guo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Zebrafish cell landscape was constructed using Microwell-seq.

(a-b) t-SNE visualization of 635,228 single cells from whole bodies across three stages of zebrafish, colored by stage (a) and cell lineage (b). (c) Heatmap showing the scaled average expression levels of zebrafish cell type-specific marker genes (left), and relative gene expression of representative cell type-specific markers for each cell type overlaid on t-SNE plots (right). (d) Heatmap showing the correspondence between zebrafish cell landscape in this study (row) and tissue-specific zebrafish dataset from Jiang et al., 2021 (column). Blue refers to a mean AUROC greater than 0.9. (e-f) t-SNE showing 24 subclusters from zebrafish C1 (neuron) (e), and violin plot showing the scale normalized expression levels of representative markers for each subcluster (f).

Extended Data Fig. 2 Drosophila cell landscape was constructed using Microwell-seq.

(a-b) t-SNE visualization of 276,706 single cells from whole bodies across two stages of Drosophila, colored by stage (a) and cell lineage (b). (c) Heatmap showing the scaled average expression levels of Drosophila cell type-specific marker genes (left), and relative gene expression of representative cell type-specific markers for each cluster overlaid on t-SNE plots (right). (d) Heatmap showing the correspondence between Drosophila cell landscape in this study (row) and tissue-specific fly cell atlas construed by Li et al., 2021 (column). Blue refers to a mean AUROC greater than 0.9. (e-f) t-SNE showing 21 subclusters from Drosophila C1 (central nerve cell) (e), violin plot showing the scale normalized expression levels of representative markers for each subcluster (f).

Extended Data Fig. 3 Earthworm cell landscape was constructed using Microwell-seq.

(a) t-SNE visualization of 95,020 single cells from whole bodies of earthworm, colored by cell type (left) and cell lineage (right). (b) Heatmap showing the scaled average expression levels of earthworm cell type-specific marker genes (left), and relative gene expression of representative cell type-specific markers for each cluster overlaid on t-SNE plots (right). (c) The hierarchical clustering tree (right) showing the similarity among earthworm 62 cells, and the histogram plot (left) showing the subtypes of each cell type. The similarity refers to the AUROC score from MetaNeighbor analysis. (d-e) t-SNE showing 11 subclusters from earthworm C24 (neuron) (d), violin plot showing the scale normalized expression levels of representative markers for each subcluster (e).

Extended Data Fig. 4 Characteristics of immune-related structure cells in zebrafish, Drosophila, and earthworm.

(a) Ridge plots showing the scale normalized expression levels of known zebrafish epithelial markers and immune-related markers in zebrafish epithelial cells. (b) Ridge plots showing the the scale normalized expression levels of known Drosophila MT markers and immune-related markers in Drosophila MT cells (C34 and C69). (c) Ridge plots showing the scale normalized expression levels of two earthworm immune-related markers, evm.TU.Chr04.2499 and evm.TU.ctg2984.2 in earthworm coelomocytes (earthworm C0, C5, C6, C16, C19, C30, C32, C35, and C37). (d-e) t-SNE visualization of 11 subtypes of MT cells from Drosophila C34 and C69, and violin plot showing the scale normalized expression levels of representative markers for each subtype (e).

Extended Data Fig. 5 Cross-species analysis among eight species.

(a) Sankey diagrams showing homologous cell-type pairs between human and mouse obtained from SAMap analyses based on different datasets. (b) The Venn diagram showing the number of overlapping homologous cell-type pairs obtained from SAMap analyses based on different datasets. (c-e) The information of homologous cell types among human, mouse, zebrafish, Ciona, Drosophila, earthworm, C. elegans, and planarian, including the number of homologous cell-type pairs (c), aligned score (d), and number of enriched gene pairs (e). Boxplots represent the median, 25th percentile, and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. N of cell types: human: 434, mouse: 432, zebrafish: 378, Ciona: 291, Drosophila: 288, earthworm: 110, C. elegans: 321, and planarians: 173.

Extended Data Fig. 6 Screening conserved regulators underlying cell lineages.

(a) The bar chart showing the percentage of cell lineage-specific TFs in humans, mice, zebrafish, Ciona, Drosophila, earthworm, C. elegans, and planarians across different conservative levels based on homologous genes (left) obtained from SAMap and 1-to-1 orthologous genes (right). The group number refers to the conservative level. TFs were divided into eight different conservative levels (Level 1–8) based on the conversion of homologous TFs among eight species. Level 1 means that TFs have no homologous genes in other species, and Level 8 means that TFs have homologous genes in all other seven species. (b) The bar chart showing the conserved neuron-related TFs between human and other species. The blue bar refers to the number of homologous TF pairs between human and other species, the yellow bar refers to the number of human neuron-related TFs involved in homologous gene pairs, and the grey bar refers to the number of other species neuron-related TFs involved in homologous gene pairs. (c) Conserved neuron-related regulators across eight species showing a comprehensive homologous link from lower to higher organisms. (d) Sankey diagram showing homologous relationships among vertebrates’ immune-related TFs. (e) Sankey diagram showing homologous relationships of immune-related TFs between Drosophila and human.

Extended Data Fig. 7 The performance of model training for eight species.

(a) The AUROC values of major cell lineages for humans, mice, zebrafish, Ciona, Drosophila, earthworm, C. elegans, and planarians. (b) Boxplots of AUPR values for eight species. Boxplots represent the median, 25th percentile, and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. N of cells: humans: 134,557, mice: 179,344, zebrafish: 241,233, Ciona: 12,489, Drosophila: 77,337, earthworm: 29,609, C. elegans: 30,515, and planarians: 50,562.

Extended Data Fig. 8 Benchmark of prediction performance.

(a) The AUROC values of Nvwa, Basset, DeepSEA, Beluga, Basenji, SVM, random labels and random features on human (n = 134,557) and Drosophila (n = 77,337) specific datasets. (b) The AUROC of multiple genome training for zebrafish (n = 241,233) and C. elegans (n = 30,515). Boxplots represent the median, 25th percentile, and 75th percentile, and whiskers correspond to 1.5 times the interquartile range. (c) The heatmaps showing the correlation between observed and Nvwa-predicted cell type-specific transcription for eight species.

Extended Data Fig. 9 Overview of sequence patterns recognized by Nvwa.

(a) Mean saliency scores show transcriptional start site and important information-rich region recognized by Nvwa. (b) Comparison of the first-layer convolution filters derived from feature map-based approaches and gradient-based TF-MoDISco on Drosophila-specific model. (c) Examples of known TFBS compared with the PWMs of Nvwa first-layer in humans, mice, zebrafish, Ciona, Drosophila, C. elegans, and planarians.

Extended Data Fig. 10 Overview of sequence motifs and their influence recognized by Nvwa.

(a) Volcano plot of Nvwa first-layer filters for humans, mice, zebrafish, Ciona, Drosophila, earthworm, C. elegans, and planarians. The x-axis represents the information contents (IC) of a Filter, the y-axis represents the overall influence on of a Filter, Filters with high influence are tagged as up, and Filters with low influence are tagged as down. Those reductant Filters are tagged as triangle and non-reductant Filters are tagged as dots, the size of elements represents the reproducibility in each independent cross-validation run. (b) Barplot of the Nvwa and single-cell ATAC cell type specific motifs for mouse. Neuronal cells (C12, n=29 and C5, n=169 for Nvwa and sci-ATAC data respectively) and endothelial cells (C50, n=31 and C22, n=136 for Nvwa and sci-ATAC data respectively) were shown. Hit indicates the same motifs, NotHit indicates the different motifs identified by TomTom. X-axis indicates the percentage. (c) Barplot of the Nvwa and single -cell ATAC cell type specific transcription factor for Drosophila. Neuronal cells (C1, n=293, CB, n=639 and OL, n=484 for Nvwa and Flybrain (GSE163697) data respectively) were shown. Hit indicates the same TFs, and NotHit indicates the different TFs annotated by TomTom. X-axis indicates the percentage.

Supplementary information

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Wang, J., Zhang, P. et al. Deep learning of cross-species single-cell landscapes identifies conserved regulatory programs underlying cell types. Nat Genet 54, 1711–1720 (2022). https://doi.org/10.1038/s41588-022-01197-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-022-01197-7

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing