The neocortex contains a multitude of cell types that are segregated into layers and functionally distinct areas. To investigate the diversity of cell types across the mouse neocortex, here we analysed 23,822 cells from two areas at distant poles of the mouse neocortex: the primary visual cortex and the anterior lateral motor cortex. We define 133 transcriptomic cell types by deep, single-cell RNA sequencing. Nearly all types of GABA (γ-aminobutyric acid)-containing neurons are shared across both areas, whereas most types of glutamatergic neurons were found in one of the two areas. By combining single-cell RNA sequencing and retrograde labelling, we match transcriptomic types of glutamatergic neurons to their long-range projection specificity. Our study establishes a combined transcriptomic and projectional taxonomy of cortical cell types from functionally distinct areas of the adult mouse cortex.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Data availability

Single-cell transcriptomic data are available at the NCBI Gene Expression Omnibus (GEO) under accession GSE115746. Summary of all transcriptomic types and markers is available in Supplementary Table 9. Full metadata for all samples are available in Supplementary Table 10. Newly generated mouse lines have been deposited to the Jackson Laboratory: Vipr2-IRES2-cre (JAX stock number 031332), Slc17a8-IRES2-cre (JAX stock number 028534), Penk-IRES2-cre-neo (JAX stock number 025112).

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Fuster, J. The Prefrontal Cortex 5th edn (Academic Press, Cambridge, MA, 2015).

  2. 2.

    Mountcastle, V. B. Perceptual Neuroscience: The Cerebral Cortex (Harvard Univ. Press, Cambridge, MA, 1998).

  3. 3.

    DeFelipe, J. The evolution of the brain, the human nature of cortical circuits, and intellectual creativity. Front. Neuroanat. 5, 29 (2011).

  4. 4.

    Glasser, M. F. et al. A multi-modal parcellation of human cerebral cortex. Nature 536, 171–178 (2016).

  5. 5.

    Kolb, B. & Tees, R. C. The Cerebral Cortex of the Rat (MIT Press, Cambridge, MA, 1990).

  6. 6.

    Ng, L. et al. An anatomic gene expression atlas of the adult mouse brain. Nat. Neurosci. 12, 356–362 (2009).

  7. 7.

    Cardin, J. A., Kumbhani, R. D., Contreras, D. & Palmer, L. A. Cellular mechanisms of temporal sensitivity in visual cortex neurons. J. Neurosci. 30, 3652–3662 (2010).

  8. 8.

    Durand, S. et al. A Comparison of visual response properties in the lateral geniculate nucleus and primary visual cortex of awake and anesthetized mice. J. Neurosci. 36, 12144–12156 (2016).

  9. 9.

    Liu, H., Agam, Y., Madsen, J. R. & Kreiman, G. Timing, timing, timing: fast decoding of object information from intracranial field potentials in human visual cortex. Neuron 62, 281–290 (2009).

  10. 10.

    Chen, T. W., Li, N., Daie, K. & Svoboda, K. A map of anticipatory activity in mouse motor cortex. Neuron 94, 866–879.e4 (2017).

  11. 11.

    Guo, Z. V. et al. Maintenance of persistent activity in a frontal thalamocortical loop. Nature 545, 181–186 (2017).

  12. 12.

    Guo, Z. V. et al. Flow of cortical activity underlying a tactile decision in mice. Neuron 81, 179–194 (2014).

  13. 13.

    Svoboda, K. & Li, N. Neural mechanisms of movement planning: motor cortex and beyond. Curr. Opin. Neurobiol. 49, 33–41 (2018).

  14. 14.

    Zeng, H. & Sanes, J. R. Neuronal cell-type classification: challenges, opportunities and the path forward. Nat. Rev. Neurosci. 18, 530–546 (2017).

  15. 15.

    Molyneaux, B. J., Arlotta, P., Menezes, J. R. & Macklis, J. D. Neuronal subtype specification in the cerebral cortex. Nat. Rev. Neurosci. 8, 427–437 (2007).

  16. 16.

    Rudy, B., Fishell, G., Lee, S. & Hjerling-Leffler, J. Three groups of interneurons account for nearly 100% of neocortical GABAergic neurons. Dev. Neurobiol. 71, 45–61 (2011).

  17. 17.

    Jiang, X. et al. Principles of connectivity among morphologically defined cell types in adult neocortex. Science 350, aac9462 (2015).

  18. 18.

    Markram, H. et al. Reconstruction and simulation of neocortical microcircuitry. Cell 163, 456–492 (2015).

  19. 19.

    Zeisel, A. et al. Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347, 1138–1142 (2015).

  20. 20.

    Tasic, B. et al. Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346 (2016).

  21. 21.

    Economo, M. N. et al. Distinct descending motor cortex pathways and their roles in movement. Nature https://doi.org/10.1038/s41586-018-0642-9 (2018).

  22. 22.

    Frazer, S. et al. Transcriptomic and anatomic parcellation of 5-HT3AR expressing cortical interneuron subtypes revealed by single-cell RNA sequencing. Nat. Commun. 8, 14219 (2017).

  23. 23.

    Abellan, A., Menuet, A., Dehay, C., Medina, L. & Rétaux, S. Differential expression of LIM-homeodomain factors in Cajal–Retzius cells of primates, rodents, and birds. Cereb. Cortex 20, 1788–1798 (2010).

  24. 24.

    Kirischuk, S., Luhmann, H. J. & Kilb, W. Cajal–Retzius cells: update on structural and functional properties of these mystic neurons that bridged the 20th century. Neuroscience 275, 33–46 (2014).

  25. 25.

    Lein, E. S. et al. Genome-wide atlas of gene expression in the adult mouse brain. Nature 445, 168–176 (2007).

  26. 26.

    Sorensen, S. A. et al. Correlated gene expression and target specificity demonstrate excitatory projection neuron diversity. Cereb. Cortex 25, 433–449 (2015).

  27. 27.

    Harris, K. D. & Shepherd, G. M. The neocortical circuit: themes and variations. Nat. Neurosci. 18, 170–181 (2015).

  28. 28.

    Oh, S. W. et al. A mesoscale connectome of the mouse brain. Nature 508, 207–214 (2014).

  29. 29.

    Li, N., Chen, T. W., Guo, Z. V., Gerfen, C. R. & Svoboda, K. A motor cortex circuit for motor planning and movement. Nature 519, 51–56 (2015).

  30. 30.

    Wang, Q. et al. Organization of the connections between claustrum and cortex in the mouse. J. Comp. Neurol. 525, 1317–1346 (2017).

  31. 31.

    Zeng, H. et al. Large-scale cellular-resolution gene profiling in human neocortex reveals species-specific molecular signatures. Cell 149, 483–496 (2012).

  32. 32.

    Ayoub, A. E. & Kostovic, I. New horizons for the subplate zone and its pioneering neurons. Cereb. Cortex 19, 1705–1707 (2009).

  33. 33.

    Hoerder-Suabedissen, A. et al. Subset of cortical layer 6b neurons selectively innervates higher order thalamic nuclei in mice. Cereb. Cortex 28, 1882–1897 (2018).

  34. 34.

    Kim, E. J., Juavinett, A. L., Kyubwa, E. M., Jacobs, M. W. & Callaway, E. M. Three types of cortical layer 5 neurons that differ in brain-wide connectivity and function. Neuron 88, 1253–1267 (2015).

  35. 35.

    He, M. et al. Strategies and tools for combinatorial targeting of GABAergic neurons in mouse cerebral cortex. Neuron 92, 555 (2016).

  36. 36.

    Paul, A. et al. Transcriptional architecture of synaptic communication delineates GABAergic neuron identity. Cell 171, 522–539.e20 (2017).

  37. 37.

    Hilscher, M. M., Leão, R. N., Edwards, S. J., Leão, K. E. & Kullander, K. Chrna2-Martinotti Cells synchronize layer 5 type a pyramidal cells via rebound excitation. PLoS Biol. 15, e2001392 (2017).

  38. 38.

    Cadwell, C. R. et al. Electrophysiological, transcriptomic and morphologic profiling of single neurons using Patch-seq. Nat. Biotechnol. 34, 199–203 (2016).

  39. 39.

    Tasic, B., Levi, B. P. & Menon, V. in Decoding Neural Circuit Structure and Function: Cellular Dissection Using Genetic Model Organisms (eds A. Çelik & M. F. Wernet) 437–468 (Springer International Publishing, New York, 2017).

  40. 40.

    Hrvatin, S. et al. Single-cell analysis of experience-dependent transcriptomic states in the mouse visual cortex. Nat. Neurosci. 21, 120–129 (2018).

  41. 41.

    Gao, P. et al. Deterministic progenitor behavior and unitary production of neurons in the neocortex. Cell 159, 775–788 (2014).

  42. 42.

    O’Leary, D. D., Chou, S. J. & Sahara, S. Area patterning of the mammalian cortex. Neuron 56, 252–269 (2007).

  43. 43.

    Rakic, P. Specification of cerebral cortical areas. Science 241, 170–176 (1988).

  44. 44.

    Vue, T. Y. et al. Thalamic control of neocortical area formation in mice. J. Neurosci. 33, 8442–8453 (2013).

  45. 45.

    Chou, S. J. et al. Geniculocortical input drives genetic distinctions between primary and higher-order visual areas. Science 340, 1239–1242 (2013).

  46. 46.

    Yoshida, M., Assimacopoulos, S., Jones, K. R. & Grove, E. A. Massive loss of Cajal-Retzius cells does not disrupt neocortical layer order. Development 133, 537–545 (2006).

  47. 47.

    Pedraza, M., Hoerder-Suabedissen, A., Albert-Maestro, M. A., Molnár, Z. & De Carlos, J. A. Extracortical origin of some murine subplate cell populations. Proc. Natl Acad. Sci. USA 111, 8613–8618 (2014).

  48. 48.

    Lein, E., Borm, L. E. & Linnarsson, S. The promise of spatial transcriptomics for neuroscience in the era of molecular cell typing. Science 358, 64–69 (2017).

  49. 49.

    Cong, L. et al. Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 (2013).

  50. 50.

    George, S. H. et al. Developmental and adult phenotyping directly from mutant embryonic stem cells. Proc. Natl Acad. Sci. USA 104, 4455–4460 (2007).

  51. 51.

    Raymond, C. S. & Soriano, P. High-efficiency FLP and PhiC31 site-specific recombination in mammalian cells. PLoS One 2, e162 (2007).

  52. 52.

    Tervo, D. G. et al. A designer AAV variant permits efficient retrograde access to projection neurons. Neuron 92, 372–382 (2016).

  53. 53.

    Chatterjee, S. et al. Nontoxic, double-deletion-mutant rabies viral vectors for retrograde targeting of projection neurons. Nat. Neurosci. 21, 638–646 (2018).

  54. 54.

    Hnasko, T. S. et al. Cre recombinase-mediated restoration of nigrostriatal dopamine in dopamine-deficient mice reverses hypophagia and bradykinesia. Proc. Natl Acad. Sci. USA 103, 8858–8863 (2006).

  55. 55.

    Paxinos, G. and Franklin, K. B. J. Mouse Brain In Stereotaxic Coordinates 3rd edn (Academic Press, Cambridge, MA, 2008).

  56. 56.

    Sugino, K. et al. Molecular taxonomy of major neuronal classes in the adult mouse forebrain. Nat. Neurosci. 9, 99–107 (2006).

  57. 57.

    Hempel, C. M., Sugino, K. & Nelson, S. B. A manual method for the purification of fluorescently labeled neurons from the mammalian brain. Nat. Protoc. 2, 2924–2929 (2007).

  58. 58.

    Ting, J. T., Daigle, T. L., Chen, Q. & Feng, G. Acute brain slice methods for adult and aging animals: application of targeted patch clamp analysis and optogenetics. Methods Mol. Biol. 1183, 221–242 (2014).

  59. 59.

    Ramsköld, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777–782 (2012).

  60. 60.

    Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).

  61. 61.

    Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

  62. 62.

    Lawrence, M. et al. Software for computing and annotating genomic ranges. PLOS Comput. Biol. 9, e1003118 (2013).

  63. 63.

    Yao, Z. et al. A single-cell roadmap of lineage bifurcation in human ESC models of embryonic brain development. Cell Stem Cell 20, 120–134 (2017).

  64. 64.

    Shekhar, K. et al. Comprehensive classification of retinal bipolar neurons by single-cell transcriptomics. Cell 166, 1308–1323 (2016).

  65. 65.

    Fortunato, S. & Barthélemy, M. Resolution limit in community detection. Proc. Natl Acad. Sci. USA 104, 36–41 (2007).

  66. 66.

    Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).

  67. 67.

    Lamprecht, M. R., Sabatini, D. M. & Carpenter, A. E. CellProfiler: free, versatile software for automated biological image analysis. Biotechniques 42, 71–75 (2007).

  68. 68.

    R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, Vienna, 2018).

  69. 69.

    Galili, T. dendextend: an R package for visualizing, adjusting and comparing trees of hierarchical clustering. Bioinformatics 31, 3718–3720 (2015).

  70. 70.

    Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, New York, 2009).

  71. 71.

    Law, C. W., Alhamdoosh, M., Su, S., Smyth, G. K. & Ritchie, M. E. RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR. F1000 Res. 5, 1408 (2016).

  72. 72.

    Liaw, A. & Weiner, M. Classification and regression by randomForest. R News 2, 18–22 (2002).

  73. 73.

    Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).

  74. 74.

    Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).

  75. 75.

    Linderman, G. C., Rachh, M., Hoskins, J. G., Steinberberger, S. & Kluger, Y. Efficient algorithms for t-distributed stochastic neighborhood embedding. Preprint at https://arXiv.org/abs/1712.09005 (2017).

  76. 76.

    Hevner, R. F., Neogi, T., Englund, C., Daza, R. A. & Fink, A. Cajal-Retzius cells in the mouse: transcription factors, neurotransmitters, and birthdays suggest a pallial origin. Brain Res. Dev. Brain Res. 141, 39–53 (2003).

  77. 77.

    Cahoy, J. D. et al. A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. J. Neurosci. 28, 264–278 (2008).

  78. 78.

    Marques, S. et al. Oligodendrocyte heterogeneity in the mouse juvenile and adult central nervous system. Science 352, 1326–1329 (2016).

  79. 79.

    Zhang, Y. et al. An RNA-sequencing transcriptome and splicing database of glia, neurons, and vascular cells of the cerebral cortex. J. Neurosci. 34, 11929–11947 (2014).

  80. 80.

    Kopatz, J. et al. Siglec-h on activated microglia for recognition and engulfment of glioma cells. Glia 61, 1122–1133 (2013).

  81. 81.

    Bennett, M. L. et al. New tools for studying microglia in the mouse and human CNS. Proc. Natl Acad. Sci. USA 113, E1738–E1746 (2016).

  82. 82.

    Armulik, A., Genové, G. & Betsholtz, C. Pericytes: developmental, physiological, and pathological perspectives, problems, and promises. Dev. Cell 21, 193–215 (2011).

  83. 83.

    Bondjers, C. et al. Microarray analysis of blood microvessels from PDGF-B and PDGF-Rβ mutant mice identifies novel markers for brain pericytes. FASEB J. 20, 1703–1705 (2006).

  84. 84.

    Campbell, J. N. et al. A molecular census of arcuate hypothalamus and median eminence cell types. Nat. Neurosci. 20, 484–496 (2017).

  85. 85.

    Groh, A. et al. Cell-type specific properties of pyramidal neurons in neocortex underlying a layout that is modifiable depending on the cortical area. Cereb. Cortex 20, 826–836 (2010).

  86. 86.

    Harris, J. A. et al. Anatomical characterization of Cre driver mice for neural circuit mapping and manipulation. Front. Neural Circuits 8, 76 (2014).

  87. 87.

    Taniguchi, H., Lu, J. & Huang, Z. J. The spatial and temporal origin of chandelier cells in mouse neocortex. Science 339, 70–74 (2013).

Download references


We thank M. Chillon Rodrigues for providing CAV2-Cre, A. Karpova for providing rAAV2-retro, A. Williford for technical assistance, and the Transgenic Colony Management and Animal Care teams for animal husbandry. This work was funded by the Allen Institute for Brain Science, and by US National Institutes of Health grants R01EY023173 and U01MH105982 to H.Z. We thank the Allen Institute founder, P. G. Allen, for his vision, encouragement and support.

Reviewer information

Nature thanks P. Carninci, C. Chau Hon and the anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

Author notes

  1. These authors contributed equally: Zizhen Yao, Lucas T. Graybuck, Kimberly A. Smith


  1. Allen Institute for Brain Science, Seattle, WA, USA

    • Bosiljka Tasic
    • , Zizhen Yao
    • , Lucas T. Graybuck
    • , Kimberly A. Smith
    • , Thuc Nghi Nguyen
    • , Darren Bertagnolli
    • , Jeff Goldy
    • , Emma Garren
    • , Osnat Penn
    • , Trygve Bakken
    • , Vilas Menon
    • , Jeremy Miller
    • , Olivia Fong
    • , Karla E. Hirokawa
    • , Kanan Lathia
    • , Christine Rimorin
    • , Michael Tieu
    • , Rachael Larsen
    • , Tamara Casper
    • , Eliza Barkan
    • , Matthew Kroll
    • , Sheana Parry
    • , Nadiya V. Shapovalova
    • , Daniel Hirschstein
    • , Julie Pendergraft
    • , Tae Kyung Kim
    • , Aaron Szafer
    • , Nick Dee
    • , Peter Groblewski
    • , Ali Cetin
    • , Julie A. Harris
    • , Boaz P. Levi
    • , Susan M. Sunkin
    • , Linda Madisen
    • , Tanya L. Daigle
    • , Amy Bernard
    • , John Phillips
    • , Ed Lein
    • , Michael Hawrylycz
    • , Allan R. Jones
    • , Christof Koch
    •  & Hongkui Zeng
  2. Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA

    • Michael N. Economo
    • , Sarada Viswanathan
    • , Vilas Menon
    • , Loren Looger
    •  & Karel Svoboda
  3. Massachusetts Institute of Technology, Cambridge, MA, USA

    • Heather A. Sullivan
    •  & Ian Wickersham


  1. Search for Bosiljka Tasic in:

  2. Search for Zizhen Yao in:

  3. Search for Lucas T. Graybuck in:

  4. Search for Kimberly A. Smith in:

  5. Search for Thuc Nghi Nguyen in:

  6. Search for Darren Bertagnolli in:

  7. Search for Jeff Goldy in:

  8. Search for Emma Garren in:

  9. Search for Michael N. Economo in:

  10. Search for Sarada Viswanathan in:

  11. Search for Osnat Penn in:

  12. Search for Trygve Bakken in:

  13. Search for Vilas Menon in:

  14. Search for Jeremy Miller in:

  15. Search for Olivia Fong in:

  16. Search for Karla E. Hirokawa in:

  17. Search for Kanan Lathia in:

  18. Search for Christine Rimorin in:

  19. Search for Michael Tieu in:

  20. Search for Rachael Larsen in:

  21. Search for Tamara Casper in:

  22. Search for Eliza Barkan in:

  23. Search for Matthew Kroll in:

  24. Search for Sheana Parry in:

  25. Search for Nadiya V. Shapovalova in:

  26. Search for Daniel Hirschstein in:

  27. Search for Julie Pendergraft in:

  28. Search for Heather A. Sullivan in:

  29. Search for Tae Kyung Kim in:

  30. Search for Aaron Szafer in:

  31. Search for Nick Dee in:

  32. Search for Peter Groblewski in:

  33. Search for Ian Wickersham in:

  34. Search for Ali Cetin in:

  35. Search for Julie A. Harris in:

  36. Search for Boaz P. Levi in:

  37. Search for Susan M. Sunkin in:

  38. Search for Linda Madisen in:

  39. Search for Tanya L. Daigle in:

  40. Search for Loren Looger in:

  41. Search for Amy Bernard in:

  42. Search for John Phillips in:

  43. Search for Ed Lein in:

  44. Search for Michael Hawrylycz in:

  45. Search for Karel Svoboda in:

  46. Search for Allan R. Jones in:

  47. Search for Christof Koch in:

  48. Search for Hongkui Zeng in:


H.Z. and K.S. conceptualized, and H.Z. and B.T. designed and supervised the study. K.S. defined ALM coordinates based on loss-of-function experiments. K.A.S. managed the scRNA-seq pipeline. A.B. and J.Phillips managed pipeline establishment. D.B., J.G., K.L., C.R, M.T. and T.K.K. performed scRNA-seq. Z.Y., L.T.G. and B.T. analysed the data with contributions from O.F., O.P., T.B., V.M., J.M., A.S. and M.H. I.W., H.A.S. and A.C. provided viral vectors. J.A.H., T.N.N., K.E.H. and P.G. conducted viral tracing experiments. B.P.L., N.D., T.C., S.P., E.B., M.K., N.V.S. and D.H. performed single-cell isolation. T.N.N. and E.G. performed RNA ISH with RNAscope. L.M. and T.L.D. generated transgenic mice. J.Pendergraft provided genotyping. R.L. provided mouse colony management. K.S., M.N.E., S.V. and L.L. provided manually collected cells from ALM. S.M.S. provided program management support. H.Z. and E.L. led the Cell Types Program at the Allen Institute. C.K. and A.R.J. provided funding, institutional support and management. L.T.G., Z.Y., T.N.N. and B.T. prepared the figures. B.T. and H.Z. wrote the manuscript with contributions from C.K., K.S., L.T.G., T.N.N. and Z.Y., and in consultation with all authors.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Bosiljka Tasic.

Extended data figures and tables

  1. Extended Data Fig. 1 Overview of sample collection.

    a, One of the two brain regions, ALM or VISp, was dissected from an adult mouse (P53–P59 (n = 339), P51 (n = 1), and P63–P91 (n = 12), Supplementary Table 1). b, Example microdissection images from the most heavily sampled mouse genotype, Snap25-IRES2-cre/wt;Ai14/wt, from both cortical regions. For many samples, microdissection was used to isolate layer-enriched portions of the cortex. ALM lacks L4. The dissection images are representative of n = 21 processed Snap25-IRES2-cre/wt;Ai14/wt brains. c, Microdissected layers were processed separately for single-cell suspension. Each sample was digested with pronase, then triturated with pipettes of decreasing tip diameter (600, 300 and 150 μm). Individual cells were sorted into 8-well strip PCR tubes by FACS. d, SMART-Seq v4 was used to reverse-transcribe and amplify full-length cDNAs from each cell. cDNAs were then tagmented by Nextera XT, PCR-amplified, and sequenced on Illumina HiSeq2500. e, Common gates used for all FACS sorts: (1) Morphology gate excludes events with high side scatter and low forward scatter, which are largely cellular debris, and (2) SC-FSC and SC-SSC gates exclude samples with high forward scatter width and high side scatter width, respectively, to exclude cell doublets and multiplets. f, Example gating for live tdTomato+ or tdTomato cells. Cells sorted using the tdTom+ gate express the tdTomato reporter and have low DAPI fluorescence. This plot was generated from cell suspension isolated from a Snap25-IRES2-cre/wt;Ai14/wt animal, which expresses tdTomato in all neurons. The tdTom gate in this genotype was the main source of non-neuronal cells, which have low DAPI fluorescence and low tdTomato expression. Gating hierarchy and sorting statistics are shown above the FACS scatter plot. This gating strategy is representative of n = 21 processed Snap25-IRES2-cre/wt;Ai14/wt brains. g, To sort eGFP+ cells, we used the same debris and doublet gating described in e, then collected cells with high eGFP and low DAPI fluorescence (eGFP+ gate). This plot was generated from cell suspension isolated from VISp of a Ctgf-2A-dgcre/wt;Snap25-LSL-F2A-GFP/wt animal, which expresses eGFP in L6b neurons. Gating hierarchy and sorting statistics are shown above the FACS scatter plot. This gating strategy is representative of n = 4 processed Ctgf-2A-dgcre/wt;Snap25-LSL-F2A-GFP/wt brains. h, Genotype and layer sampling frequencies for all cluster-assigned cells (n = 23,822). PAN Cre-lines were used to broadly sample neurons in the cortex. Non-PAN lines were included to (1) enrich for cells that displayed poor survival in the isolation process (for example, Rbp4-cre_KL100 for L5 types and Pvalb-IRES-cre for Pvalb types); (2) enrich for rare cell types (for example, Ctgf-2A-dgcre for L6b); and (3) transcriptomically characterize cell types labelled by these lines. Bar plots show the number of cells sampled from each genotype and region (A, ALM; V, VISp). Bars are coloured according to the number of samples from microdissected layers or combinations of layers. i, Transgenic driver composition with respect to cell types for all cluster-assigned cells (n = 23,822). The stacked bar plot shows the proportion of cells in each cluster that were collected from each Cre line. Black bars represent cells collected from retrograde tracing experiments. These cells were labelled by a fluorophore-expressing virus or by a Cre-expressing virus together with a Cre-reporter transgenic line. Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).

  2. Extended Data Fig. 2 scRNA-seq pipeline and analysis workflow.

    a, Workflow diagram outlines the path from individual experimental animals to quality control-qualified scRNA-seq data. At multiple points throughout sample processing, cell and sample metadata were recorded in a laboratory information management system (LIMS, labelled as L), which informs quality control processes. Samples must pass quality control benchmarks to continue through sample processing. b, Clustering procedure. Cells were first divided into broad cell classes based on known marker gene expression, then were segregated into clusters 100 times using a bootstrapped procedure that sampled 80% of cells each time. Within each iteration, cells were split by selection of high variance genes followed by PCA or WGCNA dimensionality reduction. Principal components and WGCNA eigengenes were then used to cluster samples by hierarchical clustering or graph-based Jaccard–Louvain clustering algorithm, depending on the number of cells (clustering module, red box). Clusters were checked for over-splitting or termination criteria (merging module, purple box). Each cluster was used as input for a further round of splitting until termination criteria were met. After 100 rounds of clustering, the frequencies with which samples were clustered together were used as a similarity measure to hierarchically cluster the samples. The resulting hierarchical clustering tree was then dynamically cut, and the resulting clusters were checked for over-splitting. Finally, cells were subjected to validation by a centroid classifier. After 100 rounds of validation, cells that were mapped to the same cluster in more than 90 out of 100 trials were assigned ‘core’ cell identity (n = 21,195), and cells with lower scores were assigned ‘intermediate’ cell identity (n = 2,627). Most intermediate cells were mapped to only two clusters (2,492 out of 2,627; 94.9%). c, The number of cells at each step in our analysis pipeline. The identification of doublets and low-quality clusters is described in the Methods. Some high-quality cells (n = 452) from the retro-seq dataset were not used for projection analysis because stereotaxic injections were determined post-brain section to be unsatisfactory, because: (1) the incorrect target was injected, (2) the injection was too close to the collection site, or (3) strong injection tract labelling was detected (Methods). These cells were kept in transcriptomic clusters, but were not used to inform the specificity of glutamatergic projections. Only cells from the annotated retro-seq dataset (n = 2,204) were used for connectivity analyses in Fig. 3 and Extended Data Fig. 10a, b.

  3. Extended Data Fig. 3 Co-clustering frequency matrix, confusion scores and intermediate cells.

    a, The co-clustering frequency matrix (centre) for up to 100 cells per cluster selected at random (n = 10,820). Some cell types, for example certain Pvalb types (middle of enlarged panel), display pronounced co-clustering. t-SNE was used to visualize the similarity of gene expression patterns in two dimensions for all cluster-assigned cells (n = 23,822). Individual cells in t-SNE plots were coloured by: cell class (GABAergic, red; glutamatergic, blue; glia, grey; endothelial cells, brown), animal donor sex (female, pink; male, purple), dissected brain region (ALM, black; VISp, grey), confusion score (low-blue, high-red), and the number of genes detected (low-blue, high-red). b, Pairwise correlation, differential gene expression and co-clustering for all 133 clusters using all cluster-assigned cells (n = 23,822). c, Confusion scores for all cluster-assigned cells (n = 23,822) segregated by clusters. For each cell, the confusion score is defined as the ratio of the probabilities for that cell to be clustered with the cells from its second best cluster and with the cells from the final cluster (also the best cluster except for rare exceptions). Thus, confusion score is a measure of the confidence of cell type assignment: the lower the value, the less frequently a cell was grouped with cells from a different cluster. Each blue dot is a confusion score for a single cell,  median values are shown as red dots; whiskers are twenty-fifth and seventy-fifth percentiles. d, Fraction of cluster-assigned cells (n = 23,822) annotated as core (coloured) or intermediate (black) for each cluster. In total, 21,195 cells (88.97%) were assigned core, whereas 2,627 (11.03%) were assigned intermediate identity. e, f, We performed 100 rounds of bootstrapped clustering to determine the confidence of our hierarchical clustering structure (Methods). The final dendrogram generated by this method (e), with branches coloured by their bootstrapped confidence: light grey (low confidence), maroon (moderate confidence), and black (high confidence). For figures, we used the dendrogram in f, in which we collapsed branches with confidence lower than 0.4.

  4. Extended Data Fig. 4 Sequencing depth and gene detection for quality-control-qualified cells segregated by cluster.

    a, Sequencing depth for all cluster-assigned cells (n = 23,822), grouped by cell type. Cells were sequenced to a median depth of 2.54 million reads (min = 0.103 M; max = 13.84 M). Median values in millions of reads are adjacent to the cell type labels. Median values are shown as red dots; whiskers are twenty-fifth and seventy-fifth percentiles. b, The number of detected genes (reads detected in exons >0) varies by cell type. Gene detection is shown for each cluster-assigned cell (n = 23,822). Median values are shown adjacent to the cell type labels. Median number of genes detected across all cells is 9,462 per cell (min = 1,445; max = 15,338). Median values are shown as red dots; whiskers are twenty-fifth and seventy-fifth percentiles. Samples with less than1,000 detected genes were excluded at a prior quality control step (Extended Data Fig. 2). c, Comparison of gene detection between our previous study20 and this study for all cluster-assigned cells (left) and cells grouped according to major classes (right). d, Comparison of gene detection between cell classes for core and intermediate cells. The higher gene detection for intermediate non-neuronal cells may be due to contamination with other non-neuronal cells. e, Comparison of gene detection within cell classes between VISp and ALM. In ce, medians are shown as dots and values are listed below each distribution; whiskers are twenty-fifth and seventy-fifth percentiles. Sample size (number of cells) for each analysed group is listed between panels a and b, and below the graphs for panels ce.

  5. Extended Data Fig. 5 Markers used for cell type assignment.

    a, Marker panel of 88 genes for cell classes. For each cluster, 25% trimmed mean expression values are shown (n = 23,822 cluster-assigned cells; 133 clusters). Maximum expression values (counts per million reads, CPM) are shown to the right of the heat map. Pan-neuronal markers (for example, Snap25) were used to assign neuronal type identity. Glutamatergic (for example, Slc17a7 and Slc17a6) and pan-GABAergic (for example, Gad1, Gad2 and Slc32a1) markers were used to assign glutamatergic and GABAergic identity, respectively. Known non-neuronal markers were used to assign non-neuronal identities. b, Marker panel for glutamatergic cell types. For each cluster, 25% trimmed mean expression values are shown (n = 11,905 cells; 56 clusters). Layer-specific markers were used to assign layer identity (for example, Cux2, Rorb, Deptor and Foxp2). To assign final names to types, subclass and/or layer-markers were combined with unique or other specific markers, many of which are novel. Once the identity was assigned, previously unknown genetic bases of phenotypes could be discovered. For example, for the Cajal–Retzius cell type CR–Lhx5, which has been shown by immunohistochemistry to contain glutamate but not GABA76, we show that its glutamatergic phenotype stems from the expression of mRNA encoding VGLUT2 (Slc17a6), and not from the other glutamate transporters (Slc17a7 and Slc17a8). Note that some markers that appear non-specific, provide preferential labelling of specific types when used to make random-insertion transgenic BAC lines. For example, Efr3a-cre_NO10834 specifically labels near-projecting types, although Efr3a mRNA is ubiquitously detected in all neurons. However, its expression is about fivefold higher in near-projecting types; this may contribute to preferential labelling of the near-projecting types by this BAC transgenic Cre line. c, Marker panel for GABAergic cell types. For each cluster, 25% trimmed mean expression values are shown (n = 10,534 cells; 61 clusters). GABAergic subclasses were assigned based on the expression of Lamp5, Serpinf1, Sncg, Vip, Sst and Pvalb. Final names were assigned based on unique combinations of markers.

  6. Extended Data Fig. 6 Comparison of cell types to those defined previously20.

    We compared the similarity of clustering results for our current dataset and our previous dataset by nearest centroid classification (Methods). We mapped the 1,424 cells from the previous dataset20 to 133 clusters from the current study (a) and vice versa: 23,822 current cells to 49 previous clusters20. b, Some types were largely absent from our previous study. For example, the Meis2 type was not detected probably owing to rarity and white-matter confinement, whereas the L5 near-projecting types were missed owing to reliance on Rbp4-cre_KL100 to isolate L5 cells. We now find that, in contrast to pan-neuronal (Snap25-IRES-cre) or pan-glutamatergic Cre lines (Slc17a7-IRES-cre), Rbp4-cre_KL100 labels all L5 types except L5 near-projecting (Extended Data Fig. 8).

  7. Extended Data Fig. 7 Saturation of cell type detection.

    We tested the ability of our of computational approach to segregate cell types based on our standard cluster separation criteria (Methods) upon decreasing cell numbers included in clustering. The total number of sampled cells and derived clusters in each trial is shown at the top of each column. Each row is a cluster from our full dataset analysis. Each coloured box represents the number of cells from each cluster (row) that were included in the specific downsampled dataset (column). Clusters that merge are shifted to the left of their respective columns and merging is indicated by arcs and/or removal of lines between adjacent clusters. For most types, we were able to segregate them from their related types with many fewer cells than present in the complete dataset. Further cell sampling may reveal additional diversity, especially for rare cell types.

  8. Extended Data Fig. 8 Cell-type labelling by recombinase driver lines.

    Driver line names are listed on top (columns, n = 55) and cell types on the left (rows, n = 133). Coloured discs represent the numbers of cells detected for each type (cell numbers are proportional to disc surface area). This plot is based on 20,758 cells isolated from transgenic mice that were collected as tdT+ or GFP+ by FACS. Note that the relative proportions of cell types obtained in these experiments are likely to be affected by cell type-specific differences in survival during the isolation procedure and by samping via layer-enriching dissections.

  9. Extended Data Fig. 9 Non-neuronal cell types.

    a, Non-neuronal cells (n = 1,383 cells) are divided into two major branches according to their developmental origin: neuroectoderm-derived branch, which contains astrocytes and oligodendrocytes (left), and non-neuroectoderm-derived, which includes immune cells (microglia, perivascular macrophages), blood vessel-associated cells (smooth muscle cells, pericytes and endothelial cells), and vascular leptomeningeal cells (VLMCs, right). All have been detected in both ALM and VISp, except VLMC–Osr1–Cd74, which may be rare (12 cells total) and may also be detected in ALM with further sampling. Violin plots represent distributions of individual marker gene expression in single cells within each cluster. Rows are genes, median values are black dots, and values within rows are normalized between 0 and the maximum expression value for each gene (right edge of each row) and displayed on a linear scale. We identify astrocytes based on expression of Aqp477. Oligodendrocyte lineage cells express Sox1078. Oligodendrocyte precursor cells are marked by expression of Pdgfra and absence of Col1a177,78, with dividing oligodendrocyte precursor cells expressing Ccnb1. Newly generated oligodendrocytes (Oligo–Rassf10) express Enpp6, whereas myelinating oligodendrocytes (Oligo–Serpinb1a/Synpr) express Opalin79. Two related types of immune cells coexpress Cd14 and Fcgr3, and can be identified as microglia by expression of Siglech80 and Tmem11981, and perivascular macrophages by expression of Mrc1, Lyve119 and Cd16381. We identify two related types of blood vessel-associated cells as pericytes and smooth muscle cells (SMCs) based on their expression of Cspg4 and Acta2 (reviewed previously82). We assign SMC identity to SMC–Acta2, which strongly expresses Acta2 (smooth muscle actin). We assign pericyte identity to the Peri–Kcnj8 cluster based on specific expression of pericyte markers Kcnj8 and Abcc983. We define additional markers uniquely expressed in this cell type (Atp13a5, Art3, Pla1a and Ace2) that may help solidify pericyte identity in future studies. We identify one type of endothelial cells (Endo–Slc38a5) based on expression of previously characterized endothelial markers, Tek, Pdgfb, Nos3, Eltd1 and Pecam1. We identify VLMC types based on their unique expression of Lum and Col1a178,84. We define four types based on differential gene expression. Markers examined in b and c are highlighted by arrows and colours, respectively. b, RNA ISH for some of non-neuronal markers from the Allen Brian Atlas25. Images contain regions of interest from representative sections selected from individual whole-brain RNA ISH experiments. Spp1 mRNA is detected in the meninges and scattered in the cortex, corresponding to VLMCs, as well as select Pvalb and Sst types. Gja5 mRNA labels vessel-like structures in the grey matter, probably corresponding to the Endo–Slc38a5 cluster. Slc47a1 is specific to the VLMC–Osr1–Cd74 type, which appears to be restricted to pia. Dcn is expressed in three VLMC types, and its expression is seen in the pia and vessel-like structures in the cortex. The number of whole-brain experiments per gene available in the Allen Brain Atlas is as follows: Spp1: n = 4 brains (2 sagittal, 2 coronal); Gja5: n = 2 brains (2 sagittal); Slc47a1: n = 1 brain (sagittal); Dcn: n = 3 brains (2 sagittal, 1 coronal). c, Single-molecule RNA FISH by RNAscope for Osr1, Spp1 and Lum mRNAs shows labelling at the pial surface and surrounding vessel-like structures. Images are representative of two independent RNAscope experiments on n = 2 brains. On the basis of the coexpression of marker genes shown in a, VLMCs within the grey matter (expanded region 2) are probably the VLMC–Spp1–Col15a1 type, whereas VLMCs in pia between cortex and tectum (expanded region 3) are probably VLMC–Osr1–Mc5r. The surface VLMCs (expanded region 1) appear to express all three markers, which are usually not co-detected in single cells by RNA-seq (a). This finding could be explained by a possibility that  region 1 may contain two or more types of spatially appositioned VLMCs, for example, VLMC–Osr1–Cd74 (based on Slc47a1 expression shown in b), as well as one of the Lum+ types (VLMC–Osr1–Mc5r and/or VLMC–Spp1–Col15a1).

  10. Extended Data Fig. 10 Retro-seq and comparison of cell types across regions.

    a, b, Injection targets for retro-seq, represented by select Allen Reference Atlas images are displayed on the left. The plots on the right show injection targets in rows and cell types in columns for annotated retro-seq cells collected from the ALM (n = 1,152) (a) or VISp (n = 1,052) (b). Cell numbers are represented as discs, coloured according to detected cell types. Cell numbers from each target segregated by categories (based on broad type or virus injected) are shown to the right. We used three types of viral tracers expressing Cre: CAV2-Cre, rAAV2-retro-EF1a-Cre and RV∆GL-Cre, and injected them into a Cre-reporter line Ai14. For ALM experiments, we also injected rAAV2-retro-CAG-GFP or rAAV2-retro-CAG-tdT into wild-type mice. To ensure diverse coverage of projection neuron types, at least two virus types were used for most broad target regions (except for striatum and tectum for ALM, and tectum for VISp), as different viruses may display different tropisms. Cell types that were never isolated from the retrograde tracing experiments are shaded pink. Grey-hatched regions denote cells that may have been labelled unintentionally (but unavoidably) through the needle injection tract. For most subcortical injections into VISp-projection areas, the needle goes through the cortex, and some IT cells are labelled through the virus deposited along the needle tract. One exception is the injection into the superior colliculus for VISp experiments, in which we avoided cortical labelling by injecting at an angle through the cerebellum (Methods). Each injection target is labelled according to the centre of the corresponding injection site, however, neighbouring regions are often infected (Supplementary Table 5). Reference atlas abbreviations are as follows: ACA, anterior cingulate area; ALM-c, contralateral anterior lateral motor area; CP-c, contralateral caudoputamen; CTX, cortex; GRN, gigantocellular reticular nucleus; IRN, intermediate reticular nucleus; LD, lateral dorsal nucleus of the thalamus; LGd, dorsal lateral geniculate complex; LP, lateral posterior nucleus of the thalamus; MD, mediodorsal nucleus of the thalamus; MOp, primary motor area; MY, medulla; ORBl-c, contralateral orbital area, lateral part; P, pons; PARN, parvicellular reticular nucleus; PERI, perirhinal area; PF, parafascicular nucleus; PG, pontine grey; PRNc, pontine reticular nucleus dorsal part; RSP, retrospenial area; SC, superior colliculus; SCs, superior colliculus sensory related area; SSp, primary somatosensory area; SSs, supplementary somatosensory area; STR, striatum; TEC, tectum; TH, thalamus; VISp-c, contralateral primary visual area; ZI, zona incerta. c, Mapping of glutamatergic cells from ALM onto VISp glutamatergic cell types (grey arrows) using a random forest classifier trained on VISp types, and vice versa (blue-grey arrows; Methods). The fraction of cells that mapped with high confidence onto clusters from the other region is represented by the weight of the arrows. The best matched types were used in Fig. 2c. For this comparison, the 4,519 ALM cells and 7,352 VISp cells from glutamatergic types excluding CR–Lhx5 were used. d, e, RNA ISH from the Allen Mouse Brain Atlas25 for select markers confirms areal gene expression specificity. Images contain regions of interest from representative sections selected from individual whole-brain RNA ISH experiments. The number of whole-brain experiments per gene available in the Allen Brain Atlas is as follows: Wnt7b: n = 3 brains (1 sagittal, 2 coronal); Postn: n = 2 brains (1 sagittal, 1 coronal); Rxfp2: n = 2 brains (1 sagittal, 1 coronal); Chrna6: n = 3 brains (1 sagittal, 2 coronal); Stac: n = 2 brains (1 sagittal, 1 coronal); Scnn1a: n = 2 brains (1 sagittal, 1 coronal). Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).

  11. Extended Data Fig. 11 Validation of glutamatergic marker gene expression and cell type location by RNA FISH and projection subclasses by anterograde tracing.

    ac, Single-molecule RNA FISH with RNAscope was used to validate marker expression and cell type distribution. a, Example image shows fluorescent spots that correspond to Rspo1 (cyan), Hsd11b1 (green) and Scnn1a (red) mRNA molecules in a 10-µm coronal VISp section. Scale bars are in micrometres. b, Example of processed data from a; white square in a corresponds to green square in b. Data processing steps involved creation of maximum projection of a montage of confocal z-stacks, identifying nuclei, quantifying the number of fluorescent spots, assignment of spots to each nucleus by CellProfiler67 and horizontal compression of the data to emphasize layer enrichment of examined cells. Each dot in the panel represents a cell plotted according to the detected nucleus position. Each cell was coloured according to the quantified fluorescent labelling. The first three panels show cells shaded according to the quantified number of spots per cell: Rspo1 and Scnn1a mRNAs are enriched in L4 and Hsd11b1 at the L4–L5 border. The fourth panel shows location of cells co-labelled with two or more probes and confirms scRNA-seq data: coexpression of Rspo1 and Scnn1a is expected in the L4–IT–VISp–Rspo1 type and coexpression of Hsd11b1 and Scnn1a in the L5–IT–VISp–Hsd11b1Endou type. c, Condensed plots for six individual representative RNA scope experiments (Exp) in VISp for select glutamatergic cell type markers. The number of times (n) each experiment was performed independently to produce similar results is listed below each experiment. Layers were delineated based on cell density. Data in a and b correspond to Exp1. d, A schematic of laminar distributions of VISp glutamatergic types according to experiments in c corroborates previous evidence85 showing that L5 IT and L5 pyramidal tract cells are not well separated into 5a and 5b sublayers in the visual cortex, compared to the primary somatosensory cortex. Note that in ALM, even subtypes of L5b with different projections are well segregated into upper and lower sublayers (see accompanying study)21. e, To confirm the projection patterns of several transcriptomic types and examine them in greater detail, we performed anterograde tracing by Cre-dependent adeno-associated virus (AAV) in select Cre lines. We have previously characterized cell type labelling by these Cre lines (Extended Data Fig. 8). In one case, we used a Cre line with a similar pattern of expression with viral reporter in adulthood (Cux2-IRES-cre instead of Cux2-IRES-creERT2)86. fk, Each image is a projection generated from a series of images obtained by TissueCyte 1000 from a representative anterograde tracing experiment; additional experiments are available at http://connectivity.brain-map.org/. f, L5 and L2/3 IT types as labelled in Tlx3-cre_PL56, and Cux2-IRES-cre lines display extensive long-range projections that cover all layers with preference for upper layers. h, By contrast, L6 IT types, labelled by the newly generated Penk-IRES2-cre-neo line (Extended Data Fig. 8), project to many of the same areas as the L2–L3 and L5 IT types, but their projections are confined to lower layers in all areas examined, including higher visual areas and contralateral VISp and ALM. i, As revealed by the retro-seq data (Extended Data Fig. 10b), we confirm that L4–IT–VISp–Rspo1 type, which represents most cells labelled by Nr5a1-cre (Extended Data Fig. 8) projects to contralateral VISp (Fig. 3d). Notably, this projection is observed only when injection is performed in the most anterior portion of VISp (compare left and right panels in i). j, L6b types labelled by Ctgf-2A-dgcre have sparse projections to the anterior cingulate area. k, Consistent with the retro-seq data, the near-projecting types labelled by Slc17a8-IRES2-cre (Extended Data Fig. 8) do not have long-distance projections, but only local projections and sparse projections to nearby areas. Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).

  12. Extended Data Fig. 12 Mapping of previously published scRNA-seq samples36 and Patch-seq samples38 to our dataset.

    scRNA-seq data obtained from sorted cells or cell content extracted by patching were mapped to our transcriptomic types using a centroid classifier (Methods). a, River plot showing the mapping of single-cell transcriptomes described previously36 (n = 584 cells) to our types. b, Alternative representation of the results in a, with blue discs representing the number of single-cell transcriptomes published previously36 onto a dendrogram of GABAergic cell types in this study. Each blue disc area represents the total number of single-cell transcriptomes mapped to one of our cell types. c, Expression of Calb2, Vip and Cck in single cells from our Sncg and Vip subclasses (n = 3,225). Transgenic recombinase lines based on these genes were used to label CCK basket cells (CCKC) and interneuron-selective cells (ISC) described previously36. Boxes highlight our types to which the CCK basket cells and interneuron-selective cells described previously36 were mapped to. CCK basket cells and interneuron-selective cells as defined previously36 each correspond to several of our transcriptomics types. d, Patch-seq data for 58 cells described previously38 were mapped to our transcriptomic types. Some cells could not be mapped with high confidence to terminal leaves of our taxonomy, and were therefore mapped to an internal node (cluster labels on the right that start with ‘n’ for node, see f). e, Constellation diagram showing corresponding types described previously38 and Lamp5 cell types from this study. Correspondences with the neurogliaform cells (NGC, orange) and single-bouquet cells (SBC, blue) defined previously38 are shown by the colours applied to the left side of each disc. f, Alternative representation of the result in d, with blue discs representing the number of single cell transcriptomes described previously38 mapped onto a dendrogram of GABAergic cell types in this study. Each blue disc area represents the total number of single cell transcriptomes mapped to a type (terminal leaf) or node in our taxonomy.

  13. Extended Data Fig. 13 A new tool, Vipr2-IRES2-Cre, for access to select transcriptomically defined cell types.

    a, Expression of select marker genes in our transcriptomically defined cell types (colour bar on top) represented as violin plots for all cluster-assigned cells (n = 23,822 cells; 133 clusters). Median values are black dots. Each row is scaled to the maximum expression value shown to the right of the plot (in CPM), and displayed on a linear scale. Venn diagrams represent expected cell type labelling by genetic tools described below. A new transgenic line, Vipr2-IRES2-cre, was created to label Pvalb–Vipr2 and Meis2 types. Unlike the previously developed Nkx2.1-creERT2 line87, this line does not require tamoxifen induction for chandelier cell labelling (corresponding to Pvalb–Vipr2). bf, Specificity of this recombinase line was tested by scRNA-seq and immunohistochemistry. b, scRNA-seq and clustering with other cells revealed cell types labelled by two mouse genotypes on the right. Types are labelled on the bottom in standard colours. Only cell types with at least one cell labelled are displayed. n = 329 cells from Vipr2-IRES2-cre/wt;Ai14/wt; n = 38 cells from Vipr2-IRES2-cre/wt;Slc32a1-T2A-FlpO/wt;Ai65/wt. cf, Representative images of immunohistochemistry results; each image is representative of n = 2 experimental animals with stated genotypes. High-magnification images show tdT labelling in black, anti-PVALB in green and anti-tdTomato in red. Tissue sections (100 µm) were stained with anti-PVALB, anti-dsRed (labels tdTomato), and DAPI. Images are maximum intensity projections of confocal z-stacks. Scale bars are in micrometres. In Vipr2-IRES2-cre/wt;Ai14/wt mice (b, grey bars), apart from the expected labelling of chandelier cells (Pvalb–Vipr2), many non-neuronal cells are labelled (especially, Astro–Aqp4 and SMC–Aoc3 types, panel c). d, To improve labelling specificity, we created and examined Vipr2-IRES2-cre/wt;Slc32a1-T2A-FlpO/wt;Ai65/wt mice. As expected, labelling was more specific, now confined to chandelier cells, basket cells and Meis2 interneurons (dark bars in b and morphologically identified types in d). Notably, the chandelier cells within VISp did not express PVALB protein (d, panel 1). e, Genetic intersection of Vipr2 and Pvalb expression labelled cells with chandelier morphology corresponding to Pvalb–Vipr2 in ALM. f, However, Vipr2-IRES2-cre/wt;Pvalb-T2A-FlpO/wt;Ai65/wt did not label chandelier cells in VISp, but PVALB+ cells of basket morphology. This unexpected labelling may reflect historical expression of Vipr2 in a subset of other Pvalb cells or low adult Vipr2 expression that is not detected by scRNA-seq. In VISp-containing sections, some chandelier cells are labelled, but are observed outside of VISp (f, panel 3). RSP, retrosplenial cortex.

  14. Extended Data Fig. 14 Discreteness and continuity in cell type definition.

    a, Within the L4–IT–VISp–Rspo1 type, n = 1,442 cells were arranged according to graded expression of 35 genes (only 26 are shown). b, Mapping of n = 394 L4 and L5 IT cells from our previous study20 to cell types in this study. The three L4 clusters from our previous study20 map primarily to the L4–IT–VISp–Rspo1 type. c, Position of L4–IT–VISp–Rspo1 within the VISp glutamatergic constellation diagram. d, Computational split of the L4–IT–VISp–Rspo1 type into three parts along the continuum. For comparison, equivalent plots indicate better separation between this cluster and select L5 IT clusters. n = 1,404 cells for L4–IT–VISp–Rspo1, 215 for L5–IT–Hsd11b1–Endou, and 435 for L5–IT–VISp–Batf3. e, Comparison of gene expression differences among the parts of L4–IT–VISp–Rspo1, as well as this type with select L5 IT types. By this measure, similar differences are detected between the two ‘end’ parts of L4–IT–VISp–Rspo1, and between L4–IT–VISp–Rspo1 and L5–IT–Hsd11b1–Endou. f, Constellation diagrams for n = 2,880 Sst subclass cells reflect clusters defined at three different deScores corresponding to clustering stringency: low, standard and high, starting with the same number of genes (n = 30,862, Methods). For low stringency, newly split clusters are enclosed by dashed contours. For high stringency, arrows indicate dominant merges.

  15. Extended Data Fig. 15 Cell types with activity-dependent transcriptomic signatures.

    a, River plots representing the mapping of 14,205 VISp cells from this study to a previously published dataset40 using a centroid classifier (Methods). b, Violin plots show the distribution of log2-scaled and centred average expression of late-response genes (LRGs) and early-response genes (ERGs) in select cell types from this study that are in a subclass with at least one type that is significantly correlated with LRG or ERG expression. From the published dataset40, only the clusters with expression of at least four ERGs and LRGs were included. In total, values for n = 6,956 cells from our study are displayed in the violin plots. We performed a two-sided t-test to assess enrichment or depletion of expression of LRGs or ERGs, and defined significant values as P < 0.01 after correction for multiple hypotheses using the Holm method, and average fold change greater than 2. Significant P values for enrichment and depletion are displayed above and below each row, respectively. Complete statistics for all t-tests is included in Supplementary Table 11. c, Heat maps of log2-scaled and centred average gene expression for ERGs and LRGs in select cell types from our study and from the published dataset40.

  16. Extended Data Fig. 16 Areal gene expression differences in GABAergic types.

    ac, t-SNE plots for Sst and Pvalb (n = 5,113 cells, generated using 1,244 differentially expressed genes), Lamp5, Serpinf1, Sncg and Vip (n = 5,365 cells, generated using 1,184 differentially expressed genes) and glutamatergic types (n = 11,905 cells, generated using 1,984 differentially expressed genes) showing cells labelled in cluster colours on the left and area-of-origin on the right. Glutamatergic cells show the most marked segregation by area of origin. Sst and Pvalb types show small but noticeable area-specific segregation, which is even less obvious for the Lamp5, Sncg and Vip types. d, Areas from the Sst and Pvalb t-SNE plots were enlarged to show partial segregation of cells within two Sst types by area of origin. e, Layer distribution and violin plots for marker genes Sst, Crh and Calb2 in cells from the same transcriptomic type divided by area. The number of cells in each type and region are shown below each column (n = 118 cells for ALM Sst–Calb2–Pdlim5; 120 for VISp Sst–Calb2–Pdlim5; 62 for ALM Sst–Esm1; and 69 for VISp Sst–Esm1). Violin plots are shown on a log10 scale, scaled to the maximum value for each gene (right of the plot in CPM); black dots are medians. In ALM, triple positive Sst+Crh+Calb2+ cells belong to Sst–Calb2–Pdlim5 type and are enriched in upper layers. In VISp, this same type does not express Crh, but a different type may account for Sst+Crh+Calb2+ cells: Sst–Esm1, and would be expected in lower layers. f, g, RNA FISH by RNAscope for Sst (green), Crh (red) and Calb2 (cyan) in ALM and VISp. Scale bars are in micrometres. In agreement with e, we find triple-positive cells by RNA FISH in ALM in upper layers, and in VISp in lower layers. Images are representative of a single experiment including multiple tissue sections from ALM and VISp. Brain diagrams were derived from the Allen Mouse Brain Reference Atlas (version 2 (2011); downloaded from https://brain-map.org/api/index.html).

Supplementary information

  1. Supplementary Information

    This file contains the full legends for Supplementary Tables 1-11.

  2. Reporting Summary

  3. Supplementary Tables

    This zipped file contains Supplementary Tables 1-11 – see Supplementary Information document for descriptions.

About this article

Publication history




Issue Date



Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.