## Main

The neocortex is responsible for many aspects of cognitive function and is affected in numerous neurological and neuropsychiatric diseases. A prominent feature of the neocortex is its disproportionate expansion in surface area, volume and neuron number in large-brained mammals when compared to the expansion measured in subcortical structures1,2. Primate neocortex shows an increase in upper or supragranular layers6, whose glutamatergic (excitatory pyramidal) neurons make connections to other neocortical and telencephalic brain regions7.

This supragranular expansion, driven primarily by changes in gene regulation8, includes increased cellular diversity. In rodents the pyramidal neurons of the supragranular neocortex (called layer 2/3 or L2/3 owing to lack of distinguishing boundaries) are relatively homogeneous within L2/37,9,10 and between cortical regions7 (although there is variation in intracortical projection targets11,12). By contrast, primates show clear heterogeneity in layers 2 and 3 (L2 and L3) neurons in density, size, morphology, and electrophysiology as a function of cortical depth and projection target1,13,14,15,16,17,18,19,20. For example, h-channel (also called HCN channel) function shows marked variation by cortical depth, probably facilitating faithful transmission of signals for neurons with long apical dendrites16. Very large neurons in deeper L3 of non-human primates that express the non-phosphorylated form of heavy chain neurofilament protein and are immunoreactive to the antibody SMI-32 (SMI-32ir) preferentially send long-range corticocortical projections21.  Macaque L3 neuron size, dendritic arborization, input resistance and firing patterns vary substantially by cortical region11,22. Two main human L3 pyramidal neuron types have been described that differ in dendritic morphology (slim- versus profuse-tufted17),  and the basal dendritic arbor size of human neocortical pyramidal neurons increases from caudal to rostral neocortex23,24. Finally, the SMI-32ir neurons in deep L3 are preferentially vulnerable to early degeneration and are dramatically reduced in late-stage Alzheimer’s disease4,5.

Single-cell or single-nucleus RNA sequencing (RNA-seq) provides a powerful strategy to quantitatively define neuronal diversity and compare cells across species3,25,26,27,28,29,30, and demonstrates an increased diversity of supragranular glutamatergic intratelencephalic-projecting (IT) neurons in human compared to mouse3. Human L2 and L3 contain three abundant transcriptomic types (t-types), which map to the three t-types found in mouse L2/33, plus additional glutamatergic neuron types in deep L3 not found in mouse supragranular neocortex. We developed a robust Patch-seq31,32 platform for human neocortical tissues from 90 neurosurgical resections to directly characterize the physiological and morphological properties of supragranular neurons and test whether the increased transcriptomic diversity of human glutamatergic IT types is mirrored in other cellular properties.

## Results

### Greater neuronal diversity in human neocortex

We used histology to compare the density and size of neurons from the human middle temporal gyrus (MTG), the most accessible region from neurosurgery, with two neocortical areas in mouse: the extensively characterized primary visual cortex (VISp)25,26 and the temporal association area (TEa), which is often used as a comparator for MTG16,19,33.  Human L2 and L3 exhibited significant differences in the size and density of neurons as a function of cortical depth compared with L2/3 in both mouse regions (Fig. 1a). The density of human neurons was graded (Fig. 1b, left), with the highest density in L2, decreasing by half to a minimum in mid L3 (Fig. 1b, right).  By contrast, supragranular layers of both mouse regions showed sixfold higher neuronal density than in the human regions, with homogeneous distribution by depth.  Similarly, the average cross-sectional area of neuron somata doubled from L2 to deep L3 in human (with the largest exceeding 850 µm2) but was uniform in mouse L2/3 (Fig. 1c, left).  Furthermore, variation in deep L3 soma size was fourfold higher in human versus mouse (Fig. 1c, right), which was clearly visible in human histological sections with large and small neurons comingling (Fig. 1a).

In a previous RNA-seq study3, molecularly-defined glutamatergic t-types were also found to be more diverse, with five t-types in human supragranular MTG (with LTK, GLP2R, FREM3, CARM1P1 and COL22A1 types) versus three t-types each in mouse VISp and ALM.  As reported3, the predominant human t-type FREM3 showed an extended gradient that, with more lenient clustering criteria, could be divided into subtypes that varied by depth (Extended Data Fig. 1a), whereas other t-types in human and mouse showed tighter clustering (Fig. 1d–f). Within-type heterogeneity was high in FREM3, and to a lesser extent in the CARM1P1 type, with greater homogeneity seen for all other mouse and human t-types (Fig. 1g).  The distinctness between clusters, measured as the number of differentially expressed genes between pairs of types, was highest for the deep L3 CARM1P1 and COL22A1 types (Fig. 1h). While these species differences may be partially owing to areal variation, at minimum, they are consistent across two very different mouse brain areas.

### Patch-seq on human neurosurgical tissues

Previous studies16,18,19,34,35,36 have shown that neurosurgically excised human neocortical tissues can be used for slice patch clamp studies within about 12 h of resection (and later37), and that human MTG t-types are consistently identified by single-nucleus RNA sequencing (snRNA-seq) in post mortem and living resected tissue.  Building on this, we developed a robust technology platform for Patch-seq31,32,38 on acute slice preparations from human neurosurgically resected neocortical tissues (Extended Data Fig. 1, Supplementary Table 1) to characterize the electrophysiological, morphological and transcriptomic properties of living human L2-3 neurons using standardized stimuli, biocytin filling and RNA-seq analysis.

To determine whether resected tissues are inherently pathological, we quantified trends in neuronal properties for six histological markers of pathology (Methods). Surprisingly, on a scale of 0 (neurotypical) to 3 (most pathological), most cases had average scores below 1.0 (Extended Data Fig. 1c).  These markers had very low correlation with each other, with only Ki-67 (dividing cells) and Nissl (cellularity) being modestly correlated (Extended Data Fig. 1d). Among 18 features compared between low (0–1) and high (2–3) scored cases for GFAP and IBA1 (the only markers not skewed heavily towards 0, Extended Data Fig. 1c), only resting membrane potential (P = 0.041) and first-spike latency (P = 0.042) differed between IBA1 groups (Fig. 1i). However, no effects of pathology were significant when including neuron depth as a factor (P > 0.17 for all features), indicating that these differences can be explained by imbalanced sampling across cortical depth for the few cases marked as overtly pathological by IBA1.  We also did not observe obvious relationships between electrophysiological features and pathology, age or sex (Extended Data Fig. 2).

Human Patch-seq cells were mapped to the MTG reference classification3 and assigned t-types using the integration and label transfer classification workflow in Seurat (V3) (Methods)39,40.  After alignment, Patch-seq cells intermixed with dissociated cells and nuclei in a low-dimensional uniform manifold approximation and projection (UMAP) space (Fig. 1j), and cells assigned to different t-types were generally colocalized in distinct locations in this space, indicating good agreement between platforms.  T-types showed discrete sub-laminar distributions that were consistent between Patch-seq biocytin staining and cellular marker distributions by multiplex fluorescence in situ hybridization (mFISH) (Fig. 1k, l). These results indicate that Patch-seq data are consistent with reference transcriptomic classifications from dissociated nuclei, and that mapping is robust despite many potential sources of technical noise and uncontrolled variation in the human tissue samples.

A total of 385 neurons that passed transcriptomic data quality control mapped with higher confidence to the five supragranular human glutamatergic t-types than to any other neuron types.  Most neurons in the dataset preserved sufficient labelling to determine the relative depth of the soma with respect to the pia and the border between L2 and L3. A majority of neurons (n = 283) also produced sufficiently complete recordings to calculate electrophysiological features. The subset of neurons (n = 109) with sufficient biocytin fills and intact apical dendrites were imaged at high resolution, then subsequently manually reconstructed.

### Diversity of human excitatory t-types

In the aggregate, properties of human t-types sampled by Patch-seq were consistent with previous reports of slice physiology recordings from L2-3 pyramidal neurons16,17.  However, with the transcriptome as the basis for classification, t-types showed clear qualitative morphoelectric differences (Fig. 2, Supplementary Note). One of the most obvious differences between human t-types was dendrite size, which varied markedly across the large thickness of human supragranular neocortex for t-types found in different layers with apical dendrites that extend to L1 (Fig. 2a, Extended Data Fig. 4). T-types also varied in their relative apical versus basal lengths as well as both passive, single action potential and sustained firing properties (Fig. 2a, b, Supplementary Note), as visualized by UMAP projections of electrophysiological and morphological data (Fig. 2c, d). The COL22A1 type shows strong separation, whereas the other types vary more continuously with some overlap. In particular, the morphoelectric properties of the FREM3 type span the range of, and partially overlap with, LTK, GLP2R and CARM1P1 types. A sparse principal component analysis (SPCA) projection of electrophysiological features (Fig. 2c, right) shows small groups of features that determine the two axes of greatest variability. The first (x) axis) is dominated by features related to passive membrane properties, including membrane time constant, input resistance and rheobase. Variability along the second (y) axis) shows differences in action potential shape. Similarly, an SPCA projection of morphological features (Fig. 2d, right) shows that features related to the total size of the dendrites (length, volume and surface area) are most prominent (x-axis), whereas additional features of the apical dendrites (vertical extent and bias) further capture the distinctness of the COL22A1 type and the continuum within the FREM3 type (y-axis). Box plots of individual features from the SPCA space emphasize that most, but not all, pairs of t-types differ significantly (P < 0.05, false discovery rate (FDR)-corrected Mann–Whitney test).

Although comparisons of morphoelectric neuronal properties between human MTG and mouse neocortex are problematic given the difficulties in identifying homologous regions and the high variance of these properties across regions, it is nevertheless valuable to assess whether the homologous mouse t-types are similarly phenotypically differentiated from one another in any cortical region. We analysed L2/3 pyramidal neurons from mouse VISp using the same Patch-seq platform41. Included were 120 neurons with high-quality electrophysiology and transcriptome data mapping to the three mouse L2/3 glutamatergic t-types, and 60 neurons with data in all three modalities—these neurons had heterogeneous electrophysiological and morphological properties but were more like one another than the human t-types (Extended Data Figs. 5, 6, Supplementary Note).

### Variation of the FREM3 type by depth

Graded change in properties by depth is a prominent organizational principle for the FREM3 t-type3 (Fig. 1d), where a transcriptomic UMAP projection shows a strong relationship between soma depth and gene expression (Fig. 3a).  Similarly, multiple electrophysiological and morphological features (apical and basal dendrites) vary continuously with depth (Fig. 3b, c) in agreement with other studies16,17,19. FREM3 neurons span the full depth of L2-3 and send apical dendrites more than 1 mm to L1 (Fig. 1l, 3b). Although features such as apical dendrite height necessarily correlate with the distance from the soma to L1, many independent features were also strongly correlated, including the maximum length of basal dendrites (Fig. 3b) and soma radius (Extended Data Fig. 7), with 37 of 58 morphological features correlated overall (FDR < 0.05; Supplementary Table 2). Graded electrophysiological properties also exist (Fig. 3c), with 9 of 18 measured features significantly correlated with depth (FDR < 0.05; Supplementary Table 2). The strongest correlations were observed for an increase in sag and action potential upstroke/downstroke ratio with depth and a decrease in action potential latency at rheobase (Fig. 3c). We found that the expression of 790 genes was correlated with depth (FDR < 0.05). Gene ontology (GO) enrichment analysis on this gene set (Fig. 3d, Supplementary Table 2) predicts functional variation across graded neuronal phenotypes based on enrichment of genes associated with synaptic transmission, projection morphogenesis and cell migration (P = 3.14 × 1013, 7.54 × 1012, 5.9 × 10−9).

### Deep t-types are phenotypically distinct

Deep L3 consists of a highly diverse set of putative IT projection types, including CARM1P1, COL22A1, and deep FREM3 (defined by the neuronal density minimum in L2-3) (Fig. 1b). Compared with superficial types, deep neurons have larger apical dendrites, reflecting deeper soma locations, but a surprising lack of apical dendrite in L1 (Fig. 4a, bottom). Their axons also innervate L4 to a greater extent than superficial t-types (Extended Data Fig. 8). They differ electrophysiologically from superficial types with higher sag, action potential upstroke/downstroke ratio and initial instantaneous firing rate. This latter feature, reflecting a vigorous response to the onset of stimulation with a burst of action potentials followed by strong firing rate adaptation, may correspond to bursting phenotypes observed in L3 neurons in non-human primates22.

The increased transcriptomic distinctness of deep types (Fig. 1h) is also reflected in their electrophysiology and morphological features. While every feature that differentiates between superficial neuron types also differentiates between deep types, several features uniquely distinguish among deep types, including resting membrane potential, membrane time constant and action potential upstroke, as well as the width and complexity (for example, maximum branch order) of the basal dendrites (Extended Data Fig. 9). This distinctiveness is also seen in the performance of a logistic regression classifier trained to predict t-type from electrophysiological features (Fig. 4b): superficial and deep types group separately, with higher discriminability among the deep types and the FREM3 type forming an intermediate with similarity to both groups.

To understand transcriptional differences that may be predictive of phenotypic differences between CARM1P1, COL22A1 and deep FREM3 types, we used genesorteR42 with slightly relaxed parameters (quant = 0.7) to identify genes selective for one or two of these three t-types and found 219 such marker genes (Extended Data Fig. 10).  Differences in morphoelectric properties of the three deep L3 t-types were reflected in genes enriched for GO terms associated with neuronal connectivity, structure and synaptic signaling, including axon (P = 3.5 × 10−6; Bonferroni corrected), synapse (P = 5.3 × 10−5), calcium ion binding (P = 0.008) and extracellular matrix organization (P = 0.00002).  Numerous genes involved in neuronal structure and function show specific expression in one or more deep types (Fig 4d), including those involved in dendritic branching (COBLL143), neuron excitability (KCNK244), long-term potentiation (PHLDB245), and cannabinoid signaling (CNR1).

SMI-32 (encoded by the NEFH gene21)-immunolabeled neurons preferentially make long-range ipsilateral projections21 in L3 of the macaque temporal neocortex, and show selective vulnerability in Alzheimer’s disease4,5NEFH showed increased expression in deep FREM3 and CARM1P1 types relative to COL22A1 and more superficial t-types (Fig. 4d).  Similarly, combined SMI-32 immunoreactivity and mFISH for cellular markers showed that the large FREM3 and CARM1P1 neurons were SMI-32-immunoreactive, whereas COL22A1 neurons were not (Fig. 4e).  This finding creates a putative link between transcriptomically-defined cell types, long-range projection target specificity and vulnerable neuron populations in Alzheimer’s disease.

## Discussion

Human neocortical cellular diversity has been difficult to define quantitatively in part owing to underpowered analyses with low-throughput techniques because of limited tissue access, high variation across individuals and the absence of cell-type-selective tools.  snRNA-seq can be applied to any species including human, where it forms the basis of a quantitative hierarchical cellular taxonomy that mirrors many aspects of cellular cytoarchitecture, function and developmental origins.  Here we demonstrate using triple-modality Patch-seq analysis of human cortical neurosurgical resections that this transcriptomic classification is a Rosetta stone—that is, it is predictive of the morphological and electrophysiological diversity of supragranular neocortical neurons both for discrete and for continuous features.  Specifically, we find the five glutamatergic t-types are distinct from one another, while the prominent transcriptomic gradient in the FREM3 type is reflected in morphoelectric properties.  Consistent with other studies using tissues distal to sites of obvious pathology, we find little evidence for disease- or pathology-related effects on electrophysiology or cell classification. This stands in contrast to studies that have used human tissues from pathological foci to characterize disease-related phenomena46,47. Remarkably, the stereotypy reported here holds true across 90 tissue donors, despite many uncontrolled axes of variation, indicating that the cellular blueprint is highly robust across individuals even in the context of disease.

Comparative analyses of transcriptome data from mouse, marmoset monkey and human strongly predicts that the human supragranular glutamatergic t-types belong to the IT subclass3,28, and that the more superficially located LTKGLP2R and FREM3 types are homologous to the mouse supragranular IT types.  By contrast, CARM1P1 and COL22A1 types do not have homologous types in mouse supragranular neocortex; rather, they are transcriptomically most like infragranular mouse IT types. Notably, the COL22A1 type shares physiological (high input resistance and increased excitability) and morphological (simple, less branched dendrites including apical dendrites that often terminate before reaching L1) features with human L5 IT neurons48. This could reflect species differences in cell migration, or, more probably, an evolutionary co-option of a pre-existing IT transcriptional program and an extended developmental program.  In either scenario, the outcome is increased neuronal diversity in human deep L3 that may have important functional implications.  The enormous CARM1P1 and deep FREM3 neurons appear specialized for integration of local inputs.  They both have complex, extensive (>0.5 mm) basal dendrites that are likely to integrate information across multiple adjacent minicolumns; by contrast, their apical dendrites sparsely enter layer 1, where feedback information is received from other cortical areas.  Of note, the basal dendrites of these neurons undergo a substantial period of growth in early childhood when environmental factors have a crucial role in brain development49CARM1P1 and deep FREM3 neurons are SMI-32-immunoreactive and NEFH-expressing, unlike COL22A1 neurons, indicating that as in non-human primate they make long-range, predominantly ipsilateral projections compared with more locally projecting neurons.  Increased local integration of deep L3 neurons supports an emerging hypothesis that in primates, superficial and deep parts of supragranular neocortex comprise functionally independent information streams, with feedforward projections originating in deep L3 neurons, whereas superficial neurons receive feedback information50. Finally, most deep L3 cells exhibit burst firing, a feature that optimizes information transfer51. Therefore, the increased cellular diversity in deep L3 may enhance efficiency of feedforward signal processing connecting distant regions of the expanded primate neocortex.

Finally, this cell classification may have potential for understanding the cellular locus of disease. SMI-32-immunoreative L3 magnopyramidal neurons are depleted in Alzheimer’s disease progression4,5, indicating selective vulnerability of the largest long-range association neurons and consequent disruption of cortical networks.  Our results show that SMI-32 labelling maps onto the transcriptomic, morphological and physiological classification, labelling some large deep L3 types but not others.  This refined morpho-electro-transcriptomic cellular framework may serve as a new roadmap for future studies investigating selective neuron disease vulnerability and resistance.

## Methods

Detailed descriptions of all experimental data collection methods in the form of technical white papers can also be found under ‘Documentation’ at http://celltypes.brain-map.org.

### Human tissue acquisition

Surgical specimens were obtained from local hospitals (Harborview Medical Center, Swedish Medical Center and University of Washington Medical Center) in collaboration with local neurosurgeons. All patients (Supplementary Table 1) provided informed consent and experimental procedures were approved by hospital institute review boards before commencing the study. Tissue was placed in slicing artificial cerebral spinal fluid (ACSF) as soon as possible following resection. Slicing ACSF comprised52 (in mM): 92 N-methyl-d-glucamine chloride (NMDG-Cl), 2.5 KCl, 1.2 NaH2PO4, 30 NaHCO3, 20 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES), 25 d-glucose, 2 thiourea, 5 sodium-l-ascorbate, 3 sodium pyruvate, 0.5 CaCl2.4H2O and 10 MgSO4.7H2O. Before use, the solution was equilibrated with 95% O2, 5% CO2 and the pH was adjusted to 7.3 by addition of 5N HCl solution.  Osmolality was verified to be between 295–305 mOsm kg−1.  Human surgical tissue specimens were immediately transported (15–35 min) from the hospital site to the laboratory for futher processing.

### Mouse breeding and husbandry

All procedures were carried out in accordance with the Institutional Animal Care and Use Committee at the Allen Institute for Brain Science. Animals (<5 mice per cage) were provided food and water ad libitum and were maintained on a regular 12-h light:dark cycle; rooms were kept at 21.1 °C and 45–70% humidity. Mice were maintained on the C57BL/6J background, and newly received or generated transgenic lines were backcrossed to C57BL/6J. Experimental animals were heterozygous for the recombinase transgenes and the reporter transgenes.

### Tissue processing

For mouse experiments, male and females were used between the ages of postnatal day (P)45 and P70 were anaesthetized with 5% isoflurane and intracardially perfused with 25 or 50 ml of 0–4 °C slicing ACSF. Human or mouse acute brain slices (350 μm) were prepared with a Compresstome VF-300 (Precisionary Instruments) or VT1200S (Leica Biosystems) vibrating microtome modified for block-face image acquisition (Mako G125B PoE camera with custom integrated software) before each section to aid in registration to the common reference atlas. Brains or tissue blocks were mounted for slicing with the optimal orientation for preserving intactness of apical dendrites of neocortical pyramidal neurons.

Slices were transferred to an oxygenated and warmed (34 °C) slicing ACSF for 10 min, then transferred to room temperature holding ACSF of the composition52 (in mM): 92 NaCl, 2.5 KCl, 1.2 NaH2PO4, 30 NaHCO3, 20 HEPES, 25 d-glucose, 2 thiourea, 5 sodium-l-ascorbate, 3 sodium pyruvate, 2 CaCl2.4H2O and 2 MgSO4.7H2O for the remainder of the day until transferred for patch clamp recordings. Before use, the solution was equilibrated with 95% O2, 5% CO2 and the pH was adjusted to 7.3 using NaOH. Osmolality was verified to be between 295–305 mOsm kg−1.

### Patch clamp recording

Slices were bathed in warm (32–34 °C) recording ACSF containing the following (in mM): 126 NaCl, 2.5 KCl, 1.25 NaH2PO4, 26 NaHCO3, 12.5 d-glucose, 2 CaCl2.4H2O and 2 MgSO4.7H2O (pH 7.3), continuously bubbled with 95% O2 and 5% CO2. The bath solution contained blockers of fast glutamatergic (1 mM kynurenic acid) and GABAergic synaptic transmission (0.1 mM picrotoxin). Thick-walled borosilicate glass (Warner Instruments, G150F-3) electrodes were manufactured (Narishige PC-10) with a resistance of 4–5 MΩ. Before recording, the electrodes were filled with ~1.0–1.5 µl of internal solution with biocytin (110 mM potassium gluconate, 10.0 mM HEPES, 0.2 mM ethylene glycol-bis (2-aminoethylether)-N,N,N′,N′-tetraacetic acid, 4 mM potassium chloride, 0.3 mM guanosine 5′-triphosphate sodium salt hydrate, 10 mM phosphocreatine disodium salt hydrate, 1 mM adenosine 5′-triphosphate magnesium salt, 20 µg ml−1 glycogen, 0.5 U µl−1 RNAse inhibitor (Takara, 2313A)  and 0.5% biocytin (Sigma B4261), pH 7.3). The pipette was mounted on a Multiclamp 700B amplifier headstage (Molecular Devices) fixed to a micromanipulator (PatchStar, Scientifica).

The composition of bath and internal solution as well as preparation methods were made to maximize the tissue quality, to align with solution compositions typically used in the field (to maximize the chance of comparison to previous studies), and modified to reduce RNAse activity and ensure maximal recovery of mRNA content.

Electrophysiology signals were recorded using an ITC-18 Data Acquisition Interface (HEKA). Commands were generated, signals were processed and amplifier metadata were acquired using MIES (https://github.com/AllenInstitute/MIES/), written in Igor Pro (Wavemetrics). Data were filtered (Bessel) at 10 kHz and digitized at 50 kHz. Data were reported uncorrected for the measured (Neher 1992) –14 mV liquid junction potential between the electrode and bath solutions.

Before data collection, all surfaces, equipment and materials were thoroughly cleaned in the following manner: a wipe down with DNA away (Thermo Scientific), RNAse Zap (Sigma-Aldrich) and finally with nuclease-free water.

For human slices, pyramidal shaped neurons in L2-3 were targeted. For mouse experiments, pyramidal neurons in L2/3 were targeted, either tdTomato pyramidal neurons when recording from a transgenic line that labels interneurons, or tdTomato+ neurons when recording from a line that labels different populations of L2/3 glutamatergic neurons, specifically Oxtr-T2A-Cre and Penk-IRES2-Cre-neo, each crossed to the Ai14 tsTomato reporter line.

After formation of a stable seal and break-in, the resting membrane potential of the neuron was recorded (typically within the first minute). A bias current was injected, either manually or automatically using algorithms within the MIES data acquisition package, for the remainder of the experiment to maintain that initial resting membrane potential. Bias currents remained stable for a minimum of 1 s before each stimulus current injection.

To be included in analysis, a cell needed to have a >1 GΩ seal recorded before break-in and an initial access resistance <20 MΩ and <15% of the input resistance (Rinput). To stay below this access resistance cut-off, cells with a low input resistance were successfully targeted with larger electrodes. For an individual sweep to be included, the following criteria were applied: (1) the bridge balance was <20 MΩ and <15% of the Rinput; (2) bias (leak) current 0 ± 100 pA; and (3) root mean square noise measurements in a short window (1.5 ms, to gauge high frequency noise) and longer window (500 ms, to measure patch instability) <0.07 mV and 0.5 mV, respectively.

Each cell was recorded using a standardized stimulus paradigm, including square pulses, ramps and noisy current injections, with the goal of extracting features that could be compared across cells, rather than tailoring each stimulus to the physiological input of that neuron.

Upon completion of electrophysiological examination, the pipette was centered on the soma or placed near the nucleus (if visible). A small amount of negative pressure was applied (~−30 mbar) to begin cytosol extraction and attract the nucleus to the tip of pipette. After approximately one minute, the soma had visibly shrunk and/or the nucleus was near the tip of the pipette. While maintaining the negative pressure, the pipette was slowly retracted in the x and z direction. Slow, continuous movement was maintained while monitoring pipette seal. Once the pipette seal reached >1 GΩ and the nucleus was visible on the tip of the pipette, the speed was increased to remove the pipette from the slice. The pipette containing internal solution, cytosol and nucleus was removed from pipette holder and contents were expelled into a PCR tube containing the lysis buffer (Takara, 634894).

### cDNA amplification and library construction

We performed all steps of RNA-processing and sequencing as described for mouse Patch-seq cells41. We used the SMART-Seq v4 Ultra Low Input RNA Kit for Sequencing (Takara, 634894) to reverse transcribe poly(A) RNA and amplify full-length cDNA according to the manufacturer’s instructions. We performed reverse transcription and cDNA amplification for 20 PCR cycles in 0.65 ml tubes, in sets of 88 tubes at a time. At least 1 control 8-strip was used per amplification set, which contained 4 wells without cells and 4 wells with 10 pg control RNA. Control RNA was either Universal Human RNA (UHR) (Takara 636538) or control RNA provided in the SMART- Seq v4 kit. All samples proceeded through Nextera XT DNA Library Preparation (Illumina FC-131-1096) using either Nextera XT Index Kit V2 Sets A-D(FC-131-2001,2002,2003,2004) or custom dual-indexes provided by Integrated DNA Technologies (IDT). Nextera XT DNA Library prep was performed according to manufacturer’s instructions except that the volumes of all reagents including cDNA input were decreased to 0.2× by volume.  Each sample was sequenced to approximately 1 million reads.

### RNA-seq data processing

Fifty-base-pair paired-end reads were aligned to GRCh38.p2 using a RefSeq annotation gff file retrieved from NCBI on 11 December 2015 for human and to GRCm38 (mm10) using a RefSeq annotation gff file retrieved from NCBI on 18 January 2016 for mouse (https://www.ncbi.nlm.nih.gov/genome/annotation_euk/all/). Sequence alignment was performed using STAR v2.5.353 in two pass Mode. PCR duplicates were masked and removed using STAR option bamRemoveDuplicates. Only uniquely aligned reads were used for gene quantification. Gene counts were computed using the R Genomic Alignments package summarizeOverlaps function using IntersectionNotEmpty mode for exonic and intronic regions separately54.  Expression levels were calculated as counts of exonic plus intronic reads.  For most analyses, log2(counts per million (CPM) + 1)-transformed values were used.

### Morphological reconstruction

#### Biocytin histology

A horseradish peroxidase (HRP) enzyme reaction using diaminobenzidine (DAB) as the chromogen was used to visualize the filled cells after electrophysiological recording, and 4,6-diamidino-2-phenylindole (DAPI) stain was used to identify cortical layers as described previously9.

#### Imaging of biocytin-labelled neurons

Mounted sections were imaged as described previously9. In brief, operators captured images on an upright AxioImager Z2 microscope (Zeiss, Germany) equipped with an Axiocam 506 monochrome camera and 0.63× Optivar lens. Two-dimensional tiled overview images were captured with a 20× objective lens (Zeiss Plan-NEOFLUAR 20×/0.5) in bright-field transmission and fluorescence channels. Tiled image stacks of individual cells were acquired at higher resolution in the transmission channel only for the purpose of automated and manual reconstruction. Light was transmitted using an oil-immersion condenser (1.4 NA). High-resolution stacks were captured with a 63× objective lens (Zeiss Plan-Apochromat 63×/1.4 Oil or Zeiss LD LCI Plan-Apochromat 63x/1.2 Imm Corr) at an interval of 0.28 µm (1.4 NA objective) or 0.44 µm (1.2 NA objective) along the z axis. Tiled images were stitched in ZEN software and exported as single-plane TIFF files.

#### Morphological reconstruction

Reconstructions of the dendrites and the full axon were generated for a subset of neurons with good quality transcriptomics, electrophysiology and biocytin fill. Reconstructions were generated based on a 3D image stack that was run through a Vaa3D-based image processing and reconstruction pipeline55. Images were used to generate an automated reconstruction of the neuron using TReMAP56. Alternatively, initial reconstructions were created manually using the reconstruction software PyKNOSSOS (Ariadne-service) or the citizen neuroscience game57 Mozak (Mosak.science). Automated or manually-initiated reconstructions were then extensively manually corrected and curated using a range of tools (for example, virtual finger and polyline) in the Mozak extension (Zoran Popovic, Center for Game Science, University of Washington) of Terafly tools58,59 in Vaa3D. Every attempt was made to generate a completely connected neuronal structure while remaining faithful to image data. If axonal processes could not be traced back to the main structure of the neuron, they were left unconnected.

Before morphological feature analysis, reconstructed neuronal morphologies were expanded in the dimension perpendicular to the cut surface to correct for shrinkage17,60 after tissue processing. The amount of shrinkage was calculated by comparing the distance of the soma to the cut surface during recording and after fixation and reconstruction. A tilt angle correction was also performed based on the estimated difference (via CCF registration) between the slicing angle and the direct pia-white matter direction at the cell’s location9.

### Slice immunohistochemistry

#### Immunohistochemistry

Tissue slices (350 µm-thick) designated for histological profiling were fixed for 2–4 days in 4% paraformaldehyde (PFA) in phosphate-buffered saline (PBS) at 4 °C and transferred to PBS, 0.1% sodium azide for storage at 4 °C. Slices were then cryoprotected in 30% sucrose, frozen and re-sectioned at 30 µm using a sliding microtome (Leica SM2000R). Sections were stored in PBS with azide at 4 °C in preparation for immunohistochemical and Nissl staining. Specific probes (vendor, dilution) used were: Neu-N (Millipore, MAB377, 1:2,000); SMI-32 (Biolegend, 801704, 1:2,000); GFAP (Millipore, MAB360, 1:1,500); parvalbumin (Swant, PV235, 1:2,000); IBA1 (Wako 019-19741, 1:1,000); Ki-67 (Dako M724001-2, 1:200). Full immunohistology protocol details available at http://help.brain-map.org/download/attachments/8323525/CellTypes_Morph_Overview.pdf?version=4&modificationDate=1528310097913&api=v2

#### Slide imaging

Colorimetric immunohistochemistry and other histologically-stained whole slides (that is, Nissl-stained preparations) for bright-field imaging were scanned using an Aperio ScanScope XT slide scanner (Leica Biosystems). The samples were illuminated using a 21DC Halogen Lamp (Techniquip). Bright-field images were acquired using ScanScope Console (v101.0.0.18) and controller (ve101.0.4.446) at 10× magnification (objective lens 20×/0.75 NA Plan Apo, 0.5× magnifier) resulting in a pixel size of 1.0 µm per pixel.

#### Pathology scoring

For every case, each set of images per histological marker were independently scored by three pathologists using a 4-point scale, where 0 is normal and 3 is overtly pathological. Well-established histological markers were used to evaluate cellularity (Nissl), neuronal density and layer orientation (NeuN), astrogliosis (GFAP), microglial activation state (IBA1), non-phosphorylated neurofilament-H (using antibody SMI-32), and cellular proliferation (Ki-67).

### mFISH

Fresh-frozen human postmortem brain tissues were sectioned at 14–16 μm onto Superfrost Plus glass slides (Fisher Scientific). Sections were dried for 20 min at −20 °C and then vacuum sealed and stored at −80 °C until use. The RNAscope multiplex fluorescent v1 kit was used per the manufacturer’s instructions for fresh-frozen tissue sections (ACD Bio), except that fixation was performed for 60 min in 4% paraformaldehyde in 1× PBS at 4 °C and protease treatment was shortened to 10 min. For combined RNAscope and immunohistochemistry, primary antibodies were applied to tissues after completion of mFISH staining. Primary mouse anti-neurofilament H (SMI-32, Biolegend, 801701) was applied to tissue sections at a dilution of 1:250. Secondary antibodies (1:250) were goat anti-mouse IgG (H+L) Alexa Fluor conjugates (594 or 647, ThermoFisher Scientific A-11005 or 21235). Sections were imaged using a 60× oil-immersion lens on a Nikon TiE fluorescence microscope equipped with NIS-Elements Advanced Research imaging software (version 4.20). For all RNAscope mFISH experiments, positive cells were called by manually counting RNA spots for each gene. Cells were called positive for a gene if they contained ≥3 RNA spots for that gene. Lipofuscin autofluorescence was distinguished from RNA spot signal based on the larger size of lipofuscin granules and broad fluorescence spectrum of lipofuscin. The following probe combinations were applied to label cell types of interest: (1) LTK: LTK (NM_002344.5) and LAMP5 (NM_012261.3); (2) GLP2R: GLP2R (NM_004246.2) and CUX2 (NM_015267.3); (3) FREM3: RORB (NM_006914.3) and FREM3 (NM_001168235.2); (4) CARM1P1: RORB and CARTPT (NM_004291.3); (5) COL22A1: RORB and COL22A1 (NM_152888.3); (6) Adamts2: Cbr3 (NM_173047.3), Neurod1 (NM_010894.2) and Cdh13 (NM_019707.4); (7) Rrad: Nr4a3 (NM_015743.3), Cux1 (NM_009986.4) and Cdh13; (8) Agmat: Pou3f2 (NM_008899.2), Igfbp7 (NM_001159518.1) and Coch (NM_001198835). Experiments were repeated on at least n = 2 donors per probe combination for both mouse and human.

### Quantification of human and mouse soma size

Images of NeuN+ stained sections from human MTG (1 section per donor for 5 donors) and mouse VISp (1 section per mouse for 3 mice) (described above) were imported into ImageJ for processing. Regions of interest (ROIs) were drawn around cell bodies and exported as .roi files for downstream processing. In both species, L4 is defined as a band of densely packed, small granular cells, and the upper bound of this band (which includes overlying large pyramidal cells) is treated as the border between L3 and L4.  The border between L1 and L2 is defined as the sharp boundary between the cell-sparse zone of L1 and the is a cell-dense zone of L2. In mouse, the border between L2 and L3 is indistinguishable and not defined. In human MTG, the boundary between L2 and L3 can be closely approximated as transition from densely packed small pyramidal cells to less densely packed larger pyramidal cells, which is largely consistent among donors.

Soma areas were defined as the number of pixels contained in each ROI, scaled by the number of pixels per µm. Cortical depth was defined for each cell as the position of that cell centroid relative to pia (absolute depth) or relative to the L1/2 and L3/4 boundaries (scaled depth) at that position in the tissue. The number of neurons per mm2 of L2-3 neocortex (absolute density) is the number of neurons per image scaled by the area of the image where cell counts were assessed.  For measuring surface density and cell area across L2-3 cortical depth, L2-3 was split into 20 evenly sized bins and the relevant measurements within each bin were calculated independently per section (one section per donor) and the average and standard deviation across sections were reported.  The first and last bins are omitted from plots as they display boundary effects.  Relative (scaled) neuron density scales to 1 for each donor and is defined as the fraction of total neuron count in each bin.  In human, a nadir of scaled density was identified at −0.575, which we define as a quantitative boundary between superficial and deep L3 in this manuscript.

### Analysis of data from dissociated cells and nuclei

Reference data used in this study include dissociated excitatory cells (mouse) or nuclei (human) collected from human MTG3 and mouse VISp26, and are all publicly accessible at the Allen Brain Map data portal (https://portal.brain-map.org/atlases-and-data/rnaseq). In human, cells from the five previously identified L2-3 glutamatergic types were retained, subsampling to match the laminar distribution of neurons included in the Patch-seq dataset as closely as possible, leaving a total of 2,948 neurons from LTK, GLP2R, FREM3, CARM1P1 and COL22A1 t-types.  In mouse, all neurons from the three L2/3 glutamatergic t-types (Adamts2, Rrad and Agmat) were retained.  Datasets were visualized as follows.  First, the top 2,000 most binary genes by beta score3 were selected. Beta score  is defined as the squared differences in proportions of cells or nuclei in each cluster that expressed a gene above 1, normalized by the sum of absolute differences plus a small constant (ε) to avoid division by zero. Scores ranged from 0 to 1, and a perfectly binary marker had a score equal to 1.  Second, the Seurat pipeline39,40 (more details below) was used to scale the data, reduce the dimensionality using principal component analysis (PCA) (30 PCs).  These PCs were then used to generate a UMAP61. Finally, data and metadata such as cluster, subcluster, layer and gene expression were then overlaid onto this UMAP space using different colored or shaded points.

Cluster heterogeneity is defined as average observed variance explained by the first PC compared with permuted data after accounting for differences in the number of cells per cell type.  To get this, we (1) randomly selected 80 cells from each cell type, (2) identified the 80 most variable genes using the FindVariableFeatures Seurat function with selection.method=”vst”, (3) performed PCA after removing outlier cells, (4) calculated the percent of variance explained by the first PC, (5) repeated steps 1–4 for 100 sets of data where the expression levels for each gene are shuffled across the 80 cells to break gene correlations but retain other gene statistics, and (6) identified the average and standard deviation of PC1 for observed versus permuted data.  Cluster discreteness is defined as the average number of differentially expressed genes between a given type and each of the remaining homologous t-types (LTKGLP2R and FREM3 t-types in human; RradAgmat and Adamts2 t-types in mouse).  In this case pairwise differential expression is defined using the de_score function in the scrattch.hicat R library26 after subsampling each cluster to 80 cells, and only the genes with higher expression in the relevant cluster are considered.  The getMarkers function from the genesorteR R library (https://github.com/mahmoudibrahim/genesorteR)42 was used to identify genes differentially expressed genes between deep FREM3 (f73 subtype; collected from L3 or L4 dissection), COL22A1 and CARM1P1 neurons, using all default parameters except quant = 0.7.  To validate the cell selection for deep L3 (since sublaminar dissection was not performed on the dissociated nuclei data), this analysis was repeated on Patch-seq neurons from these three types collected in deep L3 (scaled depth < −0.575). GO enrichment analysis was performed using ToppGene62 with default settings, and Bonferroni-corrected P-values are reported unless stated otherwise.

### Dataset curation

Patch-seq cells were included in this dataset if they met the following criteria.  All neurons: (1) had high-quality transcriptomic data, measured as the normalized summed expression (NMS, adapted from the single-cell quality control measures in ref. 63) of ‘on’-type marker genes greater than 0.4; and (2) retained a soma through biocytin processing and imaging such that an accurate laminar association could be made.  In addition, mouse neurons were: (1) located within VISp; (2) either tdTomato or tdTomato+ from a line known to label glutamatergic neurons (that is, tdTomato+ neurons from known inhibitory mouse lines were excluded); (3) mapped to L2/3 IT VISp Rrad, L2/3 IT VISp Agmat, or L2/3 IT VISp Adamts2 using Seurat mapping (as described below); and (4) mapped to L2/3 IT VISp Rrad, L2/3 IT VISp Agmat, L2/3 IT VISp Adamts2, or L4 IT VISp Rspo1 in a separate Seurat mapping analysis where only reads located within gene introns are considered for both datasets.  This final filter removes Patch-seq cells that jointly express markers for GABAergic and glutamatergic cells, probably representing L2/3 GABAergic neurons contaminated with adjacent glutamatergic cells.  We do not find examples of such cells in human, possibly owing to a much smaller sampling of GABAergic cells than in the mouse.

### Identifying t-types

Due to the differences in gene expression between Patch-seq and dissociated cells (see Extended Data Fig. 3b and a companion mouse study41), we used transcriptomes of dissociated human nuclei3 or cells26 as reference datasets for human and mouse, respectively, and mapped Patch-seq transcriptomes to the reference data to identify their cell types.  Prior to data transfer, we filtered genes potentially related to technical variables.  X and Y chromosomes were excluded to avoid nuclei mapping based on sex. Many mitochondrial genes have expression correlated with RNA-seq data quality in dissociated nuclei data3, so nuclear and mitochondrial genes downloaded from Human MitoCarta2.064 were also excluded.  We also find that Patch-seq cells often have high expression of non-neuronal marker genes, so any genes most highly expressed in a non-neuronal cell type are excluded.  Finally, any genes showing at least four-fold higher expression in dissociated nuclei versus Patch-seq cells in the included cell types (or vice versa) were excluded as potentially platform dependent.  In total 23,129 of 50,281 genes (46%) remained in human and a comparable fraction for mouse.  Variable genes for mapping were selected as described above for dissociated nuclei data visualization, by using the top 2,000 remaining genes by beta score as input into the procedure described below.

For both species, we mapped Patch-seq datasets to the relevant dissociated cells or nuclei reference using Seurat V3 (https://satijalab.org/seurat/)39,40 following the tutorial for integration and label transfer with default parameters for all functions, except when they differed from those used in the tutorial, and replacing variable gene selection with the genes described above. More specifically, we first defined a low (30)-dimensional PCA space of the dissociated cells or nuclei dataset and then project this onto the Patch-seq dataset.  We then found transfer anchors (cells that are mutual nearest neighbours between datasets) in this subspace.  Each anchor is weighted on the basis of the consistency of anchors in its local neighbourhood, and these anchors were then used as input to guide label transfer (or batch correction), as described previously65. We then scaled the data, reduced the dimensionality using PCA, and visualized the results with UMAP61. This process is done using the FindTransferAnchors and TransferData R functions, which provide both the best mapping cell type and a confidence score.  For mouse data, the three homologous types did not provide a heterogenous enough reference dataset, and therefore a larger set of glutamatergic and GABAergic cell types was used as reference.  Cell-type assignments for most cells were robust to choice of reference dataset and to changes in parameter settings.  Some cells with expression levels intermediate to two cell types changed calls between different runs; however, the cell-type-level results presented are robust to these small changes.

Multiple variations of three different mapping strategies were tested: (1) Seurat (as described in this Article), (2) a variation of the tree-mapping strategy previously described41 for analysis of GABAergic cell types in mouse Patch-seq and (3) a correlation-based strategy comparing expression of marker genes in Patch-seq versus cluster medians of the scRNA-seq data.  In all cases, 75–95% of cells mapped to the same cell type when comparing pairs of methods, consistent with similar analyses performed using dissociated FACS-sorted cells and therefore does not reflect a quality issue with Patch-seq data per se, but rather the fact that cell-type definitions are not totally discrete, and single cell measurements are not totally accurate (for example, owing to dropouts).  As such, many cells show expression patterns that are not unambiguously mapped between highly similar transcriptomic cell types, although how much of this is biological and how much technical is difficult to assess.  Despite the imperfect agreement of specific cells, the statistical results regarding morphology and electrophysiology for each cell type remain relatively unchanged regardless of the mapping method used.

Gene expression of Patch-seq cells was visualized by projection into the UMAP space calculated from dissociated nuclei using a combination of Seurat and the R implementation of the UMAP library (https://github.com/tkonopka/umap). More specifically, the Seurat data integration pipeline (functions FindIntegrationAnchors and IntegrateData) was used to calculated scaled data for both datasets and PCA was performed on this integrated space.  The first 30 PCs from both datasets, as well as the UMAP coordinates calculated for dissociated nuclei above were input into the UMAP pipeline and the ‘predict’ function was used to project the Patch-seq cells into UMAP coordinates.  As above, data and metadata were then overlaid on these UMAP coordinates.

Since dissociated nuclei were not collected using sublaminar dissections, ‘deep FREM3’ neurons were defined as FREM3 neurons dissected from L3 or L4 that were assigned to subtype f73 (Extended Data Fig. 1), which colocalizes with deep FREM3 Patch-seq neurons in UMAP space (Figs. 1, 3).  Furthermore, 77 of these 219 marker genes (including four genes shown in Fig. 4e) were also defined as marker genes by Patch-seq, where cortical depth was explicitly measured, suggesting the selection of deep FREM3 neurons in dissociated nuclei was reasonable.

### Assessing transcript contamination

To quantify the effect of contamination and gene dropout in the Patch-seq dataset, we compared median gene expression levels of homologous t-types between platforms (Fig. 3a).  Dissociated nuclei and Patch-seq cells from matched human t-types were highly correlated (R = 0.85, P ≈ 0).  Relatively few genes (177 genome-wide) showed enriched expression in dissociated nuclei relative to Patch-seq cells, suggesting that high quality transcriptomes collected in this dataset do not show the increased dropout rate reported in our previous study in mouse41.  This is likely to be because we compared our human Patch-seq cells to a reference of dissociated nuclei, rather than the reference based on dissociated cells in mouse.  By contrast, we identified 2,670 genes with at least fourfold enrichment in Patch-seq, including genes associated with extra-nuclear compartments such as the mitochondria (P < 10–12) and ribosome (P < 10−9), genes regulating cell death (P < 10–18), RNA-binding genes (P < 10−8) including immediate early genes, and markers for non-neuronal cells such as microglia (P < 10–20). Some of the top genes in these categories include COX3, FOS and IL1B, which all show >100-fold enrichment in Patch-seq cells.  These results indicated that Patch-seq cells probably contain some RNA collected from extranuclear compartments and from nearby contaminating cells (particularly microglia), and may show some activity-dependent transcription. However, these effects were minor compared to cell-type differences and we find overall high consistency and similar quality between Patch-seq cells and dissociated nuclei.

### Comparison of gene expression between species

Gene orthologues between mouse and human were pulled from the gene orthologues table on NIH (https://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_orthologs.gz) on 22 November 2019.  Only genes with unique orthologues between mouse and human were included in cross-species analyses.

### Electrophysiology feature analysis

Electrophysiological features were measured from responses elicited by short (3 ms) current pulses and long (1 s) current steps as previously described9. In brief, action potentials were detected by first identifying locations where the smoothed derivative of the membrane potential (dV/dt) exceeded 20 mV ms−1, then refining on the basis of several criteria including threshold-to-peak voltage, time differences and absolute peak height. For each action potential, threshold, height, width (at half-height), fast after-hyperpolarization (AHP) and interspike trough were calculated (trough and AHP were measured relative to threshold), along with maximal upstroke and downstroke rates dV/dt and the upstroke/downstroke ratio (that is, ratio of the peak upstroke to peak downstroke). Additional features from supratheshold sweeps included the rheobase and slope of the firing rate versus current curve (fI slope); the first spike latency and initial firing rate (inverse of first inter-spike interval), measured at rheobase; and the mean firing rate and spike frequency adaptation ratio (mean ratio of consecutive inter-spike intervals), measured at ~50 pA above rheobase. Subthreshold features included the resting membrane potential (RMP), input resistance and membrane time constant (tau) from response across or before hyperpolarizing long steps, and sag ratio from response at ~−100 pA. All feature calculation used the IPFX package (https://github.com/AllenInstitute/ipfx).

### Morphology feature analysis

Morphological features were calculated as previously described9. In brief, feature definitions were collected from prior studies10,66. Features were calculated using the version of neuron_morphology package (https://github.com/alleninstitute/neuron_morphology/tree/dev). Reconstructed neurons were aligned in the direction perpendicular to pia and white matter. Additional features, such as the laminar distribution of axon, were calculated from the aligned morphologies. Shrinkage correction was not performed (see above), features predominantly determined by differences in the z-dimension were not analysed to minimize technical artifacts due to z-compression of the slice after processing.

### Analysis of features by t-type and species

Combined datasets of electrophysiological and morphological features across homologous t-types from mouse and human were visualized by an analysis pipeline of data imputation and standardization, followed by projection to two dimensions using UMAP or SPCA (sklearn and umap python packages)67,68. Cells with more than 3 out of 18 electrophysiological features missing were dropped, the remaining missing features were imputed as the mean of 5 nearest neighbours (KNNImputer), and features were centred about the median and scaled by interquartile range (RobustScaler). The SPCA regularization parameter was adjusted to minimize non-zero features while preserving dataset structure. All features with coefficients over 0.05 were reported directly in the case of electrophysiology or summarized by feature categories for morphology.

For each feature, differentiation by t-type was assessed by running a one-way ANOVA for the feature by t-type, using the statsmodels package69. This analysis was repeated separately for the three mouse and human homologous t-types, as well as the three deep human t-types (with the subset of deep FREM3 cells only). Results were reported as fraction of variance explained ($$\eta$$2 or R2) and heteroscedasticity-robust F test P-value (HC3), corrected for FDR (Benjamini–Hochberg procedure) across all features for each data modality. Post-hoc Mann–Whitney U tests were run across pairs of t-types in each group (human and mouse homologous types and deep human types) for top-ranked features from ANOVA, and results FDR-corrected.

For classification of t-types, features were normalized using the standard scaler scalar in sklearn (StandardScalar), and the data were randomly assigned with stratification to training (70%) and testing sets (30%). The random forest classifier was trained using the sklearn package with 600 decision trees. The classification performance was estimated after averaging the results of the classifiers trained on 1,000 random data splits and compared against performance for data with shuffled t-type labels. Confusion matrices shown are for a single representative train–test split.

### Analysis of features by depth for FREM3 t-type

For each electrophysiology, morphology, and gene feature, the depth-related variability was assessed by a linear regression of the feature against relative L2-3 depth, using the statsmodels package69. Results were reported as fraction of variance explained (R2), Pearson correlation r, and heteroscedasticity-robust F test P-value (HC3), corrected for FDR (Benjamini–Hochberg procedure) across all features for each data modality. Owing to the large number of morphology and genes tested, results were summarized by calculating GO-term enrichment in ToppGene62 for the set of depth-correlated genes (FDR < 0.05), followed by subselection of representative GO terms using REViGO70. Groups of features were ranked by the group’s highest R2, and the features with highest correlation shown for the top groups.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.