USeqFISH for in situ profiling of endogenous and viral genes

To enable spatial transcriptomic approaches for high-throughput, high-resolution AAV tropism profiling, we developed USeqFISH (Fig. 1a,b). USeqFISH comprises a signal amplification step with combined RCA45 and HCR54 (RCAHCR; Fig. 1a, ‘Signal amplification with RCAHCR’) and a sequential labeling step with a two-step hairpin and initiator stripping method via toehold-mediated strand displacement (Fig. 1a, ‘Two-step stripping for sequential labeling’).

The new amplification strategy, RCAHCR, combines two conventional signal amplification methods (RCA and HCR) to achieve high sensitivity. Adapting the SNAIL probe of STARmap40, we designed four probes for each gene, with each probe consisting of a primer and a padlock. Each primer and padlock include 20 nucleotides (nt) complementary to consecutive sequences of the target. This paired design offers higher specificity, as the RCA amplification can happen only if both primer and padlock have hybridized on the target40,54. The padlock also has a 19-nt unique gene identifier (UGI) that is replicated via RCA and to which the initiator (an HCR initiator with sequence complementary to the UGI, UGI*) hybridizes. RCAHCR is, therefore, carried out by hybridizing probes to the target gene, generating DNA amplicons via RCA, hybridizing the initiators to the amplicons and, lastly, triggering spontaneous HCR assembly by adding hairpins.

To assess the signal amplification performance of RCAHCR, we compared the intensity of RNA signals amplified with RCAHCR to those amplified with RCA-only or HCR-only in mouse brain tissue. To do so, we designed four probes against Gad1 for each condition. For RCA-only amplification, we used the same probes designed for RCAHCR but added fluorophore-conjugated UGI* after RCA (as in STARmap40). For HCR-only amplification, we used four probes designed as split–initiator pairs as suggested for HCR version 3 (ref. 54) (HCR v3, Fig. 1c). In all cases, we used the same fluorophore (Alexa Fluor 647) to minimize color variance. We found that RCAHCR yielded an 11.7-fold and a 6.2-fold increase in mean signal intensity compared to HCR-only or RCA-only, respectively (Fig. 1d). RCAHCR also showed a significantly higher signal-to-background ratio (SBR; 10.9 ± 0.01) compared to HCR-only (1.39 ± 0.67) or RCA-only (3.09 ± 2.21; mean ± s.e.m.; Fig. 1e). Additionally, we confirmed that, compared to increasing the reaction time of RCA to overnight, RCAHCR (2 hours of RCA and subsequent 1 hour for HCR) produces RNA spots with similar size yet significantly higher intensity (Extended Data Fig. 1a–d). We also measured that the false positive of RCAHCR is 1 per 8,000 µm2 and 1 per 20,000 µm3 in cell culture and tissue, respectively (Extended Data Fig. 1e), which is much lower than HCR (1 per 3,000 µm3)44. Note that single-molecule FISH56 or other methods employing signal amplification (for example, RCA48, HCR54 or SABER57) usually recommend using at least 10–20 probes per target gene, and we were able to obtain a brighter RNA signal by adding more probes to HCR (‘13 probes’ in Fig. 1c). In contrast, RCAHCR uses only four probes, and, even with this small number, it generates much brighter and more sensitive RNA signals, which will be beneficial for targeting short endogenous sequences or designing short barcodes for viral genomes with a limited packaging capacity.

Indeed, this remarkably high sensitivity of RCAHCR enabled RNA detection even with a single probe (a pair of a primer and a padlock; Fig. 1f–h). To validate the capability of single-probe detection with RCAHCR amplification, we assessed the detection efficiency of 1-probe RCAHCR compared to single-molecule HCR (smHCR) by measuring four housekeeping genes (Gapdh, Eef2, Tfrc and Polr2a) with various expression levels in the same NIH3T3 cells (Fig. 1f and Extended Data Fig. 2a–d)42. We found that the detection efficiency of 1-probe RCAHCR is 42–87% of smHCR with high correlation (Fig. 1g,h). We also confirmed that 1-probe RCAHCR can visualize endogenous genes in intact tissue volume (Extended Data Fig. 2e). Altogether, these results indicate that the method requires a unique sequence as short as 40 nt for selective endogenous RNA detection in tissue.

One caveat of RCAHCR is the limited penetration of the enzymes required for RCA that hinders three-dimensional (3D) tissue labeling (by comparison, HCR is capable of labeling tissue with a thickness of ~500 μm58). To address this, we reasoned that hydrogel-based tissue clearing could help the enzymes penetrate deeper into the tissue. By optimizing our passive CLARITY technique (PACT)59,60 to better retain mRNA transcripts without substantial interference in probe hybridization (Supplementary Fig. 1), we enabled multi-color detection of three endogenous genes in 50-μm-thick tissue using RCAHCR (Fig. 1i).

We next established a quenching method to increase the number of endogenous and viral genes that can be detected (Fig. 1a, ‘Two-step sequential labeling’). Most previous studies used DNase I to degrade hairpin assemblies44,61; however, this enzyme often leaves residual activity that can affect the next round of HCR and can also compromise the DNA amplicons. Thus, we instead used a hairpin with a toehold sequence (10 nt) that can induce spontaneous disassembly through strand displacement41. With these toehold hairpins, the two-step method is performed as follows. For the first round, we add a pair of toehold and fluorophore-conjugated hairpins to create the hairpin assembly. To dismantle the hairpin assembly, we add a short displacement oligo (a sequence complementary to the toehold and half of the hairpin). Next, formamide (60–70%) is used to detach the initiators from the amplicon to prevent undesired hairpin assembly during the next round of HCR. Once the signal is quenched, the next round of HCR can be initiated by adding another set of initiators, followed by HCR labeling.

To examine whether our two-step stripping method could efficiently quench HCR signal, we transfected HEK293T cells with plasmids carrying a barcode (a unique 160-nt sequence orthogonal to the human transcriptome) and hybridized four probes complementary to this barcode. After RCA and the first round of HCR with the toehold hairpins, we compared cell intensity change over time in multiple conditions: ‘No stripping’ (no further steps), ‘SD only’ (displacement oligos only) and ‘SD-FA’ (both displacement oligos and subsequent formamide treatment) (Extended Data Fig. 3a,b). Treatment with both displacement oligos and formamide quickly and efficiently reduced cell intensity, suggesting near complete disassembly of hairpin structures.

We next asked whether our two-step method could prevent crosstalk between HCR rounds and preserve the DNA amplicon intact. If residual hairpins or initiators remained hybridized to amplicons, subsequent addition of hairpins could result in hairpin assembly at unwanted sequences, introducing substantial noise. Also, if the amplicons were damaged by the stripping method, further rounds of HCR would not be successful. To answer these questions, we attempted a second round of HCR on samples stripped by our method, using the same initiators and hairpins as for the first round (Extended Data Fig. 3c). When adding only hairpins without initiators, we were unable to trigger HCR assembly, indicating that residual hairpins and initiators are negligible after two-step stripping (‘−Initiator’ in Extended Data Fig. 3c,d). Using the same initiators and hairpins, we were able to recover HCR signals with R2 = 0.889 ± 0.016 (mean ± s.e.m.; ‘+Initiator’ in Extended Data Fig. 3c,d). Two rounds with the same hairpins and initiators for the same gene (Gad1) in tissue also highly overlapped (Extended Data Fig. 3e). Finally, we successfully achieved eight rounds of sequential labeling with the same hairpins and initiators in the same cell culture, preserving a mean R2 of >0.9 between rounds (Extended Data Fig. 3f,g). These results indicate that our two-step method facilitates sequential labeling with RCAHCR amplification, enabling the USeqFISH procedure.

We assessed the performance of USeqFISH in tissue. First, we observed no significant difference between the number of Gad1 spots per cell detected with USeqFISH (four probes) and HCR (ref. 54), indicating that USeqFISH has a similar detection efficiency to HCR (~70%44,54; Fig. 1j and Supplementary Fig. 1b). We also compared the expression level of 26 endogenous genes in mouse cortex (Supplementary Table 1), measured as the mean spot count per cell with USeqFISH and as the mean unique molecular identifier (UMI) count per cell with scRNA-seq32. We found a significant linear correlation (r = 0.627; P = 6 × 10−4; Fig. 1k), with a higher detection efficiency of USeqFISH, especially for lower-abundance genes. The robustness of USeqFISH over multiple rounds was examined by detecting the same gene (Gad2) in the same cells at round 2 and round 13; ~76% of the signal was preserved (Fig. 1l). Taken together, our results show that USeqFISH can detect ~50 RNAs (4 colors × 13 rounds) in intact tissue volumes by targeting ≤160 nt of each sequence without substantial loss (compared to HCR or scRNA-seq) yet with much brighter signal.

We next investigated whether USeqFISH could detect RNAs transcribed from viral genomes transfected into cultured cells (Fig. 2). To do so, we designed a plasmid encoding the AAV9 VP3 protein with an eGFP tag. We then cloned amino acid (AA) mutations into this plasmid to generate the VP3 of AAV-PHP.eB6 (a 2-AA substitution and a 7-AA insertion) and AAV-PHP.S6 (a 7-AA insertion) at AA588 of AAV9. After transfection of each plasmid into HEK293T cells, we applied USeqFISH with three different probe sets: four probes against the VP3 sequence shared across all three variants, one probe against part of the 7-AA insertion of AAV-PHP.eB and one probe against part of the 7-AA insertion of AAV-PHP.S (Fig. 2a). Note that, for the probes targeting the insertions, only the padlocks differed by 14 nt, and the same primer was shared. With these VP3 probes, we were able to label the transcripts of all three plasmids in the cells expressing eGFP. With the single probe for each insertion, only the cells transfected with the corresponding plasmid were labeled (Fig. 2b), indicating that USeqFISH can selectively detect a mutated region in the viral genome as short as 14 nt in vitro.

Fig. 2: High sensitivity of USeqFISH detects short mutations and barcodes in the AAV genome in vitro and in vivo. a, Three plasmids were designed to carry the VP3 of AAV9, AAV-PHP.eB (‘PHP.eB’) and AAV-PHP.S (‘PHP.S’) with eGFP. AAV-PHP.eB and AAV-PHP.S have distinct 9-AA and 7-AA mutations (bold letters) in the same location (AA588) of the AAV9 VP3 sequence. After transfecting into HEK293T cells, we detected the transcripts of each plasmid using the following probes (gray filled boxes indicate the padlock target sequence, and gray outlined boxes indicate the primer target sequence): four probes against the shared VP3 sequence, one probe against the insertion of AAV-PHP.eB and one probe against the insertion of AAV-PHP.S. For the probes against each insertion, we used the same primers for AAV-PHP.eB and AAV-PHP.S but distinct padlocks that differed by 14 nt. b, Detection of the VP3 transcripts with four probes for VP3, one probe for AAV-PHP.eB and one probe for AAV-PHP.S in HEK293T cells expressing the VP3 of AAV9, AAV-PHP.eB and AAV-PHP.S. c, For in vivo detection, we designed a viral genome carrying mNeonGreen and a barcode and systemically delivered it to adult mice using AAV-PHP.eB at a dose of 1 × 1011 vg per mouse. At 3 weeks after injection, we used USeqFISH with probes against the barcode to detect viral transcripts in tissue. d, Detection of viral barcodes (‘Barcode (mRNA)’) in cells expressing mNeonGreen (green) in various mouse brain regions (cortex, striatum and thalamus). Full size image

We also tested whether USeqFISH could detect transcripts from the AAV genome in tissue after systemic delivery (Fig. 2c). We produced AAV-PHP.eB packaging pAAV-CAG-mNeonGreen (mNG)-WPRE-hGHpA with a barcode inserted between mNG and WPRE; this location facilitated barcode detection (Supplementary Fig. 2a). We intravenously (IV) administered the virus to adult mice at a dose of 1 × 1011 vector genomes (vg) per animal and harvested the brains at 3 weeks after injection. Using USeqFISH with probes against the barcode, we were able to label the RNA transcripts of the viral genome in cells expressing mNG from various brain regions (Fig. 2d). These results support the applicability of USeqFISH for parallel detection of multiple systemically delivered AAVs in tissue by reading barcodes uniquely assigned to each viral genome.

In situ profiling of pooled systemic AAVs at various doses

To apply USeqFISH for in situ profiling of pooled AAVs, we further designed the viral cargo to include (1) a non-fluorescent, short coding sequence and (2) a shortened WPRE, W3SL62, making the essential components as compact as possible to leave room for the sequences to be tested (Extended Data Fig. 4a). For the coding sequence, we used the first part of split GFP (spGFP(1–10)63) owing to its short length (642 base pairs (bp)) and its ability to be labeled by GFP antibodies for purposes of viral injection and expression quality control (Extended Data Fig. 4b). Then, we cloned a uniquely designed barcode (160 nt in length for four probes) into the backbone between spGFP(1–10) and W3SL. We confirmed that the barcodes inserted into the viral genome can be selectively detected only with the complementary probes in vitro (Supplementary Fig. 2b).

Because the total dose for systemically administered AAVs is limited (~1012 vg per adult mouse) mainly due to liver toxicity27, pooled administration requires each variant to be delivered at a lower dose. To determine the minimum dose that USeqFISH can detect, we designed a cocktail of the same AAV-PHP.eB carrying five unique barcodes delivered at different doses (1011, 1010, 109, 108 and 107 vg per variant per animal) to adult wild-type (WT) mice (n = 5) via IV administration. Three weeks after injection, we harvested the brains and applied USeqFISH to detect all barcodes expressed in the mouse cortex (Extended Data Fig. 4c). Considering cells expressing at least one viral RNA spot to be transduced, we measured the transduction rate of AAV-PHP.eB at each dose. The transduction rate at the dose of 1011 vg was ~70%, similar to that observed by Chan et al.6, but dropped to ~4% at the dose of 107 vg (mean; Extended Data Fig. 4d). In addition to the transduction rate, the distribution of viral transcript numbers per transduced cell showed that higher doses were associated with higher spot counts, suggesting that high-dose administration could lead to more cells being transduced as well as more AAV-delivered transgenes being expressed in each cell (Extended Data Fig. 4e). Collectively, these results suggest that the efficiency of both transduction and expression of the transgene packaged in systemic AAVs (here, AAV-PHP.eB) relies on the dose injected and underscore the importance of using a matched dose across experimental batches for accurate validation and characterization of AAV capsids. Our results also show that USeqFISH can detect AAV transcripts even at the minimum dose of 107 vg; however, because <20% of cells were transduced and expressed only a few spots at doses of 109 vg or lower, we conclude that, for quantitative analysis, a dose of ≥1010 vg for each variant would be required.

In situ cell type tropism profiling of pooled systemic AAVs

To demonstrate the capability of USeqFISH for high-throughput, high-resolution profiling of AAVs, we designed a pool of six systemic AAVs (Fig. 3a). This pool includes previously identified capsids (AAV-PHP.eB6, AAV.CAP-B10 (ref. 8), AAV-PHP.N7, AAV-PHP.V1 (ref. 7) and AAV-PHP.B8 (ref. 7)) that show a range of efficiency and specificity of mouse CNS transduction across the blood–brain barrier (BBB). We also included a new capsid, AAV-PHP.AX, in the same pool to test if USeqFISH profiling can enable deep characterization of previously unexplored variants.

Fig. 3: In situ major cell type tropism profiling of barcoded systemic AAV capsid pools in mouse cortex. a, Experimental pipeline. We designed a pool of six AAV capsid variants carrying unique barcodes and administered them to adult wild-type mice through retro-orbital injection. At 4 weeks after injection, we harvested the brain tissue and used USeqFISH to profile viral gene expression along with endogenous cell type marker genes. The image dataset was then converted to a gene-by-cell expression matrix via our automated image processing pipeline, and we quantitatively analyzed the data by clustering endogenous genes to identify cell type clusters, followed by viral gene expression profiling in each cluster. b, Representative image of six variants and ten cell type markers in the same region of the mouse cortex. c, Transduction efficiency, measured by % transduced cells, of each variant in two mice (mean ± s.e.m.; n = 5 for mouse 1; n = 6 for mouse 2). d, Endogenous (top, cividis color map) and viral gene expression profiles (enrichment: middle, viridis; relative tropism bias: bottom, coolwarm) in the cell type clusters. Full size image

AAV-PHP.AX is a variant rationally designed to display a tropism-homing peptide (NTGSPYE, shown to target microglia55) by substituting AA452-458 of AAV-PHP.eB (Extended Data Fig. 5). In a previous study, we successfully derived an efficient neurotropic capsid, AAV.CAP-B10, by screening a 7-AA substitution library of the highly protruding AA455 loop in AAV-PHP.eB8. This result demonstrated that it was possible to ‘add’ specificity by introducing diverse mutations to the capsid and motivated us to engineer the same location of AAV-PHP.eB with previously identified homing peptides.

Using the same backbone as in the dose-dependency study, we individually produced the viruses to guarantee that each one packaged a unique barcode, and we IV administered the pool to adult WT mice (n = 2) at a dose of 5 × 1010 vg per variant (for a total dose of 3 × 1011 vg per animal). We harvested the brains after 4 weeks of expression and applied USeqFISH with probe sets for the barcodes of each variant and for canonical cell type marker genes that we selected from previously published scRNA-seq studies43,64,65,66,67,68. By applying our computational image processing and data analysis pipeline, we obtained an expression matrix of endogenous genes and viral barcodes (Extended Data Fig. 6a). With cell type clusters identified from the endogenous gene expression matrix (Methods), we analyzed the expression of the viral barcodes in each cell type cluster (Extended Data Fig. 6b). First, to assess the transduction efficiency of all variants in the pool across cell types, we measured the enrichment (mean of log-transformed spot numbers) of each variant in each cell cluster. Second, to compare relative cell type tropism bias across variants, we measured z-scored spot numbers of each barcode log-normalized to the total barcode counts per cell, with the hypothesis that the ratio between variants’ transcript numbers would be conserved across all cell types if all variants have the same tropism.

We first asked whether multiplexed analysis of these six AAVs with USeqFISH could recapitulate our previously reported characterization results from multiple studies with IHC6,7,8 or scRNA-seq32. To do so, we analyzed a total of 4,330 cells in the cortex of two mice using ten marker genes that represent cell types known to be preferred or excluded by each capsid, including neurons (Slc17a7 for pan-excitatory neurons; Gad1 for pan-inhibitory neurons; and Pvalb, Sst and Vip for major inhibitory neuronal subtypes), glia (Gja1 for astrocytes and Hexb for microglia) and vascular cells (Msr1 for pericytes, Cldn5 for endothelial cells and Acta2 for vascular smooth muscle cells (SMCs)) (Fig. 3b and Supplementary Fig. 3). Although a direct comparison to previous results would be inappropriate owing to the different doses used, the trend of overall transduction efficiency across the variants conformed to our expectations (Fig. 3c): AAV-PHP.eB and AAV.CAP-B10 showed similar transduction efficiency (~48%) and much higher than other variants8. The overall transduction efficiency of AAV-PHP.N was modest (~26%) and that of AAV-PHP.V1 and AAV-PHP.B8 was low (~14%) due to their expected tropism bias toward vascular cells and thalamus/cerebellum, respectively7. The new variant, AAV-PHP.AX, showed slightly lower transduction efficiency (~41%) than the most efficient ones (AAV-PHP.eB and AAV.CAP-B10) but much higher than the others.

Through USeqFISH profiling, we identified nine cell type clusters in which each cell type marker gene is highly expressed, and we assessed the distribution of each variant across these cell types (Fig. 3d). We found that the most efficient variants (AAV-PHP.eB, AAV.CAP-B10, AAV-PHP.N and AAV-PHP.AX) showed strong enrichment in neurons, particularly in Pvalb+ inhibitory cells (presumably due to the slight inhibitory bias of the CAG promoter69), and lower enrichment in Vip+ inhibitory cells, consistent with scRNA-seq results32. All six variants exhibited lower enrichment in non-neuronal cells than in neurons. In previous IHC studies, some variants showed higher transduction rates in non-neuronal cells; for example, AAV-PHP.V1 transduced astrocytes (S100+: ~60%) and endothelial cells (Glut1+: ~60%) more efficiently than neurons (NeuN+: 10%)7. This discrepancy could be due to the different markers and doses used in this study or to underestimation of non-neuronal cells by USeqFISH (Discussion). Nonetheless, USeqFISH showed higher enrichment of AAV-PHP.eB than AAV.CAP-B10 in astrocytes, supporting the relative neuronal preference of AAV.CAP-B10 (ref. 8).

Overall, our relative tropism analysis was consistent with our previous results. Again, AAV-PHP.eB showed relative tropism bias toward inhibitory neurons and astrocytes, whereas AAV.CAP-B10 showed bias toward neurons (both excitatory and inhibitory) and away from astrocytes8,32. AAV-PHP.N was relatively neuronal7, with lower transduction efficiency than AAV.CAP-B10. Although much less enriched than the other capsids, AAV-PHP.V1 showed relative tropism toward vascular cells (pericytes and vascular SMCs) as expected7. Collectively, these results show that USeqFISH can characterize the transduction and tropism of pooled AAVs with high throughput. In addition, the endogenous gene-based cell type clustering and transduction profiles of the six variants were strongly conserved between the two mice (Supplementary Fig. 4), supporting the reproducibility of USeqFISH-based AAV profiling.

For more in-depth characterization, we next investigated the enrichment and relative tropism of our variant pool across neuronal subtypes in mouse cortical layers using 30 cell type/layer-specific markers (a total of 8,475 cells were analyzed in a 1.14 mm × 1.69 mm area; Fig. 4a–d). Based on our selected gene markers, we classified cells into 26 clusters, including nine excitatory/gene-specific, 11 excitatory/layer-specific and six inhibitory subtypes (Fig. 4a,b and Supplementary Fig. 5). Using these cell type clusters, we identified a few interesting features of the variants (Fig. 4b). The overall enrichment pattern—AAV-PHP.eB and AAV.CAP-B10 highly enriched, followed by AAV-PHP.AX and AAV-PHP.N and then low-efficiency AAV-PHP.V1 and AAV-PHP.B8—was consistent with our previous results, as was the variants’ bias toward inhibitory neurons with a lower preference for Vip+ cells. Despite similar enrichment patterns, AAV-PHP.eB and AAV.CAP-B10 differed in their tropism, with AAV-PHP.eB being relatively biased toward L5 and inhibitory neurons and AAV.CAP.B10 showing relative bias toward L2/3 and L4. Interestingly, despite a lower transduction efficiency than the most efficient variants (AAV-PHP.eB and AAV.CAP-B10), AAV-PHP.N was relatively biased toward excitatory neurons. When the biases of these three variants were compared between excitatory and inhibitory neuronal clusters, AAV-PHP.eB was significantly biased toward inhibitory neuronal subtypes. Although AAV.CAP-B10 showed slight bias toward inhibitory subtypes, AAV-PHP.N was biased toward excitatory subtypes (Fig. 4c).

Fig. 4: Neuronal subtype tropism profiling of systemic AAVs in mouse cortical layers and other brain regions. a, Labeling of the cortex region by DAPI and the six AAV variants. For excitatory cell layers and inhibitory subtypes, the left and middle panels show the real RNA images of selected genes acquired from the experiment, and the right panels show the cell types inferred from clustering based on endogenous gene expression. b, Endogenous (top, cividis color map) and viral gene expression profiles (enrichment: middle, viridis; relative tropism bias: bottom, coolwarm) across the cell type clusters identified. Colored cell types are visualized in a. c, The relative tropism bias of AAV-PHP.eB, AAV.CAP-B10 and AAV-PHP.N across the group of excitatory neurons (total 11 clusters) and inhibitory neurons (total six clusters; mean ± s.e.m.; two-sided unpaired t-test). d, The cortical neuron coverage of efficient variants (AAV-PHP.eB, AAV.CAP-B10, AAV-PHP.N and AAV-PHP.AX) measured by the inverse variance of relative tropism bias (the coolwarm heat map in b) across all cell type clusters (F-test on variance). We omitted AAV-PHP.V1 and AAV-PHP.B8 from this analysis as their transduction efficiency is too low to be considered for overall neuronal transduction. e, Viral expression profiles across selected mouse brain regions (cortex, striatum, thalamus and cerebellum). We separated the endogenous gene expression matrix into the fields of view of each region and used this profile to identify the regional bias of the variants. f, Endogenous and viral gene expression profiles in cell type clusters identified in the striatum, thalamus and cerebellum. We selected ten genes for striatum, ten for thalamus and nine for cerebellum that have been shown to be enriched in each region and classified cells into the clusters represented by each gene. Based on their Ward distance dendrogram, we manually merged clusters into known subtypes. Unlike the striatum and cerebellum, which are composed of genetically distinct cell types, the thalamus has relatively gradual variation in gene expression across topographical nuclei; therefore, we separated cells into three putative groups (marked by dashed lines). Full size image

Another interesting result is that, compared to other variants, AAV-PHP.AX showed relatively unbiased and broad coverage (measured by the inverse of bias variance) across neuronal subtype clusters identified (Fig. 4d). Given its relatively robust transduction efficiency and enrichment in astrocytes (Fig. 3d), this variant could potentially serve as a high-efficiency universal vector that can be paired with cell-type-specific gene regulatory elements for targeted transduction. In fact, despite showing less enrichment in astrocytes than AAV-PHP.eB with a ubiquitous promoter (Fig. 3d and Extended Data Fig. 5c), with the astrocyte-specific glial fibrillary acidic protein (GFAP) promoter, AAV-PHP.AX more efficiently transduced astrocytes than AAV-PHP.eB delivering the same cargo (Extended Data Fig. 5d,e). These results suggest that AAV-PHP.AX has a higher capacity for tropism modulation by engineered cargos. Collectively, our results show that USeqFISH can provide transduction and tropism profiles of AAVs at the cell subtype level, in addition to facilitating inter-variant comparison in the same animal.

Broadening our coverage to other brain areas, we further sought to determine the cell type tropisms of our AAV pool in mouse striatum, thalamus and cerebellum (Fig. 4e,f). From sagittal sections, we collected 4–6 fields of view of each region (including the cortex as a control) and pooled all the data for quantitative analysis (6,929 cells were analyzed in total). We first examined the enrichment and relative tropism of each variant across the regions regardless of cell type to see whether our approach could recapitulate prior observations of overall and region-biased expression patterns (Fig. 4e). Our results show that the thalamus was the most favorable among the four regions for all six capsids, as expected. Note that, compared to other variants, AAV.CAP-B10 was highly biased to cortex yet largely away from cerebellum, consistent with our previous observation8. Although slightly more enriched in thalamus than other regions, we discerned no noticeable regional bias for AAV-PHP.V1 and AAV-PHP.B8 compared to other variants, presumably due to the lower dose used in our pool and the lower transduction efficiency of both variants.

Next, we profiled our AAV pool across major neuronal subtypes in each brain region. We selected ten cell type marker genes enriched in the striatum66, ten in the thalamus68 and nine in the cerebellum67 (Fig. 4f). We then identified cell type clusters represented by these individual marker genes and manually merged them into known groups based on Ward distances between clusters. Note that, whereas the striatum and the cerebellum are comprised of genetically distinct cell types, thalamic cell profiles have been shown to be rather continuous along topographically organized nuclei68; therefore, we split the clusters into three major groups based on markers (Tnnt1 for primary, Necab1 for secondary and Calb2 for tertiary nuclei)68, and this putative separation is marked by dashed lines in Fig. 4f. In the striatum (total 2,010 cells), we found that all six variants showed similar enrichment and relative tropism. Most variants transduced both D1 and D2 medium spiny neurons (MSNs) as well as Gad2+ inhibitory cells (with a slight preference for Th+ cells). In the thalamus (total 1,481 cells) and cerebellum (total 1,428 cells), on the other hand, they transduced most region-specific cell types with slightly distinct preferences. In the thalamus, AAV-PHP.eB and AAV.CAP-B10 were highly biased toward Prkcd+ cells, whereas AAV-PHP.N and AAV-PHP.AX preferred Calb2+ cells (Fig. 4f and Extended Data Fig. 7a). In the cerebellum, AAV-PHP.eB showed a bias toward Purkinje cells in the Purkinje layer (PL) and the molecular layer (ML), whereas AAV-PHP.N and AAV-PHP.AX were biased toward the granular layer (GL) (Fig. 4f and Extended Data Fig. 7b). Despite low overall enrichment, AAV.CAP-B10 showed relative bias toward ML and Golgi cells, with a preference for Gdf10+ Bergmann cells in the PL compared to other variants. These results not only reveal new cell type tropisms of systemic AAVs for region-specific cell types that have not been readily accessible with IHC but also demonstrate the scalability of USeqFISH-based AAV profiling across diverse brain regions without loss of throughput or resolution.

In situ profiling of pooled regulatory cargos

In addition to the capsid profiling, we examined the capability of USeqFISH for in situ characterization of pooled regulatory cargos of systemic AAVs. Regulatory sequences inserted in the 5′ or 3′ untranslated region (UTR) of AAV genomes have been used widely to control the expression of transgenes in targeted cell types or organs. For example, miRNA TS has been shown to have potential use for cell-type-specific transgene expression and mitigation of AAV toxicity for clinical applications owing to its ability to suppress transgene expression in the cells or tissue where the respective miRNA is highly expressed26,70,71. However, although sequencing studies have provided large datasets of the differential expression of miRNAs across organs and cell types, a lack of systemic approaches to validate them with AAVs has allowed us to identify only a few thus far.

To test the capability of USeqFISH for high-throughput profiling of regulatory cargos, we designed a pool of 13 variants that include the four tandem repeats of miRNA TSs (a complementary sequence of microRNAs) in the 3′ UTR of AAV genomes with a unique barcode (12 variants with unique miRNA TS and one control without miRNA TS; Fig. 5a). We selected miRNA TSs, which were shown to be abundant and differently expressed across cell types, based on previous miRNA sequencing studies72,73,74. All cargos were packaged in AAV-PHP.eB, and pooled viruses were IV administered to mice at a dose of 1 × 1010 vg per variant (a total dose of 1.3 × 1011 vg per animal). After 4 weeks of expression, we harvested the brains and proceeded with USeqFISH profiling with 24 neuronal subtype marker genes (a total of 9,289 cells were analyzed in a 1.68 mm × 1.41 mm area; Fig. 5b). After the cell type clustering based on endogenous gene expression, we measured the enrichment of each variant in each cell type identified and the log 2 fold change of enrichment compared to the control (the variant with no miRNA TS, ‘No TS’).

Fig. 5: USeqFISH profiling of pooled microRNA target sites in the AAV genome across neuronal subtypes in mouse cortical layers. a, A pool of 13 variants (12 miRNA TSs in the 3′ UTR of the AAV genome and one control, ‘No TS’) was designed, packaged in AAV-PHP.eB and IV delivered to mice. We applied USeqFISH to the brain tissue harvested after 4 weeks of expression. b, Labeling of the cortex region by DAPI and the spatial location of cell types identified from clustering analysis based on endogenous gene expression. Two representative images of transgene expression (‘No TS’ and ‘433-3p’) are shown at the bottom. c, Endogenous (top, cividis color map) and viral gene expression profiles (enrichment: middle, viridis; log 2 fold change: bottom, coolwarm) across the cell type clusters identified. We identified 16 total clusters, including two L2/3 (red), three L4 (orange), one L5 (green), two L5/6 (sky blue), one L6a (blue), two L6b (purple), four inhibitory and one hippocampal neuron. Colored cell types are visualized in b. Full size image

As a result, our selection of cell type marker genes revealed 16 clusters, including two L2/3, three L4, one L5, two L5/6, one L6a, two L6b, four inhibitory and one hippocampal neuron (Fig. 5b,c). Note that, in this experiment, we intentionally selected a different set of genes for cell type markers from the previous experiments, demonstrating the robustness of USeqFISH for cell type clustering. We also confirmed that the expression patterns of genes we selected in this experiment were consistent with the in situ hybridization data from the Allen Brain Atlas (Extended Data Fig. 8a). The overall transgene expression was enriched the most in inhibitory neurons (highest in Pvalb+ cells) and biased toward the lower layers (L5/6) of the cortex, which is consistent with our previous AAV-PHP.eB transduction profile (Fig. 4). Among these identified cell type clusters, we revealed distinct expression profiles of AAV genomes (Fig. 5c and Extended Data Fig. 8b). We found strong inhibition of transgene expression under control of miR1a-1 and miR433-3p TSs, which will potentially be useful for targeting peripheral systems with minimal toxicity to the brain. We also found that the TSs of 128-3p and 221-3p inhibit transgene expression less in inhibitory neurons than excitatory neurons, although the combination of these TSs has previously shown to have higher specificity in targeting inhibitory neurons26. This discrepancy could be due to differences in administration routes (systemic versus direct), the number of tandem repeats (four versus ten) and the promoter used (CAG versus hSyn). In this study, we instead found that the TS of 204-5p is promising for increasing inhibitory neuronal specificity for systemic gene delivery, because it reduces transgene expression mostly in excitatory neurons with a minimal effect on inhibitory neurons. Another interesting observation is the overall higher expression of transgenes across cell types with the TS of miR126a-3p. We speculate that the TS of miR126a-3p might reduce the endogenous level of miR126a-3p, which could increase the permeability of the BBB75. Although more experimental evidence will be required to understand the mechanism of miRNA TS effects on AAV transgene regulation, our results suggest that USeqFISH profiling provides an efficient approach to investigate the regulatory effect of engineered cargos for systemic AAVs in tissue.

USeqFISH application to NHP brains

For successful translation of engineered AAVs into therapeutic tools, they must be evaluated and characterized in NHPs, which is a considerable challenge owing to limited resources (for example, animal models, antibodies and atlases) and much longer turnaround times than in rodents. Because high-throughput, high-resolution AAV profiling could address these challenges, we sought to apply USeqFISH to NHP tissue. We reasoned that, as in mice, USeqFISH would allow us to detect synthetic transgenes, such as fluorescent proteins (FPs), in the NHP brain resulting from successful viral gene delivery. Using AAV.CAP-Mac, a capsid variant that we recently developed for efficient transduction of the rhesus macaque CNS29, we delivered an FP-encoding transgene and confirmed that USeqFISH with probes targeting the FP sequence can detect viral transcripts in cells expressing FPs in the NHP brain (Extended Data Fig. 9a). Detecting the endogenous RNA of NHPs requires probes specifically designed and filtered against the genes of each species. To do so, we expanded our probe design pipeline to incorporate two representative NHP species, the marmoset (Callithrix jacchus) and the rhesus macaque (Macaca mulatta), and validated the ability of USeqFISH to detect endogenous mRNAs (for example, major inhibitory subtypes: Pvalb, Sst and Vip) and viral transcripts expressed in the intact brain tissue of both species (Fig. 6a,b). Applying post hoc IHC to the USeqFISH-labeled NHP tissue verified the suitability of our probe design for the NHP genes and the compatibility of USeqFISH with IHC (Extended Data Fig. 9b). These results demonstrate the applicability of USeqFISH to in situ detection of endogenous and AAV-delivered genes in NHP tissues and support potential translation of USeqFISH-based AAV profiling into NHPs.

Fig. 6: USeqFISH application to NHPs: in situ AAV detection and integrative analysis of cell morphology and transcriptional profiles. a,b, We applied USeqFISH to brain tissue slices of marmoset (a) and rhesus macaque (b) to which our viruses were administered (eight pooled variants for the marmoset and AAV.CAP-Mac for the rhesus macaque) with probes against three endogenous genes (yellow: Pvalb; green: Sst; magenta: Vip) and the coding sequence of each viral genome (human frataxin for the marmoset and mNeonGreen for the rhesus macaque (cyan); FPs were quenched by proteinase K (ProK) treatment). The representative images show that USeqFISH is applicable to these two NHP species with species-specific probes. c, Schematic of procedure of vector-assisted spectral tracing (VAST) and subsequent USeqFISH profiling of the rhesus macaque brain. We systemically delivered a cocktail of three AAV.CAP-Mac viruses packaging mNeonGreen, mTurquoise2 or mRuby2 to an infant rhesus macaque and recovered the brain. This brain exhibited a variety of colors, coming from stochastic expression of the three FPs, allowing us to trace single-cell morphologies. We additionally labeled seven endogenous genes (Pvalb, Sst, Vip, Lamp5, Slc17a7, Crym and Nr4a2) using USeqFISH in the same tissue to identify transcriptionally defined cell types and their morphology. d, Representative image of integration of VAST and USeqFISH with seven cell marker genes in the rhesus macaque brain and examples of two cells (yellow outlined box: i; red outlined box: ii) identifying both cell type and morphology. Full size image

We further explored the potential application of USeqFISH, in combination with viral tools, for multimodal in situ single-cell analysis of the NHP brain. As a proof of concept, we designed an experiment to integrate viral cell morphology labeling and USeqFISH-based transcriptional profiling in the same NHP tissue. We used AAV.CAP-Mac to efficiently deliver three cargos each encoding an FP—mNG, mTurquoise2 and mRuby2—to the infant rhesus macaque via systemic administration (Fig. 6c). As a result, the brain expressed a variety of colors resulting from a stochastic mixture of the three FPs, allowing us to readily identify the morphology of single cells29. After imaging of FPs in the cortical area (area size: 1.14 mm × 1.14 mm × 100 µm), we treated the tissue with proteinase K to quench the FP signal and subsequently proceeded with two rounds of USeqFISH for seven cell type markers (Pvalb, Sst, Vip, Lamp5, Slc17a7, Crym and Nr4a2). Although coverage was sparse, this approach allowed us to trace the morphology of cells with transcriptional identities (Fig. 6d). These results demonstrate the versatility of USeqFISH and its compatibility with other single-cell labeling and barcoding methods, suggesting that it can be integrated with viral tools to explore the cellular and molecular architecture of tissue across species.