Main

A central principle of neuroscience is that the nervous system is a network of diverse types of neurons and supporting cells communicating with each other mainly through synaptic connections. This overall brain architecture is thought to be composed of four systems—motor, sensory, behavioural state and cognitive—with parallel, distributed and/or hierarchical sub-networks within each system and similarly complex, integrative interconnections between different systems1. Specific groups of neurons with diverse anatomical and physiological properties populate each node of these sub- and supra-networks, and form extraordinarily intricate connections with other neurons located near and far. Neuronal connectivity forms the structural foundation underlying neural function, and bridges genotypes and behavioural phenotypes2,3. Connectivity patterns also reflect the evolutionary conservation and divergence in brain organization and function across species, as well as both the commonality among individuals within a given species and the uniqueness of each individual brain.

Despite the fundamental importance of neuronal connectivity, our knowledge of it remains remarkably incomplete. C. elegans is the only species for which an essentially complete wiring diagram of its 302 neurons has been obtained through electron microscopy4. Histological tract tracing studies in a wide range of animal species has generated a rich body of knowledge that forms the foundation of our current understanding of brain architecture, such as the powerful idea of multi-hierarchical processing in sensory cortical systems5. However, much of these data are qualitative, incomplete, variable, scattered and difficult to retrieve. Thus, our knowledge of whole-brain connectivity is fragmented, without a cohesive and comprehensive understanding in any single vertebrate animal species (see for example the BAMS database for the rat brain6). With recent advances in both computing power and optical imaging techniques, it is now feasible to systematically map connectivity throughout the entire brain. A salient example of this is the ongoing effort in mapping connections in the Drosophila brain7,8.

The connectome9 refers to a comprehensive description of neuronal connections, for example, the wiring diagram of the entire brain. Given the enormous range of connectivity in the mammalian brain and the relative inaccessibility of the human brain, such descriptions can exist at multiple levels: macro-, meso- or microscale. At the macroscale, long-range, region-to-region connections can be inferred from imaging white-matter fibre tracts through diffusion tensor imaging (DTI) in the living brain10. However, this is far from cellular-level resolution, given the size of single volume elements (voxels >1 mm3). At the microscale, connectivity is described at the level of individual synapses, for example, through electron microscopic reconstruction at the nanometer scale4,11,12,13,14,15. At present, the enormous time and resources required for this approach makes it best suited for relatively small volumes of tissue (<1 mm3). At the mesoscale, both long-range and local connections can be described using a sampling approach with various neuroanatomical tracers that enable whole-brain mapping in a reasonable time frame across many animals. In addition, cell-type-specific mesoscale projects have the potential to dramatically enhance our understanding of the brain’s organization and function because cell types are fundamental cellular units often conserved across species16,17.

Here we present a mesoscale connectome of the adult mouse brain, The Allen Mouse Brain Connectivity Atlas. Axonal projections from regions throughout the brain are mapped into a common 3D space using a standardized platform to generate a comprehensive and quantitative database of inter-areal and cell-type-specific projections. This Connectivity Atlas has all the desired features summarized in a mesoscale connectome position essay18: brain-wide coverage, validated and versatile experimental techniques, a single standardized data format, a quantifiable and integrated neuroinformatics resource and an open-access public online database.

Creating the Allen Mouse Brain Connectivity Atlas

A standardized data generation and processing platform was established (Fig. 1a, see Methods). Recombinant adeno-associated virus (AAV), serotype 1, expressing EGFP optimally was chosen as the anterograde tracer to map axonal projections19,20. We also confirmed that AAV was at least as efficient as, and more specific than, the classic anterograde tracer biotinylated dextran amine (BDA) (Extended Data Fig. 1), as described separately21.

Figure 1: Creation of the Connectivity Atlas.
figure 1

a, The data generation and processing pipeline. QC, quality control. b, The two main steps of informatics data processing: registration of each image series to a 3D template (upper panels) and segmentation of fluorescent signal from background (lower panels). c, Distribution of injection sites across the brain. The volume of the injection was calculated and represented as a sphere. Locations of all these injection spheres are superimposed together (left panel). Mean injection volumes ( ± s.e.m.) across major brain subdivisions are shown (right panel, see Extended Data Fig. 3).

PowerPoint slide

EGFP-labelled axonal projections were systematically imaged using the TissueCyte 1000 serial two-photon (STP) tomography system22, which couples high-speed two-photon microscopy with automated vibratome sectioning of an entire mouse brain. High xy resolution (0.35 µm) 2D images in the coronal plane were obtained at a z-sampling interval of 100-µm across the entire brain during a continuous 18.5 h scanning period, resulting in 140 serial sections (a 750 gigabyte (GB) data set) for each brain (Extended Data Fig. 2a and Supplementary Video 1). Owing to its block-face imaging nature, STP tomography essentially eliminates tissue distortions that occur in conventional section-based histological methods and provides a series of highly conformed, inherently pre-aligned images amenable to precise 3D mapping.

Image series were processed in an informatics pipeline with a series of modules (see Methods). The injection site location of each brain was manually drawn and annotated using the Allen Reference Atlas23 and other reference data sets when appropriate. Stringent quality control criteria were applied, discarding 25% of all scanned brains due to insufficient quality in labelling or imaging. Each image set was registered into a 3D Allen Reference Atlas model in two steps (Fig. 1b, upper panels). First, a registration template was created by averaging many image sets, and every image stack was aligned to this average template brain. This process was repeated for multiple rounds, first globally (affine registration) and then locally (deformable registration), each round generating a better average template and more precise alignment of individual brains. The final average template brain, averaged from 1,231 brains, shows remarkably clear anatomical features and boundaries. Second, the average template brain was aligned with the 3D reference model, again using local alignment (Supplementary Video 2).

We developed a signal detection approach and applied it to each section to segment GFP signals from background (Fig. 1b, lower panels). Signals within injection site polygons were computed separately from the rest of the brain. The segmented pixel counts were gridded into 100 × 100 × 100 µm3 voxels to create an isotropic 3D summary of the projection data. These voxels were used for data analysis, real-time data and correlative searches, and visualization of projection relationships in the Brain Explorer.

Meaningful informatics data quantification and comparison relies on the mapping precision of the raw data sets into the 3D reference framework. We investigated registration variability in two ways. First, we selected 10 widely distributed anatomical fiducial points to compare variability among 30 randomly selected brains (Extended Data Fig. 2b). We found a high degree of concordance among individual brains, with median variation < 49 µm in each dimension between each brain and the average template brain, which is comparable to the median inter-rater variation of < 39 µm. The median difference is < 71 µm between each brain and the Reference Atlas. Second, we compared manual and informatics annotations of the injection sites from all Phase I (see below) brains. The informatics-derived assignment of injection site structures had > 75% voxel-level concordance with manual expert annotation for almost all injection sites (Extended Data Fig. 2c). These analyses confirmed the relatively high fidelity of co-registration of raw image data with the Allen Reference Atlas. The remaining difference is mainly due to the imperfect alignment between the average template brain and the Nissl-section-based Reference Atlas (Supplementary Video 2).

Mapping axonal projections in the whole mouse brain

The connectivity mapping was carried out in two phases. In Phase I (regional projection mapping), axonal projections from 295 non-overlapping anatomical regions, defined from the Allen Reference Atlas ontology and tiling the entire brain space (Supplementary Table 1), were characterized in wild-type mice with a pan-neuronal AAV vector expressing EGFP under the human synapsin I promoter (AAV2/1.pSynI.EGFP.WPRE.bGH, Fig. 1a). In Phase II (Cre driver based projection mapping), axonal projections from genetically defined neuronal populations are characterized in Cre driver mouse lines with a Cre-dependent AAV (AAV2/1.pCAG.FLEX.EGFP.WPRE.bGH, Fig. 1a). We only report here on the completed Phase I study, which includes 469 image sets with injection sites covering nearly the entire brain (Fig. 1c, Extended Data Fig. 3 and Supplementary Video 3). Only 18 intended structures were completely missed due to redundancy or injection difficulty (Supplementary Table 1).

We examined multiple projection data sets in detail and found that they were complete in capturing all known projection target sites throughout the brain, sensitive in detecting thin axon fibres, and consistent in quality to allow qualitative and quantitative comparisons. As an example, 7 representative isocortical injections (Fig. 2) reveal distinct projection patterns in the striatum, thalamus, zona incerta, midbrain, pons and medulla. To compare the brain-wide spatial distribution of projections between cortical source regions, we placed each isocortical injection experiment into one of 9 broad functional groups: frontal, motor, anterior cingulate, somatosensory, auditory, retrosplenial, visual, ventral and associational areas (Extended Data Fig. 4). The average percentages of total projection signals into 12 major brain subdivisions showed disproportionately large projections within the isocortex, as well as distinct subcortical distributions.

Figure 2: Whole brain projection patterns from seven representative cortical regions.
figure 2

One coronal section at the centre of each injection site is shown in the top row (see Supplementary Table 1 for the full name of each region). In the second row, 3D thumbnails of signal density projected onto a sagittal view of the brain reveal differences in brain-wide projection patterns. The bottom 6 rows show examples of EGFP-labelled axons in representative subcortical regions.

PowerPoint slide

Brain-wide connectivity matrix

After segmentation and registration, we derived quantitative values from segmented signals in each of the 500,000 voxels contained within each brain. We constructed a brain-wide, inter-areal, weighted connectivity matrix using the entire Phase I experimental data set (Fig. 3, see Supplementary Table 2 for the underlying values). The Allen Reference Atlas contains 863 grey-matter structures at the highest level of the ontology tree (Supplementary Table 1). We focused our analyses on the chosen 295 structures, which are at a mid-ontology level corresponding best with the approximate size of the tracer infection areas (for example, isocortical areas are not subdivided by layers in this matrix), but our techniques may be used at deeper levels in future studies. The projection signal strength between each source and target was defined as the total volume of segmented pixels in the target (summed across all voxels within each target), normalized by the injection site volume (total segmented pixels within the manually drawn injection area).

Figure 3: Adult mouse brain connectivity matrix.
figure 3

Each row shows the quantitative projection signals from one of the 469 injected brains to each of the 295 non-overlapping target regions (in columns) in the right (ipsilateral) and left (contralateral) hemispheres. Both source and target regions are displayed in ontological order. The colour map indicates log10-transformed projection strength (raw values in Supplementary Table 2). All values less than 10−3.5 are shown as blue to minimize false positives due to minor tissue and segmentation artefacts and all values greater than 10−0.5 are shown as red to reduce the dominant effect of projection signals in certain disproportionately large regions (for example, striatum).

PowerPoint slide

The majority of the 469 Phase I image sets are single injections into spatially distinct regions, but a subset of these are repeated injections into the same regions. To assess the consistency of projection patterns across different animals and the reliability of using a single experiment to define connections from any particular region, we compared brain-wide connectivity strengths in 12 sets of duplicate injections (Extended Data Fig. 5). Each pair was highly correlated across a range of projection strengths. Differences between any two points were on average only a half order of magnitude (within one standard deviation). In primate cortex, single tracer injections were also found to reliably predict mean values obtained from repeated injections into the same source24.

The AAV tracer expresses cytoplasmic EGFP, which labels all processes of the infected neuron, including axons and synaptic terminals. Signals associated with the major fibre tracts of the brain, marked in the Allen Reference Atlas, were removed before the informatics quantification. However, there are also areas (for example, striatum) where axons pass through without making synapses. Although passing fibres can generally be distinguished from terminal zones by visual inspection of morphology in the 2D images (axons in terminal zones ramify and contain synaptic boutons, see Extended Data Fig. 6), it is difficult to confidently make this distinction algorithmically. We compared results of terminal labelling using Synaptophysin-EGFP-expressing AAV with the cytoplasmic EGFP AAV (Extended Data Fig. 6). Outside of major fibre tracts, there was high correspondence between synaptic EGFP and cytoplasmic EGFP signals in target regions. Nonetheless, it should be noted that the connectivity matrix contains passing fibre signals within grey matter, the nature of which should be manually examined in 2D section images.

This connectivity matrix (Fig. 3) has several striking features. First, connectivity strengths span a greater than 105-fold range across the brain (Extended Data Fig. 7), suggesting that quantitative descriptions of connectivity must be considered for understanding neural network properties24. Second, there are prevalent bilateral projections to corresponding ipsilateral and contralateral target sites, with ipsilateral projections generally stronger than contralateral ones (total normalized projection volumes from all experiments are 4.3:1 between ipsilateral and contralateral hemispheres). Third, of all possible connections, strong connections are found in only a small fraction. Whereas 63% ipsilateral and 51% contralateral targets have projection strength values above the minimal true positive level of 10−4 (which has a potential false positive rate of 27%, Extended Data Fig. 7), only 21% ipsilateral and 9% contralateral targets have projection strength values above the intermediate level of 10−2.

An inter-region connectivity model

Infected neurons in injection sites often span several brain areas. To better describe the mutual connection strengths between ontologically defined regions rather than injection sites, we constructed inter-region connectivity matrices via a computational model (see Supplementary Notes for a detailed description), using segmented projection volumes (Fig. 3) to define connection strengths. Two basic modelling assumptions were used. The first, regional homogeneity, assumes that projections between source X and target Y regions are homogeneously distributed, so that infection of a subarea of the region is representative of the entire region. This allows the value of WX,Y, a regional connectivity measure, to be inferred from data that can at best only sample the source region. The second assumption, projection additivity, assumes that the projection density of multiple source regions sum linearly to produce projection density in a target region. This allows relative contributions of different sources to be determined for a target region, assuming at least partially independent injections.

The 469 experiments allowed us to compute the mutual connections among 213 regions. The best-fit model (Fig. 4a, see Supplementary Table 3 for the underlying values) results from a bounded optimization followed by a linear regression to determine connection coefficients, assigning statistical confidence (P values) to each connection in the matrix. Based on the bounded optimization, the number of non-zero entries provides an upper bound estimate for sparsity: 36% for the entire brain and 52% for cortico-cortical connections. Using confidence values for each non-zero connection, the lower bound on sparsity is 13% for the entire brain and 32% for cortico-cortical connections.

Figure 4: A computational model of inter-regional connection strengths.
figure 4

a, The inter-region connectivity matrix, with connection strengths represented in colours and statistical confidence depicted as an overlaid opacity. Note that in this matrix, the sources (rows) are regions, whereas for the matrix of Fig. 3, the sources are injection sites. b, Both whole-brain and cortico-cortical connections can be fit by one-component lognormal distributions (red lines). However, the log distributions of whole-brain connection strengths are best fit by a two-component Gaussian mixture model (green lines). c, Node degree and clustering coefficient distributions for a binarized version of the linear model network, compared against Erdos-Renyi, Watts-Strogatz and Barabasi-Albert networks with matched graph statistics. d, Comparison of the correlation coefficients of normalized connection density between areas, defined as the common source for projections to other regions (left) and as the common target of projections from other regions (right).

PowerPoint slide

Connection strengths spanned 105-fold range, and negatively correlated with the distance between connected regions (Supplementary Notes and Supplementary Table 4). Based on the Akaike information criterion (AIC), among hypothesized connection strength distributions (lognormal, normal, exponential, inverse Gaussian) the brain-wide data are best fit by a lognormal distribution (Fig. 4b, red lines). However, the log-transformed connection strengths failed to pass the Shapiro-Wilk test for normality (ipsilateral: P = 0.039; contralateral: P = 0.023), and among Gaussian mixture models, a two-component one provided the best fit (Fig. 4b, green lines). For cortico-cortical connections, both intra- and inter-hemispheric distributions are well fit by lognormals (ipsilateral: P = 0.23; contralateral: P = 0.21) individually (Fig. 4b), but they are different enough that when combined the distribution is no longer lognormal (P = 0.0019). This extends previously reported findings that cortico-cortical connections follow a lognormal distribution in the primate24,25 and mouse cortex26 to the entire mouse brain. These observations combined indicate that connections might be lognormally distributed within a region, yet vary systematically with statistics unique to the region.

Previous studies on connectivity considered global organizational principles from a graph-theory perspective26,27,28. We transformed our weighted, directed, connectivity matrix (Fig. 4a) to binary directed and binary undirected data sets. Network analyses (Fig. 4c, see also Supplementary Notes) reveal that the mouse brain has a higher mean clustering coefficient (which gives the ratio of existing over possible connections), 0.42, than expected by a random network29,30 with identical sparseness, 0.12. Random graphs with matched node degree distribution show a similar drop in clustering coefficient to 0.16. A ‘small-world’ network model31 approximates the clustering coefficient distribution after being fit to its mean; however, its node degree distribution poorly matches the data. Here, a better fit is achieved with a scale-free network32; however, neither model simultaneously fits both distributions.

Next, we analysed similarity in connection patterns between different regions. Similarity is characterized by two measures: correlations between outgoing projections originating in two areas and correlations between incoming projections ending in these two areas. Figure 4d depicts heat maps of correlation coefficients between the same regions of the linear model (Fig. 4a) depicted across the rows (that is, as a common source for other regions), and down the columns (that is, as a common target from other regions). The number of strong correlations is larger than expected by chance, suggesting a tendency of regions to organize into clusters to allow for strong indirect connectivity.

The cortico-striatal-thalamic network

Different cortical areas project to different domains of striatum and thalamus with some degree of topography33,34,35. We used 80 isocortical injection experiments to examine this. Spearman’s rank correlation coefficient of segmented projection volumes of all voxels across the entire brain was computed between every pair of experiments, and hierarchical clustering led to 21 distinct groups, each containing 1 to 10 injections (Extended Data Fig. 8a, b). Such grouping effectively divides the cortex into 21 predominantly non-overlapping spatial zones as shown in a flat-map cortex representation (Fig. 5a) defined by similar output projections. To effectively visualize different projection patterns in a common 3D space, voxel densities from 21 selected injections, one (centrally located) from each cluster, were overlaid to create ‘dotograms’ (Fig. 5b, c and Extended Data Fig. 8c), demonstrating that projections from different cortical regions divide up striatum and thalamus into distinct domains.

Figure 5: Topography of cortico-striatal and cortico-thalamic projections.
figure 5

a, Cortical domains in the cortex flat-map. Each circle represents one of 80 cortical injection experiments, whose location is obtained via multidimensional scaling from 3D to allow visualization of all the sites in one 2D plane. The size of the circle is proportional to the injection volume. Clustered groups from Extended Data Fig. 8b are systematically colour-coded. The selected injections for b are marked with a black outline. b, For co-visualization, voxel densities from the 21 selected injections from a are overlaid as ‘dotograms’ at 8 coronal levels for ipsilateral hemisphere. For the dotogram, one circle, whose size is proportional to the projection strength, is drawn for each injection in each voxel; the circles are sorted so that the largest is at the back and the smallest at the front, and are partially offset as a spiral. c, Enlarged view of the dotogram from the area outlined by a white box in b. d, 3D tractography paths in both cortical hemispheres. e, A medial view of 3D tractography paths into the ipsilateral caudoputamen. Voxel starting points are represented as filled circles and injection site end points as open circles. f, A top-down view of 3D tractography paths into the ipsilateral thalamus.

PowerPoint slide

Average inter-group distances (Extended Data Fig. 9a–d) were used to quantify the degree to which inter-group spatial relationships within the cortex are preserved in target domains. Distance matrices for both ipsilateral and contralateral cortical targets were highly correlated with the distance matrix of injection sites, as were ipsilateral striatal and thalamic distance matrices. Weaker correlations were observed in contralateral striatum and thalamus. The computed distance matrices show that the spatial relationship between injection sites is recapitulated in the projections to striatum and thalamus, with some transformation of scale and rotation.

This highly synchronized topography can be determined via virtual tractography. Real tractography (following single axons) cannot be done because of the discrete 100-µm sampling between sections. Instead, from every voxel we computed a path back to the injection site by finding the shortest density-weighted distance through the voxels. The 3D tractography paths were plotted for both cortical hemispheres (Fig. 5d and Extended Data Fig. 9e), ipsilateral striatum and thalamus (Fig. 5e, f). The tractography shows that the paths themselves also retain the same spatial organization. In particular for the thalamus, anterior groups pass through fibre tracts in the striatum, narrowing through the globus pallidus, before spreading throughout the thalamus (Extended Data Fig. 9f). Posterior groups (RSP, VIS) bypass the striatum but retain a strict topography following the medial/lateral axis (Fig. 5f).

Although the striatum is a cellularly homogeneous structure that can be subdivided into distinct domains selectively targeted by cortical and other inputs36 (Fig. 5e), the thalamus is highly heterogeneous, composed of up to 50 discrete nuclei37, receiving and relaying diverse sensory, motor, behavioural state and cognitive information in parallel pathways to and from the isocortex. We constructed a comprehensive wiring diagram between major, functionally distinct cortical regions and thalamic nuclei in the ipsilateral hemisphere (Fig. 6 and Extended Data Fig. 10), by combining the quantitative connectivity matrix (Fig. 3) with the linear model (Fig. 4a), manual proof-checking in the raw image sets, and cross-referencing published literature (83 publications, mostly from rat data, see Extended Data Fig. 10). This wiring diagram demonstrates specific point-to-point interconnections between corresponding clusters that divide the cortico-thalamic system into six functional pathways: visual, somatosensory, auditory, motor, limbic and prefrontal. We also observed cross-talk between these pathways, mediated by specific associational cortical areas and integrative thalamic nuclei.

Figure 6: A wiring diagram of connections between major cortical regions and thalamic nuclei.
figure 6

Upper and lower panels show projections from cortex to thalamus or from thalamus to cortex, respectively (ipsilateral projections only). Colour coding of different cortical regions and their corresponding thalamic nuclei is similar to the flat-map cortex in Fig. 5a. Thickness of the arrows indicates projection strength, which is shown in three levels as in Extended Data Fig. 10 and corresponds roughly to the red, orange and yellow colours in the raw connectivity matrix (Fig. 3). LGv and PF do not have significant projections to cortex. The reticular nucleus of the thalamus (RT) (the dashed box) is placed in between cortex and thalamus to illustrate its special role as a relay nucleus which all cortico-thalamic and thalamo-cortical projections pass through and make collateral projections into. The asterisks indicate that cortico-thalamic and thalamo-cortical projections in the gustatory/visceral pathway are between GU/VISC cortical areas and VPMpc/VPLpc nuclei (instead of VPM/VPL). See Supplementary Table 1 for the full name of each region.

PowerPoint slide

The specific observations from our data are mostly consistent (with a few additions) with extensive previous studies in rats and the fewer number of studies in mice (Extended Data Fig. 10) as well as with studies in other mammalian species37,38,39, providing a comprehensive and unifying view of mouse cortico-thalamic connections for the first time. Much work is still needed to obtain a full picture of connectivity in the cortico-thalamic system, including intra-cortical and intra-thalamic connections, their relationships with the interconnections between cortex and thalamus, and the exquisite cortical laminar specificity of the originating and terminating zones of many of these connections5,38,40.

Discussion

The standardized projection data set and the informatics framework built around it provide a brain-wide, detailed and quantitative connectivity map that is the most comprehensive, to date, in any vertebrate species. The high-throughput whole-brain mapping approach is remarkably consistent across animals, with an average correlation of 0.90 across 12 duplicate sets of mice (Extended Data Fig. 5). Informatics processing of the data set, such as co-registration and voxelization, helps with direct comparison between any image series, and systematic modelling and computational analyses of the entire network. Furthermore, the entire data set preserves the 3D spatial relationship of different domains, pathways and topography (Fig. 5). Thus, our connectivity atlas lays the groundwork for large-scale analyses of global neural networks, as well as networks within and between different neural systems.

As an initial analysis of this large-scale data set, we present an examination of both general principles of whole brain architecture and specific properties of cortical connections. We found that projections within the ipsilateral hemisphere and to the corresponding locations in the contralateral hemisphere are remarkably similar across the brain (Figs 3 and 4; Pearson’s r = 0.595), with the contralateral connection strengths significantly weaker than ipsilateral ones. The mouse brain shows defining features of both small-world and scale-free networks, that is, it clusters and has hubs; but neither of these models in isolation can fully explain it. Interestingly, the connection strengths at both cortico-cortical and whole-brain levels show lognormal distributions, that is, long-tailed distributions with small numbers of strong connections and large numbers of weak connections. In connections among isocortex, striatum and thalamus, clustering analysis and virtual tractography recapitulate anatomical parcellation and topography of functional domains and projection pathways (Fig. 5). The extensive reciprocal connections between isocortex and thalamus (Fig. 6) further illustrate general principles of network segregation and integration.

Our Connectivity Atlas represents a first systematic step towards the full understanding of the complex connectivity in the mammalian brain. Through the process, limitations of the current approach and opportunities for future improvement can be identified. On the technical side, any potential new connections identified in the Phase I data set (which does not yet have extensive redundancy in regional coverage) will need to be confirmed with more data. Also, we cannot exclude the possibility that the AAV tracer we chose to use (with the specific promoter and serotype) may not be completely unbiased in labelling all neuronal types. The connectivity matrix has been shown to contain false positive signals (Extended Data Fig. 7), mainly due to tissue and imaging artefacts and injection tract contaminations. The connectivity matrix based on cytoplasmic EGFP labelling does not distinguish passing fibres from terminal zones, and examination of raw images is needed to help with such distinction, using features such as ramification of axon fibres, and boutons or enlargements in axons. The Atlas could also be enhanced in the future with more systematic mapping using synaptic-terminal-specific viral tracers as shown (Extended Data Fig. 6). Regarding signal quantification, we chose to use projection volume (sum of segmented pixel counts) over projection fluorescence intensity (sum of segmented pixel intensity), because we found the former more reliable and less variable across different brains (even after normalization). However, the use of projection volume will probably underestimate the strength of dense projections. Thus, the true range of projection strengths may go beyond the 105-fold reported here. Finally, we observed that the alignment between the average template brain and our existing Reference Atlas model (which was drawn upon Nissl sections) is not perfect, which leads to a degree of registration imprecision that could affect the accuracy of the quantitative connectivity matrices. Our work shows the need to generate a new reference model based on a realistic 3D brain, such as the average template brain presented here. Our data set can also help this by adding connectivity information to improve anatomical delineations previously defined solely by cyto- and chemoarchitecture.

Beyond the above technical issues, identities of the postsynaptic neurons at the receiving end of the mapped connections are not labelled and therefore unknown. Microscale, synaptic-level details are missing, and electrical connections through gap junctions are not revealed. Moreover, our mesoscale connectome provides a static, structural connectivity map, which is necessary but insufficient for understanding function. Moving from here to functional connectivity and circuit dynamics in a living brain will require fundamentally different approaches41. One important aspect concerns the types of synapses present in each connectional path, as determined by their neurotransmitter contents and their physiological properties. Anatomical connection strength (for example, numbers of axon fibres and boutons) needs to be combined with physiological connection properties (for example, excitatory vs inhibitory types of synapses, fast vs slow neurotransmission, and the specific strength and plasticity of each synapse) to yield a true functional connection strength.

With the goal of bridging structural connectivity and circuit function41, we have taken a genetic approach, using AAV viral tracers that express EGFP in either a pan-neuronal or cell-type-specific manner. The same neural networks mapped here can be further investigated by similar viral vectors expressing tools for activity monitoring (for example, genetically encoded calcium indicators) and activity manipulation (for example, channelrhodopsins)16. Furthermore, our ongoing efforts of Cre-driver dependent tracing will allow more specific connectivity mapping from discrete areas and specific functional cell types. Such cell-type-specific connectivity mapping is perhaps the greatest advantage of the genetic tracing approach, allowing dissection of differential projection patterns from different neuronal types that are often intermingled in the same region. The genetic tracing approach can be further extended into identification of inter-connected pre- and postsynaptic cell types and individual cells, using approaches such as retrograde or anterograde trans-synaptic tracing42,43,44. Our approach can also be applied to animal models of human brain diseases and the connectivity data generated here can be instructive to human connectome studies, which will help to further our understanding of human brain connectivity and its involvement in brain disorders.

Methods Summary

C57BL/6J male mice at age P56 were injected with EGFP-expressing AAV using iontophoresis by the stereotaxic method. The brains were scanned using the STP tomography systems. Images were subject to data quality control and all the injection sites were manually annotated. All the image sets were co-registered into the 3D reference space. EGFP-positive signals were segmented from background, and binned at voxel levels for quantitative analyses. The raw connectivity data are served with various navigation tools on the web through the Allen Institute’s data portal. See the full Methods section for detailed descriptions.

Online Methods

All experimental procedures related to the use of mice were approved by the Institutional Animal Care and Use Committee of the Allen Institute for Brain Science, in accordance with NIH guidelines.

Outline of the Connectivity Atlas data generation and processing pipeline

A standardized data generation and processing platform was established (Fig. 1a). Viral tracers were validated and experimental conditions were established through pre-pipeline activities. C57BL/6J mice at age P56 were injected with viral tracers using iontophoresis by the stereotaxic method. Six STP tomography systems generated high-resolution images from up to 6 brains per day. Images were subject to data quality control and all the injection sites annotated. A stack of qualified 2D images then underwent a series of informatics processes. The image data and informatics products as well as all metadata are stored, retrieved and maintained using our laboratory information management system (LIMS). The generated connectivity data are served with various navigation tools on the web through the Allen Institute’s data portal along with other supporting data sets.

Stereotaxic injections of AAV using iontophoresis

Recombinant adeno-associated virus (AAV) expressing EGFP was chosen as the anterograde tracer to map axonal projections because of several advantages over conventional neuroanatomical tracers19. First, AAV mediates robust fluorescent labelling of the soma and processes of infected neurons, which can be coupled with direct imaging methods for high-throughput production without additional histochemical staining steps. Second, compared to conventional tracers which often have mixed anterograde and retrograde transport, retrograde labelling with AAV (except for certain serotypes) is generally negligible. In this study, retrogradely infected cells were seen only rarely. Notable exceptions are found in specific circuits that might have certain types of strong presynaptic inputs, such as the entorhinal cortex projection to hippocampal subregions dentate gyrus and CA1, where retrogradely infected input cells can be brightly labelled. Finally, perhaps the greatest advantage of using AAV over conventional tracers is the flexible molecular strategies that can be used to introduce various transgenes and to label specific neuronal populations by combining cell-type-selective Cre driver mice with AAV vectors harbouring a Cre-dependent expression cassette.

To optimize the tracing approach for the large-scale atlas data generation, we tested various AAV constructs, serotypes and injection methods. We selected AAV vectors that express EGFP at the highest levels. We found that AAV serotype 1 produces the most robust and uniform neuronal tropism and that iontophoretic delivery of AAV gives rise to the most consistent and confined viral infection volume. Thus, the entire atlas data generation was standardized with the use of AAV serotype 1 and iontophoresis for stereotaxic injections20.

Stereotaxic coordinates were chosen for each target area based on The Mouse Brain in Stereotaxic Coordinates45. For the majority of target sites, the anterior/posterior (AP) coordinates are referenced from Bregma, the medial/lateral (ML) coordinates are distance from midline at Bregma, and the dorsal/ventral (DV) coordinates are measured from the pial surface of the brain. For several of the most caudal medullary nuclei (for example, gracile nucleus and spinal nucleus of the trigeminal, caudal part), the calamus (at the floor of the fourth ventricle) is used as a registration point instead of Bregma. For many cortical areas, injections were made at two depths to label neurons throughout all six cortical layers and/or at an angle to infect neurons along the same cortical column. For laterally located cortical areas (for example, orbital area, medial part; prelimbic area; agranular insular area), the injections were made at two adjacent ML coordinates for the same reason, since the pipette angle required for injection along the cortical column is nearly 90°, beyond our technical limit. The stereotaxic coordinates used for generating data are listed under the Documentation tab in the data portal.

Adult male C57BL/6J mice (stock no. 00064, The Jackson Laboratory, Bar Harbour, ME) were used for AAV tracer (AAV2/1.pSynI.EGFP.WPRE.bGH, Penn Vector Core, Philadelphia, PA) injections at P56 ± 2 postnatal days. Mice were anaesthetized with 5% isoflurane and placed into a stereotaxic frame (model no. 1900, David Kopf Instruments, Tujunga, CA). For all injections using Bregma as a registration point, an incision was made to expose the skull and Bregma and Lambda landmarks were visualized using a stereomicroscope. A hole overlying the targeted area was made by first thinning the skull using a fine drill burr until only a thin layer of bone remained. A microprobe and fine forceps were used to peel away this final layer of bone to reveal the brain surface. For targeting caudal nuclei in the medulla, ketamine-anaesthetized mice were placed in the stereotaxic frame with the nose pointed downward at a 45–60 degree angle. An incision was made in the skin at the base of the skull and muscles were bluntly dissected to reveal the posterior atlanto-occipital membrane overlying the surface of the medulla. A needle was used to puncture the membrane and the calamus was visualized.

All mice received one unilateral injection into a single target region in the right hemisphere. Glass pipettes had inner tip diameters of 10–20 µm. The majority of injections were done using iontophoresis with 3 µA at 7 s ‘on’ and 7 s ‘off’ cycles for 5 min total. These settings resulted in infection areas of approximately 400–1,000 µm in diameter, depending on target region. Reducing the current strength to 1 µA decreased the area of infected neurons, and was used when 3 µA currents produced infection areas larger than 700 µm. Mice quickly recovered after surgery and survived for 21 days before euthanasia. Injection sites ranged from 0.002 to 1.359 mm3 in volume, with an average size of 0.24 mm3 across all 469 data sets.

Serial two-photon tomography

Mice were perfused with 4% paraformaldehyde (PFA). Brains were dissected and post-fixed in 4% PFA at room temperature for 3–6 h and then overnight at 4 °C. Brains were then rinsed briefly with PBS and stored in PBS with 0.1% sodium azide before proceeding to the next step. Agarose was used to embed the brain in a semisolid matrix for serial imaging. After removing residual moisture on the surface with a Kimwipe, the brain was placed in a 4.5% oxidized agarose solution made by stirring 10 mM NaIO4 in agarose, transferred through phosphate buffer and embedded in a grid-lined embedding mould to standardize its placement in an aligned coordinate space. The agarose block was then left at room temperature for 20 min to allow solidification. Covalent interactions between brain tissue and agarose were promoted by placing the solidified block in 0.5% sodium borohydride in 0.5 M sodium borate buffer (pH 9.0) overnight at 4 °C. The agarose block was then mounted on a 1 × 3 glass slide using Loctite 404 glue and prepared immediately for serial imaging.

Image acquisition was accomplished through serial two-photon (STP) tomography22 using six TissueCyte 1000 systems (TissueVision, Cambridge, MA) coupled with Mai Tai HP DeepSee lasers (Spectra Physics, Santa Clara, CA). The mounted specimen was fixed through a magnet to the metal plate in the centre of the cutting bath filled with degassed, room-temperature PBS with 0.1% sodium azide. A new blade was used for each brain on the vibratome and aligned to be parallel to the leading edge of the specimen block. Brains were imaged from the caudal end. We optimized the imaging conditions for both high-throughput data acquisition and detection of single axon fibres throughout the brain with high resolution and maximal sensitivity. The specimen was illuminated with 925 nm wavelength light through a Zeiss ×20 water immersion objective (NA = 1.0), with 250 mW light power at objective. The two-photon images for red, green and blue channels were taken at 75 µm below the cutting surface. This depth was found optimal as it is deep enough to avoid any major groove on the cutting surface caused by vibratome sectioning but shallow enough to retain sufficient photons for high contrast images. In order to scan a full tissue section, individual tile images were acquired, and the entire stage was moved between each tile. After an entire section was imaged, the x and y stages moved the specimen to the vibratome, which cut a 100-µm section, and returned the specimen to the objective for imaging of the next plane. The blade vibrated at 60 Hz and the stage moved towards the blade at 0.5 mm per sec during cutting. Images from 140 sections were collected to cover the full range of mouse brain. It takes about 18.5 h to image a brain at an x,y resolution of 0.35 µm per pixel, amounting to 750 GB worth of images per brain. Upon completion of imaging, sections were retrieved from the cutting bath and stored in PBS with 0.1% sodium azide at 4 °C.

Image data processing

The informatics data pipeline (IDP)46 manages the processing and organization of the image and quantified data for analysis and display in the web application. The two key algorithms are signal detection and image registration.

The signal detection algorithm was applied to each image to segment positive fluorescent signals from background. Image intensity was first rescaled by square root transform to remove second-order effects followed by histogram matching at the midpoint to a template profile. Median filtering and large kernel low pass filter was then applied to remove noise. Signal detection on the processed image was based on a combination of adaptive edge/line detection and morphological processing. High-threshold edge information was combined with spatial distance-conditioned low-threshold edge results to form candidate signal object sets. The candidate objects were then filtered based on their morphological attributes such as length and area using connected component labelling. In addition, high intensity pixels near the detected objects were included into the signal pixel set. In a post-segmentation step, detected objects near hyper-intense artefacts occurring in multiple channels were removed. It should be noted that passing fibres and terminals are not distinguished. The output is a full resolution mask that classifies each 0.35 µm × 0.35 µm pixel as either signal or background. Isotropic 3D summary of each brain is constructed by dividing each image into 100 µm × 100 µm grid voxels. Total signal is computed for each voxel by summing the number of signal-positive pixels in that voxel.

The highly aligned nature from section to section throughout a single brain allowed us to simply stack the section images together to form a coherent reconstructed 3D volume. Each image stack was then registered to the 3D Allen Reference Atlas model. To avoid possible bias introduced by using a single specimen as template and to increase the convergence rate of the registration algorithm, a registration template was created by iteratively averaging 1,231 registered and resliced brain specimens. A global affine (linear) registration to the template was first performed using a combination of image moments and maximizing normalized mutual information between the red channel of the image stack and the template using a multi-resolution gradient descent optimization. A B-spline based deformable registration was then applied using a coarse-to-fine strategy through four resolution levels with decreasing smoothness constraints. In each generation, a new template was created using the previous generation results to further improve registration convergence. The template was then deformably registered to the 3D reference model by maximizing the mutual information of large structure annotation and the template intensity.

Segmentation and registration results are combined to quantify signal for each 100 µm × 100 µm × 100 µm voxel in the reference space and for each structure in the ontology by combining voxels from the same structure in the 3D reference model. To generate the raw connectivity matrix (Fig. 3), the projection signal was quantified by summing the number of segmented pixels in every voxel, and scaling this value to a mm3 volume. The voxel values within each of the 469 injection sites (source) or each of the 295 target sites in either hemisphere were binned and summed based on which structure the voxel belongs to. The target structure values presented across the columns are normalized by each experiment’s injection volume to allow comparison between injections. Fluorescent signals within each injection area were excluded from projection signal calculation.

The informatics data processing supports key features in the web application, including an interactive projection summary graph for each specimen, an image synchronization feature to browse images from multiple injections, reference atlases and other data set in a coordinated way, and on-the-fly search services to search for a specimen with specific projection profiles. Further details of the pipeline processing and web application features are described under the documentation tab in the data portal.

Quality control, injection site annotation and polygon drawing

A rigorous manual quality control protocol was established which includes identification of the injection structure(s) according to the Allen Reference Atlas ontology, delineation of the injection site location and decisions on failing an experiment due to production issues affecting specimen and image quality. Severe artefacts such as missing tissue or sections, poor orientation, edge cutoff, tessellation and low signal strength lead to elimination of the entire image series. In some cases, the quality control process extended to identification and masking of areas of high intensity/high frequency artefacts and areas of signal dropout. This information is used in downstream search and analysis to reduce false positive and false negative returns.

For each passed image series, the anatomical location(s) of injection site was annotated based on the Allen Reference Atlas23 and The Mouse Brain in Stereotaxic Coordinates45. If an injection has hit multiple structures, the structure containing the majority of the tracer is named as the primary injection structure, and any other structures containing tracer-infected neurons are considered secondary injection structures. Polygons were manually drawn overlaying the cell bodies of infected neurons for each passed injection with an electronic region of interest for ease of injection site location by the end user, and further informatics processing. After data registration into the 3D reference space, injection sites were also annotated computationally. In most cases, results obtained from manual and informatics annotations are the same. The manually derived primary and secondary injection structures are provided as search entries for the Atlas, while the computationally derived sites are available on the projection summary page for each experiment.