A Predictive Structural Model of the Primate Connectome

Anatomical connectivity imposes strong constraints on brain function, but there is no general agreement about principles that govern its organization. Based on extensive quantitative data, we tested the power of three factors to predict connections of the primate cerebral cortex: architectonic similarity (structural model), spatial proximity (distance model) and thickness similarity (thickness model). Architectonic similarity showed the strongest and most consistent influence on connection features. This parameter was strongly associated with the presence or absence of inter-areal connections and when integrated with spatial distance, the factor allowed predicting the existence of projections with very high accuracy. Moreover, architectonic similarity was strongly related to the laminar pattern of projection origins, and the absolute number of cortical connections of an area. By contrast, cortical thickness similarity and distance were not systematically related to connection features. These findings suggest that cortical architecture provides a general organizing principle for connections in the primate brain, providing further support for the well-corroborated structural model.

Scientific RepoRts | 7:43176 | DOI: 10.1038/srep43176 as an indicator of overall cortical composition [29][30][31] . Cortical thickness covariations have been treated as a surrogate of anatomical connectivity (but see ref. 32). The inferred structural networks based on cortical thickness have been explored with respect to their topological properties, association with functional connectivity, and relationship to behavioral traits (e.g. refs 33-36; for a review see ref. 37). Given this strong interest in the possible significance of cortical thickness, we assessed this parameter as an anatomical covariate of structural connectivity ('thickness model').
We compared the predictive power of the three factors for connection data from a comprehensive connectivity data set (connectome). This data set provides extensive quantitative information on the existence and laminar origins of projections linking cortical areas in the macaque brain 38,39 . We investigated whether this connectome can be understood in terms of the underlying brain anatomy.

Results
We examined the association between the primate cortical connectome and these anatomical features of the primate cerebral cortex: neuron density (a quantitative measure of cortical architecture, Fig. 1); spatial proximity; and cortical thickness. We tested how well each of the three anatomical parameters was related to the existence and the laminar origins of projections between cortical areas, and could predict the presence or absence of projections. We found that the existence of projections is most closely related to the neuron density of cortical areas. We also showed that neuron density is the anatomical factor that best accounts for laminar projection patterns and is linked to topological properties of brain regions.
Relations among anatomical variables. To quantify relative structural similarity across the cortex, for all pairs of connected areas we computed the difference in neuron density or cortical thickness as measured on a log scale. That is, structural (dis-)similarities were expressed as log-ratios. Spatial proximity was quantified by Euclidean distance between areas. The anatomical variables associated with the corticocortical projections were not completely independent. We found a moderate correlation between the undirected neuron density ratio and the Euclidean distance of area pairs (r = 0.47, p < 0.001), whereas the correlation of Euclidean distance with the undirected thickness ratio was significant but of negligible size (r = 0.12, p < 0.001). In contrast, neuron density ratio and thickness ratio were strongly negatively correlated (r = − 0.76, p < 0.001), an association which results from a strong inverse correlation between the neuron density and thickness of brain areas (r = − 0.69, p < 0.001).
Then, to assess the distribution of absent and present projections across the three structural variables in more detail, we plotted the relative frequency of present projections across neuron density ratio and Euclidean distance in comparison to the absolute numbers of absent and present projections (Fig. 2). For all variables, present projections became relatively less frequent with increasing distance or structural dissimilarity of two potentially  Figure 1. Neuron densities in the macaque cortex depicted on the M132 parcellation 38 . Gray areas: no density data available. Abbreviations as in ref. 38. connected areas, as also shown by a rank correlation coefficient, ρ , of the relative frequencies (|log-ratio density |: ρ = − 1.00, p < 0.001; distance: ρ = − 0.98, p < 0.001; |log-ratio thickness |: ρ = − 0.93, p < 0.01).
Finally, to exploit the association of the structural variables with the existence of cortical connections, we used the parameters to classify projections as either absent or present. We predicted projection presence or absence based on all 7 possible combinations of the three parameters (each individual parameter, 3 pairwise combinations of the parameters, and a combination of all three parameters). See Methods for a detailed description of the classification and validation procedure.
The best classification among the 6 combinations of one or two parameters was obtained from the combination of the log-ratio of neuron density (i.e., density similarity) with Euclidean distance. This pairing was superior to all other combinations; its accuracy, precision and negative predictive value were not exceeded at comparable thresholds, and overall performance as quantified by the mean Youden-index J was worse for all other combinations (mean ± standard deviation: J(|log-ratio density | & distance) = 0.75 ± 0.04; J(distance & |log-ratio thickness |) = 0.51 ± 0.13; J(|log-ratio density | & |log-ratio thickness |) = 0.11 ± 0.03; J(|log-ratio density |) = 0.0 ± 0.0; J(distance) = 0.07 ± 0.03; J(|log-ratio thickness |): no predictions at thresholds above p(present) = 0.775; see Supplementary Figure S1 for the underlying distribution of true positive rate and false positive rate and Supplementary Figure S2 for a detailed depiction of the Youden-index J across all thresholds). Including all three anatomical variables as predictive variables did not improve classification accuracy or overall performance as assessed by the mean Youden-index (J(|log-ratio density | & distance & |log-ratio thickness |) = 0.76 ± 0.04 (Fig. 3C). A Kruskal-Wallis-test showed that the distributions of the Youden-index J were significantly different between the combinations of the parameters (H = 549.2, p < 0.001). Post hoc tests (Bonferroni-corrected for multiple comparisons) revealed that the distributions of the combination of the log-ratio of neuron density and Euclidean distance ('density, distance') and the combination of the log-ratio of neuron density, Euclidean distance and the log-ratio of thickness ('density, distance, thickness') were not significantly different from each other (p > 0.05), while the combination of the log-ratio of neuron density and Euclidean distance had a higher mean J than all other combinations (p < 0.01 for all pair-wise tests).
According to these results, we adopted the combination of the absolute log-ratio of neuron density and Euclidean distance as predictive variables for our probabilistic model. Figure 3A depicts the posterior probability for a projection to be present across the predictive variable space for this feature combination. Cross-validated classification performance across the evaluated thresholds is shown in the remainder of Fig. 3. As shown in Fig. 3B, classification accuracy quickly exceeded 80%, with a sizable fraction of the test set being classified. At higher thresholds, accuracy notably surpassed 90%, although this was accompanied by a decrease in the fraction of classified observations. As shown in Supplementary Figure S1, higher thresholds were associated with a consistent decrease in the rate of false positive predictions at an overall high rate of true positive predictions, resulting in a favorable Youden-index J (Fig. 3C).
Classification performance at all thresholds reliably exceeded chance performance as assessed by a permutation analysis. The permutation analysis revealed a classification performance from nonsensical labels that showed a relatively uniform accuracy of about 65% across tested thresholds. True positive rate and false positive rate equaled 1 across all thresholds, resulting in a Youden-index J = 0.0 ± 0.0 for all thresholds.
Using the posterior probabilities obtained by the trained classifier (Fig. 3A), we were able to make predictions about the status of projections between area pairs that were considered as unknown in the current data set 38 . We classified unknown projections at the threshold p(threshold) = 0.85, as indicated by the black lines in Fig. 3A. Projections predicted to be absent or present are listed in Supplementary Table S1.
Laminar patterns of projection neurons. We observed a strong correlation between the fraction of labeled neurons originating in supragranular layers (N SG %) and log-ratio density (r = 0.59, p < 0.001, Fig. 4A), as well as a moderate correlation between N SG % and log-ratio thickness (r = − 0.42, p < 0.001, Fig. 4B). Given the strong correlation between the neuron density ratio and cortical thickness ratio, we computed a partial correlation of N SG %, log-ratio density , and log-ratio thickness to assess the relative contribution of each variable. The partial correlation revealed that the correlation between thickness ratio and laminar patterns was mainly driven by the neuron density ratio, since the correlation did not reach significance when controlled for neuron density similarity (r = 0.06, p > 0.05). In contrast, the correlation between the neuron density ratio and laminar patterns was still significant when controlled for the cortical thickness ratio (r = 0.43, p < 0.001). Additionally, both N SG % (r = 0.09, p > 0.05, Fig. 4C) and |N SG %| (r = 0.003, p > 0.05, Fig. 4D) were independent of distance. Thus, the only anatomical factor that was systematically associated with laminar projection patterns was the architectonic similarity of linked areas.
Relation of cytoarchitecture with connection topology. We found that nodal network properties of cortical areas were related to the areas' cytoarchitecture. Specifically, areas belonging to the structural network core had lower neuron density than non-core areas (t (22) = 2.9, p < 0.01, r = 0.52, Fig. 5A). Note that the difference in density between non-core and core areas remained significant if the outlier in the non-core areas was removed. Given that a major defining feature of core areas is their high degree (i.e., the large total number of connections), we tested whether this observation was indicative of a general relationship between cytoarchitectonic differentiation and the connectivity of areas. This analysis revealed that neuron density was strongly correlated with areal degree of connectivity (r = − 0.60, p < 0.01, Fig. 5B). Note that this correlation remained significant if a rank-correlation was computed instead, removing differences in magnitude (ρ = − 0.47, p = 0.019). The correlation reached the significance threshold if the data point in the lower right of Fig. 5B was removed (r = − 0.41, p = 0.0509). Additionally, we tested whether the same relationships could be observed for cortical thickness. Here the results were inconsistent. While cortical thickness did not differ between core and non-core areas (t (27) = − 2.0, p > 0.05, r = 0.35), thickness was moderately correlated with the degree of connectivity of areas (r = 0.38, p < 0.05).
Furthermore, we compared the neuron density and cortical thickness of five structural network modules that are related to spatial and functional sub-divisions of the cortex (specifically, comprising frontal, temporal, somato-motor, parieto-motor and occipito-temporal regions). These modules or clusters are characterized by denser structural connectivity within than between the modules 40 . Module assignments were taken from Goulas and colleagues 41 , who delineated the modules for a sub-network of 29 × 29 cortical areas 38 using a spectral decomposition algorithm. We found that the network modules differed in their neuron density (H = 13.7, p < .01), but not in their cortical thickness (H = 7.2, p > 0.05). Post hoc tests, Bonferroni-corrected for multiple comparisons, revealed that the frontal module had a lower neuron density than the occipito-temporal module (t (13) = −3.8, p = 0.002, r = 0.73, α corr = 0.005); all other pairwise differences in neuron density between the modules were not significant after correcting for multiple comparisons.
The architectonic basis of the primate connectome. Architectonic differentiation defined by neuron density of areas was the structural factor that related most consistently and strongly to the investigated features of the primate connectome. Figure 6 summarizes this finding and displays all present projections that were included in the analyses. Areas are arranged according to their neuron density, and projections are color-coded according to the neuron density ratio, expressing the architectonic similarity of the connected areas (from green for the smallest ratios via blue to purple for the highest density ratios). Note the dominance of projections linking architectonically similar areas (green links). Also note that core areas (indicated in red), are clustered at the lower end of the neuron density scale, as are areas with a relatively large number of connections, marked by their larger node size.

Discussion
We assessed the extent to which distinct anatomical features can be used to predict the connectivity in the cerebral cortex of a non-human primate, using the most extensive quantitative data set of connections available for the macaque monkey 38,39 . Specifically, we considered the cytoarchitectonic differentiation of cortical areas, quantified by neuron density, the spatial proximity of areas, quantified by Euclidean distance, and cortical thickness extracted from structural MRI data. We found that the existence of projections between areas depends strongly on their architectonic similarity ( Fig. 2A). We capitalized on this association to predict the existence of projections based on the structural relationships of potentially connected areas. Integrating cytoarchitectonic similarity and spatial proximity in a predictive model made it possible to determine whether two areas would be connected with more than 90% accuracy (Fig. 3B). The model showed that a connection was most likely to exist between areas that are similar in their cytoarchitectonic differentiation and spatially close (Fig. 3A). Our classification procedure consistently performed above chance level, as assessed by a permutation analysis. We used this classification procedure to make predictions about the status of unsampled projections (Supplementary Table S1), which provides an opportunity to compare our model's performance with future experimental results, allowing further model validation. Classification from alternative feature combinations revealed that, when the three parameters were used as single predictors, cytoarchitectonic similarity yielded the highest maximum Youden-index J compared to Euclidean distance or thickness similarity on their own (Supplementary Figure S2B). This suggests that the performance of the predictive model hinged predominantly on cytoarchitectonic similarity and to a lesser extent on spatial proximity. While thickness similarity also correlated with the relative frequency of present projections, including this feature into our predictive model did not improve classification performance. Furthermore, even though the relative thickness of brain areas correlated strongly with the areas' relative neuron density, substituting density similarity for thickness similarity led to a considerable decrease in our model's predictive power. Importantly, the predictive model also revealed that, although the likelihood of a connection decreased across large differences in cytoarchitecture or long distances, this effect was mitigated if areas were spatially very close or respectively very similar in their cytoarchitecture. Thus, although connections were relatively less likely to exist between spatially remote areas, they did occur preferentially when distance was compensated for by similar cytoarchitectonic differentiation. Axonal wiring costs are a major constraint on structural connectivity 42 but are not strictly minimized in neural networks 6,43,44 . Our results highlight cytoarchitectonic differentiation as a key factor for predicting the occurrence of costly connections between spatially remote areas.
Additionally, we found that the laminar patterns of projection origins across the whole macaque cortex were very well accounted for by cytoarchitectonic similarity (Fig. 4A), consistent with previous reports 8,9,19,20 . In contrast, there was no systematic relationship between laminar patterns of projection origins and distance or cortical thickness when the correlation with cytoarchitectonic differentiation was accounted for.
Moreover, we found that cytoarchitecture was closely associated with some of the essential topological properties of cortical areas. Specifically, areas belonging to the structural network core had a lower neuron density than areas in the periphery (Fig. 5A). This finding complements the observation that there are differences in several aspects of regional cellular morphology (e.g., dendritic tree size) between core and periphery areas 45 . One of the main defining features of core areas is their exceptionally large number of connections 46,47 . Therefore, we assessed whether there exists a direct relationship between cytoarchitecture as expressed by neuron density and area degree (i.e., the number of connections of an area), without interposing the classification into core and periphery areas. This analysis revealed a strong general relationship between area degree and cytoarchitecture across the entire cortex. Thus, areas of lower density possessed a larger number of connections (Fig. 5B), consistent with previous findings 13 . In contrast, cortical thickness showed an inconsistent and weaker relationship to membership in the structural network core and area degree.
An explanation for the strong relationship between cytoarchitecture and topological network features of cortical areas is likely to be found in ontogeny. The development of the regional architectonic structure may be Figure 6. Primate cortical connectome based on neuron density gradients. Grey circles correspond to neuron density, increasing from center to periphery; cortical areas are positioned accordingly (cf. Fig. 1). Present projections between cortical areas are displayed color-coded according to absolute neuron density ratios of the connected areas from green (small ratios) via blue to purple (large ratios). Node sizes indicate the areas' degree (i.e., number of connections). Structural core areas, as classified by Ercsey-Ravasz and colleagues 25 , are filled in red. Abbreviations as in ref. 38. associated with the establishment of the connections of an area. One possible mechanism might draw on the relative timing of the emergence of areas, where areas that appear earlier might have the opportunity to connect more widely 14 . Indeed, a similar process has been suggested to explain the degree distribution of single neurons in Caenorhabditis elegans 48 . The systematic structural variation of the cortex, which is at the core of the structural model, has recently been shown in humans to originate in cortical development 49 . Barbas and García-Cabezas 49 also directly linked connectivity of the prefrontal cortex to its time of origin, thus providing strong support for the hypothesis that relative timing of area formation is a crucial determinant of cortical connectivity.
Additionally, we observed that network modules of areas differ in their cytoarchitecture. It has been suggested that network modules of the primate cortex result from a combination of spatial and topological properties 50 . Our findings suggest that cytoarchitecture may be another factor in the formation of structural modules, in line with our general conclusion that cortical architecture governs the formation of connections between brain areas. This hypothesis is supported by a previous report which demonstrated that topological features such as modular connectivity may arise from the growth of connectivity in developmental time windows 51 .
While thickness measures have the advantage of being accessible non-invasively using MRI in humans, their relation to other anatomical features and to structural connectivity remains unclear. Our findings suggest that, while cortical thickness may show similarities to neuron density in its variability across the cerebral cortex, it is an imperfect surrogate and does not capture the fundamental aspects of brain networks that can be delineated from cytoarchitectonic differentiation.
In conclusion, our findings suggest that several features of the primate cortical connectome can largely be accounted for by the underlying structural properties of the cerebral cortex. Specifically, the relative cytoarchitectonic differentiation of the cortex provides an essential scaffold for explaining the organization of structural brain networks. Does the structural model, explaining connections based on cytoarchitecture, apply across species? The present results in the macaque cortex very closely parallel previous findings for the cat 20 . In both species, cytoarchitectonic differentiation was closely associated with multiple aspects of the organization of cortical networks, and cytoarchitectonic similarity integrated with spatial proximity was highly predictive of the existence of projections between potentially connected brain areas. This close association of brain architecture with connectivity was observed for areas distributed across the entire cortical surface, and was not contingent on grouping the areas into functional or anatomical modules of any kind. Moreover, cytoarchitectonic similarity has been consistently shown to account for laminar patterns of projections across cortical regions in the macaque as well as cat 8,9,[18][19][20]52 . Furthermore, an inverse relationship between the cytoarchitectonic differentiation and the connection degree of areas was observed in both species. Thus, areas of weaker differentiation have more connections. Highly connected areas are often hubs or members of a functionally prominent rich-club, occupying a topologically special position within networks of anatomical connections (e.g. refs 47 and 66). Moreover, weakly differentiated areas likely differ from more strongly differentiated areas in their intrinsic circuitry and signal processing properties 53 . In combination, these findings indicate that the relative architectonic differentiation of cortical areas might shape the formation of corticocortical connections and thus impose constraints on structural as well as functional aspects of the connectome. There is, thus, excellent correspondence of findings across two mammalian species and across the entire cerebral cortex.
Furthermore, these finding were recently paralleled in the mouse cortex 54 . Analyses of comprehensive global cortico cortical connectivity thus closely mirror previous findings across a number of cortical systems and connection targets, including the contralateral hemisphere and the amygdala, in several species 8,9,14,18,19,52,[54][55][56][57][58] . This evidence suggests that the reported association between architectonic differentiation of cortical areas and features of the inter-areal brain network reflects general organizational principles underlying the formation and maintenance of connections in the mammalian cortex.

Conclusions
Cytoarchitecture, which encompasses characteristic differences of local cortical organization 17 , has previously been shown to account for laminar patterns of corticocortical connections 8,9,19,20 . Our results further underscore the significance of cytoarchitecture as a central factor that governs multiple aspects of the configuration of brain networks. This conclusion is based on three key observations about cortical connectivity: cortical cytoarchitecture is closely associated with the presence or absence of connections between cortices, the number of connections of a cortical area, as well as the laminar pattern of connections. By contrast, other factors, such as cortical thickness and distance, are not consistently related to connection features. The structural model was originally developed qualitatively, in the classic studies of Sanides and Pandya (e.g. ref. 59), and systematically extended into quantitative studies by Barbas and colleagues, particularly of prefrontal connectivity in the primate, but also of cat and mouse cortex 8,9,[18][19][20]52,[54][55][56][57]60 . Here, we explored this concept for a comprehensive connectivity and cytoarchitectonic data set of the macaque, adding further support for the structural model, which has been developed by experimental and theoretical neuroanatomists over several decades. The applicability of the structural model across different mammalian species and cortical systems suggests that it captures fundamental organizational principles underlying the global structural connectivity of the cerebral cortex. In humans, connections cannot be measured directly by tracing studies, but brain architecture can be studied post mortem. Therefore, these findings also have important implications for understanding the structural connectivity of the human brain.

Methods
We first introduce the analysed data of primate corticocortical connectivity and then present the structural parameters that were hypothesized to constrain connectivity. Subsequently, we describe measures and procedures used in the analyses. Connectivity data: Presence of projections. We used comprehensive data about corticocortical connectivity in the macaque brain obtained from systematic anatomical tracing experiments 38 . Briefly, the authors injected retrograde tracers into 29 cortical areas (parcellated according to their M132 atlas) and quantified labeled neurons found in all 91 areas of the M132 atlas that project to these injected sites. Within each area, labeled neurons ranged from a minimum of 1 neuron to a maximum of 262,279 neurons. Each of these is called a 'projection' to refer to a pathway from an area with labeled neurons to the injection site. The resulting data set contains information about the existence (i.e., either presence or absence) of 2610 projections within a 91 × 29 subgraph of the complete (91 × 91) connectivity matrix of the M132 atlas. For projections found to be present, projection strength is given as the fraction of labeled neurons, normalizing the number of projection neurons between two areas to the total number of labeled neurons for the respective injection, as done previously (e.g. refs 9 and 52).
Crucially, the data set includes a 29 × 29 subgraph of injected areas, which contains information about all possible connections among the injected areas. This edge-complete subgraph makes it possible to perform analyses without uncertainty related to possible connections that were not sampled. Due to the wide distribution of the injected areas across the cortex, the 29 × 29 subgraph is expected to have similar properties as the complete network which incorporates all 91 areas 25 . Ercsey-Ravasz and colleagues 25 used the edge-complete subgraph to identify areas belonging to a 'network core' with a high density of connections among areas. This network core is similar to the concept of a rich-club, as discussed in recent studies [61][62][63][64][65][66][67][68][69] . Ercsey-Ravasz and colleagues 25 identified 17 core areas in the 29 × 29 subgraph, assigning the remaining 12 areas to the network periphery. We computed the degree of each area in the subgraph as the sum of the number of afferent and efferent projections of the area.
Connectivity data: Laminar origin of projection neurons. In addition, we analyzed the laminar patterns of projection origins in 11 areas 39 , which Markov and colleagues extracted from the set of 29 injections described above 38 . Here, the fraction of labeled neurons originating in supragranular layers (N SG %) was provided for 625 projections originating in 11 of the 29 injected areas and targeting all 91 areas of the M132 atlas. Specifically, N SG % was computed as the number of supragranular labeled neurons divided by the sum of supragranular and infragranular labeled neurons 39 . To relate N SG % to the undirected measure of Euclidean distance, we also transformed it to an undirected measure of inequality in laminar patterns, |N SG %|, where |N SG %| = |N SG %-50|*2. Values of N SG % around 0 and 100% thus translated to larger values of |N SG %|, indicating a more pronounced inequality in the distribution of origins of projection neurons between infra-and supragranular layers. We based our analyses regarding N SG % on the subset of 429 projections comprising more than 20 neurons (neuron numbers for each projection are provided in ref. 38). Thus, we excluded very weak projections for which assessment of the distribution of projection neurons in cortical layers was not considered reliable (cf. ref. 18). Results did not change qualitatively if a less conservative threshold of 10 neurons was applied.
Structural model: Neuron density. The spectrum of architectonic differentiation ranges from areas of low overall neuron density, with few layers and lacking an inner granular layer (agranular), to dense areas with six distinct layers. The striate cortex, for example, has a much higher overall neuron density not only within the cortical visual system, but also among all other cerebral cortices 11,57,[70][71][72][73][74][75] . Intermediate to these two extremes are areas of lower neuron densities with a sparse inner granular layer (dysgranular), and areas with six layers but without the exceptional clarity of layers and sublayers or remarkable neuron density of striate cortex. We used an unbiased quantitative stereologic approach to study the cytoarchitecture of each area expressed by neuron density. We estimated neuron density from coronal sections of macaque cortex that were stained to mark neurons using either Nissl stain or immunohistochemical staining for neuronal nuclei-specific antibody (NeuN), which labels neurons but not glia, using a microscope-computer interface (StereoInvestigator, MicroBrightField Inc., Williston, VT). We verified that there is a close correspondence between measures derived from both staining methods in a sample of areas for which both measures were available (r = 0.99, p = 0.001), and accordingly transformed density measures from different staining methods to a common reference frame. The neuron density measurements used here have partly been published previously 14 . In total, neuron density measures were available for 48 areas (Fig. 1). Within the 29 × 29 subgraph, neuron densities were available for 14 of the 17 core areas and 10 of the 12 non-core areas. We quantified relative cytoarchitectonic differentiation across the cortex by neuron density. We computed the log-ratio of neuron density values for each pair of connected areas (which is equivalent to the difference of the logarithms of the area densities), where log-ratio density = ln (density source area /density target area ). The use of a logarithmic scale was indicated, since the most extreme value of the neuron density measures was more than three standard deviations above the mean of the considered neuron densities 76 . For analyses which required considering an undirected equivalent of the actual neuron density ratio, we used the absolute value of the log-ratio, |log-ratio density |. From the available neuron density measures we were able to determine the relative cytoarchitectonic profile for 1128 of the sampled projections, including 172 projections with an associated N SG %.
Distance model: Spatial proximity. We operationalized the spatial proximity of all 91 cortical areas by the Euclidean distance between their mass centers, obtained from the Scalable Brain Atlas (http://scalablebrainatlas.incf.org). This widely used interval measure of projection length represents a pragmatic estimate of the spatial proximity of pre-and postsynaptic neurons located in different brain areas (e.g. refs [77][78][79][80][81][82][83]. Information about the spatial proximity of areas was included for all 2610 sampled projections, also encompassing all 429 projections we analyzed with respect to |N SG %|. Detailed protocols of the procedures were approved by the Institutional Animal Care and Use Committee at Harvard Medical School and Boston University School of Medicine in accordance with NIH guidelines (DHEW Publication no. [NIH] 80-22, revised 1996, Office of Science and Health Reports, DRR/NIH, Bethesda, MD, USA). During MR data acquisition, the animal was anesthetized with propofol (loading dose, 2.5-5 mg/kg, i.v.; continuous rate infusion, 0.25-0.4 mg/kg min). MR data were acquired on a 3 Tesla Philips Achieva MRI scanner using a three-dimensional magnetization prepared rapid acquisition gradient-echo (3DMPRAGE) sequence with 0.6 mm isotropic voxels (130 slices, TR = 7.09 ms, TE = 3.16 ms, FOV = 155 × 155 mm 2 ). Cortical reconstruction and volumetric segmentation were performed using the Freesurfer image analysis suite (http://surfer. nmr.mgh.harvard.edu/). The resulting surface reconstruction was registered to the M132 atlas 38 using the Caret software 84 (http://www.nitrc.org/projects/caret/). Cortical thickness was then extracted for all 91 areas in both hemispheres using Freesurfer. Here, we report results for mean thickness values of the left and right hemisphere. Cortical thickness data (registered to a different atlas) extracted from these MR data have been used in a previous publication 85 .
The thickness measurements extracted from MR data were well correlated with microscopic measurements of histological sections 14 . Corresponding histological and MR measurements for 33 areas were available, resulting in r = 0.62, p < 0.001 for the left hemisphere, r = 0.48, p < 0.01 for the right hemisphere, and r = 0.56, p < 0.001 for mean thickness values of the left and right hemisphere.
To quantify relative thickness across the cortex in order to compare thickness in pairs of connected areas, we computed the log-ratio of thickness values for each pair of areas analogous to the log-ratio of neuron density, where log-ratio thickness = ln (thickness source area /thickness target area ). We transformed the log-ratio of cortical thickness to an undirected equivalent, |log-ratio thickness |, where appropriate. Relative thickness of areas was included for all 2610 sampled projections, also encompassing all 429 projections analyzed with respect to N SG %.

Relative projection frequencies.
To characterize the distribution of present and absent projections across the range of each anatomical variable, while accounting for differences in sampling, we computed relative frequencies of projections that were present. Specifically, we partitioned each anatomical variable into bins and normalized the number of present projections in each bin by the total number of studied projections (i.e., absent and present projections that fall into the respective bin). This procedure allowed us to obtain a measure of the relative frequency of present projections which is robust against disparities in sampling across a variable's range (e.g., when more projections were sampled across a spatial separation of 10-15 mm than across 50-55 mm, as can be seen from the absolute projection numbers). We verified that results were robust against changes in bin size.
Classification of projection existence. We combined the anatomical variables in a probabilistic predictive model for classifying the existence of projections. We built this model using a binary support vector machine (SVM) classifier (i.e., used for two-class learning), which received the anatomical variables associated with the projections as independent variables (features) and information about projection existence (i.e., projection status 'absent' or 'present') as the dependent variable (labels, comprising two classes). Euclidean distance, absolute log-ratio of neuron density and absolute log-ratio of cortical thickness were used as features in different combinations.
For training the SVM classifier, we used a linear kernel function, standardized the independent variables prior to classification and assumed uniform prior probabilities for the learned classes. Classification scores obtained from the trained classifier were transformed to the posterior probability that an observation was classified as 'present' , p(present). To assess performance of the classification procedure, we used five-fold cross-validation. We randomly partitioned all available observations into five folds of equal size. After training the SVM classifier on a training set comprising four folds, we used the resulting posterior probabilities to predict the status of the remaining fold (20% of available observations) that comprised the test set. We used two classification rules derived from a common threshold probability. (1) We assigned the status 'present' to all observations whose posterior probability exceeded the threshold probability, that is, observations with p(present) > p(threshold). (2) We assigned the status 'absent' to all observations with p(present) < 1 − p(threshold). The approach was applied to thresholds from p(threshold) = 0.50 to p(threshold) = 1.00, in increments of 0.025. By increasing the threshold probability, we therefore narrowed the windows in the feature space for which classification was possible. For thresholds of p(threshold) < = 0.50, the classification windows overlap. In particular, parts of the feature space corresponding to classification as 'present' overlap with parts corresponding to classification as 'absent' , and observations would therefore be classified twice. For this reason, we did not consider thresholds below p(threshold) = 0.50. For each threshold, we computed performance as described below and averaged results across the five cross-validation folds. To make performance assessment robust against variability in the partitioning of observations, we report performance measures averaged across 100 rounds of the five-fold cross-validation.
We assessed classification performance by computing prediction accuracy, the fraction of correct predictions relative to all predictions. Accuracy was also separately assessed for positive and negative predictions, yielding precision and negative predictive value as the fraction of correct positive or correct negative predictions relative to all positive or negative predictions, respectively. We also computed which fraction of observations in the test set was assigned a prediction at a given threshold. As further performance measures, we computed sensitivity (true positive rate) and specificity (true negative rate) at the evaluated thresholds. We also computed the false positive rate (1-specificity). To quantify performance based on sensitivity and specificity, we computed the Youden-index J as J = sensitivity + specificity − 1 86,87 . J is a measure of how well a binary classifier operates above chance level, with J = 0 indicating chance performance and J = 1 indicating perfect classification. Since J is defined at each threshold, to obtain a single summary measure we computed the mean of J across the more conservative thresholds p(present) = 0.85 to p(present) = 1.00 for all 100 cross-validation runs. Results did not change if the maximum J across all thresholds was considered instead (Supplementary Figure S2B). To assess statistical null performance of the classification procedure, we performed a permutation analysis. The analysis was equal to the classification procedure described above, with the exception of an additional step prior to the partitioning of observations into cross-validation folds. Here, for each round of cross-validation, the labels were randomly permuted. Thereby, the correspondence between features and true labels of observations was removed. In the permutation analysis, we used Euclidean distance and the absolute log-ratio of neuron density as features, based on the feature combination that led to the best results, and averaged performance measures across 1000 rounds of five-fold cross-validation.

Statistical tests.
To test groups of projections for equality in their associated anatomical variables, we computed two-tailed independent samples t-tests and report the t-statistic t, degrees of freedom df and the associated measure of effect size r, where r = (t 2 /(t 2 + df)) 1/2 . Results did not change if Welch's t-test was applied, which does not assume equal variances across groups. To test for equality of more than two groups of areas regarding their neuron density or cortical thickness, we computed the non-parametric Kruskal-Wallis test statistics (H). To assess relations between interval variables, we computed Pearson's correlation coefficient r. For ordinal variables, we computed Spearman's rank correlation coefficient ρ . All tests were pre-assigned a two-tailed significance level of α = 0.05. These analyses were performed using Matlab R2012b (The MathWorks, Inc., Natick, MA, United States).
Methodological considerations. Some comments need to be made on the anatomical variables used in the present analysis. First, we used overall neuron density of brain areas to capture the complex architectonic profile of different cortices in a single parameter. Other crucial features of cytoarchitecture include the number and distinctiveness of cortical layers and the relative width and granularity of layer 4. Additionally, features that cannot be observed in cytoarchitecture, for example myeloarchitectonic properties, contribute to a fuller characterization of cortical differentiation (see ref. 88). However, many of these aspects are difficult to quantify. Moreover, there exists no consistent framework for integrating these measures into a one-dimensional ranking of structural differentiation. In practice, estimates of the overall differentiation of brain areas rely on subjective expert categorizations, resulting in the assignment of areas to 'structural types' (cf. refs 9 and 20). By contrast, neuron density can be determined objectively using unbiased stereologic methods. In a comparison of multiple quantitative features of cortical architecture, neuron density turned out to be the most discriminating parameter for identifying cortical areas in the primate prefrontal cortex 14 . The features included in that analysis comprised cortical thickness, and density of different cell markers, including neurons, glia, and neurons labeled with calbindin, calretinin or parvalbumin, and their respective laminar distributions. Further, there is a close correspondence between neuron density measurements and expert ratings of cytoarchitectonic differentiation that comprehensively take into account multiple dimensions 14 . Thus, neuron density is a well established, characteristic measure for quantifying cytoarchitectonic differentiation of cortical areas. Second, we used measurements of cortical thickness obtained from structural MRI in one macaque monkey. The MRI measures provided coverage of all cortical areas, and agreed well with the corresponding microscopic thickness measurements from histological sections (cf. Methods section Thickness model: Cortical thickness). This finding is in line with similar agreements between histological and MRI-based thickness measures seen for cortical regions of the human brain 89 . Therefore, the thickness measurements were considered reliable, despite the small sample size. Reliability was further strengthened by averaging thickness values for corresponding regions of the left and right hemisphere.