Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Genetic influences on hub connectivity of the human connectome


Brain network hubs are both highly connected and highly inter-connected, forming a critical communication backbone for coherent neural dynamics. The mechanisms driving this organization are poorly understood. Using diffusion-weighted magnetic resonance imaging in twins, we identify a major role for genes, showing that they preferentially influence connectivity strength between network hubs of the human connectome. Using transcriptomic atlas data, we show that connected hubs demonstrate tight coupling of transcriptional activity related to metabolic and cytoarchitectonic similarity. Finally, comparing over thirteen generative models of network growth, we show that purely stochastic processes cannot explain the precise wiring patterns of hubs, and that model performance can be improved by incorporating genetic constraints. Our findings indicate that genes play a strong and preferential role in shaping the functionally valuable, metabolically costly connections between connectome hubs.


Nervous systems are intricately connected networks with complex wiring patterns that are neither completely random nor completely ordered1,2. Numerous studies, conducted in species as diverse as the nematode Caenorhabditis elegans, mouse, macaque, and human, and at scales ranging from the cellular to the macroscopic, have shown that this complex organization is, in part, attributable to a heterogeneous distribution of connectivity across neural elements, such that a large fraction of network connections is concentrated on a small subset of network nodes called hubs3,4,5,6,7. These hubs are more strongly interconnected with each other than expected by chance, forming a rich-club3,4,5,7 that is topologically positioned to integrate functionally diverse neural systems and to mediate a large proportion of inter-regional communication5,8.

In the human cortex, hubs are predominantly located in transmodal paralimbic and association areas6,9 and are among the most metabolically expensive elements of the connectome10, with rich-club connections between hubs accounting for a disproportionate fraction of axonal wiring costs3,4,5,7,11. Paralimbic and association  hubs of the human brain also show marked inter-individual variability in connectivity and function that relates to a diverse array of behaviors6,12,13,14. These brain regions are disproportionately expanded in individuals with larger brains15 and in human compared to nonhuman primates16. They also show greater topological centrality and evolutionary divergence in the human connectome when compared to chimpanzee17. These findings support the view that rapid expansion of multimodal association hubs, and the costly, valuable rich-club connections between them, underlies the enhanced cognitive capacity of humans compared to other species18.

What influences the way in which hub regions connect to each other? The rapid evolutionary expansion of network hubs in humans, coupled with evidence supporting the heritability of many different aspects of brain organization19, suggests an important role for genes. In the developing brain, neurons can innervate precise targets, even over long anatomical distances, by following genetically regulated molecular cues20,21. However, it is unknown whether genetic influences are preferentially exerted across specific classes of connections, such as the costly and functionally valuable links between network hubs. Preliminary evidence from human twin research suggests that certain properties of hub functional connectivity are strongly heritable22, and analyses of C. elegans, mouse, and human data suggest that hub connectivity is associated with a distinct transcriptional signature related to metabolic function7,11,23,24,25. Alternatively, some have suggested that the protracted maturation of hub regions16,26,27 may endow these areas with enhanced plasticity12, suggesting a prominent role for environmental influences. Moreover, recent computational models of whole-brain connectome wiring suggest that it is possible to grow networks with complex topological properties, including hubs, that mimic actual brains using simple, stochastic wiring rules based on geometric constraints28,29,30,31 or trade-offs between the wiring cost and functional value of a connection32,33,34. These findings imply that the emergence of network hubs may not require precise genetic control and may instead result from random processes shaped by generic physical and/or functional properties.

Here, we use a multifaceted strategy to test between these competing views and characterize genetic influences on hub connectivity of the human cortical connectome. Using a connectome-wide heritability analysis (Fig. 1A, B), we show that genetic influences on phenotypic variance in connectivity strength are not distributed homogeneously throughout the brain, but are instead preferentially concentrated on links between network hubs. Then, as previously demonstrated in C. elegans11 and mouse7, we show that connected pairs of hubs in the human brain exhibit tightly coupled gene expression related to the metabolic demand and cytoarchitectonic similarity of these areas (Fig. 1C). Finally, we use computational modeling to show that stochastic network wiring models can indeed generate networks with brain-like properties, but fail to capture the spatial distribution of hub regions and, by extension, the precise pattern of wiring that connects them. Moreover, adding genetic constraints to the models can improve their performance. Collectively, these findings demonstrate a direct link between molecular function and the large-scale network organization of the human connectome and highlight a prominent role for genes in shaping the costly and functionally valuable connections between network hubs.

Fig. 1: Workflows used to characterize genetic influences on hub connectivity.

A A schematic representation of the connectome showing different connection types in the brain. Given a distinction between hub nodes (red outline) and nonhub nodes (gray outline), we can delineate three classes of connections: rich links—connections between two hubs (red); feeder links—connections between a hub and a nonhub (yellow); and peripheral links—connections between two nonhubs (blue). B Connectome-wide heritability analysis. We use structural equation modeling to fit a classic ACTE biometric model to every connection within the brain, resulting in estimates of genetic and environmental influences for each link. C Analysis of transcriptional coupling. (I) Each of 3702 tissue samples in the Allen Human Brain Atlas (AHBA) is mapped to a given region in our brain parcellation. (II) Expression values are then subjected to quality control and processing pipeline40 to construct a region × gene matrix of expression values. (III) We estimate correlated gene expression (CGE) between each pair of brain regions as the Pearson correlation between region-specific gene-expression profiles. (IV) Inter-regional CGE is corrected for spatial autocorrelation of the expression data via regression of an exponential distance trend40. D Schematic representation of how values assigned to each edge are compared across connection types. We compare the mean of edge-level (pairwise) measures of heritability and CGE for rich, feeder, and peripheral links across all possible hub-defining thresholds (horizontal axis). As k increases, the definition of a hub becomes more stringent and identifies the actual hubs of the network. Thus, if a given effect is stronger for rich links, we expect the pairwise estimates to increase as a function of k, with the increase for rich links being particularly large relative to the feeder and peripheral links.


Using diffusion-weighted imaging (DWI) data for 972 subjects acquired through the Human Connectome Project (HCP)35 we generate a representative group-level connectivity matrix containing 12,924 unique connections between 360 brain regions defined by the HCPMMP1 atlas36. This network contains a set of highly connected regions, quantified using the measure of node degree (k), which represent network hubs and span sensorimotor, paracentral, mid-cingulate (k > 105), insula, posterior cingulate, lateral parietal, and dorsolateral prefrontal cortices (k > 145) (Fig. 2A). As shown previously5,6,8,10, the network exhibits rich-club organization, with hubs being more densely and strongly interconnected than expected by chance (Supplementary Fig. 1). Rich-club connections also have higher average wiring cost and communicability (Supplementary Fig. 1), indicating that they are among the most topologically central and costly elements of the connectome.

Fig. 2: Genetic influences on connectivity strength are preferentially concentrated on rich-club links.

A Anatomical locations of hubs defined at different levels of k. B The degree distribution of the representative group-level connectome. Mean genetic (C) and unique environmental (D) influences for rich (hub–hub), feeder (hub–nonhub), peripheral (nonhub–nonhub) connections as a function of the hub-defining threshold, k. The mean of the corresponding measure across all network links is shown as a dotted black line. Shaded area corresponds to the standard error of the mean, circles indicate a statistically significant increase of the measure in a given link type compared to the rest of the network (one-sided Welch’s t-test, uncorrected p < 0.05). E Regional assignments to canonical functional network modules38, represented using color. F The proportion of nodes with degree > k in each functional network module as a function of k. G Distributions of heritability estimates across edges within functionally defined networks38: VIS1—primary visual; VIS2—secondary visual; SM—somatomotor; CO—cingulo-opecular; DAN—dorsal attention; LAN—language; FPN—frontoparietal; AUD—auditory; DMN—default mode; PM—posterior multimodal; VM—ventral multimodal; OA—orbito-affective. Rich links within each module are represented as black dots, as defined for k > 105. Heritability distributions for edges within (H) and between (I) functional modules across rich, feeder, and peripheral link types for k > 105. Rich links show significantly higher heritability compared to both feeder and peripheral links, within and between functional modules (one-sided Welch’s t-test, all p < 1.9 × 10−12). For distributions presented in GI, white dots represent median values for each distribution. The interquartile range is represented with a dark gray box, whiskers are represented with a light gray line. Source data are provided as a Source Data file.

Genetic influences on brain connectivity are concentrated in the rich club

To investigate whether genes preferentially influence certain classes of connections in the human brain, we perform a connectome-wide heritability analysis of twin data acquired through the Human Connectome Project. For each of 234 monozygotic (MZ) twins and their 69 non-twin siblings as well as 120 dizygotic (DZ) twins and 48 of their non-twin siblings, we reconstruct macroscale cortical connectomes using DWI.

For each connection in the representative group connectome, we use the classic ACTE model to estimate the proportion of variance in connectivity strength that is attributable to additive genetic factors (narrow-sense heritability, denoted h2). Using the average fractional anisotropy (FA) of each connecting fiber bundle to quantify connectivity strength, we observe a wide range of heritability estimates across connections, spanning 0 to 0.99 (\({{h}^{2}}_{{\rm{mean}}}=0.45\), \({{h}^{2}}_{{\rm{SD}}}=0.2\)). Non-trivial genetic influences, quantified using the A component of the ACTE model, are observed for the majority of connections, with the AE model showing the best fit for 86.7% of edges, ACTE for 4.3%, and ACE for 1.3%. A total of 7.7% of connections are influenced only by environmental factors (CE model 6.8%, E model 0.9%). Anatomical projections of the most and least heritable edges are shown in Supplementary Fig. 2. To examine whether genetic influences are preferentially concentrated on specific types of inter-regional connections, we distinguish between hub and nonhub regions, resulting in three possible types of connections: rich (hub-to-hub), feeder (between a hub and a nonhub), and peripheral (nonhub-to-nonhub) links [see schematic in Fig. 1A37]. We find that mean heritability derived from the best-fitting biometric model is highest for rich, intermediate for feeder, and lowest for peripheral connections across nearly all values of k (Fig. 2B, C). The increase in heritability for rich links as a function of the hub-defining threshold, k, indicates that genetic influences are, on average, stronger for connections between the most highly connected brain regions [see also Fig. 1D]. The same pattern is replicated when taking genetic parameters from the full ACTE model, confirming that this result is not an artifact of our model-selection procedure (Supplementary Fig. 3B).

Contrasting with rich links between hubs, phenotypic variance in peripheral connectivity (between nonhubs) is predominantly influenced by unique environment (quantified by the model parameter E, Fig. 2D), while common environmental influences are consistently low across all link types (mean values 〈C〉 < 0.08 and 〈T〉 < 0.02 across all k thresholds). Critically, we obtain similar evidence of preferential genetic influences on hub connectivity when using different methods for parcellating or thresholding our connectomes (Supplementary Fig. 4) or when evaluating connection strength based on the number of reconstructed streamlines (streamline count, SC) between regions (Supplementary Fig. 3C, D).

To investigate whether genetic influences are specific to certain functional systems of the brain, we next categorize edges according to the major functional networks that they connect, as defined using a network parcellation38 of the HCPMMP1 atlas36 (Fig. 2E). Figure 2F shows the proportion of nodes with degree > k in each functional network. High-degree nodes are present in most networks until k ≈ 120, beyond which they are predominantly found in transmodal paralimbic and association networks; namely, the frontoparietal, cinguolo-opercular, and default mode systems.

Across all 12 canonical functional networks, rich links both within and between networks demonstrate significantly higher heritability than other types of connections (Fig. 2G–I, one-sided Welch’s t-test, comparing heritability of rich vs feeder and rich vs peripheral connections, all p < 1.9 × 10−12), indicating that the elevated genetic influences observed for rich links cannot be explained by the affiliation of hub nodes to any specific functional network. Moreover, the stronger heritability of rich links is evident across different connection distances (Supplementary Fig. 5A–C), suggesting that preferential genetic influences on hub connectivity cannot be explained simply by the longer average distance between hubs (Supplementary Fig. 1D). Other factors, such as the number of outliers excluded from the analysis (Supplementary Fig. 5D–F, K) or differences in the phenotypic variance of connectivity strength estimates across different edge types (Supplementary Fig. 5G–I), were also unable to account for the increased heritability of rich links.

Together, these findings indicate that genetic influences on phenotypic variance in connectivity strength are not distributed homogeneously throughout the brain, nor are they confined to specific functional networks or long vs short-range connections. Instead, they are most strongly concentrated on the connections between network hubs. These hubs are distributed throughout the cortex, with the most highly connected regions residing in transmodal networks.

Transcriptional coupling is elevated between connected hubs

Next, we investigate the transcriptional correlates of hub connectivity using data from the Allen Human Brain Atlas (AHBA)39, focusing on expression profiles of 10,027 genes surpassing our quality-control criteria40 within the 180 cortical regions of the left hemisphere, where spatial coverage in the AHBA is most comprehensive. Evaluating CGE across the full set of genes allows us to quantify global expression patterns across the brain without restricting the analysis to predefined gene categories. We secondarily test for enrichment of certain classes of genes, as detailed below. We quantify the transcriptional coupling between different brain regions using spatially-corrected correlated gene expression (CGE) (Fig. 1C and Supplementary Fig. 6) and define inter-regional connectivity using a binary group-representative matrix. The spatial correction is important as prior studies of C. elegans, mouse, and human nervous systems have shown that, across the brain, CGE decays exponentially as a function of distance7,11,24,41. Recent analyses of the mesoscale connectome of the mouse7 and microscale (cellular) connectome of C. elegans11 indicate that, after considering this bulk trend, connected pairs of hubs show the highest CGE, despite being separated by longer anatomical distances, on average, than other neural elements.

Figure 3A, B shows that the same effect is observed in humans: CGE is highest for rich, intermediate for feeder, and lowest for peripheral links. We obtain similar results under different connectome processing options (Supplementary Fig. 7), distance ranges (Supplementary Fig. 8), and when using connectivity data from an independent sample (Supplementary Fig. 9A). The consistency of this effect between human, mouse, and C. elegans [see Supplementary Fig. 10 for comparison] is striking given the large physiological differences between species, methods for connectome reconstruction (DWI, viral tract tracing, electron microscopy), analysis resolution (macroscale (mm to cm), mesoscale (μm to mm), microscale (individual cells and synapses)), and gene-expression assays (microarray, in situ hybridization, curation of published reports).

Fig. 3: Transcriptional coupling is elevated for connected brain network hubs.

A The degree distribution of the representative group-level connectome of brain regions in the left cortical hemisphere. Degree is computed from whole-brain connectivity. B Mean correlated gene expression (CGE) for rich (hub–hub), feeder (hub–nonhub), peripheral (nonhub–nonhub) connections as a function of the degree threshold, k, used to define hubs. The mean CGE across all network links is shown as a dotted black line. The shaded area corresponds to the standard error of the mean, circles indicate a statistically significant increase in CGE in a given link type compared to the rest of the network (one-sided Welch’s t-test, uncorrected p < 0.05). CGE estimates are corrected for distance effects, as detailed in the Methods section. C CGE within functionally defined networks as in Fig. 2E. Black dots represent CGE values for rich links (k > 105). CGE values within (D) and between (E) functional modules in the left hemisphere across different link types (rich, feeder, and peripheral). Inter-module rich links show significantly higher CGE compared to both feeder (one-sided Welch’s t-test, p = 0.03) and peripheral links (p = 1.5 × 10−4). Within functional modules, rich links show higher CGE compared to peripheral (p = 1.2 × 10−4) but not to feeder links (p = 0.5). F Gene contribution score t-statistic values (GCSt-stat) for cell-specific gene groups quantifying the contribution of individual genes towards increased CGE for rich compared to peripheral links. Neuronal gene groups (excitatory—excitatory neurons; inhibitory—inhibitory neurons) are colored blue; glial gene groups (OPC—oligodendrocyte progenitor cells, astroglia, endothelia—endothelial cells, microglia, oligodendrocytes) colored green; values for all other genes presented in light orange. Oligodendrocyte-related genes show a statistically significant increase in GCC compared to all other genes (one-sided Welch’s t-test, p = 2 × 10−11). For distributions presented in CF white dots represent median values for each distribution. The interquartile range is represented with a dark gray box, whiskers are represented with a light gray line. G The degree distribution of the representative group-level cortical connectome. H Mean microstructural profile covariance (MPC) for rich (hub–hub), feeder (hub–nonhub), peripheral (nonhub–nonhub) connections as a function of degree threshold, k used to define hubs. The MPC across all network links is shown as a dotted black line. Shaded area corresponds to the standard error of the mean, circles indicate a statistically significant increase in MPC in a given link type compared to the rest of the network (one-sided Welch’s t-test, uncorrected p < 0.05). Inset near the degree distribution shows examples of the intermediate surfaces used to assay microstructure across the cortical depth (reproduced with permission from49). Source data are provided as a Source Data file.

As with heritability (Fig. 2C), higher CGE occurs for connections between high-degree nodes distributed across the brain; i.e., the effect is not confined to a single functional network (Fig. 3C). Indeed, connected pairs of hubs demonstrate higher CGE both within (Fig. 3D) and between (Fig. 3E) functionally defined networks (one-sided Welch’s t-test, comparing CGE of rich vs feeder and rich vs peripheral connections, all p ≤ 0.02).

Expression values in the AHBA are extracted from bulk tissue samples, and thus agglomerate transcriptional information from many different cell types. It is, therefore, possible that inter-regional CGE may be related to similarity in regional cellular composition [see also ref. 42]. We thus repeated the CGE analysis, this time using only data from genes showing cell-specific expression for seven canonical cell types: excitatory and inhibitory neurons, oligodendrocyte progenitor cells, astroglia, endothelial cells, microglia, and oligodendrocytes43,44,45,46,47.

We find that all classes of cell-specific genes exhibit an increase in CGE for rich links relative to peripheral (Supplementary Fig. 11), with oligodendrocyte-related genes showing a significantly stronger contribution to elevated CGE between connected hubs (one-sided Welch’s t-test, p = 2 × 10−11, Fig. 3F) compared to all other genes. These findings suggest that connected hubs may have higher cytoarchitectonic similarity than other pairs of regions. Given that the CGE of cell-specific genes is a relatively indirect marker of cytoarchitecture, we conducted a more direct test of the hypothesis that connected hubs have more similar cytoarchitecture using the BigBrain atlas48, which is a high-resolution Merker-stained histological reconstruction of a postmortem human brain that provides an opportunity to map regional variations in cellular density as a function of cortical depth. Following Paquola et al.49, we estimate intensity profiles across 16 equivolumetric surfaces placed between the gray/white and pial boundaries of the cortical ribbon and compute the inter-regional microstructural profile covariance (MPC) as a proxy for cytoarchitectonic similarity. Mirroring the CGE and heritability findings, rich links exhibit elevated MPC compared to the feeder and peripheral edges (Fig. 3H). These convergent MPC and cell-specific CGE results indicate that connected hubs have a more similar cytoarchitecture than other pairs of brain regions.

Finally, a gene set enrichment analysis of gene groups related to elevated CGE between hubs identifies significant enrichment of 48 GO categories, notably featuring genes related to purine metabolism, ATP biosynthesis and metabolism, and mitochondrial function (pFDR < 0.05, Supplementary Table 1) that mirror those previously reported in the mouse brain7. These results suggest a close genetic link between hub connectivity and metabolic function [for additional considerations, see ref. 50].

Stochastic models of brain wiring do not capture the spatial topography of degree

The results above imply strong genetic control of hub connectivity, which seems at odds with recent modeling studies suggesting that simple stochastic wiring rules can generate networks with complex, brain-like topologies, including heavy-tailed degree distributions that signify the existence of hubs29,30,32,33. Investigating variability in the binary topology of connectivity—that is, the specific pattern of wiring between regions—is challenging, as such variations at the level of macroscale human connectomes are limited. It is thus possible that stochastic processes may give rise to the basic binary topology of hub connectivity, with variations in connectivity strength subsequently being influenced by genetic factors.

To investigate the role of stochastic processes in shaping hub connectivity, we fitted 13 different generative models of network wiring to the HCP connectome data. Under each model, synthetic connectomes are generated using probabilistic wiring rules. The models we consider here have been explored extensively in prior work33 and have the general form:

$${\theta }_{ij}={d}_{ij}^{\eta }\times {t}_{ij}^{\gamma },$$

where θij is a score that weights the probability of connecting nodes i and j, dij is the Euclidean distance between node pairs, and tij is a topological property of an edge that may confer functional value to the network. Each of the 13 models substitutes a different topological property for tij (definitions in Table 2). The exponents η and γ are free parameters fitted to the data to optimally match the topological properties of the actual human connectome, as defined using nodal distributions of degree, clustering, and betweenness, and the edge-level distribution of connection distances33.

In line with prior work33, we find that models in which connections form according to both spatial (wiring cost) and topological rules can fit the distributions of empirical network properties better than a model based on wiring cost alone (i.e., the “sptl” model), as shown in Fig. 4A. The best-fitting model, “deg-avg”, modulates a pure wiring cost term by favoring connectivity between pairs of nodes that already have a high average degree, and shows a good fit to the data (i.e., all fits, indexed by the Kolmogorov–Smirnov statistic, were KS < 0.21).

Fig. 4: Generative brain network models do not reproduce the spatial topography of brain network hubs.

A Each distribution represents estimates of model fit, as quantified by the maximum KS value of the top 100 networks (out of 10,000) produced by the model optimization procedure. The color of each box indicates conceptually related models, as determined by the specific topology metric used in the model [Table 2]. White dots represent median values for each distribution. The interquartile range is represented with a dark gray box, whiskers are represented with a light gray line. Models favoring homophilic connectivity between node pairs are shown in red, those favoring clustering in orange, those based on the degree in light blue, and a purely spatial model considering wiring costs alone is in dark blue. The specific wiring-rule names are shown along the horizontal axis, with formal definitions provided in Table 2. Cumulative distributions of: B node degree, k; C betweenness centrality, b; D clustering coefficient, c; and E edge length, d, for the empirical connectome (darker line) and 100 runs (lighter lines) of the best-fitting “deg-avg” model corresponding to the data points shown in A. F Anatomical locations of hubs defined for a single hemisphere at selected k thresholds for the empirical data (top) and the single run of the optimized “deg-avg” generative model demonstrating the best model fit across 10,000 runs (bottom). G Correlation between the degree sequences of the empirical data and the best-fitting generative model within a single hemisphere (Spearman’s ρ = −0.05, p = 0.49). H The distribution of correlation values quantifying the relationship between left hemisphere degree sequences of the empirical data and synthetic networks generated using the top 100 best-fitting parameter combinations for each of the 13 considered models, corresponding to the data points shown in A. Source data are provided as a Source Data file.

Despite this adequate fit to four key network properties of the human connectome (Fig. 4B–E), we find that node degree in the empirical and model networks have very different spatial distributions. As shown in Fig. 4F, hubs in the empirical data are distributed throughout the brain, whereas hubs in the network that demonstrates the best fit to data across 130,000 model runs are predominantly confined to the temporal cortex. As a result, the correlation between the degree sequences of the empirical and model networks is very low (ρ = −0.05, Fig. 4G). This low correlation is observed consistently across all models (Fig. 4H), and even when we fit model parameters to explicitly optimize the correlation between empirical and model degree sequences (Supplementary Fig. 12); across 260,000 model runs, the degree sequence correlation with the empirical data never exceeds ρ = 0.3.

Together, these findings indicate that while stochastic models of brain network wiring can capture the statistical properties (node- and edge-level distributions) of connectomes, they cannot reproduce the way in which these properties are spatially embedded and thus do not accurately replicate the precise pattern of wiring between connectome hubs.

Genetically constrained models offer improved fits to topological and topographical properties of the connectome

The limitations of stochastic models, coupled with our evidence for a strong genetic influence on hub connectivity, raise the question of whether models that include genetic information may show better performance than models based on cost and/or topology alone. To address this question, we focus on the best-fitting cost-topology model, “deg-avg”, and examine its performance relative to model variants that include a bias to form connections between pairs of regions with high CGE. We focus specifically on CGE, given our evidence for elevated CGE between pairs of hubs regions (Fig. 3).

Figure 5 compares model fit statistics for the original “deg-avg” model (denoted “ST” in Fig. 5A), and models in which connections are formed according to CGE alone (denoted “G”), wiring cost alone (denoted “S”), an interplay between CGE and wiring cost (denoted “SG”), and an interplay between CGE and topological constraints, as defined in the “deg-avg” model (denoted “TG”). We find that a model incorporating both topology and genetic information, such that connections are favored between regions with both high average degree and high CGE (“TG” model), shows the best fits, on average, to network topology, even surpassing the “deg-avg” (“ST”) model.

Fig. 5: Adding genetic constraints to generative models can improve fits to network topology and topography.

A Each distribution represents estimates of model fit, as quantified by the maximum KS value of the top 100 networks (out of 10,000) produced by the model optimization procedure. The color of each box indicates conceptually related models, as determined by the specific metric used in the model: models favoring connectivity between regions with similar gene expression are in green, a model based on degree and wiring cost is in light blue, and a purely spatial model considering wiring costs alone is in dark blue. “S”, “T”, “G” stand for space (wiring cost), topology, and gene expression, respectively. White dots represent median values for each distribution. Interquartile range is represented with a dark gray box, whiskers are represented with a light gray line. Cumulative distributions of B node degree, k; C betweenness centrality, b; D clustering coefficient, c; and E edge length, d, for the empirical connectome (darker line) and 100 runs (lighter lines) of the best-fitting “TG” model corresponding to the data points shown in A. F Anatomical locations of hubs defined for a single hemisphere at selected k thresholds for the empirical data (top) and the single run of the optimized “TG” generative model demonstrating the best model fit across 10,000 runs (bottom). These networks contain 177 regions (instead of 180 presented in Fig. 4F) due to the limited coverage of gene-expression data. G Correlation between the degree sequences of the empirical data and the best-fitting generative model within a single hemisphere (Spearman’s ρ = 0.23, p = 3.3 × 10−5). H The distributions of correlation values quantifying the relationship between left hemisphere degree sequences of the empirical data and synthetic networks generated using the top 100 best-fitting parameter combinations for each of the 6 considered models, corresponding to the data points shown in A. Source data are provided as a Source Data file.

Moreover, the spatial distribution of hubs in the “TG” model network demonstrating the best fit is more dispersed across the brain (Fig. 5F) compared to the classical “deg-avg” model, resulting in a higher correlation between degree sequences of the empirical and model networks (Fig. 5G). Overall, models including topology and CGE or CGE alone demonstrate more positive degree sequence correlations compared to models that do not include CGE (Fig. 5H). Our findings thus indicate that incorporation of genetic constraints into stochastic models of the connectome can improve fits to both network topology and spatial topography.


The complex topology of neural networks is thought to have been sculpted by competitive selection pressures to minimize wiring costs and promote complex, adaptive function10,51. Across diverse species, rich-club connections between hubs are among the most costly and topologically central links of the connectome3,4,5,7 and thus play a major role in determining how cost-value trade-offs are negotiated within a given nervous system. Here, we combine a multifaceted genetic analysis with mathematical modeling to examine the mechanisms that shape hub connectivity of the human connectome. We find that: (i) genetic influences on phenotypic variation in connection strength are principally concentrated on the rich links between hubs; (ii) connected hubs have highly correlated gene-expression patterns that are related to similarity in regional cytoarchitecture and energy metabolism; (iii) current stochastic models of network growth cannot reproduce the spatial topography of hubs; and (iv) adding genetic constraints to these models can improve performance. Together, these findings support a major role for genes in shaping the rich-club organization of the brain.

Our connectome-wide heritability analysis presents evidence for a non-uniform distribution of genetic influences across the brain, characterized by a gradient in which genetic influences are weak for peripheral connections between nonhubs, intermediate for feeder connections between hubs and nonhubs, and strongest for rich links between hubs. Critically, this effect cannot be attributed to connection distance or network affiliation, suggesting some degree of specificity to hubs located throughout the brain.

The most strongly connected hubs in our connectomes were located in transmodal paralimbic and association networks, which show disproportionate expansion in size and connectivity in human compared to nonhuman primates16,17,18. Given the high centrality and cost of these connections (Supplementary Fig. 13,4,5,7), the preferential genetic influence on rich-club connectivity that we observe supports the hypothesis that natural selection favors wiring patterns that provide high value for low cost and that selection pressures are strongly concentrated on the valuable, costly links between hubs2,10. This view is also supported by recent evidence that genes demonstrating accelerated divergence between humans and chimpanzees show elevated expression in transmodal networks52.

In contrast to rich-club connectivity, phenotypic variance in peripheral connections between nonhubs is primarily influenced by unique environment. Topologically peripheral connections are more strongly conserved between human and chimpanzee connectomes17. Moreover, the spatial topography and function of nonhub sensory areas are highly consistent across primates, presumably being specified early in development by evolutionarily conserved transcriptional gradients18. These conserved gradients may couple with simple physical processes to give rise to predominantly short-range connectivity between topologically peripheral pairs of regions31,53. Subsequent modifications to peripheral connectivity may be driven by activity-dependent mechanisms, resulting in a greater environmental influence on phenotypic variance in connection strength.

It has been proposed that evolutionary expansion of multimodal hubs untethers these regions from transcriptional anchors in sensory areas, resulting in distinctive, non-canonical anatomical and functional properties18. Our findings suggest that, despite this putative untethering, genes still play an important role in shaping phenotypic variance of hub connectivity. This result aligns with evidence that non-conserved network properties reflect evolutionary innovations that are driven by structural variation of DNA, yielding greater phenotypic variation within a species12 and higher trait heritability when compared to more strongly conserved properties54.

In addition to being highly heritable, pairs of connected hubs also show the highest levels of transcriptional coupling, as previously observed in the mesoscale mouse connectome7 and cellular connectome of C. elegans11 (Supplementary Fig. 10). In C. elegans this result is not explained by the developmental proximity (i.e., similarity in neuron birth time or cell lineage distance), neurochemical identity, or anatomical position of neuron pairs, but is instead related to the functional identity of hub neurons, which tend to be command interneurons. Our analysis of cell-specific genes suggests a similar result at the regional level in humans, as rich-link CGE was elevated for gene markers of seven different cell types, suggesting that network hubs have enhanced similarity in regional cytoarchitecture. This conclusion was supported by our MPC analysis of the BigBrain atlas. Our results align with the structural model of cortical connectivity, in which regions with similar cytoarchitecture are more likely to connect with each other, even over long distances55. More specifically, our findings suggest that hub areas are the most similar in their cellular composition, and that this similarity may play a critical role in how genes preferentially sculpt long-range inter-connectivity between hubs.

We also show that current stochastic models of network growth, despite capturing key statistical network properties of the connectome, do not reproduce the spatial locations of network hubs. Indeed, while the actual hubs of the human brain have a widespread anatomical distribution, hubs in the best-fitting (“deg-avg”) model network are concentrated around centrally located regions. In line with this result, recent work has shown that cost-neutral randomizations of brain connectivity, in which connections are progressively randomized while preserving total wiring cost and the existence (but not position) of hubs, almost always degrade the functional complexity of the network, disconnect high-cost hubs, and lead to a distinct hub topography in which the most highly connected nodes cluster near the center of the brain56. These findings suggest that actual brains are very close to optimally balancing wiring cost with topological complexity, and that hub connectivity plays a critical role in determining how this balance is realized [see also ref. 57].

Notably, we find that incorporating genetic constraints into the models improves their capacity to reproduce both network topology and the spatial topography of hubs. While there is still room for considerable further improvement, our findings indicate that combining topological, genetic, and possibly spatial information may offer a fruitful way forward for generative models of the human connectome. Indeed, some models suggest that random growth of connections, when coupled with changes in brain geometry and heterochronicity of connection formation across regions, can yield brain-like networks with realistic features29, including connectivity between regions with similar cytoarchitecture58,59. Although genes likely influence heterochronous development, future work that extends such models so that they can be directly fitted to empirical data in humans, and which considers which genes may be most relevant in shaping network wiring, will be important for delineating the precise roles of genetic, environmental, stochastic, and physical mechanisms in shaping connectome architecture.


Imaging data acquisition

We examined DWI data from two independent cohorts. The first was obtained from the Human Connectome Project (HCP35). We used the minimally processed DWI and structural data from the HCP for 972 participants (age mean ± standard deviation: 28.7 ± 3.7, 522 females), including a cohort of MZ and DZ twin pairs together with their non-twin siblings. Data were acquired on a customized Siemens 3T “Connectome Skyra” scanner at Washington University in St Louis, Missouri, USA using a multi-shell protocol for the DWI: 1.25 mm3 isotropic voxels, repetition time (TR) = 5520 ms, echo time (TE) = 89.5 ms, field-of-view (FOV) of 210 × 180 mm, 270 directions with b = 1000, 2000, 3000 s/mm2 (90 per b value), and 18 b = 0 volumes. Structural T1-weighted data were collected using 0.7 mm3 isotropic voxels, TR = 2400 ms, TE = 2.14 ms, FOV of 224 × 224 mm. Subject recruitment procedures and informed consent forms, including written informed consent to share de-identified data, were approved by the Washington University institutional review board. The full details can be found elsewhere60.

The second DWI data set came from individuals recruited as part of ongoing research conducted at Monash University. The experimental protocol was approved by Monash University’s Human Research Ethics Committee and was carried out in accordance with the approved guidelines. Written informed consent was obtained from all participants before testing. The Monash sample was used for replication of the CGE analysis, and comprised 439 participants with MRI data obtained on a Siemens Skyra 3T scanner at Monash Biomedical Imaging in Clayton, Victoria, Australia using the following parameters: 2.5 mm3 voxel size, TR = 8800 ms, TE = 110 ms, FOV of 240 × 240 mm, 60 directions with b = 3000 s/mm2 and seven b = 0 volumes. In addition, a single b = 0 s/mm2 was obtained with the reverse-phase encoding so distortion correction could be performed. T1-weighted structural scans were acquired using: 1 mm3 isotropic voxels, TR = 2300 ms, TE = 2.07 ms, FOV of 256 × 256 mm. Data for 15 subjects were excluded due to: low connectome density (n = 10, connectome density more than 3 standard deviations lower than the mean) or issues with cortical surface segmentation (n = 5), resulting in a final sample of 424 participants (age mean ± standard deviation: 23.5 ± 5.3, 190 females).

Image pre-processing

HCP DWI data were processed according to the HCP minimal pre-processing pipeline, which included normalization of mean b0 image across diffusion acquisitions, and correction for EPI susceptibility and signal outliers, eddy-current-induced distortions, slice dropouts, gradient-nonlinearities, and subject motion. T1-weighted data were corrected for gradient and readout distortions prior to being processed with Freesurfer (full details can be found in ref. 60).

Pre-processing for T1-weighted structural images in the Monash Sample consisted of visual screening for gross artifacts followed by the reconstruction of the gray/white matter interface and the pial surface using FreeSurfer v5.3.0 software. Surface reconstructions for each subject were visually inspected, with manual corrections performed as required to generate accurate surface models60.

Distortions in the Monash DWI data were corrected with TOPUP in FSL, using the forward and reverse-phase-encoded b = 0 images to estimate the susceptibility-induced off-resonance field61,62. We corrected for eddy-current distortions, volume-to-volume head motion, within-volume head motion, and signal outliers using the eddy tool in FSL (version 5.0.1163,64,65). This implementation of EDDY significantly mitigates motion-related contamination of DWI connectivity estimates66. DWI data were subsequently corrected for B1 field inhomogeneities using FAST in FSL62,67.

Connectome reconstruction

For both the HCP and Monash data sets, network nodes for each individual were defined using a recently-developed, data-driven group average HCPMMP1 parcellation of the cortex into 360 regions (180 per hemisphere36). An advantage of this parcellation is that it uses diverse structural and functional information to derive a consensus partition of the cortex into different areas. Each region has also been assigned to a distinct canonical functional network38, allowing us to examine results in relation to the organization of these classic systems. However, the resulting areas can vary considerably in size, which can affect regional connectivity estimates since larger regions are able to accommodate more connections. To ensure that our results were not driven by the use of this specific parcellation, we replicated our main findings using a random cortical parcellation consisting of 500 approximately equally sized regions (250 per hemisphere, generated using the approach described in ref. 68; code available at This approach offers a stringent test of the generalizability of our findings, as the parcellations vary in terms of both methods for construction (data-driven vs random) and resolution (360 vs 500 nodes).

We focus our analysis on cortical connectivity for simplicity, as we know of no unified parcellation of cortical and subcortical areas. Moreover, positional differences between cortex and subcortex can affect DWI connectivity estimates, and there are major differences between cortical and subcortical patterns of gene expression in the AHBA data that can be difficult to appropriately accommodate40.

Subsequent processing of the DWI data for both the HCP and Monash data was performed using the MRtrix369 and FMRIB Software Library70. Tractography was conducted in each participant’s T1 space using second-order integration over fiber orientation distributions (iFOD2)71. To further improve the biological accuracy of the structural networks, we applied Anatomically Constrained Tractography (ACT), which uses a tissue segmentation of the brain into cortical gray matter, subcortical gray matter, white matter, and cerebrospinal fluid to ensure that streamlines are beginning, traversing, and terminating in anatomically plausible locations72. Tissue types were determined using FSL software70. A total of 10 million streamlines were generated on a probabilistic basis using a dynamic seeding approach that evaluates the relative difference between the estimated and current reconstruction fiber density and preferential samples from areas of insufficient density73. This method helps mitigate biases related to the poor reconstruction of tracts from certain parts of the brain due to insufficient seeding. The resulting tractogram was then combined with the cortical parcellation for each subject to produce a network map of white matter connectivity. Streamline termination points were assigned to the closest region within a 5 mm radius.

Connection weights were quantified using both streamline count (number of streamlines connecting two regions, SC) and the mean fractional anisotropy (FA) of voxels traversed by streamlines connecting two regions, which is commonly used as a marker of white matter microstructure. We focused on SC and FA as measures of connectivity strength because they are the most widely used in the literature, but note that they can be influenced by numerous factors that are not directly related to physiological measures of communication capacity between two regions74. Moreover, while diffusion tractography remains the only available tool for in vivo connectivity mapping in humans, tractography algorithms can vary in their specificity and sensitivity for tract reconstruction75,76. To mitigate these effects, our data processing pipeline has been designed to limit contributions from spurious streamlines72 and head motion66. While the accuracy of all tractography methods remains an open challenge for the field77, we note that any errors in tract reconstruction should reduce our chances of identifying stronger genetic effects for rich links through heritability analysis, since noisy connectivity values will inflate estimates of the E parameter (which also accounts for measurement error) in our biometric models, and rich links tend to be long-range connections, which are more prone to tractography errors78. Our findings may thus provide a conservative estimate of genetic influences on hub connectivity.

Connectome thresholding

As connectomes are estimated with some degree of noise, it is common practice to threshold weak or inconsistent edges to focus on connections that can be more reliably estimated79. We, therefore, selected edges that were: (i) present in at least 30% of subjects; and (ii) were amongst the τ% strongest edges (based on the median streamline count) to achieve the desired connectome density. We note that multiple other thresholding approaches are available79,80,81, and there is no consensus as to which works best for different data sets. Since the desired connection density is arbitrary, we examined our main results across a range of densities: τ = 15%, 20%, 25% for 360 region parcellation, and τ = 5%, 10%, 15% for the higher resolution parcellation of 500 regions. We note that the actual connection density of the human connectome remains unknown, and we chose these thresholds to span a range commonly studied in the literature.

The connection matrix resulting from our thresholding procedure was then used as a binary mask for selecting edges for the heritability, gene-expression analyses, and generative modeling. This masking procedure thus restricted individual variability in the binary topology of connectomes across individuals (indeed, in healthy individuals such topology should be highly conserved). For heritability analysis, we used this group-representative connectome as a mask to extract FA-based connection weights and also repeated the analysis using streamline count as a measure of connection strength.

Rich-club organization

The connectivity of each region (node) in a network can be quantified by counting the number of connections to which it is attached. This measure is known as node degree. At a particular degree threshold, k, nodes can be labeled as hubs (degree > k) or nonhubs (degree ≤ k), subsequently classifying all connections within the network as “rich” (connection between two hubs), “feeder” (connection between a hub and a nonhub), and “peripheral” (connection between two nonhubs) (Fig. 1A5). To quantify the inter-connectivity between hub regions within a binary brain connectivity network, we used the topological rich-club coefficient ϕ(k):

$$\phi (k)=\frac{2{E}_{{> }\,k}}{{N}_{{> }\,k}({N}_{\!{> }\,k}-1)},$$

where N>k is the number of nodes with degree > k, and E>k is the number of edges between nodes with degree > k82. Therefore, the rich-club coefficient quantifies the density of the subgraph comprising nodes with a degree higher than the hub-defining threshold k.

Since nodes with a higher degree make more connections, and can thus be expected to have a higher connection density compared to other nodes, we compared the ϕ(k) of the empirical network to the mean value across a 1000 randomized null networks, ϕrand(k), generated by rewiring the edges of the empirical network while retaining the same degree sequence, using the randmio_und function from the Brain Connectivity Toolbox83, rewiring each edge 50 times per null network. This randomization method is commonly used in the literature3,4,5,7. Alternative approaches that preserve both the degree sequence and connectome wiring cost56,84,85 can be used to test rich-club organization in relation to geometric influences on connectome organization.

To assess whether the connections between high-degree nodes were also more likely to have stronger connection weights than expected by chance, we evaluated the weighted rich-club coefficient86:

$${\phi }^{w}(k)=\frac{{W}_{ \!\!{> }\,k}}{\mathop{\sum }\nolimits_{l = 1}^{{E}_{ {> }\,k}}{w}_{l}^{{\rm{rank}}}},$$

where W>k is the sum of weights in the subgraph with degree higher than k, and the denominator is the total sum of l strongest weights in the network. As a null model for the weighted rich-club coefficient, we separate the definitions of weighted and topological rich-club coefficients by randomly reassigning weights within the network while preserving the binary topology87 (instead of rewiring the links).

In both binary and weighted cases, we computed the normalized rich-club coefficient ϕnorm(k) as the ratio between the rich-club coefficient in the empirical network and the mean rich-club coefficient in the set of the corresponding randomized networks:

$${{{\Phi }}}_{{\rm{norm}}}(k)=\frac{\phi (k)}{\langle {\phi }_{{\rm{rand}}}(k)\rangle }.$$

Values of Φnorm > 1 indicate rich-club organization, where high-degree nodes are more densely interconnected (in the case of the topological rich-club) or have higher weights (in the case of the weighted rich-club) than be expected by chance. The statistical significance of the result is assessed by computing a p-value directly from the empirical null distribution of the 1000 randomized networks, ϕrand(k), as a permutation test5. We note that in all our analyses, we estimated the representative group connectome and consequently the node degree using SC-weighted connectomes. Where indicated, FA-weighted connectomes were used in analyses of connectivity weights.


We investigated the topological centrality of rich links using a measure called communicability88, estimated across a range of degree thresholds. The communicability, Cij, between a pair of nodes i and j, is calculated by accounting for all possible paths of length l between the nodes, weighted as 1/l!, so that shorter paths make a stronger contribution to the overall score. The communicability, Cij, for a binary matrix A is formally defined as:

$${C}_{ij}=\mathop{\sum }\limits_{l=0}^{\infty }\frac{{({A}^{l})}_{ij}}{l!}={({e}^{A})}_{ij}.$$

In a weighted network, communicability, \({C}_{ij}^{w}\), is defined using a weighted adjacency matrix W:


where \({S}^{-\frac{1}{2}}\) is the diagonal matrix with elements \(\frac{1}{\sqrt{{s}_{i}}}\) and si is the strength of node i. We estimated the mean binary and weighted communicability for rich links, as a function of the hub-defining threshold k, to evaluate whether rich links are topologically central within the human connectome (Supplementary Fig. 1E, F). An advantage of communicability is that, unlike classic measures of centrality, it does not assume that information is routed exclusively along the shortest paths in the network, which is likely to be an inappropriate assumption for brain networks2,89.

Heritability analysis

The HCP diffusion imaging data set includes 117 pairs of genetically confirmed monozygotic (MZ) twin pairs together with 69 of their non-twin siblings, as well as 60 dizygotic (DZ) same-sex twin pairs and 48 of their non-twin siblings. For each twin pair with more than one non-twin sibling, we selected one sibling at random (demographic details summarized in Table 1). Only twin pairs where both twins had genetically verified zygosity were included in the heritability analysis.

Table 1 Demographic data for twin groups and their non-twin siblings.

Heritability analysis relies on the assumption that both shared genetic factors and common environment contribute to the phenotypic similarity between twins within a pair, whereas unique environmental factors and non-shared genetic effects contribute to the differences observed between them. In the classical twin design, MZ twins are assumed to be genetically identical whereas DZ twins on average share half of their DNA, which is similar to non-twin siblings. Structural equation modeling can thus be used to decompose phenotypic variance and covariance in any particular trait into additive genetic (A), common environmental (C), and unique environmental (E) influences. Considering that twins raised together might have experienced a more similar environment compared to their non-twin siblings, including a set of non-twin siblings into the analysis allows us to separate the common environmental contributions into twin-specific (T) and twin non-specific (C) common environmental factors.

We used the binary group-representative cortical connectome mask described above to extract FA-weighted edges and applied standard structural equation modeling (SEM) to every connection in the connectome using OpenMx software90,91 in R. The analysis reported in the main text was performed on the 360 region36 cortical connectome at 20% density (12,924 unique connections) using FA as a connection weight. The analyses were subsequently reproduced using SC (Supplementary Fig. 3C, D) and at different connectome densities (Supplementary Fig. 4A–C) and using a higher resolution 500-region random cortical parcellation at 5%, 10%, and 15% densities (Supplementary Fig. 4D–F).

A range of biometric models—ACTE, ACE, AE, CE, E—were fitted to each edge defined by the group connectome mask in order to find connection-specific maximum likelihood estimates of additive genetic (A), twin-specific common environmental (T), twin non-specific common environmental (C) and unique environment (E) factors, using age and sex as covariates. Outlying connection weights for each analysis were removed using the boxplot function in R by keeping data points (w) in a range Q1 − 1.5 × IQR < w < Q3 + 1.5 × IQR where Q1 and Q3 are the first and third quartiles respectively and IQR is the interquartile range. The Akaike information criterion (AIC)92 was used to compare the goodness of fit of all tested models in order to find the most parsimonious model. For each edge, the model with the lowest AIC was selected. Consequently, the narrow-sense heritability (the proportion of variance attributable to additive genetic factors, referred to as heritability) was estimated for each connection using the best-fitting model. We also show heritability results using parameter estimates from the full ACTE model to ensure that our findings cannot be explained by our model-selection procedure (Supplementary Fig. 3B) and verify that outlier exclusion did not affect our findings (Supplementary Fig. 5D–F, K). We also replicated the finding when evaluating differences in heritability relative to a null model in which edge labels were permuted (Supplementary Fig. 13). The increase for rich links is also observed for the mean genetic variance, which quantifies the non-normalized variance attributable to genetic factors (Supplementary Fig. 14).

Gene-expression data

We used brain-wide gene-expression data from the Allen Human Brain Atlas (AHBA), which consists of microarray expression measures in 3702 spatially distinct tissue samples taken from six neurotypical postmortem adult brains39. Different brain regions were sampled across each of the six AHBA donors to maximize spatial coverage, resulting in ~400–500 tissue samples in each brain. The samples were distributed across cortical, subcortical, brainstem, and cerebellar regions, measuring the expression levels of 58,692 probes quantifying the transcriptional activity of 20,737 genes. Considering that only two out of six brains were sampled from both left and right hemispheres whereas the other four brains had samples collected only from the left hemisphere, we focused our analyses on the left cortex only.

The pre-processing procedures applied to the data are outlined below and the choices detailed in ref. 40. Briefly, probe-to-gene annotations were first updated using the Re-Annotator toolbox93 resulting in the selection of 45,821 probes corresponding to the total of 20,232 genes. Second, tissue samples annotated to the brainstem and cerebellum were removed. Then, intensity-based filtering40 was applied in order to exclude probes that do not exceed background noise in more than 50% of samples, excluding 13,844 probes corresponding to 4486 unique genes. Afterward, a representative probe for each gene was selected based on the highest correlation to RNA sequencing data in two of the six brains94. Gene-expression samples were assigned to regions-of-interest by generating donor-specific gray matter parcellations and assigning samples located within 2 mm of the parcellation voxels. To increase the accuracy of assigning samples to regions, the samples were first divided into four separate groups based on their location: hemisphere (left/right) and structure assignment (cortex/subcortex), so samples listed as coming from the left cortical hemisphere in the AHBA ontology are only mapped to left cortical voxels of the parcellation (applying a 2 mm distance threshold, almost 90% of all cortical and subcortical samples were assigned to a non-zero voxel of the parcellation). Then, samples assigned to the subcortical regions as well as the right hemisphere were removed. Finally, gene-expression measures within a given brain were normalized first by applying a scaled robust sigmoid normalization for every sample across genes and then for every gene across samples in order to evaluate the relative expression of each gene across regions, while controlling for donor-specific differences in gene expression [see ref. 40 for a validation]. Normalized expression measures in samples assigned to the same region were averaged within each donor brain and aggregated into a region × gene matrix consisting of expression measures for 10,027 genes over 180 (left hemisphere, HCP parcellation) and 250 regions (left hemisphere of the random parcellation), respectively.

Distances between region pairs that were subsequently used to account for the spatial effects on transcriptional coupling were estimated on the cortical surface (pial) using the annotation files for each parcellation mapped onto the spherical representation of the fsaverage cortical surface. First, all the vertices that correspond to a particular region of interest in the spherical representation were identified and their centroid coordinates were calculated. Then the centroid coordinates were mapped to the fsaverage cortical surface and the distances between each pair of regions were calculated using the toolbox fast_marching_toolbox in MATLAB.

Transcriptional coupling

The result of the above mapping of AHBA data was an expression profile for each brain region, quantifying transcriptional activity across 10,027 genes. We used these profiles to quantify transcriptional coupling, or correlated gene expression (CGE), between every pair of regions. We defined CGE as the Pearson correlation between the normalized expression measures of the genes available after pre-processing (n = 10,027). As shown in Supplementary Fig. 6A and described in ref. 40, CGE exhibits a strong spatial autocorrelation that can be approximated as an exponential relationship with separation distance, such that regions located in close proximity to each other share more similar gene expression. To investigate whether CGE differs between different topological classes of connections beyond any low-order spatial effect, we need to ensure that the distance between regions alone is not informative of their CGE. Otherwise, any similarity between region pairs will be driven by a mixture of two factors: the CGE signature that is reflective of the spatial gradient and CGE signature that corresponds to the edge-specific properties. To account for the low-order spatial effect we fitted an exponential function with form r(d) = Aed/n + B. The parameters A = 0.64, B = −0.19 and n = 90.4 capture the trend well, allowing us to retain the residuals for further analysis (Supplementary Fig. 6B), defined as \(\widehat{CG{E}_{ij}}=CG{E}_{ij}-r({d}_{ij})\). These distance-corrected residual CGE values were used in all CGE analyses.

To evaluate transcriptional coupling for different connection types, for every edge within the connectivity matrix, we assigned a distance-corrected CGE measure. At each degree threshold, k, for defining hubs (nodes with degree > k), we then computed the average CGE of rich, feeder, and peripheral links. Significant increases in the CGE for a given link type compared to the rest of the network were evaluated using a one-sided Welch’s t-test (p < 0.05). Note that our CGE analysis focused on examining differences between connected pairs of regions, following prior work7,11, and thus only considers a fraction of the full matrix of CGE values. The development of more accurate diffusion MRI estimates of inter-regional connectivity will facilitate more precise comparisons between the transcriptional properties of connected and unconnected pairs of regions.

Gene contribution score

To determine which functional gene groups contribute the most to any observed differences in CGE across different link types in the brain, we quantified the degree to which each gene contributes to the overall CGE between a pair of regions, following prior work7:

$$\widehat{CG{E}_{ij}}=CG{E}_{ij}-r({d}_{ij})=\frac{1}{N}\mathop{\sum }\limits_{a=1}^{N}[{\widetilde{g}}_{i}^{a}{\widetilde{g}}_{j}^{a}-r({d}_{ij})]=\frac{1}{N}\mathop{\sum }\limits_{a=1}^{N}GC{S}_{ij}^{a},$$

where N is the number of genes (N = 10,027), \({\widetilde{g}}_{i}^{a}{\widetilde{g}}_{j}^{a}\) the product of the z-score normalized expression values for gene a in regions i and j, and r(dij) is the previously defined spatial autocorrelation effect approximated as an exponential line (Supplementary Fig. 6). Therefore, the gene contribution score between a pair of regions i and j for gene a was defined as \(GC{S}_{ij}^{a}={\widetilde{g}}_{i}^{a}{\widetilde{g}}_{j}^{a}-r({d}_{ij})\).

We then assigned each gene a t-statistic quantifying the increase in GCS for rich compared to peripheral links (GCSt-stat), as these two groups constitute the most distinct link types. A high value indicates increased CGE in rich compared to peripheral links. These t-statistic measures were used in the enrichment analyses as gene scores for determining whether any functional gene groups made a stronger contribution to CGE than others.

Cell-specific genes

Given that the AHBA assays gene expression using bulk tissue samples, it is possible that regional variations in cellular architecture drive differences in CGE between different link types. To test this hypothesis, we conducted a second CGE analysis focused on subsets of genes that have previously been identified as cell-specific markers. The set of cell-specific genes was compiled based on data from five different single-cell studies that used postmortem cortical samples of human postnatal subjects. Genes identified in each study as a cell-specific marker or as specifically enriched within a cell type were aggregated into study-specific lists44,45,46,47. In the case of ref. 43, where the normalized gene-expression values were available for each cell type, we identified enriched genes as those with an average Fragments per kilobase million, FPKM > 5 and at least a four-fold enrichment over other cell types, as per authors recommendations. We then assigned genes within each of the resulting study-specific gene lists to one of seven canonical cell classes: astroglia, endothelial cells, excitatory neurons, inhibitory neurons, oligodendrocytes, and oligodendrocyte progenitor cell restricting each cell-class list to only contain genes unique to that class.

Gene-set enrichment analysis using gene score resampling

Gene-set enrichment analyses assess whether any functionally related groups of genes, annotated using Gene Ontology (GO), are associated with a selected phenotype. Every gene in our sample (n = 10,027 genes) was assigned a t-statistic score quantifying its contribution towards the increase in GCS for rich links relative to peripheral (GCSt-stat). Using these scores, we determined which specific functional groups of genes contributed to the observed increase in correlated gene expression. Functional gene group analysis was performed using version 3.2 of ErmineJ software95. Gene ontology annotations were obtained from GEMMA96 as on April 29, 2021. Gene Ontology terms and definitions were automatically downloaded by ErmineJ on April 29, 2021 as go.obo (data version 2021-02-01) and can be downloaded from We performed gene score resampling (GSR) analysis on the GCSt-stat scores testing the biological process GO categories with 5–100 genes available using the mean t-statistic score across genes to summarize the GO category and applying full resampling with 106 iterations. The resulting p-values were corrected across 5892 GO categories, controlling the false discovery rate (FDR) at 0.05 using the method of Benjamini and Hochberg97.

Recent work indicates that the default null models used in such analyses are insufficiently constrained for spatially embedded transcriptomic atlas data50. This problem can lead to inflated significance for some GO categories when testing for spatial correlations between regional variations in gene-expression patterns and measures of brain structure or function. The extent to which this problem generalizes to phenotypes defined for pairs of regions, such as the connectivity metrics considered here, is unclear. We nonetheless suggest caution in interpreting the findings of this analysis, as appropriate null models for the analysis of pairwise phenotypes have not yet been developed. We report the enrichment findings to test for consistency with prior findings in the mouse7.

Microstructural profiles

Our CGE analysis of cell-specific genes indicated that connected hubs have more similar cellular composition than other region pairs. To independently verify this result, we estimated the microstructural profile covariance (MPC) between each pair of regions using the BigBrain atlas, which is a Merker-stained 3D volumetric histological reconstruction of a human brain48,49. MPC was estimated using methods described in ref. 49 ( In brief, the MPC procedure involved constructing 16 equivolumetric surfaces between the pial and white matter boundaries, followed by systematic sampling of the intensity values along these surfaces at 163,842 matched vertices per hemisphere. The intensity profiles, reflecting depth-wise changes in cellular density and soma size, were corrected for the midsurface y-coordinate to account for an anterior–posterior increase in intensity values across the BigBrain related to coronal slicing and reconstruction. Standardized residual intensity profiles were averaged within areas of the HCPMMP1 (n = 360)36 and random (n = 500) parcellations. We quantified cytoarchitectural similarity between cortical areas by correlating areal intensity profiles (covarying for cortex-wide mean intensity profile), thresholding to retain only positive values (r > 0) and applying a log transformation, resulting in the measure of microstructural profile covariance (MPC)49. We repeated the same analysis using the 500-region random parcellation.

Notably, this analysis did not replicate the elevated MPC for rich links seen with the HCPMMP1 atlas (compare Fig. 3H with Supplementary Fig. 9B). This discrepancy likely reflects the fact that the HCPMMP1 parcellation more closely approximates boundaries between functional zones of the cortex, as it is based on a fusion of multimodal imaging data36. The random parcellation makes no attempt to capture such boundaries and may blur different cytoarchitectonic regions within the same network node, thus resulting in noisier MPC estimates. In this way, the MPC results appear to depend on the accurate approximation of cytoarchitectonic boundaries in the cortex.

Models of brain network wiring

To evaluate the role of stochastic processes in shaping connectome architecture, we evaluated a series of generative models of network wiring that have the general form:

$${\theta }_{ij}={d}_{ij}^{\eta }\times {t}_{ij}^{\gamma },$$

where θij is a score that weights the probability of connecting nodes i and j, dij is the Euclidean distance between nodes i and j, and tij is some topological property of nodes i and j or an edge between them. This topological term modulates the probabilities of forming an edge along with wiring cost (operationalized as dij). The exponents η and γ, act as weights on the distance and topological terms, respectively32,33. Numerous topological properties have been evaluated for tij in past work32,33, and we consider these same models here. A summary is provided in Table 2.

Table 2 A list of topological terms, tij, used in the generative models [Eq. (8)].

At each iteration, the computed connection score, θij, is used to calculate the probability of a given edge, (i, j), being formed in that iteration, as Pij = θij/Θ, where Θ is the sum of θij over all edges that have not yet been formed. Thus, at a given iteration, the model calculates the probability of each edge forming based on its distance and the current value of its topological term, tij. This topological value is recalculated at each iteration. Edges are added iteratively until the total number of edges is equal to the number of edges in the empirical connectome. Due to computational burden, and in line with prior analyses33, we fitted models to a single (left) hemisphere connectome defined using the HCPMMP136 parcellation, containing 5025 unique edges (20% whole cortex connectome density).

As per prior work33, we quantified model performance using the Kolmogorov–Smirnov (KS) statistic. The KS statistic quantifies the distances between distributions of key topological statistics of the network; as such, lower values indicate better model fit. We focused on four key metrics: node-level distributions of degree, clustering, and betweenness, and the edge-level distribution of connection distance33. The quality of model fit was defined as the maximum KS value across all four distributions; that is, model performance is defined by the property that is fitted most poorly. In principle, any number of other topological parameters could be used in this objective function, but these are some of the most widely used to characterize brain networks and were employed in prior work evaluating the same models32,33.

We optimize the free parameters η and γ as per the previous work33. Specifically, we randomly sample the parameter space and evaluate the fits of the resulting networks. We consider η values in the range from −4 to 4, allowing both positive and negative contributions for the topological terms, whereas wiring cost is always penalized with γ values ranging from −15 to 0. After sampling 2000 points in this space, Voronoi tessellation is then used to identify areas—or cells—of this space where the parameters produce networks with the best fits, as defined by the KS statistic. We then preferentially sample from cells with better fits. This procedure is repeated four times so that the algorithm gradually converges on an optimum. We ran each generative model on the group connectome 10,000 times and then evaluated each different model by comparing the 100 lowest energy values obtained from the optimization procedure.

For our analysis, we draw a critical distinction between the distribution and sequence of a topological property. The distribution refers to how a property is statistically distributed across the nodes of the network. The sequence refers to the exact assignment of a particular value to individual nodes or edges; in other words, how the property is spatially embedded in the brain. It is possible that two networks may have similar distributions for a given property, but very different underlying sequences.

The models we consider here are optimized to match distributions, not sequences. As we are specifically interested in understanding the mechanisms driving the precise way in which hubs are connected, and given evidence that the specific anatomical location of network hubs has important implications for network dynamics56, we seek to evaluate whether the models can not only generate networks with hubs, which would be shown by the accurate fitting of the degree distribution, but also whether the models yield hubs in the same anatomical regions as the empirical data, which would be shown by the accurate fitting of the degree sequence. To this end, we additionally evaluate the correlation between the degree sequences of the empirical and synthetic networks using the Spearman correlation coefficient, ρ. A high correlation between the model and data implies that hubs are located in the same anatomical regions across the two networks. Conversely, a low correlation indicates that the model does not accurately capture the spatial embedding of connectivity in the connectome and that hubs in the model network reside in anatomical locations that differ from the actual connectome.

An important consideration is that the correlation between model and empirical degree sequences was not a part of the objective function used in model fitting. We fitted the models using topological distributions and then evaluated their performance in capturing the empirical degree sequence. This procedure allows us to examine how well these models, as traditionally implemented, capture spatial properties of hub connectivity. However, this procedure also raises the question of whether it is possible to obtain a higher degree sequence correlation if model parameters are chosen to optimize this specific quantity. We, therefore, repeated the analysis after replacing the objective function with one that maximized the similarity between model and empirical degree sequences. Specifically, we optimized the Spearman correlation between model and empirical degree sequences with no other constraints to give the models the best possible chance of reproducing the empirically observed spatial topography of hub regions. The results of this analysis are shown in Supplementary Fig. 12. Qualitatively similar results were obtained when optimizing the Pearson correlation between model and empirical degree sequences.

Across the 13 models evaluated in our analysis, we find that the best-fitting model is the “deg-avg” model, which involves a trade-off between forming connections between highly connected nodes (i.e., node pairs with high average degree) and penalizing long-range connections (i.e., minimizing wiring cost). This result differs from past work, in which a homophilic attachment trade-off model that balances wiring cost with a preference for forming connections between nodes with similar neighbors offered the best fit to empirical connectome data32,33. This discrepancy may be related to our use of a higher resolution network parcellation, a connectome mapped at a different connection density, using a different tractography algorithm, and/or a different diffusion MRI processing pipeline. Investigating the effect of these factors on modeling results is an important direction of future work, but these potential effects do not change the substantive point of our results that current stochastic models do a poor job of reproducing the spatial embedding of hub connectivity, as across 260,000 runs of the 13 different models considered, no degree sequence correlation exceeded ρ = 0.3. We also note that our model fits (Fig. 4B–E) are in the same range as those reported by Betzel et al.33, indicating that our discrepant results are not due to differences in model accuracy.

Models incorporating transcriptional information are constructed using the same general form (Eq. (8)) while replacing one of the terms with pairwise distance-corrected CGE values gij, weighted using a parameter λ varying in the range [0, 200] that favors forming connections between regions with higher CGE.

Whereas our analysis mimics prior comprehensive evaluations of generative network models based on cost-value trade-offs32,33, other formulations and approaches are also possible98,99,100. It is also possible to define 3-parameter or higher-order models that incorporate spatial, topological, and genetic constraints, but we have focused on the simpler 2-parameter form here to enable efficient optimization and fair model comparison. It is also possible that alternative topological properties and non-genetic wiring rules may yield improved model performance, and so we cannot completely rule out a role for stochastic processes, or a more complex interaction between CGE, wiring cost, and topology, in shaping hub connectivity. Indeed, more abstract models suggest that stochastic wiring, acting in concert with developmental changes in brain geometry and heterogeneous timing of connection formation across regions, can indeed generate networks with brain-like properties34,59,101. However, a framework for directly fitting such models to human DWI data has not yet been developed.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

Structural brain networks were mapped from Human Connectome Project data35, which are freely available ( Gene-expression data were acquired from the Allen Human Brain Atlas, which is also freely available ( Microstructural profiles were derived from the freely accessible BigBrain Project ( Gene ontology annotations were obtained from GEMMA96 as on April 29, 2021. Gene Ontology terms and definitions were automatically downloaded by ErmineJ on April 29, 2021 as go.obo (data version 2021-02-01) and can be downloaded from Data necessary to generate the figures in this work are available at the associated repository data are provided with this paper.

Code availability

Functions to compute graph-theoretic measures are available as part of the Brain Connectivity Toolbox ( Custom MATLAB code to generate the figures in this work is available at


  1. 1.

    Bullmore, E. & Sporns, O. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10, 186 (2009).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  2. 2.

    Fornito, A., Zalesky, A. & Bullmore, E. Fundamentals of Brain Network Analysis (Academic Press, 2016).

  3. 3.

    Harriger, L., van den Heuvel, M. P. & Sporns, O. Rich club organization of macaque cerebral cortex and its role in network communication. PLoS ONE 7, e46497 (2012).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  4. 4.

    Towlson, E. K., Vértes, P. E., Ahnert, S. E., Schafer, W. R. & Bullmore, E. T. The rich club of the C. elegans neuronal connectome. J. Neurosci. 33, 6380 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  5. 5.

    van den Heuvel, M. P. & Sporns, O. Rich-club organization of the human connectome. J. Neurosci. 31, 15775 (2011).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  6. 6.

    van den Heuvel, M. P. & Sporns, O. Network hubs in the human brain. Trends Cogn. Sci. 17, 683 (2013).

    PubMed  Article  PubMed Central  Google Scholar 

  7. 7.

    Fulcher, B. D. & Fornito, A. A transcriptional signature of hub connectivity in the mouse connectome. Proc. Natl. Acad. Sci USA 113, 1513302113 (2016).

    Article  CAS  Google Scholar 

  8. 8.

    Mišić, B. et al. Network-level structure-function relationships in numan neocortex. Cereb. Cortex 26, 3285 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  9. 9.

    Buckner, R. L. et al. Cortical hubs revealed by intrinsic functional connectivity: mapping, assessment of stability, and relation to Alzheimeras disease. J. Neurosci. 29, 1860 (2009).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  10. 10.

    Bullmore, E. & Sporns, O. The economy of brain network organization. Nat. Rev. Neurosci. 13, 336 (2012).

    CAS  Article  Google Scholar 

  11. 11.

    Arnatkevičiūtė, A., Fulcher, B. D., Pocock, R. & Fornito, A. Hub connectivity, neuronal diversity, and gene expression in the Caenorhabditis elegans connectome. PLoS Comput. Biol. 14, e1005989 (2018).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  12. 12.

    Mueller, S. et al. Individual variability in functional connectivity architecture of the human brain. Neuron 77, 586 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  13. 13.

    Jung, R. E. & Haier, R. J. The parieto-frontal integration theory (P-FIT) of intelligence: converging neuroimaging evidence. Behav. Brain Sci. 30, 135 (2007).

    PubMed  Article  PubMed Central  Google Scholar 

  14. 14.

    Crossley, N. A. et al. The hubs of the human connectome are generally implicated in the anatomy of brain disorders. Brain 137, 2382 (2014).

    PubMed  PubMed Central  Article  Google Scholar 

  15. 15.

    Reardon, P. K. et al. Normative brain size variation and brain shape diversity in humans. Science 360, 1222 (2018).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  16. 16.

    Hill, J. et al. Similar patterns of cortical expansion during human development and evolution. Proc. Natl. Acad. Sci. USA 107, 13135 (2010).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  17. 17.

    Ardesch, D. J. et al. Evolutionary expansion of connectivity between multimodal association areas in the human brain compared with chimpanzees. Proc. Natl. Acad. Sci. USA 116, 7101 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  18. 18.

    Buckner, R. L. & Krienen, F. M. The evolution of distributed association networks in the human brain. Trends Cogn. Sci. 17, 648 (2013).

    PubMed  Article  Google Scholar 

  19. 19.

    Thompson, P. M. Genetic influences on brain structure. Nat. Neurosci. 4, 1253 (2001).

    CAS  PubMed  Article  Google Scholar 

  20. 20.

    Yamamoto, N., Tamada, A. & Murakami, F. Wiring of the brain by a range of guidance cues. Progress Neurobiol. 68, 393 (2003).

    Article  Google Scholar 

  21. 21.

    Kolodkin, A. L. & Tessier-Lavigne, M. Mechanisms and molecules of neuronal wiring: a primer. Cold Spring Harb. Perspect. Biol. 3, a001727 (2011).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  22. 22.

    Fornito, A. et al. Genetic influences on cost-efficient organization of human cortical functional networks. J. Neurosci. 31, 3261 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  23. 23.

    Vértes, P. E. et al. Gene transcription profiles associated with inter-modular hubs and connection distance in human functional magnetic resonance imaging networks. Philos. Trans. R. Soc. B 371, 735 (2016).

    Article  CAS  Google Scholar 

  24. 24.

    Fornito, A., Arnatkevičiūtė, A. & Fulcher, B. D. Bridging the gap between connectome and transcriptome. Trends Cogn. Sci. 23, 34 (2019).

    PubMed  Article  Google Scholar 

  25. 25.

    Arnatkevičiūtė, A., Fulcher, B. D. & Fornito, A. Uncovering the transcriptional correlates of hub connectivity in neural networks. Front. Neural Circuits 13, 47 (2019).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  26. 26.

    Baker, S. T. E. et al. Developmental changes in brain network hub connectivity in late adolescence. J. Neurosci. 35, 9078 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  27. 27.

    Oldham, S. & Fornito, A. The development of brain network hubs. Dev. Cogn. Neurosci. 36, 100607 (2018).

    PubMed  PubMed Central  Article  Google Scholar 

  28. 28.

    Ercsey-Ravasz, M. et al. A predictive network model of cerebral cortical connectivity based on a distance rule. Neuron 80, 184 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  29. 29.

    Song, H. F., Kennedy, H. & Wang, X.-J. Spatial embedding of structural similarity in the cerebral cortex. Proc. Natl. Acad. Sci. USA 111, 16580 (2014).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  30. 30.

    Henderson, J. & Robinson, P. Relations between the geometry of cortical gyrification and white-matter network architecture. Brain Connect. 4, 112 (2014).

    PubMed  Article  PubMed Central  Google Scholar 

  31. 31.

    Horvát, S. et al. Spatial embedding and wiring cost constrain the functional layout of the cortical network of rodents and primates. PLoS Biol. 14, e1002512 (2016).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  32. 32.

    Vértes, P. E. et al. Simple models of human brain functional networks. Proc. Natl. Acad. Sci. USA 109, 5868 (2012).

    ADS  PubMed  PubMed Central  Article  Google Scholar 

  33. 33.

    Betzel, R. F. et al. Generative models of the human connectome. Neuroimage 124, 1054 (2016).

    Article  Google Scholar 

  34. 34.

    Goulas, A., Betzel, R. F. & Hilgetag, C. C. Spatiotemporal ontogeny of brain wiring. Sci. Adv. 5, eaav9694 (2019).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  35. 35.

    Van Essen, D. C. et al. The WU-Minn Human Connectome Project: an overview. Neuroimage 80, 62 (2013).

    PubMed  PubMed Central  Article  Google Scholar 

  36. 36.

    Glasser, M. F. et al. A multi-modal parcellation of human cerebral cortex. Nature 536, 171 (2016).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  37. 37.

    van den Heuvel, M. P., Kahn, R. S., Goni, J. & Sporns, O. High-cost, high-capacity backbone for global brain communication. Proc. Natl. Acad. Sci. USA 109, 11372 (2012).

    ADS  PubMed  PubMed Central  Article  Google Scholar 

  38. 38.

    Ji, J. L. et al. Mapping the human brainas cortical-subcortical functional network organization. NeuroImage 185, 35 (2019).

    PubMed  Article  Google Scholar 

  39. 39.

    Hawrylycz, M. J. et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature 489, 391 (2012).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  40. 40.

    Arnatkevičiūtė, A., Fulcher, B. D. & Fornito, A. A practical guide to linking brain-wide gene expression and neuroimaging data. Neuroimage 189, 353–367 (2019).

    Article  Google Scholar 

  41. 41.

    Lau, H. Y. G., Fornito, A. & Fulcher, B. D. Scaling of gene transcriptional gradients with brain size across mouse development. Neuroimage 224, 117395 (2021).

  42. 42.

    Burt, J. B. et al. Hierarchy of transcriptomic specialization across human cortex captured by structural neuroimaging topography. Nat. Neurosci. 21, 1251 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  43. 43.

    Zhang, Y. et al. Purification and characterization of progenitor and mature human astrocytes reveals transcriptional and functional differences with mouse. Neuron 89, 37 (2016).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  44. 44.

    Li, M. et al. Integrative functional genomic analysis of human brain development and neuropsychiatric risks. Science 362, eaat7615 (2018).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  45. 45.

    Lake, B. B. et al. Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain. Nat. Biotechtol. 36, 70 (2018).

    CAS  Article  Google Scholar 

  46. 46.

    Darmanis, S. et al. A survey of human brain transcriptome diversity at the single cell level. Proc. Natl. Acad. Sci. USA 112, 7285 (2015).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  47. 47.

    Habib, N. et al. Massively parallel single-nucleus RNA-seq with DroNc-seq. Nat. Methods 14, 955 (2017).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  48. 48.

    Amunts, K. et al. Bigbrain: an ultrahigh-resolution 3D human brain model. Science 340, 1472 (2013).

    ADS  CAS  PubMed  Article  PubMed Central  Google Scholar 

  49. 49.

    Paquola, C. et al. Microstructural and functional gradients are increasingly dissociated in transmodal cortices. PLoS Biol. 17, e3000284 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  50. 50.

    Fulcher, B. D., Arnatkeviciute, A. & Fornito, A. Overcoming false-positive gene-category enrichment in the analysis of spatially resolved transcriptomic brain atlas data. Nat. Commun. 12, 2669 (2021).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  51. 51.

    Ramón y Cajal, S. Histology of the Nervous System of Man and Vertebrates (Oxford University Press, 1995).

  52. 52.

    Wei, Y. et al. Genetic mapping and evolutionary analysis of human-expanded cognitive networks. Nat. Commun. 10, 4839 (2019).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  53. 53.

    Henderson, J. A. & Robinson, P. A. Geometric effects on complex network structure in the cortex. Phys. Rev. Lett. 107, 018102 (2011).

    ADS  CAS  PubMed  Article  PubMed Central  Google Scholar 

  54. 54.

    Bertolero, M. A. et al. The human brain’s network architecture is genetically encoded by modular pleiotropy. Preprint at (2019).

  55. 55.

    Barbas, H. General cortical and special prefrontal connections: principles from structure to function. Annu. Rev. Neurosci. 38, 269 (2015).

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  56. 56.

    Gollo, L. L. et al. Fragility and volatility of structural hubs in the human connectome. Nat. Neurosci. 21, 1107 (2018).

    CAS  Article  Google Scholar 

  57. 57.

    Bassett, D. S. et al. Efficient physical embedding of topologically complex information processing networks in brains and computer circuits. PLoS Comput. Biol. 6, e1000748 (2010).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  58. 58.

    Beul, S. F., Barbas, H. & Hilgetag, C. C. A predictive structural model of the primate connectome. Sci. Rep. 7, 43176 (2017).

    ADS  PubMed  PubMed Central  Article  Google Scholar 

  59. 59.

    Goulas, A., Majka, P., Rosa, M. G. P. & Hilgetag, C. C. A blueprint of mammalian cortical connectomes. PLoS Biol. 17, e2005346 (2019).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  60. 60.

    Glasser, M. F. et al. The minimal preprocessing pipelines for the Human Connectome Project. Neuroimage 80, 105 (2013).

    PubMed  Article  PubMed Central  Google Scholar 

  61. 61.

    Anderson, J., Skare, S. & Ashburner, J. How to correct susceptibility distortions in spin-echo echo-planar images: application to diffusion tensor imaging. Neuroimage 20, 870 (2003).

    Article  Google Scholar 

  62. 62.

    Smith, S. et al. Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 23, S208 (2004).

    Article  Google Scholar 

  63. 63.

    Andersson, J. & Sotiropoulos, S. An integrated approach to correction for off-resonance effects and subject movement in diffusion MR imaging. Neuroimage 125, 1063 (2016).

    Article  Google Scholar 

  64. 64.

    Andersson, J., Graham, M., Zsoldos, E. & Sotiropoulos, S. Incorporating outlier detection and replacement into a non-parametric framework for movement and distortion correction of diffusion MR images. Neuroimage 141, 556 (2016).

    PubMed  Article  PubMed Central  Google Scholar 

  65. 65.

    Andersson, J. et al. Towards a comprehensive framework for movement and distortion correction of diffusion MR images: within volume movement. Neuroimage 152, 450 (2017).

    PubMed  Article  PubMed Central  Google Scholar 

  66. 66.

    Oldham, S. et al. The efficacy of different preprocessing steps in reducing motion-related confounds in diffusion MRI connectomics. NeuroImage. 222, 117252 (2020).

    PubMed  Article  Google Scholar 

  67. 67.

    Zhang, Y., Brady, M. & Smith, S. Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Trans. Med. Imaging 20, 45 (2017).

    Article  Google Scholar 

  68. 68.

    Notter, M. et al. Parcellation fragmenter. (2018).

  69. 69.

    Tournier, J.-D., Calamante, F. & Connelly, A. MRtrix: diffusion tractography in crossing fiber regions. Int. J. Imaging Syst. Technol. 22, 53 (2012).

    Article  Google Scholar 

  70. 70.

    Jenkinson, M., Beckmann, C. F., Behrens, T. E., Woolrich, M. W. & Smith, S. M. FSL. Neuroimage 62, 782 (2012).

    PubMed  Article  PubMed Central  Google Scholar 

  71. 71.

    Tournier, J.-D., Calamante, F. & Connelly, A. Improved probabilistic streamlines tractography by 2 nd order integration over fibre orientation distributions. Ismrm 88, 2010 (2010).

    Google Scholar 

  72. 72.

    Smith, R. E., Tournier, J.-D., Calamante, F. & Connelly, A. Anatomically-constrained tractography: improved diffusion MRI streamlines tractography through effective use of anatomical information. Neuroimage 62, 1924 (2012).

    PubMed  Article  PubMed Central  Google Scholar 

  73. 73.

    Smith, R. E., Tournier, J.-D., Calamante, F. & Connelly, A. SIFT2: Enabling dense quantitative assessment of brain white matter connectivity using streamlines tractography. Neuroimage 119, 338 (2015).

    PubMed  Article  PubMed Central  Google Scholar 

  74. 74.

    Jones, D. K. Challenges and limitations of quantifying brain connectivity in vivo with diffusion MRI. Imaging Med. 2, 341 (2010).

    Article  Google Scholar 

  75. 75.

    Zalesky, A. et al. Connectome sensitivity or specificity: which is more important? Neuroimage 142, 407 (2016).

    PubMed  Article  PubMed Central  Google Scholar 

  76. 76.

    Thomas, C. et al. Anatomical accuracy of brain connections derived from diffusion MRI tractography is inherently limited. Proc. Natl. Acad. Sci. USA 111, 16574 (2014).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  77. 77.

    Maier-Hein, K. H., Neher, P. F., Houde, J.-C., Cote, M.-A. & Descoteaux, M. The challenge of mapping the human connectome based on diffusion tractography. Nat. Commun. 8, 1349 (2017).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  78. 78.

    Zalesky, A. & Fornito, A. A dti-derived measure of cortico-cortical connectivity. IEEE Trans. Med. Imaging 28, 1023 (2009).

    PubMed  Article  Google Scholar 

  79. 79.

    de Reus, M. A. & van den Heuvel, M. P. Estimating false positives and negatives in brain networks. Neuroimage 70, 402 (2013).

    PubMed  Article  PubMed Central  Google Scholar 

  80. 80.

    Betzel, R. F., Griffa, A., Hagmann, P. & Mišić, B. Distance-dependent consensus thresholds for generating group-representative structural brain networks. Proc. Natl. Acad. Sci. USA 3, 475 (2018).

    Google Scholar 

  81. 81.

    Roberts, J. A., Perry, A., Roberts, G., Mitchell, P. B. & Breakspear, M. Consistency-based thresholding of the human connectome. Neuroimage 145, 118 (2016).

    PubMed  Article  Google Scholar 

  82. 82.

    Colizza, V., Flammini, A., Serrano, M. A. & Vespignani, A. Detecting rich-club ordering in complex networks. Nat. Phys. 2, 110 (2006).

    CAS  Article  Google Scholar 

  83. 83.

    Rubinov, M. & Sporns, O. Complex network measures of brain connectivity: uses and interpretations. Neuroimage 52, 1059 (2010).

    Article  Google Scholar 

  84. 84.

    Roberts, J. A. et al. The contribution of geometry to the human connectome. Neuroimage 124, 379 (2016).

    PubMed  Article  Google Scholar 

  85. 85.

    Betzel, R. F. & Bassett, D. S. Specificity and robustness of long-distance connections in weighted, interareal connectomes. Proc. Natl. Acad. Sci. USA 115, E4880 (2018).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  86. 86.

    Opsahl, T., Colizza, V., Panzarasa, P. & Ramasco, J. J. Prominence and control: the weighted rich-club effect. Phys. Rev. Lett. 101, 168702 (2008).

    ADS  PubMed  Article  CAS  PubMed Central  Google Scholar 

  87. 87.

    Alstott, J., Panzarasa, P., Rubinov, M., Bullmore, E. T. & Vértes, P. E. A unifying framework for measuring weighted rich clubs. Sci. Rep. 4, 1 (2014).

    Google Scholar 

  88. 88.

    Estrada, E. & Hatano, N. Communicability in complex networks. Phys. Rev. E 77, 036111 (2008).

    ADS  MathSciNet  Article  CAS  Google Scholar 

  89. 89.

    Goñi, J. et al. Resting-brain functional connectivity predicted by analytic measures of network communication. Proc. Natl. Acad. Sci. USA 111, 833 (2014).

    ADS  PubMed  Article  CAS  PubMed Central  Google Scholar 

  90. 90.

    Boker, S. et al. OpenMx: an open source extended structural equation modeling framework. Psychometrika 76, 306 (2011).

    MathSciNet  PubMed  PubMed Central  MATH  Article  Google Scholar 

  91. 91.

    Neale, M. C. et al. OpenMx 2.0: extended structural equation and statistical modeling. Psychometrika 81, 535 (2016).

    MathSciNet  PubMed  MATH  Article  Google Scholar 

  92. 92.

    Akaike, H. Information Theory and an Extension of the Maximum Likelihood Principle (Springer, 1998).

  93. 93.

    Arloth, J., Bader, D. M., Röh, S. & Altmann, A. Re-Annotator: annotation pipeline for microarray probe sequences. PLoS ONE 10, e0139516 (2015).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  94. 94.

    Miller, J. A. et al. Improving reliability and absolute quantification of human brain microarray data by filtering and scaling probes using RNA-Seq. BMC Genomics 15, 154 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  95. 95.

    Gillis, J., Mistry, M. & Pavlidis, P. Gene function analysis in complex data sets using ErmineJ. Nat. Protoc. 5, 1148 (2010).

    CAS  PubMed  Article  Google Scholar 

  96. 96.

    Zoubarev, A. et al. Gemma: a resource for the reuse, sharing and meta-analysis of expression profiling data. Bioinformatics 28, 2272 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  97. 97.

    Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289 (1995).

    MathSciNet  MATH  Google Scholar 

  98. 98.

    Rubinov, M. Constraints and spandrels of interareal connectomes. Nat. Commun. 7, 13812 (2016).

    ADS  CAS  PubMed  PubMed Central  Article  Google Scholar 

  99. 99.

    Chen, Y., Wang, S., Hilgetag, C. C. & Zhou, C. Features of spatial and functional segregation and integration of the primate connectome revealed by trade-off between wiring cost and efficiency. PLoS Comput. Biol. 13, 1 (2017).

    Google Scholar 

  100. 100.

    Chen, Y., Wang, S., Hilgetag, C. C. & Zhou, C. Trade-off between multiple constraints enables simultaneous formation of modules and hubs in neural systems. PLoS Comput. Biol. 9, 1 (2013).

    Article  Google Scholar 

  101. 101.

    Beul, S. F., Goulas, A. & Hilgetag, C. C. Comprehensive computational modelling of the development of mammalian cortical connectivity underlying an architectonic type principle. PLoS Comput. Biol. 14, e1006550 (2018).

    ADS  PubMed  PubMed Central  Article  CAS  Google Scholar 

  102. 102.

    Arnatkeviciute, A. Data files to support reproducing analyses in “Genetic influences on hub connectivity of the human connectome” [Data set]. (Version v2). (2021).

  103. 103.

    Arnatkeviciute, A. Reproducing figures for “Genetic influences on hub connectivity of the human connectome” (Version v1). (2021).

Download references


We would like to thank Richard Betzel for sharing the optimization code for the generative modeling. This work was supported by the MASSIVE HPC facility ( A.F. was supported by the Sylvia and Charles Viertel Charitable Foundation. Data were provided [in part] by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University.

Author information




A.A., B.D.F., and A.F. designed the research; A.A., B.D.F., S.O., J.T., C.P., Z.G., K.A., Z.H., B.J., G.B., M.K., G.D., B.F., M.A.B., and A.F. contributed data and analytic tools; A.A. and S.O. analyzed the data; A.A. and A.F. wrote the paper; B.D.F., S.O., J.T., C.P., Z.G., K.A., Z.H., B.J., G.B., M.K., G.D., B.F., and M.A.B provided feedback on the drafts of the paper. A.F. supervised the work.

Corresponding author

Correspondence to Aurina Arnatkeviciute.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review informationNature Communications thanks the anonymous reviewers for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Source data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Arnatkeviciute, A., Fulcher, B.D., Oldham, S. et al. Genetic influences on hub connectivity of the human connectome. Nat Commun 12, 4237 (2021).

Download citation


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing