Weighted Stochastic Block Models of the Human Connectome across the Life Span

The human brain can be described as a complex network of anatomical connections between distinct areas, referred to as the human connectome. Fundamental characteristics of connectome organization can be revealed using the tools of network science and graph theory. Of particular interest is the network’s community structure, commonly identified by modularity maximization, where communities are conceptualized as densely intra-connected and sparsely inter-connected. Here we adopt a generative modeling approach called weighted stochastic block models (WSBM) that can describe a wider range of community structure topologies by explicitly considering patterned interactions between communities. We apply this method to the study of changes in the human connectome that occur across the life span (between 6–85 years old). We find that WSBM communities exhibit greater hemispheric symmetry and are spatially less compact than those derived from modularity maximization. We identify several network blocks that exhibit significant linear and non-linear changes across age, with the most significant changes involving subregions of prefrontal cortex. Overall, we show that the WSBM generative modeling approach can be an effective tool for describing types of community structure in brain networks that go beyond modularity.

and are the number of nodes in communities and . The mean within-community strength for community would be: ℎ = and the between-community for community would be

Participation coefficient
The participation coefficient of node ( ) is a measure of how well-connected node is to is own community versus other communities. If a node is uniformly connected to all communities, the participation coefficient will be near 1. If a node is only connected to nodes of the same community, its participation coefficient will be 0. See 3 equation 4. It is defined as: where Z is the set of communities, is the weighted degree for node , and ( ) is the weighted degree between and all nodes in community .

Community assortativity
Community assortativity 4 for community ( ) compares the within community strength, which we can think of as on-diagonal community strength, with the max between community strength, which we can think of as the max off-diagonal community strength. It is defined as:

Nodal assortativity
The nodal assortativity 4 for node ( ) compares a node's weighted connectivity to its assigned community ( ) to the maximum weighted connectivity to other communities. Given node 's community assignment , its weighted connection density to community is defined as then the nodal assortativity can be defined as:

Within-module (community) z-score
The within-module z-score 5 for node ( ) is a measure of a node's relative within-community connectivity, given within-community connectivity for all the nodes of the same community. It is defined as: where is the community containing node , ( ) is the within-community weighted degree of node , and � ( ) and ( ) are the mean and standard deviation of the within-community weighted degree distribution.

Nodal versatility
The nodal versatility 6 for node ( ) is an index of how readily a node is assigned a community with the same neighboring nodes. It is defined as: Where p is the pairwise membership probability, or in other words, the probability of nodes and resulting in the same community: where is the number of repetitions of a community detection algorithm and the is the agreement matrix, where: ( , ) = � 1, if = for the th repetition 0, else then the nodal versatility for node is:

Consensus convergence
Consensus convergence 7 ( ) is an index of how consistently nodes are assigned the same community membership, given a pairwise membership probability matrix, . It is defined as: where is the set of nonzero entries in the agreement matrix, , and | | is the size of set .

Further workflow validation methods and results
For the generative module evaluation framework, we sought to show how the WSBM consensus model could effectively generate synthetic adjacency matrices. Given that the consensus community structure models were fit to the young adult representative matrix, there is a possibility that the WSBM model could be overfit to the input data. To measure the generalizability of the model, we measured the energy between generated synthetic networks of our consensus models and each adjacency matrix in our sample. Specifically, for each subject we measured the mean KS energy over 5000 iterations for each consensus model (WSBM and modular). We found that the WSBM consensus model produced lower energies (M=0.4, SD=0.013) on average than the modular consensus model (M=0.42, SD=0.014; t(1.24x10 3 )=-27.28, p < 10 -9 ).
When evaluating the symmetry of our consensus partitions, we wanted to ensure that the symmetry demonstrated by the WSBM is not merely a result of a consensus fit procedure. Therefore, we repeated this hemispheric symmetry analysis 100 times, switching out the consensus WSBM partition for one of the 100 WSBM models of our fitting workflow (the models represented by Figure 2, panel b). These are models fit to the representative adjacency matrix at a given k, but not modified by consensus information. In 97 of the 100 WSBM tested, the partition produced a distribution of hemispheric KS scores (across 620 subjects) significantly lower (Wilcoxon rank sum test; p < 10-9; z-value range for significant comparisons: -25.87 --8.95) than the distribution of hemispheric KS scores (across 620 subjects) derived from the modular partition.

Supplemental results using a different brain parcellation
In our original analysis, we used a brain parcellation consisting of 114 nodes in the cortical grey matter, derived from a clustering method applied to 1000 resting-state fMRI scans 8 . Many plausible parcellations of the cortical grey matter exists, constructed to optimize a variety of objective functions 9 . It is therefore good practice to perform brain network analysis on an additional parcellation; which is what we do here. For this reevaluation, we chose to use the Lausanne scale125 parcellation (scale125), containing 234 cortical nodes 10 . This parcellation was created by randomly subdividing the nodes of a widely-used structural parcellation 11 . In the same manner as previously, we recorded the streamline density between each region of the scale125 parcellation to create our subject-level data. By changing the number of nodes being used, we also changed the sparsity of the individual-level data, necessitating a new sparsity cutoff which was set at 0.15; yielding 609 subjects to analyze. The young adult representative matrix was created from 50 subjects between 25-35 years old, with an edge-existence density of 17.7%. Following the creation of the young adult representative matrix, we performed the identical analysis as described before (with the Yeo parcellation); however, we did not repeat the individual fit analysis due to computational feasibility. Here we describe these results and find that they align generally with our previous findings.
Using the scale125 node definitions, we identified 11 communities using the WSBM consensus workflow (supplementary Figure S4). Like previously, the WSBM has modeled some off-diagonal block interactions with high edge weights; such as interactions 6-8 and 4-10. The WSBM consensus model contains 3 communities that we could consider disassortative (communities 4, 5, and 10) while the modular consensus model does not contain any such communities (supplementary table S1). Using MLR to measure strengths between block interactions, we see that in the WSBM partition, the top three MLR trends as measured by R 2 are off-diagonal whereas the top three MRL trends for the modular partition are on the diagonal (supplemental Figure S5).
We again find that the WSBM model generates synthetic brain networks (supplemental Figure S6, panel a) with a lower energy than the modular model (t(1999)=-21.11, p <10 -9 ); supplemental Figure S6). However, here we find that both the WSBM and modular models perform considerably better than the randomized counterpart model. When measuring how well each consensus partition respects brain symmetry (supplemental Figure S6, panel b), we find that in contrast to the results reported with the Yeo parcellation, the modular partition has a slightly smaller between hemisphere KS when measuring participation coefficient (t(1.18x10 3 )=4.84, p = 1.5 x 10 -6 ). The differences observed for within-module z-score and assortativity recapitulate what we previously observed (p < 10 -9 and p = 0.0077, respectively). Finally, measuring vector similarity/distance between individual subjects and the consensus vectors (supplemental Figure  S6, panel c) yielded results analogous to the previous findings, the vector measurements given the WSBM partition are more similar and less distant than the vector measurements made using the modular partition (p < 10 -9 for both). In all tests, without and with covariates, the R 2 values of the WSBM trend is high than the R 2 values of the modular trend.

Comparing community structure partitions across parcellation
We also assessed the performance of community detection algorithm across parcellation choice (Yeo and scale125), to see if the community structure identified by a specific model for one parcellation would be statistically similar to the community structure identified by the same model, but with a difference parcellation. Straightforward computation of a similarity is not feasible as the number of nodes between the parcellations differs. To make this comparison, we obtained the spatial arrangement of the parcellations on the FreeSurfer average surface, which contains 163842 vertices per hemisphere. For each vertex not part of the "medial gap" (the area where the two hemispheres are attached through the callosum and subcortical volumes), we identified the community assignment for that vertex. This resulted in 135679 and 135625 valid vertices for the left and right hemispheres, which were concatenated to create a vector of length 271,304. Projecting the communities onto the same geometric space allowed us to measure the variation of information (VI; distance) and adjusted rand index (ARI; similarity) between the vector of community assignments across parcellation. We computed both of these measurements, to make sure that we could observe converging distance/similarity results 12 . Were measured the empirical distance/similarity between the WSBM partitions in both the Yeo and scale125 parcellations. We repeated this procedure for the modular partitions as well.
To assess statistical significance of these spatial community structure distances/similarities, we employed a spin-based permutation test 13,14 . For each hemisphere of the fsaverage surface, we have a mapping of the points in a spherical space ( Figure S7, panel a). For each permutation, and repeated within an iteration for each hemisphere, a sphere is randomly rotated along the x, y, and z axes. Based on this rotation, spatial map originally in the corrected fsaverage space can be transformed to new points along the surface. The advantage of such a permutation method is the ability to maintain the spatial adjacencies of the parcellation. We performed 5000 spin permutations and recorded the VI and ARI between the unrotated Yeo, and the rotated scale125 at each permutation, disregarding the medial gap areas. At each iteration, the WSBM scale125 and the modular scale125 partitions were rotated according to the same angles.
We found that for both the WSBM and modular community structures, the identified partitions are less distant than by chance. We also see that the modular community structures across parcellation are less distant than the WSBM community structures. Overall, this analysis shows that the spatial similarity within algorithm (and across parcellation) is significant, for both methods. The empirical VI and ARI are outside the range of null values (empirical VI is to the left of the null distribution, empirical ARI is to the right of the null distribution) for each test ( Figure S7). How should we interpret the finding that the modular partitions appear "closer" across the two parcellations than the WSBM partitions? We do not think that this necessarily indicates greater robustness on the part of modular partitions. An alternative interpretation is that modular partitions more strongly reflect spatial (proximity) effects (that are independent of partitions).