Network Communities of Dynamical Influence

Fuelled by a desire for greater connectivity, networked systems now pervade our society at an unprecedented level that will affect it in ways we do not yet understand. In contrast, nature has already developed efficient networks that can instigate rapid response and consensus when key elements are stimulated. We present a technique for identifying these key elements by investigating the relationships between a system’s most dominant eigenvectors. This approach reveals the most effective vertices for leading a network to rapid consensus when stimulated, as well as the communities that form under their dynamical influence. In applying this technique, the effectiveness of starling flocks was found to be due, in part, to the low outdegree of every bird, where increasing the number of outgoing connections can produce a less responsive flock. A larger outdegree also affects the location of the birds with the most influence, where these influentially connected birds become more centrally located and in a poorer position to observe a predator and, hence, instigate an evasion manoeuvre. Finally, the technique was found to be effective in large voxel-wise brain connectomes where subjects can be identified from their influential communities.


Influence of Network Perturbations
To validate the claims that the CDI are the communities with the greatest influence over the consensus process, we investigate the optimisation of consensus leadership. The speed of consensus can be captured analytically by assessing the first eigenvalue of the system matrix 26 . Consensus can be driven to a target value by applying a constant input perturbation to a set of vertices that have a directed connection to all other vertices 26 . Similar perturbation optimisations have been studied extensively in the context of leadership selection for the control of multi-agent and swarm systems [26][27][28][29][30] . In these cases, the perturbation is often constrained so that only a set number of leaders are chosen with a binary option for perturbation input, i.e. vertices set as leaders or followers. In the context of multi-agent systems, there is significantly more literature on minimising the steady-state variance about an input perturbation 27,28 than there is on fast convergence to consensus. There has been an attempt to tackle both problems, but this work was restricted to 1-D community and ring graphs 29 , and an examination of how the proportion of leader vertices affects the convergence rate to consensus in multiplex networks 30 . Of most relevance, to the work herein, is globally bounded input perturbations that can be applied with a variable distribution to any combination of vertices 26 . For such a case, the first left eigenvector of the Laplacian matrix was identified as a sub-optimal resource allocation (equivalent to an input perturbation) for achieving fast convergence to consensus 26 . An improvement in this allocation has since been developed for directed k-outdegree graphs, where a near-optimal perturbation vector can be produced by combining first left eigenvectors from manipulated versions of the adjacency matrix 31 . It is worth noting that the globally bounded perturbation optimisation problem, to maximise convergence rate to consensus, has no verifiable solution. A numerical optimiser can produce near-optimal solutions, but detection of the global minima is not guaranteed. A significant contribution of this article is the filtering out of local minima by highlighting the most effective network influencers, a process that functions at any network scale and is no longer restricted to k-outdegree graphs.

Results
The Communities of Dynamical Influence (CDI) are detected using the most dominant left eigenvectors of the Laplacian matrix, as described in the Methods section. The CDI are shown in Fig. 1 for 100 vertex, k = 10, nearest neighbour (k-NNR) networks. The k-NNR networks are generated by randomly distributing vertices in a square plane with each vertex connecting to its k nearest neighbours according to euclidean distance. The first left eigenvector (v L1 ) is always used to determine the dynamical influence of the CDI but, as shown in Fig. 1 where the axes are v L2 and v L3 , the other input eigenvectors affect the community composition. This can be seen most clearly in Fig. 1b where the second eigenvector of the Laplacian matrix (v L2 ) divides the network in two as CDI only has the first two eigenvectors as input. v L1 has only positive entries, so community vertices form a trail behind their community leader that ends close to the origin of the eigenvector coordinate system.
The CDI can be used as an input to a perturbation optimiser, detailed in the Methods section (Algorithm 3), that optimises the network's convergence to consensus by maximising the first eigenvalue of the perturbed www.nature.com/scientificreports www.nature.com/scientificreports/ Laplacian matrix. Essentially, the optimisation identifies the most effective leaders in the network, those with the highest v L1 values in each community, and applies a perturbation of variable magnitude to those vertices. The leadership perturbations applied from the CDI-based optimisation are detailed in Fig. 1, where the 3, 4 and 5 eigenvector CDI (Fig. 1b,c) produce the same result and a faster convergence than the 1, 2 and 6 eigenvector CDI. The optimal number of eigenvectors, in terms of effective consensus leadership, varies depending on the network. Using only one or two eigenvectors makes it more likely that important community divisions are not identified, as seen in Fig. 1a,b, which produce the slowest convergence by only identifying two communities. Using more eigenvectors can also result in a sub-optimal performance with 6 eigenvectors outperforming the 1 and 2 eigenvector case but not the 3, 4 or 5 eigenvector optimisation.
In the following section, three eigenvectors are used for the detection of dynamical influence. CDI with three eigenvectors does not always produce the fastest consensus but in the following section it is shown to produce consistently good results (see Fig. 2). Three eigenvectors provides accurate community division whilst ensuring all communities are amongst the most influential. The issue with using more eigenvectors is that the detected communities may no longer reflect the most effective for leading network consensus. This has already been seen in Fig. 1e where a community formed (light blue -top right) with vertices that had high v L1 values because they were connected to more influential vertices in other communities, and not because they were effective network leaders on their own. A perturbation optimisation comparisons between 3 and 4 eigenvector CDI, using the series of k-NNR networks from Fig. 2b, revealed that 4 eigenvector CDI often produced a superior community division, and hence faster convergence, but it was occasionally susceptible to significant inaccuracies. Such inaccuracies were a result of poor community designation for the same reason that Fig. 1e produced a sub-optimal result. Validation of community influence. When referring to the influence of a community, we are referring to the influence of the influential vertices in that community. These vertices wield the greatest global influence within their local cluster, but this usually means that they are also the most locally influential vertex. To validate the claim that the CDI are influential, we performed a series of analyses comparing perturbations optimised to drive convergence in linear consensus. The vertices highlighted by this optimisation are those that can lead the network effectively to a new state of consensus, i.e. those with strong local and global network influence, which should align with the influential CDI vertices and their communities.
The CDI and k-means clustering algorithms are used as an input to the perturbation optimiser, detailed in the Methods section (Algorithm 3). It should be noted that the k-means clustering requires the number of divisions to be defined and, hence, is set to detect the same number of communities as found by the CDI algorithm. The results of these optimisations are compared with the Communities of Influence (CoI) method 31 and a numerical optimiser 32 in Fig. 2. The CoI method generates optimised perturbations by detecting influence, using the first left eigenvector, and investigating how this influence changes when key vertices are removed from the network. www.nature.com/scientificreports www.nature.com/scientificreports/ This was shown to be effective in k-outdegree networks where the CoI method, using 5 input vectors, produced similar results to the output of a numerical optimiser 31 . In Fig. 2, the CDI optimiser is shown to produce similar result in k-outdegree networks to the CoI method, but achieves notably superior results when applied to variable k networks. In fact the CoI results are too low to be included for many of the networks sizes reported in Fig. 2c,d. The CDI-based optimiser frequently exceeds the results of the numerical optimiser, which struggles to find globally optimal minima.
The k-means clustering algorithm developed by Ng et al. 33 . also employs eigenvectors and a form of machine learning to define clusters. Figure 2 shows that the k-means clustering algorithm, when provided with the number of communities found by the CDI algorithm, can often identify the most effective network leaders by placing them in separate communities. The main differences between CDI and k-means clustering are seen in the variable outdegree k-NNR case, Fig. 2d, where the CDI performs consistently better.
The superiority of CDI in comparison with k-means clustering is made clearer by reducing the weight of each edge for a given graph. This alteration constricts the flow of information through the graph, with the result that perturbations need to be spread to a greater number of vertices to overcome this restriction. This phenomena is seen in Fig. 3a,b for a 100 vertex k-NNR topology with edge weight set at 1 and 0.2 respectively. If the edge weight was reduced to 0, then the optimal perturbation would provide all vertices with the same perturbation, i.e. lead them all individually. By requiring more network leaders, the difference between CDI and k-means clustering can be seen even for the k-NNR topology, where the methods performed similarly in Fig. 2b. Comparing CDI in Fig. 3b with k-means clustering in Fig. 3c shows that the clusters generated by both methods are similar. But in the bottom right quadrant of the figures it can be seen that CDI detects a division that k-means does not and this results in a faster convergence speed for the CDI-based optimisation. These small differences in community designation are also what differentiates the methods in Fig. 2d.
Responsive starling flock topology. Starling flocks tend towards a thicknesses of between 0.13 and 0.27, where the flock thickness is the ratio of the smallest to largest dimension of an ellipsoid having the same principal moments of inertia as the flock 22 . In Fig. 4, five examples of 1200 bird starling flock networks are presented with a thickness of 0.2, where the flock is modelled by randomly distributing birds from a uniform distribution within a rectangular prism 22 . The community distribution and optimised perturbations are shown for these representative examples that employ k-NNR topologies for the following outdegrees; k = 5, 7, 15, 25 and 50. The starling vertices remain in the same position for each analysis with the number of nearest neighbours, k, the only change that affects the network structure.
What is of particular interest, from these results, is the position of the influential vertices in each example. In high k outdegree examples, such as Fig. 4c-e, the influential vertices, associated with the most influential communities, are centrally located. An optimised perturbation (using 3 eigenvector CDI) is shown to align with the most influential communities, therefore the perturbations become increasingly centrally located as the outdegree increases. For the k = 50 example, in Fig. 4e, these perturbations are primarily located in the centre of the model flock. For lower outdegree examples, such as Fig. 4a,b, the influential vertices, especially those associated with the most influential communities, are more evenly distributed throughout the flock, aided by the increase in the number of CDI present. Considering an actual starling flock where k ≈ 7, the topology makes it more likely, in comparison with a higher outdegree scenario, that one of the most influential birds will be near the site of a predator attack. Therefore, the low outdegree topology is more likely to have a highly influential bird involved in leading a predation avoidance manoeuvre.
Varying the outdegree for the starling flock model, while using the same distribution of 1200 birds, and applying an optimised leadership perturbation changes the value of the dominant eigenvalue λ 1 of the perturbed Laplacian matrix, which represents convergence speed. For outdegrees between k = 5 and k = 50, λ 1 results roughly conform (R 2 = 0.962) to a power law distribution (λ 1 = 0.0031k −0.184 ). A higher convergence speed is indicative of a graph that can be more effectively led by key vertices. The highest convergence speeds, therefore, usually belong to networks with lower outdegrees. Given that a lower outdegree tends to result in faster perturbation driven consensus, the reason for the outdegree remaining as high as 7 may be due to the requirements of maintaining a connected flock. If the outdegree is too low the flock may split whenever a perturbation is applied and may not reconnect. Defining a lower bound that ensures connectivity is beyond the scope of this paper, but www.nature.com/scientificreports www.nature.com/scientificreports/ lower bounds have been identified for static topologies including k-NNR graphs where the required k increases with the number of vertices 34 .
Brain similarity detection. The CDI method is applied in this section on large-scale human connectomes generated by Roncal et al. 35 from magnetic resonance imaging (MRI) scans carried out by Landman et al. 25 Landman et al. used 21 healthy volunteers, where each subject was scanned twice with a short break between scan and rescan (scan 1 and scan 2 respectively) 25 . Note that one of the scans, for subject 127, could not be sourced and, hence, this article shall consider the remaining 20 volunteers.
The networks generated by Roncal et al. are undirected and the adjacency matrix, rather than the Laplacian is used, due to the scale of the network (see Methods section). Therefore, the CDI no longer identifies the most effective leaders of system consensus as every vertex both leads and follows equally. This makes it impossible to differentiate between important sources and sinks of information in the network. The use of the adjacency matrix also means that the communities are ranked by popularity, i.e. the frequency with which a random walker would visit each vertex in the graph, rather than effective consensus leadership. Therefore, instead of drawing conclusions about a community's network leadership, the CDI is used as a similarity metric by detecting changes in the influential communities (whether sources or sinks) of brain connectomes.
In the work by Roncal et al. 35 , the Frobenius Norm was successfully used to detect the similarity of these scan-rescan matrices created from Landman et al. 's study 25 . The Frobenius Norm is an established matrix distance measure 36 , referred to as Frobenius Distance when assessing graph similarity. The result from Roncal et al. 's study 35 has not proven to be exactly reproducible with the published dataset 25 , as the scan-rescan comparisons www.nature.com/scientificreports www.nature.com/scientificreports/ do not always produce the lowest values (i.e. greatest similarity). A superior similarity metric for this case was identified as Graph Edit Distance (GED), which is an inexact graph matching method defined as the cost of the least expensive sequence of edit operations that are needed to transform one graph into another 37 . Specifically, the results of edge GED 38 are displayed in Fig. 5a for the scan 1 and 2 (scan and rescan) comparisons. GED has been used as an identification method for matching fingerprints 37 , but it fails to identify subject 113 from their scan-rescan comparison. The CDI based comparisons are shown in Fig. 5c,d for three and ten input eigenvectors respectively. The CDI communities with 3 input eigenvectors are also displayed in three-dimensional space (Fig. 5d) for subject 113, with the difference in community density providing an insight into why edge GED failed but CDI succeeded in recognising subject 113. The paths taken by communities in Fig. 5d are mostly similar with one non-matching community present from scan 2. However, the density of these communities are significantly different with far more vertices in the scan 1 communities than those from scan 2. This also translates to the density of the connectivity network, where scan 1 has 841,097 vertices with a non-zero outdegree and scan 2 only has 647,049 vertices for subject 113. This difference of 194,048 vertices is far larger than any other scan-rescan comparison, where the next largest difference in 25,554 vertices (mean number of non-zero outdegree vertices is 836,699 for the other scans). This difference in graph density appears to prevent edge GED from detecting a match.
The CDI community matching, with ten input eigenvectors (Fig. 5c), presents a similar accuracy to edge GED (Fig. 5a). But it is notable that 3 eigenvector CDI produces some poor matches in Fig. 5c, such as subject 814 and 849. This suggests that the most influential communities have changed between scan and rescan for these subjects as they are still recognisable when considering the less influential communities detected when using ten eigenvectors.
Similar results to Fig. 5c can be obtained by taking an approach based on normalised cut 11 , where the Fiedler vector divides the graph by the sign of the vector's entries. This spectral bisection approach, as described in the www.nature.com/scientificreports www.nature.com/scientificreports/ Methods section, does not manage to clearly identify all subjects when applying the mean number of matching communities procedure. It also creates more off-diagonal false matches than Fig. 5c and does not give insight into any changes in neuronal influence.

Discussion
In this article a method for detecting Communities of Dynamical Influence (CDI) is proposed that uses the relationship between a selection of the most dominant left eigenvectors of the Laplacian to identify network divisions. The communities are shown to be led by the most influential vertices by comparing with a perturbation optimiser that maximises the network convergence rate to consensus. CDI defines community divisions that can be similar to those detected by k-means clustering, but user selection is required to define k. In contrast, with CDI the number of community divisions are a product of the network topology and the number of input eigenvectors. Three input eigenvectors are shown to produce consistently good results when using CDI-based consensus optimisation, which is used herein as an assessment of the quality of the influential communities that are found. The selected number of input eigenvectors is a trade-off, for CDI, as more eigenvectors may highlight more or new communities but those communities may no longer represent the communities that form through association with an influential leader. There is scope to consider in future work weighting the contribution of less dominant eigenvectors to ensure high accuracy community division that is still focused on the most influential leaders.
For a starling flock model, the CDI reveal the benefit to starlings of maintaining a low outdegree. Higher outdegrees were seen to reduce the responsiveness of the network, in the CDI-based perturbation optimisations, with the flock also becoming composed of fewer CDI. It was also seen that the most influential vertices became more centrally located within the flock, where they are unlikely to detect an incoming predator. The CDI did not reveal any optimality in the chosen starling outdegree, instead it was suggested k = 7 may have emerged as a compromise between ensured connectivity and fast response.
In this article a series of human brain networks, around 850,000 vertices in size, were analysed to identify neuronal communities. The identified communities enabled separate MRI scans to be clearly recognised as belonging to the same subject, especially when using CDI with ten input eigenvectors. We conjecture that the subjects with the lowest mean number of matching communities, when using CDI with the first three eigenvectors, are those that display the greatest change in neuronal activity. Since these same subjects are clearly identified when considering the less influential communities detected by the 10 eigenvector CDI. Edge Graph Edit Distance is shown to be highly effective in identifying the scan similarity for the majority of matches with one exception. This exception highlights the effectiveness of the CDI approach, where pathways/community similarity was still clear even when comparing graphs and communities of significantly different sizes.

Methods
A graph is defined as G = (V, E), where there is a set of V vertices and E edges, which are unordered pairs of elements of V for an undirected graph and ordered pairs for a directed graph.
The adjacency matrix, A, is a square n × n matrix when representing a graph of n vertices. This matrix captures the network's connections where a ij > 0 (a ij is the ijth entry of the graph's adjacency matrix) if there exists a directed edge from vertex i to j and 0 otherwise. Variable edge weights contain information on the relative strength of interactions, whilst uniform edge weighting either only represents the presence of a connection or is a result of all the edges having the same information carrying capacity. For an undirected graph, the adjacency matrix is symmetric with an edge (i, j) ∈ E resulting in a ij = a ji > 0. For a directed graph, the indegree is equal to the column sum, ∑ a i ij , and outdegree is equal to the row sum, ∑ a j ij . The Laplacian matrix is defined as L = D − A where the degree matrix, D, is a diagonal matrix and the ith diagonal element is equal to the outdegree of vertex i. The first eigenvalue of the adjacency matrix (λ 1 ), referred to as the Spectral Radius, is the largest eigenvalue in magnitude and is associated with the Perron vector, which is an eigenvector that contains only positive entries. Whereas for the Laplacian matrix the first eigenvalue is associated with the eigenvalue λ 1 = 0 26 . For a directed graph, the left eigenvectors of the Laplacian matrix, v L , are row vectors satisfying v L L = λv L .
Communities of dynamical influence. The Communities of Dynamical Influence (CDI) are found by analysing the left eigenvectors of the Laplacian matrix as presented in Algorithm 1. The algorithm defines a coordinate system using only the Real part of the selected eigenvectors for a chosen number of input eigenvectors. Hence, if any of the eigenvectors considered form a complex conjugate pair then the algorithm will only ever use one vector from the pair and then choose the next dominant eigenvector not in the pair, since the real part of the complex conjugates will be identical.
The CDI are based on first identifying the most effective community leaders. These vertices do not "follow" nodes with greater community influence, therefore they are identified as those with no outward connections to a vertex that is further from the origin in this eigenvector-based coordinate system.
A vertex is assigned to a community when there is a directed path from that vertex to a community leader, with each vertex on the path further from the origin of the eigenvector coordinate system than the previous. The number of communities is equal to the number of community leaders where a vertex can belong only to one community. If a vertex is assigned to multiple communities, then it is kept in the community where it is most aligned with the community leader and removed from the others. This alignment is determined by comparing the position vector of the vertex with respect to those of the most influential vertices using the scalar product. www.nature.com/scientificreports www.nature.com/scientificreports/ Perturbation driven consensus. The networks considered herein have n agents connected via local communication with a static, time-invariant, topology. A uniform signal u = u[1, 1,…, 1] T ∈ IR n is supplied to all agents with different positive gains c i , where i = 1, 2,…, n. The dynamics of this system are defined as where x i is the state of the i th agent and u is the scalar target value that all agents must achieve. The resource allocation, c i , ranges from 0 to 1, is globally bounded as ∑ i c i = 1, and scales the comparison between the uniform input signal, u, and the current state x i .
The global dynamics of the network can be expressed with respect to the Laplacian matrix as where C is the perturbation matrix, C ← diag (c) = diag (c 1 , …, cp). Spanning trees have been highlighted previously as a condition for consensus [39][40][41] since for a directed network G, defined by Eq. (2), consensus will eventually be achieved if all agents are reachable, via directed edges, from the vertices supplied with perturbation input.

Perturbation optimisation.
A perturbation optimisation is presented as a method for validating the CDI algorithm's ability to identify the most influential communities and network leaders. The objective function of this optimisation is to maximise the system's convergence to consensus, by applying a globally bounded perturbation to the vertices. It has been demonstrated that, by changing the coordinates, Eq. 2 can be written as where the diagonal elements of C can be optimised to maximise the magnitude of λ 1 (−(L + C)), which is the most dominant (rightmost) eigenvalue (i.e. eigenvalue with the largest real part) of the negated and perturbed Laplacian matrix 26 . The first step is to optimise a perturbation only applied to most influential vertex from each community, according to their v L1 value, as detailed in Algorithm 2. If the optimiser reduces the perturbation applied to the influential vertex to less than or equal to zero then the community is discarded from the optimisation. Once the optimiser has converged, only the communities associated with influential vertices that still have positive perturbations are included in the next step of the optimisation.
For each of the selected CDI from Algorithm 2, an input vector ω i is created for each CDI with entries populated with their v L1 entries if the vertex is in the community and values set to zero otherwise. These vectors are www.nature.com/scientificreports www.nature.com/scientificreports/ then manipulated, using the Power Optimisation method 31 in Algorithm 4, and combined to produce the final optimised perturbation, with weighting variables used to determine the ratio of each vector's contribution in Algorithm 3.
The Power Optimisation focuses resources on the most effective leaders by raising an eigenvector to a power, η i , for a given input vector, ω i , according to where the denominator ensures that ∑ j (c) j = 1. Also note that C = diag(c) in the −(L = C) system. Algorithm 3 presents the perturbation optimisation procedure, where Eqs. 4 and 5 are used repeatedly with different inputs and constraints. A numerical optimiser, employing a sequential quadratic programming method 42 , is used throughout the algorithm to maximise the dominant eigenvalue by optimising the power, η = {η 1 , …, η i }, and weighting, r = {η 1 , …, η i }, variables for the i input vectors. The algorithm first optimises the power variables using the power optimisation method for one input vector. This power is employed for checking if adding any more input vectors and numerically optimising only the new weighting variable will increase the value of |λ 1 (−(L + C))|. If the convergence speed is improved then the new selection of input vectors will have their power variables numerically optimised, before repeating the search for new input vectors and optimising the new weighting variable. Once all input vectors have been checked both the weighting and power variables are optimised numerically.
To check that there are not any redundant input vectors, each vector is removed from the optimisation, starting with the first input vector, to check if that removal increases |λ 1 (−(L + C))|. If removing the vector does not improve the performance then it is reintroduced and the process is repeated for the other community's input vectors. The final combination of input vectors (i.e. communities) are optimised numerically by varying their weighting and power variables to maximise λ 1 (−(L + C)). Matching brain communities. Each brain connectome graph contains 1,827,240 voxels that each represent a 1 mm3 volume of the brain. The centre of each volume (voxel) forms a three dimensional grid with 1 mm spacing between neighbouring voxels. Each edge in the graph is defined as any two vertices that are connected by at least a single fibre, where an edge of weight 1 would represent a single fibre connection. This results in an undirected network of weighted edges where around half of these voxels have connections in the subjects considered here.

Algorithm 2. Community Leader Optimisation.
For the large brain connectome graphs, the adjacency matrix was employed, rather than the Laplacian, to identify CDI. This was due to the difficulty that emerged in converging on λ 1 = 0 for such large matrices that contained more than one near zero eigenvalue. Both adjacency and Laplacian matrices can be used with the procedure in Algorithm 1. The left eigenvectors of these matrices are the same in certain cases, including graphs with a constant outdegree for each vertex. In other cases the eigenvectors vary but the first eigenvector contains all positive entries for both, while the following eigenvectors divide the network in a similar manner. It should also be noted that due to the undirected nature of the connectome data, the Laplacian matrix's ability to highlight the imbalance between outdegree and indegree is less relevant. In fact, for an undirected Laplacian matrix the first eigenvector is uniform with an imbalance in the indegree and outdegree of vertices required to determine influence from this eigenvector.
The vertices included in influential communities from different graphs were compared to determine similarity. For this similarity comparison, the CDI were reduced in size by only including vertices with a large eigenvector entry. This eigenvector entry threshold was set at 0.01 (i.e. (v A ) i > 0.01 where v A is any of the eigenvectors used in the CDI coordinate system). The similarity of two graphs was assessed by calculating the number of matching communities shared between both graphs. This comparison metric for assessing community matches, developed here, considers the shortest distance from all the vertices of one community to the nearest vertex that belongs to another. Vertices were considered overlapping if they are from the same voxel or they are in an adjacent voxel (i.e. maximum overlapping voxel distance ≈ . 3 1 74 mm). The percentage of overlapping vertices are calculated for each community comparison to find the highest percentage overlap between two communities in separate scans. The communities appear to reveal pathways in the brain as depicted in Fig. 5b. These pathways can sometimes vary in density of vertices and in length, which makes an exact match between two communities unlikely. Therefore, for a pair of communities to be considered a match their percentage overlap had to exceed a threshold value. The mean number of matching communities was determined by taking the mean number of matches from a range of threshold values between 50% and 90%, at 10% intervals. Note that any community can only be a member of one matching pair, i.e. if a community in one scan 1 matches with multiple communities in another scan only one of those matching pairs would be considered for the number of matching communities. Algorithm 3. Perturbation optimisation using CDI.
Finally, it is worth noting that there are always errors in the images produced from MRI scans, even when using the same equipment and procedures, with small errors occurring because of slight changes in image orientation and magnetic field instability 43 . The mean number of matching communities is, therefore, also able to accommodate any small positional errors when detecting overlapping communities. Spectral bisection. The Fiedler vector is associated with the second smallest eigenvalue of the Laplacian matrix but the second eigenvector of the adjacency matrix, used here for the analysis of brain networks, also divides the network in a similar manner. These spectral bisections are completed three times to create 8 communities by first dividing the network according to the second eigenvector, with the sign of its entries determining community division. The second eigenvector is then assessed for both of these communities and more divisions made. For the final bisection of four communities into eight, the second eigenvector was used unless it did not generate two communities with values higher than the threshold used when assessing the mean number of matching communities, described previously. In this case the next eigenvector that divides the community, so that both divisions had values above the threshold, is selected. This ensures that all scans have eight eigenvectors with which to compare.