Ranking Regions, Edges and Classifying Tasks in Functional Brain Graphs by Sub-Graph Entropy

This paper considers analysis of human brain networks or graphs constructed from time-series collected from functional magnetic resonance imaging (fMRI). In the network of time-series, the nodes describe the regions and the edge weights correspond to the absolute values of correlation coefficients of the time-series of the two nodes associated with the edges. The paper introduces a novel information-theoretic metric, referred as sub-graph entropy, to measure uncertainty associated with a sub-graph. Nodes and edges constitute two special cases of sub-graph structures. Node and edge entropies are used in this paper to rank regions and edges in a functional brain network. The paper analyzes task-fMRI data collected from 475 subjects in the Human Connectome Project (HCP) study for gambling and emotion tasks. The proposed approach is used to rank regions and edges associated with these tasks. The differential node (edge) entropy metric is defined as the difference of the node (edge) entropy corresponding to two different networks belonging to two different classes. Differential entropy of nodes and edges are used to rank top regions and edges associated with the two classes of data. Using top node and edge entropy features separately, two-class classifiers are designed using support vector machine (SVM) with radial basis function (RBF) kernel and leave-one-out method to classify time-series for emotion task vs. no-task, gambling task vs. no-task and emotion task vs. gambling task. Using node entropies, the SVM classifier achieves classification accuracies of 0.96, 0.97 and 0.98, respectively. Using edge entropies, the classifier achieves classification accuracies of 0.91, 0.96 and 0.94, respectively.

Node Importance. In a complex network, different nodes may have different usages. Some may be used more than others, whereas some nodes might be controlling the dynamics of the whole network. These measures describe the centrality properties of the graph 22 . Statistical significance tests are commonly used to infer about the most important regions and links associated with an external stimulation. Here we describe a statistical way to infer about important regions during task states using generalized linear models (GLM). Among the network-theoretic measures commonly used to infer about important nodes, we illustrate four centrality measures, namely, degree centrality, eigenvector centrality, betweenness centrality and leverage centrality. Generalized linear models (GLM) 23 use multiple regression with false discovery rate controls to infer the most important regions during a task. Degree centrality 24 defines the central nodes to be the ones having the highest number of connections with other nodes. This centrality metric computes the importance of a node in the network by just the number of other nodes with which it directly interacts. Eigenvector centrality 25 takes into account the centrality of immediate neighbors when computing the centrality of a particular node. In particular, eigenvector centrality is a positive multiple of the sum of nearest node centralities. They are computationally very intensive compared to the other centrality metrics. Betweenness centrality 26 of a node represents the importance from the perspective of shortest paths in a graph. Particularly, this metric is calculated as the fraction of the shortest paths between all pairs of nodes (except the node in consideration) of a graph that contain the given node 27 . Joyece et al. 22 introduced a new measure of centrality called leverage centrality that finds out the influence of a node in a graph on other neighboring nodes based on their degree distribution. However, we note that the centrality measures may not only depend on degrees but also on the weight of the link between them. For example, if the weight of an edge is higher, it is more likely to be used. The information of the edge weights can be used to develop a new importance measure. In addition, all these centrality measures are only applicable when the topological structure of network is clearly known for every individual sample. In stochastic networks where the group behavior of a number of networks is of utmost importance, the extension of these measures is not straight-forward for differentiation between two groups. More details about these network measures can be found in 3 . edge Importance. There have been a few previous works for understanding the importance of edges in brain states. Among them, Network Based Statistics (NBS) 28 is a popular method for testing hypotheses about the edges in a network using t-test. It is used to identify connections and networks comprising the connectome associated with an experiment for a between-group difference.
Node and edge Importance to predict Brain states. This paper introduces an information-theoretic approach to bridge the gap of understanding node and edge importance from brain networks (corresponding to states) to classify states. Here we note that information-theoretic centrality metrics have been proposed before, although in a different setting. Information-theoretic approaches have been used in communication engineering since the seminal paper 29 of Shannon in 1949. The information-theoretic concepts have been applied to understand different types of complex systems, e.g., in chemical graph theory 30,31 . From a structural

Results
This section proposes the information-theoretic metrics for analysing networks in order to extract important nodes and edges. It also demonstrates the classification results of applying node entropy and edge entropy to two different conditions on human brain networks. First, graph entropy, sub-graph entropy, node entropy and edge entropy are illustrated using a simple example. Second, important regions and edges based on change in group (node and edge) entropy are ranked. Third, node and edge entropy values are used to design classifiers for classifying two connectivity states for emotion and gambling tasks. The classification performance is compared with the state-of-the-art network metrics for classification of states. The performance is also compared with a recently developed tensor based model for task prediction. Fourth, we compare graph entropy based centrality measure with commonly used centrality measures like degree, betweenness, eigenvector and leverage. A comparison of graph entropy based centrality with structural centrality is also shown in Subsection S.7 and Fig. S13 in Supplementary Information. In addition, regions found through graph entropy are compared with the ones extracted by GLM and NBS. Lastly, the group-level differences of whole brain network between task vs. no-task (or task 1 vs. task 2) are investigated.
The brain region parcellation is based on 38 . In this paper, for all subsequent brain networks, we use the regions of interest (85 in total) as defined in 38  Illustration on Graph entropy. For a graph G = (V, E), let two nodes be v i and v j . The weight of the edge between two nodes v i , v j is denoted by e ij . We illustrate the approach to calculate graph entropy using an example graph shown in Fig. 2. To calculate the sub-graph entropy, we normalize the edge connection. Left: sub-graph before normalization. Right: sub-graph after normalization. This sub-graph consists of 5-nodes. As the weighted edge between them are normalized, they sum up to 1. (c) Sub-graph associated with node 2 (left) and node 4 (right) from the example in (a). To calculate the sub-graph entropy, we normalize the edge connection. These sub-graphs consist of 4-nodes. As the weighted edge between them are normalized, they sum up to 1. (d) An example of sub-graph containing edge 1-2 from the example in (a). To calculate the sub-graph entropy, we normalize the edge weights. This subgraph consists of 5-nodes. As the weighted edges between them are normalized, they sum up to 1.
The example graph depicted in Fig. 2(a) consists of 7-nodes and 10-edges. For simplicity assume that the edge weights are already normalized, i.e., they sum up to 1. In this scenario, we can calculate the graph entropy as follows.
• Identifying the normalized edges q i,j . Let us identify adjacency matrix Q such as Q(i, j) = q i,j • Calculating the entropy as In this example, a sub-graph is shown in Fig. 2(b). The normalized incidence matrix of this sub-graph is given by The entropy can be calculated as bits. Note that, this sub-graph entropy is less than actual graph entropy, indicating that it contains less randomness compared to the previous graph.
Importance of a graph node can be thought to be dependent on the entropy of sub-graphs in its immediate neighborhood. In order to calculate the entropy of sub-graphs surrounding a node, we need to extract the structure of sub-graphs containing that node. After that, based on sub-graph complexity, we can calculate the sub-graph entropy. In this example, sub-graphs containing nodes 2 and 4, respectively, are shown in Fig. 2(c). The normalized incidence matrix of the sub-graph related to node 2 is given by bits.
On the other hand, The normalized incidence matrix of the sub-graph related to node 4 is given by bits. Note that, although the degree of node 2 and 4 are the same, their entropy values are different. The node entropy proposed in this paper is different from vertex strength 40 where the strength of vertex is calculated as sum of edge weights associated with the vertex.
In this example, a sub-graph containing edge 1 − 2 is shown in Fig. 2(d).
The normalized incidence matrix of this sub-graph is given by = . log (0 1)] 1 5710 2 bits. This entropy is more than the node entropy calculated before, implying the edge contains more information.
Average Entropy from a Group of Graphs. In order to infer entropy information from a group of graphs, their sample average can be calculated. In this case, entropy values for each node and edge for each graph are calculated and the average value across all graphs is computed. This average entropy acts as an unbiased estimator for the group. For proof, see Subsection S.10 in the Supplementary Information. www.nature.com/scientificreports www.nature.com/scientificreports/ Importance of Nodes and edges. Ranking of Regions. The importance of nodes can be described by the complexity it contains. If the sub-graph entropy is able to explain most complexity of the network, then those sub-graphs are more important. In other words, if node entropy is higher, then that node is more important in the whole network. Hence, we rank the regions based on node entropy H G ( ) v i . From a group of graphs, node entropy is calculated for each node for every graph in the group. Then we calculate the average of each node entropy for the whole group and rank the vertices based on the group averaged node entropy. The algorithm to rank the regions based on node entropy is given in Algorithm 1. The ranking pipeline is also illustrated in Fig. S1 in Supplementary Information.
This scheme can be seen as maximizing mutual information between sub-graph and the whole graph. We provide a proof in the Supplementary Information Subsection S.11.
We use the node entropy to rank the regions of brain which are most important for different conditions (emotion task, gambling task, no-task) using Algorithm 1. The result of the ranking process for emotion task is shown in Table 1 and Fig. S2 in Supplementary Information. The regions of importance were consistent almost for every state, i.e., the regions that carried the most entropy did not change between task vs. no-task states. . As before, the edge entropy of each edge for every graph is calculated from a group of graphs. Then we compute the average of each edge entropy for the whole group and rank the nodes based on the group averaged edge entropy. The algorithm to rank the edges based on edge entropy is given in Algorithm 2.

Region Entropy Edge Entropy
Edge entropy is then used to rank the functional edges of brain according to the importance of priority for different conditions (emotion task, gambling task, no-task) using Algorithm 2. The result of this ranking process for emotion task is shown in Table 1. The top-100 active edges are shown in Fig. S3 in Supplementary Information. The importance of priority edges was consistent for every state. In all the states, the most important edges are those criss-crossing two hemispheres. Also, the edges are mostly concentrated in the frontal regions of the brain. This is also consistent with the nodes found in regional ranking for each separate condition. www.nature.com/scientificreports www.nature.com/scientificreports/ Ranking based on Differential Entropy. Between two groups of tasks (or task vs. no-task conditions), if the communication pattern among brain regions change, then the change in pattern can be captured using the above mentioned ranking procedure. In this scenario, the regions or links with the most change in entropies between two groups play a significant role in discriminating the two classes. Suppose, for region v i , the conditional entropy for subjects belonging to group G1 (where G1 ∈ {Emotion, Gambling}) is given by H G1 (v i ) and for group G2 (where G2 ∈ {No-task, Other Task}), H G2 (v i ). The difference between these two values would encompass the change in graph entropies between two groups of subjects for region i. We calculate the change in entropy (defined differential entropy) as |H G1 where |x| is the absolute value of x. Then we rank them based on decreasing value. The results from our experiment show empirically that this ranking can capture the significant distinguishing regions between two groups. The same argument and ranking procedure can be applied to edges as well. The algorithm is described in Algorithm 3.
There are regions that have maximum change of entropy between two states. Although, these regions may not be among the most complex regions, they provide the maximum change of entropy between two states. We extract the regions that are important from the perspective of change of information in Table 2 for different tasks. Visualization of important regions that have highest differential entropy between two states for emotion task vs. no-task. Red: regions that have higher node entropy during emotion task, blue: regions that have higher node entropy during no-task.

Algorithm 3.
Ranking of Regions and Edges for Two Groups based on differential entropy. www.nature.com/scientificreports www.nature.com/scientificreports/ The corresponding regions of interest for emotion vs. no-task are shown in Fig. 3. In addition, the regions of interest for gambling vs. no-task and emotion vs. gambling are shown in Figs. S4 and S5, respectively (Supplementary Information).
The change in ranking for emotion vs. no-task was the highest for fusiform cortex in the right hemisphere. For emotion vs. gambling task, the regions with maximum change in ranking for individual tasks are: left hemisphere banks of the superior temporal sulcus, left caudal anterior cingulate and right fusiform cortex. In order to facilitate the visualization of edge ranking procedure, the top ranked edges are overlaid on a brain template. Following group ranking procedure based on edge entropy, this process extracts top edges from (a) emotion vs. no-task (  . Visualization of important edges that have highest differential entropy between two states for emotion task vs. no-task. Red: regions that have higher node entropy during emotion task, blue: regions that have higher node entropy during no-task, yellow: regions that are not significant based on node entropy. www.nature.com/scientificreports www.nature.com/scientificreports/ the top 100 edges for each group are identified. A close inspection of the results reveals several observations. First, group ranking procedure reveals edges that are distributed throughout the whole brain and some of them criss-cross the hemispheres. Second, differential entropy elevates the edges that belong to frontal-parietal and frontal-subcortical areas, e.g., frontal lobe, parietal lobe, temporal lobe, cingulate gyrus, limbic system, striatum, thalamus, stem, and amygdala. performance of Classifying two Brain states. The leave-one-out classification performance using top-25 region and significant edge entropies are shown in Table 3. The classification performance is compared with state-of-the-art network metrics for nodes. In addition, the classification performance for edges is also compared to NBS measures. A number of classifiers were tested, e.g., support vector machine (SVM), random forest, naive Bayes, and logistic regression. All the classifiers perform similarly with respect to the features. Therefore, the results from support vector machine with radial basis function are presented for illustration. The hyperparameters for the classifiers were tuned using in-fold validation. The support vector machine classifier with a radial basis function kernel and node entropy features performs better for classifying two states with highest accuracy, specificity and sensitivity between node and edge based features separately.
Intersection and Union Sub-Graphs: Two sub-graphs are created from the intersection and union of top regions and edges to compute sub-graph entropies for different groups. The intersection sub-graph contains subset of edges associated with the nodes of the top-25 regions. The union sub-graph contains top-25 regions and significant edges. The node and edge entropies associated with union and intersection sub-graphs are also used for classification. These results are summarized in Table 3.
When we utilize the regional centrality measures based on the regions of Table 2 to classify task vs. no-task states or emotion vs. gambling states, the classifier achieves very good area under the curve (AUC) values (shown in Fig. S8 in Supplementary Information). Compared to other centrality measurements, the proposed centrality achieves better prediction consistently for the whole range of receiver operating characteristics (ROC). Using edge entropies, the proposed classifier achieves very good mean AUC values as shown in Fig. S9 in Supplementary  Information. statistical Analysis of Results. Significance of Regions and Edges. The statistical significance of the top ranked regions that have highest change in node entropy is investigated using nonparametric permutation t-test separately on each highly ranked regions. For emotion vs. no-task, out of the 25 regions shown in Table 2, top 11 have significant change in node entropy. For gambling vs. no-task, top 15 regions have significant change in node entropy. The same procedure, using t-test, is also carried out using other four centrality measures, i.e., degree, betweenness, eigenvector and leverage centrality. The significant regions found using the other centrality measures are shown in Tables S3 and S4 in Supplementary Information. Node entropy measure is always able to extract the regions found to be significant by other measures. In addition, it finds some other important regions not found by the state-of-the-art centrality measures. For emotion task, the regions shown to be significant by node entropy, but not by other measures, include: left hippocampus, left amygdala, left accumbens, right caudate, right pallidum and right transversetemporal. Similarly, for gambling task, the regions shown to be significant by node entropy, but not by other measures, include: left pericalcarine, right pericalcarine, right postcentral and right transversetemporal.
For edges, nonparametric permutation t-test is carried out using edge entropy values on all edges, and the statistically significant edges are found using p = 0.05 with Bonferroni correction. The sub-network containing the significant edges are all top ranked edges from Algorithm 3. The number of edges, that had significant change   www.nature.com/scientificreports www.nature.com/scientificreports/ in edge entropy values correspond to 102, 118 and 83, respectively, for emotion vs. no-task, gambling vs. no-task and emotion vs. gambling. The sub-networks containing the edges are shown in Fig. 4 for emotion vs. no-task.

Stability of Top Regions and Edges.
We use a rigorous leave-one-out technique to rank regions and edges in order to understand the stability of our method [41][42][43][44] . We run the proposed algorithm (Algorithm 3) 475 times, each time leaving one subject out and ranking the regions and edges based on Algorithm 3. We find that, the top regions and edges obtained from this leave-one-out method are very stable as shown by their histograms. For emotion task, top 21 regions (from Table 2 Table 2) were ranked higher 475 time, the rest five regions came up 470, 375, 360, 325 times, respectively. Out of the significant edges for three tasks, 75%, 85%, 80% edges, respectively, came up 475 times. The number of occurrences of the regions (and edges) among top-25 (and significant edges, respectively) are illustrated in Figs S10 and S11 (Supplementary Information), respectively. The histogram for each case is quite flat signifying that important regions and edges were similar across most subjects. This indicates a consistent group-level behavior for classification, i.e., same features are being used for classifying two states.
Quantifying Classification Significance. To further establish that the results are better than chance, we perform permutation tests. Performing permutation test involves computing a trivial baseline using permuted labels, i.e.,   www.nature.com/scientificreports www.nature.com/scientificreports/ the accuracy produced if there was "no signal" between the features and label. Then we determined if our learned model performed significantly better than the baseline. Here, for each dataset (emotion vs. no-task, gambling vs. no-task, emotion vs. gambling), we performed 1000 iterations: each time, we randomly permuted the subject labels to effectively remove any relationship between the input features and the label, then we trained a model on the training subset of this set and tested it on the remaining subset. Fig. S12 shows the distributions of accuracy scores for the three datasets. In each case, we see that there is a significant difference between the centers of the distributions and the accuracy obtained by node entropy (p = 4.8213 × 10 −8 , 7.7689 × 10 −11 , 9.8659 × 10 −10 , respectively, for three tasks). The same conclusion holds for edge entropy. In addition to the permutation tests, we use a binomial test to compare the leave-one-out classification accuracies (using node and edge entropy) to baseline accuracies, to determine if each learner is significantly better than previous state-of-the-art classifiers. Node entropy performs significantly better than the next best method (tensor based) for classifying emotion vs. no-task with p = 7.9637 × 10 −7 . In addition, it is significantly better than eigenvector centrality for classifying gambling vs. no-task (p = 7.3483 × 10 −4 ). Edge entropy is also better than NBS based methods for classification with p = 4.0653 × 10 −4 , 1.5673 × 10 −15 , 5.8537 × 10 −6 , respectively, for three classification tasks. The highest classification performance is achieved using node and edge entropy features associated with the union sub-graph.
Comparison of Node Entropy based Importance with Other Measures. To understand the relationship between the proposed measure and other well-known centrality measures in fMRI literature, we use a scatter plot of the node entropy values for both task and no-task conditions with other centrality measures in Fig. 5 for emotion task. The gambling task follows similar pattern and has not been shown here. In addition, we calculate mean correlation values of centrality measures for a group of graphs (both simulated and real world) in Table 4. The simulated graphs are first constructed using 85 nodes and edges following a uniform distribution (0-1). Next, the graphs are made sparse similar to the sparsity of real networks. For each graph, node entropies are measured and correlated with other centrality measures. Then the average and standard deviation values of correlation are calculated. Correlation values are similarly calculated for the data from emotion and gambling tasks. The scatter plot and the table indicate that our proposed centrality measure has very low correlation values with degree, betweenness and leverage centrality although it has a somewhat high correlation with eigenvector centrality. This implies that graph entropy provides a different dimension of importance in comparison with degree, betweenness and leverage, and provides somewhat similar information with eigenvector centrality.
We also performed GLM analysis of the two tasks. Based on the value of regression coefficients, we ranked the regions associated with each task separately. The ranked regions are shown in Table 5.

Comparison of Graph Entropies between Two
States. The total graph entropy values between states corresponding to two conditions (task vs. no-task time points or task 1 vs. task 2) are also compared. After the calculation of two types of graph entropies for each subject, a one-sided t-test is carried out to understand if the two states were significantly different. Graph entropy based p-values for functional connectivity states are shown in Table 6. All the changes were statistically significant (p < 0.05). The corresponding group mean entropy values are also plotted and compared for two different states (task vs. no-task conditions or others) in Fig. S14 in Supplementary Information. We use standard box plot to visualize the span of entropy values for each group. For classification between two states, this feature achieves greater than 0.7 area under curve (AUC) for classification for all cases. The sub-graph entropies between two sub-graphs are also compared across different tasks and illustrated as box-plots in Figs S15, S16, respectively, in Supplementary Information.

Discussion
The important regions and edges extracted using only one condition are similar across all subjects. They are concentrated mainly in the frontal part of the brain. There are no significant differences between important regions and edges for different conditions. These regions and their connectivities are commonly used in brain to transfer information during task. Many of the significant regions are in anterior cingulate gyrus, ventromedial frontal cortex, and inferior parietal brain regions. These regions are consistent with the previous works by Cole et al. 45 , Tomasi et al. 46 , Zuo et al. 18 . We provide theoretical justifications in Supplementary Information Subsections S.10, S.11 and S.12 for using edge strength and average graph entropy as a measure of group-level behavior of states and show that maximizing sub-graph entropy leads to maximizing mutual information between a sub-structure and whole graph. Some of the regions extracted using one condition consist of some small and noisy regions like left temporal pole and right temporal pole. These regions are ranked lower when using differential entropy. Generally, smaller and noisier regions will not rank higher when differential entropy is used. emotion task. Our definition of important regions between two different conditions based on change of information flow could also extract regions most responsible for the tasks. We also identify a number of useful brain functional areas that are activated mainly during emotion tasks as significant regions between task vs. no-task networks. These areas are amygdala, caudate region, fusiform, striatum, and basal ganglia. Fusiform gyrus has been identified as one of the main regions for face information processing in Mccarthy et al. 47 . This region is also identified as one of the main regions for face emotion processing 48,49 . We find this region among top-5 regions in our ranking. Pallidum, part of basal ganglia, is also a very important region in terms of emotion processing. Nucleus accumbens area (both right and left hemisphere) is also identified as a significant region. Neuclus accumbens has been shown to be an important area for emotional processing in [50][51][52] . Specially, Floresco et al. 52 hypothesize it to be an intermediary region regulating cognition and action. Areas from anterior cingulate cortex have been related to cognition and emotion 53 . Moreover, regions from anterior cingulate cortex (ACC) www.nature.com/scientificreports www.nature.com/scientificreports/ are related to intelligent behavior, i.e., emotional self-control, focused problem solving, error recognition, and adaptive response to changing conditions 54 . Also, Etkin et al. 55 showed its involvement in negative emotional stimuli 55 . We find hippocampal areas to have significant changes during emotion both for emotion vs. no-task and emotion vs. gambling. Hippocampus has been correlated with emotional responses and acts in conjunction with amygdala for processing of emotional situations. The amygdala and hippocampal areas, two medial temporal lobe structures, are linked to two independent memory systems, each with their unique characteristic functions, respectively. The situation where a person faces emotional stimuli, the two regions interact to give rise to specific responses. Specifically, amygdala can have effect on both the formation and storing of memories that depend on hippocampal activation 56 . The hippocampus area is associated with the amygdala response by forming episodic representations of the emotional stimuli. Although these regions are independent with respect to memory organization, they act in concert when emotion stimuli meets memory representations 56 .
The emotion task based on visual face information has a great effect on the regions from visual cortex specifically V1 areas. Calcarine sulcus areas from both right and left hemispheres have the most change in information flow in case of regions and edges. Areas from parietal lobule are also identified as important regions to explain the functional network. These regions may have been prominent as they have been shown to be responsible for processing higher order facial features 57 . One of the surprising finding is the ranking of caudate neucleus as an important region during the task. Caudate has generally been correlated with emotional processing but not with respect to the reaction to the preference of face pictures 58,59 . It has also been identified as neural correlate for emotion based heart rate variability 60 . Hence, apart from main hub locations for angry or fearful emotions, brains of the subjects may also try to process multiple dimensions of the visual stimuli. The edges extracted as important edges also support the regional involvement as most of the regions in the edges are similar as in Table 2. All the regions and edges have p-value < 0.05 indicating that they are statistically significant as well.
Gambling task. The regions that have significant change in information belong to the reward circuitry of brain. Specifically regions from orbitofrontal 61 , limbic system (amygdala, hippocampal) and basal ganglia neucleus (pallidum and striatum area caudate) were seen to have most change in entropies between gambling vs. no-tasks. One other area that has been shown to be involved from the proposed ranking method is neucleus accumbens. Knutson et al. have showed that activation in nucleus accumbens is prominent in people performing a gambling task. However, it is conjectured that this activity is associated with anticipation of reward prediction. This further reinforces the efficacy of differential entropy for ranking process using gambling task without the monetary reward 62,63 . Moreover, reward processing is also correlated with reward-related functional activation in the nucleus accumbens 64 . In case of reward prediction, a behavior employed by the gambling task, significant activity is seen in the lateral orbitofrontal cortex and the striatum 65 . Basal ganglia region striatum is seen to be related to differentiating rewards from non-rewards 66 . Human brain limbic system is associated with neural responses for reward prediction 67 . Especially the difference between the actual gain and expected gain are associated with a neural circuitry of the mesolimbic dopamine system 68 . Gambling task also invokes areas related to decision making, e.g., amygdala. Previous studies have shown that amygdala damage can interfere with decision-making 69 . Amygdala is critical in the neural system and it triggers somatic states from primary inducers that brings back emotions for a secondary event. Functional disconnectivity of the amygdala regions have been shown to impair acquisition of gambling tasks in rats. It also alters their decision making behavior 70 . Anterior cingulate cortex's involvement in cognition and conflict monitoring is well documented. Specifically, findings have posed specific challenges, especially concerning the way it addresses the processing of errors 71 . Dorsal ACC in adults are also active making risky selections. Furthermore, reduced activity in these areas are correlated with greater risk-taking performance making risky economic choices 72 . Other studies also suggest anterior cingulate is significantly correlated with performance on the gambling task 73 and risk anticipation 74 . In addition, we also extract significant regions from frontal lobe and parietal lobe whose entropy have changed significantly during the gambling task. As before, the top ranked edges extracted as important edges also supported the regional involvement as most of the regions in the edges are similar as shown in Table 2. All the regions and edges had p-value < 0.05 indicating that they are statistically significant as well. www.nature.com/scientificreports www.nature.com/scientificreports/ Graph entropy values can be used as a representative metric for neural state. On the other hand, sub-graph entropy metric can be used to extract useful regions and edges that have significant differences between two states. Some of the regions found by sub-graph entropy are similar to traditional GLM (Table 5). Incorporating biologically meaningful regions, edges extracted through the differential entropy based ranking procedure also outperforms other centrality measures for classifying two states. In addition, the centrality information conveyed by graph entropy is different compared to degree, betweenness and leverage centrality. The scatter plots between node entropy and other centralities (Fig. 5) are flat and wide implying very little overlap in the information content. Many regions extracted through sub-graph entropy are different which indicates that sub-graph entropy conveys different information regarding functional connectivity compared to traditional methods.   www.nature.com/scientificreports www.nature.com/scientificreports/ Methods Dataset. Two different task-fMRI datasets collected from 475 subjects from the Human Connectome Project (HCP) Young Adult study 6,16 were used in this paper. The tasks chosen were emotion and gambling. These data are publicly available from the ConnectomeDB database https://db.humanconnectome.org. All data were acquired on a customized Siemens 3 T Connectome Skyra scanner with the following parameters: task-fMRI was obtained with 2 mm isotropic voxels with TR = 720 ms, TE = 33.1 ms. Here emotion processing task was carried out with two runs of 2:16 min with 176 frames per each run. Gambling task was continued for 3:12 mins with 253 frames per run for two runs 75,76 . Description of Task. Emotion. This task was adapted from the one developed by Hariri et al. 77 . Participants are presented with blocks of trials that either ask them to decide which of the two faces presented on the bottom of the screen match the face at the top of the screen, or which of two shapes presented at the bottom of the screen match the shape at the top of the screen. The faces have either an angry or fearful expression. The task format is illustrated in Fig. 6. Here 6 trials of the same task (face or shape) are repeated with the stimulus presented for 2000 ms and a 1000 ms inter-task interval (ITI). Each block is preceded by a 3000 ms task cue ("shape" or "face") so that each block is 21 seconds long including the cue. Each of the two runs includes 3 face blocks and 3 shape blocks with 8 seconds of fixation at the end of each run. The task is described based on WU-Minn HCP 500 Subjects Data Release Manual available from https://www.humanconnectome.org/.
Gambling. This task was adapted from the one developed by Delgado et al. 78 . Participants play a card guessing game where they are asked to guess the number on a mystery card (represented by a question mask "?") in order to win or lose money. Participants are told that potential card numbers range from 1-9 and to indicate if they think the mystery card number is more or less than 5 by pressing one of two buttons on the response box. Feedback is the number on the card (generated by the program as a function of whether the trial was a reward, loss or neutral trial) and either: 1) a green up arrow with "$1" for reward trials, 2) a red down arrow next to -$0.50 for loss trials; or 3) the number 5 and a gray double headed arrow for neutral trials. The "?" is presented for up to 1500 ms (if the participant responds before 1500 ms, a fixation cross is displayed for the remaining time), followed by feedback for 1000 ms. There is a 1000 ms inter-task interval with a "+" presented on the screen. The task is presented in blocks of 8 trials that are either mostly reward (6 reward trials pseudo randomly interleaved with either 1 neutral and 1 loss trial, 2 neutral trials, or 2 loss trials) or mostly loss (6 loss trials pseudo-randomly interleaved with either 1 neutral and 1 reward trial, 2 neutral trials, or 2 reward trials). In each of the two runs, there are 2 mostly reward and 2 mostly loss blocks, interleaved with 4 fixation blocks (15 seconds each). The task format is shown in Fig. 7. The task is described based on WU-Minn HCP 500 Subjects Data Release Manual available from https://www.humanconnectome.org/. prepossessing. The HCP task-fMRI data was first processed following the HCP "fMRIVolume" pipeline (v3.4) 79 , which includes gradient unwrapping, motion/distortion correction, registration to structural scan, nonlinear registration into MNI152 space, and intensity normalization as reported in 9 . Subsequently, spatial smoothing and activation maps generation using the generalized linear model implemented in FSL's FILM (FMRIB's Improved Linear Model with autocorrelation) 80 were performed. Additional details about the HCP "fMRIVolume" pipeline can be found in Barch et al. 76 . Using Freesurfer cortical parcellation atlas 38 , 85 regions of interest were identified as shown in Table S5 in Supplementary Information. An illustration of this pipeline is shown in Fig. 8. Mean time-series value of voxels in every region for each subject were then extracted separately for task and no-task conditions. The task blocks (respectively no-task blocks) were concatenated for each subject and for each region corresponding to task (respectively no-task). Also, linear, square and cubic trends were removed from these time-series.
Modeling the Brain Graph from fMRI. After mean time-series are extracted from predefined anatomical regions 38 from fMRI, a matrix of R × T (note that R = |V|) is generated. Here R is the number of regions and T is the number of time points. A node in the brain graph corresponds to a region of interest and is associated with one mean time-series. Absolute value of Pearson correlation coefficient between two mean time-series represents the edge weight associated with two nodes. This makes sure that we only have positively correlated edges. Absolute value of Pearson correlation coefficients are computed separately for task states and no-task states as defined before. Specifically, the network connectivity for a task is constructed from fMRI time points when a subject is performing a task (e.g., emotion, gambling) during a t-fMRI experiment [12][13][14][15] . The network connectivity for a no-task is constructed from fMRI time points when a subject is not performing a task during a t-fMRI experiment 17 . Hence we get two adjacency matrices for each subject. The mapping process is shown in Fig. 9. Each adjacency matrix is made sparse by keeping top correlating edges. The edges had the same sparsity for all subjects. This was done by choosing S 1 8 Centrality Measures. Throughout our analysis, we assume that an un-directed brain network is given by where V contains vertices or nodes, E contains weighted edges. Number of nodes is given by |V| which is equal to number of regions or neuronal units (R). Number of edges is given by |E|. In this section, first we define graph entropy based on the edge weights of graph.
Edge Weight of Graph. The edge weight e ij between two nodes (v i , v j ) is defined by the absolute value of Pearson correlation coefficient between their corresponding time-series. Thus the measure of edge weight e ij is proportional to the magnitude of correlation between the two time- www.nature.com/scientificreports www.nature.com/scientificreports/ where E[X] represents average value of random variable X. This implies that if e ij is higher, the two nodes behave more similarly, i.e., their interaction is more. Hence, the probability of communication between v i , v j is proportional to e ij . We used 4 types of centrality measures for comparison namely degree, betweenness, eigenvector and leverage 22 .
Degree Centrality. Degree 24 of node i is determined by the number of neighbors connected to node i. Eigenvector Centrality. Eigenvector centrality 25 e i is calculated by Equation 1.
Here a i,j is (i, j) th entry of adjacency matrix corresponding to the graph and λ is a constant.
Betweenness Centrality. Betweenness Centrality 25 of node i, b i , is defined by the Equation 2.
Here g xy is the number of shortest paths between any two nodes x and y. Also g xiy is the number of paths among those passing through node i.
Leverage Centrality. Leverage centrality l i is a measure of the relationship between the degree of a given node (k i ) and the degree of each of its neighbors (k j ), averaged over all neighbors (N i ) as reported in 22 , and is defined in The following two metrics are used for statistical comparison with graph entropy metrics.
Generalized Linear Model. Generalized linear model 23 is multiple regression of event blocks onto fMRI time-series. If there are two conditions, e.g., task and no-task, the regression coefficients are estimated for each condition on each time-series. Their differences describe the activation map for each region. The regression coefficients are computed based on ordinary least square technique 81 .
Structural Centrality. Structural centrality 82 of a network is defined as where R is the number of nodes. If C(G) is high, then the network is more central, i.e., they are influenced by a few leading nodes. A comparison of structural centrality and node entropy is shown in Subsection S.7 in Supplementary where q i,j is probability of correlation between nodes (v i , v j ). It is easy to see that ∑ = q 1 . Note that q i,j 's can also be identified as entries in the normalized incidence matrix Q of graph G such that Q(i, j) = q i,j .
This definition allows us to define the graph entropy as H(G) can be seen as total amount of uncertainty in the whole network and its unit is bits. This entropy measure was introduced in 35 . Graph entropy has an inverese relationship with respect to structural centrality 82 .
Some mathematical properties of graph entropy as in Eq. 6 that are of interest are listed below.
• If some q i,j = 1, then H(G) = 0. In that case, region i always communicates with region j. No other regions communicate with each other. Here i, j are leader nodes in the network.
• H(G) takes its maximum value when all q i,j 's are equally distributed. Here all regions participate equally in the communication process and the system is a homogeneous system.  where q′ k,m represents the normalized correlation coefficient between nodes (v k , v m ) within that sub-graph. We define edge entropy as given by, Based on their differences in entropy, they are ranked in descending order. We also calculate their corresponding p-values using a permutation t-test. The regions with significant change in entropies (p ≤ 0.05) are illustrated in a table. The edges with significant change in entropies (p ≤ 0.05), are plotted as sub-network in a brain template. To understand if the chosen rankings were stable enough, a leave-one-out subject scheme was implemented to select top regions and edges. In particular, in each iteration one subject is left out and the regions, edges are ranked based on the other 474 subjects. The occurrence of the most important regions and edges were plotted in a histogram [41][42][43] . To quantify the significance of classification performance, permutation tests are performed. This involves computing a trivial baseline-the accuracy produced by permuting the labels and then www.nature.com/scientificreports www.nature.com/scientificreports/ determining if the learned model performed significantly better than that. Here, we perform 1000 iterations for each of the datset, then we train a model on the training data and test it on the remaining instances. The classification performance of the proposed model is also compared with baseline methods using binomial tests. This involves using the baseline accuracies as parameter of a binomial distribution and calculating the probability of achieving the accuracy achieved by the proposed models.
In addition, graph entropy values for regions were correlated with other four centrality measures. We create a scatter plot containing regional entropy values vs. each of degree, betweenness, eigenvector, leverage. The correlation values between node entropy and other centralities for each subject are calculated. The total graph entropy measures were used to differentiate between task vs. no-task condition. We use t-test and effect size to differentiate these two states at a group-level. Furthermore, node and edge entropy values are compared using our algorithm and top-25 values are used to classify task vs. no-task states in fMRI scan in each case (region, edge). software. MATLAB is used for running experiments and generating the results. Custom MATLAB code is created for extracting graph entropy measures. We used the brain connectivity toolbox (BCT) 3 to calculate the centrality metrics. SVM classifiers are designed using LIBSVM toolbox 83 .

Conclusion
The main contribution of the study is to demonstrate that well defined brain states can be predicted using sub-graph entropy from t-fMRI data. We showed that there are important nodes and edges in functional connectivity that are sufficiently distinguishing between two different brain states. This paper has introduced the notion of sub-graph entropy in general and node and edge entropies in particular to rank regions and edges in brain graphs in a quantitative manner. Results obtained by the proposed method have been compared with that from the generalized linear model (GLM), degree centrality, eigenvector centrality, betweenness centrality and leverage centrality and network based statistics (NBS). In this paper, node and edge entropies have been defined based on 1-hop neighbors. Whether node and edge entropies defined using 2-hop neighbors provide more accurate prediction of brain network state needs further research. Future work will be directed towards applications of the technique in identifying dynamic states from fMRI tasks as well as from other temporally rich signals such as electroencephalogram (EEG) 84,85 and magnetoencephalogram (MEG) 86,87 . While node and edge entropies have been used in this paper, identifying sub-graphs corresponding to certain tasks requires further research. Investigating applications of the technique to understand differences in brain networks of populations with various diseases and healthy control is also of interest. In many disease prediction applications, filtered versions of time-series have been found to be more discriminative of the disease state [42][43][44]88 . Thus, sub-graph entropy features should be extracted from filtered fMRI and then used for classification; this topic needs to be investigated further.

Data Availability
The datasets analyzed for this study are available to the public from the Human Connectome Project (Open Access Data) ConnectomeDB database.