An algorithm based on positive and negative links for community detection in signed networks

Community detection problem in networks has received a great deal of attention during the past decade. Most of community detection algorithms took into account only positive links, but they are not suitable for signed networks. In our work, we propose an algorithm based on random walks for community detection in signed networks. Firstly, the local maximum degree node which has a larger degree compared with its neighbors is identified, and the initial communities are detected based on local maximum degree nodes. Then, we calculate a probability for the node to be attracted into a community by positive links based on random walks, as well as a probability for the node to be away from the community on the basis of negative links. If the former probability is larger than the latter, then it is added into a community; otherwise, the node could not be added into any current communities, and a new initial community may be identified. Finally, we use the community optimization method to merge similar communities. The proposed algorithm makes full use of both positive and negative links to enhance its performance. Experimental results on both synthetic and real-world signed networks demonstrate the effectiveness of the proposed algorithm.

signed networks, community detection methods in signed networks need to be developed. The challenge of the community detection problem in signed networks is that the community structure is ambiguous since that there are some negative links within communities and some positive links between communities. In the face of the challenge, researchers have put forward lots of community structure detection algorithms to get the best partition of signed networks.
Several algorithms have been extended from the community detection algorithm in unsigned networks to solve community detection problem in signed networks 20 . Yang et al. first proposed the FEC algorithm to detect communities from signed networks based on random walk. Subsequently, several two-stage clustering algorithms have been proposed [21][22][23] . For instance, the community modularity values are respectively calculated by positive and negative links, and the communities are evaluated by the combination of these two community modularity values 21 . The GN-H algorithm is the combination of GN and hierarchical clustering algorithm to detect communities in signed networks 22 . Specifically, it uses the GN algorithm to detect communities based on the positive links, and then combine the negative links to get the final hierarchical clustering results. However, in these two-stage clustering algorithms, the latter stage is always affected by the previous stage, which may limit the performance of the algorithms. Liu et al. first proposed the community detection problem as a multiobject problem (MOP), but the proposed objective functions still need further optimization and improvement to enhance its performance 24 . Majority of previous researches mainly use positive links for community detection, and negative links are only used for adjustment. In fact, positive links attract a node to be in a community, while the node is rejected outside by negative links. Negative links have no less information than positive links. Thus, further study is needed to make full use of both positive and negative links for community detection in signed networks.
In our work, we propose a random walk-based algorithm named SRWA for community detection in signed networks based on positive and negative links. The overall framework of SRWA is to detect initial communities in a signed network, and then expand these initial communities by application of random walks. Firstly, a dense subgraph is detected based on the nodes, whose degree is larger than that of its neighbours. Then, the initial community is growing by adding the node which is more likely to be attracted into the community than to be rejected from the community step by step. Specifically, a node which is not in a current community has a positive probability to be in the community and a negative probability to be away from the initial community. The positive probability is compared with the negative one to judge whether the node should be added into the community. If a node could not be added into current communities, then a new initial community may be developed. Experimental results on both synthetic and real-world signed networks show the feasibility and effectiveness of the proposed algorithm.

Results
In this section, we present the comparative results of the proposed algorithm and the representative algorithms, i.e., FEC 20 , MEAs-SN 24 and a method to optimize the modularity based on Tabu search which is implemented by Radatool (Tabu search for short) 21,25 , on both real-world and synthetic signed networks.
Real-world and synthetic signed networks. Real-world signed networks. The first real social network is the U.S. supreme court justices network, which describes the voting behavior of nine justices in the supreme court of the United States during the period of 2006-2007 26 . The positive line means that one justice supports the other one, and the negative line indicates the opposite meaning. Its community structure is shown in Fig. 1. We can see that the U.S. supreme court justices network is divided into two communities.
The Slovene parliamentary party network represents the relationships among ten parties of the Slovene parliamentary in 1994 2 . Positive links mean that the parliament activities of two parities are similar, while negative links mean that their activities are dissimilar. Figure 2 shows the topological structure of the Slovene parliamentary party network and its community structure.
The Gahuku-Gama subtribes network reflects the political alliances and oppositions among 16 Gahuku-Gama subtribes, which are distributed in a particular area and are involved in warfare with each other 27 . Positive and negative links represent the political arrangements with positive and negative ties, respectively. Its community structure can be seen in Fig. 3. The Sampson monastery network represents the social relationships between 18 monks in the monastery of new England 28 . Sampson collected four kinds of social relationships among a group of monks, i.e., friendship, esteem, influence and sanction. Each type of relationship has both positive and negative aspects. Six variants of the Sampson monastery network can be obtained from UCINET IV datasets and each variant consists of 18 nodes, however, the numbers of positive and negative links are different in these variants. The information about the six variants of the Sampson monastery network are described in Table 1 29 . All these variants have three communities due to the fact that 18 monks were divided into three groups, i.e., Young Turks, Outcasts, and Loyal Opposition 29 .
The microarray expression data for the construction of a gene network used in the study originated from the Gene Expression Omnibus (GEO) with the accession number GSE23400 (http://www.ncbi.nlm.nih.gov/). There are 52 samples and each sample contains expression data of 54,675 probes, which are associated to genes according to the information of GPL570 (a microarray chip). According to the number of genes that a probe detects, probes can be classified into three categories: probes detecting a single gene, probes detecting more than one gene, and probes detecting no genes. We performed the removal of probes which could not detect any genes in each sample, and calculated the expression value of each gene which could be detected by more than one probe. In addition, we calculated the Pearson correlation coefficients of two genes based on their expression data. If the     29 . '(N p , N n )' denotes that the number of positive links in a network is N p , and that of negative links is N n .
Pearson correlation coefficient between gene 1 and gene 2 is larger than 0.8 or smaller than −0.8, then a positive link or a negative link is considered between gene 1 and gene 2 . A positive link between gene 1 and gene 2 denotes that gene 1 and gene 2 are positively related, and a negative link means that they have a negative correlation. Then, a gene-gene interaction network (GIN, for short) is constructed, including 658 nodes and 3338 links, where 2774 are positive links and 564 are negative links.
Synthetic signed networks. In this work, we extended the Lancichinetti-Fortunato-Radicchi (LFR, for short) benchmark to signed networks 30 . A signed network generator is designed with an unsigned network generator and a program to control the type of links in an unsigned network 31 . The signed network generator is denoted as SRN(n, k, maxk, t 1 , t 2 , minc, maxc, on, om, μ, P − , P + ). Here, N is the number of nodes in a network; k and maxk are the average and maximum degree of nodes; t 1 and t 2 are the exponents for the degree and community size distribution; mimc and maxc are the minimum and maximum community size; on and om are the number of overlapping nodes and the number of memberships of overlapping nodes. More importantly, μ is the fraction of links that each node shares with nodes in other communities, which controls the cohesiveness of the communities in the generated SRNs. The higher the value of μ is, the more ambiguous the community structure is. P − is the fraction of negative links within communities, while P + is the fraction of positive links between communities. Ideally, negative links should be between communities and positive links should be within communities. Thus, P − and P + are two parameters to adjust the noise level. When the value of μ is fixed, the larger the values of P − and P + are, the more ambiguous the community structure is. That is, given a fixed μ, we can control the noise level by adjusting both P − and P + . In this experiment, we produce three groups of signed LFR benchmark networks. All groups share parameters maxk = 50, t 1 = 2, t 2 = 1, minc = 10 and maxc = 30. The values of other parameters show differences in different groups. One group contains 100 networks, which share the parameters N = 128, k = 16; μ increases from 0.1 to 0.5 in the step of 0.1; P + increases from 0.0 to 0.8 in the step of 0.2; P − increases from 0.0 to 0.6 in the step of 0.2. Each of the other two groups contains 12 networks. These two groups share parameter k = 10, μ ∈ {0.3, 0.5}, P + ∈ {0.1, 0.3, 0.5}, and P − ∈ {0.1, 0.3}. The number of nodes is set to be 500 and 1000 in these two groups, respectively. The detailed information about each group is shown in Table 2.
Comparison with other algorithms. We verify the performance of the proposed algorithm (SRWA) by comparing it with three representative algorithms (FEC, MEAs-SN, and Tabu search) on both real-world and synthetic signed networks.
Comparison on real-world signed networks. As can be seen in Table 3, the proposed algorithm could generate the true partition results on the networks (e.g., the U.S. supreme court justices network, the Slovene parliamentary party network, the Gahuku-Gama subtribes network, and two variants (i.e., SAM-AFF4 and SAM-INFL) of the Sampson monastery networks). Besides, the obtained NMI and Q signed values were almost larger than those of other algorithms. We also examined the performance of the proposed algorithm on the gene-gene interaction network, the truth partition of which is unknown. Although the Q signed value of the proposed SRWA (i.e., 0.2901) was smaller than that of Tabu search (i.e., 0.4577) on the gene-gene interaction network, the communities achieved by SRWA seem  Table 2. Information of LFR benchmark signed networks. 'N' represents the number of nodes in a network; 'k' denotes the average degree of nodes; μ is the fraction of links that each node shares with nodes in other communities; P − is the fraction of negative links within communities, while P + is the fraction of positive links between communities.  to be more reasonable than those obtained by Tabu search and other compared algorithms. To be specific, on the gene-gene interaction network, SRWA detected 41 communities, among which 11 communities were confirmed to be related to certain biological processes by the database for annotation, visualization and integrated discovery (DAVID for short, https://david.ncifcrf.gov/summary.jsp) (see Table 4). For example, a community detected by SRWA contains seven nodes, which represent the genes ANKH, RP4-758J24.5, MIR6741, DNAJC30, NEIL2, NSMAF and XRN2, respectively. Interestingly, above seven genes are all phosphoproteins, which are bound to phosphoric acid. In addition, the other ten communities detected by SRWA are corresponding to the following biological functions: membrane, alternative splicing, splice variant, protein binding, signal peptide, sequence variant, splice variant and cytoplasm. Here, we refer to a community which is confirmed to be related to a biological process by DAVID as an effective community. The ratio of the effective communities to all communities detected by SRWA is 0.268. However, the ratios of the effective communities to all communities detected by the compared algorithms (FEC, MEAs-SN and Tabu search) are respectively 0.017, 0.004 and 0.022, which are smaller than that by SRWA. Therefore, the SRWA performed better than other compared algorithms on the gene-gene interaction network.

NMI
Comparison on synthetic signed networks. All algorithms are tested on three groups of synthetic signed networks. A total of 30 independent runs are conducted for each algorithm and the average results are shown.
(1) Comparison results on synthetic signed networks with 128 nodes As can be seen from Fig. 4(a,b,e,f,i,j,m and n), when the parameter P − ≤ 0.2, the NMI obtained by the proposed SRWA is larger than that obtained by MEAs-SN, but it is smaller than that obtained by FEC or Tabu search for few detection problems, which suggests that the performance of SRWA is not the best on all synthetic signed networks. However, in these situations, the NMI obtained by SRWA is larger than 0.90, meaning that SRWA could get nearly true partition results. For example, when μ = 0.1 and P − = 0, the NMI obtained by the proposed algorithm is always 1, as P + increases from 0 to 0.8 ( Fig. 4(a)). It suggests that in this situation SRWA could get the completely true partition results. In addition, the performance of SRWA is still better than that of FEC in term of stability. To be specific, for FEC, its performance decreases obviously with the increasing of μ, P + and P − . For instance, when μ = 0.2 and P − = 0.2, the value of NMI largely decreases when P + increases from 0 to 0.8. Similarly, when μ = 0.1 and P + = 0.2, the increase of P − causes huge drops in the performance of FEC. If the values of P + and P − are both fixed, the value of NMI decreases with the increase of the μ value. It means that FEC is very sensitive to the parameters μ, P + and P − . That is because there are some uncertain factors which lead to the instability of FEC, such as the random selection of the initial starting node. Although the increase of μ, P − , and P + may also cause the decline of NMI by SRWA, there is a smaller decrease by SRWA than by FEC (Fig. 4).
When the parameter P − > 0.2, the NMI value of SRWA is larger than those of other algorithms Fig. 4(c,d,g and h). For the Tabu search, despite it achieves the largest NMI when P − ≤ 0.2, the increase of P − causes huge drops of NMI. For example, when μ = 0.1 and P − = 0.6, the performance of Tabu search in term of NMI is smaller than 0.3. However, in the situation, the value of NMI obtained by SRWA is larger than 0.75. Thus, SRWA performs better than Tabu search when P − > 0.2. It may due to the fact that Tabu search is based on the maximization of modularity, which shows less effective when the community structure is unclear. That is to say SRWA shows its superior performance on signed networks with unclear community structures.
(2) Comparison results on synthetic signed networks with 500 and 1000 nodes We also test the performance of SRWA on the synthetic signed networks with 500 and 1000 nodes. According to Fig. 5(a and b), we can see that when P − = 0.1 the NMI obtained by SRWA is no less than 0.8, and in few situations it is smaller than that achieved by the Tabu search. It suggests that SRWA performs slightly less well than Tabu search for few detection problems, which is similar to the results on the synthetic networks with 128 nodes. In addition, on these two group of synthetic signed networks we find that SRWA may achieve larger NMI values than Tabu search when μ or P + is larger, e.g., μ = 0.3 and P + = 0.5 or μ = 0.5 and P + ∈ {0.3,0.5} (Fig. 5(b)). That is to say, when P − = 0.1, as the increase of μ or P + , SRWA shows its superior to Tabu search in terms of NMI.
In addition, SRWA almost performs the best when the parameter P − = 0.3 (Fig. 5(c) and (d)). It is concluded that the performance of SRWA is superior to the comparative algorithm on the benchmark networks with 500 and 1000 nodes.

Discussion
In this work, we have proposed a new algorithm, named SRWA, for detecting community structures in signed networks. The key component of SRWA is that a node which is not in any current communities may be added into a community on the basis of random walks, which makes full use of both positive and negative links between    the node and the members of a community. We have tested the performance of SRWA, and compared it with other representative algorithms (FEC, MEAs-SN and Tabu search) on both real-world and synthetic signed networks. The experimental results have demonstrated the feasibility and effectiveness of SRWA. The feature of the proposed algorithm could be summarized as follows. (1) SRWA has a good ability to detect communities on signed networks. Several other algorithms have good performances on small-scale networks with clear community structures, however, their detection results are far from the expectation on large-scale networks with unclear community structure. The proposed SRWA shows its superiority over the competing approaches for detecting communities in signed networks with unclear community structures in terms of the quality of found communities.
(2) SRWA is not sensitive to the initial nodes and it needs not any prior knowledge on the community structure.
In our future work, we will focus on how we can use the SRWA approach to further address problems in other related domains such as disease module mining. So far, the work about disease module mining considers a biological network as a large graph including only positive links. However, the relations among the entities of the biological network are complex, which could not be modeled only by positive links. From such a signed biological network, we may discover some previously unknown information. In addition, it is also interesting to investigate bio-inspired computing models for community detection in complex networks, such as probe machine 32 and spiking neural P system 33 .

Methods
A signed network can be abstracted as a graph SN = (V G , E P , E N ), where V SN = {v 1 , v 2 , …, v n } is the set of nodes in the network, E P is the set of positive links and E N is the set of negative links. The graph could be expressed as an adjacency matrix A, where the element a(i, j) represents the type of the link between the nodes v i and v j (i.e., < v i , v j >). Specifically, if the link between the nodes v i and v j is positive, then a ij = 1; if the link between v i and v j is negative, then a ij = −1; if there is no relationships, then a ij = 0.
The community detection in signed networks is to detect the communities in which the links are positive and between which the links are negative. Let C = {C 1 , C 2 , …, C m } be a set of communities in a signed network. The community detection problem in the signed network can be described as: The proposed algorithm aims to make full use of both positive and negative links to detect communities in a signed network. The overall framework of SRWA is presented in Table 5, which consists of three main steps: (1) the initial communities are detected; (2) the initial communities are expanded based on random walks; (3) a procedure for community optimization is performed. In what follows, we introduce the details of SWRA.
Detecting initial communities in signed networks. The node with a large impact in a network always has a large number of neighbours. The importance of a node could be reflected by the node degree, which is the sum of the positive degree and the absolute value of the negative degree (Eq. 1).
where deg(v) represents the node degree, and deg P (v) and deg N (v) are the positive degree and the negative degree of the node, respectively. Specifically, if the degree of a node is larger than those of its neighbours, then the node is more likely to be a center of a community than its neighbours. The local maximum degree node is defined as the node which has a larger degree compared with its neighbors 13 . The way to discover the local maximum degree nodes was referred to the previous work 13 . In this work, we identify the local maximum degree nodes from all nodes in a signed network based on node degrees. Here, a initial community in a signed network is defined as a dense subgraph, which includes a local maximum degree node, as well as its close neighbors. Given a local maximum degree node (node 1 ), we identify its neighbour node (node 2 ) with the largest positive degree. The reason why the positive degree of node 1 is used to identify node 2 is that, as members of initial communities, node 1 and node 2 should be linked closely by positive links. node 1 and node 2 may have a common neighbour node (node 3 ), which is also detected based on positive degrees. A initial community is comprised by the nodes node 1 , node 2 and node 3 , together with the links among them.
Expanding communities. Let Y = {Y k |k = 1, …, q} be the set of all communities, where q is the number of N is the k th community, V k is the set of nodes in the community, E k P and E k N are respectively the set of positive and negative links in the community. Specifically, in the initial situation, Y k (k = 1, …, q) is an initial community.
Let the walker start from a node u, which is not belong to any current communities. Then, the node u could teleport to current communities with probabilities on the basis of the connections of nodes. The total probability theorem and conditioning probability model are used to calculate the positive probability of the node u teleporting to a community based on positive links (i.e., p + (u → Y k )(k = 1, …, q)), as well as the negative probability which represents the node is away from the community based on negative links (i.e., p − (u → Y k )(k = 1, …, q)). If the positive probability of the node u teleporting to a community is larger than the negative probability of u being away from the community, then u may be added into the community; otherwise, it is not in current communities, which implies that a new initial community should be formed.
There are q initial communities, so we perform q runs of random walks to calculate p + (u → Y k ) and p − (u → Y k ). At the k th run of random walks, it is supposed that u belongs to the k th community. The graph of the k th random walk process is SCIEntIfIC REPORTS | 7: 10874 | DOI:10.1038/s41598-017-11463-y N . First, we calculate the positive and negative probability of the walker teleporting from u to the node v i (i = 1, …, m) in the graph G k . The way to calculate the positive and negative probability is the same except that they are based on different kinds of links.
Take the calculation of the positive probability of the walker teleporting from u to v i for example. From the time t to t + 1, the walker has a teleporting probability α to jump, and a probability 1 − α to stay. Usually, the teleporting probability α is 0.15 34 . When the walker jumps, it may jump to a node with a transition probability. Suppose that the transition probability from u to v i (i = 1, …, m) is the same, then the transition probability vector , where m is the number of the nodes in the k th community, and d is a m × 1 vector. When the walker stays, it may reach a node based on the positive similarity between nodes. The way to calculate the positive similarity between nodes is based on the positive links. Here, we make use of the similarity definition that Jaccard provided in the literature to evaluate the positive similarity between the nodes v i ∈ V and v j ∈ V (1 ≤ i, j ≤ m) as follows [35][36][37] .
is the positive neighborhood of v i (v j ), the member of which is connected with v i (v j ) by a positive link, and |x| indicates the cardinality (i.e., number of elements) in the set x. Let v j = u in Eq. 3. Similar + (v i ,u) represents the positive similarity between u and v i ∈ V (1 ≤ i ≤ m), and it is also denoted as Similar + (v i ) for short.
Let the matrix M + be the normalization of the positive similarity between nodes in the k th community. That is, Here, M + could be considered as the transition matrix of a random walker. Suppose

Signed network SN;
Output Community set Y.
Step 1 Calculate the node degree of each node in SN by Eq 1; Find the node which has a larger degree compared with its neighbors, and put it in set H.
/* Each node in H is a local maximum degree node. */ Step 2 For each node v i in set H, discover initial communities Y v i ; Put the elements of Y v i in set Y.
/* Y is the set of initial communities. */ /* Y v i is the set of initial communities, in which each community contains v i . */ Step 3 Merge the initial communities which are identical in Y; Return Y as the set of initial communities; Put all nodes in initial communities in V; Put the rest nodes which are not in initial communities in U.
/* V is the set of nodes which are in initial communities. */ /* U is the set of nodes which are not in any initial communities. */ Step 4 For each node u i in U and an initial community Y k in Y, calculate P + (u i → Y k ) and P − (u i → Y k ) by Algorithm 1.
/* Y k is the k th community in Y. */ /* P + (u i → Y k ) means the positive probability of u i belonging to Y k . */ /* P − (u i → Y k ) means the negative probability which represents u i is away from Y k . */ Step 5 /* Temp − number is used to storage the number of initial community which is likely to contain u i . */ Step 6 If |Temp − number| ≠ 0, then add u i to the community Y best that results in the largest positive probability; Step 7 If |Temp − number| = 0, for u i , discover initial communities Y u ; Put all nodes included in Y in V; Put the rest nodes which are not included in Y into U.
Step 8 Repeat step 4-7 until there is no node left in U.
Step 9 Merge the communities which are identical or similar in Y by Eq. 14; Return Y as the set of communities. the positive probability of the walker teleporting from u to v i is + s i ( ) t at the time t. Particularly, in the initial situation, the positive probability of the walker teleporting from u to v i is the normalization of the positive similarity between u and v i , i.e., s i ( ) . At the time t + 1, the positive probability + + s t 1 is calculated as follows. captures the positive probability of the walker teleporting from u to v i at the time t + 1. Iterate the Eq. 4 until s + is convergent. Suppose when the iteration has been completed, the stable state is ( , , ) m 1 π π π = … + + + , then π + satisfies π + = (1 − α) ⋅ (M + ) T ⋅ π + + α ⋅ d. In this situation, the i th entry of π + denotes the conditional positive probability that the node u teleports to v i when u belongs to the k th community.
Similarly, we calculate the negative similarity based on negative links by Eq. 5.
The negative similarities between nodes are normalized to get the transition matrix M − . Suppose s t − represents the conditional probability that u is away from v i when u belongs to the k th community at the time t. We also calculate the initial negative probability vector s 0 − , the i th entry of which is the normalization of the negative similarity between v i and u, i.e., Iterate the Eq. 6. When the iteration has been completed, represents the conditional negative probability that u is away from v i when u belongs to the k th community. Next, the node u has an average conditional positive probability p + (u → Y j |u ∈ G k ) to teleport to a community Y j when u is connected to the nodes in the k th community. Specifically, p + (u → Y j |u ∈ G k ) is the mean value of the conditional probabilities and represents u teleports to all nodes in Y j in the graph G k (Eq. 7).
where V j is the node set of the community Y j . Similarly, u also has an an average conditional negative probability p − (u → Y j |u ∈ G k ) to be away from Y j when u is connected to the nodes in the k th community (Eq. 8).
The probability that u belongs to the k th community is based on the positive similarity between u and a node in the k th community, which is calculated as Eq. 9. We also calculate the probability that u does not belong to the k th community as Eq 10.
Finally, the positive probability for the node u to teleport to or the negative probability for u to be away from a community Y j is calculated based on the theorem of total probability by Eqs 11 and 12.
is the average conditional positive probability for u teleporting to the community Y j when u is connected to the nodes in the k th community, while p − (u → Y j |u ∈ G k ) is the average conditional negative probability for u being away from to Y j on the same condition. The algorithm to calculate the positive and negative probability of a node belonging to each community is described in Table 6. If a node is more likely to be in a community than to be away from the community, then it will be added into the community. Otherwise, it could not be added into any current communities. In this situation, the node could be considered as a new important node, and a new initial community which includes the new important node as well as its close neighbours may be detected. If a new initial community has been detected, then the number of the current communities plus one, and the above procedures are repeated to add nodes into SCIEntIfIC REPORTS | 7: 10874 | DOI:10.1038/s41598-017-11463-y communities; If a new initial community could not be found, u will be added to the most likely community by the tightness between u and a community Y j (j = 1, …, q) as Eq. 13. where num 1 denotes the number of nodes which have positive connections with the node u in the community Y j , and num 2 is the number of nodes in the community Y j . The node is added to the community which has the largest tightness with it.
Community optimization. Two or more communities may have a large number of common nodes. That is, these communities may be identical or similar. In this case, the expanded communities should be merged into one community. If communities C i and C j satisfy the following formula, then they can be merged into a larger community C 38 .
i j i j where ξ is a threshold. Let ξ = 0.5, meaning that most members of the small community are in the large community, the two communities can be merged into one. /* The nodes in V are those within current communities. */ Output P + (u i → Y k ) means the positive probability of u i belonging to Y k ; P − (u i → Y k ) means the negative probability which represents u i is away from Y k Step 1 Construct the graph G k by Eq. 2.
/* G k represents the graph of the k th random walk process. */ Step 2 Calculate the positive similarity matrix Similar + and the negative similarity matrix Similar − by Eqs 3 and 5; Normalize both Similar + and Similar − to obtain M + and M − . /* d is the transition probability vector and α is the teleporting probability.*/ Step 5 Iterate the Eq. 4 until + s t is convergent, and let π + to be the convergent s t Iterate the Eq. 6 until s t − is convergent, and let π − to be the convergent s t − . /* + s i ( ) t means the positive probability of the walker teleporting from u to v i at the time t. */ /* − s i ( ) t means the negative probability that u is away from v i at the time t. */ /* π + (i) denotes the positive probability of the walker teleporting from u to v i . */ /* π − (i) denotes the negative probability that u is away from v i . */ Step 6 Calculate p + (u → Y i |u ∈ G k ) by Eq. 7; Calculate p − (u → Y i |u ∈ G k ) by Eq. 8. /* p + (u → Y i |u ∈ G k ) and p − (u → Y i |u ∈ G k ) denote an average conditional positive and negative probability that u teleports to a community Y j when u is connected to the nodes in the k th community. */ Step 7 Calculate p + (u ∈ G k ) by Eq. 9; Calculate p − (u ∈ G k ) by Eq. 10.
/*p + (u ∈ G k ) means the positive probability that u is connected to the nodes in the k th community. */ /*p − (u ∈ G k ) means the negative probability that u is connected to the nodes in the k th community. */ Step 8 where P R and P F respectively represent the community partition result obtained by an algorithm and the real community partition; N is the number of nodes; X is a 2 × 2 matrix, and X ij is the number of nodes from the real community i that also belong to the found community j; X .j = X 1j + X 2j ; X i. = X i1 + X i2 . If the partitioning result P F is the same as P R , then NMI(P R , P F ) = 1; if they are completely opposite, then NMI(P R , P F ) = 0. . w + (w − ) represents the total positive (negative) strength of the SN, and C i (C j ) represents the community which node v i (v j ) belongs to, and δ(C i , C j ) is 1 if nodes v i and v j are in same community; otherwise δ(C i , C j ) is 0.