Abstract
The critical edges in complex networks are extraordinary edges which play more significant role than other edges on the structure and function of networks. The research on identifying critical edges in complex networks has attracted much attention because of its theoretical significance as well as wide range of applications. Considering the topological structure of networks and the ability to disseminate information, an edge ranking algorithm BCC_{MOD} based on cliques and paths in networks is proposed in this report. The effectiveness of the proposed method is evaluated by SIR model, susceptibility index S and the size of giant component σ and compared with wellknown existing metrics such as Jaccard coefficient, Bridgeness index, Betweenness centrality and Reachability index in nine real networks. Experimental results show that the proposed method outperforms these wellknown methods in identifying critical edges both in network connectivity and spreading dynamic.
Introduction
The structure and function of complex networks attracted a great deal of attention in many branches of science^{1}. Networks mediate the spread of information, sometimes, a few initial seeds can affect large portions of networks. Such information cascade phenomena are observed in many situations, for example, cascading failures in power grids, diseases contagion between individuals, innovations and rumors propagating through social networks, and large grassroots social movements in the absence of centralized control. How to find critical nodes and edges is an important and interesting issue. With the rapid development of internet media, the information interaction between individuals is becoming more and more frequent and the mechanism of information diffusion has become more and more complex. Many methods are used to measure the importance of nodes in networks. Degree centrality^{2}, semilocal centrality^{3}, kshell^{4} and Hindex^{5,6} are based on nodes’ degrees. Closeness centrality^{7}, betweenness centrality^{8} and eccentricity centrality^{9} are based on paths in networks. PageRank^{10}, LeaderRank^{11} and HITs^{12} are based on eigenvector. Sleep scheduling^{13} is one of the approaches to save residual energy of wireless nodes in energyconstraint largescale industrial wireless sensor networks while satisfying network connectivity and reliability. In comparison, critical edges also play a significant role in the process of information diffusion. In complex networks, sometimes it is impractical to forbid all communications of a node, so it is necessary to truncate some important communication links. Critical edges analysis will be beneficial to guide or control the information dissemination from a global perspective.
In order to explore the transmission of information, many researches have focused on the network topology to find the critical edges. Degree product^{14} supposes that edges connecting two nodes with high degrees are critical. Betweenness centrality of edges^{15,16} and betweenness centrality of a group of edges^{17} suppose that edges linking two connected components are important. Average node reachability and the maximum flow of a network can characterize the ability of information transmission in networks and critical edges have serious influence on average node reachability and maximum flow^{18,19}. In Jaccard coefficient^{20}, if node i and node j have a lot of common neighbors, even if they have no direct connection, information also can spread from node i to node j easily, so edges are more important if there are less common neighbors. Complex networks may have many cliques. In Bridgeness^{21}, if an edge is removed, information can spread through other edges in the clique which contains the removed edge, so, intuitively, edges in smaller cliques are more important.
What’s more, The ability to disseminate information is also an evaluation index to measure the importance of edges. In online social networks, the study finds three different spreading mechanisms: social spreading, selfpromotion and broadcast^{22}. An edge is important if most of the information is spreading through this edge^{23}.
In this report, we only use the topology of networks to rank the importance of edges, considering not only the local characteristics (degrees of nodes, cliques) but also the global characteristics (betweenness centrality). The proposed method is compared with Jaccard coefficient, Bridgeness, Betweenness centrality and Reachability index in three evaluation metrics, SIR model^{24,25}, susceptibility index S^{26} and the size of giant component σ^{27} in nine real networks which have large differences in basic topological features and the results show that the proposed method in this report can quickly decompose networks and has a greater impact on information spreading.
Results
If there are many different cliques containing two related nodes of an edge, the edge is not so important for the perspective of spreading. Based on above point and betweenness centrality of edges, a new index BCC_{MOD} (Betweenness Centrality and Clique Model) is proposed to measure the importance of an edge e(u, v). BCC_{MOD} is an index which combines the local and global characteristics. In BCC_{MOD}, if we remove edges with high score, the effect of spreading is large. The performance of BCC_{MOD} is compared with that of Jaccard coefficient, Bridgeness, Reachability and Betweenness. The results show that BCC_{MOD} can quickly decompose networks and has a greater impact on information spreading in most cases comparing with other methods. The detailed definitions of indices are given in the Method section.
Data Description
Nine undirected and unweighted networks are used to evaluate the performance of the edge ranking method. (1) Jazz, a collaboration network between Jazz musicians. (2) Oz, a network contains friendship ratings between 217 residents living at a residence hall located on the Australian National University campus. (3) Highschool, a network contains friendships between boys in a small high school in Illinois. (4) Innovation, a network spread among 246 physicians in five towns, i.e., Illinois, Peoria, Bloomington, Quincy and Galesburg. (5) Lesmis, a network contains cooccurances of characters in Victor Hugo’s novel Les Miserables. (6) Train, a network contains contacts between suspected terrorists involved in the train bombing of Madrid on March 11, 2004 as reconstructed from newspapers. (7) PowerGrid, a network contains information about the power grid of the Western States of the United States of America. (8) Email, a network contains the email communication at the University Rovira i Virgili in Tarragona in the south of Catalonia in Spain. (9) Router, a network contains autonomous systems of the Internet connected with each other. All data can be downloaded from Chicago network dataset^{28} and the basic topological properties of these nine networks are shown in Table 1. In order to guarantee the diversity of networks, these nine networks have large differences in total number of nodes and edges, average degree, maximum degree, average clustering coefficient and degree heterogeneity.
Evaluation metrics
Susceptibility index S, the size of giant component σ and SIR spreading model are used to evaluate the performance of ranking methods.
Susceptibility index S
In network connectivity metric, susceptibility index S is used to evaluate the performance of methods. Susceptibility index S is defined as:
where n_{s} is the number of components whose size equals s, s_{max} is the size of giant component, and n is the size of whole network. For details, sort edges in descending order according to their ranking score firstly, and then calculate the Susceptibility index S after removing the edges from network one by one from high to low ranking scores. In this report, parameter p is defined as:
where m is the number of all edges and m_{r} is the number of removing edges.
The results are shown in Table 2 and Fig. 1. From Table 2 and Fig. 1, it can be seen that BCC_{MOD} has the minimum p when the largest S achieves in Lesmis, Highschool, Jazz, Train, Email and Oz. In Innovation, all methods have the same effect. In PowerGrid and Router, the largest S of BCC_{MOD} is appeared the second earliest. So, the largest S of BCC_{MOD} appeared the earliest in most cases compared with other methods, this demonstrates that BCC_{MOD} can break down the network quickly. Moreover, the largest S of BCC_{MOD} is the highest among all methods for all networks except Email and Router, which means BCC_{MOD} has the greatest damage to networks. From these results, in the point of network connectivity, BCC_{MOD} can quickly decompose networks and has the greatest damage to networks in most cases.
The size of giant component σ
Besides susceptibility index S, another metric, the size of giant component σ is used to evaluate the performance of methods. For details, sort edges descending order according to their score firstly, and then count the size of giant component σ after removing the edges from network one by one from high to low ranking scores.
The results are shown in Fig. 2. The faster the curve falls, the better the effect of method is. From Fig. 2(b,c,f,h,i), it can be found that the curve of BCC_{MOD} falls the fastest, which means BCC_{MOD} can break down the network quickly. And in Fig. 2(d,g), the falling speed of the BCC_{MOD} is close to the best case among all methods. In Fig. 2(a), the size of giant component σ drops quickly although it drops relative slow at the beginning. These results demonstrate that BCC_{MOD} can quickly decompose networks in most cases.
SIR model
In SIR model, there are three statuses: (1) S(t) denotes the number of nodes which may be infected (not yet infected); (2) I(t) denotes the number of nodes which have been infected and will spread the disease or information to susceptible nodes; (3) R(t) denotes the number of nodes which have been recovered from the disease or boredom the information and will never be infected by infected nodes again. In a network, each infected node will infect all susceptible neighbors with a certain probability μ. Infected nodes recover with probability β (for simplicity, β = 1 in this report) at each step. The process stops when there is no infected node. We can set a node to be infected and the others to be susceptible to estimate the influence of a single node in the network. The normalized final effected scale is defined as
where n_{R}(t_{c}, u) is the number of final effected nodes if node u is infected initially under SIR model and F(t_{c}, u) is the finally normalized scale. To estimate the influence of edges, we can calculate the average influence of all nodes when remove a certain fraction of edges. We have an index
where F^{(i)}(t_{c}) is the average final infected scale of all nodes, i.e., \({F}^{(i)}({t}_{c})=\frac{1}{n}{\sum }_{u\in V}F({t}_{c},u)\), and F^{(1)}(t_{c}) and F^{(2)}(t_{c}) are results of original network and the network after removing p of edges.
In Table 3, we show the spearman correlation coefficients between the ranking scores and the relative differences of real infected scale R_{s} with μ/μ_{c} = 2 where \({\mu }_{c}=\frac{\langle k\rangle }{\langle {k}^{2}\rangle \langle k\rangle }\) in this report and all results are averaged over 200 independent implementations. Edges are descending order and divided into 50 parts. For each step only 1 part of edges (remaining other 49 parts) are removed and calculated the relative differences of real infected scale corresponding. Finally, two sequences (scores of the 2% edges and the relative differences of real infected scale) are obtained and the spearman correlation coefficients between them are obtained. From Table 3, it can be seen that BCC_{MOD} has maximal spearman correlation in PowerGrid, Lesmis, Router, Jazz, Innovation, Train and Email. These results demonstrate that the edge which BCC_{MOD} preferentially removed has a greater impact on the dissemination of real information.
Figure 3 shows the relative differences of real infected scale R_{s} after removing top 5% ranking edges under different infect rates. It can be seen that BCC_{MOD} has higher R_{s} under different infect rates comparing with Jaccard, Bridgeness, Betweenness and Reachability methods. Generally, there is a significant impact on information spreading after removing top 5% ranking edges under BCC_{MOD}.
Figure 4 shows the relative differences of real infected scale R_{s} under different ratio of edges removing p with μ/μ_{c} = 2. From Fig. 4, it can be seen that BCC_{MOD} has higher R_{s} under different ratio of edges removing comparing with other methods. These results demonstrate that BCC_{MOD} has a greater impact on information spreading while removing a small part of edges than other methods.
Discussion
In this report, the results show that if there are many different cliques containing both two related nodes of an edge, then the edge is not important for the perspective of spreading. We propose a global structural index, called BCC_{MOD} and compared with four wellknown topological indices by susceptibility index S, the size of giant component σ and SIR model. The results show that BCC_{MOD} performs good in identifying critical edges both in network connectivity and spreading dynamic. As indicated by the experiments on the SIR model, BCC_{MOD} is effective in quantifying the spreading influences of edges. This will help us in some reallife applications such as controlling the spreading of diseases or rumors and withstanding targeted attacks on network infrastructures. What’s more, formal definitions of cliques have generally assumed that the network links are undirected, in directed networks, the definition of cliques will be modified^{29,30}, correspondingly, the algorithm of mining critical edges also have subtle changes. Although the methods have a good performance, high computational complexity make it can’t be used in largescale networks. In BCC_{MOD}, all nodes’ degrees should be determined (running time is O (m)) and the time complexity for calculating the betweenness centrality of all edges in undirected networks is O (mn)^{31}. The time complexity for finding all cliques in undirected networks is O (M (n)) where M (n) is the cost of multiplying two n × n matrices^{32} (for sparse matrices, M (n) is O (n^{2})). So the computational complexity of BCC_{MOD} is O (mn + M (n)) in undirected networks. BCC_{MOD} is a global index with not too high computational load and expected to be applied in small and middle undirected networks. How to optimization of our algorithm in largescale networks and directed networks will be part of our future work. Besides SIR model, there also have other wellknown dynamical processes to measure the importance of edges, for example, the susceptibleinfectedsusceptible (SIS) spreading model^{33} can examine how much information through the edge over a period of time.
Methods
Betweenness centrality
We know that betweenness centrality of edges indicates that the more the shortest paths between node pairs pass through the edge e(u, v), the more important the edge e(u, v) is. The betweenness centrality of an edge e(u, v)^{15} is defined as:
where δ_{st} is the number of all the shortest paths between node s and node t, δ_{st}(u, v) is the number of all the shortest paths between node s and node t which pass through the edge e(u, v), the larger the score BC is, the more important the edge is.
Critical edge identification method
Generally, from the perspective of information spreading, the more important the two related nodes are, the more important the edge is. On the other hand, if there are many different cliques containing e(u, v), even e(u, v) is removed, the information also can spread from u to v (or v to u) easily through other edges in these cliques. Based on above 2 points and combined betweenness centrality of edges, a new index BCC_{MOD} (Betweenness Centrality and Clique Model)
can be defined to measure the importance of an edge e(u, v). Where BC(u, v) is the betweenness centrality of edge e(u, v), k_{u} and k_{v} are the degrees of node u and node v respectively, C(u, v)_{i} is the number of cliques containing edge e(u, v) (in this report, clique means full connected subgraph, not the maximum full connected subgraph) whose size being i. For example C(u, v)_{4} = 3 means there are three cliques containing edge e(u, v) whose size being 4. In this method, the larger the score is, the more important the edge is. For example, as shown in Fig. 5(a,c), the degrees of nodes 1 and 2 are 7 and 8 respectively. In Fig. 5(a) (max size of cliques is 4), C(1, 2)_{3} is 5 and C(1, 2)_{4} is 2. When we remove edge e(1, 2), there are also many paths from node 1 to node 2, the effect of spreading is little. However, in Fig. 5(c) (max size of cliques is 3) with C(1, 2)_{3} being 1, when we remove edge e(1, 2), the effect of spreading is large since there is only one path (1, 3, 2) from node 1 to node 2. Table 4 shows the effect probability p_{e} of nodes 2, 3, and 9 with the original infected source being node 1 on SIR spreading model with full contact process. Taking node 2 as an example, in Fig. 5(a,b), its effect probability is 0.3733 and 0.2240 respectively under μ = 0.2. However, in Fig. 5(c,d), the effect probability of node 2 is 0.2392 and 0.0380 respectively under μ = 0.2.
The Jaccard coefficient of an edge e(u, v) is defined as
where u and v are two related nodes of the edge e(u, v) and Γ_{u} is the set of u’s neighbors.The Bridgeness index of an edge e(u, v) is defined as
where S_{u}, S_{v} and S_{e(u, v)} is the size of max clique which contains node u, v and edge e(u, v), respectively.
The Reachability index of edge e(u, v) is defined as
where V is the number of nodes, G_{e} is the subnetwork by removing an edge e(u, v) from original network and \(R(s;{G}_{e(u,v)})\) is the number of reachable nodes from a node s over G_{e}.
References
 1.
Newman, M. E. The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003).
 2.
Bonacich, P. Factoring and weighting approaches to status scores and clique identification. J. Math. Sociol. 2, 113–120 (1972).
 3.
Chen, D., Lü, L., Shang, M.S., Zhang, Y.C. & Zhou, T. Identifying influential nodes in complex networks. Phys. A 391, 1777–1787 (2012).
 4.
Kitsak, M. et al. Identification of influential spreaders in complex networks. Nat. Phys. 6, 888 (2010).
 5.
Lü, L., Zhou, T., Zhang, Q.M. & Stanley, H. E. The hindex of a network node and its relation to degree and coreness. Nat. Commun. 7, 10168 (2016).
 6.
PastorSatorras, R. & Castellano, C. Topological structure and the h index in complex networks. Phys. Rev. E 95, 022301 (2017).
 7.
Freeman, L. C. Centrality in social networks conceptual clarification. Soc. Netw. 1, 215–239 (1978).
 8.
Freeman, L. C. A set of measures of centrality based on betweenness. Sociom. 35–41 (1977).
 9.
Hage, P. & Harary, F. Eccentricity and centrality in networks. Soc. Netw. 17, 57–63 (1995).
 10.
Brin, S. & Page, L. The anatomy of a largescale hypertextual web search engine. Comput. Networks 30, 107–117 (1998).
 11.
Lü, L., Zhang, Y.C., Yeung, C. H. & Zhou, T. Leaders in social networks, the delicious case. PLoS ONE 6, e21202 (2011).
 12.
Kleinberg, J. M. Authoritative sources in a hyperlinked environment. J. ACM 46, 604–632 (1999).
 13.
Zhou, Y., Hao, J. K. & Glover, F. Memetic search for identifying critical nodes in sparse graphs. arXiv preprint arXiv: 1705.04119 (2017).
 14.
Giuraniuc, C. et al. Trading interactions for topology in scalefree networks. Phys. Rev. Lett. 95, 098701 (2005).
 15.
Girvan, M. & Newman, M. E. Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99, 7821–7826 (2002).
 16.
Wang, Z., He, J., Nechifor, A., Zhang, D. & Crossley, P. Identification of critical transmission lines in complex power networks. Energies 10, 1294 (2017).
 17.
Zio, E. et al. Identifying groups of critical edges in a realistic electrical network by multiobjective genetic algorithms. Reliab. Eng. Syst. Saf. 99, 172–177 (2012).
 18.
Saito, K., Kimura, M., Ohara, K. & Motoda, H. Detecting critical links in complex network to maintain information flow/reachability. In Pacific Rim International Conference on Artificial Intelligence, 419–432 (Springer, 2016).
 19.
Wong, P. et al. Finding k most influential edges on flow graphs. Inf. Syst. 65, 93–105 (2017).
 20.
Hamers, L. et al. Similarity measures in scientometric research: The jaccard index versus salton’s cosine formula. Inf. Process. Manag. 25, 315–18 (1989).
 21.
Cheng, X.Q., Ren, F.X., Shen, H.W., Zhang, Z.K. & Zhou, T. Bridgeness: a local index on edge significance in maintaining global connectivity. J. Stat. Mech: Theory Exp. 2010, P10011 (2010).
 22.
Pei, S., Muchnik, L., Tang, S., Zheng, Z. & Makse, H. A. Exploring the complex pattern of information spreading in online blog communities. PLoS ONE 10, e0126894 (2015).
 23.
Zhu, H., Yin, X., Ma, J. & Hu, W. Identifying the main paths of information diffusion in online social networks. Phys. A 452, 320–328 (2016).
 24.
Kimura, M., Saito, K. & Motoda, H. Blocking links to minimize contamination spread in a social network. ACM Trans. Knowl. Discov. Data 3, 9 (2009).
 25.
Newman, M. E. Spread of epidemic disease on networks. Phys. Rev. E 66, 016128 (2002).
 26.
Bunde, A. & Havlin, S. Fractals and disordered systems (Springer Science & Business Media, 2012).
 27.
Dereich, S. et al. Random networks with sublinear preferential attachment: the giant component. The Annals Probab. 41, 329–384 (2013).
 28.
Chicago network dataset – KONECT, October (2016).
 29.
Seidman, S. B. Cliquelike structures in directed networks. J. Math. Sociol. 3, 43–54 (1980).
 30.
Palla, G., Farkas, I. J., Pollner, P., Der´enyi, I. & Vicsek, T. Directed network modules. New J. Phys. 9, 186 (2007).
 31.
Brandes, U. A faster algorithm for betweenness centrality. J. Math. Sociol. 25, 163–177 (2001).
 32.
Cazals, F. & Karande, C. A note on the problem of reporting maximal cliques. Theor. Comput. Sci. 407, 564–568 (2008).
 33.
PastorSatorras, R. & Vespignani, A. Epidemic spreading in scalefree networks. Phys. Rev. Lett. 86, 3200 (2001).
Acknowledgements
This work is partially supported by the National Natural Science Foundation of China under Grant Nos 61433014 and 61673085.
Author information
Affiliations
Contributions
E.Y.Y. and D.B.C. designed the research and prepared all figures. E.Y.Y. performed the experiments and analyzed the data. All authors wrote the manuscript.
Corresponding author
Correspondence to DuanBing Chen.
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Received
Accepted
Published
DOI
Further reading

Exploring significant edges of public transport network under targeted attacks
Modern Physics Letters B (2019)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.