Introduction

Many man-made infrastructures such as the Internet, transportation networks, and electric power grids can be represented as complex networks1. Because these complex networks play an important role in society, their robustness is pivotal2,3,4. However, most of these infrastructures have been found to be heterogeneous and to have a power-law degree distribution1,5,6. With their “heavy-tailed” properties, these complex networks have been found to be robust against random attacks; however, they are rather fragile under malicious attacks, especially cascade-based attacks7,8.

Cascading failures are common in modern social networks. For example, in electrical power grids, when a power transmission station or a power line goes down, its power will be shifted to the nearby stations (lines). In most cases, neighbouring stations can manage the extra load. However, in some extreme circumstances, these neighbouring stations may become overloaded and fail, resulting in a redistribution of their loads to their neighbours. Ultimately, the redistribution effort may lead to a cascading failure in which a large number of power transmission stations (lines) are overloaded and, consequently, malfunction9. Cascading failures may also take place on the Internet. The load on an Internet router represents data packets that must be transmitted per unit of time, and overloading corresponds to congestion10. Rerouting data packets from a congested router to another might spread the congestion to a large fraction of subnetworks. Some Internet collapses have been caused by congestion11. Another example is a power grid, in which each component is designed to deal with a specific load of power. On August 14, 2003 in Canada and the northeastern United States, a massive power blackout occurred that led to a cascading failure12. A similar breakdown occurred in southern Oregon on August 10, 1996 13,14.

Cascading failures in complex networks have been widely studied over the past few decades15,16,17,18,19,20,21,22,23,24. Different cascading failure models have been proposed to reproduce cascading phenomena. Motter et al. first proposed the “C-L” model in ref. 18, performing experiments on both random and scale-free networks that focused on cascading triggered by the failure of a single node. The “C-L” model obtained good results that were consistent with experts’ intuition about how cascading failures occur. Crucitti et al.19 introduced a dynamical model that considered the dynamical redistribution of flow in networks, in which overloaded nodes obstruct network traffic rather than removed. Zhao et al.20 provided a mathematical proof of the “C-L” model in scale-free networks that analysed the cascading breakdown in scale-free networks in terms of phase transitions. Feng et al.23 proposed an approach of simple, self-consistent probability equations to study cascading behaviours in interdependent networks and showed that this approach can greatly simplify the mathematical analysis of systems ranging from single-layer networks to various types of interdependent networks. Hu et al.24 used a percolation approach to study more realistic coupled networks system in which both interdependent and interconnected links exist and found rich and unusual phase-transition phenomena—including mixed first- and second-order hybrid transitions.

Based on different cascading failure models, various strategies have been proposed to enhance network robustness against cascading failures. Koç et al.25 proposed a robust metric for cascading failures on power grid networks; an entropy-based metric was introduced in ref. 26. Wang et al.27 studied cascading failures on the Internet. Based on a new cascading edge model, they proposed some methods to protect the Internet from cascading failures. However, all the above methods focused only on cascades triggered by removing one or two nodes, and such methods cannot evaluate the overall robustness of networks against cascading failures and may not be useful in many practical applications. Additionally, these methods rarely take the cost involved in updating the real-world systems into account.

Considering only the cascading failure resulting from removing individual nodes in networks is insufficient because many of the remaining nodes are still connected; therefore, the network still maintains its integrity to a certain extent. In contrast, in this paper, we first design a new robustness measure to evaluate the overall robustness of networks against cascading failures. In this robustness measuring scheme, the network is attacked through cascading failures repeatedly until the entire network collapses. During this process, after each cascade attack, the remaining large network components are calculated.

Based on this measure, we propose a memetic algorithm (MA) called MA-Rcf that enhances network robustness against cascading failures. Memetic algorithms form a popular branch of evolutionary algorithms (EAs) that successfully combine global and local searches and have been shown to be more efficient and more effective than traditional EAs for many problems28,29,30. In a previous study, we designed a new memetic algorithm named MA-RSFMA, which improves the robustness of scale-free networks against malicious attacks that achieved a good performance31. Thus, the algorithm proposed here, MA-Rcf, is based on the framework developed for MA-RSFMA and makes use of the properties of cascading failures to design new operators; that is, a new local search operator that considers the characteristic of cascading failures is designed for MA-Rcf. In MA-Rcf, the degree distribution of networks is also kept unchanged to minimize the costs of updating real-world systems. Both synthetic and real-world networks are used to validate the performance of the MA-Rcf. The experimental results show that the MA-Rcf can enhance the network robustness against cascading failures efficiently. Moreover, some properties of robust networks are also analysed.

Methods

Robustness Measure for Cascading Failures

A network can be modelled as a graph, G = (V, E), where V = {1, 2, …, N} is a set of N nodes and E = {ejk| j, kV and j ≠ k} is a set of M links. In ref. 18, Motter et al. proposed the “C-L” model for cascading failures in which, for a given network, at each time step, one unit of the relevant quantity (such as energy or goods) is exchanged between each pair of nodes and transmitted along the shortest connecting path. The “load” at a node consists of the total number of shortest paths passing through it32,33. Each node carries the maximum load that it can handle, and in man-made networks, node capacity is limited by economic costs. The capacity Ci of node i and its initial load Li have the following proportional relation:

where the constant α is a tolerance parameter, and is the initial load of the ith node. Initially, the network operates in a free-flow state insofar as α ≥ 0. However, the failure of a node for any reason triggers the dynamics of the redistribution of loads. When the load at a node becomes larger than the node’s capacity, the node fails. This forces the load previously carried by that node to shift to its neighboring nodes, which in turn, can cause them to fail. Consequently, subsequent failures can occur, and this step-by-step process is a cascading failure18.

In ref. 34, Schneider et al. proposed an effective robustness measure, R, to evaluate networks’ ability to resist targeted attacks on individual nodes. The R measure is based on the “giant component,” namely, the largest connected component left in the network after each node removal. To calculate R, the network must be attacked until only separated nodes are left. Thus, R can evaluate the robustness of entire networks. Therefore, we combine the “C-L” model with R to design a new measure, Rcf, which can evaluate the overall robustness of networks against cascading failures. With the original property of “C-L” model in mind, the process for calculating Rcf is described below.

Step 1. Ssum ← 0 and t ← 1, where Ssum is the accumulated size of the giant components and t is the index of cascaded attack rounds;

Step 2. Calculate the initial load and Ci of each node;

Step 3. Remove the node with the maximum load and all the edges connected to it;

Step 4. If the number of remaining nodes is equal to 1, go to Step 6;

Step 5. Recalculate the of each remaining node:

If a remaining node i is overloaded, namely, Li > Ci, then remove node i and the edges connected to it, then, go to Step 4;

If no remaining node is overloaded, calculate the relative size of the giant component St, SsumSsum+St, tt+1, then, go to Step 3;

Step 6. Calculate the robustness measure Rcf against cascading failures as follows,

where T is the total number of rounds that a cascading failure-based attack needs to destroy the entire network, reducing it to only one node. Obviously, T may vary even for networks of the same size; therefore, the normalization factor 1/N ensures comparability for the robustness of networks with different sizes.

Memetic Algorithm to Enhance Rcf

Memetic algorithms have been shown to be highly capable of searching for the optimal solution in optimization problems28,29,30. In ref. 31, we designed a new memetic algorithm named MA-RSFMA to improve the robustness of scale-free networks against malicious attacks, and it achieved a good performance31. Thus, based on the framework of MA-RSFMA, in this paper, we propose a new memetic algorithm, MA-Rcf, to enhance the overall robustness of networks against cascading failures. By considering the intrinsic property of cascading failures, we design a new local search operator for MA-Rcf that takes Rcf as its objective function while keeping the degree of each node unchanged. Next, we introduce the representation of chromosomes and the initialization process. Then, we describe the evolutionary operators, including the newly designed local search operator. Finally, we summarize the entire framework of the MA-Rcf algorithm.

Representation and initialization

In the MA-Rcf algorithm, each chromosome represents a network. Initially, MA-Rcf has a population with W chromosomes. The initialization process for MA-Rcf is the same as that used for MA-RSFMA31. During the initialization, because we need to preserve the number of links and the degree of each node, each chromosome is generated by randomly adjusting a fraction of the edges in the initial network, G0—that is, the connections of two randomly chosen edges that have no common nodes are swapped in the network. During the initialization, the goal is to generate different networks with the same degree distribution; therefore, any edge-swapping operations that can keep the network connected are accepted without checking whether the swap improves the robustness of the network. The details of this initialization process are summarized in Algorithm 1 (also refer to ref. 31 for more information).

Evolutionary operators

In evolutionary algorithms, crossover operators are often performed to exchange genetic information among the individuals in the population. In ref. 31, a new crossover operator that operates on complex networks is designed that exchanges the structures of two networks effectively without changing their degree distributions. We also employ this crossover operator in this paper. This crossover operator (whose details are summarized in Algorithm 2) acts on two randomly selected parent chromosomes and generates a pair of child chromosomes. Please refer to ref. 31 for more information about this crossover operator.

The local search operator is another important operator in MAs. In ref. 35, Tanizawa et al. found that networks with an “onion-like” structure, where nodes with almost the same degree are connected to each other, are more robust under targeted attacks than those without such onion-like structures. Inspired by this, to search for networks that are the most robust against cascading failures, we design a new local search operator that lets nodes with similar loads connect to each other. The principle of this operator is simple: if a node with small load connects to a node with a very large load, the small-load node will crash immediately if the large-load node fails, because the small-load node has insufficient capacity to handle the extra load. Therefore, connecting nodes with similar loads to each other have a high probability of avoiding such situations. Suppose edges eij and epq are selected and are swapped to eip and ejq. This swap is accepted only if

where Li, Lj, Lp and Lq are the loads of the corresponding nodes, and β is a parameter in the range of [0, 1] that controls the acceptance level for the difference in loads between nodes. The smaller the value of β is, the stronger the constraint is and, thus, the larger the reduction in load differences is. Consequently, this operator can effectively guarantee that nodes with similar loads will be connected to each other, which enhances the search for load-balanced networks. The details of this local search operator are given in Algorithm 3.

MA-Rcf uses binary tournament selection in each generation to select the chromosomes for the next population. Binary tournament selection involves a “tournament” between two chromosomes chosen randomly from the population in which the winner is the chromosome whose fitness is better. Binary tournament selection is a popular method for selecting better chromosomes from a population in an evolutionary algorithm.

Implementation of MA-Rcf

In MA-Rcf, the initialization operator is first used to generate an initial population with W chromosomes. In each generation, the crossover operator is applied to the population first, and then, the local search operation is conducted. After performing the crossover operator, a new child population is obtained. Then, the local search operator and the binary tournament selection operator are applied to both the parent and child populations to generate the child population for the next generation. Finally, the best chromosome in the last population is the most robust network found. The framework of MA-Rcf is summarized in Algorithm 4.

Results

In this section, because scale-free networks have become an important type of network, experiments are conducted on both synthetic scale-free networks and real-world networks to validate the performance of MA-Rcf. We also study some of the network properties of the robust networks obtained by MA-Rcf. The synthetic scale-free networks were generated using the BA model5, and their average degree was set to 4. In ref. 31, Zhou et al. showed that MA-RSFMA can improve the robustness of scale-free networks against malicious attacks effectively; consequently, we also compare networks optimized by MA-Rcf with those optimized by MA-RSFMA to investigate the different properties of both optimized networks.

The parameter α in (1) reflects the capacity of a node to handle its load. A larger α indicates a stronger node. The value of α is always assumed to be in the range [0, 1]: a α > 1 is unrealistically large18,19. In this work, we assume that ability of a node to handle its loads is average (neither very strong nor very weak). Thus, in the following experiments, α is set to a median value, 0.5.

In the local search operator, the tolerance parameter β controls the percentage by which loads can differ between connected nodes. Therefore, we first conducted an experiment to find an appropriate value for β. This experiment used BA networks with 100 nodes. The robustness obtained by MA-Rcf under different values of β is plotted in Fig. 1. The results are averaged over ten independent runs on each sampled point. As Fig. 1 shows, MA-Rcf achieves the highest robustness when β equals 0.8. Thus, β is set to 0.8 in the following experiments.

Figure 1
figure 1

The effect of parameter β on the performance of MA-Rcf.

The other parameters of MA-Rcf were set as follows: the population size W was set to 10, the crossover probability pc and the local search probability pl were set to 0.8 and 0.5, respectively, and the maximum number of objective function evaluations was set to 5 × 104.

Experiments on Synthetic Networks

In this experiment, scale-free networks with different scales were used to test the performance of MA-Rcf. First, an experiment was carried out to test the effectiveness of the local search operator. In this experiment, versions of MA-Rcf both with and without the local search operator were tested on BA networks with 100, 200, 300, and 500 nodes. The obtained robustness values are listed in Table 1, which shows that the MA-Rcf version with the local search operator always performs better than the version without the local search operator. Therefore, the local search operator in MA-Rcf is effective.

Table 1 The Rcf values of BA networks obtained by MA-Rcf with and without the local search operator.

Next, some experiments were conducted to test the ability of MA-Rcf to search for the most robust networks. Network structure optimization is a hard optimization problem. In existing works, the hill climbing algorithm34 and the simulated annealing algorithm36,37 are widely used to address this problem. Thus, we compared the performance of MA-Rcf with that of the hill climbing algorithm34 and the simulated annealing algorithm36,37. The maximum number of objective function evaluations for these two algorithms was also set to 5 × 104 to obtain the results of these three algorithms at the same computational cost.

We tested BA networks with 100, 200, 300, and 500 nodes. The best, worst and average values of Rcf of the three algorithms over 10 independent runs are reported in Table 2. In addition, the corresponding curves of the average robustness obtained by the different algorithms are plotted in Fig. 2. As shown, MA-Rcf obtains the highest robustness values among these algorithms on all test networks. That is, MA-Rcf always finds more network structures more robust to cascading failures than do the other algorithms.

Table 2 The Rcf of BA networks of different sizes obtained by the three tested algorithms.
Figure 2
figure 2

A comparison between MA-Rcf and existing algorithms on BA networks of different sizes.

It is useful to study the robustness of the network structures obtained MA-Rcf. Thus, the network topologies of BA networks before and after optimized by MA-Rcf are plotted in Fig. 3, where the size of each node is proportional to its degree. As shown, before optimization, low degree nodes are often connected to nodes with high degrees; consequently, the entire network is composed of numerous star networks with hub nodes. However, the optimized networks which have higher Rcf, the low degree nodes are more likely to be connected to other low degree nodes and the high degree nodes are more likely to be connected to other high degree nodes. The entire network is a hub-node-connected structure. Considering the property of cascading failures, it is easy to understand why hub-node-connected networks have a stronger ability to resist cascade failures: when a hub node fails, the neighbouring hub nodes can withstand the additional loads effectively, preventing the spread of cascading failures.

Figure 3
figure 3

The network topology before and after optimization by MA-Rcf.

The size of each node is proportional to its degree.

Next, we carried out an experiment to test how well the networks obtained by MA-Rcf resist cascading failures. We simulated the process of cascaded failures on BA networks with 200 nodes until the size of the giant component decreased to 1. The decreasing progress of the size of the giant component St is plotted in Fig. 4. As shown, along with the increasing cascade attack circle t, the MA-Rcf optimized network, which has a higher Rcf value, protects the giant component more effectively. The area between the curve of the “Initial BA network” and the curve of the “MA-Rcf optimized network” in Fig. 4 represents the amount of cascade failure mitigation, which corresponds to improving network robustness against cascade failures by 153%. These results show that the networks obtained by MA-Rcf can resist cascading failures effectively.

Figure 4
figure 4

The change in the relative size of the giant component St over a series of cascaded attack circles t.

The BA network has 200 nodes. The Rcf values of the initial BA network, the optimized network obtained by MA-RSFMA and the optimized network obtained by MA-Rcf were 0.0521, 0.0558 and 0.1319, respectively.

In ref. 31, Zhou et al. found that the onion-like network in which nodes with similar degree connect to each other can contribute to resisting high node degree attacks, we plotted the decreasing process of the size of St of networks optimized by MA-RSFMA under cascading failures in Fig. 4, which has 200 nodes. In Fig. 5, we separately plotted the decreasing process of St of networks optimized by both MA-Rcf and MA-RSFMA under high node degree attacks. In each attack circle, the node with largest degree and all the edges connected to it are removed. To perform a fair comparison, the parameters for MA-RSFMA were set to the same as those for MA-Rcf, namely, the population size was set to 10, the crossover probability and the local search probability were set to 0.8 and 0.5, respectively, and the maximum number of objective function evaluations was set to 5 × 104. In Fig. 4, under cascaded attack circles, the size of the St of the MA-RSFMA optimized networks decreases as fast as that of the initial BA network—even more sharply after the first several attack rounds. Moreover, under high node degree attack circles, the size of the St of the MA-Rcf optimized network decreases as fast as initial BA networks (see Fig. 5 for more details). In other words, the MA-Rcf algorithm cannot contribute to resisting high node degree attacks. These two experiments show that although the network structures optimized by these two algorithms have some similarity, their ability to resist cascading failures is significantly different.

Figure 5
figure 5

The change in the relative size of the giant component St under high node degree attack circles t.

The BA network has 200 nodes. The Rcf values of the initial BA network, the optimized network obtained by MA-RSFMA and the optimized network obtained by MA-Rcf were 0.0521, 0.0558 and 0.1319, respectively.

We are also interested in whether other important network properties might have changed because of the optimization. Therefore, in Table 3 shows the results of assessing the assortativity coefficient38, the average shortest path length and the global communication efficiency39 of networks both before and after being optimized by MA-Rcf. As shown, before the optimization, the BA networks have negative assortativity coefficient values that become positive after the optimization. In other words, the correlation degree of the networks changes from disassortativeness to assortativeness after the optimization. This occurs because the optimization process promotes the connection of high degree nodes with other high degree nodes. After the optimization, the average shortest path length of BA networks is slightly increased while the global communication efficiency has a slight decrease, which means that the optimization process has no significant effect on network communication efficiency.

Table 3 The changes in important network properties of BA networks with different sizes before and after optimization by MA-Rcf and MA-RSFMA, including the assortative coefficients (A), the average shortest path length (L) and the global communication efficiency (C).

Because the networks obtained by MA-RSFMA are also assortative, it is interesting to study the difference in terms of the network properties of networks obtained by both MA-Rcf and MA-RSFMA; these properties are listed in Table 3. We can see that when both algorithms optimize the same network, the network obtained by MA-Rcf is far less assortative than that obtained by MA-RSFMA. In addition, the average shortest path length of the MA-Rcf is only half that of MA-RSFMA. Moreover, the networks obtained by MA-Rcf have higher global communication efficiency.

Experiments on Real World Networks

In this section, MA-Rcf is applied to two real-world networks. One is an electrical power grid in Western Europe (mainly Portugal and Spain)40, labelled the WE Power grid network. It has 217 nodes and 320 edges. The average degree of the WE Power grid network is 2.95. The other network is the US air network—the US air transportation system41—consisting of 332 airports and 2126 air routes, in which the nodes represent airports and the edges present flights. The average degree of the US air network is 12.81. These two real networks are well connected and without any separate component.

The hill climbing algorithm, simulated annealing algorithm and MA-Rcf are used to independently optimize the above two networks. The obtained robustness values are reported in Table 4. As we can see, MA-Rcf always performs better than the two other algorithms. The network topologies of these two real networks before and after optimized by MA-Rcf are shown in Fig. 6. Comparing the network structure before and after optimization, we can see that, in the optimized networks, nodes with similar degrees are more likely to connect with each other, and the hub nodes are more closely connected to each other than before. Consequently, even for an existing network, MA-Rcf can find a structure more robust against cascading failure. By comparing the robust structure with the initial structure, we can find several key edges, which—if changed—would increase network robustness significantly. Considering the cost of optimization, there is no need to change all the edges of the real network; instead, MA-Rcf can help find the key edges.

Table 4 The robustness of networks after optimization with different algorithms on two real world networks.
Figure 6
figure 6

The network topology of two real world networks before and after optimization by MA-Rcf.

The size of each node is proportional to its degree.

The assortativity coefficient, average shortest path length and global communication efficiency of these two real networks both before and after optimization are reported in Table 5. As listed, the WE Power network has a positive assortativity coefficient while the US air network has a negative assortativity coefficient. After optimization by MA-Rcf, the WE Power network has stronger assortativeness, while the disassortativeness of the US air network gets weaker. This result is the same as the results of the experiments with synthetic networks, further verifying that networks with more hub nodes connected to each other have a stronger ability to resist cascading failures. Interestingly, after optimization by MA-Rcf, the average shortest path length of these two networks decreases while their global communication efficiency increases. In other words, MA-Rcf can not only increase a network’s global robustness against cascading failures but can also increase its global communication efficiency—even when applied to real networks.

Table 5 The changes in some important network properties of real world networks before and after optimization by MA-Rcf, including the assortative coefficients (A), the average shortest path length (L) and the global communication efficiency (C).

Discussions

Securing network infrastructure is critical in today’s society. When studying networks subject to cascading failures, considering only the damage from one or even several nodes is insufficient. In this paper, we describe the design of a more comprehensive index that can evaluate the ability of networks to resist cascading failures. To enhance networks resistance to cascading failures, we propose a memetic algorithm, MA-Rcf. Then, to test the performance of MA-Rcf, we tested it on both synthetic and real networks. The topologies of the robust networks obtained by MA-Rcf are shown and some of their network properties are discussed. From experiments comparing with MA-Rcf other network optimization algorithms, we can conclude that MA-Rcf achieves a better performance, showing that MA-Rcf is an effective algorithm for enhancing the robustness of networks against cascading failures.

When dealing with complex networks, the large computational complexity of calculating shortest paths limits the algorithms that rely on such calculations from being applied to large-scale networks. For example, the computation of Rcf in Equation (2) needs to calculate the shortest path of the network under each cascading attack circle; consequently, MA-Rcf is unable to optimize large networks at a low computational cost. However, studying the effects of MA-Rcf on cascading failures is still meaningful. MA-Rcf provides an opportunity to explore network structures that are robust against cascading failures. In this paper, we apply MA-Rcf to networks with specific sizes and study the topological structure and network properties of robust networks. The experiments show connecting hub nodes to each other more closely would be a good strategy when designing networks that are robust against cascading failures. Moreover, extending this discovery to large-scale networks is not difficult. In contrast with other traditional algorithms, MA-Rcf is a competitive algorithm for optimizing networks against cascading failures.

Additional Information

How to cite this article: Tang, X. et al. Mitigate Cascading Failures on Networks using a Memetic Algorithm. Sci. Rep. 6, 38713; doi: 10.1038/srep38713 (2016).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.