Introduction

Maximizing network resilience is of great importance because it helps to mitigate the impact of perturbations or failures and suggests an emergency solution to repair the network1,2,3,4. Recently, considerable research effort has been devoted to enhancing network resilience against malicious attacks5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25, including immunization strategies5,6,10,11,12,13,14,15,16 and topological construction methods17,18,19,20,21,22,23,24,25. Most of the immunization strategies map the problem onto the identification of vital nodes, which, if immunized, would mitigate the diffusion of a large scale failure. However, the strategies cannot essentially improve network resilience from a topological structure, and it is impossible to find a universal index to quantify the importance of a node well in every situation16.

The problem of maximizing network resilience with topological construction is to find an optimal set of edge swaps (or edge additions). The heuristic edge-swap (ES) methods17,18,19,20,21 can enhance network resilience by modifying a network to a specific onion-like structure. However, the computations of these methods become prohibitively expensive, especially for the large scale networks; on the other hand, the networks optimized by the ES methods have a great change in topological structures (onion-like structures), which has an impact on the functionality of the original networks. In the heuristic edge-addition (EA) methods22,23,24,25, for a given network, the new edges between the nodes with lowest degrees are added into the original network. The EA methods have a good performance on computational complexity; however, they possess few effect on resilience optimizations. Furthermore, both the ES and the EA methods cannot optimize network resilience globally. As a consequence, they cannot well maintain the topological functionality of a network and their performance on resilience improvement cannot be guaranteed.

Measurement of resilience is essential for addressing the resilience optimization problem, yet there are no universally accepted indices of network resilience. Conventionally, the resilience (or robustness) of networks is measured by critical (percolation) threshold2,3,4,5,6 which is equivalent to the maximum external force in physical elastic systems. Hence, the measurement cannot fully characterize the elastic properties of nonlinear networks (see also Fig. S1). Ref.17 defined a robustness measurement R, but without mathematical deductive inference and physical properties. Other defined resilience metrics7,8 vary between extremes such as recoverability, adaptability and absorptivity9. Therefore, the problem of maximizing network resilience remains unsolved despite an abundance of heuristic methods17,18,19,20,21,22,23,24,25. More efforts are required for a general approach to maximize network resilience.

Here we address the problem of optimal resilience by finding an optimal (that is, minimal) set of structural edges. After introducing network resilience indices that can reflect the most essential resilience properties of network structure, we provide an optimal solution of the problem by means of a unified theoretical framework and the proposed indices. Further, we propose an algorithm of posteriorly adding (PA) edges to solve the resilience-optimization problem in artificial random networks and real networks26,27,28. Compared with competing approaches17,23, our algorithm achieves better network resilience performance. The main contributions of this paper are as follows: (1) by mapping a complex network onto a physical elastic system, we introduce indices of network resilience, which can better characterize the elastic properties for nonlinear networks, compared with the conventional metrics; (2) based on the proposed indices, we present a unified theoretical framework and a PA algorithm, which can maximize network resilience with minimal costs (i.e., with an optimal (that is, minimal) set of structural edges), in contrast to the heuristic ES17 and EA23 methods.

Methods

Resilience indices

The resilience metrics of networks in this paper were formulated by mapping a complex network onto a physical elastic system and can be commonly used in complex networks. For a physical elastic system, resilience is defined as the capacity of a material to absorb energy during elastic deformation, which can be measured by elastic potential energy (elastic strain energy), i.e.,

$${E}_{p}={\int }_{\sigma =0}^{{\sigma }_{c}}-F{\rm{d}}\sigma $$
(1)

where F is external force (stress), σ is elastic deformation and σc is critical elastic deformation. For a linear physical elastic system, the external force (or elastic deformation) is also a resilience metric, and it has the identity with the elastic potential energy. The value of Ep of Equations (1) lies in the range [0, ∞).

In analogy with the physical elastic system, the proposed network resilience refers to the network deformation under external force including initial attacks, disruptions or perturbations. Let the fraction of the removed nodes, q, represent external force; and let the fraction of failed nodes, 1 − G(q), denote elastic deformation under external force, where G(q) is the fraction of the largest (giant) connected component3,12,29. And support that the size and shape of a complex network can be restored during elastic deformation if the external force is withdrawn. The elastic potential energy of a complex network, Ep, can be given by (see also Fig. S1a, and detailed explication in Supplementary Information Section S1).

$${E}_{p}={\int }_{{\rm{1}}-G(q)=0}^{{\rm{1}}-G(q)={\rm{1}}}-q{\rm{d}}(1-G(q))={\int }_{G(q)=0}^{G(q)={\rm{1}}}q{\rm{d}}G(q)={\int }_{q=0}^{q=1}G(q)){\rm{d}}q$$
(2)

where q ϵ [0, 1], G(q) ϵ [0, 1] and 1 − G(q) ϵ [0, 1]. If q = 0 (the network is not attacked), G(q) = 1 and 1 − G(q) = 0; If q > qc (the network breaks down), G(q) = 0 and 1 − G(q) = 1 − G(qc) = 1, where qc is the critical external force (the critical threshold) and G(qc) is the critical giant connected component29. The value of Ep of Equations (2) Ep lies strictly in the range [0, 0.5].

Considering that the network system is a nonlinear discrete-time system, the quadrature formula (2) can but be solved by numerical integration method, here we provide the numerical versions of Equation (2) by rectangular and trapezoid approximation methods respectively

$${E}_{p}=\frac{1}{N}{\sum }_{q=\frac{1}{N}}^{1}G(q)$$
(3)
$${E}_{p}=\frac{1}{N}{\sum }_{{q}_{l}=\frac{1}{N}}^{1}\frac{G({q}_{l})+G({q}_{l-1})}{2}$$
(4)

where N is the total number of nodes in the network, 1/N is the normalized minimum-step integral size which corresponds to dq in the Equation (2), and q, ql and ql−1 are the fractions of the removed nodes and ql − ql−1 = 1/N. The value of Ep of Equations (3) and (4) Ep lies strictly in the range [1/N, 0.5], where the two limits correspond to a star network and a fully connected graph respectively. This is because (1) a star network breaks down if a vital node is removed from it, and (2) if a fully connected graph is attacked maliciously (or randomly), its fraction of the largest (giant) connected component is equal to 1 minus the fraction of the attacked nodes, i.e., G(q) = 1 − q. Though the error of numerical integration in Equation (4) is smaller than that in Equation (3) (see the detailed explication in Supplementary Information Section S1), we select the Equation (3) as numerical integration version of Equation (2) in the following simulations in Result Section, for comparing with the method in ref.17. Note that in ref.17, the right side of Equation (3) is defined only as a robustness measure R without mathematical deductive inference and physical properties.

Beyond that, the complex networks have other resilient indices such as an elastic coefficient (also called the modulus of elasticity), the critical external force (critical threshold, qc) and the elastic complementary energy (all of which are defined in Supplementary Information Section S1), the same as the physical elastic systems do. The traditional measurement for resilience of networks, critical threshold (qc), can just reflect the critical external force, which is unsuitable for nonlinear systems. For a nonlinear network, the elastic potential (or complementary) energy can better characterize its elastic properties due to its advantages covering the elastic coefficient and critical threshold (see Fig. S1b, and detailed explication in Supplementary Information Section S1).

Theoretical framework

If a certain fraction (q) of vital nodes is intentionally removed from a network and the network breaks down into many finite (disconnected) components, i.e., q = qc, the network will undergo a structural collapse and no giant connected component will exist, i.e., G(qc) = 0. Let the vector C = (C1, …, Ck, …, CK) represent the finite components, whose normalized sizes are s1, …, sk, …, sK (s1 > … > sk > … > sK), where k is the serial number of a finite component ordered by size, and K is the number of finite components in the collapsed network. Similar to the definition of the critical giant components, we define the “weak cores” (e.g., C1,c, C2,c in Fig. 1) as the critical finite components. A critical finite component is a special critical giant component caused by an attack, as a finite component is regarded as a subnet. If an edge between the “weak cores” and the critical giant component (Cc,c in Fig. 1) is added, the failure of the finite component can be avoided unless the critical giant component G(pc) fails. Therefore, the weak cores can be used for maximizing network resilience.

Figure 1
figure 1

Optimal edges and weak cores: (a) an original network (Zachary network26). (b) The collapsed network including the isolated nodes, the finite components Ci and the critical giant component Cc,c, by sequentially removing top vital nodes (34, 1, 3, 33 and 2). The size and sequence of finite components (si, qi) and the critical threshold (qc) have been conserved in the process of malicious attacks. (c) The search of critical finite components (Ci,c, “weak cores”) and optimal edges. The optimal set of edges including e5,14, e8,28 and e13,27 is identified, where the optimal edges are adaptively connected between the least influencer in the critical finite component with the top latent resilience and the critical giant component. (df) The ratios of increments of resilience and critical threshold by adding one, two and three of the optimal edges e5,14, e8,28, e13,27, respectively, where Δqc = qcI − qc.

For simplicity, we investigated the case of adding only one optimal edge, eij, to maximize the network resilience (elastic potential energy); an optimal set of edges is provided in the follow-up section. There are only 4 ways to add this edge eij: (i) in the same finite component Ck (i, j ϵ Ck, sk > 2/N), (ii) in the same critical giant component Cc,c (i, j ϵ Cc,c), (iii) between two different finite components Ca, Cb (i ϵ Ca, j ϵ Cb, sa > sb), where sa and sb are the sizes of Ca and Cb, respectively, and (iv) between a finite component Ca (i ϵ Ca) and the critical giant component Cc,c (j ϵ Cc,c). After adding an edge in any of the above 4 ways, from Equation (2), the increment of elastic potential energy of network can be given by

$$\Delta {E}_{p}={\int }_{q=0}^{q=1}[{G}_{I}(q)-{G}_{O}(q)]{\rm{d}}q$$
(5)

where GI(q) and GO(q) are the elastic potential energies of the modified network by adding the edge eij and the original network respectively, and ΔEp ϵ [0, 0.5). Note that the least important nodes in the “weak cores” (Ci,c) and the critical giant components should be selected as the terminal nodes of edge eij to avoid being attacked maliciously in cases (iii) and (iv).

For cases (i) and (ii), due to \({G}_{I}(q)\approx {G}_{O}(q)\) and \({q}_{a}-{q}_{1}=1/N\) (where qa and q1 are the fractions of the removed nodes) (see Fig. 2a), the increment of elastic potential energy can be obtain from Equation (5) by

$$\Delta {E}_{p}^{{\rm{i}},{\rm{ii}}}={\int }_{q=0}^{q=1}[{G}_{I}(q)-{G}_{O}(q)]{\rm{d}}q={\int }_{{q}_{1}}^{{q}_{a}}[{G}_{I}(q)-{G}_{O}(q)]{\rm{d}}q\approx 0$$
(6)
Figure 2
figure 2

Comparisons of increments of elastic potential energy by adding edges in 4 possible ways. Here, q1 = qa − 1/N, q3 = qb − 1/N. (a) The increment of elastic potential energy by adding an edge between the two different nodes in the same finite component Ca (case (i)) (or in the critical giant component Cc,c (case (ii))). (b) The increment of elastic potential energy by adding an edge between the two nodes from two different finite components (case (iii)). (c) The increment of elastic potential energy by adding an edge between the “weak core” Ca,c (or Cb,c in d) and the critical giant component Cc,c (case (iv)).

Suppose that the two finite components Ca and Cb fail at qa and qb (qa < qb) respectively, where qb is the fraction of the removed nodes, and Ca,c and Cb,c are the corresponding “weak cores” of Ca and Cb respectively. Accordingly, the increment of elastic potential energies in case (iii) and (iv) are respectively given by (see Fig. 2b,c)

$$\Delta {E}_{p}^{{\rm{iii}}}={\int }_{q=0}^{q=1}[{G}_{I}(q)-{G}_{O}(q)]{\rm{d}}q={\int }_{{q}_{a}}^{{q}_{b}}[{G}_{I}(q)-{G}_{O}(q)]{\rm{d}}q$$
(7)
$$\Delta {E}_{p}^{{\rm{iv}}}={\int }_{q=0}^{q=1}[{G}_{I}(q)-{G}_{O}(q)]{\rm{d}}q={\int }_{{q}_{a}}^{{q}_{c}}[{G}_{I}(q)-{G}_{O}(q)]{\rm{d}}q+{\int }_{{q}_{c}}^{{q}_{cI}}{G}_{I}(q)$$
(8)

where qcI is the critical threshold of the modified network, \({\rm{\Delta }}{E}_{p}^{{\rm{iii}}}\in [0,0.5)\) and \({\rm{\Delta }}{E}_{p}^{{\rm{iv}}}\in [0,0.5)\). Comparing Equation (8) (case (iv), Fig. 2c) with Equation (7) (case (iii), Fig. 2b), one can see that the increment of elastic potential energy in case (iv) is greater than that in case (iii), due to qc > qb.

Algorithm

The above analysis shows that the optimal edge, eij, must be located between a “weak core” and the critical giant component (i.e., case (iv)) (strict theoretical proof in Supplementary Information Section S4). Moreover, from Equation (8), it can be observed that increment of the elastic potential energy in case (iv) depends on two key factors: the size and the failed sequence (such as qa in Fig. 2c and qb in Fig. 2d) of the finite component. Greater finite component size and smaller failed sequence result in greater increment of elastic potential energy, as shown in Fig. 2c,d.

By comparing the increment of the elastic potential energy from Equation (8) for every finite component, the sequence of the set of increments of the elastic potential energy \({\rm{\Delta }}E=\{{\rm{\Delta }}{E}_{p}^{1,c},\ldots ,{\rm{\Delta }}{E}_{p}^{k,c},\ldots ,{\rm{\Delta }}{E}_{p}^{K,c}\}\) can be given, where \({\rm{\Delta }}{E}_{p}^{1,c} > \ldots > {\rm{\Delta }}{E}_{p}^{k,c} > \ldots > {\rm{\Delta }}{E}_{p}^{K,c}\). Accordingly, the sequential set of edges, \(e=\{{e}_{i,j}^{1},\ldots ,{e}_{i,j}^{k},\ldots ,{e}_{i,j}^{K}\}\), can be obtained, here, i ϵ Ck and j ϵ Cc,c (or Cc,c, the critical giant component of the modified network). Undoubtedly, the first element in the set of edges, \({e}_{i,j}^{1}\), is an optimal edge which, if added into the network, would improve the resilience of network maximally. The sequential set of optimal edges can be obtained naturally by repeating the above procedure. In this regard, a highly scalable algorithm, PA, is proposed for maximizing resilience. The algorithm is terminated if the number of added edges reaches a predefined limit, Fig. 3 shows the overall flowchart of the algorithm (more detailed depiction of the PA algorithm is shown in Supplementary Information section S5). Naturally, by adding the edges from the optimal set sequentially, the resilience of network can be enhanced maximally.

Figure 3
figure 3

The overall flowchart of the proposed PA algorithm.

The resilience-improvement algorithm scales as \(O(2\alpha K(M\,+\,N){\rm{l}}{\rm{o}}{\rm{g}}(M\,+\,N))\), where M is the number of edges of the network, α(α«N) is the number of pre-set optimal edges and K(K«M) is the number of large finite components (more detailed explanation in Supplementary Information section S5). Generally, the number of large finite components K in a collapsed network is small, because the size distribution of the finite components follows the power law at the tail29. This high scalability allows us to find the edges to enhance the network resilience optimally in large-scale networks.

Results

Effectiveness

We demonstrate the efficiency of our approach on the Zachary (Karate club) network26, the Gansu (GS)27 and Henan (HN) power grids28 as well as artificial random networks, i.e., scale-free (SF) networks and Erdös-Rényi (ER) networks. Figure 1d–f demonstrates the effectiveness of the proposed algorithm in maximizing the resilience of a simple network (Zachary network26) against malicious attack (high degree adaptive, HDA). The network resilience is increased by 30%, 63% and 72%, by adding one, two and three edges, respectively. Figure 4a–c shows the structures of SF, GS and HN network optimized by the proposed method (the structure of the optimized ER network in Fig. S2). For example, in Fig. 4c, before optimization, the finite components (green) C1, C2 will emerge if the vital nodes (such as high degree nodes (purple)) v1, v2 are maliciously removed from original network; after being optimized by adding optimal edges (red), the emergence of C1, C2 will be avoided naturally under the same attacks. This case explains why the proposed method can tremendously improve network resilience. As a practical example, the networked micro grids can enhance the power system resilience5.

Figure 4
figure 4

The optimized network structures. (a) The random SF network with N = 2000 nodes, M = 4000 edges, and power-law index γ = 3. (b) The GS power grid with N = 1569 nodes and M = 2163 edges. (c) The HN power grid with N = 310 nodes and M = 466 edges. In all cases, the test networks are modified by adding optimal edges (red), and the proportion of added edges to all edges of the original networks is 3.5%.

In Fig. 5a–c, we show the mitigation of malicious attacks for the SF network, GS and HN power grids (ER network in Fig. S3), respectively. The dashed lines correspond to the sizes of the giant component G(p) in each original network, and the coloured solid lines correspond to the typical modified networks under the different numbers of added edges (from 20 to 180, 2 to 32 and 1 to 16 for SF, GS and HN, respectively). The coloured areas give increments of the resilience (elastic potential energy) under malicious attacks. By adding only 4.5% of edges to the SF network, GS and HN power grids under HDA attacks (Fig. 5d–f), the resilience of the three networks were increased by 44%, 187% and 740%, respectively.

Figure 5
figure 5

Mitigation against malicious attacks, improved resilience and critical threshold. (ac) Mitigation against malicious attacks. The dashed lines correspond to the sizes of the giant components in each original network, the coloured solid lines to optimal modified networks under the different numbers of added edges and the coloured areas give the mitigation against malicious attacks (resilient increment). We compare the ratios (r = ΔEp/Ep) of increased resilience of our algorithm (PA) with other methods (ES, LD) under two modes of malicious attacks (HAD and CI) for each network in (df). The abscissa, w, indicates the proportion of added (or swapped in the ES method) edges to all edges of the original networks. Here CI represents CI2 (other ratios of increased resilience and critical thresholds by CI1, CI2, CI3 and CI4 attacks are shown in Figs S4 and S5). The related comparisons of critical threshold increases for each network are shown in (gi).

We compare the proposed algorithm with the heuristic strategies, i.e., ES17 and EA23 in Fig. 5d–f. Remarkably, the heuristic strategies (ES and EA) improve the network resilience greatly. Furthermore, the improvement ratios of the network resilience by our algorithm are the optimal ratios and are greater than those of the heuristic strategies17,23 under the same proportion of added (or swapped) edges. In the same three figures, we investigate the effect of the resilience improvement of our algorithm on two different malicious attacks, i.e., the widely used HDA3 and the optimal collective influence (CI)12 (see also Figs S4 and S5). Our algorithm performs very well under both attacks. The network resilience is improved by 36%, 223% and 762% (by adding or swapping 4.5% edges) in the SF network, GS and HN power grids, respectively, under the CI attack. Furthermore, if the critical threshold is used as the resilience measure, our algorithm also outperforms the other strategies17,23 (Fig. 5g–i).

For networks with a community structure29 (such as the Zachary network, the GS and HN power grids), our algorithm produces a better network resilience and greater critical threshold than those complex networks with no community structure (such as SF and ER networks), as shown in Figs 5 and S3, because the networks lead to a few large finite components when they are attacked maliciously. In addition, better improvements of network resilience and critical percolation threshold can be obtained in the SF network (Fig. 5) than in the ER network (Fig. S3). As the top vital (hub) nodes of the SF network are removed sequentially, its serious heterogeneity will generate a few large finite components, which contributes to the consequences. Figure S6 shows that the network resilience and the critical thresholds of the original and the improved ER networks are increased, which indicates that they follow nearly the same rising trend in the original and the improved networks as the average degree. From Fig. S7, one can observe that the improvements in the network resilience and the critical threshold remain nearly unchanged regardless of the network size.

Unchanged network functionality

The functionality of a network is commonly related to its topological features17,29. It is fundamental and necessary to keep a network’s functionality unchanged when optimizing its resilience. We tested the effects of the topological structural changes on the functionalities of the optimized networks, i.e., the SF network, and the GS and HN power grids. The distributions of cumulative degree, shortest path distance and betweenness were used for measuring the functionality. As shown in Fig. 6, those functionality measures hardly changed. Other topological characteristics including the cluster coefficient, the network diameter, etc., also remain unchanged (Table S2). Therefore, the networks optimized by our algorithm are not only more resilient against malicious attacks but also exhibit little change to their functionalities compared with the original networks.

Figure 6
figure 6

Unchanged network functionality. The network functionality is characterized by the network topological structure. The test networks with w = 0.25 and w = 0.4 were modified by our algorithm based on HDA attacks; the networks with w = 0 denote the original networks. (ac) The cumulative degree distribution p(k). (df) The cumulative shortest path distance distribution p(d), where d is the shortest path distance between nodes. (gi) The cumulative between-ness distribution p(b), where b represents between-ness of node or edge.

Discussion and Conclusion

Intentional attacks and the corresponding defences are always the two opposite sides of network security. To enhance network resilience against malicious attacks, we introduce the network resilience indices by mapping a complex network onto a physical elastic system; then we propose a unified theoretical framework and a general approach (PA algorithm) to solve the problem of resilient optimization. As mentioned before, both the ES methods and EA methods cannot well maintain the topological functionality of a network and their performance on resilience improvement cannot be guaranteed since they are unable to optimize network resilience globally under a theoretical framework. In contrast, our algorithm can maximize network resilience by adding optimal edges between the “weak cores” and the critical giant component (Fig. 1), with minimal costs. This is because, after being optimized by our method, the emergences of the large infinite components can effectively be avoided under the same attacks (Figs 1 and 4). Moreover, the proposed indices of network resilience can characterize the elastic properties for nonlinear networks, compared with the conventional metrics such as critical threshold. Case studies show that our algorithm achieves better performance on resilient improvement of networks, compared with competing approaches17,23.

As edges are added to reach a certain proportion, the growth of network resilience slows down, especially for realistic networks, because the number of large-scale finite components generated by malicious attacks becomes increasingly smaller. Thus, it is necessary to balance the maximum resilience improvements with the costs of modifying a network to find an optimal compromise for the application of our method.

The proposed theory is strictly valid, and can be applied to any real network. Our solution to the optimal resilience problem demonstrates its importance because it can be used to enhance network resilience, guide the design of technological resilient systems, and offer fast and effective ways to mitigate the collapse of networks against malicious attacks, or furnish a self-healing solution to reconstruct existing failed infrastructure systems.