Localized recovery of complex networks against failure

Resilience of complex networks to failure has been an important issue in network research for decades, and recent studies have begun to focus on the inverse recovery of network functionality through strategically healing missing nodes or edges. However, the effect of network recovery is far from fully understood, and a general theory is still missing. Here we propose and study a general model of localized recovery, where a group of neighboring nodes are restored in an invasive way from a seed node. We develop a theoretical framework to compare the effect of random recovery (RR) and localized recovery (LR) in complex networks including Erdős-Rényi networks, random regular networks, and scale-free networks. We find detailed phase diagrams for the subnetwork of occupied nodes and the “complement network” of failed nodes under RR and LR. By identifying the two competitive forces behind LR, we present an analytical and numerical approach to guide us in choosing the appropriate recovery strategy and provide estimation on its effect by using the degree distribution of the original network as the only input. Our work therefore provides insight for quantitatively understanding recovery process and its implications in infrastructure protection in various complex systems.

Note B: Derivation for the critical recovery probability r c (LR) and the fraction of giant component P ∞ (LR) Recall that the original random network contains N nodes and their degrees are generated by G 0 (x) = ∑ ∞ k=0 P (k)x k in the limit of N → ∞. q is the fraction of functional nodes after the initial random failure. We divide the localized recovery process into two regimes: (i) We first recover a fraction r of failed nodes according to LR strategy. Then we assume that any failed node outside the recovery area are still active (i.e., present). Hence, after the localized recovery process, all nodes and edges in the original network are present; (ii) We remove those failed nodes outside the recovery area. By doing so, we obtain the network of occupied (i.e., functional and recovered) nodes.
Since the initial failure is random, outside the recovery area there are (1−q)(1−r)N failed nodes and q(1 − r)N functional nodes, on average, after the LR process. This observation indicates that there is a fraction (1 − r) of nodes outside the recovery area. Set s = 1 − r.
We first consider the regime (i). Let A s (k) be the number of nodes with degree k out of the recovery area. The probability to have a node with degree k out of the recovery area is With one more node being checked, A s (k) changes as where k(s) = ∑ k P s (k)k. Following [2], in the limit of N → ∞, (S4) and (S5) yield the differential equation By direct differentiation, the solution can be expressed as and Next, instead of considering the regime (ii) by removing failed nodes outside the recovery area, we consider the opposite operation by first (iia) removing the edges connecting the recovery area to outside and then (iib) removing the functional nodes outside the recovery area. Thus, we obtain the "complement network" composed of failed nodes outside the recovery area. r c (LR) indicates the critical threshold at which a giant component in the complement network first forms.
In the regime (iia), note that the number of edges belonging to the nodes on the outer shell, say, (part of) shell l, of the recovery area minus those connecting inward to shell l − 1, Since loops may exist, the number of edges connecting the recovery area to outside is can be viewed as the outcome of a bond percolation with occupation probability given bỹ Hence, its probability generating function of nodes' degree distribution,G 0 (x), becomes [3,4] ) . (S10).
Finally, in the regime (iib), another node percolation is applied with "occupation" probability 1 − q since the initial failure is random. (Recall that a node is functional with probability q after the initial random failure. Hence, it is removed in regime (iib) with prob- be the generating function of the underlying branching process. The size distributions of the clusters that can be reached from a randomly chosen edge, and the clusters that can be traversed by randomly following a starting node are generated, respectively, by [1]H The mean size of small clusters is ) . (S13) The diverging point of (S13) marks the critical recovery probability r c (LR), at which a giant component of the complement network emerges. Hence, r c = 1 − s c is determined by where f ≡ G −1 0 (s). The fraction S of the giant component in the complement network of failed nodes outside the recovery area satisfies where The point G 1 (1) = 2 marks the watershed of the evolution of phase diagrams under RR.
Here, we take ER networks as an example, where G 1 (1) = λ is just the average degree of the original network. In Fig. 2(a) we plot the phase diagram for λ = 5. Fig. S1(a) and Fig.   S1(b) below show the phase diagrams for λ = 2 and λ = 1.6, respectively. Clearly, when λ decreases from 5 to 2, the three combined phases IB, IC, and IIC contract and vanish simultaneously at the critical point λ = 2. When λ keeps decreasing, three new combined phases IIA, IIIA, and IIIB emerge simultaneously.  7) and (13) can be recast as It is direct to check that (S16) is equivalent to (S17) by relating w to u following w = (1 − r)(u − 1) + 1. The (combined) phase diagrams in Fig. S2 are quantitatively similar to those in Fig. 4, in that r c (LR) > r c (RR) for any q < q c , a signature of homogeneous networks.  r is set to 0.1 for all three networks when calculating γ ∞ ,P ∞ , P ∞ (RR) and P ∞ (LR).

Note F: Predicting robustness of real networks under RR and LR
Three real-world scale-free networks are studied here. The first is Internet   Table   S1.
For Internet, we find that γ c = 2.9 > γ = 2.1 in Table S1. This implies that the "recovering impetus" is dominant when q = 0.3, and LR is the better strategy. This conclusion is supported by the numerical result r c (RR) = 0.69 > r c (LR) = 0.43. The theoretical critical recovery fractionr c = 0.48 represents a good approximation (in fact, an upper bound) for r c (LR). Similar analysis can be applied to γ ∞ ,P ∞ , and the networks Circuit and Protein. In all the situations considered, our theoretical predictions are found to be supported very well by the numerical simulations, and thus can guide us in choosing appropriate recovery strategy and provide estimation on the desired recovery fraction by using the degree distribution of the original network as the only input.

Note G: RR and LR on correlated networks
We compare RR and LR on two real-life correlated networks. The first is a metabolic network Reactome, which has N = 5973 nodes [8]. Its Pearson correlation coefficient is shown to be ρ = 0.24 and hence is an assortative network. The second is a social network describing Facebook user-user relationship called Facebook [9]. It has N = 2888 nodes and Pearson correlation coefficient ρ = −0.67. Therefore, Facebook is a disassortative network. and P ∞ (LR) is prominent. For example, under a relatively sever random error with q = 0.2, the giant cluster in the network of failed nodes in Reactome has more than 30% nodes when half of the failed nodes are recovered by RR, but has only about 2% nodes when using LR strategy. Evidently, LR is much more powerful than RR for healing Reactome.
On the contrary, we observe that P ∞ (LR) > P ∞ (RR) for Facebook, which suggests that disassortative mixing may hinder the LR process. This agrees with our intuition that low degree nodes linked to a recovered hub would reduce the recovery progression under LR since low degree nodes have limited contribution to the giant component of occupied nodes.