Non-Markovian recovery makes complex networks more resilient against large-scale failures

Lin, Zhao-Hua; Feng, Mi; Tang, Ming; Liu, Zonghua; Xu, Chen; Hui, Pak Ming; Lai, Ying-Cheng

doi:10.1038/s41467-020-15860-2

Download PDF

Article
Open access
Published: 19 May 2020

Non-Markovian recovery makes complex networks more resilient against large-scale failures

Nature Communications volume 11, Article number: 2490 (2020) Cite this article

4640 Accesses
23 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Non-Markovian spontaneous recovery processes with a time delay (memory) are ubiquitous in the real world. How does the non-Markovian characteristic affect failure propagation in complex networks? We consider failures due to internal causes at the nodal level and external failures due to an adverse environment, and develop a pair approximation analysis taking into account the two-node correlation. In general, a high failure stationary state can arise, corresponding to large-scale failures that can significantly compromise the functioning of the network. We uncover a striking phenomenon: memory associated with nodal recovery can counter-intuitively make the network more resilient against large-scale failures. In natural systems, the intrinsic non-Markovian characteristic of nodal recovery may thus be one reason for their resilience. In engineering design, incorporating certain non-Markovian features into the network may be beneficial to equipping it with a strong resilient capability to resist catastrophic failures.

Node-Level Resilience Loss in Dynamic Complex Networks

Article Open access 27 February 2020

Scaling laws of failure dynamics on complex networks

Article Open access 13 November 2023

Reviving a failed network through microscopic interventions

Article 20 January 2022

Introduction

The dynamics of failure propagation on complex networks constitute an active area of research in network science and engineering with significant and broad applications. This is because the functioning of a modern society relies on the cooperative working of many networked systems such as the electrical power grids, various transportation networks, computer and communication networks, and business networks, but these networks typically possess a complex structure and are vulnerable to failures and intentional attacks. Among the diverse failure scenarios, one of the most severe types is cascading failures¹, where the failure of some nodes would cause their neighbors to fail and the process would propagate to the entire network, disabling a large fraction of the nodes and causing malfunctioning of the system at a large scale^{2,3,4,5,6,7,8,9,10,11,12,13,14,15}. Classic examples of cascading failures includes power blackout—the collapse of power grids^5,6, traffic jams¹⁶, and even economic depression^14,17. Previous studies mostly focused on how cascading failures occur, how network structures and failure propagation are related, and on network robustness and vulnerability to failure propagation^{18,19,20,21,22}.

A tacit assumption employed in many previous studies of cascading failures is irreversible failure propagation, where a node, if it has failed, cannot recover and is no longer able to function actively. A failed node is then removed from the network completely, including all the links associated with it. There are real-world situations of networked systems, such as financial and transportation networks, where failed nodes can recover from malfunctioning spontaneously after a collapse^{23,24,25,26,27,28}. In general, there are two types of failure-and-recovery scenarios²⁹: internal and external. In the first type, a node fails because of internal causes (e.g., the occurrence of some abnormal or undesired dynamical behaviors within the node), which is independent of the states of its neighbors. In this case, the node can recover spontaneously after a period of time. An example is the failure of a company characterized by a drop in its market value due to poor management, followed by recovery due to internal restructuring. The second type is external failures, where a node’s failure is externally triggered, e.g., by the failures of its neighboring nodes. After a period of time, as its local “environment” is improved, the node is able to recover spontaneously. The time of recovery depends not only on the specific type of failure-and-recovery mechanism, i.e., whether internal or external, but also on the individual node and its position in the network. For example, for a given node in the network, it may take longer to complete an internal restructuring process to recover from a failure due to an internal than an external cause. Previous computation and mean-field analysis have revealed that cascading dynamics incorporating a failure-and-recovery mechanism can exhibit a rich variety of phenomena such as phase transitions, hysteresis, and phase flipping^{29,30,31,32,33}. With respect to the resilience responses of networks, the effects of removing a fraction of nodes and links on network functions were studied^34,35,36,37, demonstrating that resilience can be used to characterize the critical functionality of the network with applications in complex infrastructure engineering^36,37.

In spite of the variations in the recovery dynamics across networks or even nodes in the same network, generally the process can be classified into two distinct types: Markovian and non-Markovian. In a Markovian recovery process, an event occurs at a fixed rate and the interevent time follows an exponential distribution^38,39,40, rendering memoryless the process. On the contrary, a non-Markovian recovery (NMR) process has memory, as the current state of a node depends not only on the most recent state but also on the previous states. In this case, the interevent time distribution is not exponential but typically exhibits a heavy tail. For example, in human activity and interaction dynamics, the occurrences of contacts among the individuals in a social network can be characteristically non-Markovian, for which there is mounting empirical evidence^{41,42,43,44,45,46,47,48}. Non-Markovian type of recovery process also occurs in biochemical reactions⁴⁹ and in the financial markets^12,50. We note that, in the context of spreading dynamics on complex networks, the effects of the non-Markovian process, due to its high relevance to the real world, have attracted growing attention^{51,52,53,54,55,56,57,58}. From the point of view of mathematical analysis, incorporating memories into the dynamical process makes analytic treatment challenging.

While the impacts of non-Markovian processes on spreading dynamics have been reasonably well-documented^{51,52,53,54,55,56,57,58}, there has been little work so far addressing the influence of non-Markovian recovery process on failure propagation dynamics. In this paper, we address this issue systematically through a comparison study of two types of dynamical processes: one with Markovian and another with NMR. In the Markovian recovery (MR) model, failures due to internal and external causes will recover with different constant rates. In the NMR model, such a constant rate cannot be defined. We thus resort to the recovery time. In particular, we assume that the failed nodes due to internal and external causes will take different time to recover, so a memory effect is naturally built into the model. For each model, we develop a mean-field theory and an analysis based on the pair approximation (PA)^{29,59,60,61,62} that retains the two-node correlation but ignores any correlation of higher orders. Comparing results with numerical simulations indicates that both mean-field theory and PA analysis capture the key features of the failure propagation dynamics qualitatively, but the PA analysis yields results that are in better quantitative agreement with numerics. The counterintuitive and striking phenomenon is then that non-Markovian character with a memory effect makes the network more resilient against large-scale failures. There are two implications. Firstly, in physical, biological, or other natural networked systems, the intrinsic non-Markovian character of nodal recovery may be one reason for resilience of these networks and their existence in a harsh environment. Secondly, in engineering and infrastructure design, incorporating certain non-Markovian features into the network may help strengthen its resilience and robustness.

Results

Spontaneous recovery models

For general failure propagation dynamics on a network, a node can be in one of two states: an active (labeled as A-type) state in which the node functions properly and an inactive state (I-type) in which the node has failed. To distinguish the causes for a node to become inactive, we label an inactive node due to internal or external failure as X-type or Y-type, respectively.

In the NMR model, an A-type node may fail spontaneously at the rate β₁ to become an X-type node, or it may fail at the rate β₂ to become a Y-type node when the number of its A-type neighboring nodes is less than or equal to a threshold integer value m that sets the limit on neighboring support for proper functioning of a node. Without loss of generality, we assume that external failures occur more frequently than internal failures: β₁ < β₂. This is often the case as internal failures can be made less probable by building up the capability of the nodes through better equipment and/or management, while external failures are uncontrollable and more difficult to avoid. For examples, falling stocks may be the result of unanticipated changes in the market rather than poor management. In a road network, failures are caused more often by congestion than by physical failures. Once a node becomes inactive, it takes time τ₁ to recover from an internal failure (when the node is of the X-type) or time τ₂ to recover from an external failure (when the node is of the Y-type). The non-Markovian characteristic is taken into account through the incorporation of a memory effect into the model. In particular, the nodes that will recover at time t constitute those that were turned into X-type inactive nodes at the time t − τ₁ and those turned into Y-type inactive nodes at the time t − τ₂. Here, we assume τ₁ > τ₂, for the reason that repairing a node or restructuring the management due to the malfunctioning of the node itself would need more time. For example, reorganizing a company or repairing a road often takes more time. The failure processes characterized by the rates β₁ and β₂ as well as the recovery processes as determined by τ₁ and τ₂ are schematically illustrated in Fig. 1.

**Fig. 1: **Schematic illustration of NMR and MR models**.**

Note that the case of τ₁ < τ₂ may also arise in the real world. For example, for an infrastructure network in civil engineering, when an earthquake strikes and destroys buildings (nodes), the time to rebuild can be longer than that required for recovering from internal failures, e.g., the collapse of a roof due to some material failure. Our computations of this case yield qualitatively similar results to those in the case of τ₁ > τ₂—see Supplementary Note 3 for detail.

The MR and NMR models differ only in the recovery processes. In the MR model, an inactive node of the X-type or the Y-type recovers at a constant rate μ₁ or μ₂, respectively, as illustrated in Fig. 1. Consequently, the number of nodes recovered at time t depends only on the number of inactive nodes of both X-type and Y-type at the previous time step.

To develop theories for failure propagation on networks with MR or NMR recovery process and to identify the key differences between the two type of dynamics, we focus on random regular networks. In the numerical simulations, we use a relatively large network size N = 3 × 10⁴ with the degree k = 35. In the NMR model, the recovery times are taken to be τ₁ = 100 and τ₂ = 1 for the X-type and Y-type of nodes, respectively. In the MR model, the values of the recovery rates are set to be μ₁ = 1/τ₁ = 0.01 and μ₂ = 1/τ₂ = 1 so that they correspond to the same scales for the recovery times in the NMR model (see Supplementary Note 1 for a more detailed explanation). The threshold values in both models are m = 15. Synchronous updating is invoked in simulations with the time step Δt = 0.01.

Markovian recovery process

Mean-field theory: We start with setting up the dynamical equations for MR dynamics and comparing results with simulations. Based on the mean-field theory in “Mean-field theory for MR dynamics” of “Methods” section, we first examine the behavior of E_t([I]) in Eq. (4). Figure 2a shows the dependence of E_t([I]) on the fraction of failed nodes [I]. It can be seen that E_t exhibits two different types of behaviors over a large part of [I]: E_t ~ 0 for a wide range of small [I] values (low failure) and E_t ~ 1 for a range of large [I] values (high failure). In the low failure state, external failure events rarely occur. In the high failure state, an active node is supported by an insufficient number of active neighbors and external failure events almost always happen. It implies that the stationary state [I] can possess two branches: setting E_t = 0 in Eq. (5) gives [I] = 1 − 1/(β₁/μ₁ + 1) as the low-failure branch, while setting E_t = 1 gives [I] = 1 − 1/(β₂/μ₂ + β₁/μ₁ + 1) as the high-failure branch. The two branches are shown in Fig. 2b (dashed and solid curves) in terms of the dependence of [I] on β₁, for μ₁ = 0.01, β₂ = 2, and μ₂ = 1 as an example. To check which branch the system would take on and whether there are two states for some range of parameters, the simulation results for moving the value of β₁ up (circles) and down (squares) are shown in Fig. 2b for comparison. As the values of β₁ are increased or decreased, the initial state is taken to be the final state corresponding to the previous value of β₁—the adiabatic process. The results indicate that: (i) the values of [I] from simulations follow the two branches given by the mean-field approximation, and (ii) the low-failure (high-failure) branch is followed when moving β₁ up (down) until a particular value at which there is a jump to the high-failure (low-failure) branch—the signature of a hysteresis. The results also imply that if one starts from the initial conditions [X]₀ ≠ 0 and [Y]₀ = 0, there exists a critical value of β_c ≈ 0.007 for a sudden increase in the number of failed nodes when [X]₀ is small as the system will follow the low-failure branch. However, for large [X]₀, the critical value β_c becomes β_c ≈ 0.003 as the system will follow the high-failure branch. A plot of β_c against [X]₀ will therefore exhibit two plateaus with β_c ≈ 0.007 for small [X]₀ and β_c ≈ 0.003 for large [X]₀.

The mean-field approximation not only simplifies the analysis but also provides insights into the dynamical process. For example, the mean-field theory suggests the ratios β₁/μ₁ and β₂/μ₂ as key parameters. In general, solutions can be obtained numerically by solving Eq. (5) together with Eq. (4). The results are shown in Fig. 2c as a phase diagram. For parameters falling into the regions corresponding to the low-failure (high-failure) phase, the system will evolve into a low-failure (high-failure) state. For parameters in the bistable phase, the system will evolve either to a low-failure or a high-failure state, depending on the initial conditions. The high-failure and low-failure phase boundaries meet at the critical point determined by β₁/μ₁ ≈ 0.745 and β₂/μ₂ ≈ 1.020.

In addition to the stationary state, the evolution of the system can also be studied by iterating Eqs. (1) and (2) for a given initial condition. Figure 3a shows the evolution of the MR dynamics as obtained by the mean-field theory for β₁ = 0.004 and β₂ = 2. In the three-dimensional space formed by [A], [X], and [Y], the sum rule [A] + [X] + [Y] = 1 defines a triangular plane, as shown in Fig. 3. At any time t, the state of the system is characterized by a point in the plane. The results show that the MR dynamics will evolve into either the low-failure or the high-failure state (filled circles), depending on where the system begins. The mean-field theory also gives a separatrix, the line traced out by the open circles, where the system will evolve into a different state starting from a point on a different side of the separatrix. For [X]₀ > 0.38, the system will evolve to a high-failure state with ([X], [Y], [A]) given approximately by (0.119, 0.580, 0.301). For [X]₀ < 0.38, the system may evolve to the high-failure state or a low-failure state at around (0.285, 0, 0.715). Numerical results are shown in Fig. 3b, verifying all the features predicted by the mean-field theory. For example, the high-failure state is given by ([X], [Y], [A]) ~ (0.124, 0.579, 0.298) and the low-failure state at around (0.287, 0, 0.713), both are quite close to the values predicted by the mean-field theory.

**Fig. 3: **Evolutionary properties of MR dynamics**.**

Pairwise approximation theory for the MR model: It is possible to formulate a theory that takes into account of two-node spatial correlation based on the pairwise approximation (PA). The basic idea is to follow the evolution of different types of links, i.e., links that connect different pairs of neighboring nodes⁶². The PA method has been used widely in studying epidemic and information spreading^63,64,65, and in coevolving voter models and adaptive games with two or more strategies^66,67,68,69. In “Effect of nodal correlation: pairwise approximation for the MR model” of “Methods” section, we develop a PA based theory for the MR model.

**Fig. 4: **Comparison of simulation results with predictions from PA analysis and mean-field theory for the MR model**.**

Figure 4 presents a comparison of the predictions of the PA analysis and mean-field theory with the numerical results, where Fig. 4a shows the time evolution of [X]_t and [Y]_t from the initial state [X]₀ = [Y]₀ = 0 for β₁ = 0.009, β₂ = 2.0, μ₁ = 0.01, and μ₂ = 1. While both mean-field and PA theories capture the key features in time evolution, the results of PA are in better agreement with those from simulations. It is useful to understand the dynamical behaviors in the MR model qualitatively (so as to enable a meaningful comparison with those of the NMR model later). For this purpose, we identify several stages in the time evolution as marked in Fig. 4a. In the early stage, i.e., t ∈ [t_O, t_A], most nodes are active and they have more active neighbors, violating the condition n_A ≤ m. As a result, only internal failures occur and [X]_t grows but [Y]_t decreases and eventually vanishes. For t ∈ [t_A, t_B], [X]_t, active nodes start to fail into Y-type nodes, leading to fewer active nodes in the system and triggering more external nodal failures. This results in the observed rapid increase in [Y]. In the later stage t ∈ [t_B, t_C], there are more failed nodes than active ones. While the failed nodes of X and Y types can recover with their respective rates, the remaining or recovered active nodes will more likely fail again through external than internal causes due to the many failed nodes surrounding the active nodes. Consequently, in this later stage, [Y]_t increases and [X]_t decreases toward their respective steady-state values for t → ∞, with [Y] > [X] when the system evolves into a high-failure state. The PA analysis captures the behavior of [X]_t over time and the onset of [Y]_t better than the mean-field analysis. Figure 4b shows the phase diagram for μ₁ = 0.01 and μ₂ = 1.0. The mean-field phase diagram is the same as that shown in Fig. 2c, where it can be seen that the results of the PA analysis (solid curves) are indeed in better agreement with the simulation results than the predictions of the single-node mean-field theory.

Note that Fig. 2 reveals the emergence of a critical value of β_c in the spontaneous failure rate beyond which the system incurs a large-scale failure starting from the initial conditions [X]₀ ≠ 0 and [Y]₀ = 0. The critical rate β_c is calculated by starting the system from the initial conditions for different values of β₁ (for a fixed value of β₂ = 2.0) and search for the value of β₁ beyond which the system attains a high-failure state (see Supplementary Fig. 1 in Supplementary Note 2). The critical value thus depends on [X]₀, the initial fraction of failed nodes due to an internal mechanism. Figure 4c shows the numerically obtained functional relation β_c([X]₀) (open circles), together with two types of theoretical prediction (PA analysis and mean-field theory). As the initial fraction [X]₀ is increased from a near zero value, β_c maintains at a relatively higher constant value (about 0.007). As [X]₀ increases through the value of about 0.4, the value of the critical rate suddenly decreases to about 0.003. We see that, again, the prediction of the abrupt change in β_c by the PA analysis is more accurate than that by the mean-field theory.

What is the physical meaning of the abrupt decrease in the critical value of the spontaneous failure rate as displayed in Fig. 4c? A higher value of β_c means that the network system is more resilient to large-scale failures as it requires a larger rate value to drive the system into a high-failure state. As the fraction of initially failed nodes is increased, the network as a whole is more prone to large-scale failure so we expect the value of β_c to decrease. Because of the lack of any memory effect in the ideal, Markovian type of recovery process, i.e., after a node fails, it either recovers instantaneously or does not recover (with probabilities determined by the rate of recovery), we expect a characteristic change in the system dynamics as characterized by the value of the critical rate β_c to occur in an abrupt manner. Indeed, as Fig. 4c reveals, as the fraction of initially failed nodes is increased through a threshold value, there is a sudden decrease of about 50% in the value of β_c, giving rise to a first-order type of transition. This behavior of abrupt transition may not occur in reality because of the assumed Markovian recovery process, which is ideal and cannot be expected to arise typically in the physical world. In the next section, we will demonstrate that making the dynamics more physical by assuming non-Markovian type of recovery process will drastically alter the picture of transition in Fig. 4c.

Non-Markovian recovery process

To analyze failure propagation dynamics in systems with NMR, a viable approach is to construct difference equations that relate the fractions of types of nodes and links at time t + Δt to those at time t. It is necessary to keep track of the time when a node becomes the X or Y type as well as the time at which a link becomes type UV. In “Pairwise approximation theory for the NMR model” of “Methods” section, we develop a PA analysis for the NMR model. Figure 5 shows the simulation results from the NMR model, together with predictions of the PA analysis and mean-field approximation for Δt = 0.01. The time evolution of [X]_t and [Y]_t is shown for the parameter setting β₁ = 0.009, β₂ = 2.0, τ₁ = 100 (thus μ₁ = 0.01), and τ₂ = 1 (thus μ₂ = 1). The initial conditions are [X]₀ = [Y]₀ = 0. Both theories capture the key features of the dynamics. Comparing with results from the MR model [e.g., Fig. 4a], we see that the time evolution of the dynamical variables in the NMR model is different from that in the MR model, in spite of the approximately identical steady-state values.

**Fig. 5: **Benefit of non-Markovian recovery to making the network more resilient against large-scale failures**.**

To describe the key features of the NMR model, we divide the evolution into five stages with the respective time intervals [t_O, t_A], [t_A, t_B], [t_B, t_C], [t_C, t_D], and [t_D, t_E], as shown in Fig. 5a. In the earliest stage [t_O, t_A], [X]_t increases due to internal failures but [X]_t is insufficient to cause external failures. The behavior is similar to that in the MR model, but the duration is shorter and the rise in [X]_t is steeper in the NMR model. The reason is that the memory effect in NMR model allows the recovery of X-type nodes to take place only after τ₁ steps, while the recovery occurs probabilistically in the MR model. In the narrow time window of [t_A, t_B], [X]_t attains a level high enough to trigger the onset of many external failures. As a result, the failed nodes constitute the majority in the system and [A]_t decreases sharply, giving rise to the sharp increase in [Y]_t. The Y-type nodes recover deterministically after τ₂ (τ₂ < τ₁) into active nodes. In the period [t_B, t_C], the recovery of Y-type nodes refuels the system with active nodes that can participate in two paths: more internal and external failures. For t_C < τ₁, the existing X-type nodes have yet to recover and [X]_t continues to increase but at a slower pace due to the external failure path, while [A]_t reduces slightly.

In the time window [t_C, t_D], the initial internally failed nodes begin to recover as t_C > τ₁, in addition to the recovery of the Y-type nodes. The A-type nodes due to recovery will be more likely to become Y-type as the failed nodes remain the majority (due to the parameter setting β₂ > β₁ in this example). This leads to the observed increase in [Y]_t and decrease in [X]_t in the time interval [t_C, t_D]. In the final stage [t_D, t_E], [X]_t stops decreasing because the recovery of X-type nodes at the time t ≳ t_D is due to those failed internally at t ≳ t_B for which the number was small. However, the recovery of Y-type nodes at a shorter time scale supplies fresh active nodes. The fraction of failed nodes [X]_t + [Y]_t is so high, i.e., approaching the high-failure state, that the dynamics lead to a higher steady value of [Y] than [X] in long time. For time well beyond t_D, both [X] and [Y] become steady.

Figure 5b shows the phase diagram of the NMR model analogous to Fig. 4b for the MR model, with μ₁ = 0.01 and μ₂ = 1.0. The results of the PA analysis (solid curve) are in better agreement with the simulation results than those obtained from the mean-field theory (dashed curve). The difference in dynamics in the NMR model also alters the dependence of β_c to sustain a high-failure state on [X]₀. Carrying out the same analysis as for the MR model (see Supplementary Fig. 1 in Supplementary Note 2 for details), we get the relationship β_c([X]₀) for attaining a high-failure state for a given initial condition, as shown in Fig. 5c. The pair approximation, again, gives more accurate prediction than that from the mean-field theory.

The result in Fig. 5c demonstrates the striking effect of non-Markovian type of recovery with memory on the failure propagation dynamics, which is in stark contrast to the ideal case of Markovian process as exemplified in Fig. 4c. In particular, as the fraction [X]₀ of initially failed nodes is increased from a near zero value to one, the value of β_c begins to decrease continuously and smoothly until it reaches a minimum, at which β_c increases relatively more rapidly to a high value of about 0.006 for [X]₀ ≈ 0.3. For [X]₀ > 0.3, the value of β_c remains approximately constant at 0.006. Comparing Fig. 5c with Fig. 4c, we see two major, characteristic differences. Firstly, the behavior of an abrupt decrease in the Markovian case is replaced by a gradual process in the non-Markovian case, essentially converting a first-order like process to a second-order one. Secondly and more importantly, β_c recovers from its minimum value and maintains at a high value regardless of the value of [X]₀ insofar as it exceeds about 30%. This means that, the system can maintain its degree of resilience even when the initial fraction of failed nodes reaches 100%! This contrasts squarely the behavior in the Markovian case, where the system resilience is reduced dramatically even when only about 40% of the nodes failed initially. In this sense, we say that a non-Markovian type of memory effect makes the network system more resilient against failure propagation.

While the behavior in Fig. 5c is counterintuitive, a heuristic reason is as follows. For an initial state with many initial X-type nodes, the few remaining nodes will switch from being active to the Y-type and back. All the initial X-type nodes will have to wait for the time period τ₁ to recover. At that time, the system becomes one with only a few failed nodes—effectively equivalent to one with small [X]₀ value and requiring a larger β_c value to evolve into the high-failure state. In a range of small [X]₀, a smaller β_c can already cause more active nodes to become Y-type, helping maintain the system in a high-failure state as described for Fig. 5a. Theoretical support for the behavior is provided by the PA analysis and mean-field theory, as shown in Fig. 5c.

In addition to the different time evolution in the MR and NMR models, there are also cases where the same initial conditions [X]₀, [Y]₀, and [A]₀ would lead to different final states. Figure 6 shows the final states starting from any [X]₀ and [Y]₀ in the [X]₀ − [Y]₀ plane (the basin structure), with β₁ = 0.004, β₂ = 2.0, μ₁ = 0.01, and μ₂ = 1.0. The results from the mean-field theory (Fig. 6a) and direct simulations (Fig. 6b) show essentially the same features. (Results from the initial-condition setting [X]₀ ≠ 0 and [Y]₀ = 0.0 are presented in Supplementary Fig. 2 of Supplementary Note 2.) It is useful to contrast the final states of the MR and NMR models. From Fig. 3, an initial state, e.g., [X]₀ = [Y]₀ = 0.5, will evolve into a high-failure state in the MR model, but it will end up in a low-failure state in the NMR model. This means that, the NMR process can make the system more resilient to failures. (More examples can be found in Supplementary Fig. 3 of Supplementary Note 2 where different steady states from the two models are presented.)

**Fig. 6: **Basin structure of NMR model**.**

MR and NMR dynamics on heterogeneous networks

So far our analysis and simulations have been carried out for MR and NMR dynamics on random regular networks. We find that altering the network structure causes little change in the qualitative results. For example, we have carried out simulations on scale-free networks of size N = 3 × 10⁴ with degree range $\left[{k}_{{\rm{m}}in},\sqrt{N}\right]$ and degree distribution P(k) ~ k^−γ. Figure 7 shows the results of β_c versus [X]₀ for the MR and NMR dynamics for networks with γ = 3. Because of the heterogeneity in the nodal degree distribution, the threshold on external failure is given in terms of the fraction one-half of the failed neighbors.

Comparing results with Fig. 4c for MR dynamics and Fig. 5c for NMR dynamics in random regular networks, we see that the key features are similar when the underlying random regular networks are replaced by scale-free networks. We have also carried out numerical simulations on four additional types of synthetic and empirical networks: (a) networks with degree–degree correlation, (b) networks with a community structure, (c) empirical arenas-email network, and (d) empirical friendship-hamster network, with results presented in Supplementary Notes 4 and 5 for the former and latter two cases, respectively. These results, together with Fig. 7, suggest that, for heterogeneous networks, a non-Markovian process tends to enhance the network resilience against large-scale failures.

Discussion

The intrinsic memory effect associated with non-Markovian processes makes it challenging to analyze the underlying network dynamics, new and surprising phenomena can arise. Most previous studies treated Markovian processes through either a mean-field type of theory^60,61 or an effective degree approach⁵⁹. For non-Markovian processes, the mean-field approximation can still be applied^29,31,32,33, but it is necessary to invoke a higher-order theory such as the PA analysis. Our work presents such an example in the context of failure propagation in complex networks.

Our study has demonstrated that, in both models, the network can evolve into a low-failure or a high-failure state, with the latter corresponding to the undesired state of large-scale failure. Both the mean-field and PA theories are capable of predicting the dynamical behaviors of failure propagation, and the performances of the theories are gauged by simulation results, revealing that the more laborious pair approximation gives results in better quantitative agreement with the numerics. Our systematic computations on different complex networks and two types of theoretical analyses have uncovered a striking phenomenon: the non-Markovian memory effect in the nodal recovery can counter-intuitively make the network more resilient against large-scale failures.

Our finding also calls for the incorporation of non-Markovian type of memory factors into the design of communication, computer, and infrastructure networks in various engineering disciplines. We hope our work will stimulate interest in examining and exploiting non-Markovian processes in various network dynamical processes. We have carried out a systematic study of the effects of Markovian versus non-Markovian recovery on network synchronization using the paradigmatic Kuramoto network model, with the main finding that non-Markovian recovery makes the network more resilient against large-scale breakdown of synchronization (Supplementary Note 6).

Methods

Mean-field theory for MR dynamics

Let [A]_t, [X]_t, and [Y]_t be the fractions of A-type, X-type, and Y-type nodes in the system at time t, respectively. A hierarchical set of dynamical equations for the MR model can be constructed to include increasingly longer spatial correlation. The equations for the evolution of the fractions of different types of nodes are:

$$\frac{{\mathrm{d}}{[X]}_{{t}}}{{\mathrm{d}}t}={\beta }_{1}{[A]}_{{t}}-{\mu }_{1}{[X]}_{{t}},$$

(1)

and

$$\frac{{\mathrm{d}}{[Y]}_{{t}}}{{\mathrm{d}}t}={\beta }_{2}{E}_{{t}}{[A]}_{{t}}-{\mu }_{2}{[Y]}_{{t}},$$

(2)

where the first term in each equation gives the supply to [X] ([Y]) due to internal (external) failures and the second term represents the drop in [X] ([Y]) due to recovery. Note that, because of the relation

$${[A]}_{{t}}=1-{[X]}_{{t}}-{[Y]}_{{t}}\equiv 1-{[I]}_{{t}},$$

(3)

an equation for [A]_t is unnecessary. The quantity E_t is the probability of an A-type node having j ≤ m neighbors of A-type nodes at time t and thus the node will be infected at the rate β₂.

In general, the quantity E_t involves the correlation between two neighboring nodes. To connect Eqs. (1) and (2) so as to retain the simplicity of a single-node theory, we use the approximation

$${E}_{{t}}([I])=\mathop{\sum }\limits_{{j} = 0}^{{m}}{C}_{{k}}^{{k-j}}{({[I]}_{{t}})}^{k-j}{(1-{[I]}_{{t}})}^{j},$$

(4)

where ${C}_{{k}}^{{k-j}}=k!/(j!(k-j)!)$. Equations (1) (4) form a set of equations, from which the fractions of different types of nodes can be solved. This is the simplest single-site mean-field approximation for the MR dynamics that ignores any spatial correlation. Despite its simplicity, it is capable of revealing the key features in the stationary state, in which Eqs. (1) and (2) require the fraction of failed nodes [I] to satisfy

$$[I]=1-\frac{1}{({\beta }_{2}/{\mu }_{2}){E}_{{t}}([I])+({\beta }_{1}/{\mu }_{1})+1},$$

(5)

which can be solved for [I] self-consistently with Eq. (4). Equation (5) implies that [I] depends only on the ratios β₁/μ₁ and β₂/μ₂ within the mean-field approximation, and so are the other fractions [A], [X], and [Y].

Effect of nodal correlation: pairwise approximation for the MR model

Our PA based analysis begins by defining [UV]_t as the fractions of UV type of links in the system at time t, where U, V ∈ {A, X, Y}. A connection that stems out from a node can be classified by a type. For example, for a node with the current state being A-type, each link that it carries can be classified into the AA, AX, or AY type, depending on the state of the node at the other end of the link. Taking into account every link from every node, we have that the fractions of links satisfy

$$\sum _{{\mathrm{U,V}}\in \{{\mathrm{A,X,Y}}\}}{[UV]}_{{t}}=1,$$

(6)

with [UV]_t = [VU]_t for U ≠ V.

In general, the equations of single-node quantities, e.g., Eq. (2), necessarily involve quantities of more extensive spatial correlation because the interplay between the failure of a node and the states of its neighboring nodes. Since [AI]_t/[A]_t = ([AX]_t + [AY]_t)/[A]_t is the probability of an A-type node having an inactive node regardless of the types of the neighbors, the probability that there are exactly j neighbors of A-type and (k − j) inactive neighbors of either X or Y type is

$${C}_{{k}}^{{k-j}}{\left(\frac{{[AI]}_{{t}}}{{[A]}_{{t}}}\right)}^{{k-j}}{\left(1-\frac{{[AI]}_{{t}}}{{[A]}_{{t}}}\right)}^{j},$$

(7)

where k is the degree of the node. The quantity E_t in Eq. 2, as schematically depicted in Fig. 8a, is thus given by

$${E}_{{t}}=\mathop{\sum }\limits_{{j} = 0}^{{m}}{C}_{{k}}^{{k-j}}{\left(\frac{{[AI]}_{{t}}}{{[A]}_{{t}}}\right)}^{{k-j}}{\left(1-\frac{{[AI]}_{{t}}}{{[A]}_{{t}}}\right)}^{j},$$

(8)

which indicates explicitly that the dynamics of single-node quantities are governed by the two-node quantity [AI]_t. This is reminiscence of the BBGKY (Bogoliubov-Born-Green-Kirkwood-Yvon) hierarchy of equations for the distribution functions in a system consisting of a large number of interacting particles in statistical physics⁷⁰. Only under the approximation [AI]_t ≈ [A]_t[I]_t (so that the two-node correlation can be neglected) will the resulting equation be Eq. (4)—a set of single-node mean-field equations.

**Fig. 8: **Schematic diagram for the PA analysis of the MR model**.**

To proceed, we derive the dynamical equations for [UV]_t that will in general involve more extensive spatial correlation. For example, a link of the type AA would evolve into a different type depending on the neighborhoods of the two nodes, effectively a small cluster of nodes. To develop a manageable approximation, we retain the two-node correlation and decouple any longer spatial correlation in terms of one-node and two-node functions. This is the idea behind PA for obtaining a closed set of equations. In particular, the dynamical equations for [AX]_t and [AA]_t are

$$\frac{{\mathrm{d}}{[AX]}_{{t}}}{{\mathrm{d}}t}={\mu }_{1}{[XX]}_{{t}}+{\mu }_{2}{[YX]}_{{t}}+{\beta }_{1}{[AA]}_{{t}},$$

$$\qquad\qquad\, -\, {\mu }_{1}{[AX]}_{{t}}-({\beta }_{1}+{\beta }_{2}{E}_{{t}}^{\prime}){[AX]}_{{t}},$$

(9)

and

$$\frac{{\mathrm{d}}{[AA]}_{{t}}}{{\mathrm{d}}t}=2{\mu }_{1}{[AX]}_{{t}}+2{\mu }_{2}{[AY]}_{{t}},$$

$$-2({\beta }_{1}+{\beta }_{2}{E}_{t}^{^{\prime\prime} }){[AA]}_{t},$$

(10)

where

$${E}_{{t}}^{\prime}=\mathop{\sum }\limits_{j = 0}^{{m}}{C}_{{k-1}}^{{k-1-j}}{\left(\frac{{[AI]}_{{t}}}{{[A]}_{{t}}}\right)}^{{k-1-j}}{\left(1-\frac{{[AI]}_{{t}}}{{[A]}_{{t}}}\right)}^{j},$$

(11)

is the probability of an A-type node having j ≤ m A-type neighbors among its (k − 1) neighbors, given that one neighbor is inactive, and

$${E}_{{t}}^{^{\prime\prime} }=\mathop{\sum }\limits_{{j = 0}}^{{m-1}}{C}_{{k-1}}^{{k-1-j}}{\left(\frac{{[AI]}_{{t}}}{{[A]}_{{t}}}\right)}^{{k-1-j}}{\left(1-\frac{{[AI]}_{{t}}}{{[A]}_{{t}}}\right)}^{j},$$

(12)

is the probability of an A-type node having j ≤ m − 1 A-type neighbors among its (k − 1) neighbors, given that one neighbor is active. Figure 8 illustrates the meanings of E_t, ${E}_{{t}}^{\prime}$, and ${E}_{{t}}^{^{\prime\prime} }$ schematically. The terms in Eqs. (9) and (10) account for how the recovery and failure processes affect the fractions of AX-type and AA-type links. The complete set of dynamical equations is listed in Supplementary Note 1, which can be solved iteratively to yield the temporal variations on the type of nodes and the type of links given an initial condition. The steady-state quantities can be obtained through a sufficiently large number of iterations.

Pairwise approximation theory for the NMR model

Specifically, we let ${[{U}^{l}]}_{{t}}$ be the fraction of nodes of type U at time t, which became type U from some other type only l time steps ago, and ${[{U}^{{l}_{1}}{V}^{{l}_{2}}]}_{{t}}$ be the fraction of links of the UV type when the corresponding node(s) associated with a link became that of the labeled type l₁ and l₂ time steps ago. The time evolution of the fraction of X-type nodes in the NMR model is given by

$${[{X}^{l}]}_{{t}+\Delta {t}}=\left\{\begin{array}{cc}{\beta }_{1}\Delta t{[A]}_{{t}},&l\in [0,\Delta t);\\ {[{X}^{l-\Delta {t}}]}_{{t}},&l\in [\Delta {t},{\tau }_{1}];\\ 0,&l\in ({\tau }_{1},\infty ).\end{array}\right.$$

(13)

The first line in Eq. (13) gives the new supply due to internal failure of A-type nodes in the time duration [t, t + Δt). The second line accounts for the nodes which were inactive for a duration l − Δt at time t but have not reached the time for recovery at time t + Δt. The third line states that all X-type nodes that came to existence τ₁ earlier have been recovered. Similarly, the time evolution of the fraction of Y-type nodes is given by

$${[{Y}^{l}]}_{{t}+\Delta {t}}=\left\{\begin{array}{cc}{\beta }_{2}\Delta t{E}_{{t}}{[A]}_{{t}},&l\in [0,\Delta t);\\ {[{Y}^{l-\Delta t}]}_{{t}},&l\in [\Delta t,{\tau }_{2}];\\ 0,&l\in ({\tau }_{2},\infty ),\end{array}\right.$$

(14)

where E_t is defined in Eq. (8) and [AI]_t = [AX]_t + [AY]_t. The fractions of X-type and Y-type nodes, regardless of how long they have been in the corresponding state, are given by ${[X]}_{{t}}=\mathop{\sum }\nolimits_{l = 0}^{{\tau }_{1}}{[{X}^{l}]}_{{t}}$ and ${[Y]}_{{t}}=\mathop{\sum }\nolimits_{l = 0}^{{\tau }_{2}}{[{Y}^{l}]}_{{t}}$, respectively. The fraction of active nodes follows from [A]_t = 1 − [X]_t − [Y]_t.

To develop a PA analysis for failure propagation dynamics with NMR, we construct the equations for the time evolution of UV-types of links and retain spatial correlation up to two neighboring nodes. Our derivation of the counterparts of Eqs. (13) and (14) in the MR case suggests the necessity to examine the history of the inactive nodes(s) associated with a link. For example, the time evolution of the links in ${[A{X}^{l}]}_{{t}}$ is governed by

$${[A{X}^{l}]}_{{t}+\Delta {t}}=\left\{\begin{array}{c}{\beta }_{1}\Delta t{[AA]}_{{t}}+{\beta }_{1}\Delta t({[{X}^{{\tau }_{1}}A]}_{{t}}+{[{Y}^{{\tau }_{2}}A]}_{{t}}),\\ l\in [0,\Delta t);\\ {[{X}^{{\tau }_{1}}{X}^{l-\Delta {t}}]}_{{t}}+{[{Y}^{{\tau }_{2}}{X}^{l-\Delta t}]}_{t}\\ +\,(1-{\beta }_{1}\Delta t-{\beta }_{2}\Delta t{E}_{{t}}^{\prime})\\ \times {[A{X}^{l-\Delta t}]}_{{t}},\quad \,l\in [\Delta t,{\tau }_{1}];\\ 0,l\in ({\tau }_{1},\infty ),\end{array}\right.$$

(15)

where $E^{\prime}$ is defined in Eq. (11). The first line represents the new supply to AX-type of links due to an internal failure in one of the active nodes associated with a link of the AA-type, and an internal failure together with a recovery of an inactive node in a link of the XA-types and YA-types. The second line includes the supply to AX^l-type links due to recoveries from XX and YX types as well as the links of AX^l−Δt type that became AX^l type in the recent duration Δt. The last line comes from the fact that an X-type node must recover after a time τ₁ since it became inactive. The fraction of links of AX-type, regardless of how long the node in the link has taken in the X-type, is given by ${[AX]}_{{t}}=\mathop{\sum }\nolimits_{l = 0}^{{\tau }_{1}}{[A{X}^{l}]}_{{t}}$. We thus have that the fraction of AA-type of links evolves in time as

$${[AA]}_{{t}+\Delta {t}}=2(1-{\beta }_{1}\Delta t-{\beta }_{2}\Delta t{E}_{{t}}^{\prime})({[A{X}^{{\tau }_{1}}]}_{{t}}+{[A{Y}^{{\tau }_{2}}]}_{{t}}),$$

$$\qquad\,\, +\, {[{X}^{{\tau }_{1}}{X}^{{\tau }_{1}}]}_{{t}}+{[{Y}^{{\tau }_{2}}{Y}^{{\tau }_{2}}]}_{{t}} + 2{[{X}^{{\tau }_{1}}{Y}^{{\tau }_{2}}]}_{{t}}$$

$$\,\,+\, (1-2{\beta }_{1}\Delta t-2{\beta }_{2}\Delta t{E}_{t}^{^{\prime\prime} }){[AA]}_{t},$$

(16)

where ${E}_{t}^{^{\prime\prime} }$ is defined in Eq. (12). Equations for other types of links can also be constructed (Supplementary Note 1). Equations (15) and (16) are analogous to Eqs. (9) and (10) in the MR model. The number of equations is determined by the divisions of τ₁ and τ₂ into the small time steps Δt, which increases rapidly when Δt is small compared with the other time scales in the NMR dynamics.

A crude approximation analogous to the mean-field theory can be developed for the NMR model by retaining only the fractions of nodes in the equations, which can be done by decoupling the two-node quantities such as [AI]_t by [AI]_t ≈ [A]_t[I]_t. The resulting equations governing the fractions of different types of nodes become

$${[X]}_{{t}+\Delta {t}}={\beta }_{1}\Delta t{[A]}_{{t}}+{[X]}_{{t}}-{[{X}^{{\tau }_{1}}]}_{{t}},$$

(17)

and

$${[Y]}_{t+\Delta t}={\beta }_{2}\Delta t{E}_{t}{[A]}_{t}+{[Y]}_{t}-{[{Y}^{{\tau }_{2}}]}_{t},$$

(18)

where E_t takes on the approximate form in Eq. (4). Equations (17), (18), and (4) form a set of equations that can be solved to yield the fractions of different types of nodes. The first two terms in Eqs. (17) and (18) correspond to the increase in inactive nodes due to failure and due to those remaining inactive, and the last term corresponds to recovery. The number of equations, again, depends on the choice of Δt. This is the mean-field approximation for the NMR model that ignores any spatial correlation.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The source data underlying Figs. 2–7 and Supplementary Figs. 1–12 are available at https://github.com/zhlin2328/Codes-for-NCOMMS-19-1125220.

Code availability

C++ codes to reproduce the data in the main text and the Supplementary Information are available at https://github.com/zhlin2328/Codes-for-NCOMMS-19-1125220.

References

Motter, A. E. & Lai, Y.-C. Cascade-based attacks on complex networks. Phys. Rev. E 66, 065102 (2002).
Article ADS CAS Google Scholar
Zhao, L., Park, K. & Lai, Y.-C. Attack vulnerability of scale-free networks due to cascading breakdown. Phys. Rev. E 70, 035101(R) (2004).
Article ADS CAS Google Scholar
Zhao, L., Park, K., Lai, Y.-C. & Ye, N. Tolerance of scale-free networks against attack-induced cascades. Phys. Rev. E 72, 025104(R) (2005).
Article ADS CAS Google Scholar
Galstyan, A. & Cohen, P. Cascading dynamics in modular networks. Phys. Rev. E 75, 036109 (2007).
Article ADS CAS Google Scholar
Bialek, J. W. Why has it happened again? Comparison between the UCTE blackout in 2006 and the blackouts of 2003. In Power Tech 2007 IEEE Lausanne 51–56 (IEEE, 2007)
Dobson, I., Carreras, B. A., Lynch, V. E. & Newman, D. E. Complex systems analysis of series of blackouts: Cascading failure, critical points, and self-organization. Chaos 17, 026103 (2007).
Article ADS PubMed MATH Google Scholar
Gleeson, J. P. Cascades on correlated and modular random networks. Phys. Rev. E 77, 046117 (2008).
Article ADS CAS Google Scholar
Rosato, V. et al. Modelling interdependent infrastructures using interacting dynamical models. Int. J. Crit. Infrastruct. 4, 63–79 (2008).
Article Google Scholar
Huang, L., Lai, Y.-C. & Chen, G. Understanding and preventing cascading breakdown in complex clustered networks. Phys. Rev. E 78, 036116 (2008).
Article ADS CAS Google Scholar
Simonsen, I., Buzna, L., Peters, K., Bornholdt, S. & Helbing, D. Transient dynamics increasing network vulnerability to cascading failures. Phys. Rev. Lett. 100, 218701 (2008).
Article ADS PubMed CAS Google Scholar
Yang, R., Wang, W.-X., Lai, Y.-C. & Chen, G. Optimal weighting scheme for suppressing cascades and traffic congestion in complex networks. Phys. Rev. E 79, 026112 (2009).
Article ADS CAS Google Scholar
Takayasu, M., Watanabe, T. & Takayasu, H. Econophysics Approaches to Large-Scale Business Data and Financial Crisis: Proceedings of Tokyo Tech-Hitotsubashi Interdisciplinary Conference and APFA7 (Springer, 2010)
Huang, L. & Lai, Y.-C. Cascading dynamics in complex quantum networks. Chaos 21, 025107 (2011).
Article ADS PubMed Google Scholar
Wang, W., Lai, Y.-C. & Armbruster, D. Cascading failures and the emergence of cooperation in evolutionary-game based models of social and economical networks. Chaos 21, 033112 (2011).
Article ADS MathSciNet PubMed MATH Google Scholar
Liu, R.-R., Wang, W.-X., Lai, Y.-C. & Wang, B.-H. Cascading dynamics on random networks: crossover in phase transition. Phys. Rev. E 85, 026110 (2012).
Article ADS CAS Google Scholar
Li, D. et al. Percolation transition in dynamical traffic network with evolving critical bottlenecks. Proc. Natl Acad. Sci. USA 112, 669–672 (2015).
Article ADS CAS PubMed Google Scholar
Parshani, R., Buldyrev, S. V. & Havlin, S. Critical effect of dependency groups on the function of networks. Proc. Natl Acad. Sci. USA 108, 1007–1010 (2011).
Article ADS CAS PubMed Google Scholar
Watts, D. J. A simple model of global cascades on random networks. Proc. Natl Acad. Sci. USA 99, 5766–5771 (2002).
Article ADS MathSciNet CAS PubMed MATH Google Scholar
Dodds, P. S. & Watts, D. J. Universal behavior in a generalized model of contagion. Phys. Rev. Lett. 92, 218701 (2004).
Article ADS PubMed CAS Google Scholar
Simonsen, I., Buzna, L., Peters, K., Bornholdt, S. & Helbing, D. Transient dynamics increasing network vulnerability to cascading failures. Phys. Rev. Lett. 100, 218701 (2008).
Article ADS PubMed CAS Google Scholar
Buldyrev, S. V., Parshani, R., Paul, G., Stanley, H. E. & Havlin, S. Catastrophic cascade of failures in interdependent networks. Nature 464, 1025 (2010).
Article ADS CAS PubMed Google Scholar
Ganin, A. A. et al. Resilience and efficiency in transportation networks. Sci. Adv. 3, e1701079 (2017).
Article ADS PubMed PubMed Central Google Scholar
Nudo, R. J. Recovery after brain injury: mechanisms and principles. Front. Hum. Neurosci. 7, 887 (2013).
Article PubMed PubMed Central Google Scholar
Shang, Y. Impact of self-healing capability on network robustness. Phys. Rev. E 91, 042804 (2015).
Article ADS CAS Google Scholar
Hu, F., Yeung, C. H., Yang, S., Wang, W. & Zeng, A. Recovery of infrastructure networks after localised attacks. Sci. Rep. 6, 24522 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
White, S. R. et al. Autonomic healing of polymer composites. Nature 409, 794 (2001).
Article CAS PubMed Google Scholar
Toohey, K. S., Sottos, N. R., Lewis, J. A., Moore, J. S. & White, S. R. Self-healing materials with microvascular networks. Nat. Mater. 6, 581 (2007).
Article CAS PubMed Google Scholar
Desmurget, M., Bonnetblanc, F. & Duffau, H. Contrasting acute and slow-growing lesions: a new door to brain plasticity. Brain 130, 898–914 (2007).
Article PubMed Google Scholar
Majdandzic, A. et al. Spontaneous recovery in dynamical networks. Nat. Phys. 10, 34 (2014).
Article CAS Google Scholar
Podobnik, B. et al. Network risk and forecasting power in phase-flipping dynamical networks. Phys. Rev. E 89, 042807 (2014).
Article ADS CAS Google Scholar
Podobnik, B. et al. Predicting the lifetime of dynamic networks experiencing persistent random attacks. Sci. Rep. 5, 14286 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Podobnik, B. et al. The cost of attack in competing networks. J. R. Soc. Interface 12, 20150770 (2015).
Article PubMed PubMed Central Google Scholar
Majdandzic, A. et al. Multiple tipping points and optimal repairing in interacting networks. Nat. Commun. 7, 10850 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Council, N. R. et al. Disaster Resilience: A National Imperative (The National Academies Press, Washington DC, 2012).
Gao, J., Barzel, B. & Barabási, A.-L. Universal resilience patterns in complex networks. Nature 530, 307–312 (2016).
Article ADS CAS PubMed Google Scholar
Ganin, A. A. et al. Operational resilience: concepts, design and analysis. Sci. Rep. 6, 1–12 (2016).
Article CAS Google Scholar
Linkov, I. & Trump, B. D. The Science and Practice of Resilience (Springer, 2019)
Pastor-Satorras, R., Castellano, C., Van Mieghem, P. & Vespignani, A. Epidemic processes in complex networks. Rev. Mod. Phys. 87, 925 (2015).
Article ADS MathSciNet Google Scholar
Wang, W., Tang, M., Stanley, H. E. & Braunstein, L. A. Unification of theoretical approaches for epidemic spreading on complex networks. Rep. Prog. Phys. 80, 036603 (2017).
Article ADS PubMed Google Scholar
de Arruda, G. F., Rodrigues, F. A. & Moreno, Y. Fundamentals of spreading processes in single and multilayer complex networks. Phys. Rep. 756, 1–60 (2018).
Article ADS MathSciNet MATH Google Scholar
Barabasi, A.-L. The origin of bursts and heavy tails in human dynamics. Nature 435, 207 (2005).
Article ADS CAS PubMed Google Scholar
González, M. C., Hidalgo, C. A. & Barabási, A. L. Understanding individual human mobility patterns. Nature 453, 779–782 (2008).
Article ADS PubMed CAS Google Scholar
Simini, F., González, M. C., Maritan, A. & Barabási, A. L. A universal model for mobility and migration patterns. Nature 484, 96–100 (2012).
Article ADS CAS PubMed Google Scholar
Zhao, Z.-D. et al. Emergence of scaling in human-interest dynamics. Sci. Rep. 3, 3472 (2013).
Article PubMed PubMed Central Google Scholar
Zhao, Z.-D., Huang, Z.-G., Huang, L., Liu, H. & Lai, Y.-C. Scaling and correlation of human movements in cyber and physical spaces. Phys. Rev. E 90, 050802(R) (2014).
Article ADS CAS Google Scholar
Pappalardo, L. et al. Returners and explorers dichotomy in human mobility. Nat. Commun. 6, 8166 (2015).
Article ADS PubMed PubMed Central Google Scholar
Zhao, Y.-M., Zeng, A., Yan, X.-Y., Wang, W.-X. & Lai, Y.-C. Unified underpinning of human mobility in the real world and cyberspace. N. J. Phys. 18, 053025 (2016).
Article Google Scholar
Yan, X.-Y., Wang, W.-X., Gao, Z.-Y. & Lai, Y.-C. Universal model of individual and population mobility on diverse spatial scales. Nat. Commun. 8, 1639 (2017).
Article ADS PubMed PubMed Central CAS Google Scholar
Bratsun, D., Volfson, D., Tsimring, L. S. & Hasty, J. Delay-induced stochastic oscillations in gene regulation. Proc. Natl Acad. Sci. USA 102, 14593–14598 (2005).
Article ADS CAS PubMed Google Scholar
Scalas, E., Kaizoji, T., Kirchler, M., Huber, J. & Tedeschi, A. Waiting times between orders and trades in double-auction markets. Phys. A 366, 463–471 (2006).
Article Google Scholar
Vazquez, A., Racz, B., Lukacs, A. & Barabasi, A.-L. Impact of non-Poissonian activity patterns on spreading processes. Phys. Rev. Lett. 98, 158702 (2007).
Article ADS PubMed CAS Google Scholar
Iribarren, J. L. & Moro, E. Impact of human activity patterns on the dynamics of information diffusion. Phys. Rev. Lett. 103, 038702 (2009).
Article ADS PubMed CAS Google Scholar
Van Mieghem, P. & Van de Bovenkamp, R. Non-Markovian infection spread dramatically alters the susceptible-infected-susceptible epidemic threshold in networks. Phys. Rev. Lett. 110, 108701 (2013).
Article ADS PubMed CAS Google Scholar
Jo, H.-H., Perotti, J. I., Kaski, K. & Kertész, J. Analytically solvable model of spreading dynamics with non-Poissonian processes. Phys. Rev. X 4, 011041 (2014).
Google Scholar
Kiss, I. Z., Röst, G. & Vizi, Z. Generalization of pairwise models to non-Markovian epidemics on networks. Phys. Rev. Lett. 115, 078701 (2015).
Article ADS PubMed CAS Google Scholar
Starnini, M., Gleeson, J. P. & Boguñá, M. Equivalence between non-Markovian and Markovian dynamics in epidemic spreading processes. Phys. Rev. Lett. 118, 128301 (2017).
Article ADS PubMed Google Scholar
Sherborne, N., Miller, J., Blyuss, K. & Kiss, I. Mean-field models for non-Markovian epidemics on networks. J. Math. Biol. 76, 755–558 (2018).
Article MathSciNet PubMed MATH Google Scholar
Feng, M., Cai, S.-M., Tang, M. & Lai, Y.-C. Equivalence and its invalidation between non-Markovian and Markovian spreading dynamics on complex networks. Nat. Commun. 10, 3748 (2019).
Article ADS PubMed PubMed Central CAS Google Scholar
Valdez, L. D., DiMuro, M. A. & Braunstein, L. A. Failure-recovery model with competition between failures in complex networks: a dynamical approach. J. Stat. Mech. 2016, 093402 (2016)
Böttcher, L., Nagler, J. & Herrmann, H. J. Critical behaviors in contagion dynamics. Phys. Rev. Lett. 118, 1–5 (2017).
Article Google Scholar
Böttcher, L., Luković, M., Nagler, J., Havlin, S. & Herrmann, H. J. Failure and recovery in dynamical networks. Sci. Rep. 7, 41729 (2017).
Article ADS PubMed PubMed Central CAS Google Scholar
Keeling, M., Rand, D. & Morris, A. Correlation models for childhood epidemics. Proc. R. Soc. Lond. Ser. B 264, 1149–1156 (1997).
Article ADS CAS Google Scholar
benAvraham, D. & Köhler, J. Mean-field (n, m)-cluster approximation for lattice models. Phys. Rev. A 45, 8358 (1992).
Article ADS CAS Google Scholar
Mata, A. S. & Ferreira, S. C. Pair quenched mean-field theory for the susceptible-infected-susceptible model on complex networks. EPL 103, 48003 (2013).
Article ADS CAS Google Scholar
Gross, T., D’Lima, C. J. D. & Blasius, B. Epidemic dynamics on an adaptive network. Phys. Rev. Lett. 96, 208701 (2006).
Article ADS PubMed CAS Google Scholar
Ji, M., Xu, C., Choi, C. W. & Hui, P. M. Correlation and analytic approaches to co-evolving voter models. N. J. Phys. 15, 113024 (2013).
Article Google Scholar
Zhang, W., Xu, C. & Hui, P. M. Spatial structure enhanced cooperation in dissatisfied adaptive snowdrift game. Eur. Phys. J. B 86, 196 (2013).
Article ADS MathSciNet CAS Google Scholar
Zhang, W., Li, Y. S., Du, P., Xu, C. & Hui, P. M. Phase transitions in a coevolving snowdraft game with costly rewiring. Phys. Rev. E 90, 052819 (2014).
Article ADS CAS Google Scholar
Choi, C. W., Xu, C. & Hui, P. M. Adaptive cyclically dominating game on co-evolving networks: numerical and analytic reuslts. Eur. Phys. J. B 90, 190 (2017).
Article ADS CAS Google Scholar
Harris, S. An Introduction to the Theory of the Boltzmann Equation (Courier Corporation, 2004)

Download references

Acknowledgements

The authors would like to thank Zhenhua Wang for helpful discussions. This work was supported by the National Natural Science Foundation of China (Grant Nos. 11975099, 11575041, 11675056 and 11835003), the Natural Science Foundation of Shanghai (Grant No. 18ZR1412200), and the Science and Technology Commission of Shanghai Municipality (Grant No. 14DZ2260800). Y.C.L. would like to acknowledge support from the Vannevar Bush Faculty Fellowship program sponsored by the Basic Research Office of the Assistant Secretary of Defense for Research and Engineering and funded by the Office of Naval Research through Grant No. N00014-16-1-2828.

Author information

Authors and Affiliations

State Key Laboratory of Precision Spectroscopy and School of Physics and Electronic Science, East China Normal University, Shanghai, 200241, China
Zhao-Hua Lin, Ming Tang & Zonghua Liu
Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, Shanghai, 200241, China
Mi Feng & Ming Tang
School of Physical Science and Technology, Soochow University, Suzhou, 215006, China
Chen Xu
Department of Physics, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
Pak Ming Hui
School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, AZ, 85287, USA
Ying-Cheng Lai

Authors

Zhao-Hua Lin
View author publications
You can also search for this author in PubMed Google Scholar
Mi Feng
View author publications
You can also search for this author in PubMed Google Scholar
Ming Tang
View author publications
You can also search for this author in PubMed Google Scholar
Zonghua Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chen Xu
View author publications
You can also search for this author in PubMed Google Scholar
Pak Ming Hui
View author publications
You can also search for this author in PubMed Google Scholar
Ying-Cheng Lai
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Z.-H.L., M.T., and Z.H.L. designed research; Z.-H.L. performed research; Z.-H.L., M.F., M.T., Z.H.L., C.X., and P.M.H. contributed analytic tools; Z.-H.L., M.F., M.T., Z.H.L., C.X., P.M.H., and Y.-C.L. analyzed data; Z.-H.L., M.T., Z.H.L., P.M.H., and Y.-C.L. wrote the paper.

Corresponding authors

Correspondence to Ming Tang or Zonghua Liu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Communications thanks Francisco Rodrigues, Igor Linkov and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lin, ZH., Feng, M., Tang, M. et al. Non-Markovian recovery makes complex networks more resilient against large-scale failures. Nat Commun 11, 2490 (2020). https://doi.org/10.1038/s41467-020-15860-2

Download citation

Received: 23 September 2019
Accepted: 26 March 2020
Published: 19 May 2020
DOI: https://doi.org/10.1038/s41467-020-15860-2

This article is cited by

Robustness and resilience of complex networks
- Oriol Artime
- Marco Grassia
- Filippo Radicchi
Nature Reviews Physics (2024)
Higher-order non-Markovian social contagions in simplicial complexes
- Zhaohua Lin
- Lilei Han
- Ming Tang
Communications Physics (2024)
The effect of information-driven resource allocation on the propagation of epidemic with incubation period
- Xuzhen Zhu
- Yuxin Liu
- Jinming Ma
Nonlinear Dynamics (2022)
Scale-free networks: evolutionary acceleration of the network survivability and its quantification
- Anqi Yu
- Nuo Wang
Peer-to-Peer Networking and Applications (2022)
The relative importance of structure and dynamics on node influence in reversible spreading processes
- Jun-Yi Qu
- Ming Tang
- Shu-Guang Guan
Frontiers of Physics (2021)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.