Equivalence and its invalidation between non-Markovian and Markovian spreading dynamics on complex networks

Epidemic spreading processes in the real world depend on human behaviors and, consequently, are typically non-Markovian in that the key events underlying the spreading dynamics cannot be described as a Poisson random process and the corresponding event time is not exponentially distributed. In contrast to Markovian type of spreading dynamics for which mathematical theories have been well developed, we lack a comprehensive framework to analyze and fully understand non-Markovian spreading processes. Here we develop a mean-field theory to address this challenge, and demonstrate that the theory enables accurate prediction of both the transient phase and the steady states of non-Markovian susceptible-infected-susceptible spreading dynamics on synthetic and empirical networks. We further find that the existence of equivalence between non-Markovian and Markovian spreading depends on a specific edge activation mechanism. In particular, when temporal correlations are absent on active edges, the equivalence can be expected; otherwise, an exact equivalence no longer holds.

Because +∞ τ Ψ rec (τ )dτ is a monotonically decreasing function of τ in Supplementary Eq. (27), we have ω inf (τ ) = 0 and ω inf (τ ) = 1/ϑ (1) (0) while the integral +∞ τ Ψ rec (τ )dτ does not decrease to zero. Using the fact that the infection time distribution follows the distribution we have which is an exponential time distribution typical of a Markovian process and strictly follows the stationary state of Supplementary Eq. (25). If, for τ = τ 0 , the following integral vanishes, we then have ω inf (τ 0 ) = 0 and ω inf (τ 0 ) = 1/ϑ (1) (0). However, in this case the quantities ω inf (τ 0 ) and ω inf (τ 0 ) are not physically meaningful because an infected node has recovered when the value of the integral +∞ τ Ψ rec (τ )dτ decreases to zero. Taken together, the analysis enables us to draw the conclusion that, for SIS processes with type-I activation mechanism, only when the infection times are distributed exponentially will the whole process be equivalent to a Markovian process.

SUPPLEMENTARY NOTE 2. THEORETICAL ANALYSIS OF NON-MARKOVIAN SIS MODEL WITH TYPE-II ACTIVATION MECHANISM
We first derive the infection rate η(τ ) of infected node j of age τ . Define A l←i (κ, τ ; t) as the two-dimensional probability density function that the edge l ← i is active with age κ and the infection age of node i is τ . The probability density function that node i stays in the infected state aged τ at time t is given by It is worth noting that, for τ = 0, the following equality holds: At time t, the probability that the edge l ← i is active with age ranging from κ to κ+dκ and node i is in the infected state with infection age in the interval (τ, τ +dτ ) is A l←i (κ, τ, t)dκdτ . In this case, node i returns to the susceptible state with probability ω rec (τ )dτ . The evolution The probability that the edge l ← i transmits disease outward is ω inf (κ)dκ. The probability that neither recovery nor transmission occurs is 1 − ω rec (τ )dτ − ω inf (κ)dκ. The evolution equation of A l←i (κ, τ ; t) can then be written as Since an infected node i with infection age in the range from zero to +∞ can transition into the susceptible state aged zero insofar as recovery is possible, we obtain the probability density function that node i returns to the susceptible state of zero age as To describe the time evolution of S i (τ ; t), we assume that the ages of two connected nodes are uncorrelated, and thus Φ i←j (τ ; t) is the probability density function that node i in the susceptible state of age τ is infected by node j at time t. During the period (t, t + dt), a susceptible node i aged τ is infected by its neighbors with probability N j=1 a ij Φ i←j (τ ; t)dτ , Mechanism of probability flows with type-II activation mechanism. The magnitude of A l←i (κ, τ ; t) is represented by the color of the squares, where a darker color indicates a larger value. a At time t, there are two distributions, A l←i (κ, τ ; t) and S i (τ ; t). b Each square in A l←i (κ, τ ; t) loses probability densities ω inf (κ)dκA l←i (κ, τ ; t) and ω rec (τ )dτ A l←i (κ, τ ; t), and converges in the integrals c An amount of the density +∞ 0 τ 0 ω rec (τ )A l←i (κ, τ ; t)dκdτ flows into the susceptible state, i.e., S i (0; t + dt), an amount τ 0 ω inf (τ )A l←i (κ, τ ; t)dκ flows into the ordinate axis of A l←i (κ, τ ; t + dt), i.e., A l←i (0, τ + dτ ; t + dt), +∞ 0 S i (τ, t) N j=1 a ij Φ i←j (τ ; t)dτ flows into the infected state: A l←i (0, 0; t + dt). d A series of probabilities flows results in two distributions A l←i (κ, τ ; t = dt) and S i (τ ; t + dt) at time t + dt.
where N is the network size and a ij is the element of the network adjacency matrix. We thus have where Regardless of the value of the active age, from zero to ∞, of the edge l ← i in the active state, if it propagates the disease, it is still in the active state but of age zero. We thus have Since a susceptible node i with any susceptibility age ranging from zero to ∞ can transition into the infected state aged zero insofar as the infection process occurs, we obtain the probability density function that the edge l ← i becomes the active state aged τ = κ = 0 as Supplementary Fig. 1 illustrates the probability flow between A l←i (κ, τ ; t) and S i (τ ; t) and those in their respective interiors. To reduce the dimension of the evolution equations, we define which is the infection rate η(τ ; t) of an infected node j aged τ at time t. Equation (37) can be rewritten as From Supplementary Eq. (34), we get Because of the relations Ψ rec (τ ) = e − τ 0 ωrec(τ )dτ and Ψ inf (κ) = e − κ 0 ω inf (κ )dκ , we can rewrite Supplementary Eq. (42) as Equation (38) can then be rewritten as From Supplementary Eqs. (32) and (38) as well as the condition Ψ rec (0) = 1, we get Because of the relation Supplementary Eq. (46) can be written as Moreover, using the relations With the definition of η(τ ; t) in Supplementary Eq. (45), we get For ∀t ≥ 0, we have η(0; t) = ψ inf (0).
Solving the partial derivative of t for Supplementary Eq. (50), we get The partial derivative of t for η(τ ; t) at τ = 0 is thus zero: and η t (0; t − τ ) = 0. From Supplementary Eq. (52), we obtain the solution series: η t (dτ ; t − τ + dτ ) = 0, η t (2dτ ; t − τ + 2dτ ) = 0, . . . , η t (τ ; t) = 0. We thus have where η(τ ; t) is independent of t. Letting η(τ ) = η(τ ; t), we rewrite Supplementary Eq. (50) as Performing Laplace transform on both sides of the equation, we obtain the following integral equation: Finally, Supplementary Eqs. (34), (36), (38) and (39) can be reduced to: where When the system reaches a steady state, we obtain the differential equations of probability density functions of node i being in the susceptible and infected states as andĨ Solving these equations, we obtain Defining we can reduce the equation toĨ Combining Supplementary Eq. (56) and the definition of λ eff : and exchanging the integration order, we obtain whereψ Note that 1 2πi when we take the integration path as a contour of an infinite radius, i.e., s = re iθ , where r → +∞, − π 2 < θ < π 2 , and We thus have where C is a contour that encloses the entire Re(s) > 0 region [2].

SUPPLEMENTARY NOTE 3. SECOND-ORDER MEAN FIELD THEORY AND ERROR ANALYSIS
A. Second-order mean field theory For SIS type of spreading dynamics, there are three distinct types of mean field theories, in the order of increase in the prediction accuracy and computational complexity: (1) the classical mean field theory based on the assumption that all nodes are equivalent [3], (2) heterogeneous mean field theory in which all the nodes with the same degree are assumed to be equivalent to each other [4], and (3) quench mean field theory where the state changes of individual nodes are treated separately [5,6]. In these theories, the dynamical correlation between any pair of connected nodes is ignored, i.e., the states of any two connected nodes are regarded as independent. The three types of theories thus belong to first-order mean field analysis. To capture the dynamical correlation, a second-order approximation taking into account the evolution of the states of nodal pairs together is needed [7][8][9][10][11][12][13]. In fact, the pairwise approximation approach is a second-order theory, which regards nodal pairs with the same degree as equivalent [7][8][9]13]. The master equation approach deals with not only the state changes of the nodes with the same degree, but also the state changes of their neighbors. With approximations, it can be reduced to the evolution equations of pairwise approximation and mean field framework [14].
Here we develop a second-order mean field theory to deal with non-Markovian SIS dynamics. We first define the states of every pair of nodes. Specifically, for each pair of nodes, there are four different states: the II or AA state indicating that both nodes have been infected, the SI or SA state denoting that one node is susceptible and the other has been infected, the IS or AS state specifying that one node has been infected and the other is susceptible, and the SS state signifying that both nodes are susceptible. Note that the nodal pair states II , SI , IS are for type-I edge activation mechanism, and AA , SA , AS are for type-II edge activation mechanism.
The computational complexity required by the second-order mean field theory can often be prohibitively high. To make an analysis feasible, we restrict our study to the homogeneous case where the nodal pairs are equivalent in the Erdös-Rényi (ER) type of random networks. For type-I edge activation mechanism, we define II (τ 1 , τ 2 ; t) as the probability density function that one node in a pair is in the infected state with the age of τ 1 and the other node is in the susceptible state with the age of τ 2 at time t. Similarly, the probabilities SI (τ 1 , τ 2 ; t), IS (τ 1 , τ 2 ; t), and SS (τ 1 , τ 2 ; t) can be defined. The evolution equations of the pair node states for type-I activation are Equations (79) For type-II edge activation mechanism, we define AA (κ 1 , τ 1 , κ 2 , τ 2 ; t) as the probability density function that a directed edge from an infected node with age κ 1 to an infected node with age κ 2 is in the active state at age κ 1 , while the inverted edge is in the active state with age κ 2 . We define the probability density functions SA (τ 1 , κ 2 , τ 2 ; t), AS (κ 1 , τ 1 , τ 2 ; t), and SS (τ 1 , τ 2 ; t) accordingly. The evolution equations of a nodal pair states can be written as Equations (98)-(101), respectively, describe that in the AA nodal pair state, disease transmission from an infected node with age τ 1 and directed edge age κ 1 to an infected node of age τ 2 and directed edge age κ 2 converts the nodal pair into the AA state with κ 1 = 0, the recovery of an infected node with age τ 1 transforms the pair into the SA state, disease transmission from the infected node with age τ 2 and directed edge age κ 2 to the infected node of age τ 1 and directed edge age κ 1 turns the pair into the AA state with κ 2 = 0, and the recovery of the infected node with age τ 2 places the pair into the AS state. Similarly, Supplementary Eqs. (102)-(105), respectively, indicate that in the SA state, the susceptible node is infected by its neighbors other than the infected node aged at τ 2 so that the pair is turned into the AA state, the susceptible node is infected by the infected node of τ 2 and so the pair gets into the AA state, and recovery of the infected node aged at τ 2 places the pair into the SS state. Equations (106)-(109) indicate, respectively, that in the AS state, disease transmission from an infected node with age τ 1 and directed edge age κ 1 to a susceptible node of age τ 2 changes the pair into the AA state, recovery of the infected node with age τ 1 transforms the pair into the SS state, and the susceptible node is infected by its neighbors other than the infected node and thus the nodal pair transitions into the AA state. Equations (110), (111), and (112) mean, respectively, that for the SS state pair, the susceptible node aged at τ 1 is infected and thus the pair moves into the AS state, and the susceptible node aged at τ 2 is infected and so the pair changes into the SA state.
To verify the accuracy of the second-order mean field theory, we set the infection time and recovery time distributions to be the Weibullean type. Supplementary Fig. 2 shows that the theory is capable of predicting the simulation results more accurately than the first-order theory for ER random networks, even for the situation where the disease has a high decay rate [e.g., α I = 4 in Supplementary Figs. 2(c) and 2(d)].

B. Error Analysis
Our development of the second-order theory, due to its relatively high accuracy, enables an analysis of the source of errors in the first-order mean field theory through a comparison between the results from the first-order and the second-order theories. (For a network with N nodes, the most accurate description is the N th-order mean field theory, but it is practically infeasible to analyze theories with order higher than two.) Because the dynamical correlation between any nodal pair is completely ignored in the first-order theory, the error analysis enables an understanding of the effects of dynamical correlation on non-Markovian dynamics with respect to different edge activation mechanisms.
(120) a,b Time evolution for type-I and type-II mechanism, respectively. c,d Extinction process with α I = 4 for type-I and type-II mechanism, respectively. In all panels, the solid symbols represent the simulation results averaged over 100 realizations on ER random networks, the open symbols represent the results of theoretical solutions obtained from the first-order mean field theory, and the half solid symbols correspond to the theoretical results from Supplementary Eqs. (75)-(113) in the second-order theory. Diamonds, circles, triangles, and stars are for α I = 0.5, 1, 2, 4, respectively. The random networks have size N = 10 4 and mean degree k ≈ 10. Other parameters are β I = 1, α R = 2, and β R = 0.5.
For SI (τ 1 , τ 2 ; t), we can obtain its evolution equation as = S(τ 1 + dt; t + dt)I(τ 2 + dt; t + dt) − S(τ 1 ; t)I(τ 2 ; t) dt =S(τ 1 + dt; t + dt) I(τ 2 + dt; t + dt) − I(τ 2 ; t) dt + I(τ 2 ; t) S(τ 1 + dt; t + dt) − S(τ 1 ; t) dt For type-I edge activation mechanism, we can get another form of the first-order mean field theory as For type-II edge activation mechanism, we have  135) and find that the quantity k − 1 is replaced by k . This means that the first-order mean field theory attempts to include the non-existent process of infection between two susceptible nodes, which is another source of error.  (152), and find that k − 1 in the latter is replaced by k in the former, which means that the first-order mean field erroneously takes into account the non-existent infection process between two susceptible nodes.
The error analysis indicates that the two activation rules have distinct effects of dynamical correlation on active edges. For type-I edge activation mechanism, the errors in the firstorder mean field theory mainly result from the temporal correlations on active edges, while for type-II edge activation mechanism, the errors are caused by the causality between disease transmission through the active edge and infection of the susceptible node.

SUPPLEMENTARY NOTE 4. EFFECT OF INFECTION TIME DISTRIBUTION ON TRANSIENT TIME
The effect of the inter-event time distribution on the speed of spreading dynamics is an issue of interest [15][16][17][18][19]. Because the time that the system has reached a steady state cannot be sharply defined, we set the empirical rule to determine the transient time T half as the time at which the infected density reaches the average value between the initial density I(0) and the steady-state densityĨ. Supplementary Fig. 3 shows that a narrower distribution of the inter-event times, which corresponds to a smaller value of α I , makes it easier for the system to reach a final steady state.

SUPPLEMENTARY NOTE 5. EFFECT OF DEGREE DISTRIBUTION ON TRAN-SIENT TIME
In general, the network topology can affect the spreading speed as well [20][21][22][23][24][25]. To address this issue, we investigate the relationship between transient lifetime and the degree distribution. To be concrete, we study the transient time on scale-free networks constructed from the uncorrelated configuration model [28]. Supplementary Fig. 4(a) shows that a smaller value of the power-law degree exponent γ enables the spreading dynamics to reach its steady state faster. The lower-cutoff degree D low of the degree distribution can also affect the transient lifetime, as shown in Supplementary Fig. 4(b), where a larger value of D low can also expedite the spreading dynamics in approaching the final steady state. These results indicate that effects of degree distribution on transient time for non-Markovian spreading dynamics are consistent with those for Markovian dynamics.  Figure 4. Effect of network degree distribution on transient lifetime. Shown is the transient time versus two specific network structural parameter: power-law degree exponent γ and lower-cutoff degree D low . In general, a smaller value of γ corresponds to the network with more hub nodes. a-b Results for SIS dynamics with respect to γ and D low , respectively. The solid and open symbols represent simulation and theoretical results, respectively. Diamonds and circles are for type-I and type-II activation, respectively. In a, the value of D low is fixed at five, while that of γ is set to be 2.3 in panel b. The networks have size N = 10 4 and the structural cutoff is characterized by the maximum degree k max ∼ N 1/2 . Other parameters are α I = 2, β I = 1, α R = 2, and β R = 0.5.

SUPPLEMENTARY NOTE 6. STATIONARY PROBABILITY DENSITY DIS-TRIBUTIONS
To validate our proposed first-order mean field theory for non-Markovian processes, we examine the stationary probability density functionsĨ(τ ) andS(τ ) defined as and compare the theoretical predictions with the simulation results in Supplementary Fig. 5. Especially, we count the numbers of infected and susceptible nodes at every discrete state age (at the age difference ∆τ = 0.001) when the system reaches a steady state, and divide the numbers by ∆τ × N to get the simulated stationary probability density. The theoretical predictions agree with the simulation results for random, scale-free, and Hamsterster networks. The difference between SIS models with type-I and type-II edge activation mechanisms is that the results ofS(τ ) for the former decrease more smoothly than the latter in the region of small τ values for α I = 2, 4, indicating that newly recovered nodes in the latter are more susceptible to infection than the ones in the former. For α I = 2, 4, the infection rate increases with the age of the active edge. By type-II edge activation mechanism, the ages of the active edges depend only on the infected nodes and thus may be greater than those in type-I edge activation, resulting in some greater infection rate. For α I = 1, the Weibull distribution reduces to an exponential distribution. In this case, the non-Markovian SIS with type-I activation mechanism is completely equivalent to a Markovian SIS process. We set For α R = {0.5, 1, 2, 4} and β R = 0.5, we adjust the value of λ eff through changing β I , where and Γ is the gamma function. In Supplementary Figs. 6(a,b), we present results from Monte Carlo simulations for the three types of networks. The theoretical results from the Markovian process are obtained from a first-order mean field analysis. For all values of λ eff (or β I ), there is a good agreement between the infected density for α I = 1 with the theoretical prediction of the Markovian model.

SUPPLEMENTARY NOTE 8. SPECIAL CASE II: NON-MARKOVIAN SPREAD-ING ON STRONG HETEROGENEOUS NETWORKS
We test our theory on scale-free networks with a strong heterogeneous degree distribution, where the values of the network parameters are: size 10000, power-law degree exponent 2.3, lower cutoff degree being five, upper cutoff degree being about 100, and average degree approximately 11.9. A network in this ensemble has a large number of hub nodes. As shown in Supplementary Fig. 7, even in this relatively extreme case, our theory is able to predict the simulation results. Furthermore, when the spreading has reached the steady state, the approximate equivalence between non-Markovian and Markovian dynamics for both type-I and type-II activation mechanisms holds. Diamonds, circles, triangles, and stars correspond to α I = 0.5, 1, 2, 4, respectively. Other parameters are β I = 1, α R = 2, and β R = 0.5. c,d Approximate equivalence between non-Markovian and Markovian processes with type-I and type-II activation mechanism, respectively. Solid, dashed, dotted, and dot-dashed curves, respectively, represent the results for α R = 0.5, 1, 2, 4. Triangles and circles correspond to results from the Markovian process and analytical solutions, respectively. Other parameters are α I = 1 and β R = 0.5.

SUPPLEMENTARY NOTE 9. SPECIAL CASE III: NON-MARKOVIAN SPREAD-ING ON NETWORKS WITH TYPE-III EDGE ACTIVATION MECHANISM
Besides the two edge activation mechanisms in the main text, other types of edge activation mechanisms exist. For example, rule #2 in Ref. [26] prescribes that the age of an active link is solely determined by the age of the infected node [27], which is similar but not totally identical to type-II edge activation treated in the main text. For convenience, we call this rule as type-III edge activation mechanism.
Because of the absence of any temporal correlation on active edges in type-III activation, the evolution equations are similar to those for type-II edge activation [Supplementary a At t 0 = 0, node i is infected, and the active age is zero, so the infection rate is ω inf (0) = 0. b At t 1 = 1, the active edge transmits disease and node j moves into the infected state, but the age of the active edge from node i to node j remains unchanged. In this case, the infection rate is ω inf (1) = 4. c At time t 2 = 2, node j has recovered but node i is still in the infected state. The age of the active age is two. In this case, the infection rate is relatively large: ω inf (2) = 32. As a result, node i will make node j infected again in a short time.
Equations (63)-(68) suggest an equivalence between SIS dynamics with type-II activation and Markovian SIS dynamics, the effective infection rate can be obtained as We can then conclude that SIS dynamics with type-III activation is equivalent to Markovian SIS dynamics with the effective infection rate given by To verify the above analysis, we simulate the dynamical processes and compare results with theoretical predictions. Supplementary Fig. 8 shows the time evolution of the infected density of the entire network for three different types of networks. We see that the meanfield predictions agree well with those of non-Markovian type of SIS spreading dynamics. We also find that a smaller value of α I results in a larger scale outbreak on the networks, which is consistent with the cases of type-I and type-II edge activation mechanisms.
We further test the equivalence between non-Markovian and Markovian spreading process with type-III activation mechanism. Representative results are shown in Supplementary  Fig. 9. It can be seen that, regardless of the network structure, the stationary infected density for α I = 0.5, 1.0, 2.0 agrees well with results from the simulation of Markovian process and from the analytical solution of the Markovian dynamics from Eq. (24) or Eq. (27) in the main text. However, the curve for α I = 4 deviates markedly from the case of Markovian dynamics. To explain this phenomenon, we note that the infection rate is while the infection times follow a Weibullean distribution: For α I > 1, ω inf (κ) increases with κ. A greater value of α I results in faster growth of ω inf (κ) with κ. Since the age of an active edge cannot be zero after it transmits the disease, the value of the infection rate ω inf (κ) will keep increasing until the corresponding infected node recovers. Supplementary Fig. 10 shows an example of how the infection rate of an active edge increases sharply with κ. The persistent enhancement effect of infection rate with time makes it harder for the mean-field theory to capture the dynamical correlation on the active edge, generating deviations between the simulation result for α I = 4 from the mean-field prediction.