Simulating SIR processes on networks using weighted shortest paths

We present a framework to simulate SIR processes on networks using weighted shortest paths. Our framework maps the SIR dynamics to weights assigned to the edges of the network, which can be done for Markovian and non-Markovian processes alike. The weights represent the propagation time between the adjacent nodes for a particular realization. We simulate the dynamics by constructing an ensemble of such realizations, which can be done by using a Markov Chain Monte Carlo method or by direct sampling. The former provides a runtime advantage when realizations from all possible sources are computed as the weighted shortest paths can be re-calculated more efficiently. We apply our framework to three empirical networks and analyze the expected propagation time between all pairs of nodes. Furthermore, we have employed our framework to perform efficient source detection and to improve strategies for time-critical vaccination.


I. INTRODUCTION
Spreading processes [1][2][3], like the spread of infectious diseases, information, or failures, naturally occur in socio-technological systems.A broad class of these systems can be described as networks, whose connectivity patterns have been shown to be heterogeneous [4].Whereas infinite time properties are well understood [2,3], the actual dynamical properties of spreading processes on complex networks still lacks understanding.In order to exactly solve the spreading process one must be able to evaluate the evolution of the process configurations.The evolution of the probabilities of configurations is given as a solution to a set of coupled differential equations called master equation [5,6], which cannot be solved directly in most cases.Therefore, different approximations are used: (i) neglecting dynamical correlations by assuming statistical independence at first neighbourhood, pair or node level in mean field approximation [7][8][9][10], (ii) neglecting loops in the network structure with message passing or belief propagation techniques [11][12][13], or (iii) neglecting the temporal evolution by mapping to the percolation theory [14][15][16].
In this paper, we propose how to exactly transform spreading dynamics to a shortest path problem on an ensemble of weighted networks using a Monte Carlo approach.In this framework, weights represent interaction time delays along edges.We are able to extract a temporal distance measures that describes the expected time the spreading process needs to propagate from one node to another.The mapping is applicable to the generalized Susceptible Infected Recovered (SIR) model [17] without memory (exponential inter-event distribution/Poisson process) and with memory (arbitrary inter-event distributions).Contrary to the message passing methods [11][12][13], our method does not neglect loops in the network structure.We show the equivalence between the time respecting paths (shortest paths) on the constructed weighted networks and the propagation times of spreading dynamics.Our results provide new insights into the global and local spreading dynamics on complex networks and have direct applications for source detection and time-critical vaccination.Finally, in a limit of process time, we establish the connection with bond percolation theory [14,15,18,19].

II. MAPPING OF SPREADING DYNAMICS
Let us start with the Poissonian SIR model, which is memoryless, such that the next state only depends on the present state of the system.This process is defined by two parameters (β, γ), which describe transitions where susceptible nodes become infected with rate β from infected neighbours and transitions where infected nodes recover with rate γ.This process has exponential inter-event time distributions: ψ(t) = βe −βt (spread) and φ(t) = γe −γt (recovery).However, many realistic processes have memory [20] and their inter-event time distributions are non-exponential.Therefore, we allow for arbitrary inter-event density distributions (ψ(t), φ(t)) for generalized SIR processes.For a given network G and class of generalized SIR spreading models, we show how to create weighted networks {G k } which encode realizations of the stochastic spreading dynamics.Each weighted network G k (see Fig. 1a) encodes one possible outcome of the spreading process from every potential source node in a network.In particular, a time-respecting weighted network instance G k is created by taking the input network G and assigning weights to the edges of the network instance with the Inverse Smirnov transform [21] FIG. 1.
Temporal distance of spreading dynamics.as where x and y are uniform random numbers ∈ [0, 1], Φ −1 (x) and Ψ −1 (y) are inverse functions of the cumulative inter-event distributions Φ(t) = t 0 dt φ(t ) and Ψ(t) = t 0 dt ψ(t ).The quantities Ψ −1 (y) and Φ −1 (x) respectively represent the samples of the transmission and recovery time obtained with the Inverse Smirnov transform of inter-event distributions.In the special case of the Poissonian SIR process one obtains If nodes recover faster than a transmission occurs, the weight is set to infinity to indicate no transmission through the edge.
We denote the distance as shortest paths on weighted networks, , where χ ij is the set of all possible paths from node v i to node v j on network G k and ρ k,l denotes the weights defined in Eq. (1).Importantly, this distance is equivalent to the propagation time from node i to j, i.e.Instances of the ensemble of weighted networks can be generated independently, or by traversing elements of the ensemble with Markov Chain.The Markov Chain (rejection-free Gibbs sampler -see Supplemental Material sec.2) consists of transitions W i,j = w(G i → G j ) over the set of weighted networks {G k }, where each weighted network corresponds to one realization of the spreading process from every potential source node.This allows us to construct transitions between weighted networks by changing the weights on a randomly selected edge by assigning a new weight according to Eq. (1) (see Fig. 1b).The existence of a stationary distribution of this Markov Chain is guaranteed by detailed balance property and the uniqueness by ergodicity (see Appendix B).After each weight change, all to all shortest paths can be dynamically recalculated [24,25] with computational complexity O N 2 log 3 N , where N denotes the number of nodes in a network (see Supplemental Material, Sec. 4 for more details).In other words, the Gibbs sampling-as well as independent sampling-constructs the probability distribution P (G) of the weighted network ensemble, such that it is a statistically exact representation of the stochastic process.
From the ensemble of weighted networks, due to ergodicity, the expectation f (G) can be estimated as the average over n samples of weighted networks: By changing functions f (G), we estimate different properties such as total outbreak size, temporal evolution, expected propagation time or source likelihood (see Appendix C).

III. TEMPORAL MAPS -FROM NODE TO NETWORK PERSPECTIVE
Starting from a given node, what is the expected time after which the disease reaches each other node in the network?The answer to this question is encoded in the ensemble of weighted networks.For SI processes (γ = 0), the expected propagation time is finite and can be estimated as the average over n samples of weighted networks as , v j ) denotes the shortest path.In Fig. 1c, we show an explicit example from the perspective of a specific source node.The advantage of the temporal maps we have created is they encode this type of information for all possible source nodes.The result is shown in Fig. 2 for three empirical networks: the email network [22], the airport network [26], and the Petser online social network [23].The plots show the average temporal distance or expected propagation time D ij for each pair of nodes i, j, and therefore encompass information about the expected spreading dynamics.In particular, we have ordered the nodes by their characteristic spreading timescale, which we define as the expected time after which a given node infects half of the network, τ i , as where Θ(•) denotes the Heaviside step function, which is one when t ≥ D ij and zero otherwise.A small value of τ i means that node i is of high importance or influence.For each observed realization, we rank nodes according to their estimated probability score according to the Direct Monte Carlo method [28], in case of a tie we prioritize the node with a higher estimator according to the temporal distance method.We compare different methods as explained in the text: Direct Monte Carlo source probability ranking, source detection ranking by topological distance [29], and the estimation from our mapping (temporal distance).Results are averaged over 30 observed realizations and error bars show one standard deviation, from top to bottom.
In Fig. 2 we observe that, apart from local fluctuations, the average global picture seems to follow a trend which in the figure corresponds to a smooth gradient of the color.We visualize this behavior with the contour lines that are superposed in the figure, and which we created by generating a coarse grained version of the matrices D ij [27].The contour lines therefore separate regions in which a pair of nodes with the characteristic spreading timescales τ and τ has, on average, a temporal distance lower than some value D * .We take the coincidence of this coarse grained measure with the microscopic temporal distances as evidence for the appropriateness of the definition of the characteristic spreading timescale.
The question how the average propagation time between two individuals scales with the system size is crucial for many real applications and of special theoretic interest.In particular, if this distance scales logarithmically with the system size, an epidemic can spread nearly instantaneously in arbitrary large networks.However, if it scales like N σ , in large enough systems the disease will not spread through the network in realistic timescales.We can address this question for the SI-model (γ = 0) in an elegant fashion by establishing an analogy to the optimal path problem in disordered networks [30][31][32], which are generated by assigning weights ρ to edges drawn from some distribution f (ρ).In [30] the authors have shown that the optimal path length [33] in these weighted networks scales differently with the system size depending on f (ρ), and distinguish between strong and weak disorder.For the example of Erdos-Renyi networks, in the case of strong disorder the optimal path length scales as d ∝ N 1/3 , and for weak disorder as d ∝ log N .The temporal distance we have defined earlier is equivalent to the optimal paths.Therefore, we can use this analogy to study the scaling of the average propagation time with the system size.From [30] it is know that if f (ρ) is taken as the exponential distribution, only weak disorder can occur, and the average distance scales with the logarithm of the systems size for Erdos-Renyi networks.This means that for the memory-less spreading process, where ψ is the exponential distribution, the average propagation time scales with the logarithm of the system size and diseases spread quickly in systems of any size.However, if the inter-event distribution is for instance given by a lognormal distribution (non-Markovian case), we know from [30] that strong disorder can occur such that the average distance and hence the average propagation time scales like d ∝ N 1/3 .In this case, the finite time properties of the spreading process are drastically different to the non-Markovian case, as the average propagation time becomes very large as the size of the system increases, such that the disease will not spread globally in realistic timescales.
To conclude, although it has recently been shown that the infinite time steady states of non-Markovian processes can be mapped to Markovian ones [34], their finite time dynamics can be dramatically different.

IV. APPLICATIONS
Our framework provides a series of direct applications.Here, we discuss three possible applications: computationally efficient source detection, time critical vaccination, and node influence ranking.

A. Source detection
For a given snapshot r * of a spreading process at time t 0 on a network, how does one detect the source node that generated the snapshot?Source detection [12,13,26,28,35,36] is an important theoretical and practical problem in network science.It is at least as hard as any problem in NP-complete class [35,37], because it belongs to #P-complete class [38] and we approach it by probabilistic kernel estimations [28,39] of the likelihood that source node generated the observed snapshot.Our framework can be used to perform this task, as the ensemble of weighted networks encompasses information about different realizations of the spreading process from all possible sources.
For the observed realization r * , our source probability estimates for every node v i are constructed from ensemble averages s i ∝ m j f (G j ).From weighted network G j , we extract realization based on temporal distances and calculate the similarity ϕ j ∈ [0, 1] to the observed realization r * .Similarity ϕ j is the normalized number of equal corresponding states between two realizations.Finally, we use the Gaussian kernel function for probability estimations [28,39] (see Appendix C).We compare our source probability estimations with direct Monte Carlo approach [28] and topological distance approach [29].The direct Monte Carlo approach generates large number of simulations n = 10 6 −10 8 , from each potential source node v i , and source probability estimation s i ∝ n i , where n i is the number of simulations that match the observed realization r * .The topological distance approach calculates the average topological distance d i from each potential source v i to other infected nodes in the observed realization r * and source estima- The result is shown in Fig. 3a, where source ranked probability estimates for different methods are shown.Indeed, the method based on the ensemble of weighted networks (n = 10 5 instances) performs better than a topological distance estimation [29] and correctly assigns a high probability to the source candidate identified by the direct Monte Carlo method [28].

B. Time critical vaccination
Now, let us consider the following vaccination scenario.We observe at time t 0 the outbreak of a disease and we have a certain amount, m, of vaccines (see Fig. 4a).We can vaccinate m individuals instantaneously, but the vaccines will only be effective after some time ∆t (a vaccinated individual can get infected before t 0 + ∆t, and in this case this vaccine portion is wasted, see Fig. 4b).Who shall we vaccinate?Our framework can be used to improve the vaccination strategy.The idea is that, at time t 0 , we vaccinate only individuals who-with high probability-will not get infected before t 0 + ∆t.Therefore, we estimate the probability that a node i will not be infected before t 0 + ∆t as where θ denotes the source node and Θ(•) is one when d G k (v θ , v i ) ≥ t 0 + ∆t and zero otherwise.We vaccinate nodes proportional to p.In Fig. 4b we show using Petster network that this procedure performs better than randomly choosing from all susceptible nodes at t 0 .Interestingly, vaccinating proportional to degree (hubs strategy) performs even worse than the fully random strategy, since the hubs usually are infected earlier, which increases the chance to waste vaccine portions.

V. EQUIVALENCE TO BOND PERCOLATION
In the following, we state the connection to bond percolation theory.From the perspective of source node The full proof is given in the Appendix A. Hence, |S p (v i )| denotes the size of the connected bond percolation component, where the transmissibility parameter [18] from source v i is given by The transmissibility quantifies the probability that an infected node transmits infection along a link to a susceptible neighbour before it recovers.The second integral accounts for the conditional probability of transmission up to the fixed recovery time τ , and the first integral account for all possible recovery times τ .Furthermore, to fix dynamical correlations (β ≈ γ) that were missing in [18], we generalize and formalize the transmissibility for the first neighborhood or, more precisely, the probability p n,k that k out of n directed links will be active where Ψ(τ ) = τ 0 ψ(t)dt.For Poissionan process this becomes The generalized transmissibility for the first neighborhood (10) establishes the connection to semi-directed bond percolation [19] (see Appendix D).To have exact mapping of the SIR process to percolation, having a single percolation parameter p is no longer accurate when β ≈ γ and generalized transmissibility p n,k for the first neighborhood is needed to capture all dynamical correlations.

VI. DISCUSSION
We have mapped spreading dynamics on complex networks to an ensemble of weighted networks.This mapping allows us to define a temporal distance as shortest paths between nodes on an ensemble of weighted networks.We concentrate on the stochastic formulation of the generalized continuous and discrete time Susceptible Infected Recovered (SIR) spreading dynamics without memory (exponential inter-event distribution) and with memory (arbitrary inter-event distributions).
Specifically, we overcome the limitations of previous methods such as message passing, mean-field approximations and kinetic Monte Carlo methods.In contrast to message passing, our framework is applicable for arbitrary network structures without neglecting loops in the network structure.Furthermore, our method takes into account dynamical correlations between node states overcoming the assumptions of mean-field like approximations [8,10].In contrast to the well known historical Gillespie or kinetic Monte Carlo methods [20,28,[40][41][42][43][44], we do not have to specify initial conditions upfront and we can sample new realizations from the previous ones by making local random perturbations on the weighted networks.The average temporal distance provides insights into the local and global dynamical properties of the spreading process.We have shown that for some non-Markovian processes, this average propagation time scales as a polynomial with the system size, even if the underlying network is small world.In stark contrast to the Markovian case, this means that for large enough systems the disease does not spread globally in a realistic timescale.We have shown that our framework allows for direct applications like source detection, time-critical vaccination, and a ranking of nodes according to their importance for the spreading dynamics.Combining our mapping framework with existing vaccination strategies is likely to further improve the results, especially when time is a crucial factor.
Note that previous studies [26,[45][46][47] for estimating the spreading arrival times were concentrated on a global disease spread with different approximations and theoretical understanding of effective distances [48] for metapopulation models.However, the exact mathematical equivalences for mapping generalized spreading dynamics at a microscopic level were still lacking.Furthermore, in a limit of process time, we establish the connection with bond percolation theory.Previously the outcome in in-finite time of the SIR process have been successfully mapped to the percolation [14,15,18,19] but not for the finite time.Which bridges the gap between inference problems and percolation theory.Contrary to previous state-of-the-art source inference studies [12,13,28], estimations with our method require same computational effort to sample realizations at an arbitrary point in time t and conduct inference not only for discrete time but also for continuous time dynamics.Finally, our mapping to weighted networks naturally provides a dynamical interpretation of the optimal paths problem in disordered networks [30,31].

APPENDIX A: BOND PERCOLATION LIMIT EQUIVALENCE
Here, we prove that, by letting t go to infinity, we obtain a realization equivalent to the normal bond percolation realization, to which we refer as bond percolation limit equivalence: |S p (v i )| denotes the size of the set of nodes v k that are reachable by any finite shortest path length from the source node v i .The term |S p (v i )| denotes the size of the connected bond percolation component with transmissibility p parameter from a source v i [18].
Next, we show that our mapping establishes links with finite length ρ i,j < ∞ with the probability which is equal to the transmissibility p [18]: In our mapping, the link with finite length ρ i,j < ∞ is formed when −ln(x)/β ≤ −ln(y)/γ, for x and y as uniform random numbers ∈ [0, 1].
where f x,y (x, y) is joint density function of variables x and y.As they are independent, f x,y (x, y) = f x (x)f y (y).
As both x and y are uniform random numbers ∈ [0, 1], their densities are which is equal to transmissibility parameter p [18].Therefore, we conclude that both bond percolation and time augmented bond percolation are the equivalent stochastic processes for the formation of edges when the finite edge weights are taken into account.
Note that graphs G have continuous non-negative real weights and that the density function for weight ρ of a specific link is ∞ 0 f (τ, ρ)dτ , where f (τ, ρ) is the joint density for recovery time τ and transmission time ρ: where 1 [0,τ (ρ) is the identity function which is equal to one on a range [0, τ and δ(t − ∞) is the Dirac delta distribution.Before proceeding, we will use the discrete approximation of the non-negative weights i.e. density functions φ(τ ), ψ(ρ), f (τ, ρ) become probability mass functions g φ (τ ), g ψ (ρ), g φ,ψ (τ, ρ).This approximation essentially tells us that we use the finite number of possible non-negative weights to approximate density distributions, which is the case for all numeric simulations of continuous functions.This enables us to use the formalism of Markov Chain for discrete state space.
By design, the Markov Chain has stationary distribution P(G), which is equal to the probability that this graph is generated by our mapping: The probability of a particular instance weighted graph due to the independence is a product of probabilities over weights on edges generated by our mapping.The probability of having a weight τ i,j is g φ (τ i,j ) and probability of having a weight ρ i,j for a given τ i,j is g ψ|φ (ρ i,j |τ i,j ).

Convergence to stationary state
The existence of stationary distribution of the Markov Chain in discrete time and discrete states is guaranteed by detailed balance property and the uniqueness by ergodicity.
Next, we show why the detailed balance holds if we choose the transition P (G i → G j ) such that the weight of the randomly selected edge (i * , j * ) is changed by generating new weight with our mapping.
We write the detailed balance condition: The probability of generating the particular graph G i is: where ρ i k,l denotes the transmission weight on edge (k, l) in G i and τ i k,l denotes the recovery time for edge (k, l) in G i .Similarly, for graph G j we can write the probability P (G j ).Now, the transition between G i and G j happens by selecting the random edge (m * , n * ) with uniform probability c and changing the weight on it.
We can extract the edge (m * , n * ) in G i from equation (17): and similarly for graph G j : Now, the ratio is since the graphs G i and G j share all the weights except on the randomly selected edge (m * , n * ).Next, we see that this ratio is equal to the ratio of transitions rates : and we conclude that the detailed balance holds.

Ergodicity
The second thing that needs to be proved is ergodicity.We know, that Markov Chain is ergodic if it is irreducible, aperiodic and positive recurrent.A finite state Markov chain is irreducible and positive recurrent if there's a finite number N such that any state can be reached from any other state in exactly N steps.To prove this, we consider two generic states G i and G j and show that there is a positive probability for reaching state G j from state G i in finite number of steps.Any state G j can be reached from state G i with the transition path where weights are changed link by link.Since number of distinct link in network is finite so it is the number of transitions N .As the transitions are independent the probability P (G i → G j ) is equal to the product ), which is positive since every P (G k → G k+1 ) is positive.Every product is positive as the weights in G come from the distributions ψ and φ.As the probability P (G i → G i ) > 0 that means that the Markov Chain is also aperiodic.

APPENDIX C: INFERENCE VIA SAMPLING
Now we state the following theorem from Markov Chain theory [49].Theorem Let {G 1 , ..., G n } be random variables distributed according to the Markov Chain that satisfies the detailed balance and ergodic property with initial condition G 0 and let f (G) be any real valued scalar function, then the following holds: • The probability distribution G n converges to the stationary distribution: • Time average converges to the average over stationary distribution almost surely where f (G) function is used for estimation of different properties over graph G.
We list different estimation functions that can be used with time augmented bond percolation mapping: • Expected total outbreak size from node v i can be estimated with the following function: where Θ (x) is the Heaviside step function, which equals to 1 when x > 0 i.e. d (v i , v j ) < ∞ and 0 otherwise.
• Expected evolution of number of infected nodes at time t from node v i can be estimated with the following function: where Θ(x) is the Heaviside step function, which equals to 1 when x ≥ 0 i.e. t ≥ d (v i , v j ) and 0 otherwise.
• The expected distance time for the spreading to propagate from node v i to node v j can be estimated with the following function: • The estimation of microscopic configuration e.g.probability of specific realization described by the set of infected nodes R * (observation) from specific source node v i can be estimated with kernel density estimator [28,39]: where ϕ j is the similarity function between specific realization R * and estimated realization from weighted graph G from source node v i and a is the kernel width parameter.The similarity ϕ j is calculated as ,where r c * denotes the complement i.e. set of noninfected nodes and N is the normalization constant i.e. total number of nodes in a network.Function Θ(x) equals to 1 when t ≥ d (v i , v j ) and 0 otherwise.Function Θ (x) equals to 1 when d (v i , v j ) > t and 0 otherwise.
• Estimations based on averages of i.i.d.samples have convergence rate O(n −1/2 ), where the Berry-Esseen theorem gives bounds on the constants for convergence.Main advantage of using the kernel estimators [39] is that they have a faster convergence rate.They have physical analogy to using the information potential field to non-parametrically estimate realization probability densities.
• Finally, we do not need to set the Kernel width parameter a in advance.We can choose the parameter a as the infimum of the set of parameters for which the estimations have converged [28].

APPENDIX D: SEMI-DIRECTED BOND PERCOLATION
Previously, the SIR dynamics without memory has been mapped [19] to the semi-directed bond percolation networks.However, still this mapping was only describing the asymptotic dynamics (t = ∞).In this section, we formalize this mapping with generalization of transmissibility from the link level to the neighbourhood level in order to make connection with the semi-directed bond percolation.Furthermore, we show that in our mapping we only need to change small adjustments so that it becomes exact mapping with weights.

Newman transmissibility for the link
Let us look at the continuous time SIR model, where infected node transmits the disease to susceptible node at an average rate β and infected nodes recover at the constant rate γ.So the probability of recovering in any short time interval dt is γdt and the probability of transmitting the disease in any short time interval dt is βdt.
The probability of recovering in any short time interval dt is γdt and the probability that the node is still infected after a τ time is: and the probability that the node remains infected this long and then recovers in the interval τ + dτ is: e −γτ γdτ , which is a standard formulation waiting times is probability density function φ(τ ) = γe −γτ for the exponential distribution with parameter γ.Now, we can use the same analogy and say that the waiting time for transmission when there is no recovery to stop it is the probability density function ψ(t) = βe −βt .Note, that the transmission through edge can happen only prior to the recovery of node at time τ which is described by the pdf φ(τ ).
Then the total probability of through edge is: Exact transmissibility for the neighbourhood Consider the focal node with n neighbours, the transmissibility for the first neighbourhood (with all correlations) or more precisely the probability T n,k that k out of n nodes get infected prior to the recovery time of central node is: Proof If we know the recovery time τ of central node, the conditional probability of transmission through any edge is τ 0 ψ(t)dt = 1 − e −βt .Now, as we have n nodes the probability that k nodes gets infected if the recovery time of central node is τ is: where (1 − e −βt ) k describes that through k transmission passes and through n − k does not passes e (−βt)(n−k) and there are n k different combinations of k nodes.Then, we need to to get the joint distribution P (n,k),τ = P (n,k)|τ P τ : And finally, we integrate over recovery time +∞ 0 P (n,k),τ dτ to loose dependence on τ : After integration we get the following expression: The T n.k is a valid distribution: (1) k T n.k = 1 and (2) T n.k ≥ 0 (all arguments to the gamma functions are real and non-negative).
For general distributions φ(•) and ψ(•), the generalized transmissibility for the neighbourhood is: where Ψ(τ ) = Let us recall the mapping from the main paper.In particular, a time-respecting weighted network instance G k is created by taking the input network G and assigning weights to the edges with the Inverse Smirnov transform [21]: where x and y are uniform random numbers ∈ [0, 1], Φ −1 (x) and Ψ −1 (y) are inverse functions of the cumulative inter-event distributions: Φ(t) = t 0 dt φ(t ) and Ψ(t) = t 0 dt ψ(t ).The quantities Ψ −1 (y) and Φ −1 (x) respectively represent the samples of the transmission and recovery time obtained with the Inverse Smirnov transform of inter-event distributions.
Note, that in the above mapping the random variables x and y were generated per each edge.The exact mapping is obtained by generating a random variable y for each node that is then shared among each adjacent outgoing edges, where one assumes that each undirected link in the original network can be interpreted as a bidirectional edge.One then performs the aforementioned procedures for both directions.In this way, we can generate weights that are semi-directed and take into the account the conditional dependence between edges.The properties shown in Sec.1-3 hold similarly for this case.

APPENDIX E: AIRPORT NETWORK SPREADING
In order to generate the effective distance matrix on the world airport transportation network, we use the world airport transportation system data [50] [51].Model of spreading is a simple model of diffusion of the infected agents along the airport transportation network.Each node represents an airport and each edge represents a connection where the diffusion of infected and susceptible travellers happen.From the flux data between nodes, we estimate the corresponding transmission rate per flight β i,j for every link in the network.The rate is now the estimated ratio of number of passengers traveling from node v i to node v j and the total outgoing traffic i.e. β i,j = F ij / k F i,k .Here, we model a discrete time SIR spreading process, where β i,j is estimated from flux travel data.The recovery rate γ ≈ 0, which describes rapid pandemic spreading, where infected airport can not fully get recovered in short time scale.Then the corresponding transmission inter-event density distribution is ψ(t) = ∞ T =1 δ(t − T )β(1 − β) T −1 i.e. geometric distribution -discrete analog of exponential distribution and φ(t) = δ(t − ∞), where δ(t) denotes the Dirac delta distribution.Essentially, the weights on the edges are samples of geometric distribution, whose rate parameter is estimated from data.
FIG. 1.Temporal distance of spreading dynamics.Figure (a) illustrates the mapping of spreading dynamics by adding weights to edges, which represent propagation time delays.The weighted network instance represents one realization of the stochastic dynamics from an arbitrary source node.The propagation time from the source node to any other node is equal to the length of the shortest path in weighted network and it is color coded.Figure(b).Markov Chain transitions Wi,j = w(Gi → Gj) between different instances of weighted networks in the ensemble.Randomization of the weight of an edge (denoted by red) corresponds to the transition between the states (realizations of the process).The source node is denoted with blue and the orange circle represents the set of nodes that are reachable within propagation time T the from source node.In Figure(c) nodes in a network are placed on a circle, where the angular coordinate represents the average propagation time from the source node, increasing in the clockwise direction (from blue to red).This visualization represents the temporal distances from the source node with index 500 from temporal distance matrix in Fig.2a.

FIG. 2 .
FIG. 2. Temporal maps Dij, which represent the expected propagation time from node vi to node vj.Nodes are ordered by their characteristic spreading timescale.(a) for the email network [22] (β = 0.01), (b) for the airport network (β estimated from flux data see Appendix E), (c) for the Petster network [23]) (β = 0.01).The contourlines are with respect to the coarse graining explained in the text.

FIG. 3 .
FIG. 3.Source detection.We compare source detection performance for different methods for the SIR model β = 0.7, γ = 0.3 (Figurea) and β = 0.7, γ = 0.7 (Figureb) at time T = 5 (discrete time) on the 4-connected lattice (30x30 nodes).For each observed realization, we rank nodes according to their estimated probability score according to the Direct Monte Carlo method[28], in case of a tie we prioritize the node with a higher estimator according to the temporal distance method.We compare different methods as explained in the text: Direct Monte Carlo source probability ranking, source detection ranking by topological distance[29], and the estimation from our mapping (temporal distance).Results are averaged over 30 observed realizations and error bars show one standard deviation, from top to bottom.

FIG. 4 .
FIG. 4. Time critical vaccination.a) Shows an illustration of the time critical vaccination process using a Albert Barabasi graph (80 nodes, mean degree k ≈ 6).Vaccination candidates are highlighted by the diamond symbols.Red note denote infected individuals, and blue nodes are susceptible.Figure b and c show the total number of infected nodes up to time t for the different vaccination strategies described in the text (m = 0.2 of the total number of nodes).The first dashed vertical line denotes T0 = 3, and the second one T0 + τ = 13.Figure b for the discrete time SIR β = 0.03, γ = 0.01 and Figure c for the discrete time SIR β = 0.05, γ = 0.01 with source node 10 in the Petster network.

τ 0 ψ
(t)dt.For n = k = 1, we get the standard transmissibility T n=1,k=1 = Exact mapping: semi-directed bond percolation with weights Equivalence to bond percolation.Average outbreak size in time estimated with the proposed mapping (star markers) and comparison with bond percolation late-time outbreak size (dotted lines).In a limit of time the mapping becomes a bond percolation (see 11).Results were obtained for the SIR dynamics (continuous time) with different transmission rates β (0.3, 0.03, 0.003, 0.0003) and recovery rate γ = 0.001 and with n = 10 4 simulations on the 4-connected two dimensional regular lattice (11x11).vi , the shortest path d (v i , v j ) indicates the first infection time of node v j .Therefore all nodes that are not reachable within the temporal distance t stay in the susceptible state.If a node is reachable within the temporal distance t, then it was infected.By letting t go to infinity we obtain asymptotic realization equivalence to bond percolation.In particular, the size of the set of nodes v j that are reachable in any finite time from the source node v i , |S p (v i )|, is given by