Locating multiple diffusion sources in time varying networks from sparse observations

Hu, Zhao-Long; Shen, Zhesi; Cao, Shinan; Podobnik, Boris; Yang, Huijie; Wang, Wen-Xu; Lai, Ying-Cheng

doi:10.1038/s41598-018-20033-9

Download PDF

Article
Open access
Published: 08 February 2018

Locating multiple diffusion sources in time varying networks from sparse observations

Zhao-Long Hu¹^na1,
Zhesi Shen ORCID: orcid.org/0000-0001-8414-7912²^na1,
Shinan Cao³,
Boris Podobnik^4,5,
Huijie Yang⁶,
Wen-Xu Wang^2,6 &
…
Ying-Cheng Lai^7,8

Scientific Reports volume 8, Article number: 2685 (2018) Cite this article

1989 Accesses
21 Citations
Metrics details

Subjects

Abstract

Data based source localization in complex networks has a broad range of applications. Despite recent progress, locating multiple diffusion sources in time varying networks remains to be an outstanding problem. Bridging structural observability and sparse signal reconstruction theories, we develop a general framework to locate diffusion sources in time varying networks based solely on sparse data from a small set of messenger nodes. A general finding is that large degree nodes produce more valuable information than small degree nodes, a result that contrasts that for static networks. Choosing large degree nodes as the messengers, we find that sparse observations from a few such nodes are often sufficient for any number of diffusion sources to be located for a variety of model and empirical networks. Counterintuitively, sources in more rapidly varying networks can be identified more readily with fewer required messenger nodes.

Diffusion capacity of single and interconnected networks

Article Open access 18 April 2023

Tiago A. Schieber, Laura C. Carpi, … Martín G. Ravetti

Towards a robust criterion of anomalous diffusion

Article Open access 28 November 2022

Vittoria Sposini, Diego Krapf, … Gleb Oshanin

Autonomous inference of complex network dynamics from incomplete and noisy data

Article 24 March 2022

Ting-Ting Gao & Gang Yan

Introduction

Diffusion and propagation processes taking place in complex networks are ubiquitous in natural and in technological systems^1,2, Examples of those processes include air or water pollution diffusion^3,4, disease or epidemic spreading in the human society^5,6, virus invasion in computer and mobile phone networks^7,8, behavior propagation in online social networks⁹. Once a negative diffusion or propagation emerges, it is imperative to locate its sources quickly and precisely to enable timely and appropriate control strategies to prevent and/or inhibit the spreading process. A number of methods have been proposed and tested recently to address the source localization problem of propagation processes in complex networks, which include those based on the maximum likelihood estimation¹⁰, dynamic message passing¹¹, belief propagation¹², hidden geometry of contagion¹³, and inverse spreading^14,15, A related problem of practical significance is to identify super spreaders for effective control of spreading^16,17, However, most existing approaches are specifically for static networks. In the real world time varying networks are ubiquitous, such as frequently changed social contacts via meetings, emails, phone and online softwares^18,19,20,21, Recently, a source detection framework was proposed on complex networks from one snapshot observation of the entire network and demonstrated for an empirical temporal network of sexual contacts²².

Those works focus primarily on source localization for propagation processes. However, source localization for diffusion processes is rarely studied. Here we concentrate on diffusion processes, as they constitute a good approximation for different types of dynamical processes (e.g., synchronization and other nonlinear processes amenable of linearization)². Very recently, considering multiple sources may exist (e.g., air or water pollution, rumors), a general framework that locating of multiple sources in static diffusion processes is presented²³. To develop effective frameworks to locate sources in time varying networks is an outstanding problem in network science and engineering. The essential difference between diffusion on a time varying network and on a static network is illustrated in Fig. 1. Specifically, in Fig. 1(a), due to the various time intervals in which different edges are activated, a spreading process starting at node b cannot reach node a in any time. In contrast, for a static network with the same structure as shown in Fig. 1(a), the spreading process can reach all nodes in the network. To our knowledge, there has been no solution to the problem of locating multiple diffusion sources associated with general dynamical processes on arbitrary time varying networks from local observations²⁴. The purpose of this paper is to provide an optimal solution. In particular, exploiting a combination of the structural observability and sparse signal reconstruction theories, we develop a general source localization framework that is applicable to arbitrarily time varying networks with any number of sources. We demonstrate that sparse data from a small set of messenger nodes are capable of identifying multiple diffusion sources accurately and efficiently, even in the absence of detailed information about the network structure such as link weights and the presence of measurement noise. The framework is established analytically and validated through extensive numerical tests of model and empirical networks.

Results

Framework of locating multiple sources on time-varying networks

A time-varying network with N nodes is generally defined by a node set V = {v₁, v₂, ..., v_N} with a set E of time varying edges, where (v_i, v_j, w_ji, t) ∈ E denotes a directed edge pointing from nodes v_i to v_j with link weight w_ji at activation time t. In this paper, we consider the following class of discrete-time, diffusion processes on such time varying networks:

$${x}_{i}(t+\mathrm{1)}={x}_{i}(t)+\beta \sum _{j=1}^{N}[{w}_{ij}(t+\mathrm{1)}{x}_{j}(t)-{w}_{ji}(t+\mathrm{1)}{x}_{i}(t)],$$

(1)

where x_i(t) is the state of node i at time t capturing the fraction of infected individuals, the concentration of water or air pollutant and etc., at place i. β is the constant diffusion coefficient, and w_ij(t) is the link weight at time t, where self loops are a result of the diffusion process². For an undirected network, we have w_ij(t) = w_ji(t). (Diffusion dynamics in continuous time can be treated similarly - see Sec. S1 in Supplemental Information (SI)). The nodes from which observations are made are the messenger nodes. When the outputs from the messenger nodes are taken into account, the system becomes

$$(\begin{array}{l}{\bf{x}}(t+\mathrm{1)}=A(t+\mathrm{1)}{\bf{x}}(t),\\ {\bf{y}}(t)=C{\bf{x}}(t),\end{array}$$

(2)

where the state vector ${\bf{x}}(t)\in {{\mathbb{R}}}^{N}$ comprises all nodes in the network at time t and A(t + 1) = I + βL(t + 1). In A(t + 1), $I\in {{\mathbb{R}}}^{N\times N}$ is the identity matrix, L(t) = W(t) − D(t) is the network Laplacian matrix at time t, $W(t)\in {{\mathbb{R}}}^{N\times N}$ is the weighted adjacency matrix of elements w_ij(t), and $D(t)\in {{\mathbb{R}}}^{N\times N}$ is a diagonal matrix of elements d_i(t) denoting the total out-weight ${\sum }_{j\in {\Gamma }_{i}(t)}{w}_{ji}(t)$ of node i with Γ_i(t) being the neighboring set of i at time t. The vector ${\bf{y}}(t)=[{y}_{1}(t);{y}_{2}(t);\cdots ;{y}_{q}(t)]$ represents the q measurable outputs from q messengers at time t, and $C\in {{\mathbb{R}}}^{q\times N}$ is the output matrix, where C_ij = 1 if output y_i(t) is measured from node j. The basic difference between source nodes and passive nodes is that, initially (t = t₀), the states of the former and latter are nonzero and zero, respectively. Without loss of generality, we set t₀ = 0. Thus, if the initial states of all nodes can be recovered from the measurements of the messenger nodes at a later time (t > 0), all sources can be identified. A solution to this problem can be obtained by exploiting the observability condition in canonical control theory. Specifically, we consider instants of time t = 0, 1, ..., T and rewrite Eq. (2) as

$${\bf{Y}}=(\begin{array}{c}{\bf{y}}\mathrm{(0)}\\ {\bf{y}}\mathrm{(1)}\\ \vdots \\ {\bf{y}}(T)\end{array})=(\begin{array}{c}C\\ CA\mathrm{(1)}\\ \vdots \\ CA(T)A(T-\mathrm{1)}\cdots A\mathrm{(1)}\end{array}){\bf{x}}\mathrm{(0)}\equiv O\cdot {\bf{x}}\mathrm{(0).}$$

(3)

where ${\bf{Y}}\in {{\mathbb{R}}}^{q(T+\mathrm{1)}}$, ${\bf{x}}\mathrm{(0)}\in {{\mathbb{R}}}^{N}$ is the initial state vector, q is the number of messenger nodes, and $O\in {{\mathbb{R}}}^{q(T+\mathrm{1)}\times N}$ is the observability matrix. To be able to accurately locate the diffusion sources, a unique solution of Eq. (3) is needed, given the output vector Y from the set of messenger nodes. The classic observability theory stipulates that, if and only if matrix O has full rank, i.e., rank(O) = N, x(0) can be fully and uniquely determined.

If we observe only a single node v, matrix O may not have full rank. As a result, only the initial states of a subset of nodes in x(0) can be reconstructed. The number of nodes whose initial states can be reconstructed is rank(O), which defines the observable centrality N_OR({v}) of v, i.e., N_OR({v}) = rank(O). Analogously, for a given set Q of nodes, we have an associated matrix C and can obtain rank(O), which defines the observable range N_OR(Q) of Q, i.e., N_OR(Q) = rank(O). Note that N_OR({v}) ≤ N and N_OR(Q) ≤ N. Thus we can define a normalized observable centrality n_OR({v}) ≡ N_OR({v})/N and a normalized observable range n_OR(Q) ≡ N_OR(Q)/N.

Since information about the link weights may not be available, a direct calculation of rank(O) is not feasible. A resolution is to analyze the structural observability^25,26,27,28, which is a highly nontrivial task for time varying networks. Our idea is to exploit the independent paths in static mappings of the underlying network²⁹, as shown in Fig. 1(b). In particular, a mapping from a time varying network to a static network can be obtained by cloning all nodes into different layers that correspond to different time t. If an edge is active at t [as shown in Fig. 1(a)], the two nodes at both ends of the edge in the corresponding layers in Fig. 1(b) will be connected. Note that the direction of links in Fig. 1(b) is reversed with respect to the actual direction of diffusion in Fig. 1(a) - a consequence of the duality relation between structural observability and controllability²⁸.

Figure 1(c) indicates the quantity N_OR({a}) when node a is chosen as a messenger node. There is a single independent path, i.e., a → c, such that N_OR({a}) = 2 (one independent path and a itself). If a and d are messengers [Fig. 1(d)], there are two independent paths and N_OR({a, d}) = 4 (including the two messengers themselves). In this case, the network is fully observable. The key to source localization is thus to identify all independent paths from messenger nodes in the static mappings of the original time varying network. In this paper, to generate a time-varying network, we propose a uniform activation network model in which random activations are imposed on a static network. Specifically, let z be the number of times (activations) an edge is active in a time interval, which can be randomly selected from a uniform distribution $U\mathrm{(1},{z}_{\max })$ with ${z}_{\max }$ denoting the maximum number of activations. After z is given for each edge, the active time associated with each activation is uniformly chosen from the distribution U(1, T) under the constraint that a link cannot be activated twice (or more) at one active time.

Estimate of observable range

For a set Q of messenger nodes, N_OR(Q) is exactly the number of independent paths plus the number of the messengers, which can be calculated by using the maximum flux algorithm. Here, we provide a theoretical estimate of the number of independent paths. As shown in Fig. 1, since every node has a self-loop, if there exists a link for a certain layer (t > 0), there must exist a path starting from the layer to the top layer (t = 0), as shown in Fig. 1(d). Moreover, there exists at most one independent path starting from one node in a given layer (t > 0). Thus, for a messenger node v, the maximum number of independent paths from v for all layers is the number of layers in which v has a link that points to other nodes. The number is nothing but the number l_v of distinct activations of v, where each activation (active time) corresponds to a layer with a link going out from v (see Sec. S2 in SI for more details). Thus, since the overlap among independent paths from v is negligible, we have n_OR({v}) ≈ (l_v + 1)/N, based on which the quantity n_OR(Q) of node set Q can be estimated as

$${n}_{{\rm{OR}}}(Q)\approx \sum _{i\in Q}({l}_{i}+\mathrm{1)}/N\mathrm{.}$$

(4)

The fraction p of messenger nodes is thus p = q/N, where q is the number of messengers.

For the uniform activation network model, if the number of distinct activations, l_v, cannot be directly measured, we can use the activation times distribution $U\mathrm{(1},{z}_{\max })$ and the active time distribution U(1, T) to estimate the average number 〈l〉 of distinct activations. Specifically, for a node with k edges, we denote their activations by z¹, &hellipsis;, z^k. The probability of the number of distinct activations being l for one node with z¹, ..., z^k is given by (see Sec. S2 in SI)

$$\begin{array}{l}P(l|({z}^{1},\cdots ,{z}^{k}))={c}_{1}(\begin{array}{c}T\\ l\end{array})\sum _{j=\,\max ({z}^{1},\ldots ,{z}^{k})}^{l}{(-\mathrm{1)}}^{l-j}(\begin{array}{c}l\\ j\end{array})\prod _{i}^{k}(\begin{array}{c}j\\ {z}^{i}\end{array}),\end{array}$$

(5)

where c₁ is a normalization constant satisfying

$$\sum _{l=\,\max ({z}^{1},\ldots ,{z}^{k})}^{\min (T,\sum _{i}{z}^{i})}P(l|({z}^{1},\cdots ,{z}^{k}))=1.$$

(6)

Therefore, for one node associated with z¹, ..., z^k, the average number of distinct activations is

$${\langle l\rangle }_{\{{z}^{1},\ldots ,{z}^{k}\}}=\sum _{l=\,\max ({z}^{1},\ldots ,{z}^{k})}^{\min (T,\sum _{i}{z}^{i})}lP[l|({z}^{1},\cdots ,{z}^{k})]\mathrm{.}$$

(7)

For a node of degree of k, the average number of distinct activations is

$$\langle l\rangle ={c}_{2}\sum _{{z}^{1}=1}^{{z}_{\max }}\ldots \sum _{{z}^{k}=1}^{{z}_{\max }}{\langle l\rangle }_{\{{z}^{1},\ldots ,{z}^{k}\}},$$

(8)

where ${c}_{2}={({z}_{\max })}^{-k}$. Given 〈l〉 for each node, for the entire messenger set Q, the normalized observable range can be approximated as

$${n}_{{\rm{O}}R}(Q)\approx \sum _{i\in Q}(\langle {l}_{i}\rangle +\mathrm{1)}/N\mathrm{.}$$

(9)

Messenger selection

Considering the cost of allocating messengers for monitoring the state of the whole network, finding a minimum set of messengers through independent paths represents the most efficient way to locate sources. Moreover, the set can be used to characterize the source locatability of the network. The difficulty is that this task is NP-complete³⁰. We employ an alternative approach by exploiting a greedy optimization algorithm to maximize the observable range n_OR through selection of the messenger set (see Sec. S3 in SI). In addition, sub-modularity^31,32 is exploited to reduce the computational cost and provides guaranteed performance at least (1 − 1/e) ≈ 0.63 compared to the global optima.

We test our framework using model and empirical networks. Figure 2 shows the observable centrality of nodes for Erdös-Rényi (ER)³³ random and scale-free (SF)³⁴ networks. Three features are found, which do not occur for static networks³⁵. First, nodes of larger degree k have a higher observable centrality N_OR, in sharp contrast to what happens in a static network where both driver and messenger nodes tend to avoid large degree nodes due to their small controllable and observable range. Second, N_OR gradually approaches the upper limit T + 1 as k increases. Third, N_OR is nearly independent of the network structure and depends mainly on T and ${z}_{\max }$. The theoretical prediction [Eq. (9)] and numerical results agree well with each other.

The results in Fig. 2 suggest that large-degree nodes be chosen as the messengers (denoted as the max-deg strategy). To validate this strategy, we compare it with the more elaborative strategy of greedy optimization. As shown in Fig. 3, n_OR resulting from the max-deg strategy is quite close to that from the greedy strategy, especially for relatively larger values of ${z}_{\max }$. The great advantage of the max-deg strategy is that it is based on local information only whereas the greedy strategy requires global information about the network. Another remarkable finding is that a very small fraction p of messenger nodes are sufficient to fully locate multiple sources (n_OR = 1) for both ER and SF networks. We also test our framework using three empirical time varying networks, as shown in Fig. 4. It should be noted that the number of distinct activations l of every node is available. We see that a quite small value of p can ensure a complete localization of diffusion sources in all the empirical networks. For both model and empirical networks, numerical calculations are in good agreement with theoretical predictions (see Sec. S3 in SI for more details).

A counterintuitive phenomenon is that, in both model and real networks, it is relatively easier to locate diffusion sources in more rapidly changing (more frequently updating) networks as the set of required messenger nodes is smaller (e.g., comparing ${z}_{\max }=5$ with ${z}_{\max }=30$ in Fig. 3 and hour with day in Fig. 4). A heuristic explanation is that more rapid changes in the network structure in fact limit the spreading patterns from sources, facilitating source localization from a relatively smaller number of messenger nodes.

Actual localization of multiple diffusion sources

We articulate an efficient and robust method to actually locate the sources based on the already identified messenger set. In a realistic situation, the number of sources is much smaller than the network size, so the vector x(0) in Eq. ((3)) has many zero elements. The sparsity of x(0) can be exploited to greatly reduce the required measurement from messengers by using the compressive sensing (CS) paradigm for sparse signal reconstruction^36,37, Specifically, Eq. (3) can be solved and accurate reconstruction of x(0) can be achieved through solutions of the following convex-optimization problem:

$${\rm{\min }}\,\parallel {\bf{x}}\mathrm{(0)}{\parallel }_{1}\,{\rm{subject}}\,{\rm{to}}\,{\bf{Y}}=O\cdot {\bf{x}}\mathrm{(0)},$$

(10)

where $\parallel {\bf{x}}\mathrm{(0)}{\parallel }_{1}={\sum }_{i\mathrm{=1}}^{N}|{{\bf{x}}}_{i}\mathrm{(0)|}$ is the L₁ norm of x(0), while ${\bf{Y}}\in {{\mathbb{R}}}^{qM}$, and $O\in {{\mathbb{R}}}^{qM\times N}$. Here M is the number of continuous measurements made by messengers. Because of the linear independence of the rows in matrix O and the sparsity of x(0), it is feasible to reconstruct x(0) as M is much smaller than T + 1. We define n_M ≡ M/(T + 1) to compare with the data amount T + 1 required by conventional solution to x(0). To be more realistic, we include both measurement noise and uncertainties in the link weights in Eq. (2), which is reformulated as

$$\{\frac{{\bf{x}}(t)=\hat{A}(t){\bf{x}}(t-\mathrm{1),}}{\hat{{\bf{y}}}(t)=C{\bf{x}}(t)\cdot ({\bf{1}}+\varepsilon ),}$$

(11)

where the measurement y(t) is contaminated by white truncated Gaussian noise of zero mean and variance σ²: $\varepsilon \sim {\mathscr{N}}({\bf{0}},{\sigma }^{2}{\bf{1}})$, where ${\bf{0}}\in {{\mathbb{R}}}^{q}$ is zero vector and ${\bf{1}}\in {{\mathbb{R}}}^{q}$ is the one vector. We assume that the uncertainties in the link weights W are also truncated Gaussian: ${\hat{w}}_{ij}(t)={w}_{ij}(t\mathrm{)(1}+\varepsilon ^{\prime} )$, where $\varepsilon ^{\prime} \sim {\mathscr{N}}\mathrm{(0},\sigma {^{\prime} }^{2})$. The random noise is restricted to positive values to make sure that the values of measurements and link weights are nonnegative. Here we use multiplicative noise to ensure that, on average, the ratio of the measurements remains the same with or without noise during the dynamics. To quantify the performance of source localization, we use the standard AUROC (area under a receiver operating characteristic) metric³⁷, where AUROC = 1 indicates the existence of a threshold to fully distinguish between sources and passive nodes whereas AUROC = 0.5 indicates that the two types of nodes cannot be distinguished (Sec. S7 in SI).

We use empirical networks (as in Fig. 4) to test the performance of our CS based source localization method. As shown in Fig. 5(a) and (b), AUROC increases with n_M. When n_M is small, AUROC shows large deviation indicating that the location of sources largely affects the accuracy of source localization for given selected messengers; once n_M exceeds some value, say 0.5, AUROC is close to 1 and the standard deviation reduces a lot implying that all sources at any locations can be accurately located. We also compared the performance of source localization for different messenger selection strategies (See Sec. S4 and Fig. S5 in SI). Figure 5(c) and (d) show the localization accuracy versus measurement noise σ and weight uncertainty σ′. We see that relatively high accuracy can still be achieved even when the noise variance approaches unity. Nonetheless, in some simulations the AUROC is small (See Sec. S5 and Fig. S6 in SI for the distributions of AUROC) and we may improve these performances by increasing the number of messengers or the length of observation time. Further efforts are still needed to see how to balance the cost of adding more messengers or increasing observation time.

In real systems, we cannot know the time-varying network structure in advance, which prevents us from selecting the optimal messengers. However, if the network structure evolves with periodicity or follows some patterns, e.g., the activation dynamic of each edge remains stable for a long period, we can construct a rough network based on the past interactions and select messengers using its structural properties, e.g., nodal degree and estimated observable range. To test the effectiveness of our method under such situation, we divide the time-varying network into two parts according to the order of each edge’s activation time: the first part with which a rough network is constructed and a set of messengers is selected, and the second part within which the source localization is applied. Figure 6(a–c) display the activation time distributions of the three empirical networks, which indicates circadian rhythms, and illustrate the dividing time point used in the simulation. Messengers are selected using greedy algorithm and max-deg strategy ensuring full observable of the first part network, and are further used to locate the sources on the second part network. As shown in Fig. 6(d), our sources localization method shows a good performance for both strategies on the empirical networks.

Discussions

Source localization is significant for preventing negative diffusion processes and reducing damages. Combining structural observability theory with sparse signal reconstruction, we succeed in developing a general framework for locating multiple diffusion sources in time varying networks, an extremely challenging problem in complex dynamical systems. The framework allows us to define an observable centrality for each node and to locate any number of sources by observing a small number of messenger nodes with larger values of observable centrality and exploiting the natural sparsity of sources. Appealing features of our framework include requirement of only small amounts of measurements and robustness against noise and uncertainties in system parameters. We offer analytic formulas for the observable centrality and the minimum number of messenger nodes, which are validated using model and empirical networks. A general finding based on our framework is that large degree nodes produce more valuable information than small degree nodes, an opposite result to that for static networks based on structural observability theory. As a result, choosing larger degree nodes as messenger nodes is more efficient to locate multiple sources in time varying nodes; in contrast, small degree nodes are often selected as messenger nodes in static networks. A counterintuitive finding is that sources in a more rapid varying network can be located more readily than in a slowly changed network. A heuristic explanation for this phenomenon is that frequent changes of the network structure in general produce more independent path in the static mapping of the original time varying network. As a result, the number of necessary messenger nodes is reduced and the sources become relatively easier to be localized. When dealing with time-varying networks, forward-planing problem is an unavoidable issue, because in many real systems the future structure of the time-varying network cannot be obtained in advance. While if the network structure evolves periodically or following some patterns, we can select messengers by fully exploiting the structural information embedded in the past interactions; If the evolution of time-varying network is totally random, then selecting messengers randomly may be the only way. In this paper, multiplicative noise is considered to test the robustness of our method, although the average performance is still satisfied, the worst cases are even worse than that of random guess (AUROC < 0.5) when the noise is strong. Therefore, it is very important to develop a more robust and efficient inference framework that can deal with different noise settings. One possible improvement is relaxing the object function Y = O ⋅ X to ||Y − O ⋅ X(0)||₂ + λ||X(0)||₁ in the cost of adding a tuning parameter λ. Another possible way is to develop a probabilistic approach which can utilize the distribution of noise to give a maximum likelihood estimation of the sources.

Our framework has potential applications in addressing many problems relevant to source localization, such as consensus, synchronization on power grid networks, locating the sources of epidemic spreading and rumor spreading in society, online social communities and computer networks. Moreover, our work has implications in disease diagnosis and therapy, such as identify focus sources of epilepsy and tumors in human body. Because of the significance and broad application potential of the source localization problem, we expect that the theory and practical algorithms presented in this work will stimulate further efforts, e.g., a more efficient and accurate algorithm to identify a minimum set of messenger nodes and a new framework available for systems with strong nonlinear properties.

Data availability statement

Data can be accessed at http://www.sociopatterns.org/datasets.

References

Vespignani, A. Modelling dynamical processes in complex socio-technical systems. Nat. Phys. 8, 32–39 (2012).
Article MathSciNet CAS Google Scholar
Gomez, S. et al. Diffusion dynamics on multiplex networks. Phys. Rev. Lett. 110, 028701 (2013).
Article ADS CAS PubMed Google Scholar
Pope, C. A. III et al. Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. Jama 287, 1132–1141 (2002).
Article CAS PubMed PubMed Central Google Scholar
Shao, M., Tang, X., Zhang, Y. & Li, W. City clusters in china: air and surface water pollution. Front. Ecol. Environ. 4, 353–361 (2006).
Article Google Scholar
Neuman™n, G., Noda, T. & Kawaoka, Y. Emergence and pandemic potential of swine-origin h1n1 influenza virus. Nature 459, 931–939 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Hvistendahl, M., Normile, D. & Cohen, J. Despite large research effort, h7n9 continues to baffle. Science 340, 414–415 (2013).
Article ADS CAS PubMed Google Scholar
Lloyd, A. L. & May, R. M. How viruses spread among computers and people. Science 292, 1316–1317 (2001).
Article CAS PubMed Google Scholar
Wang, P., González, M. C., Hidalgo, C. A. & Barabási, A.-L. Understanding the spreading patterns of mobile phone viruses. Science 324, 1071–1076 (2009).
Article ADS CAS PubMed Google Scholar
Centola, D. The spread of behavior in an online social network experiment. Science 329, 1194–1197 (2010).
Article ADS CAS PubMed Google Scholar
Pinto, P. C., Thiran, P. & Vetterli, M. Locating the source of diffusion in large-scale networks. Phys. Rev. Lett. 109, 068702 (2012).
Article ADS PubMed Google Scholar
Lokhov, A. Y., Mézard, M., Ohta, H. & Zdeborová, L. Inferring the origin of an epidemic with a dynamic message-passing algorithm. Phys. Rev. E 90, 012801 (2014).
Article ADS Google Scholar
Altarelli, F., Braunstein, A., Dall’Asta, L., Lage-Castellanos, A. & Zecchina, R. Bayesian inference of epidemics on networks via belief propagation. Phys. Rev. Lett. 112, 118701 (2014).
Article ADS PubMed Google Scholar
Brockmann, D. & Helbing, D. The hidden geometry of complex, network-driven contagion phenomena. Science 342, 1337–1342 (2013).
Article ADS CAS PubMed Google Scholar
Zhu, K. & Ying, L. Information source detection in the sir model: A sample-path-based approach. IEEE/ACM Trans. Netw. 24, 408–421 (2016).
Article Google Scholar
Shen, Z., Cao, S., Wang, W.-X., Di, Z. & Stanley, H. E. Locating the source of diffusion in complex networks by time-reversal backward spreading. Phys. Rev. E 93, 032301 (2016).
Article ADS MathSciNet PubMed Google Scholar
Kitsak, M. et al. Identification of influential spreaders in complex networks. Nat. Phys. 6, 888–893 (2010).
Article CAS Google Scholar
Morone, F. & Makse, H. A. Influence maximization in complex networks through optimal percolation. Nature 524, 65–68 (2015).
Article ADS CAS PubMed Google Scholar
Fu, L., Shen, Z., Wang, W.-X., Fan, Y. & Di, Z. Multi-source localization on complex networks with limited observers. Europhys. Lett. 113 (2016).
Zejnilovic, S., Gomes, J. & Sinopoli, B. In 51st Annual Allerton Conference on. IEEE, 2013.
Zhu, K., Chen, Z. & Ying, L. Locating the contagion source in networks with partial time stamps. Data Mining and Knowledge Discovery 30, 1217–1248 (2015).
Article Google Scholar
Holme, P. & Saramäki, J. Temporal networks. Phys. Rep. 519, 97–125 (2012).
Article ADS Google Scholar
Antulov-Fantulin, N., Lancic, A., Smuc, T., Stefancic, H. & Sikic, M. Identification of patient zero in static and temporal networks: Robustness and limitations. Phys. Rev. Lett. 114, 248701 (2015).
Article ADS PubMed Google Scholar
Hu, Z.-L., Han, X., Lai, Y.-C. & Wang, W.-X. Optimal localization of diffusion sources in complex networks. Royal Society Open Science 4, 170091 (2017).
Article MathSciNet PubMed PubMed Central Google Scholar
Wang, W.-X., Lai, Y.-C. & Grebogi, C. Data based identification and prediction of nonlinear and complex dynamical systems. Phys. Rep. 644, 1–76 (2016).
Article ADS MathSciNet MATH Google Scholar
Shields, R. W. & Pearson, J. B. Structural controllability of multi-input linear systems. Rice University ECE Technical Report (1975).
Mayeda, H. On structural controllability theorem. IEEE Trans. Automat. Contr. 26, 795–798 (1981).
Article MathSciNet MATH Google Scholar
Liu, Y.-Y., Slotine, J.-J. & Barabási, A.-L. Observability of complex systems. Proc. Natl. Acad. Sci. 110, 2460–2465 (2013).
Article ADS MathSciNet CAS PubMed PubMed Central MATH Google Scholar
Pósfai, M. & Hövel, P. Structural controllability of temporal networks. New J. Phys. 16, 123055 (2014).
Article MathSciNet Google Scholar
Pósfai, M. Structure and controllability of complex networks. Ph.D. thesis, Eötvös Loránd University, Budapest (2014).
Nemhauser, G. L., Wolsey, L. A. & Fisher, M. L. An analysis of approximations for maximizing submodular set functionsłI. Math. Program. 14, 265–294 (1978).
Article MathSciNet MATH Google Scholar
Golovin, D. & Krause, A. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. J. Artif. Intell. Res. 42, 427–486 (2011).
MathSciNet MATH Google Scholar
Erds, P. & Rényi, A. On the evolution of random graphs. Publ. Math. Inst. Hungar. Acad. Sci 5, 17–61 (1960).
MathSciNet Google Scholar
Barabasi, A. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
Article ADS MathSciNet CAS PubMed MATH Google Scholar
Liu, Y.-Y., Slotine, J.-J. & Barabási, A.-L. Controllability of complex networks. Nature 473, 167–173 (2011).
Article ADS CAS PubMed Google Scholar
Candès, E. J., Romberg, J. & Tao, T. Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52, 489–509 (2006).
Article MathSciNet MATH Google Scholar
Donoho, D. L. Compressed sensing. IEEE Trans. Inf. Theory 52, 1289–1306 (2006).
Article MathSciNet MATH Google Scholar
Hanley, J. A. & McNeil, B. J. The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143, 29–36 (1982).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We thank M. Pósfai for valuable discussion. Z.-L.H and Z.S contributed equally to this work. This work was supported by NSFC under Grant Nos. 61174150, 61573064, 71401037 and 61074116, the Fundamental Research Funds for the Central Universities and the Beijing Nova Program. YCL would like to acknowledge support from the Vannevar Bush Faculty Fellowship program sponsored by the Basic Research Office of the Assistant Secretary of Defense for Research and Engineering and funded by ONR through Grant No. N00014-16-1-2828.

Author information

Zhao-Long Hu and Zhesi Shen contributed equally to this work.

Authors and Affiliations

College of Mathematics, Physics and Information Engineering, Zhejiang Normal University, Jinhua, 321004, Zhejiang, China
Zhao-Long Hu
School of Systems Science, Beijing Normal University, Beijing, 100875, China
Zhesi Shen & Wen-Xu Wang
School of Finance, University of International Business and Economics, Beijing, 100029, P. R. China
Shinan Cao
Center for Polymer Studies Boston University, Boston Massachusetts, 02215, USA
Boris Podobnik
Faculty of Civil Engineering, University of Rijeka, 51000, Rijeka, Croatia
Boris Podobnik
Business School, University of Shanghai for Science and Technology, Shanghai, 200093, China
Huijie Yang & Wen-Xu Wang
School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe, Arizona, 85287, USA
Ying-Cheng Lai
Department of Physics, Arizona State University, Tempe, Arizona, 85287, USA
Ying-Cheng Lai

Authors

Zhao-Long Hu
View author publications
You can also search for this author in PubMed Google Scholar
Zhesi Shen
View author publications
You can also search for this author in PubMed Google Scholar
Shinan Cao
View author publications
You can also search for this author in PubMed Google Scholar
Boris Podobnik
View author publications
You can also search for this author in PubMed Google Scholar
Huijie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Wen-Xu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ying-Cheng Lai
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceived and designed the research: W.W.X., P.B., L.Y.C., C.S. Performed the research: H.Z.L., S.Z., W.W.X., C.S. Wrote the paper: P.B., Y.H., W.W.X., L.Y.C.

Corresponding author

Correspondence to Wen-Xu Wang.

Ethics declarations

Competing Interests

The authors declare that they have no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hu, ZL., Shen, Z., Cao, S. et al. Locating multiple diffusion sources in time varying networks from sparse observations. Sci Rep 8, 2685 (2018). https://doi.org/10.1038/s41598-018-20033-9

Download citation

Received: 24 May 2017
Accepted: 10 January 2018
Published: 08 February 2018
DOI: https://doi.org/10.1038/s41598-018-20033-9

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.