Reconstructing propagation networks with temporal similarity

Liao, Hao; Zeng, An

doi:10.1038/srep11404

Download PDF

Article
Open access
Published: 18 June 2015

Reconstructing propagation networks with temporal similarity

Hao Liao^1,3,4 &
An Zeng^2,3

Scientific Reports volume 5, Article number: 11404 (2015) Cite this article

2485 Accesses
12 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Node similarity significantly contributes to the growth of real networks. In this paper, based on the observed epidemic spreading results we apply the node similarity metrics to reconstruct the underlying networks hosting the propagation. We find that the reconstruction accuracy of the similarity metrics is strongly influenced by the infection rate of the spreading process. Moreover, there is a range of infection rate in which the reconstruction accuracy of some similarity metrics drops nearly to zero. To improve the similarity-based reconstruction method, we propose a temporal similarity metric which takes into account the time information of the spreading. The reconstruction results are remarkably improved with the new method.

Relevance of temporal cores for epidemic spread in temporal networks

Article Open access 27 July 2020

Evaluating link prediction by diffusion processes in dynamic networks

Article Open access 25 July 2019

Hyperbolic mapping of human proximity networks

Article Open access 20 November 2020

Introduction

One of the key features in complex networks is the similarity between nodes¹. The intrinsic similarity between nodes is one of the mechanisms driving the growth of networks². Consequently, nodes in a network may appear to have some level of similarity in topology. An accurate estimation of nodes’ topology similarity is fundamental to many applications in network science, including link prediction³, personalized recommendation⁴, spurious link identification^5,6, backbone extraction^7,8,9, community detection^10,11 and network coarse-graining^12,13. However, how to estimate the topology similarity between nodes still remains a challenge in which the optimal solution depends significantly on the problems we are facing. For example, in recommender systems it has already been pointed out that a more effective similarity metric should be biased to small degree nodes to enhance diversity of the recommendation⁴. For the problem of spurious link identification⁵, the similarity metric should be combined with the betweenness index to avoid removing the important links connecting communities¹⁴. The concept of similarity is applied to compare sampled networks in order to detect damage in the original networks¹⁵.

The spreading, as an important dynamics in networks, has been applied to simulate many real processes including epidemic contagion^16,17,18, cascading failure¹⁹, rumor propagation^20,21,22 and others^23,24,25. Recently, one fundamental problem about the spreading process attracts increasing attention: reconstructing propagation networks from observed spreading results²⁶. In some real systems, partial data of the spreading process are visually available, but the underlying structure of the propagation network is not accessible. For example, the propagation of risk in financial systems²⁷ and the diffusion of chemicals in neural systems²⁸ are important dynamics processes for these systems. However, the inter-bank lending relations are commercial secrets²⁹ and the synaptic connections between neurons are very difficult to detect³⁰. Therefore, how to reconstruct the propagation network from the collected spreading data is very meaningful for understanding these real systems. Moreover, knowing the propagation networks can help us to hinder the propagation in the context of epidemic spreading. For instance, one effective way is to immunize the nodes that connecting different clusters in the propagation networks³¹.

Very recently, the compressed sensing theory has been introduced to infer the propagation networks³². This technique, though effective, has relatively high computational complexity which prevents its application in large scale networks. Real networks, especially in online social systems, can contain millions of nodes. An efficient algorithm should be based only on local information. To solve this problem, some local similarity metrics have been applied to inferring the propagation networks³³. The basic idea is that the nodes’ similarity in the “infection pattern” is connected with their similarity in topology. In other words, nodes receiving similar information/virus in spreading are more likely to be connected in the propagation networks. However, the similarity-based methods only use the final spreading results as input information. In reality, one may be able to access more detailed spreading information even including the time stamp that records when the information/virus reached the node. If such information is used properly, it may significantly improve the inference accuracy.

Even though there are many problems, such as link prediction³ and personalized recommendation⁴, related to the network reconstruction, they are essentially different. In link prediction and personalized recommendation, the main task is to estimate the likelihood of a nonexisting link to be an existing link in the future³. A method that putting many future existing links on the top of the likelihood ranking has high accuracy. In network reconstruction, the accuracy is not the only focus. A well-performing method should also avoid high ranking of the false links that may result in significant difference between the reconstructed network and the real network. Therefore, one may reach completely different conclusions even if the same similarity method is applied to these two different types of problems¹⁴. In this context, the performance of the existing similarity metrics has to be reexamined when applied to network reconstruction.

In this paper, we first systematically study the performance of different similarity metrics in reconstructing the propagation networks. Some methods with high accuracy in predicting missing links perform very badly in reconstructing the propagation networks under some infection rates. We find that this is because these similarity metrics overwhelmingly suppress high degree nodes, so that the links are mostly connected to the nodes that are supposed to have low degree. Moreover, we find a phenomenon called “more is less”: when the infection rate is higher than the critical value, each information/virus covers a large part of the network, making the similarity metric fails to capture the local structure of the network. In order to solve this problem, we propose a temporal similarity metric to incorporate the time information of the spreading results. The simulation results in both artificial and real networks show that the reconstruction accuracy is remarkably improved with the new method.

Results

Problem Statement

We make use of the well-known Susceptible-Infected-Remove (SIR) model to simulate the spreading process on networks³⁴. Although it is an epidemic spreading model, it has also been applied to model the information propagation process³⁵. While we use here the terminology of news propagation, our results remain applicable to the epidemic spreading case.

A social network with N nodes and E links can be represented by an adjacency matrix A, with A_ij = 1 if there is a link between node i and j and A_ij = 0 otherwise. In our model, each node has a probability f submitting a piece of news to the network. As there are N nodes in the network, finally there will be f × N pieces of news propagating in the network. The propagation of the news follows the rule of the SIR model: After news α is submitted (or received) by a node, it will infect each of this node’s susceptible neighbors with probability μ. After infecting its neighbors, the node is marked as recovered. During the spreading, we record all the news that each node receives and the time step when it happens. At the end, the information of news received by nodes is stored in a matrix R, with R_iα = 1 if i have received news α and R_iα = 0 otherwise. When R_iα = 1, the time step at which i received α is recorded in T_iα. In the simulation, we use parallel update of nodes’ status in the spreading. It means that the time step is updated after all infected nodes finish the attempt to infect neighbors. In next time step, all the newly infected nodes from last time step will attempt to infect their neighboring nodes. The main task is to use the information stored in R and T to rebuild the network A. The notations of important variables are presented in Table 1.

Table 1 Variable notations in this paper.

Full size table

Similarity metrics

The methods we used to reconstruct the network will be based on node similarity. The basic idea is that the nodes receiving many common news are similar and tend to link together in the network. Therefore, the similarity s_ij between node pair ij can be used to estimate the likelihood L_ij for two nodes to have a link in the network. With R, many similarity methods can be used to calculate the similarity between nodes. The performance of these methods have been extensively investigated in³⁶. Here, we mainly consider four representative methods: Common Neighbors (CN)¹, Jaccard (Jac)³⁷, Resource Allocation (RA)³⁸ and Leicht-Holme-Newman (LHN)³⁹ Indices.

We select these four indices because we want to explore different type of similarity definitions. The CN and RA similarities are in favor of the high degree nodes. The Jaccard similarity reduces the advantage of high degree nodes by normalizing the number of common news with the size of the union of the received news. The LHN similarity punishes the high degree nodes even more than the Jaccard. By comparing the results of CN, Jac and LHN, we can investigate the influence of different penalty schemes (i.e. CN: no penalty; Jac: median penalty; LHN: strong penalty) on node degree on the network reconstruction results.

As we are able to get access to the information of the time step T_iα at which the news α is received by the node i, we can further improve the similarity with T_iα. If two nodes receive the news at a closer time step, they are more likely to be connected in the network. Therefore, for each similarity method, we will design an improved method based on the temporal information of the news propagation. The improved methods are respectively Temporal Common Neighbors (TCN), Temporal Jaccard (TJac), Temporal Resource Allocation (TRA) and Temporal Leicht-Holme-Newman (TLHN) Indices. The detailed description of the methods can be seen in the Methods section.

Metrics

We adopt three metrics to evaluate the performance of different methods. The first one is the standard metric of the area under the receiver operating characteristic curve (AUC)⁴⁰. Each method above gives a score to all node pairs in the network and the AUC represents the probability that a true link has a higher score than a nonexisting link. To obtain the value of the AUC, we pick a true link and a nonexisting link in the network and compare their scores. We randomly pick up n pairs of such links in total. The number of times that the real link has a higher similarity score s_ij than the nonexisting link is denoted as n₁. Moreover, we use n₂ to denote the number of times that the real link and the nonexisting link have the same score s_ij. And the AUC value is then calculated as follows:

If links were ranked at random, the AUC value would be equal to 0.5. We tested different n value and find that AUC in different realization is already very stable after n > 10⁴. Therefore, we set n = 10⁵ in this paper.

The second and third metrics require the reconstruction of the network. The node pairs are ranked in descending order according to s_ij and E (we assume that we know roughly the number of real links in the network) top-ranked links are used to reconstruct the network. The precision of the reconstruction, as the second metric, can be assessed by the overlap of the links in the reconstructed network and the real network. If m out of E top-scoring links occur also in the real network of size E, precision is m/E. The precision metric can be regarded as a complementary measurement to AUC. The third metric is the Pearson correlation between node degree in the reconstructed network and the real network. In fact, AUC and precision measure the reconstruct performance computing on individual level, i.e. whether the top-ranked link exist or not in the network. The degree correlation, on the other hand, evaluates the methods in rather collective level, i.e. whether the methods can correctly infer the degree of nodes.

Artificial networks

We first analyze the methods in two classic artificial networks: (i) Small-World networks (SW), generated by the Watts-Strogatz model⁴¹ and (ii) Scale-free networks, generated by the Barabasi-Albert model (BA)⁴². The spreading process has two parameters: infection rate μ and news submission probability f. With the Common Neighbor (CN) method as an example (see the results of other methods in Fig. S1, Fig. S2 and Fig. S3 in the Supplementary Information), we study the influence of these two parameters on the network reconstruction results in Fig. 1. The AUC, precision and degree correlation in the parameter space (μ, f ) for both BA and SW networks are shown. One can see that μ significantly affects the results in each panel. In BA networks, the optimal μ results in the highest AUC and precision and degree correlation are nearly the same (around 0.1). However, in SW networks the optimal μ for AUC and precision is different from the optimal μ for degree correlation. More specifically, to achieve the highest AUC and precision, μ in SW needs to be around 0.15. However, the best μ for degree correlation is around 0.25. In³³, it has already been pointed out that the optimal μ for AUC is roughly equal to 1/〈k〉. Different from μ, the effect of f on the results is monotonous. All the three metrics increase remarkably as f increases. After f is higher than a threshold, these three metrics are affected only slightly by f (see Fig. S4 in the SI for the dependence of the three metrics on f ).

We further compare the performance of different similarity methods. To this end, we present the dependence of AUC, precision and degree correlation on μ of CN, Jac, RA and LHN methods in Fig. 2. As in real systems, the observed propagation results are usually limited, we thus use a relatively small f in this figure, i.e. f = 0.5. As we discussed in Fig. 1, when CN is applied, one can observe a pronounced peak when tuning μ. The reason for this peak has already been explained in ref. 33. Here, the interesting phenomenon happens when different similarity methods are compared. For Jac and LHN, the peaks in AUC still exist. However, when precision and degree correlation are considered, the curves of these two metrics drop suddenly within a certain range of μ which we refer to as the “special range” of μ. The “special range” is actually due to two reasons: the similarity degeneracy and degree penalty of the similarity metrics. The similarity degeneracy mainly explains the “special range” in CN method. It means that there are some node pairs with the same similarity when μ is in the “special range” that one cannot set a simple threshold to cut top-E links to reconstruct the propagation network. In this case, many links need to be randomly selected from a large number of candidates, resulting in a low reconstruction precision. The degree penalty mainly explains the “special range” in Jac and LHN methods. These similarity metrics overwhelmingly suppress high degree nodes, so that the links are mostly connected to the nodes that are supposed to have low degree. A quantitative analysis and explanation of the “special range” are reported in SI.

During the news propagation process, the time stamp when the news reaches each node is recorded. We thus used the temporal information of the news propagation to improve the existing similarity methods (see the Methods section). Here, we present the advantage of these temporal similarity methods in Fig. 3 and 4. In Fig. 3, we show the dependence of the AUC on f and μ. In Fig. 3(a)μ = 1/〈k〉 and one can see that TCN and TJac can significantly outperform CN and Jac, respectively (see the results of other temporal similarity methods in Fig. S5 in the SI). In Fig. 3(c)μ = 1/〈k〉 again, but the curves of the original similarity methods and the temporal similarity methods overlap, indicating the received news under this μ dominates the similarity. In Fig. 3(b, d), one interesting feature of the temporal similarity methods can be observed when tuning μ. When μ is large, the AUC of the classic similarity methods is very low. This is because the news proposed by every node can reach a large part of the networks, so that the news coverage can no longer reflect the topology information of the network. However, when TCN and TJac methods are applied, AUC can remain close to 1 even when μ is as large as 0.1. These results indicate that the temporal information is crucial to the network reconstruction from the propagation processes. However, we have to remark that, when μ is small, as we see in the Fig. 3, the temporal information cannot improve AUC.

In Fig. 4, we study the dependence of degree correlation on f and μ respectively when the temporal similarity methods are used. Clearly, the temporal similarity methods cannot improve the correlation and the special range of μ still exists. This is easy to understand as the degree correlation is mainly determined by the normalization factor of the similarity methods. Therefore, when selecting the temporal similarity method, one still needs to be very careful, as an inappropriate method may still result in a negative degree correlation and very low reconstruction accuracy.

As shown above, the different similarity metrics yield very different results when varying the spreading parameters. In practice, one needs to estimate the spreading parameter before selecting the most appropriate similarity metrics to reconstruct the network. For instance, in a social network context, μ can be estimated by the mean-field approximation of the epidemic spreading process. By fitting the evolution of the infected node number with the mean-field curve, one can roughly estimate the parameter μ in the mean-field model^43,44. As for f, one can estimate it by M/(N * t) where M is the number of news proposed by users in t period of time and N is the number of users in the social network. These three values are normally publicly accessible in real online systems.

Real undirected networks

We further apply the methods to the real networks. Firstly, the methods are applied to real undirected networks. We consider nine empirical networks including both social networks and nonsocial networks: (i) Dolphin: an undirected social network of frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand⁴⁵. (ii) Word: adjacency network of common adjectives and nouns in the novel David Copperfield written by Charles Dickens⁴⁶. (iii) Jazz: a music collaboration network obtained from the Red Hot Jazz Archive digital database. It includes 198 bands that performed between 1912 and 1940, with most of the bands from 1920 to 1940⁴⁷. (iv) E.coli: the metabolic network of E.coli⁴⁸. (v) USAir: the US air transportation network which publicly available dataset at http://vlado.fmf.uni-lj.si/pub/networks/data/default.htm. (vi) Netsci: a coauthorship network between scientists who published on the topic of network science⁴⁶. (vii) Email: an email communication network⁴⁹. (viii) TAP: a yeast protein binding network generated by tandem affinity purification experiments⁵⁰. (ix) PPI: a protein-protein interaction network⁵¹. We only take into account the giant component of these networks. This is because a pair of nodes located in two disconnected components, their similarity scores will be zero according to CN and its variants.

The results of the similarity methods on these networks are reported in Table 2 in detail. Consistent with the results in the artificial networks, the temporal similarity methods significantly outperform the classic similarity methods (not necessarily in degree correlation). In Table 2, TJac outperforms TCN in both AUC and Precision. The results of TLHN and TRA methods are reported in Table S2. The special range is also observed when LHN methods is applied to real networks. For example, in the email network, the degree correlation drops to negative when μ > 0.1 and the precision value is significantly lowered (from 0.2 to 0.02). However, we also observe that Jac no longer leads to the sudden drop of correlation and precision in the real networks we considered. Comparing all the methods, the TRA method generally enjoys the highest accuracy.

Table 2 Basic properties of real undirected networks and the performance of the CN, TCN, Jac and TJac methods on these networks.

Full size table

Real directed networks

The methods are also applied to real directed networks. We considered several real directed networks to validate our methods. Results of TCN and TJac are shown in Table 3 and results of TLHN and TRA methods are shown in Table S4. The networks include Prisoners (friendship network between prisoners, available dataset at http://www.casos.cs.cmu.edu/index.php), St. Marks FW (food web in St. Mark area collected by http://www.cosinproject.org/), C. elegans neural (neural network of C. elegans)⁵², C. elegans metabolic (metabolic network of C. elegans)⁵² and PB (hyper link between the blogs of politicians, available at http://incsub.org/blogtalk/images/robertackland.pdf).

Table 3 Basic properties of real directed networks and the performance of the CN, TCN, Jac and TJac methods on these networks.

Full size table

Like the undirected networks, the temporal similarity methods have a much higher AUC and precision than the classic similarity methods. However, one can also see that AUC and Precision in directed networks are on average lower than the undirected networks. This indicates that it is generally more difficult to reconstruct directed networks via similarity metrics. We also studied the effect of μ on the results in directed networks. We observe that the improvement of the temporal similarity methods becomes more significant when μ is larger. Moreover, the special zone of both the Jac and LHN methods exists when adjusting μ in directed networks. Taking the Neural network as an example, when LHN is applied and μ > 0.08, the degree correlation drops to negative and the precision decreases from 0.15 to 0.07. We remark that the results on other networks are similar.

We select real networks from diverse backgrounds in order to study the performance of the similarity methods in different situations. Table 2 and 3 show that the method with the highest accuracy is almost unchanged in different networks. This means that the performance of similarity methods with respect to the accuracy is robust. However, when the degree correlation is measured, the results depend more on the networks, as shown in Table 2 and 3. The degree correlation measures whether the node degree in the reconstructed network is correlated with the node degree in the real network. In this case, a method that performs well in one type of networks is not guaranteed to perform well in other types of networks. For example, if the degree distribution of the real network is very heterogeneous, CN would work better in recovering the node degree (as the nodes’ CN similarity score is proportional to their degree). If the degree distribution is homogeneous, Jac or LHN similarity measures may outperform CN in degree correlation due to the higher accuracy.

Other similarity metrics

Besides the four similarity metrics, we tested some other similarity metrics such as the Cosine index (Cos)⁵³, Hub depressed index (HDI)³⁸, Hub promoted index (HPI)⁴², Sorensen index (SSI)⁵⁴, Preferential attachment index (PA)⁴², Asymmetric Index (AS)³³. For each method, we also study its temporal version. The description of these methods and their results are presented in SI (see Fig. S10 and Table S5, S6, S7).

We study the influence of different parameters (i.e. N, 〈k〉, μ) on the performance of different similarity metrics in network reconstruction. We find that the temporal similarity metrics can significantly outperform the corresponding traditional similarity metrics especially when μ is large. When 〈k〉 increases, the precision of both traditional similarity metrics and temporal similarity metrics tend to increase. When N increases, the precision of both traditional similarity metrics and temporal similarity metrics tend to decrease. However, when 〈k〉 and N increase, the temporal metrics constantly outperform the traditional metrics. Therefore, it is better to use the temporal similarity metrics to reconstruct networks.

When different similarity metrics are compared, we find that CN and RA indices have smaller drop of precision in the “special range” than the other similarity metrics such as LHN, SSI, HPI, HDI, Cos and Jac. This is because the latter group of metrics all has some form of punishment based on node degree. In LHN, the drop of precision in the special range is most significant. The “special range” effect is much less obvious when the temporal similarity metrics are used. In LHN, however, an observable drop of precision in the “special range” still exists. This is because the degree punishment is most severe in LHN. We then compare the results of different metrics on SW and BA networks. In SW networks, all the temporal metrics can reach a very high precision (close to 1) when μ is large. However, TRA method reaches the highest value later (i.e. a larger μ is needed) than the other methods. In BA networks, the THPI reaches a highest precision.

In summary, if the time information of the spreading is unknown, it is better to use RA and CN to reconstruct the network as their precision is not affected much by the “special range” effect. If the time information of the spreading is available, it is better to use THPI to reconstruct the network as it works similar to other metrics in heterogeneous networks and it works best in homogeneous networks.

Discussion

In this paper, we applied several standard similarity metrics to reconstruct the propagation network based on the observed spreading results. We find that even though some similarity methods such as Jaccard and LHN perform well in link prediction, they may cause problems when they are used to reconstruct networks, as they punish too much the nodes received many news and assign a large number of links to the nodes that supposed to have low degree. We find that the resource allocation method not only has high reconstruction accuracy, but also results in similar network structural properties as the original network. Finally, we take into account the temporal information of the propagation process and we find that such information can significantly improve the reconstruction accuracy of the existing similarity methods, especially when the infection rate is large.

The value range of the infection rate in which the performance of Jaccard and LHN suddenly drops is denoted as a “special range” in this paper. The special range cannot be observed if one uses AUC to assess the network inference results. It can only be seen when one picks up the top ranking predicted links and uses them to reconstruct the network. This means that in the “special range”, even though existing links are still highly ranked in general by these link prediction algorithms (high AUC), only few links are actually located in the top-ranking (low precision). Therefore, the discovery of this “special range” not only gives warning information that a well-performed similarity method is not for sure effective in all difference cases, but also highlights the fact that precision of the predicted links needs to be measured when judging the performance of the similarity methods. This is also an important message for the link prediction research in which AUC is usually adopted as the only metric to evaluate the prediction results.

Some problems still remain unsolved. For example, our methods now require full time information. When only partial time information is available, the temporal similarity methods must be modified. In addition, our work only considers the simplest epidemic spreading model. Other more realistic models describing the disease contagion and information propagation need to be examined⁵⁵. Furthermore, similar problems in other fields also need to be addressed. For instance, most link prediction methods are based on the observed network topology. When the time information of the observed links is available, the similarity methods should be modified accordingly to incorporate the temporal information of the network. Node similarity is also a basic network feature for community detection. Improving the community detection accuracy with the time information could be important problem. We believe that our work will inspire possible solutions to the above mentioned problems in the near future.

Methods

The original similarity methods and the improved ones based on time information are listed below.

(i) Common Neighbours (CN) The common neighbor index is the simplest one to measure node similarity by directly counting the overlap of news received, namely

where R_iα = 1 if i have received news α and R_iα = 0 otherwise.

(ii) Temporal Common Neighbours (TCN) This method, based on the common neighbor index, takes into account the time steps difference between two nodes receiving the news in common. The formula reads

where T_iα records the time step at which i received α. If two nodes receive the news at a closer time step, they are more likely to be connected in the network.

(iii) Jaccard Index (Jac) This index was proposed by Jaccard³⁷ over a hundred years ago. It can prevent the large degree nodes from having too high similarity with other nodes. The index is defined as

(iv) Temporal Jaccrad Index (TJac) The Jaccard index can also be improved by T_iα as

(v) Resource Allocation Index (RA) The similarity between i and j is defined as the amount of resource j received from i³⁸, which is

(vi) Temporal Resource Allocation Index (TRA) The improved RA method reads

(vii) Leicht-Holme-Newman Index (LHN) This index assigns high similarity to node pairs that have many common neighbours compared to the expected number of such neighbours³⁹. It is defined as

(viii) Temporal Leicht-Holme-Newman Index (TLHN) Similar to the above three improved methods, the formula is

In all the temporal similarity methods above, we set (T_iα − T_jα)⁻¹ = 0 when T_iα = T_jα. In this case, i is definitely not the node that passes the news to j, so i and j are unlikely to be connected in the networks. We pose this setting as it applies to our step-by-step spreading model. Note that in other problems such as link prediction and recommendation, the case of T_iα = T_jα may have to be treated differently.

Additional Information

How to cite this article: Liao, H. and Zeng, A. Reconstructing propagation networks with temporal similarity. Sci. Rep. 5, 11404; doi: 10.1038/srep11404 (2015).

References

Newman, M. E. The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003).
Article ADS MathSciNet Google Scholar
Papadopoulos, F., Kitsak, M., Serrano, M. Á, Bogu M. & Krioukov, D. Popularity versus similarity in growing networks. Nature 489, 537–540 (2012).
Article CAS ADS Google Scholar
Clauset, A., Moore, C. & Newman, M. E. Hierarchical structure and the prediction of missing links in networks. Nature 453, 98–101 (2008).
Article CAS ADS Google Scholar
Zhou, T. et al. Solving the apparent diversity-accuracy dilemma of recommender systems. Proc. Natl. Acad. Sci. USA. 107, 4511–4515 (2010).
Article CAS ADS Google Scholar
Guimer, R. & Sales-Pardo, M. Missing and spurious interactions and the reconstruction of complex networks. Proc. Natl. Acad. Sci. USA. 106, 22073–22078 (2009).
Article ADS Google Scholar
Liao, H. et al. Ranking reputation and quality in online rating systems. Plos ONE 9, e97146 (2014).
Article ADS Google Scholar
Carmi, S., Havlin, S., Kirkpatrick, S., Shavitt. Y. & Shir, E. A model of Internet topology using k-shell decomposition. Proc. Natl. Acad. Sci. USA. 104, 11150–11154. (2008).
Article ADS Google Scholar
Serrano, M. A., Bogu, M. & Vespignani, A. Extracting the multiscale backbone of complex weighted networks. Proc. Natl. Acad. Sci. USA. 106, 6483–6488 (2009).
Article CAS ADS Google Scholar
Quax, R., Apolloni, A. & Sloot, P. M. A. The diminishing role of hubs in dynamical processes on complex networks. J. R. Soc. Interface 10, 20130568 (2013).
Article Google Scholar
Palla, G., Dernyi, I., Farkas, I. & Vicsek, I. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814–818 (2005).
Article CAS ADS Google Scholar
John, B., Sebastian, F., Nicholas, G., Seth, B. & Vincent, A. A. J. Stability in flux: community structure in dynamic networks. J. R. Soc. Interface. 8, 1031–1040 (2011).
Article Google Scholar
Gfeller, D. & De Los Rios, P. Spectral Coarse Graining of Complex Networks. Phys. Rev. Lett. 99, 038701 (2007).
Article ADS Google Scholar
Zeng, A. & Lu, L. Y. Coarse graining for synchronization in directed networks. Phys. Rev. E. 83, 056123 (2011).
Article ADS Google Scholar
Zeng, A. & Cimini, G. Removing spurious interactions in complex networks. Phys. Rev. E. 85, 036101 (2012).
Article ADS Google Scholar
Ciulla, F., Perra, N., Baronchelli, A. & Vespignani, A. Damage detection via shortest-path network sampling. Phys. Rev. E. 89, 052816 (2014).
Article ADS Google Scholar
Meloni, S., Arenas, A. & Moreno, Y. Traffic-driven epidemic spreading in finite-size scale-free networks. Proc. Natl. Acad. Sci. USA. 106, 16897–16902 (2009).
Article CAS ADS Google Scholar
O′Dea, R., Crofts, J. J. & Kaiser, M. Spreading dynamics on spatially constrained complex brain networks. J. R. Soc. Interface. 10, 20130016 (2013).
Article Google Scholar
Travencolo, B. & Dafcosta, L. Accessibility in complex networks. Phys. Lett. A 373, 89–95 (2008).
Article CAS ADS Google Scholar
Buldyrev, S. V., Parshani, R., Paul, G., Stanley, H. E. & Havlin, S. Catastrophic cascade of failures in interdependent networks. Nature 464, 1025–1028 (2010).
Article CAS ADS Google Scholar
Kitsak, M. et al. Identification of influential spreaders in complex networks. Nature Phys. 6, 888–893 (2010).
Article CAS ADS Google Scholar
Comin, C. H. & Dafcosta, L. Identifying the starting point of a spreading process in complex networks. Phys. Rev. E. 84, 056105 (2011).
Article ADS Google Scholar
Doer, B., Fouz, M. & Friedrich, T. Why rumors spread so quickly in social networks. Communications of the ACM 55, 70–75 (2012).
Article Google Scholar
Garas, A., Schweitzer, F. & Havlin, S. A k-shell decomposition method for weighted networks. New J. Phys. 14, 083030 (2012).
Article ADS Google Scholar
Medo, M., Cimini, G. & Gualdi, S. Temporal effects in the growth of networks. Phys. Rev. Lett. 107, 238701 (2011).
Article ADS Google Scholar
Da Silva, R. A. P., Viana, M. P. & Daf Costa, L. Predicting epidemic outbreak from individual features of the spreaders. J. Stat. Mech. 2012, P07005 (2012).
Article Google Scholar
Altarelli, F., Braunstein, A., Dall’Asta, L., Lage-Castellanos, A. & Zecchina, R. Bayesian Inference of Epidemics on Networks via Belief Propagation. Phys. Rev. Lett. 112, 118701 (2014).
Article ADS Google Scholar
Battiston, S., Puliga, M., Kaushik, R., Tasca, P. & Caldarelli, G., Debtrank: Too central to fail? financial networks, the fed and systemic risk. Sci. Rep. 2, 541 (2012).
Article CAS ADS Google Scholar
Robinson, L. D., Hermans, A., Seipel, T. A. & Wightman, M. R. Monitoring Rapid Chemical Communication in the Brain, Chem. Rev. 108, 2554–2584 (2008).
Article CAS Google Scholar
De Masi, G., Iori, G. & Caldarelli, G. Fitness model for the Italian interbank money market. Phys. Rev. E. 74, 066112 (2006).
Article CAS ADS Google Scholar
Bullmore, E. & Sporns, O. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat. Rev. Neurosci. 10, 186–198 (2009).
Article CAS Google Scholar
Chen, Y., Paul, G., Havlin, S., Liljeros, F. & Stanley, H. E. Finding a Better Immunization Strategy. Phys. Rev. Lett. 101, 058701 (2008).
Article ADS Google Scholar
Shen, Z., Wang, W. X., Fan, Y., Di, Z. R. & Lai, Y. C. Reconstructing propagation networks with natural diversity and identifying hidden sources. Nat. Commun. 5, 4323 (2014).
Article CAS ADS Google Scholar
Zeng, A. Inferring network topology via the propagation process. J. Stat. Mech. 11, 11010 (2013).
Article MathSciNet Google Scholar
Dorogovtsev, S. N., Goltsev, A. V. & Mendes, J. F. F. Critical phenomena in complex networks. Rev. Mod. Phys. 80, 1275 (2008).
Article ADS Google Scholar
Moreno, Y., Nekovee, M. & Pacheco, A. F. Dynamics of rumor spreading in complex networks. Phys. Rev. E. 69, 0066130 (2004).
Article ADS Google Scholar
Lu, L. Y. & Zhou, T. Link prediction in complex networks: A survey. Physica A 390, 1150–1170 (2011).
Article ADS Google Scholar
Jaccard, P. tude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin de la Societe Vaudoise des Sciences Naturelles 37, 547 (1901).
Google Scholar
Zhou, T., Lu, L. Y. & Zhang, Y. C. Predicting Missing Links via Local Information. Eur. Phys. J. B. 71, 623–630 (2009).
Article CAS ADS Google Scholar
Leicht, E. A., Holme, P. & Newman, M. E. Vertex similarity in networks. Phys. Rev. E. 73, 026120 (2006).
Article CAS ADS Google Scholar
Hanely, J. A. & McNeil, B. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982).
Article Google Scholar
Watts, D. J. & Strogatz, S. H. Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998).
Article CAS ADS Google Scholar
Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N. & Barabasi, A. L. Hierarchical organization of modularity in metabolic networks. Science. 297, 1551–1555 (2002).
Article CAS ADS Google Scholar
Bailey, N. T. J. The mathematical theory of infectious diseases and its applications (Hafner Press, New York, 1975).
Chen, D. B., Xiao, R. & Zeng, A. Predicting the evolution of spreading on complex networks. Sci. Rep. 4, 6108 (2014).
Article CAS Google Scholar
Lusseau, D. et al. Incorporating uncertainty into the study of animal social networks. Behav. Ecol. Sociobiol. 54, 1809–1815 (2003).
Article Google Scholar
Newman, M. E. Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E. 74, 036104 (2006).
Article CAS ADS MathSciNet Google Scholar
Gleiser, P. M. & Danon, L. Community structure in jazz. Adv. Complex Syst. 6, 565 (2003).
Article Google Scholar
Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. & Barabasi, A. L. The large-scale organization of metabolic networks. Nature 407, 651–654 (2000).
Article CAS ADS Google Scholar
Guimera, R., Danon, L., Diaz-Guilera, A., Giralt, F. & Arenas, A. Self-similar community structure in a network of human interactions. Phys. Rev. E. 68, 065103 (2003).
Article CAS ADS Google Scholar
Gavin, A. C. et al. Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636 (2002).
Article ADS Google Scholar
Mering, C. V. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002).
Article ADS Google Scholar
Duch, J. & Arenas, A. Community detection in complex networks using extremal optimization. Phys. Rev. E. 72, 027104 (2005).
Article ADS Google Scholar
Salton, G. & McGill, M. J. Introduction to modern information retrieval (MuGraw-Hill, Auckland, 1983).
Sorensen, T. A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Biol. Skr. 5, 1–34 (1948).
Google Scholar
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M. & Hwang, D. U. Complex networks: Structure and dynamics. Phys. Rep. 424, 175–308 (2006).
Article ADS MathSciNet Google Scholar

Download references

Acknowledgements

We thank Prof.Yi-Cheng Zhang and Dr. Matus Medo for fruitful discussion and comments. This work was partially supported by the EU FP7 Grant 611272 (project GROWTHCOM) and the Opening Foundation of Alibaba Research Center for Complex Sciences, Hangzhou Normal University (Grant No. PD12001003002008 and PD12001003002006). A.Z. acknowledges the support from the Youth Scholars Program of Beijing Normal University (Grant No. 2014NT38) H.L acknowledges the support from the Guangdong Key Laboratory Projects (Grant No. 2012A061400024 and 2014A030313553), China 863 project (Grant No. 2015AA015305) and NSF China projects (Grant No. U1301252, 61170076, 61471243).

Author information

Authors and Affiliations

Guangdong Province Key Laboratory of Popular High Performance Computers, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, P.R. China
Hao Liao
School of Systems Science, Beijing Normal University, Beijing, 100875, P.R. China
An Zeng
Institute of Information Economy, Alibaba Business School, Hangzhou Normal University, Hangzhou, 310036, P.R. China
Hao Liao & An Zeng
Department of Physics, University of Fribourg, Chemin du Musée 3, Fribourg, CH-1700, Switzerland
Hao Liao

Authors

Hao Liao
View author publications
You can also search for this author in PubMed Google Scholar
An Zeng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.Z. designed the research, H.L. performed the experiments, H.L. and A.Z. analysed the data and wrote the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Liao, H., Zeng, A. Reconstructing propagation networks with temporal similarity. Sci Rep 5, 11404 (2015). https://doi.org/10.1038/srep11404

Download citation

Received: 10 December 2014
Accepted: 20 May 2015
Published: 18 June 2015
DOI: https://doi.org/10.1038/srep11404

This article is cited by

GLORY: Exploration and integration of global and local correlations to improve personalized online social recommendations
- Mingxin Gan
- Lily Sun
- Rui Jiang
Information Systems Frontiers (2019)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.