Timing Information Propagation in Interactive Networks

Animal behavior is greatly influenced by interaction between peers as well as with the environment. Understanding the flow of information between individuals can help decipher their behavior. This applies to both the microscopic and macroscopic levels, from cellular communication to coordinated actions by humans. The aim of this work is to provide a simple but sufficient model of information propagation to learn from natural coordinated behavior, and apply this knowledge to engineered systems. We develop a probabilistic model to infer the information propagation in a network of communicating agents with different degrees of interaction affinity. Another focus of the work is estimating the time needed to reach an agreement between all agents. We experiment using swarms of robots to emulate the communication of biological and social media groups for which we are able to provide upper bounds for the time needed to reach a global consensus, as well as to identify individuals that are responsible for slow convergence.

therefore at the epicentre of the the emergence of collective behaviors is rather naive. In an effort to move past this simple model, we introduce a stochastic element to the interactions between individuals in their range of communication. The next section defines the characteristics of this probabilistic variable and alludes at the physical elements in nature that can be modelled with it.
Our contribution is two-folds. First, we aim to probabilistically model the propagation of information in a network of interacting agents. Many works have proposed complex models to represent the propagation of information, especially infection spreading [20][21][22][23] with a number of parameters and settings. These models, for instance, are built on the assumption that the infection rate, the state of the individual, time and age of infection, etc. are known which might actually be difficult to acquire in a real case-study. The work proposed in 21 goes as far as to require a pedestrian model to accurately estimate the infection transmission in air travel. In addition, most proposed propagation models rely on scenario-specific parameters such as an Susceptible-Infected-Recovered (SIR) model in studying infection spreading which cannot be used to study the propagation of signals in animal groups for instance. The strength of our model lays in abstracting the quantity being propagated to a piece of information (visual queues, chemical signals, messages) with the likelihood of transmission as an attribute (line of sight, infection probability, influence of users). This eliminates the dependence of the model on the scenario being studied and renders it applicable to multiple domains of study. In here, we model the information propagation by stripping away these details down to a fewer number of assumptions; namely (1) a static or slowly changing network, (2) the propagation of a single piece of information and (3) information transmission probability of a node. The purpose here is to show that this simple model is enough to explain the emergence and occurrence of certain events. Behaviors like synchronized flashing exhibited by photonius carolinus fireflies and the tendency of certain fireflies to defect from the group is one interesting scenario to study 24 . In that regard, the flashing lights are construed as information being sent to the rest of the group. The model is then a tool to infer the influence of these defectors on others out of their line of sight by studying the propagation of information between individuals in different regions. As a sub-contribution, we explore the validity of centrality as a contributor to information propagation. Oftentimes, centrality has been a key parameter to techniques that dealt with detecting propagation sources and selecting influential nodes in the network [25][26][27] . In this work, we reveal the drawbacks of relying on centrality and propose a metric based on conviction and influence probability to boost the propagation of information.
Second, armed with this model, considering that one of the incentives to studying collective behavior in nature is to gather the knowledge to engineer new systems (e.g. optimized transportation routes, robot swarms as emergency responders), we analyse the timing characteristics of information propagation and the ways it could lead to new technologies. To illustrate this, we study leader-following consensus, common in decision-making problems within groups of interacting individuals, in which the purpose is for the group to reach and agree on the opinion of a leader. This can be modelled as an information propagation paradigm in which an opinion is an information for which the convergence means the agents in the network agreeing to that argument. This has been observed to be a feature that groups, both animal and human, strive for to make decisions and establish certainty over a choice of action 28 . When it comes to consensus, the research is focused on developing controllers that are more resilient and those that guarantee convergence. To our knowledge, little has been dedicated to explore the convergence times towards information propagation. The work of Başar et al. 29 defines the expected convergence rate of Quantized Metropolis consensus. The assumption though is that the graph remains connected in every sequence and that the transmission occurs to a single node per step with a uniform probability. A noisy environment increases the chances of information loss, and inter-individual conflicts might arise, especially within heterogeneous groups. All this increases the time needed to reach a consensus. This has been considered by Cheng et al. 30 in which the effect of noise-attenuation gain was explored in leader-following consensus to define a bound to the convergence rate. The results were limited to gains of a certain class of functions. From the field of evolutionary graph theory, the authors 31 define the exact fixation probability and time of a Moran process for arbitrary small graphs. Most of these works rely on Markov chains to model the information propagation which renders the states intractable as the network grows in size. Our aim is to provide a probabilistic estimate of information propagation time which is of practical use to real-time modern systems, especially those with hard timing constraints. In this work, we integrate the probabilistic model of information propagation with a timing analysis technique to estimate a probabilistic worst case convergence time (pWCCT) for information propagation.

Information from an observation to a Conviction
The communication among individuals that share feelings of uncertainty and are prone to share their state and the state of the environment around them to their neighbors have been observed to be at the root of the emergence of collective behaviors 2 . The time it takes for a behavior to form and spread in a group relies heavily on the efficiency of the communication medium as well as the affinity of the individuals to receive and share information; in other words, their affinity to cooperation. The intuition here is to model the direct interaction between an individual and its neighbors and the indirect interaction with the rest of the group in a probabilistic manner to project this affinity and information loss.
We want to define the probability of an individual to receive information if broadcast by a different individual in the network, and not necessarily by its neighbors. Suppose that individual A sends an information that will reach individual B with probability p ab = Pr(B|A). We define the information conviction as the probability of an individual to hold the information broadcast by another individual in the network. The conviction of B to have the information is Pr(A) · p ab . This is true provided B has a single source of information, A in this case. For such, we need to define what it means to share information, and how to condition the information propagation on other sources of information. The example in Figure 1 shows that X has multiple of these sources. We model this as a message passing system in which a message transmits the confidence of a node to pass on the information to a particular neighbor. It can be viewed as an individual sending a message to broadcast their ability to propagate www.nature.com/scientificreports www.nature.com/scientificreports/ the information; in other words, their conviction of holding the information and the confidence of which they are to send it to a neighbor given what its other neighbors are saying through the received messages. Hence, we recognize two types of messages: • Received messages represent the messages π received by a node. They indicate the opinion of the source node of how likely the information is to reach a certain node from its neighboring nodes; • Shared messages define the messages λ that a node share with its neighbors. Message λ XY 1 for instance describes the likelihood of Y 1 getting the information from a specific source X and no other.
The algorithm is local in that a node relies only on the opinion of its neighbors from the received messages to build its conviction of an information reaching it given that the information was observed and broadcast from remote parts of the network. That being said, the interaction graph is defined as a directed graph that can contain cycles (See Methods). Initially, the only observations are the messages shared by the broadcasting source whereas all other messages are initialized to zero. To reach the information conviction of every individual considering direct and indirect interactions, messages are updated iteratively from previous observations of the information propagation in the network. We direct the reader to the Methods section for a detailed description of this process.
For illustrative purposes, a homogeneous network of 50 individuals is depicted in Figure 2A in which the information has been set to have equal chances of reaching a neighboring node or getting lost in the process from any node in the network. Iterating over the procedure described by Algorithm 1 (See Methods) produces the results in Figure 2 in which we can observe the accumulation of the belief of the reception of an information broadcast at node ν 0 throughout different iterations. Once the algorithm converges, the conviction of every individual to hold the information is as exhibited in Figure 2D. The neighbours of the broadcasting source are observed to have a high conviction that they hold the information, which is expected. Surprisingly though, we notice that individuals such as {ν 14 , ν 16 , ν 23 } far from the source and with a degree of separation higher than two nodes, in different clusters altogether for that matter, have quite a high information conviction. This tells us that the propagation of information in a network of interacting individuals might not necessarily depend entirely on the distance to the source of the information and line of sight. This leads us to question whether centrality is the reason behind these observations. This notion is further studied in later sections where we show its validity and determine the circumstances under which it no longer holds.

Probabilistic Worst Case Convergence Time
The transfer of knowledge acquired from studying animal group behaviors to artificial systems has seen a surge of interest in the research community as of late. In light of this, there is a need to study certain characteristics that are inherent to these modern systems; which might, in turn, aid in understanding some behaviors in animals and humans alike that are still unpredictable. One of these characteristics is timing performance. In particular, this paper aims at providing a probabilistic estimate of the worst case time for a group to reach a consensus over a piece of information. We develop a methodology to obtain the exceedance probability curve (or Complementary Cumulative Distribution Function) which describes the probability that the convergence time will exceed a certain threshold. We refer to this as the probabilistic worst-case convergence time (pWCCT). This is of a particular interest to a system designer whom might have strict timing restrictions: for example, the need to ensure that the time for a robot swarm to autonomously agree on a task assignment doesn't exceed a certain deadline. www.nature.com/scientificreports www.nature.com/scientificreports/ For every individual ν i , we define ν I t ( ) i as a probability mass distribution (PMD) which describes the probability of the individual ν i to receive the information from its neighbors at time t. The intuition is that, for an individual to receive the information at time t = T, it suggests that its neighbors that hold the information have failed to transfer it at t < T. The PMDs ν I i represent the timing behaviour of corresponding individuals ν i . To estimate the timing for a consensus to be reached, we need to look at the timing behavior of the network. Two elements are to be considered: (i) how to translate the individual timing behaviors into a group timing, namely a probabilistic worst case convergence time (pWCCT); and (ii) what possible structures to study to achieve a bound on the pWCCT.
We can see that as we observe new states, the conviction of certain nodes grow larger than others. This is observed to not be entirely dependent on the distance to the broadcasting source where some nodes in different regions of the network exhibit higher conviction than immediate neighbours. www.nature.com/scientificreports www.nature.com/scientificreports/ In Figure 3, information is propagated to A and B with their corresponding PMDs. To determine the timing of this information flow, we need to enumerate the times at which both A and B have the information, and define the probability to reach every state. In the figure, every arrow represents one of these cases. Convolving the two distributions turns out to produce a probability distribution that describes these possible cases. To estimate the exact timing of the information propagation, one needs to consider all possible paths the information might take to propagate from an individual to the rest of the network. The complexity of this grows exponentially as the size of the network increases. However, since our main concern lays in estimating the worst case convergence time, it is sufficient to consider the longest path in the graph of the network. This is similar to picking one node at a time to receive the information from the rest of the network at every time-step. As one can expect, this consideration results in an upper bound of convergence time that is quite pessimistic. In reality, individuals in a group are more likely to transmit information to multiple individuals at the same time. In order to tighten this estimation of the probabilistic worst convergence time, instead of considering the longest path that spans all nodes, we look for a spanning tree with minimum branching. Among other assumptions (detailed in Methods), we have to assume that the information propagates sequentially to all the individuals, as the inclusion of branches hinders the use of convolution. The way to combine the PMDs of the nodes in the branches with the rest of the spanning tree is by defining a merge operator (described more thoroughly in Methods). This takes into account the possibility that certain individuals receive the information at the same time, akin to a parallel process.

Results
As has been mentioned previously, collective behavior is the product of sharing some kind of information among a group of individuals. Our work spans any system that can be represented as a network of agents propagating an information. In what follows, we illustrate the efficiency of our framework in a few examples of information propagation, namely rumour spreading and consensus reaching in a robot swarm. In this section, we showcase the model presented in this paper as (1) a way to study behaviors and understand the reasons certain events progress in the manner that they do; and (2) as a practical solution to issues encountered in engineered collective systems.
A Rumour and its Counter-Rumour. We study the case of rumour spreading as an example of information propagation in social media. We analysed 7 major news events from the PHEME rumour dataset 32 that spans 297 twitter conversations discussing rumours. We focus on one of the rumours that spread during the Sydney hostage situation of 2014. Figure 4 shows the progression of the rumour that claims that hostages were held by terrorists associated with ISIS. The rumour started by the following tweet: @User_zero: SYDNEY SIEGE: Gunman forces hostages to hold up ISIS flag in window. We observe the tweets that counter-attacked the false rumour and their spreading throughout the network. We notice an explosion of tweets denying the rumour at around 23:45 after a tweet published by a user that we refer to as Bob (For confidentiality purposes): @Bob: Flag in window of Sydney Lindt cafe not an ISIS flag. Reads: 'There is no God but Allah and Muhammad is the messenger of God'. [2014-12-14 23:45:51] However, this was not the first attempt at correcting the false rumour. A tweet previously published by a different user (Alice) revised the claim with visual evidence: @Alice: These not the same. 1st Shahadah flag, 2nd is specifically claimed by IS(ISIS). In an endeavour to reach an understanding as to why the first counter-tweet by Alice didn't have much of an impact on correcting the rumour whereas the tweet by Bob did, we study the spreading of the information as modelled in previous sections and examine interesting patterns. The procedure to build a group interaction model is fully described in the Methods Section, which is then used to model the information propagation for two different scenarios to obtain the conviction of every user in the network to have the information. The first scenario represents the propagation of the information in the network from Alice and the second scenario sees the information spreading from Bob. Figure 5a illustrates the conviction of receiving the information by every individual in the network if the information was propagated initially by Alice and Bob respectively, based on their distance from the source. The heat map generated for Bob shows a wider outreach and propagation of the information to the other users in the network compared to Alice. This in part explains the reason behind the progression of information observed in Figure 4. We also explore how this analysis could be exploited to prevent rumour spread. Figure 5a.1 can be www.nature.com/scientificreports www.nature.com/scientificreports/ divided in three sections: the first is Alice propagating the information but failing to reach its immediate followers; the second section can be observed to have a sudden darkening area which indicates a user with a great conviction of receiving the information from Alice and a higher influence on their immediate followers and indirect relationships; and a last section, two-thirds through the network, that encounters another user able to further propagate the information through the rest of the network.
Analyzing the spreading of information starting at these critical users results in the heat-maps of Figure 5b. Both maps demonstrate a wider outreach then that observed for Alice. These users exhibit high conviction as well as high influence on their direct and indirect neighbours. This could be an incentive to rely on the framework to detect critical users that will spread the right information in times of crisis and quickly quench any rumours that might arise unforeseen chaotic behaviors.
Timing and Resilience. We experimented with our framework as an analysis tool where we estimate timing characteristics and study the resilience of a network to loss of information.
pWCCT and Kilobots. An interesting question to answer is how fast the spreading of infection could occur; either through the body, similar to cancerous cells contaminating neighbouring healthy cells 33 or at the population level, such as the spread of influenza. A recent work tackled this issue to determine the takeover time. The propagation model by Ottino et al. 34 assumed an infected node transmits the infection to a single neighbor at random every time-step from any infected individual in a network. With our framework, we go beyond this simple model: whereas we don't claim to estimate the exact probabilistic distribution of the time for infection spreading, we model the probability of the infection spreading from any infected individual and upper bound the worst takeover time in a www.nature.com/scientificreports www.nature.com/scientificreports/ probabilistic manner under two assumption: (a) the infection can spread to more than one individual in a single time-step (b) as well as consider that every individual might have a different transmission affinity.
In here, we employed a swarm of small robots to emulate the propagation of information-infection in the example above-throughout a network. To validate the ability of our model to upper bound the worst case convergence time towards a consensus, a set of experiments on a real swarm of robots was performed in which a number of Kilobots 35 form a swarm of different topologies ( Figure 6a shows the robots in a scale-free topology) and the aim is to study the time to converge to a consensus over a piece of information. Kilobots are simple robots that rely on infra-red communication which renders message transmissions very susceptible to noisy environments. Our motivation to experiment on a real robot swarm is to appraise the performance of our framework when it deals with physical characteristics of the swarm communication such as collision, interference, etc. that are difficult to simulate. Figure 6c plots the results obtained for 30 runs of the same experiment on a scale-free topology. We observe that the distribution of the convergence times has a high variance and that is partially due to the fact that the scale-free topology contains a number of clusters connected by a small number of links. This renders the nodes at these links highly critical and whether they succeed to transmit the message or not highly affects the convergence time. The convergence time recorded for this particular topology was in the range [3.8, 17.6](s).
Looking at what we estimated as the worst case convergence time expected for this topology, we examine Figure 6d. We plot the exceedance probability distributions for mdp ∈ [0.5, 0.6, 0.7, 0.8, 0.85] varying the message drop probability (mdp). Our interest lays in the plots for mdp 1 = 0.6 and mdp 2 = 0.7 (Refer to Methods) which show the worst convergence time estimated at different exceedance probabilities. In other words, WCCT = 170(s) and WCCT = 360(s) for mdp 1 and mdp 2 respectively, exceeded with a very low probability of Pr = 10e −13 . This upper bounds the time to convergence for the swarm of Kilobots, including extreme cases. The pWCCT estimation based on both convolution and merge presented here defines a safe upper-bound to the information propagation in a network that might be pessimistic when only ordinary situations are expected. However, its use is promoted in hard real-time systems that call for strict requirements on their timing behaviors such as robot-aided space exploration or robot emergency responders. This also offers a flexible measure to upper bound this timing characteristic by varying the exceedance probability threshold.
The advantage of our model over what is proposed in literature is its ability to estimate a pWCCT for different topologies, and to be adaptable to different scenarios. In addition to the scale-free topology we presented here, results on other topologies such as a snake-like topology, and a topology where the robots are randomly distributed with obstacles can be observed and are presented later as additional results.
A Chain is No Stronger than Its Weakest Link. From our study of rumour propagation, it is clear that preventive measures that rely on robustness analysis of the network are of utmost importance. This extends to non-biological systems, such as wireless sensor networks (WSNs). Designing a network that is entirely fault-tolerant can be too expensive, in terms of time, cost, and expertise. Our model can be a smart solution to select critical nodes in wireless sensor networks (WSNs) to be hardened against faults. Broadly speaking, our model is a tool to analyze a network and detect the weaker individuals that interfere with the propagation of information. www.nature.com/scientificreports www.nature.com/scientificreports/ To demonstrate this, we simulate a robot swarm in a random geometric topology in which a selected number of robots have been hardened, i.e. they have been given message drop probabilities mdp = 0 to ensure message transfer. In addition, instead of working with a homogeneous swarm, we randomly assign different mdp for every node in the graph. The purpose behind this choice is to observe networks that are heterogeneous in terms of information propagation (e.g. due to individual preferences, different noise levels in the environment, etc.). We compare against random selection, in which a set of nodes are picked at random. We also mentioned previously that our observations of information propagation might be related to centrality. We study these claims by comparing our method against a centrality-based selection which relies on the topology of the network and picks the nodes that have a high out-degree. We run every scenario 100 times, while we vary the number of selected elements m. This produces the results summarized in Figure 7.
While there is no major improvement in terms of shorter convergence time when the number of selected elements is small, we observe a major decrease in the time to reach consensus with m = 50 selected elements. The first observation shows that although our method reached global consensus in less time than the original scenario (with no hardened elements), the times reached were statistically similar to a centrality-based approach. In this case, since m is small, both techniques were choosing geometrically similar nodes.
The choice of working with a heterogeneous swarm, at least in terms of mdp, is highlighted in the results of Figure 7b,c. The idea is that a centrality-based technique selects nodes with high connectivity regardless of their information conviction I v i . Instead, we locate the weaker links in the network that might hinder the spreading of information throughout the swarm, especially in terms of convergence time. We do so by identifying the nodes that have a high conviction of obtaining the information, but are failing to transmit the information to their neighbours. In other words, we favor a node that we consider a weak link based on its ability to further propagate the information in the network, even if its connectivity is low. For the sake of brevity, we did not include the results for a homogeneous network, but it is interesting to note that our model exhibits similar behavior as centrality-based methods when considering a homogeneous swarm of robots where the mdp is constant across the network. This leads us to conclude that centrality is a good measure to select influential nodes. However, by itself, it fails to promote the propagation of information especially in more complex configurations.

Discussion
Information propagation effects have been categorized by Arif et al. 14 , in the case of rumours, into 4 patterns: Giant, Snowball, Fizzle and Babble. The giant and snowball effects, which both exhibit high derivative information propagation by high and low exposure individuals respectively, were of a particular interest since they present patterns that emergency responders look out for to maximize spreadability and stave off the emergence of chaotic behaviors. We observe both of these effects in Figure 5 from Bob and Alice respectively. Although the giant effect could be intuitively interpreted, the snowball effect is an observed fact that is not completely understood. In here, we introduced a simple but sufficient model of information propagation for the aim of studying emergence of behaviors. The purpose here was to explore the effect of stripping the propagation model from scenario-specific details, such as the volume of shared tweets or the rate at which the information is transmitted, etc. This showed that the few assumptions taken in modeling information propagation were enough to discern the patterns that lead to the emergence of the observed event. The study promotes the practice of bottom-up investigation when it comes to modeling information flow and to properly identify and isolate the originator of specific events and behaviors; as is the case with giant and snowball effects in rumour propagation research. For instance, we show that, although the snowball effect starts with low exposure individuals, it is mainly due to highly influential individuals picking up the information from low exposure sources. Since the model quantifies the information conviction of individuals, it is able to www.nature.com/scientificreports www.nature.com/scientificreports/ detect easily-influenced individuals which are themselves apt to influence other individuals. Being able to identify the major players in information propagation is of utmost importance especially in crisis situations. We suggest that the framework can readily be used to spread the right information and maximize the likelihood of giant effects. In fact, we exploited this characteristic to analyze artificial networks in which the framework was used to detect the weaker links that hinder information spreading which, as a result, produced more resilient systems. The framework was extended with a timing analysis technique to study information consensus in a group. We showed that despite the few assumptions taken to model the information propagation, which neither fully express the intricacies of the diffusion of information nor the environment, it is able to provide a probabilistic measure of the worst convergence time towards a consensus. Future work could see the framework extended to uncover the history of current events, such as determining the time and place for the origin of an infection. Admittedly, the framework presented could model a single information with multiple sources. However, future studies could see an extension of the framework to handle multiple, possibly conflicting, information in the network.

Methods
Group interaction model. The interaction between the group individuals is represented by a directed graph depicts the n individuals in a group and E = {e ij |(i, j) ∈ (0, n]} the information flow from ν i to ν j . The graph can have cyclic interactions where the individuals exchange information in both directions, which makes the existence of edges e ij and e ji possible. This is observed in a pod of bottlenose dolphins where whistles are exchanged to identify whether a dolphin belongs to a certain group 36 . Since we want to study a group of individuals in noisy environments that are apt to not cooperate, a probability of the information propagating from ν i to ν j is defined as p ij = Pr(ν j |ν j ) and is assigned to every pair (ν i , ν j ).
Conviction through message passing. The message-passing framework implements the idea that a node builds its conviction on an information reaching it, given the information was observed at one or multiple sources, by "listening" to the opinion of its neighbors about their own observations. This happens through an exchange of messages loosely based on the message passing algorithm described in 37 . In other words, the messages are a way to virtually strengthen a node's belief that it will hold a piece of information. This is done by observing the likelihood of its neighborhood to bring the information to it. The observations are modelled through the received messages. The node, by updating its conviction, shares a message to broadcast its conviction to hold and transmit the information. The existence of cycles in the graph imposes a recursive process for a node to iteratively reinforce its opinion until all messages converge.
The example in Figure 1 shows that X has multiple neighbors in which we recognize two types of messages; Received messages and Shared messages. The received messages are built on the direct relationship between a node X and its neighbors  X ( ) and the indirect influence from its second degree neighbors X ( ( ))   . They are interpreted as a node receiving the opinion of its neighbors on the likelihood that they will transmit the information given the state of the rest of the network and are defined as: Shared messages are of a particular interest since they can be decomposed to represent how much influence a node has on its individual neighbors. Message λ XY 1 for instance describes the level of conviction that Y 1 will get the information from a specific source X and no other. They are defined as: www.nature.com/scientificreports www.nature.com/scientificreports/ The summation excludes the message π YX i received from Y i since the shared message λ XY i represents the degree to which Y i is convinced that the information is coming from X, which eliminates the possibility that π YX i will be holding the information.
We define Ψ(X) as the conviction of individual X that it holds the information observed to have spread from a particular node in the network and is defined as Given the interaction graph defined above and an observation of the information at one or multiple broadcasting sources, the way to properly build the conviction of the other individuals in the network is by traversing the graph and gradually updating the state of the messages as information is observed. We do this in an iterative process that is summarized in Algorithm 1. Intuitively speaking, since the information is only observed at the sources, the messages are initialized to zero since they depict the conviction of an individual having the information, except for the messages shared by the broadcasting nodes. The state of the messages is then updated as dictated by Equations 1 and 2 by following the flow of the information and repeating the process until convergence. We are able then to define an information conviction for every node in the graph.
Probabilistic worst case convergence time. The multiple outbreaks of Spruce Budworms that ravaged north-American forests almost every decade of the first half of the 20th century 38 is a scenario of infection propagation that is still being studied extensively. Having a tool to estimate the time for an infection to spread in a community could lead to better prevention methods and open doors to understanding the propagation patterns. The probabilistic model defined so far is an essential part to reach this goal.
For every individual ν i , we define a probability mass distribution (PMD) ν I t ( ) i that describes the probability of the individual to receive the information from its neighbors at time t.
represents the conviction that individual ν i will receive the information from one of its neighbors.
The PMDs ν I i represent the timing behavior of corresponding individuals ν i . To estimate a bound on the timing for a consensus to be reached, we need to look at the collective timing behavior in the form of a probabilistic worst case convergence time, which rises two concerns: (i) how to translate the individual timing behavior into a group timing and (ii) what possible structures to study in order to achieve a tighter bound on the pWCCT.
Looking at the first issue, we look at how to combine all possible ways in which the information could propagate in the network in terms of timing. We rely on the convolution operator to implement this; Since the interest lays in estimating the worst time to convergence, and in order to estimate a tighter upper-bound on the pWCCT, instead of considering the longest path that spans all nodes, we look for a spanning tree with minimum branching.
For the convolution to work however, we have to assume that (i) the PMDs are independent and identically distributed (i.d.d) and (ii) the information propagates in a sequential manner to all the individuals (See Figure 8). www.nature.com/scientificreports www.nature.com/scientificreports/ The inclusion of branches in the considered structure hinders the use of convolution solely. The way we go about combining the nodes in the branches with the main spanning tree is by defining a merge operator: which looks into the probability of the information taking longer to reach the nodes in the branch than the main path and vice-versa. This takes into account the possibility that certain individuals receive the information at the same time, akin to a parallel process. In the structure of Figure 8c, the merging will look at the probability of the information reaching x at time t while the information has already spread to Λ at time s ≤ t and vice-versa. This reduces the pessimism of the convolution-based pWCCT estimation and offers a tighter bound.

Theorem 1.
Let Λ be a simple path and x be a node in a graph G such that ∉ Λ x then, merging I x (t) and I Λ (t) yields a tighter bound on the pWCCT than convolution.
We introduce a small lemma to prove this theorem. Proof. Since the pWCCT is an exceedance probability distribution, to prove tightness of bound, it is enough to show that the CDF of merging at any given time t is larger than that of convolution. Formally, we want to prove that Rumour data collection. We relied on the Twitter API to collect the relevant information to build the network connecting users Alice and Bob to the users involved in the rumour and compile the data in the form of a group interaction model as described above. The parameter of most importance in the graph model is the probability of an individual transmitting an information to its neighbors p ij . In the context of social media networks and interactions among humans, this refers to what is commonly addressed as social influence. On that account, metadata were extracted to represent the influence probability p ij defined as the probability of user i to influence the opinion of user j. It is common in literature to rely on the rate of communication to quantify the influence 39,40 .
In here, we define this parameter as: In order to better represent the influence between individuals, we estimate the influence based on two other quantities; the level of trust between users f t and the popularity of the user in the network f p .

True False
where Φ(i) represents the set of individuals following user i.