The inclusion of mobility processes is a key ingredient in the modeling of the geographic spread of epidemics. Recently this has been made evident in the modeling effort concerned with the diffusion of the 2009 H1N1 pandemic in which several papers have obtained estimates of the epidemic parameters and unfolding based on the knowledge of human travel and mobility patterns1,2,3. Models that explicitly take into account the mobility patterns of individuals range from relatively coarse-grained approaches that consider aggregated traveling flows to highly detailed structured metapopulation or agent-based models allowing for the description of billions of individuals4,5,6. However, the available data on human mobility and interaction are descriptive of human behavior as long as information concerning the unfolding of the epidemic does not induce changes in the population's behavior, for at this point the model has to include the population's behavioral changes that in turn alter the epidemic spreading. Although behavioral changes are in many cases triggered by the policy-making effort of public institutions and agencies, self-initiated changes in behavior induced by transmission of information about the disease both from the media and the local environment (friends, colleagues, etc.) are often initiated by the population's individuals. Self-initiated behavioral changes are elusive to modeling because of the difficulty involved in quantifying these changes and an overall lack of relevant data. In this case mathematical and computational modeling represents a very effective tool for exploring the impact of behavioral changes on the epidemic. For this reason there has been an increasing focus in the development of formal models aimed at closing the epidemic-spreading→behavioral-changes→epidemic-spreading feed-back loop for the past decade (see7 for a recent review). However, only a few of these theoretical and computational approaches have considered the spatially structured nature of populations and the effect of behavioral and mobility changes in the large-scale spreading of the epidemic8,9,10.

Here we consider a metapopulation model that incorporates self-initiated changes in human behavior in response to an epidemic outbreak and study how these reactions influence the spread of infectious diseases. The model is general enough to include a number of different behavioral responses. Ultimately, we find that prevalence-based behavioral changes do not affect the invasion threshold, although the number of subpopulations affected by the outbreak does depend considerably on population behavior. In particular, we introduce a simple mechanism that provides individuals with the propensity to avoid locations affected by the epidemic. Although the aim of such a self-initiated behavior is to prevent an invididual's exposure to the disease, it may lead to the unanticipated effect of facilitating its spread to new locations. The results presented in this paper underline the importance of the proper consideration of self-initiated behavioral responses to the spreading of epidemics.


In order to describe the large-scale spreading of infectious diseases mathematically we use a metapopulation approach11. This framework describes a set of spatially structured interacting subpopulations as a network whose links denote the mobility of individuals across subpopulations. Each subpopulation consists of a number of individuals that are divided into several classes according to their dynamical state with respect to the modeled disease – for instance: susceptible, infected, removed, etc. The internal compartmental dynamics models the contagion dynamics by considering that people in the same subpopulation are in contact and may change their state according to their interactions and the disease dynamics. Finally, subpopulations also interact and exchange individuals due to mobility from one subpopulation to another. Figure 1 shows a schematic representation of the metapopulation system. The global invasion threshold that marks the point beyond which a local outbreak reaches other subpopulations and spreads throughout the metapopulation system not only depends on the infection parameters, but on the mobility rates of individuals as well12,13 and thus differs from the single population epidemic threshold. Previous works have considered fully Markovian dynamics for the movement of individuals among subpopulations and more recent analyses have focused on the analytical description of models with recurrent patterns.

Figure 1
figure 1

Schematic representation of the Metapopulation System.

A population of individuals is divided into V subpopulations connected with each other following a heterogeneous network. Within each subpopulation, individuals are classified according to their dynamical status as Susceptible (S), Infected (I) and Removed (R). In absence of behavioral changes (blue arrows), individuals move from a subpopulation to another at a rate λ following the shortest path connecting both subpopulations. The discontinuous arrows represent the second mechanism of behavioral reaction in which people travel avoiding places with high prevalence at the cost of larger diffusion paths.

The model and the invasion threshold

Here we consider a general scenario in which individuals have memory of their original subpopulations, which they return to after having reached their destination location. More explicitly, we define a population of size N partitioned into V subpopulations. An individual is assigned its origin destination – its home – among the V subpopulations. The subpopulations are interconnected by edges that represent the mobility connections among subpopulations. We can therefore see the metapopulation system as a network made of V nodes and an assigned degree distribution P(k) that defines the probability that any given subpopulation is connected to k other subpopulations. Given the set of populations we can denote the number of subpopulations with k connections by Vk . A standard convenient representation of such a system is provided by quantities defined in terms of the degree k:

The quantity Nk indicates the average number of individuals in a population of degree k. This is a mean-field approximation that considers all subpopulations with a given degree k as statistically equivalent, thus allowing for the introduction of degree-block variables that depend only upon the subpopulation degree. While this is an obvious approximation to the system description, it has been successfully applied to many dynamical processes on complex networks and it is rooted in the empirical evidence presented in the analysis of mobility networks14,15,16,17,18. In the following we assume the population distribution

where is the average number of individuals per node in the subpopulation network. The mobility of the population is defined as follows: for every time step each of the Ni individuals (N = Σ iNi ) of subpopulation i travels with probability λ ij = λ. For simplicity, let us first consider the case in which destinations j are randomly chosen proportionally to the size of the population of the destination node (similar results are obtained by choosing the destination randomly). Individuals then move along the shortest paths to their destination nodes. This is a plausible assumption that corresponds to a traffic of individuals on each edge proportional to the edge betweenness. Data on the real network confirm this assumption with a linear statistical association between traffic flows and betweenness in the airport network (see supplementary online information). Additionally, once a traveler arrives at its destination, it must return to its origin subpopulation along the same shortest route.

For the epidemic dynamics in each subpopulation i we first consider the minimal SIR (Susceptible Infected Removed) model, according to which individuals are partitioned into Si (t), Ii (t) and Ri (t) compartments, denoting the number of susceptible, infected and removed individuals at time t, respectively. The transition rate from the susceptible to the infected state is given through the usual force of infection βIi /Ni , while infected individuals recover at a rate µ. The SIR model11,19 is characterized by the average number of infectious individuals produced by a single infected individual in a fully-susceptible population, the so-called reproductive number R0 = β/µ. In a stochastic model if R0 > 1 an outbreak may take place11,19. For the case of metapopulation models, the previous condition on R0 for the subpopulation outbreak holds. However, if the mobility rate of the individuals of the originally infected subpopulation is not enough to ensure the seeding of other subpopulations before the waning of the local epidemic, the outbreak does not spread globally. This is equivalent to the existence of a second reproductive number at the subpopulation level R* that depends on the mobility parameters and defines the threshold for the epidemic invasion of a finite fraction of subpopulations12,20,21,22.

In order to derive the condition for global spread in our model with origin-destination dynamics let us consider a metapopulation system in which a small set of initially infected subpopulations is experiencing an outbreak with R0 > 1. In the early stage of the epidemic, the number of subpopulations experiencing an outbreak is small and we describe the disease spreading at the level of the metapopulation system as a branching process, using a tree-like approximation relating the infected subpopulations at generation n to the infected subpopulations at generation n − 1. The average number of infected individuals in subpopulations of degree k during the evolution of the epidemic is αNk , where α is a disease-dependent parameter expressing the total number of individuals in the population that have been in the infectious state. Furthermore, in the SIR model, each infected individual stays in the infectious state for an average time µ −1. Thus, the total number of infected individuals circulating through the network at the n − 1 generation is:

Those individuals can trigger the start of an epidemic in a non-infected subpopulation with probability , where γ is the number of infectious individuals in generation n − 1 that have visited the subpopulation23. In order to provide a quantitative estimate of this number we consider that if individuals follow the shortest route through the network, then the probability that a node of degree k is visited by any individual is proportional to the average betweenness bk of nodes of corresponding degree. This allows us to write explicitly (as shown in the material and methods section) the branching ratio that provides the average number of subpopulations that will be infected by each subpopulation experiencing an outbreak, defining the following global invasion threshold:

which can also be expressed as a threshold condition for the mobility rate λ as

This threshold condition is extremely relevant as it links the mobility rate and patterns of individuals to the eventual global spreading of the disease. This allows us to relate the mobility of individuals to the global spreading of the disease and eventually study how behavioral changes that affect human mobility may alter the course of epidemic outbreaks. As we have previously detailed, the analytic calculations are based on several simplifications and assumptions and reduce to the critical point of a simple branching process. In order to test the robustness and reliability of the calculation we compare results from individual-based simulations in Fig. 2 with the analytic prediction expressed in Eq.(5). The individual-based simulations are detailed in the Materials and Methods section and are based on stochastic and discrete binomial models of transmission and mobility24,25. Here we report networks of size V = 3,500 with N = 3×106 individuals. These relatively small sizes are the most interesting as the mean-field approximations used in the calculations are obviously valid in the case of large-scale random graphs. The figure shows that the mean-field framework nicely estimates the value of the mobility threshold beyond which the movement of individuals from infected subpopulations to susceptible ones is large enough as to seed the latter and spread the epidemic to a system-wide scale. For the sake of completeness we report simulations in the supplementary online material for sizes up to V = 3×104 and N = 3×107 that confirm the analytical results.

Figure 2
figure 2

Mobility Threshold.

To compare the analytical insights with numerical results here we represent the number of diseased subpopulations D/V as a function of the mobility rate λ. The analytical value (Eq.(5)) is indicated by the arrow and the green triangle. Full circles are results from numerical simulations and represent the average over at least 100 stochastic runs (the line is a visual guide). The value of α has been approximated by 19. The substrate topology is an uncorrelated scale-free network generated according to the uncorrelated configuration model30 with γ = 2.5, V = 3000 subpopulations and N = 3×106 individuals. Other parameters are indicated in the figure.

Self-initiated behavioral changes

During the outbreak of an acute infectious disease it is natural to expect self-initiated human behavioral changes and variations of individuals' mobility patterns. Obviously the extent of behavioral change depends on the risk as perceived by individuals that concerns the severity of the disease, prevalence of it within the population and the information available on the disease. Behavioral changes have been shown7 to modify the disease state of individuals26,27, model parameters28 and contact structure29. In our system human responses to the presence of a disease might have a direct impact on mobility and traveling habits, since avoiding infected areas is a natural attitude of individuals and more drastic reactions such as not traveling at all may spontaneously arise, as documented in the recent epidemic. In order to model behavioral changes in our framework we consider that individuals react to prevalence-based information and study two mechanisms of behavioral change. The simplest one assumes that the probability pij (t) of traveling from subpopulation i to subpopulation j at time t is related to the level of infection at the destination subpopulation so that the higher the incidence of the infection at the destination, the less likely the individual will engage in traveling, i.e., pij (t) = 1 − Ij (t)/Nj (t). If travel is not cancelled, then the individual moves following the shortest path to the destination. This mechanism is the one that has led to the decline in the number of passengers arriving at airports in Mexico both domestically and internationally in 2009 due to the H1N1 influenza outbreak.

The second mechanism of mobility change (see Fig. 1) models behavioral reactions that induce changes in traveling routes. Specifically, given an individual who is traveling from an origin (subpopulation i) to a destination (subpopulation j), we assume that it will try to avoid traversing infected nodes, except when the next move leads to its destination. This process is obviously not deterministic and it consists of a trade-off between the risk associated with visiting a given subpopulation and the increase of the travel path length to the final destination. We assume that the risk perception associated with the visit of a given subpopulation is dependent on the prevalence of the disease in that subpopulation. However, staying away from infected subpopulations has the associated cost of traveling through alternative routes. We therefore assume that individuals move to the neighboring subpopulation l that minimizes the cost function cl (t) = l +(1−h)Il (t)/Nl (t), where δl is the change in distance to the destination, which can only take values -1 if node l is in the shortest path to the destination, 0 if it is at the same distance to the destination than the actual node and +1 otherwise. The parameter h tunes the force of the behavioral response and for h = 1 the shortest path is always followed, whereas h = 0 corresponds to a path minimizing the risk of traversing infected areas.

Synthetic metapopulation system

As for the analysis of the global invasion threshold, we simulate a metapopulation network of V = 3000 nodes and N = 3×106 individuals as detailed in the Materials and Methods section. In Figs. 3 and 4 we report the behavior of the density of infected subpopulations D/V at the end of the global epidemic as a function of both the basic reproductive number R0 and the traveling diffusion rate λ. The curves report the baseline case without behavioral changes and the case in which behavioral changes are implemented. The results readily show that in all analyzed cases the metapopulation system exhibits an invasion threshold which is independent of human behavioral changes. This feature of the model can be traced back to the fact that the behavioral changes are prevalence based. Analogously to the basic reproductive number, the invasion threshold is determined by the average number that each infected subpopulation will generate in a fully susceptible metapopulation system. Clearly in this regime the prevalence-based behavioral changes are irrelevant and the threshold value is thus not affected. As we increase the value of the parameters above the epidemic threshold we start observing differences in the two cases with respect to the number of subpopulations affected by the epidemic. In particular, it is not difficult to separate the effects of the two mechanisms of behavioral changes. When people have the option of deciding whether or not to engage in travel, the fraction of diseased subpopulations at the end of the outbreak decreases with respect to the null case, thus pointing out that this kind of response is beneficial. The reason for the reduction in D/V is rooted in the effective reduction of the mobility rate of individuals, which leads to a smaller exposure of susceptible individuals to the disease both while traveling and at home. A different scenario emerges in the case that individuals attempt to avoid infected subpopulations. For values of the parameter R0 and λ close to the invasion threshold the rerouting of individuals on different paths leads to a reduction of the outbreak probability in the subpopulations along the origin-destination path. This leads to a final reduction of the subpopulations experiencing an outbreak. For increasing R0 and λ this probability saturates to one and all subpopulations visited on the original and the rerouted path experience an outbreak. In this regime the number of subpopulations affected by the epidemic is much larger than the case without behavioral changes. The enhancement of the epidemic size is visible in Fig. 3 and occurs close to the invasion threshold onward. A full discussion with analytical arguments of this behavior is provided in the supplementary online material and points out the important and counter-intuitive effects that may be generated by behavioral changes in the population.

Figure 3
figure 3

Effects of behavioral changes in synthetic networks.

The figure compares the fraction of diseased subpopulations D/V when behavioral reaction mechanisms are active with the situation in which such behavioral responses are not taken into account (null model). (A) We show the dependency of D/V with the mobility rate λ (A) for random scale-free networks generated according to the uncorrelated configuration model30. Symbols represent the results obtained when individuals do not react to the presence of the disease (error bars are smaller than symbol sizes). The rest of the results correspond to the mechanisms of behavioral changes: “DP” stands for “departure probability” and represents the mechanism in which individuals decide whether or not to travel; “RR” (rerouting) corresponds to the case in which people travel while trying to minimize the risk of infection avoiding subpopulations with high prevalence at the cost of long travel paths. The results confirm that the invasion threshold is independent of behavioral changes and that the latter has a significant impact on the invasion dynamics of the metapopulation. The points are the averages among at least 100 stochastic runs and we consider µ = 0.04 and h = 0.1. See the main text for further details. (B) we report the relative difference of subpopulations experiencing an outbreak in the RR and baseline scenarios as a function of λ. It is possible to see the non-linear behavior that first induces a decrease – close to the invasion threshold – and then a sharp increase in the number of affected subpopulations.

Figure 4
figure 4

Effects of behavioral changes in data-driven scenarios.

Comparison of the fraction of diseased subpopulations D/V for the full (behavioral reaction mechanisms are active) and null (behavioral responses are not taken into account) limits of the metapopulation system. We plot D/V as a function of the mobility rate λ. The results confirm even in this case that the invasion threshold is independent of behavioral changes. Moreover, as for synthetic networks, epidemic awareness enhances the disease spreading as given by the increase in the number of subpopulations affected by the disease. The averages were taken over at least 100 stochastic realizations and we fix µ = 0.04 and h = 0.1. See the main text for further details.

Data-driven simulations

While the previous set of computational studies are obtained in a synthetic and in some ways simplified metapopulation scheme, we also carried out Monte Carlo simulations using the Air Transportation Network (see Materials and Methods). This network is highly heterogeneous with respect to the number of connections between urban areas (subpopulations) as well as the traffic wil in terms of the number of passengers and available seats on a given direct route connecting two subpopulations i and l. Taking into account these traffic patterns, we assume that initially the number of individuals in subpopulation i is proportional to its strength Ni = Σ lwil . In this realistic setting, instead of using a shortest-path routing we implement a traffic-based routing where individuals follow paths to the destination proportionally to the actual traffic to the destination. Moreover, as the travel flows are not homogeneous, we also consider that individuals choose their destinations proportionally to the strengths of each possible target subpopulation and that all travelers spend a time τ at their destinations before coming back home. In this way, we preserve the inter-city traffic patterns. The parameter τ is drawn from a uniform distribution with mean equal to the traveled distance dil . Sensitivity analyses for these parameters and those defining the behavioral changes are reported in the Supplementary Information file, where we also report the simulation for a more realistic model with the additional class of exposed individuals (the SEIR model) added to disease natural history. In Fig. 4 we report the results obtained for the data-driven model where all mechanisms of behavioral changes are at work, confirming the results obtained for the synthetic metapopulation network. We observe: i) a global invasion threshold that does not depend on behavioral changes; and, ii) that changes in traveling routes dominate the model behavior by considerably enhancing the spread of epidemics affecting a number of subpopulations compared to the case with no behavioral changes.


The results provided by the analysis of the effect of behavioral changes in both synthetic and data-driven simulations make it clear that behavioral changes inspired by the best intention of slowing down and containing the epidemic may give rise to the opposite effect. The numerical analysis shows that the disease spreading, as given by the number of subpopulations with local outbreaks, increases when travelers decide to bypass the subpopulations with a high number of infected individuals. The behavioral change effect is clearly observed in the data-driven simulations by plotting the invasion tree in the USA with an epidemic starting in New York. The invasion tree specifies the disease progression by defining a directed link ij from the infecting to the infected subpopulation, i.e. the origin subpopulation i of the infected individuals who have started the epidemic to the subpopulation j. Fig. 5A shows that in the absence of behavioral changes the infection tree is heterogeneous and has several hubs. This is not the case in the presence of behavioral changes where the entire tree originates from the initially infected subpopulation as shown in Fig. 5B. The rationale behind this finding is that the increased flow of individuals going through alternative paths brings the infection to new subpopulations that would otherwise be infected by other subpopulations. This constitutes a very interesting finding, as one can think of the whole process in terms of a social dilemma; individuals adopt a sort of selfish behavior by avoiding highly infected spots, but as a consequence, the disease invades a larger fraction of the subpopulations in the metapopulation system. Thus, what is beneficial at the individual level, turns out to have a negative impact on the whole population, especially in the cases where the epidemic has pervaded the system (large R0 and λ). As Figs. 3 and 4 show, it is also worth stressing that the effects of this kind of behavioral change depend nontrivially on both R0 and λ. Moreover, the features observed here are emphasized by models that take into account a natural history of the disease that allows for exposed individuals to become infectious after a latency period. Those individuals are subject to behavioral changes that make them avoid subpopulations that are already infected, but they do not have limitations to their mobility, thus providing an even more efficient mechanism for the large-scale spreading of the infectious disease.

Figure 5
figure 5

Invasion tree.

Invasion tree describing the air transportation network inside the USA of an epidemic starting in New York. The invasion tree specifies the disease progression by defining a directed link ij from the infecting to the infected subpopulation. In panel (A) we show the invasion tree for the null model when no behavioral reactions are considered. In panel (B) we show the invasion tree starting from the same initial conditions but consider both mechanisms of behavioral reaction to be active. In order to provide a clear representation we consider in both cases just the first 100 infected nodes among the total 425 in a single run, respectively. The color scale is a measure of time and is the same for both cases. At the time step in which the first subpopulation is infected from the seed it is yellow. At the time step in which the last subpopulation (among the first 100 in both cases) is infected it is red. All the other time steps are in the gradient between these two limits. Panel (A) shows clearly that in the absence of behavioral changes the infection tree is heterogeneous and contains several hubs that are infected first and that determine the time scale of the spreading infection to smaller airports. This is not the case in the presence of behavioral changes where the entire tree originates and grows much faster from the initially infected subpopulation as shown in panel (B). In both cases, we fix the parameters to R0 = 2, µ = 0.04 and λ = 10−6.

The present finding points out the importance and relevant effects of behavioral changes. Other behavioral reactions may be considered separately or in concomitance with the mechanisms studied here, thus providing more complicated dynamic behavior. The addition of simple behavioral models in the analysis of the geographical spreading of infectious diseases opens the path to a multitude of effects that can have a major impact in both our understanding of epidemiological data and in the definition of public health intervention.


Invasion threshold

In order to relate the global epidemic spreading with the network betweenness we note that the number of seeding infectious individuals for each node of degree k is where bk is the algorithmic betweenness (equal to the topological betweenness in the case of shortest path routing) of subpopulations of degree k and btot is the normalization factor accounting for the betweenness of all nodes. For reproductive numbers close to the epidemic threshold we can approximate the infection probability (see the SI file) and write explicitly the recursive expression for the number of infected subpopulations with degree k at the nth generation as

It is important to remark again that this expression is obtained by assuming that the network is random, nodes with a given degree k are statistically equivalent and the number of infected subpopulations is negligible with respect to the total number of subpopulations so that . This also allows us to consider the branching process as uncorrelated and the seeding of each subpopulation as an independent event. Furthermore, in this case it is possible to use the general result bk kη relating the node betweenness and degree in random networks. The above expression yields the branching ratio used to calculate the invasion threshold.

Internal nodes' dynamics

In each node an SIR dynamics takes place over a well-mixed population of initial size Ni . As time passes, Ni (t) changes according to the number of individuals that have been received and that have left the node. Within the nodes, one step of an SIR process takes place. The state of every individual inside a node i is changed according to the following probabilities: a susceptible individual becomes infected with probability and an infected individual recovers with probability p(I→R) = µ. Specifically, the exact number of individuals that change state is determined by a binomial distribution with the probability p(S→I) (or p(I→R)) and the susceptible populations size Si (t) (or infected Ii (t)) as parameters. Note that in this scenario, R0 only participates in the internal dynamics; individuals traveling through node i are involved in the SIR and thus can change their state while at node i.

Synthetic networks

We consider topologically-uncorrelated random graphs generated by an uncorrelated configuration model30. The simulations presented here consider a degree distribution P(k) k−γ with kminkkmax and γ = 2.5 and kmin = 2. The population of each node is assigned according to Eq. (2).

Airports network

The airport network data set is composed of passenger flights operating in the time period November 1, 2000 to October 31, 2001 as compiled by OAG Worldwide (Downers Grove, IL) and analyzed previously in Ref. 31. It consists of 3,618 nodes (airports) and 14,142 links. We use a weighted network in our analysis. Airports corresponding to a metropolitan area have been collapsed into one node in the original database.

Computational implementation

Each simulation starts with a small number of infected individuals. Specifically, we randomly choose a small fraction of subpopulations (less than 1% of the nodes) and within these subpopulations only 1% of the individuals are infected, assuring that the condition is fulfilled. In the simulations the traffic and spreading dynamics have the same time scale, so at each time step a diffusion step is performed first and then the SIR internal dynamics is evaluated. For the diffusion of individuals we assume that the number of individuals starting a trip at time t is given by a binomial distribution with mobility rate λ and subpopulation size Ni (t) as parameters. Destinations are chosen according to the strength wi of each node. To keep the nodes' populations constant over time (excluding transient individuals traveling through the node), once an individual reaches its destination it starts a new trip back to its origin. Simulations run until the stationary state I(t) = 0 is reached. Note that in this framework only travelers are explicitly followed and that we only keep a counter for the number of non-traveling individuals inside each node. Individuals are labeled according to their origin population and both the disease dynamics and the mobility process are simulated with discrete, binomial processes (See Materials and Methods). Finally, simulations of the behavioral changes are made more realistic by considering that infected individuals diffuse at a lower rate, λI , than susceptible subjects. This simulates the fact that most of the infected individuals will exhibit symptoms that prevent them from traveling. This happens in influenza-like diseases where only a fraction of infected individuals is asymptomatic and can continue with regular mobility habits. Without loss of generality we assume that λI = λ/2 and we report in the supplementary online material the sensitivity analysis for different values of this parameter. In synthetic networks agents select which neighbor to visit next on their ways to their destinations, presuming that following the shortest path is the preferred solution in an infection-free scenario. In the simulated airport network individuals are routed proportionally to real traffic edges and individuals traveling between subpopulations are tracked in time and the evolution of the disease is monitored. In addition, we study the invasion dynamics and measure the number of diseased subpopulations at time t, D(t). All numerical results reported henceforth are averages taken over at least 100 realizations of the initial conditions and the stochastic dynamics.