Introduction

Rapid urbanization and increased mobility brings new challenges for epidemics1. Estimates show that more than half of the world’s population already lives in cities, while further big increases are expected especially in Asia and Africa. Challenges presented by new megacities include the rapid spread of new epidemics, which can become worldwide threats due to increased global connectivity2,3,4. Poor housing conditions in rapidly growing cities can exacerbate epidemic threats, especially in the case of insect and rodent vector diseases and geohelminths5,6. Governments need to look for innovative solutions for monitoring and controlling epidemics7. An important part of these considerations is understanding the relationship between disease spread and human mobility, which have been previously linked on global scales2,3. In this paper we explore the effectiveness of pervasive technologies, specifically mobile phone data, in predicting and understating the emergence of mosquito borne disease outbreaks in urban environments. Cell phone data has been shown to be valuable in monitoring mobility patterns in near real-time8. Such information has a large potential in epidemiological modeling and control9, yet it has been often unreliable and difficult to obtain with traditional methods, especially in developing countries with rapidly changing urban environments and limited resources to conduct travel surveys.

We study the influence of human mobility on the spread of the mosquito-borne dengue virus, as inferred from a large-scale mobile communication dataset in the city-state of Singapore. Contrary to previous studies that either focused on this problem at the scale of countries or regions2,9,10,11,12,13,14, essentially treating cities as well-mixed nodes in a larger travel network, or used small-scale data of human movement inside cities collected through surveys15 or only use theoretical models and aggregate on intra-city human mobility16,17,18,19,20, we now employ a large-scale dataset of human mobility to study the connection between intra-city mobility and dengue spread. We focus on comparing a dengue transmission model based on people’s real commuting patterns (as inferred from the mobile phone dataset) with the observed dengue cases and with simulations employing random mobility models. This allows us to measure the impact of mobility model on the accuracy of modeling the spatial distribution of dengue cases. We especially focus on comparing random mobility that results in perfect mixing of population with more structured mobility models, effectively evaluating the importance of intra-city human mobility in dengue spread.

Dengue fever is a mosquito-borne viral infection, transmitted by female mosquitoes of the species Aedes aegypti and Aedes albopictus when biting humans. The infection causes flu-like symptoms with occasional complications that can be fatal. There are four strains of the virus and the infection with one strain produces lifelong immunity to that type. However, a second infection with a different type increases the risk of severe complications. Dengue continues to be a global threat, with about half the world’s population being at risk of infection21. Worldwide, there are more than 50 million infections every year, leading to half a million hospitalizations and up to 25 thousands deaths. Dengue is prevalent in tropical and sub-tropical climates worldwide, mostly in urban and semi-urban areas. The prevention and control solely depends on controlling the mosquito populations. There is active development for vaccines, with a first-generation vaccine becoming available recently22. Dengue affects Singapore in particular and two major outbreaks were observed in 2013 and 2014 (Fig. 1(a)).

Figure 1
figure 1

Temperature Dependency of the Dengue cases and Schematic representation of the Human-Vectors interactions in the epidemiological model. (a) Weekly observed dengue cases and average temperature in Singapore from January 2013 to December 2014. Two outbreaks took place during those two years during the summer. It is possible to observe a correlation between temperature and number of reported cases of people affected by the disease. (b) Compartmental classification for DENGUE disease. Humans can occupy one the four top compartments: susceptible, which can acquire the infection through contacts (bites) with infectious mosquitoes; exposed, where individuals are infected but are not able yet to transmit the virus; infectious, where individuals are infected and can transmit the disease to susceptible mosquitoes; and recovered or removed, where individuals are no longer infectious. The density of mosquitoes changes according to the seasonal transition from Aquatic (A) to Adult Mosquitoes (V). Similar to the humans case, Mosquitoes can occupy three different compartments and they can die with a given rate depending on the temperature.

The modeling of dengue outbreaks has attracted the interest of many researchers in many disciplines from physics to computational biology. Presented models investigate, for example, the variability of the mosquito population23, the variability of the human population24, the vertical transmission between mosquitos (that is, the transmission between mosquito generations)25 as well as seasonal patterns26. Otero et al.27,28 presented a dengue model, which takes into account the evolution of the mosquito population. Another study shows that dengue appears to travel in waves10. As the flight range of mosquitoes is limited to a few hundred meters15, it is generally assumed that humans carry the dengue virus to previously dengue-free areas and infect local mosquitoes. There is evidence that the spread of mosquito-borne diseases is related to human mobility16. Various agent-based simulations suggest that the mobility of humans could be the main driving force behind the spread of the dengue virus29,30. Teurlai et al.13 showed that the human mobility, estimated from the road network, influences the spread at a national scale in Cambodia. Especially house-to-house human movements seem to play a key role in Iquitos, Peru16. Related malaria studies show that human mobility, which is estimated from cell-phone networks, drives the dissemination of malaria parasites as well12. Recently Wesolowski et al. studied the impact of human mobility on the emergence of dengue epidemics in Pakistan14 using mobile phone-based mobility.

Considering the threat presented by dengue especially in cities, many authors studied the effect of dengue fever in urban environments7,11,17,18,19,20. While these work generally assume that intra-city mobility is an important factor for dengue epidemics, a direct quantification of this effect is still lacking. For the first time, we analyze the effectiveness of mobile phone useage data to predict the dengue spreading in an urban environment, such as Singapore. In doing so, we compare random mobility patterns with the real one estimated from anonymized mobile phone usage records in an agent-based model of dengue transmission adapted from previous studies11,14. This way, we are able to characterize the effect of human mobility on urban scales in the spread of vector-borne diseases and the effectiveness to use mobile phone data to estimate disease epidemics on this scale as well.

Results

We propose an agent-based dengue transmission model in which humans and mosquitoes are represented as agents and humans go through the epidemic states of dengue23,24,25,29. To model dengue dynamics, we use a stochastic population model based on the ordinary differential equation (ODE) framework employed by Lourenco and Recker to describe a dengue outbreak in Madeira, Portugal11 and then used by Wesolowski and colleagues to model the dengue outbreak in Pakistan14. The epidemiological model depends on both temperature-dependent and constant parameters as described in the Methods section and reported in Tables 1 and 2. We employ an agent-based approach for humans, while we model localized mosquito subpopulations stochastically. As a necessary simplification, we only consider one serotype of dengue; in this case, individuals can only be infected once. The physical environment in which the epidemic takes place is a regular grid, composed of \(320\,{\rm{m}}\times 320\,{\rm{m}}\) cells, overlaid the city of Singapore.

Table 1 Temperature-dependent parameters.
Table 2 Constant parameters.

The model is composed of two phases: (i) the phase of reaction, defined by the epidemioloigcal model (see the section Materials and Methods for details and Fig. 1(b) for a schematic overview), where disease transmission takes place in each grid cell; (ii) the phase of diffusion where agents are moved from one grid cell to another according to the mobility model under consideration: the mobility flows aggregated at census district level for the different mobility models are reported in Fig. 2. Each day consists of two reaction phases, corresponding to day and night, and two diffusion phases, corresponding to people’s morning and evening commute.

Figure 2
figure 2

Commuting flows from home to work locations aggregated at the 55 planning areas. The location of the nodes corresponds to the centroid of the areas and their size corresponds to the incoming degree which corresponds the total amount of agents that commutes everyday to that area. In this figure we report only the most significant nodes in terms of incoming flow (i.e. greater than 95th percentile the distribution). (a) We can observe that major hub in the mobile phone data mobility model corresponds to the Central Business District where the majority of the jobs are located. (b) The random mobility mobility has different hubs randomly distributed in the space. (c) The Levy-distribution and (d) the radiation model show similar patterns, with an homogeneous distribution on the territory without significant hubs: however the mobility derived from the radiation model is more aggregated in the central part of the city.

In this work, we consider four different mobility models (see Fig. 2) and compare their predictive power about the dengue outbreaks of 2013 and 2014 in Singapore. In each mobility model, we assign a home and work location (grid cell) to each agent who are assumed to commute between these two daily. The models differ in the way how this assignment is made: (1) mobile phone data: we use anonymized call detail records of one mobile phone operator in Singapore, collected in a two month period in 2011 that allows us to estimate home and work cells for 2.3 million agents; (2) random work location: in this case, we still use the home cells estimated from the mobile phone data, but work locations are assigned randomly; (3) Levy-distribution: each agent is assigned a random home location based on the mobile phone data and a work cell is chosen such that the commuting distance follows a truncated Levy-distribution; (4) radiation model: we use census data to distribute the home locations of agents31 and then we choose work cell locations according to the radiation model of Simini et al.32. In total, there are 2,598 grid cells with either a home or work location in them. More detailed description of the mobility models is given in the Materials and Methods section, while we present a comparion between the mobility models in the Supplementary Material, in Figs S1 to S5. Most notably, flows of people on the district level are highly correlated among the mobile phone data and the radiation model (\(r=0.938\)), somewhat less correlated with the Levy-distribution model (\(r=0.901\)) and significantly less correlated among mobile phone data and random mobility (\(r=0.304\)). This way, we conclude that the radiation, Levy-distribution and random mobility models give successively worse approximations of real mobility patters.

Beside the mobility model, we have two main variable parameters, the number of mosquitoes per human, \({x}_{v}\) and average bite rate of mosquitoes, \(a\) (more thorough definitions of these and a discussion on model parameters are given in the Materials and Methods section). We perform a sensitivity analysis on these, by exploring the phase space \({x}_{v}\in [0.004,0.1]\) and \(a\in [0.14,0.26]\). This allows us to calibrate our model to the population of agents in Singapore; this is a necessary step since exact estimation of these parameters is especially challenging in real-world condition, while several parameters in the epidemiological model cannot be reliably measured in real-world conditions, only in controlled laboratory experiments33. In our approach, we use best available estimates from the literature for most parameters, while allow variation of \({x}_{v}\) and a to deal with this inherent uncertainty. We select the best parameter combination for each mobility model to evaluate our results.

We start our simulations with initial conditions for infected human agents based on the observed number and distribution of cases in January 2013, while we obtain the initial mosquito populations by running the population dynamic model for an initial warm-up period as described in the Materials and Methods section. To account for the stochastic nature of the simulation, for each parameter value, we ran the simulation 100 times and report the median and average values in the following.

Temporal analysis

In Fig. 3 we report the comparison of the number of observed cases and the median of the simulated infected cases estimated from our simulations during the epidemiological weeks in 2013 and 2014 for the four different mobility models. In particular for each mobility model we report the pair of parameters \({x}_{v}\)-\(a\) that maximize the \({R}^{2}\) between the simulations and observed number of cases. Each mobility model is able to predict quite well the temporal evolution of the dengue outbreaks, since the epidemiological dynamics mainly depends on the value of the temperature. Each mobility model optimizes the prediction for different values of the parameter \({x}_{v}\)-\(a\) as reported in the legends of Fig. 3. In order to find the best pair of parameter values, we compute the \({R}^{2}\) between the observed and the predicted number of cases between the 12th and 26th epidemiological weeks when the epidemic peaked during the study period. We show optimal \({R}^{2}\) values in Table 3 and display variation of log R2 in the phase space in Fig. S13 in the Supplementary Information. The Mobile Phone Data and the Levy Distribution mobility models have the better accuracy with a value of \({R}^{2}\) of 0.65 and 0.61 respectively while the Random and the Radiation mobility models tend to overestimate the number of cases and with R2 of 0.52 and 0.57 respectively. Nevertheless, we still conclude that all models reproduce the main trends in the epidemic well.

Figure 3
figure 3

Temporal analysis. We report the comparison between the best simulated scenario and the observed number of dengue cases during the 2013–2014 outbreaks. Parameter values for \({x}_{v}\) (average number of mosquitoes per human) and \(a\) (mosquito bite rate) are displayed in the figure legends for each case.

Table 3 Prediction error R2 for the best couple of the parameters xv, a for the different models.

Spatial analysis

In this section we show the results of our simulations and we compare it with the spatial distribution of number of reported cases in 2013–2014 in Singapore. The distribution of Ae. aegypti expanded during the decade from 2003 to 2013 and the percentage of houses with mosquito breeding in 2013 and 2014 was significantly higher than in previous years34. As expected, the dengue case distribution pattern in 2013 and 2014 was in line with the geographical spread of A. aegypti in the country34. The biggest clusters remain in Tampines in the eastern part of the island, however more are now in the west and north. In order to quantify the effect of human mobility on the spatial propagation the dengue in Singapore, we compare the results of our model with observed case scenarios by considering the four different mobility patterns: (1) mobile phone data; (2) random; (3) Levy; (4) radiation. We show the cumulative spatial distribution of observed cases in Fig. 4 and simulated cases in the four models in Fig. 5 with with the \({x}_{v}\) and \(a\) parameters that give the best estimate for the temporal patterns (as reported in Fig. 3). We can see that the mobility plays an important role for predicting the spatial distribution of the number of cases. Indeed the spatial distribution of the number of cases predicted by the random mobility model is uniformly distributed among the city, while the other mobility models allow us to detect key hotspots of the outbreaks similar to the observed scenario.

Figure 4
figure 4

Observed dengue cases. Cumulative spatial distribution of observed dengue cases during the 2013 and 2014 outbreaks.

Figure 5
figure 5

Spatial analysis. We report the heatmap of the cumulative number of cases for the four mobility models. For each simulated scenario we report the results with the best parameter values, as shown in the figures.

To better distinguish between the predictive power of different mobility models, we computed structural similariy (SSIM) scores35,36 (see the Supplementary Information for a description) for each case in each epidemiological weeks, with the best parameters \({x}_{v}\)-\(a\) and compare their distribution in Fig. 6. We can observe that the mobile phone mobility model and the radiation model perform in a similar way, consistently well approximating the observed spatial distribution during the time period of our study. The Levy-distribution model is performing slightly worse, while the random mobility model gives significantly worse results. Looking at the results in Fig. 5, we find that the overall distribution of the infected cases for the random mobility model corresponds well to the average population density (i.e. average of work and home locations in each cell). This means that our random mobility model achieves a good mixing among the population. The difference from the real distribution of dengue cases and the other mobility models highlights that uniform mixing among the population does not account for a spread of dengue in Singapore, thus mobility patterns are an important factor. While previous large-scale epidemiological studies often treat cities as well-mixed nodes in a larger travel network2,9,10,11,12,13,14, our results show that disease spreading can exhibit important localized patterns inside cities as well, in line with studies done previously on smaller samples of the population or aggregate models of human mobility15,16,17,18,19,20. It is unclear yet, how the intra-city and inter-city epidemiological models are best reconcilied; we note that frequent travelers are often a non-uniform sample of the total population of any city, thus local and long-range spread of infectious diseases can have complex intervowen patterns. The further difference between the Levy and radiation mobility model is consistent with previous work which found the radiation model to best reproduce the statistical properties of human commuting32. Furthermore, the good results obtained from the mobile phone data show that the home-work commuting estimated from this dataset indeed accounts for the most important factors in human mobility in Singapore.

Figure 6
figure 6

Spatial analysis. Boxplot of the value of the SSIM Index for each weeks during the 2013–2014 outbreaks using the best parameter \({x}_{v}\)-\(a\) shown in Fig. 3. SSIM index values were calculated for each epidemiological week during the outbreak for each of the 100 simulation runs. The distributions of these values are shown as boxplots for each mobility model in this figure. The boxplots show the minimum, first quartile, median, third quartile and maximum among the SSIM values observed. We see that in all cases, the range of data is quite small; the mobile phone data and radiation model results are clearly distinguished from the random mobility and Levy-distribution results.

Discussion

More than 80% of the world’s population is at risk from at least one vector-borne disease3. The populations most at risk are those living in poverty in the tropical and subtropical areas, but as the case of Singapore shows, highly developed cities and countries still need continued efforts to prevent outbreaks34. The rapid urbanization, the increase in international travel and trade, the modification of agriculture and environmental changes have increased the spread of vector populations, putting more and more people at risk. Mobile phone data can give real-time mobility information that can be combined with infectious disease surveillance data and seasonally varying environmental data to map these changing patterns of vulnerability in cities that are changing everyday. In this paper we proposed an agent based model in order to explicitly simulate the epidemic spread of the disease as governed by the transmission dynamics of the dengue virus through human-mosquito interactions and promoted by the population movements across the city state of Singapore. In this methodology humans and mosquitoes are represented as agents and humans go through the epidemic states of dengue.

We modelled four different mobility patterns: 1) mobility estimated from mobile phone data, 2) random mobility patterns, 3) mobility estimated from census data following a Levy distribution model and 4) mobility estimated from census data following a radiation model. We were able to reproduce the main temporal and spatial patterns of the dengue outbreak in 2013 and 2014. Our results show that human mobility is a very important factor in the spread of vector-borne diseases such as dengue even on the short scale corresponding to intra-city distances. This is evidenced by the large difference found between the observed spatial pattern of dengue cases and the ones obtained by the completely random mobility model which corresponds to a “perfect mixing” among the population. This extends the results obtained from the previous work of Wesolowski et al.14 who showed how human mobility determines the spread of dengue on the scale of a country and studies that investigated the relationship between human mobility and spread of vector diseases on different spatial scales7,9,12,17,19,20,37. We believe that our main contribution is showing that human mobility patterns are important for the spread of vector-borne diseases even on intra-city scales; this is in contrast to previous studies which often assume cities to be a well-mixed environment for the purpose of epidemiology and study disease spreading between cities and regions2,9,11. It is an interesting question for future work to what extent this result applies to other types of diseases, e.g. airborne infections that require only shorter co-location of people to spread, thus are able to exploit mixing of population in a more rapid way.

Furthermore, we found that more sophisticated models of intra-city mobility can give good estimates of the spatial spread of dengue, opening up the possiblity to incorporate these into modeling and control of vector diseases in urban environments. The proposed methods could be integrated into urban planning in near real time. Mobile phone data is an obvious candidate for this purpose, giving real-time information on people’s movements. A major limitation of mobile phone data generated by national operators is the difficulty in capturing cross-border travel patterns and it is not possible to monitor with high accuracy the flux of people travelling to the city. On the other hand, we found that the radiation model of people’s commuting behavior performs similarly well, opening up the possibility to improve prediction of disease spread if accurate census data is available. Thus, we believe our methods can be readily used in other cities where these mobility models can be estimated, while accuracy will be affected by overall predictability of human movements and regularity of commuting patterns8,32,38,39,40. Concluding, we note that the methods we presented here can be readily generalized to consider different mosquito-borne diseases such as dengue, chikungunya, malaria, yellow fever and different sources and models of human mobility, having a large potential usability for better understanding, control and prevention of vector disease epidemics in urbanized areas.

Materials and Methods

The code used for our simulations is available online41.

Mobile phone data

Anonymized call detail records (CDRs) were collected over a two month period in 2011 by one of the mobile phone operators in Singapore with a significant market share (a statistical analysis is reported in Fig. S9 in the supplementary materials.). The data includes more than 2 billion records in total and includes the approximate time and location of events (including phone calls and text messages). Locations are collected at the cell tower level with further noise applied for privacy reasons. We use this data to assign two “favorite” locations to each user: (i) home and (ii) work. In Singapore, according to a study by the Land Transport Authority, about 80% of all trips go to either a work or a home location42. This implies that the infection with the dengue virus in Singapore very likely happens either at home or at work, thus we focus on commuting between these two locations when modeling human mobility in this paper. To estimate home and work locations, we perform a spatial clustering of the CDRs, creating overlapping clusters of events which are spatially close to each other (a threshold of 500 m was used so as to account for the potential uncertainty regarding which one of nearby antennas a phone connects to). To be able to distinguish between home and work locations, we performed this clustering procedure separately for records generated between 8 pm and 6am on weekdays and during weekend (for home locations) and records generated between 10 am and 4 pm on weekdays (for work locations). After this procedure, we selected the largest clusters for both cases and filtered the list of users who had at least 10 events in both clusters. Following this procedure, there are 2,307,230 users to whom we can assign a home and a work location. We then assign users’ home and work locations into a \(320\,m\times 320\,m\) grid overlay G which we use as the basis of the epidemic simulation. We display the distribution of these home and work locations in Fig. S2, while we show the nonempty grid cells in Fig. S12 in the Supplementary Material.

To show that the cellphone dataset is representative of Singapore it is possible to compare the distribution of the home locations identified by our clustering procedure with official census data from 201031 (See Figs S10 and S11 in SI). Singapore is divided into 55 urban planning areas43 and we compare the number of home locations identified in each of them with the 2010 census data31. With a correlation coefficient of 0.96, the two spatial distributions are highly linearly correlated as shown in Fig. S10 in the supplementary materials. Furthermore, we note that penetration of mobile phones (number of active mobile phone subscriptions compared to the total population) in Singapore was above 140% at the time of our study44, one of the highest rates in the world. This way, we expect that almost all of the population has a mobile phone and many people have more than subscription. As the flight range of mosquitoes is limited to a few hundred meters16, it is generally assumed that humans carry the dengue virus to previously dengue-free areas and infect local mosquitoes. For this reason in our model mosquitoes don’t travel among different cells. Therefore, in the computational implementation each day is represented by two steps: daytime, during which population stay at work, and nightime during which poluation stay at home.

Mobility models

We use four different models to estimate mobility of people and assign home and work locations to our agents. The first one is the mobility model defined according to the real estimation of mobility patterns from CDR data as described above. The second mobility model is a model in which for each agent we take the home location from the mobile phone data while the work location is assigned randomly (according to a uniform distribution) among the 2598 cells. The third mobility model is defined in the following way: for each agent we choose a random home cell of the grid, while the work location is choose with a distance (d) that follows a truncated Levy distribution38 as distribution of the mobility patterns, such as \(P(d)\sim {(d+{d}_{0})}^{-\beta }\exp (\,-\,d/k)\), where \(P(d)\) is the probability to have of distance d between home and work location, \({d}_{0}(m)=100\), \(\beta =2\) and \(k(m)=1500\). The fourth mobility pattern has been generated according to the radiation law of human mobility32. According to this we generated a mobility pattern considering the following: i) we assigned to each cell a number of inhabitants randomly distributed (normal distribution) according to the census data. ii) for each cell we consider that the percentage of commuters is the 80% while the other 20% work and live in the same cell. iii) For all the other inhabitants we computed that distance between home and work location based the radiation laws that reads \(\langle {T}_{ij}\rangle ={T}_{i}\frac{{m}_{i}{n}_{j}}{({m}_{i}+{s}_{ij})({m}_{i}+{n}_{j}+{s}_{ij})}\), where \({T}_{i}\) is the total number of commuters from county i, \({m}_{i}\) and \({n}_{j}\) are the population in county \(I\) and \(j\) respectively, and \({s}_{ij}\) is the total population in the circle centered at \(i\) and touching \(j\) excluding the source and the destination population. The displacement of the agents for the different mobility models are reported in Figs S2 to S5 in the supplementary materials. The generated mobility models show that the radiation model the model generated with the mobile phone are the most similar while there is almost no correlation with the random mobility model as shown in Fig. S1 in Supplementary materials.

Epidemiological data

Information about the weekly number of reported Dengue cases in Singapore was collected from the official Singapore’s government’s one-stop portal45. In the 2013 dengue outbreak in Singapore, a significant rise in the number of dengue fever cases was reported in Singapore and caused 8 victims and a total of 22318 cases. In the week of 16–22 June 2013, there was a record of 842 dengue cases in Singapore in a single week. This figure was far beyond the highest number of cases per week in the years 2010, 2011 and 2012. The number of weekly dengue fever cases has exceeded the epidemic threshold of 237. Similarly high number of cases were reported over the course of 2014, with the maximum number of weekly reported cases having a peak of 891. In the following years, the number of dengue cases were significantly lower due to increased efforts to control the mosquito population. We show the total number of dengue cases during 2013 and 2014 in Fig. 1(a). For the spatial analysis of Dengue outbreaks, we use a dataset that is a collection of data from the NEA. Data was collected twice a week since May 2013 (except for a gap in October 2013) from SGCharts Charting Singapore’s Data46. The data provide information of the number of dengue cases in local spatial clusters that were established dynamically based on the location of recent cases. Spatial clusters are typically a few hundred meters in size, encompassing multiple city blocks. This allows us to have a good representation of the spatial spread of dengue, while still protecting to privacy of people affected. We display the cumulative spatial distribution of dengue cases in 5(a).

Climate data

We collected data about climate conditions in Singapore during years 2013 and 2014, during which two outbreaks during the respective summers took place. In Fig. 1(a) we show the number of dengue cases during the epidemiological weeks in 2013 and 2014 comparing it with the average temperature. The impact of daily temperature fluctuations on dengue virus transmission by the A. aegypti mosquitoes have been extensively studied and the results indicate that the weekly mean temperature is statistically significant relative to the increases in dengue incidence in Singapore and signifies the hazardous impacts of climatic factors on the increase in intensity and magnitude of dengue cases47. This reflection can be observed in the outbreaks of 2013 and 2014 where the comparison between reported cases and temperature has been reported in Fig. 1(a). Weather data including Mean temperature (MeanT, °C), Minimum temperature (MinT, °C), Maximum temperature (MaxT, °C), Rainfall (Rain, mm), Relative humidity (RH, %) and Wind speed (WindS, m/s) were obtained from the National Environment Agency, Singapore (NEA)48.

Epidemiological model

The epidemiological model can be described schematically as shown in Fig. 1(b). Motivated by research that shows that mosquitoes have a very limited flight range and infection is carried by human movement15,16,29,30, we assume mosquitoes to have a fixed location, i.e. there is no interaction between mosquito populations in distinct grid cells. For this reason, humans are treated as distinct agents, while the values for mosquitoes are aggregated at the cell level. The transitions on the proposed epidemiological model depend on temperature dependent parameters as reported in Table 1 and described in the Supplementary materials (see also Figs S6 and S7 in the supplementary materials). The constant parameters are described in Table 2.

Humans

In the stochastic framework, we represent each human as an agent i, who at each timestep t can be described by a pair \({(N,c)}_{t,i}\), where \(N=S,E,I,R\) is the epidemic state (susceptible, exposed, infected and recovered, respectively), and \(c\) denotes the grid cell where the agent resides. In our mobility models, \(c\) alternates between a set home and work location, either inferred from the mobile phone usage data in the realistic scenario or generated randomly. We denote by \({S}_{t,c}\), \({E}_{t,c}\), \({I}_{t,c}\) and \({R}_{t,c}\) the total number of susceptible, exposed, infected and recovered humans in cell \(c\) for timestep \(t\). We further use \({N}_{t,c}\equiv {S}_{t,c}+{E}_{t,c}+{I}_{t,c}+{R}_{t,c}\) for the total number of humans. We assume each agent to be susceptible to the virus initially (\(S\)). Upon challenge with infectious mosquito bites (\({\lambda }^{v\to h}\)), individuals enter the incubation phase (\(E\)) with mean duration of days \(1/{\gamma }^{h}\), later becoming infectious (\(I\)) for days \(1/{\sigma }^{h}\) and finally recovering (\(R\)) with life-long immunity.

In each timestep, transition of the states can occur with probabilities \({\lambda }_{t,c}^{v\to h}/2\), \({\gamma }^{h}/2\) and \({\sigma }^{h}/2\) for the case of the \(S\to E\), \(E\to I\) and \(I\to R\) transitions respectively (as each reaction timestep takes half day, we obtain the transition probabilities by halving the daily transition rates). We evaluate the transitions individually for each human agent as a Bernoulli-process, and update the state accordingly. While the \({\gamma }^{h}=0.5\,day{s}^{-1}\) and \({\sigma }^{h}=0.25\,day{s}^{-1}\) rates are constants11,14, the \({\lambda }_{t,c}^{v\to h}\) rate is related to the mosquito population of the grid cell where the human agent is currently residing:

$${\lambda }_{t,c}^{v\to h}=a{\dot{\varphi }}^{v\to h}\frac{{I}_{t,c}^{v}}{{N}_{t,c}}=a{\dot{\varphi }}^{v\to h}\frac{{V}_{t,c}}{{N}_{t,c}}{\rho }_{t,c}^{I}\propto V{\rho }^{I}$$
(1)

where \(a\) is the biting rate (i.e. how many humans a mosquito bites on average per day), \({\dot{\varphi }}^{v\to h}\) is the disease transmission rate per bite, while \({I}_{t,c}^{v}\) is the total number of infected mosquitos in cell \(c\) at time \(t\) (i.e. \(a\tfrac{{I}_{t,c}^{v}}{{N}_{t,c}}\) gives the probability that an infected mosquito bites the given human agent during this timestep), while \({V}_{t,c}\) is the total number of mosquitos in cell \(c\) and \({\rho }_{t,c}^{I}={I}_{t,c}/{N}_{t,c}\) represents the fraction of infected individuals in that cell. The change in compartments of human agents is then expressed by the following equations:

$${t}_{t,c}^{S\to E}=BD({S}_{t,c},{\lambda }_{t,c}^{v\to h}/2)$$
(2)
$${t}_{t,c}^{E\to I}=BD({E}_{t,c},{\gamma }^{h}/2)$$
(3)
$${t}_{t,c}^{I\to R}=BD({I}_{t,c},{\sigma }^{h}/2)$$
(4)
$${S}_{t+1,c}={S}_{t,c}-{t}_{t,c}^{S\to E}$$
(5)
$${E}_{t+1,c}={E}_{t,c}+{t}_{t,c}^{S\to E}-{t}_{t,c}^{E\to I}$$
(6)
$${I}_{t+1,c}={I}_{t,c}+{t}_{t,c}^{E\to I}-{t}_{t,c}^{I\to R}$$
(7)
$${R}_{t+1,c}={R}_{t,c}+{t}_{t,c}^{I\to R}$$
(8)

where \(BD(n,p)\) represents a sample taken from a binomial distribution with \(n\) samples and \(p\) success probability. We note that during the simulation, the \(t\) transition numbers are not calculated by sampling a binomial distribution, but by performing an independent trial for each human agent with the appropriate transition probabilities and recording the number of successes. While the resulting \(t\) values are equivalent to sampling a binomial distribution directly, performing the individual trials allow us to track the state of each agent individually. This is necessary to update the populations in the next step based on the movement of agents determined by the mobility model used.

Mosquitoes

We model the vector population in each grid cell stochastically, where mosquitos have two pertinent life-stages: aquatic (eggs, larvae and pupae, \(A\)) and adult females (\(V\))49. We keep track of the number of mosquitoes for each grid cell and calculate the transmission between the classes stochastically based on the rates calculated from the parameters of the model, some of them being dependent on the temperature. For this, we denote the total number of mosquitoes in each class by \({A}_{t,c}\) and \({V}_{t,c}\) respectively for timestep \(t\) and cell \(c\). We then calculate the changes in mosquito numbers of each mosquito class in each cell according to the following rules.

$${d}^{A}=BD({A}_{t,c},{\dot{\mu }}_{A}^{v}/2)$$
(9)
$${t}^{A\to V}=BD({A}_{t,c}-{d}^{A},{\dot{\varepsilon }}_{A}^{v}/2)$$
(10)
$${d}^{V}=B({V}_{t,c},{\dot{\mu }}_{V}^{v}/2)$$
(11)
$${t}^{V\to A}=PD[cf{\dot{\theta }}_{A}^{v}/2(1-\frac{A}{{K}_{t,c}})V]$$
(12)

and then update the mosquito populations accordingly

$${A}_{t+1,c}={A}_{t,c}-{d}^{A}-{t}^{A\to V}+{t}^{V\to A}$$
(13)
$${V}_{t+1,c}={V}_{t,c}-{d}^{V}+{t}^{A\to V}$$
(14)

Here \(PD(x)\) represents a sample taken from a Poisson distribution with a mean of \(x\). The coefficients \(c\) and \(f\) are the fraction of eggs hatching to larvae and the fraction of female mosquitoes hatched from all eggs, respectively. For simplicity and lack of quantifications for the local mosquito population, we assume these to be 149. Moreover, \({\dot{\varepsilon }}_{A}^{v}\) denotes the rate of transition from aquatic to adults, \({\dot{\mu }}_{A}^{v}\) and \({\dot{\mu }}_{V}^{v}\) are the mortality rates for aquatic and adult mosquitoes, \({\dot{\theta }}_{A}^{v}\) is the intrinsic oviposition rates. The logistic term \((1-\frac{A}{{K}_{t,c}})\) can be understood as the physical/ecological available capacity to receive eggs, scaled by the carrying capacity term \({K}_{t,c}\) in each cell. The effective carrying capacity \({K}_{t,c}\) is defined as:

$${K}_{t,c}={x}_{v}\frac{{W}_{c}+{H}_{c}}{2}$$
(15)

where \({x}_{v}\) is the average number of mosquitos per human, \({W}_{c}\) and \({H}_{c}\) are respectively the number people whose works or home location is in the cell \(c\). This form assumes that the number of mosquitos in a cell scales with the average number of people found there, i.e. the mean of the nighttime population (defined by the number of home locations in that cell) and daytime population (defined by the number of work locations). Depending on the efficiency of vector control mechanisms, the number of female Aedes mosquitoes per residence varies greatly between countries. In Puerto Rico, the number of mosquitoes per home appears to be between 5 and 10 per home50, whereas in Singapore, this number is estimated as slightly greater than 0.251. This means that the average number of mosquitoes per human in Singapore should be in the range from 0.004 to 0.01. Note that such incorporation of aquatic mosquitoes in our models assumes that every cell contains some breeding sites, which is necessary to sustain a mosquito population if we do not allow mosquitoes to travel between cells.

All the aquatic mosquitoes (\({A}_{t}^{V}\)) that become adult mosquitoes at time \(t\) are susceptible (\({S}_{t}^{V}\)) and they can eventually become exposed (\({E}_{t}^{V}\)) if they a bite an infected human and they become infected (\({I}_{t}^{V}\)) after an incubation time as shown in Fig. 1(b). Both the aquatic and the adult mosquitoes can die with given probabilities (\({\mu }_{A}\) and \({\mu }_{V}\) respectively). Similarly to the human epidemiological models, the equations describing the vector dynamics are:

$${t}_{t,c}^{{S}^{V}\to {E}^{V}}=BD({S}_{t,c}^{V},{\lambda }_{t,c}^{h\to v}/2)$$
(16)
$${t}_{t,c}^{{E}^{V}\to {I}^{V}}=BD({E}_{t,c}^{V},{\dot{\gamma }}^{v}/2)$$
(17)
$${S}_{t+1,c}^{V}={S}_{t,c}^{V}-{t}_{t,c}^{{S}^{V}\to {E}^{V}}$$
(18)
$${E}_{t+1,c}^{V}={E}_{t,c}^{V}+{t}_{t,c}^{{S}^{V}\to {E}^{V}}-{t}_{t,c}^{{E}^{V}\to {I}^{V}}$$
(19)
$${I}_{t+1,c}^{V}={I}_{t,c}^{V}+{t}_{t,c}^{{E}^{V}\to {I}^{V}}$$
(20)

where the transition rate human-to-vector \({\lambda }_{t,c}^{h\to v}\) is defined as14:

$${\lambda }_{t,c}^{h\to v}=a{\dot{\varphi }}^{h\to v}{S}_{t,c}^{V}\frac{{I}_{t,c}^{v}}{{N}_{t,c}}.$$
(21)

These transitions are function on two temperature dependent parameters such as \({\dot{\gamma }}^{v}\) and \({\dot{\varphi }}^{h\to v}\).

Summary

Using these equations, running the model means iterating the following two steps: 1) Evaluate change of states for every human using individual Bernoulli-trials, and the change in mosquito populations in each cell using Eqs (12) and (14); 2) Update the locations of human agents based on the mobility model and recalculate the number of humans of each class in each cell accordingly. We can characterize the mosquito population dynamics and the epidemics based on the ODE representation of the previous model (see SI for the corresponding equations). Using these, we can derive the basic offspring number (Q), that is, the mean number of viable female offspring produced by one female adult during its entire time of survival (and in the absence of any density-dependent regulation) as:

$$Q=\frac{{\dot{\varepsilon }}_{A}^{v}}{{\dot{\varepsilon }}_{A}^{v}+{\dot{\mu }}_{A}^{v}}\frac{cf\dot{\theta }}{{\dot{\mu }}_{V}^{v}}$$
(22)

All parameters defining \(Q\) are temperature-dependent (see below). For a fixed temperature \({T}_{0}\) it is possible to derive expressions for the expected population sizes of each mosquito life-stage modelled. These are used to initialize the system, given the temperature present at the initial timepoint:

$$A({T}_{0})=K(1-\frac{1}{Q({T}_{0})})\,V({T}_{0})=K(1-\frac{1}{Q({T}_{0})})\frac{{\dot{\varepsilon }}_{A}^{v}({T}_{0})}{{\dot{\mu }}_{V}^{v}({T}_{0})}$$
(23)

Including the humans, the expression for dengue’s basic reproductive number is defined similarly to previous modeling approaches52,53 but without human mortality:

$${\dot{R}}_{0}=\frac{V}{N}\frac{{a}^{2}{\dot{\varphi }}^{v\to h}}{{\sigma }^{h}{\dot{\mu }}_{V}^{v}}$$
(24)

We note that as necessary, our model includes some simplifications. Most importantly, parameter values for mosquito population modeling come from controlled experiments performed in laboratory studies49. Clearly, it seems prohibitably challenging to directly estimate these parameters in the wild, as tracking individual mosquitoes is infeasible; studies can test the applicability of the models by comparing predictions to estimates of observed mosquito population sizes. Furthermore, accurately measuring mosquito populations itself present difficulties in real-world conditions. We note that uncertainties in parameters are inherently linked in our model; e.g. a shorter mosquito lifespan could be offset by higher bite rate as evident from Eq. 24. This way, any calibration process among the parameter values will likely be degenerate. Another main limitation in our dataset is that we have no estimate of any existing immunity to dengue in the population. While dengue has mulitple strains, and partial or full immunity can be acquired after being infected with a specific strain, the picture is quite complex. Similarly to uncertainty of parameters for mosquitoes, uncertainty in the size of susceptible population is linked to any variations in other parameters. For this reason, we do not perform a scaling of the population size, but use the sample obtained from the mobile phone data which covers a large part of Singapore’s population. We deal with these issues by using established values and temperature-dependent forms from the literature for most parameters11,14,49, while exploring a phase space determined by variations in a small number of parameters, namely the bite rate (\(a\)) and average number of mosquitoes per human (\({x}_{v}\)). Finding an ideal combination in for this pair of parameters allows us to calibrate the model for Singapore, while avoiding overfitting.

In summary, as initial conditions for the simulations setting we consider N = 2,307,230 agents derived from the mobile phone data and described above. At the beginning of the simulations, i.e. January 1st 2013, we set the \({I}_{init}\) number of initial infected agents as retrieved from the official Singapore’s government’s portal45. In particular \({I}_{init}=242\) infected individuals in 93 different cells of the grid G. In order to keep the outbreaks alive we ensured that the number of infected individuals in the systems always I >= 100 as visible in Fig. 3. The number of initial aquatic and adult mosquito have been computed for each values of the parameter \({x}_{v}\) from January 1st 2011. For each day from January 1st 2011 to December 31st 2012 we collected the temperature and we simulated the dynamics of aquatic and adult mosquitoes in each cell given the population estimated from the mobile phone and following the Eqs 12 and 14. In this way, for each value of the parameter \({x}_{v}\) it has been possible to set a stable number of aquatic and adult mosquitoes the first day of the simulation.