Introduction

Dengue is one of the most prevalent vector-borne arboviruses, transmitted between humans by Aedes mosquitoes. The number of dengue cases reported to the World Health Organization (WHO) has increased steadily from an average of less than a thousand cases per year globally in the 1950s to over 3.34 million in 20161,2. Moreover, a recent study estimated that 390 million dengue infections occur each year3 while another indicated that 3.9 billion people in 128 countries are at risk of infection4. The number of cases is still increasing as the disease spreads to new areas and explosive outbreaks occur. This global expansion of dengue virus (DENV) can be due to several causes, including environmental and climate factors. Climate is an important determinant of vector-borne disease epidemics as it directly influences the abundance and distribution of the vector. Together with environmental factors such as the degree of urbanisation, climate variations may result in the geographic expansion of mosquitoes and dengue into new areas5.

Studies have shown that a positive association exists between temperature and dengue transmission6,7,8,9. One possible reason is that warmer temperatures lead to a higher abundance of Aedes mosquitoes by increasing their survival and development rates10,11. Furthermore, the extrinsic incubation period (EIP; the time required for viruses to become transmissible after the initial infection of a mosquito) of DENV may be shortened under warmer conditions12,13, resulting in increased efficiency of viral transmission by allowing more mosquitoes to transmit the virus after exposure. The effect of rainfall amount on dengue incidence, on the other hand, is more controversial. Rainfall has been generally thought to be a positive predictor of dengue vector abundance as it provides essential mosquito habitats11,14. Recent studies, however, have demonstrated that excess rain may also flush away eggs, larvae or pupae of mosquitoes and remove mosquito breeding sites15,16, hence negatively effecting dengue incidence. Another study has found that dry seasons followed by a period of excess rainfall can lead to a high risk of dengue outbreaks, indicating that either positive or negative effects can occur, however, with different time lags17. Therefore, a better understanding of how rainfall affects dengue infection rates in areas with large outbreaks or increasing incidence could provide better projections of climate effects on dengue expansion.

The South-East Asia and Western Pacific regions, two of the six WHO regions, together have accounted for nearly 75% of the current global burden of dengue infections1. A rapid increase of dengue incidence has been recently observed near the centre part of the Western Pacific region, such as in Guangzhou (the capital city of Guangdong province in southern China) and southern Taiwan18,19. Hong Kong, located geographically close to both Taiwan and Guangzhou, has a typical East Asian subtropical monsoon climate, characterised by higher temperatures and copious amounts of rain in the spring and summer months20, which may favour dengue vector abundance and cause outbreaks later during or after summer. Despite the successful dengue control measures introduced after the first local case was identified in Hong Kong in 200221, there has been a gradual increase in dengue incidence after 2014 with the highest number of dengue so far recorded in 2018. Identifying the role of climate variables, particularly the effects of rainfall, on recent distribution trends in Hong Kong is urgently needed in order to understand the recent large outbreak and geographical expansion of dengue in the Western Pacific region or globally. However, the effects of pre-summer rainfall on dengue incidence around southern China are often ignored since the rainfall season normally has a lead time up to three to five months before the outbreaks. These lag effects were often not considered in many of the previous studies in this region22,23,24,25.

This study aims to enable prediction the annual number of dengue cases using seasonal monthly temperature and rainfall data before summer and assessing the effects of climate variability on the recent increase of dengue incidence in Hong Kong. Additionally, given that all dengue outbreaks/infections in Hong Kong occurred as either local clusters or sporadic cases in certain areas instead of large regional scale epidemics, the study utilised a generalized linear mixed model that contains area-specific random effects to account for environmental factors which are difficult to measure and fixed effects to account for climate effects on dengue transmission. Understanding how the climate may affect annual dengue incidence is important in order to forecast future possible dengue outbreaks, thereby allowing for more effective mosquito control measures in each region, leading to early preparedness and prevention of dengue.

Results

Area-specific dengue cases in Hong Kong

After the first local case of dengue fever was discovered in 2002, a large dengue outbreak occurred in the same year and another in 2018 while sporadic cases mainly occurred between 2014 and 2017. We divided Hong Kong into 3 areas: (i) New Territories-South (NTS), (ii) New Territories-North (NTN), and (iii) Hong Kong Island & Kowloon (HKL), based on the administrative districts, the population density, and the usage of the Mass Transit Railways (see Methods). The existence of spatial heterogeneity of dengue infections was observed between the defined areas (Fig. 1 and its inset). For example, HKL and NTS together consisted of all the infections during 2018 outbreak, while HKL and NTN contained all the sporadic cases from 2014 to 2017. Note the special case of the year 2002 when 17 out of 20 locally acquired dengue cases, which included one case transmitted via blood transfusion, were all located around a construction site on an island geographically at the centre of the three areas (shown in red in Fig. 1). After the blood transfusion case was removed, the remaining cases were equally assigned to the three areas with one extra case assigned to NTN.

Figure 1
figure 1

The area division: New Territories - South (NTS; light blue), New Territories - North (NTN; orange), and Hong Kong & Kowloon (HKL; green) along with the selected automatic weather stations (white circles) and rainfall stations (light yellow circles). The grey lines indicate the official district borders. As Ma Wan Island (red) is located at the center, all the 17 non-blood transfusion cases on that island are assigned into the 3 areas in our study: 5 cases in NTS, 6 cases in NTN, and 5 cases in HKL. The inset provides annual local case numbers from 2002 to 2018.

Climate predictors selection

In order to select suitable climate parameters for area-specific annual dengue forecasting, we compared 4 models with different combinations of temperature (either monthly mean or minimum; indicated by Tmean or Tmin) and rainfall (either monthly total or maximum; indicated by Rtot or Rmax). These climate parameters were suggested by the prior studies conducted in nearby countries8,26,27, but their association with recent dengue outbreaks in Hong Kong is still unknown and the effects of rainfall on dengue spreading are mixed. Furthermore, spatial differences in the climate data were not considered when the effects of monthly climate predictors were analysed8. Thus, all the climate data within the studied years from 2002 to 2018 were retrieved from 11 weather stations and 2 automatic rainfall stations across the defined areas (white and yellow circles Fig. 1), with the averages from January to August shown in Fig. 2. In general, the highest temperatures and the highest rainfalls were recorded between June and August and after April, respectively.

Figure 2
figure 2

The monthly climate data for the whole of Hong Kong from 2002 to 2018. (a) Monthly mean and monthly minimum temperature, and (b) monthly total and monthly maximum rainfall from January to August for the years 2002, 2003–2017 combined, and 2018 are shown. Climate data were retrieved from 11 weather stations and 2 automatic rainfall stations across the three pre-defined areas.

Since most of the local (indigenous) dengue cases occurred during the summer seasons starting in or after August (Fig. 3) and in Hong Kong mosquitoes are not commonly observed during the winter until March28,29, for each combination of climate parameters we used monthly climate predictors (Fig. 4) from March to August in the 3 defined areas to forecast area-specific annual dengue cases using a Poisson mixed effects model. The predictors were first normalised in order to compare the effects between each predictor. The stepwise algorithm based on the Akaike Information Criterion with a correction (AICc) was used to select the monthly predictors for each model (Supplementary Table S1). The best-fitting model was determined according to both the AICc and Bayesian Information Criterion (BIC).

Figure 3
figure 3

The number of monthly reported local dengue cases from 2002 to 2018. Different colours represent the number of dengue cases in different years.

Figure 4
figure 4

The normalised climate predictors for each area using (a) the monthly mean temperature and (b) the monthly total rainfall. The areas NTS, NTN, and HKL are defined as in Fig. 1.

Area-specific annual dengue forecasting

The results showed that the models with the Tmean climate parameters set performed better than the models with the Tmin parameters (Table 1). The best-fitting model, Model: Tmean + Rtot was used as our predictive model for dengue cases while Model: Tmean + Rmax was used as an alternative model as both the AICc and BIC results were equal to only 1 for Model: Tmean + Rmax. The best-fitting model estimated using the Laplace approximation is given by:

$$\begin{array}{lll}{\rm{\log }}\,({\mu }^{i}) & = & -6.2+3.594\times {T}_{3}^{i}+2.542\times {T}_{4}^{i}+1.21\times {T}_{5}^{i}+10.072\times {T}_{7}^{i}\\ & & -4.002\times {R}_{4}^{i}-4.029\times {R}_{5}^{i}-8.047\times {R}_{6}^{i}+{\alpha }^{i}\end{array}$$
(1)

where the standard deviation of the random intercepts σ = 0.789. A significance test was performed for each fixed effect predictor using the likelihood ratio test. P-values were less than 0.05 for 8 out of 9 regression coefficients (Table 2). The random intercepts represent different levels of effects resulting from unobserved area-specific factors, which may include varying land usages or population densities between different areas.

Table 1 The AICc and BIC of each model with the selected predictors. The most optimal set of predictors was determined using the StepAICc for each climate parameters set. Model: Tmean + Rtot was selected as the best-fitting model and Model: Tmean + Rmax was used as the alternative model.
Table 2 The Likelihood Ratio Test (LRT) results for each selected climate predictor in the best-fitting model. Each predictor was removed and compared to the original model using ANOVA. The p-value was then calculated given that the test statistic of the LRT is asymptotically approximating a χ2 distribution. Tm and Rm denote the temperature and rainfall predictors in the mth month of the year.

The model performance was evaluated using leave-one-out cross-validation (LOOCV) and leave-one-year-out (also called leave-three-out) cross-validation (LOYOCV). For LOOCV, the observation in one area in a given year was removed before refitting the model. The best-fitting model successfully predicted the major outbreaks by year and by area using LOOCV (Fig. 5a–c, Supplementary Table S2 and Supplementary dataset S1). 42 out of a total of 51 (82.4%) total observations by year and by area (annual incidence rates in 17 years and 3 areas) were within the 95% confidence interval of the annual incidence predicted by our model. Note that the upper and the lower bounds of the confidence intervals were rounded to the nearest integer when assessing the prediction accuracy of observations in Table S2. 5 out of 6 observations of area-specific outbreaks in 2002 and 2018 were able to be predicted. Although the 2018 outbreak in NTS was not predicted, the model still identified an outbreak (defined as the annual number of dengue cases  >  2) in that area. When LOYOCV was used, the model predicted outbreaks (defined as the annual number of dengue cases  >  2) in 5 years (i.e., the years 2002, 2003, 2007, 2015 and 2018), which include the two major outbreak years (2002 and 2018) and a year with sporadic cases (2015) (Fig. 5d).

Figure 5
figure 5

Comparison between observed and predicted number of annual dengue cases. (ac) Observed and predicted number of annual dengue cases in each pre-defined area (NTS, NTN, and HKL) with 95% confidence intervals using leave-one-out cross-validation. The predicted values represent the mean of the Poisson distribution of the annual dengue cases in each area, estimated using a generalized linear mixed model with a restricted maximum likelihood method. (d) Observed and predicted number of annual dengue cases in the whole Hong Kong area (ALL) with 95% confidence intervals using leave-one-year-out cross-validation.

The best-fitting model produced an MSEtr of 0.592, an MSEva of 3.538 and an MSEratio (MSEvaMSEtr) of 5.976 using LOOCV (Table 3). Both the MSEva and the MSEratio of the best-fitting model were the lowest compared with the alternative (Model: Tmean + Rmax), fixed effects (same predictors as the best-fitting model but without random effects), and full models, indicating the best prediction performance among all the tested models. The alternative model using Tmean and Rmax produced an MSEtr of 0.672 and an MSEva of 5.143, which confirmed that the chosen best model fitted the data better than the alternative. The fixed effects model performed similarly to but slightly worse than the best-fitting model. The full model, including all predictors, had the lowest MSEtr value but the highest MSEva and the highest MSEratio, indicating an overfitting phenomenon. The best model also performed better than the other models when the Normalised Mean Squared Error (NMSE) was used. When LOYOCV was used, the MSEratio (12.629) of the best-fitting model was slightly higher than that obtained from LOOCV results. Although the fixed effects model can perform similarly or slightly better according to the MSEva (or NMSEva), the best-fitting model performed better than the other models according to the MSEratio (or NMSEratio) (Supplementary Table S3). These results demonstrate that while working with a small number of dengue observations in Hong Kong, a mixed effects model with AICc-selected variables can reduce model overfitting.

Table 3 The mean squared errors (MSE) and normalised mean squared errors (NMSE) of each model tested using leave-one-out cross-validation. The MSE and NMSE for the training and the validation sets, and the ratios of the MSE and NMSE of the validation set to that of the training set, respectively, are listed. The best-fitting model (Model: Tmean + Rtot) was compared with the alternative model (Model: Tmean + Rmax), the fixed effects model (same predictors as the best-fitting model but without random effects) and the full model (mixed effects with all predictors).

Effects of climate variables

The predictive model showed that all monthly mean temperature predictors (T3, T4, T5, and T7) were positively correlated with the number of annual cases, while monthly total rainfall predictors (R4, R5, and R6) were all negatively correlated. These behaviors were also found in the stepAICc results of the other 3 models with different climate parameter sets (Supplementary Table S1), confirming the positive correlations of the temperature and the negative correlations of the early rainfall predictors with the number of annual cases. Among all the temperature predictors, T7 (the mean temperature in July) was the most significant predictor (p < 10−8) and its regression coefficient had the highest magnitude, indicating the strongest effect. The coefficients of the 3 rainfall predictors (from April to June) were of similar magnitude, indicating a much longer delayed effect of rainfall compared to temperature. In the NTS and HKL areas, relatively high values for T3, T5, and T7 and relatively low values for R4 and R5 were found in 2018, which may explain the the outbreak in 2018 in those two areas.

We further examined the marginal effects of the 4 predictors (T7, R4, R5, and R6) with the greatest magnitude of regression coefficients on the relative risk RR (Supplementary Fig. S1). RR was defined as the number of annual cases divided by the average number of annual cases. Marginal effects of the collected temperature (without normalization) \(T{\prime} \) and rainfall \(R{\prime} \) (without normalization) were also obtained after refitting the same model (Supplementary Fig. S2). If the rainfall is lower than 200 mm (per month) between May and June, or lower than 100mm in April, a higher relative risk of dengue incidence is observed.

Discussion

This study is the first to explore the association between seasonal climate variables and dengue incidence while taking into account the spatial heterogeneity of the dengue cases in Hong Kong. After dividing Hong Kong into 3 areas, a Poisson mixed effects model was fitted using the monthly temperature and rainfall data from the early months of the year to forecast annual dengue incidence in the defined areas. Given that most of the dengue infections occurred in August and September, the model successfully predicts annual dengue incidence using seasonal weather data from the preceding months. Instead of using a distributed lag model, our approach allows us to capture the impact of the weather in each month on the annual number of dengue cases, which is more suitable for regions affected by seasonal epidemics8.

Because most of the outbreaks occurred in August and September, the period of the delay of climate effects can be estimated. Our results showed that the temperature is positively associated with the relative risk within time lags of at least 1 or 2 months (from T7 to the start of the outbreak), which is consistent with previous studies17,30,31,32,33,34. These lagged effects could be caused by the incubation periods of dengue viruses and the mosquito’s reproductive life cycle. The delayed effect of rainfall from between 3 and 5 months (from R4, R5 and R6 to the start of the outbreak) in our study is also consistent with recent findings that rainfall can be negatively associated with the risk of dengue outbreaks with a lead time up to 5 months17.

The three defined areas NTS, NTN, and HKL in our study were selected to reflect the spatial heterogeneity of the dengue cases in Hong Kong. This spatial heterogeneity may be caused by area-specific factors other than climate variables; for instance, urbanisation, transportation, population density, etc., could all contribute to the expansion of dengue. For the purpose of our study, as we mainly focus on the impact of climate, those area-specific factors were considered as random effects in our model. Introducing random effects into the model can lead to higher accuracy of parameter estimation especially when the dataset contains non-independent observational units that are hierarchical in nature35.

The significance of understanding the climate-dengue association lies in three aspects. First, being a highly populated region with more than 50 million people travelling to the city annually36, infectious diseases can spread rapidly not only within Hong Kong but also to other countries due to its high level of contact mixing within- and between-countries37,38,39. Therefore, whether an even larger dengue outbreak will occur in Hong Kong in the future has become a primary global health concern.

Second, we demonstrated that in addition to global warming, the variation in rainfall intensity associated with the East Asian monsoon may play a significant role in the recent dengue expansions in some East Asian subtropical countries/cities. An increasing risk of seasonal dengue epidemics has been observed in both southern Taiwan and Hong Kong in the recent years, and similar to our findings, a negative association between early rainfall and dengue infections has previously been identified in southern Taiwan8. As Hong Kong and Taiwan are both located in the East Asian subtropics, they are both affected by the East Asian rainy season occurring annually between February and June40. Our results suggest that the variability during or even before the East Asian monsoon (pre-summer) rainy season may pose a high risk for dengue expansion in the region and complex non-linear effects should be studied further. As for Guangzhou, we noticed that before the 2014 dengue outbreak period (the largest one in Guangzhou), heavy rainfall mixed with drought (or low initial water level) were observed in that particular year41,42.

Third, our findings on rainfall effects can be used to inform mosquito control strategies in Hong Kong and possibly in other nearby countries similarly affected by the East Asian monsoon. The authorities of Hong Kong often enhance mosquito control measures while increasing public awareness especially during periods of heavy rainfall43,44. Our results suggest that in this particular region, the relative risk of incidence is higher when the intensity of rainfall before summer is lower. This implies that stepping up mosquito control measures to reduce mosquito activities is the most critical when the early rainfall amount is lower than the expected level.

In conclusion, we have developed a generalized linear mixed model that is able to successfully forecast annual dengue incidence across Hong Kong by taking into account both the climate data and the unobserved area-specific factors. Our findings may provide a valuable assessment on how to plan dengue prevention and control measures more effectively in Hong Kong.

Methods

Study area

Hong Kong is officially divided into 18 districts, which are grouped into three main areas: Hong Kong Island, Kowloon, and the New Territories (Supplementary Fig. S3). Based on the geography, population density, and the usage of the Mass Transit Railways (MTR), which is the most heavily used public transportation in Hong Kong, we redivided the city into three main areas. We first divided the New Territories into two regions: New Territories-South (NTS) and New Territories-North (NTN) such that NTS consists of the smaller and outlying islands while NTN consists of the region connected to Mainland China. We then merged Hong Kong Island and Kowloon into Hong Kong Island & Kowloon (HKL) because of their high similarities in terms of population density (Supplementary Fig. S4) and public transportation network. Kwai Tsing District was reassigned from the New Territories into this area as it also has a high population density, is located close to Kowloon, and is linked to the area by the busiest MTR line45.

Dengue data

The confirmed numbers of local dengue fever cases from 2002 to 2018 were collected through press releases issued by the Centre for Health Protection (CHP) of the Department of Health of the Hong Kong Government46. The official statements indicate the location and the date of every local case confirmed through laboratory tests such as NS1 antigen and IgM serological assays, where the possible infection location is determined by either the residence or the travel history of the patient. The annual number of dengue cases was defined as the total number of confirmed local (indigenous) cases in each year (Fig. 1 inset); the onset of dengue fever symptoms in all years between 2002 and 2018 was observed between June and December (Fig. 3). Imported cases, confirmed by CHP, were excluded in order to rule out travel-related factors influencing infection rates.

Meteorological data

Both the daily mean temperature and the daily total rainfall from 2002 to 2018 inclusive were retrieved from the Hong Kong Observatory47. A total of 11 out of the available 49 automatic weather stations were selected to represent all districts in Hong Kong (Fig. 1). The stations were chosen as they provided the most complete temperature and rainfall data in those years; any weather stations with more than 30% missing data were excluded. The selected station with the highest number of missing data is in Tuen Mun District since it only started recording from 2007 onwards. Two additional automatic rainfall stations on the island were selected due to the incomplete rainfall data from the weather stations in Hong Kong Island. Daily weather data were then averaged across different stations in each defined area to obtain average area-specific weather data. Monthly mean (minimum) temperature is defined as the average (minimum) of the daily mean temperature while the monthly total (maximum) rainfall is defined as the total (maximum) of the daily total rainfall. In general, a strong seasonal pattern was observed among the climate parameters from 2002 to 2018 (Fig. 2). An increasing trend in monthly temperature and rainfall was observed from January until July and June, respectively.

Model framework

Given the similar seasonality of dengue epidemics in Southern Taiwan and Hong Kong where infections mostly appeared in the latter half of the year starting in summer, we extended a previous study of forecasting dengue annual cases in Southern Taiwan by fitting a model using the temperature and rainfall data from the earlier part of the year to predict the subsequent area-specific outbreaks8. In order to account for any unobserved factors relating to the studied areas, we developed a Poisson generalized linear mixed model (GLMM) to fit the confirmed cases in the different defined areas. Let \({Y}_{t}^{i}\) denote the annual number of dengue cases and \({\mu }_{t}^{i}\) denote the expected number of dengue cases, which is the corresponding distribution mean in area i and year t, then:

$${Y}_{t}^{i} \sim Poiss({\mu }_{t}^{i})$$
(2)

and the model is specified by:

$$log({\mu }_{t}^{i})=\alpha +{\alpha }^{i}+{\sum }_{m={m}_{1}}^{{m}_{n}}({\beta }_{Tm}\times {T}_{mt}^{i}+{\beta }_{Rm}\times {R}_{mt}^{i})$$
(3)

where \({T}_{mt}^{i}\) and \({R}_{mt}^{i}\) denote the mth month’s (ranges from m1 to mn) normalised temperature and normalised rainfall predictors in the year t and area i, βTm and βRm denote the regression coefficient of \({T}_{mt}^{i}\) and \({R}_{mt}^{i}\) respectively, α represents the intercept, and αi represents the area-specific random intercept.

The monthly predictors were obtained using monthly climate data that were normalised into the range [0,1] from the collected temperature \(T{\prime} \) and rainfall \(R{\prime} \) data, so that the effects of the predictors on the responses are easier to compare when fitting the model33,48:

$$\begin{array}{lll}{T}_{m} & = & (T{\prime} -min(T{\prime} ))/(max(T{\prime} )-min(T{\prime} ))\\ {R}_{m} & = & (R{\prime} -min(R{\prime} ))/(max(R{\prime} )-min(R{\prime} ))\end{array}$$
(4)

Despite a number of zero counts in the dengue case data for certain years, we did not use zero-inflated models as they assume that excess zeros are generated by a separate process from the count values (e.g. underreporting of infections). We assumed that the zero values represent the true case numbers as CHP conducts strict dengue prevention and control measures in Hong Kong.

Model selection

A two-step procedure was used for the model selection. First, a candidate model was chosen to represent each of the four climate parameter sets (Tmean or Tmin with Rtot or Rmax). The stepwise algorithm based on the Akaike Information Criterion with correction (AICc), also known as the stepAICc, was used to select the fixed effect variables (monthly climate predictors) for each of the candidate models in a forward selection approach (Supplementary Fig. S5). The stepwise algorithm has been commonly used in selecting relevant environmental or climatic determinants to predict disease occurrences8,30. This approach iteratively added a new variable to the model, starting with zero variables, to fulfil the selection criteria (such as smaller AIC or AICc) until the most optimal set of variables was reached. For our case, the AICc was used instead of AIC due to the small number of observations. For the purpose of model selection, all the models were estimated using Maximum Likelihood by Laplace approximation provided by the R package, glmmTMB49, in order to compare models with different fixed effects. Laplace approximation is a suggested method for fitting Poisson mixed effects models50. Secondly, the candidate models with the most optimal set of variables were then compared with each other based on their AICc and Bayesian Information Criterion (BIC). The model with the lowest AICc and BIC (or only AICc) was chosen as the best-fitting model.

Model evaluation

Leave-one-out cross-validation (LOOCV) was used in order to evaluate the best-fitting model’s overall performance and to ensure that the model does not overfit. During each iteration, the observation of the annual number of dengue cases in a particular year and area was excluded while fitting the model using the restricted maximum likelihood method. The number of annual dengue cases of this particular observation was then estimated (predicted) given the coefficients produced by the fitted model in that iteration. In order to measure the performance of the model against the observation data, the Mean Squared Error (MSE) was calculated for both training and validation sets using the following general formulae:

$$\begin{array}{lll}MS{E}_{tr} & = & {\sum }_{j=1}^{17}{\sum }_{i\ne j}{({\widehat{\mu }}_{i}^{(-j)}-{\mu }_{i})}^{2}/(17\cdot (17-1))\end{array}$$
(5)
$$\begin{array}{l}MS{E}_{va}={\sum }_{j=1}^{17}{({\widehat{\mu }}_{j}^{(-j)}-{\mu }_{j})}^{2}/17\end{array}$$
(6)

where j indicates the index of a year to be tested, \({\widehat{\mu }}_{i}^{(-j)}\) indicates the estimated number of cases in year i when the number in year j was removed, and μi is the observed number of cases in year i. When pre-defined areas were considered, the highest value of index j became 51 to represent the total number of observations of annual dengue incidence by year and by area. In addition, leave-one-year-out (leave-three-out) cross-validation (LOYOCV) was also used to evaluate the best-fitting model when the observations of the annual number of dengue cases in all three areas in a particular year were removed and used as the validation set while fitting the model.

Bootstrapped confidence intervals were computed. For each predicted \(\widehat{\mu }\) of a model, we generated 1000 samples of outcomes from a Poisson distribution with the rate parameter equal to the estimated parameter (\(\widehat{\mu }\)) of that fitted model. Based on these 1000 simulated outcomes, we refitted the model 1000 times to obtain the distribution of the parameter’s values. The 95% confidence interval defines a range of values between the 97.5% and 2.5% quantiles. We considered the model to be satisfactory if it managed to successfully predict the two major outbreaks (within confidence intervals) in 2002 and 2018 in the LOOCV process and if it produced relatively low values for MSEva or MSEratio (defined as MSEva/MSEtr). The Normalised Mean Squared Error (NMSE) was also calculated by dividing the MSE of the validation or training set by the range of the observed variable.