Effects of climatic and environmental factors on mosquito population inferred from West Nile virus surveillance in Greece

Mosquito-borne diseases’ impact on human health is among the most prominent of all communicable diseases. With limited pool of tools to contrast these diseases, public health focus remains preventing mosquito-human contacts. Applying a hierarchical spatio-temporal Bayesian model on West Nile virus (WNV) surveillance data from Greece, we aimed to investigate the impact of climatic and environmental factors on Culex mosquitoes’ population. Our spatio-temporal analysis confirmed climatic factors as major drivers of WNV-transmitting-Culex mosquitoes population dynamics, with temperature and long periods of moderate-to-warm climate having the strongest positive effect on mosquito abundance. Conversely, rainfall, high humidity, and wind showed a negative impact. The results suggest the presence of statistically significant differences in the effect of regional and seasonal characteristics, highlighting the complex interplay between climatic, geographical and environmental factors in the dynamics of mosquito populations. This study may represent a relevant tool to inform public health policymakers in planning preventive measures.


Effect of climatic and environmental factors
The fixed effects, i.e. the linear effect of the covariates on the mosquito abundance (in log-scale) are shown in Fig. 3. On the left side are presented the posterior estimates for the coefficients of the covariates, together with the 95% posterior credibility interval (PCI).Note that the covariates are standardized, therefore the magnitude of their effects is comparable between one another.The average monthly temperature has the strongest positive effect on mosquito abundance, with a mean estimate of 0.238 (SD: 0.0053, PCI: 0.228-0.249).Similarly, the consecutive number of grow days (CGD) in the concurrent and previous month have a positive effect, with mean estimates of 0.153 (SD: 0.0037, PCI: 0.146-0.160)and 0.168 (SD: 0.0039, PCI: 0.160-0.175),respectively.Additionally, positive Figure 1.Culex mosquito abundance in logarithmic scale, stratified by trap (columns) and month-year (rows).

Spatial and temporal random effect
We now consider the results of the hierarchical temporal (month and calendar year) and spatial component.The posterior mean for the temporal effect for month is positive in the period from April to September, with increasing values from May to August and decreasing values from August to October.A possible explanation may include differences in larvae development and spraying procedures.As for the temporal effect for calendar year, there is no clear patterns except for a noticeable lower value for the years 2018 and 2021.This suggest an unexpected deviation in mosquito abundance that is not consistent with the effect of the covariates and the spatial effect, and it is likely due to external factors not considered in the model.The combined temporal random effect by month and year (Fig. 3, right panel) depicts a more clear picture.The overall pattern is consistent across most years, with higher positive values observed in the central months of July and August, excluding April and October.In addition, there is a noticeable increase in variability in the months of April and October, likely due to the limited number of observations in those months.The posterior means for April are mostly positive, while October shows the opposite trend with all negative values.This deviation may be attributed to the model's smoothing effect, which prioritizes adhering to the behavior of central months over a more precise estimation in April and October (see Supplementary Figure S1 for the posterior distribution of the spatial and temporal effects).
The posterior mean for the spatial random effect shows a very smooth behaviour (left panel of Fig. 4), with no large deviations localized in specific traps.This might suggest that the local variability at trap level is well captured by the covariates and the temporal effect.On the contrary, there is a clear separation between the Central Macedonia region, with predominantly negative values, and the South-West part of Crete and Western Greece.This disparity could be influenced by proximity to the coast or differences in elevation.Moreover, differences in

Model performance and predictive ability
Our model provides a better fit in the months from June to September, where there is a higher availability of data (Fig. 5).Conversely, modeling the months of April and October proves to be more challenging due to the limited amount of data available, both in terms of calendar year and trap location.The difficulty in modeling the years 2018 and 2021 is also evident and reflects the high variability previously discussed in the temporal effects.www.nature.com/scientificreports/ In order to check the robustness of the model in terms of its predictive ability, we evaluated MAE and meansquared error (MSE, in log scale) and the mean-absolute error (MAE), defined as where y i , ŷi are the class indices (see 20 for the details), both in train and test set in a leave one out procedure.In particular, in the first part of Table 1, we report the errors computed by using one year at a time as test set, and the remaining years as training.Similarly, in the second part of Table 1, we report the errors computed by using one region at a time as test set, and the remaining years as training.Overall, the results shows a convincing robustness of the model, with test errors always below 1.5 for MAE and 1.6 for MSE, except for the years 2018 and 2021 (exceptional years as previously described).As for the regions, the results are comparable but less clear.This may be due to two main reasons: first, the number of observation is disproportionally biased towards Central Macedonia, with an expected increase in the errors when this is used as test set; second, unlike for the years for which the priors are independent, in the spatial dimension is present a distance dependence correlation that might influence the results when the traps of a whole region are removed from the training set.

Discussion
In this study, a hierarchical Bayesian spatio-temporal model was employed on WNV surveillance data from Greece to investigate the impact of climatic and environmental factors on the Culex mosquito population abundance.The potential of this approach was also to explore the possibility of using the gained knowledge to predict mosquitoes abundance.Our results suggested that temperature is a major driver of the mosquito population, which is consistent with previous publications, exploring this association 19,[21][22][23][24][25][26][27] .Paz & Albersheim 21 , observed a positive association between abundance of Culex pipiens and the average temperature with two-week lags.This is indirectly captured in our model by the number of grow days in the previous month, that has the second most important positive association.Similarly, in a study on WNV vectors' abundance, Bisanzio et al. 22 reported a significant positive relation between weekly abundance of Culex pipiens and Culex modestus with the temperature in the week before trapping.This lags might be explained following the impact that temperatures have on mosquito development: Soh & Aik 27 observed a positive association between temperature and Culex larval habitats in Singapore.Similarly, Gardner et al. 24 , specifically on WNV vectors, and Ukubuiwe et al. 28 , on general abundance of Culex quinquefasciatus, reported a faster immature development, as well as faster larval growth and shorter duration of immature phase in warmer rearing water, respectively.Observing the abundance of WNV vectors Culex pipiens, Ruybal et al. 19 reported a monotonic and non-linear inverse relation between adult female development and temperature, as well as a direct relation between larval survival and temperature under 27 • C. In addition, the same authors observed a negative relation between adult female and larval survival, and higher temperature 19 .Similarly, Ciota et al. 17 , in a study on factors driving the general abundance of Culex pipiens, Culex quinquefasciatus, and Culex restuans, observed a positive association between mosquito development and increased temperatures, with the association weakening after 24 • C.These two latter results are especially important, as they allows uncertainty over the role of climate change in the distribution and epidemiology of mosquito-borne diseases.In fact, it is common concern that climate change could increase the distribution areas of arbovirus and their vectors, as well as the prevalence of arboviral diseases.However, the number of CWD and CDD, which serve as proxies for soil moisture, also play a synergistic role in shaping the mosquito population.High soil moisture levels are positively associated with mosquito survival due to the high surface area to volume ratio of mosquitoes, which makes them susceptible to desiccation 29 .Water availability is another important factor that positively affects mosquito abundance.Although rain might prove beneficial, providing natural and artificial water bodies for female mosquitoes to lay their egg, heavy rainfall can be detrimental to the mosquito population, as it can flush away breeding habitats and immature individuals, especially among Culex pipiens 24,30 , as well as reducing daily survival rates among adults 31 .Our study results support these hypotheses, showing a negative association between high levels of rainfall both in the current as well as in the month before trapping, and mosquito abundance.Similar results were found by Soh & Aik 27 , who observed a negative association between rainfall and Culex larval habitats in Singapore.However, there is not yet consensus on this association.Bisanzio et al. 22 reported a positive relation between cumulative rain in the 10 days before trapping and weekly abundance of Culex pipiens.Reisen et al. 32 found both a positive and negative correlation between rainfall and abundance of the population of Culex tarsalis in different regions of California.A possible explanation for the duplicity of this association was provided by Valdez et al. 33 , whom observed in their model that elevated heterogeneity in daily levels of rainfall reduces the dependence of Culex quinquefasciatus abundance on rainy days.Another possible explanation was reported by Carrieri et al. 34 , who observed that, in urban environments, breeding site are mostly related to human environments and, therefore, rainfall does not influence the seasonal trends of mosquito abundance.Furthermore, following Wang et al. 23 , the impact of rainfall is clearly lower than the one of temperature, and-in addition-develops over longer time (these authors suggest 35 days before capture).
Other climatic and environmental conditions have shown to be associated with mosquito abundance in our model: NDWI and NDVI presented a positive association with mosquito population, whereas CWD and CDD in the previous month, and the average wind speed presented a negative association.A huge number of dry days, together with an high amount of rainfall in the previous month does not favour large Culex abundances.This results suggest that non-extreme drought/rainfall conditions are more likely to enhance Culex abundance (due to, e.g., development of small-to-medium-sized artificial water bodies).Just as per temperature and rainfall, similar results have been already presented, e.g., NDVI have been already found to be positively associated with the abundance of Culex modestus 22 , whereas wind speed was associated if not directly with mosquito abundance, with their WNV infection rate 35 .
Concerning its ability to predict the abundance of mosquitoes, our model successfully captures the bellshaped curve of mosquito abundance while maintaining enough flexibility to adapt to the different spatial characteristics.For a large portion of the traps, the model predictions align with the observed abundance.However, traps that display more oscillatory behavior remain challenging to model and it is uncertain whether these fluctuations are due to exceptional weather conditions or unrepresentative trap abundances.For this reason, perhaps a more homogeneous distribution of trap sites would provide help to better uncover the effect of the gradient of temperature, soil moisture, and rainfall on WNV-transmitting Culex mosquitoes abundance.In addition, it is worth noting the quite unique behavior in Crete (Fig. 7), which is characterized by a large number of low values, especially during April and May: unlike other regions that have a similar distribution shape, Crete exhibits a significantly left-skewed distribution.These spatial variability of Crete could be attributed to factors such as elevation or proximity to the sea; and though the spatial component of the model captures this behavior to a certain extent, capturing the very small values of abundance in just one region might require a more complex approach.
Our study has certain limitations.Firstly, the data presents several challenges, including spatial heterogeneity, with two-thirds of the traps located in the region of Central Macedonia, and a significant amount of disproportionate information availability for a huge portion of traps and years.This potential source of selection bias for the observed areas results from the origin of the data that we used.The areas, which were more explored (e.g., more traps for more years), were the ones with higher prevalence of WNV disease.It is important to keep this in mind when interpreting our results: though the model might apply to general Culex pipiens population abundance, we cannot exclude an influence of the distribution of the trapping sites on our observations, and therefore we suggest to always contextualizing the reported estimate to WNV surveillances.On our side, to address these challenges, we define a hierarchical model that is able to capture the spatio-temporal autocorrelation characteristics of the data, possibly due to unmeasured variables (e.g.different availability of hosts due to nesting or emigration of preferred host species, development of stronger defensive behaviors by potential hosts, shift in feeding patterns 36 ).The model also allows appropriately dealing with the huge amount of missing values in the observed mosquito abundances, by borrowing information across both the temporal and the spatial dimension (there is no direct imputation as the model is able to handle missing data by combining the information from the nearby similar traps).In addition, it is paramount to understand the generalizability of the data: as the model is based on the mosquito abundance in traps, inevitably the amount of mosquitoes that are predicted in other areas represent the abundance that would be found in a trap, if one was set at the specific location, rather than the overall mosquito abundance.A limitation to this approach might be that our results were not adjusted by trap type.Following 37 , different mosquito traps perform better in different context, also related to climatic and environmental factors.Hence, the type of traps used for the collection of mosquito might influence some of the results we observed.
We considered that a more thorough understanding of real-world setting mosquito population dynamic together with an improved ability to predict mosquitoes' population abundance would be relevant both to support policy makers in their evaluation of optimal planning of preventive interventions, as well as in enhancing preparedness and anticipation of the potentially changing risk of virus transmission resulting from climate variability.We observed that the analysis of the impact of climatic and environmental factors on mosquito population abundance could be challenging due to the uneven distribution of trap locations and the high variability in captured mosquitoes.In fact, despite representing a fairly complete and extensive cross-section of the mosquito population in Greece over a decade, the considered data remains very complex to study due to the high amount of missing values.A better understanding of these key factors is crucial for improving statistical models for population dynamics and could help in the development of early warning systems to reduce human exposure through vector control.
Our results provide strong evidence that climatic factors play a significant role in shaping the dynamics of the mosquito population; however, it remains challenging to fully understand the underlying processes.This complexity is compounded perhaps by the intrinsic bipartite life cycle of mosquitoes, where larvae develop in aquatic habitats and adults are terrestrial 38 , and which could benefit from further research.In addition, our results could support future studies aiming at disentangling the potential relationship between mosquito abundance and incidence of WNV disease.Nevertheless, on a general level, the results evaluated using the mean absolute error are similar to previous results on modelling mosquito abundance 20 .The approach proposed in this study shows promising results and provides insights into the role of climatic and environmental factors on mosquito population dynamics.

Model
We model the mosquito abundance y(s, j, t) observed at site s, year j and month t.We assume to have a set of traps {s i } (i=1,...,I) .For each i, the set of indices k ∈ A i , with A i comprising the years for which we have observa- tions for the trap i.Moreover, the set of indices j ∈ M i k , with M i k comprising the months of year k for which we have observations for the trap i.The model assumes the mosquito abundance to be a realization of a Poisson process, (with expected number of cases E i taken to be the average mosquito abundance by trap), with a mean comprising a spatio-temporal component and a set of covariates.The model is estimated using Integrated Nested Laplace approximations 40,41 , using the R-software package INLA (version 22.04.16,www.r-inla.org) with default priors for the coefficients.This Bayesian approach provides an intuitive and explicit estimation of model uncertainty, accounting for spatially and temporally correlated errors.Similar Bayesian spatio-temporal processes have been already used in the epidemiological literature to model, for instance, HIV mortality, thanks to their flexibility 42,43 .The hierarchical spatio-temporal process is the following: where β is the vector of coefficients for the fixed effects.The spatial effect f is modelled via a Gaussian random field with a Matérn covariance function, using an approximated stochastic partial differential equations approach 40,41 .The covariance between measurements taken at two trap sites (s i , s i ′ ) i� =i ′ , separated by a distance d = ||s i − s i ′ || , is given by where Ŵ is the gamma function, K ν is the modified Bessel function of the second kind, and ρ and ν are positive parameters of the covariance.The estimated posterior values for (θ 1 , θ 2 ) provided in the Results Section cor- respond to the transformations The range of the spatial process is controlled by parameter ρ .Higher values of the parameter induce a faster decay in the correlation with distance, which imply a small range spatial process.On the contrary, lower values will indicate a spatial process with a large range.The parameter ν controls smoothness of the spatial process.Finally, the parameter σ 2 a general scale parameter.The first temporal component γ (for months) consists of a first order difference prior, so that given a grid {κ 0 , . . ., κ K } , we have Here the we use a regular grid that corresponds to the months considered.This prior is the Bayesian equivalent of spline smoothing, so that large differences on the effect of two consecutive months are penalized.The second temporal component ξ (for years) consists of a Gaussian prior with independent components, since there is no clear evidence on the dependence among years.Moreover, we use non-informative priors for the coefficients of the linear part.All posterior expectations and probabilities were estimated using 1000 samples.
The model and the predictions are obtained using a high-resolution Delaunay triangulation mesh of the study area (see Supplementary Figure S3), constructed starting from the official boundaries obtained from Eurostat (https:// ec.europa.eu/ euros tat/ web/ gisco/ geoda ta).The mesh is obtained via the function inla.mesh.2d of the package INLA, using parameters max.edge= (1, 5) • 0.95 , cutoff= 0.06 , min.angle=30.

Figure 2 .
Figure2.Distribution of traps over the Greece territory and median culex abundance highlighted with color (darker shades corresponds to higher abundances) and size (larger areas corresponds to higher abundances).The map has been created using the package ggplot2 within the R software (https:// www.r-proje ct.org/, version 4.3.1),with official boundaries obtained from Eurostat (https:// ec.europa.eu/ euros tat/ web/ gisco/ geoda ta).

Figure 3 .
Figure 3. Left: Estimated posterior mean for the fixed effect and corresponding 95% Bayesian credible intervals for potential predictors of mosquitoes abundance, reported from the most positive association to the most negative.Considered factors were: average temperature (TAVG), consecutive number of growth days in previous month (CGD_l1), consecutive number of growth days in current month (CGD), vegetation abundance index (NDVI), water abundance index (NDWI), consecutive number of wet days (CWD), consecutive number of dry days (CDD), average rainfall (RAIN), average wind speed (WIND), consecutive number of wet days in previous month (CWD_l1), average rainfall in previous month (RAIN_l1), and consecutive number of dry days in previous month (CDD_l1).Rigth: Combined posterior mean and standard deviation (SD) for the random effects for month and year.Different mean values are reported with different colours, whereas different SDs are reported with different sizes of the squares.

Figure 4 .
Figure 4. Left: Estimated posterior mean.Right: Estimated posterior standard deviation for the spatial random effect.

Figure 6 .
Figure 6.Observed (blue/squares) vs predicted (red/triangles) values for the year 2021 (model estimated using data from 2011-2020), stratified by trap, while the number at the top-left corresponds to the MSE.The traps are ordered as in Fig. 1.

Table 1 .
Observed vs fitted logarithmic mosquitoes abundance values, stratified by month and year.The dashed lines correspond to the ±1 interval, the red (triangle) dots represent the amount of pair that fall within these limits, while the number at the bottom-right corresponds to the proportion of observations that fall in the interval.Training and prediction errors (Mean Absolute Error and Mean Squared Error) for different years (for each calendar year, the model is estimated using all the years except the considered one, used for prediction) and different regions.