Multiple invasions, Wolbachia and human-aided transport drive the genetic variability of Aedes albopictus in the Iberian Peninsula

The Asian tiger mosquito, Aedes albopictus, is one of the most invasive species in the world. Native to the tropical forests of Southeast Asia, over the past 30 years it has rapidly spread throughout tropical and temperate regions of the world. Its dramatic expansion has resulted in public health concerns as a consequence of its vector competence for at least 16 viruses. Previous studies showed that Ae. albopictus spread has been facilitated by human-mediated transportation, but much remains unknown about how this has affected its genetic attributes. Here we examined the factors that contributed to shaping the current genetic constitution of Ae. albopictus in the Iberian Peninsula, where the species was first found in 2004, by combining population genetics and Bayesian modelling. We found that both mitochondrial and nuclear DNA markers showed a lack of genetic structure and the presence of worldwide dominant haplotypes, suggesting regular introductions from abroad. Mitochondrial DNA showed little genetic diversity compared to nuclear DNA, likely explained by infection with maternally transmitted bacteria of the genus Wolbachia. Multilevel models revealed that greater mosquito fluxes (estimated from commuting patterns and tiger mosquito population distribution) and spatial proximity between sampling sites were associated with lower nuclear genetic distance, suggesting that rapid short- and medium-distance dispersal is facilitated by humans through vehicular traffic. This study highlights the significant role of human transportation in shaping the genetic attributes of Ae. albopictus and promoting regional gene flow, and underscores the need for a territorially integrated surveillance across scales of this disease-carrying mosquito.

www.nature.com/scientificreports/ epidemiology of several mosquito-borne diseases since the tiger mosquito is a competent laboratory vector of at least 16 viruses 5,6 . While Ae. aegypti is considered as the principal vector of dengue and Zika, Ae. albopictus is a less efficient epidemic vector (with local exceptions in some cases), having developed an enhanced transmission for chikungunya facilitated by genetic adaptation (E1-A226V substitution) of the ECSA strain [6][7][8][9] . Ae. albopictus has a role on sporadic autochthonous disease transmission of arboviruses (as chikungunya, dengue and Zika) in Europe, and therefore its surveillance and control are considered a regional priority 10 . Invasive alien species have historically been spread by humans. However, advances in transportation logistics resulting in higher air traffic and sea-born trading are driving a more rapid dispersal of non-indigenous species in the world, including many vectors of human diseases 11 . Ae. albopictus is one of top invasive alien species in the world, being considered, together with Ae. aegypti, the most costly invasive species 12,13 . Its expansion in Europe coincides with the wave of introductions of invasive species in the continent, which started almost 40 years ago as a result of globalization 14 . Considering its limited flight range (less than 200 m/day) 15 , a crucial element for its success is the longevity of its desiccation-resistant eggs 16 , which can be passively transported by humans through commercial shipping of used tires and aquatic plants 3 and ground vehicles 10,17 . In Spain, Ae. albopictus was first detected in Sant Cugat del Vallès, Catalonia, in 2004 18 as a result of the nuisances associated with its aggressive anthropophilic behaviour 19 . Almost twenty years later, this species is well established in the Mediterranean coast of Spain and it is now colonizing inner territories of the Iberian Peninsula 17,20 .
A biological invasion is a three-step process, which involves: initial dispersal, establishment of self-sustaining populations and spread to neighbouring habitats 21,22 . It is, however, during the initial dispersal when proactive management efforts can be more cost-effective for preventing the establishment of invasive species 22,23 . For instance, successful surveillance in New Zealand in the 1990s intercepted the entrance of Ae. albopictus 24 , which has not been reported so far in this country. On the other hand, once this has become established, it is highly difficult to eradicate 16 . In this respect, studying Ae. albopictus population genetic structure and identifying its dispersal routes, its main drivers and scales, is crucial to understand the tiger mosquito's spread and design integrated surveillance and preparedness strategies 25,26 . Previous population genetic studies pointed to a worldwide chaotic dispersion pattern in Ae. albopictus, with Europe harbouring several distinct genotypes that have been linked to multiple independent introductions (e.g. 25,27,28 ). Furthermore, in Europe human transportation networks have been shown to have facilitated the spread of this mosquito and conditioned its genetic and demographic patterns [29][30][31][32] .
As different marker types can tell different stories 33,34 , it is crucial to analyse both mitochondrial (mtDNA) and nuclear (nDNA) DNA data to understand what major factors contributed to shaping the genetic attributes of invasive populations. In this regard, the presence of maternally-transmitted endosymbiotic bacteria may be relevant, as they can distort mtDNA phylogenies and reduce mtDNA diversity, a process which may have little to no effects on nDNA variation 35 . If a maternally-inherited symbiont confers a selective advantage to its hosts, the mtDNA variants originally associated with the symbiont can rapidly spread through the host population and go to fixation, thus resulting in a great increase in the frequency of a single or few mtDNA haplotypes 36 . Such symbionts are widespread in many arthropod species, being Wolbachia the most common maternallyinherited symbiont on the planet 37,38 . For this reason, incorporation of nDNA data is essential to corroborate results derived from mtDNA.
It is now widely accepted that a progress in the study of the ecology of Ae. albopictus requires an interdisciplinary approach 11,39 . For instance, population genetics and multilevel modelling analyses can be crucial to shed light on Ae. albopictus patterns of dispersal. Besides, it has been shown that genetic diversity in this species has been related to a higher adaptive potential 29 and a diversification of its interactions with the pathogens it carries 39 . Understanding movement patterns in its populations in endemic areas is crucial to disentangle the dynamics of disease transmission between vectors and humans 15 . In this study we assess geographic patterns of mtDNA and nDNA variation in Ae. albopictus and use this information to investigate the species' dispersal routes across Spain. We also compare features of mtDNA and nDNA variation and assess whether they are incongruent, and test for Wolbachia presence to ascertain whether mtDNA diversity is affected by this maternally-transmitted parasite.

Results
Genetic variation and structure, and temporal trends of genetic diversity. The nuclear DNA alignment (second internal transcribed spacer of ribosomal DNA-ITS2) included 470 sequences of 286 bp (424 sequences from Spain + 30 from France + 16 from Greece), while for the mitochondrial DNA alignment (cytochrome c oxidase gene subunit 1-COI) we obtained 471 sequences of 513 bp (424 from Spain + 40 from France + 7 from Greece) (Fig. 1, Table 1). Haplotype and nucleotide diversity estimates calculated at the province level ranged from 0.490 to 1 and from 0.004 to 0.064 for ITS2, respectively, and from 0 to 0.350 and from 0 to 0.0014 for COI (Table 1).
In the Iberian Peninsula, overall ITS2 genetic diversity was between three (haplotype diversity) and 10 (nucleotide diversity) times higher than for COI. Indeed, overall haplotype and nucleotide diversities were 0.7084 ± 0.025 and 0.00566 ± 0.00048 for ITS2, respectively, and 0.2196 ± 0.027 and 0.00054 ± 0.00008 for COI. We found 78 haplotypes defined by 83 polymorphic sites (21 parsimony informative) for ITS2, whereas only 18 haplotypes defined by 20 polymorphic sites (5 parsimony informative) were detected in the case of COI (Fig. 2).
No apparent association between haplotypes and geography was detected. Both ITS2 and COI haplotype networks showed the presence of worldwide dominant haplotypes and the absence of a clear genetic structure along the study area (Fig. 2). Specifically, in the ITS2 network, the most frequent and centrally placed haplotypes were observed in localities spanning the entire extension of the study area and were shared with other areas of the world. However, 89.7% of the haplotypes found were newly described (70/78), the majority being low-frequency haplotypes. In the case of COI, we obtained a much simpler star-like network defined by one  Table 1. Basic genetic statistics of Ae. albopictus sampled provinces and countries for mtDNA (COI) and nuclear DNA (ITS2). N, sample size; S, number of polymorphic sites; HD, haplotype diversity; π, nucleotide diversity. Note that each province contains multiple sampling sites, as shown in Fig. 1.

Province/Country
Year of first detection Sampling period N COI S COI HD COI (mean ± SE) π COI (mean ± SE) N ITS2 S ITS2 HD ITS2 (mean ± SE) π ITS2 (mean ± SE) www.nature.com/scientificreports/ main central haplotype connected to many low-frequency haplotypes with little genetic differentiation from the dominant sequence. Haplotypes were separated by no more than three nucleotide substitutions, although 83.3% of the haplotypes found were unique (15/18). At the province level, we found a significant positive correlation between genetic diversity and colonisation time for COI, meaning that the provinces that were first colonised by Ae. albopictus bear higher mitochondrial genetic diversity (R 2 = 0.497, p = 0.007 for haplotype diversity, R 2 = 0.597, p = 0.002 for nucleotide diversity; Fig. 3). However, a deeper look showed that COI's variation in nucleotide diversity over time was almost non-existent (π range: 0-0.0014), likely indicating a statistically significant but biologically irrelevant pattern. On the contrary, we found a significant negative correlation between ITS2 nucleotide diversity and colonisation time (R 2 = 0.376, p = 0.022), which displayed higher variation (π range: 0-0.022); the same relationship with haplotype diversity was not significant (Fig. 3).
Whereas these province-level correlations may be limited, to some extent, by the different sample sizes in each province, we also analysed the genetic distances among the individual sampled mosquitoes. In agreement with the haplotype network, multidimensional scaling (MDS) analysis revealed no apparent evidence of distinct genetic groups or geographic consistency within the nuclear dataset, although it did show a certain level of drift over time (Fig. 4). Most of the samples were grouped into a unique cluster in the centre of the MDS, with the exception of a few samples collected in 2014 and 2015 at the geographic extremes of the study area that laid slightly outside the main cluster (i.e. samples from the southernmost provinces of Málaga, Granada and Murcia, from the northernmost provinces of Gipuzkoa and Girona, and from the Balearic Islands in the east). The twodimensional MDS solution accounted for 30% of the variance in the data (goodness of fit = 0.302).
Again using the individual sampled mosquitoes, a Mantel test of correlation revealed a significant positive correlation between ITS2 pairwise genetic distances and spatial distance (r = 0.163, p = 0.001), and negative correlations with potential tiger mosquito flux (r = − 0.022, p = 0.036) and spatial proximity (r = − 0.042, p = 0.001). The Mantel test did not find evidence of correlation between ITS2 genetic distance and temporal distance. Modelling genetic distance from spatial distance, temporal distance and mosquito flux. We use Bayesian multiple-membership multilevel zero-inflated beta regressions to model ITS2 genetic distance as a function of mosquito flux, spatial proximity, spatial distance, and temporal distance (see "Methods"). These models rely on each pair of sampled mosquitoes as the unit of analysis and they suggest that ITS2 genetic distance is better explained by the combination of potential tiger mosquito flux, spatial proximity, spatial distance and temporal distance than by any of these variables on their own or in smaller combinations. This is most clearly seen in our leave-one-out cross validation (LOO) comparison, in which the expected log pointwise predictive density for model 6 (M6) is 138 points lower than the next best model, with a standard error of only 20 (Supplementary Table S1). It is also seen in the Bayesian R-squared comparison, in which M6 also has the highest value, although in this case the differences are small compared to the standard errors (complete model comparisons are presented in Supplementary Fig. S1 and Supplementary Table S2). The slope estimates for the main effects in the equation for the mean (μ) of the beta distribution are shown in Fig. 5A and Supplementary Table S2, and should be read in conjunction with the slope estimates for the main effects in the equation for the probability of zeros (π), shown in Fig. 5B and Supplementary Table S2. The coefficients on the spatial proximity and potential tiger mosquito flux variables are negative in all models of μ and positive in all models of π, even when both variables are included together (model 5 -M5-and M6), indicating that greater spatial proximity and mosquito fluxes are associated with lower ITS2 genetic distance between sampled mosquitoes, with a higher probability of zero distances. In contrast, the coefficient on the temporal distance variable is positive in all models of μ and negative in all models of π (albeit with its posterior distribution overlapping zero in M5 and M6), indicating that greater passage of time between samples is associated with greater ITS2 genetic distance and with lower probability of zero distances ( Supplementary Figs. S2, S3). In M6 we find that greater spatial distance is also associated with lower probabilities (π) of zero ITS2 genetic distance (Fig. 5B). Although greater spatial distance is also associated with lower ITS2 genetic distance in the model of μ (Fig. 5A), the combined effect of the two components is an overall positive relationship between spatial distance and ITS2 genetic distance (Fig. 6A). Figure 6A shows the effects predicted by M6 of changes in inter-point distance (reflected simultaneously in the spatial distance and spatial proximity variables) on ITS2 genetic distance. The range of inter-point distance values used for these predictions is the same as that observed in the data. Potential mosquito flux is held at its  Table 1  www.nature.com/scientificreports/ median, the sampling years are set to 2011 and 2015 (to give the widest range observed) and the sample pair (for purposes of the random intercepts) is arbitrarily selected. The overall pattern is non-linear, reflecting the combination of the spatial proximity and spatial distance variables. There is an initial steep increase in predicted ITS2 genetic distance for samples taken within a few meters of one another, which can be seen most clearly in the inset plot of Fig. 6A. Beyond these highly proximate samples, predicted ITS2 genetic distance then rises more gradually, driven by the spatial distance variable. Figure 6B shows the effects predicted by M6 of changes in potential mosquito flux on ITS2 genetic distance. The range of fluxes used for these predictions is the same as that observed in the data. Inter-point distance is held at its median and used for calculating the spatial proximity and spatial distance variables, the sampling years are set to 2011 and 2015 (to give the widest range observed) and the sample pair (for purposes of the random intercepts) is arbitrarily selected. Here we see a steep drop in predicted ITS2 genetic distance as potential mosquito flux increases from 0 to several hundred per day, with effect of increased mosquito fluxes becoming weaker at higher values, reflecting the log-linear relationship used in the model. Although in this case the pattern is overwhelmed by the uncertainty of the predictions (the wide posterior predictive distributions shown in the lighter  Figure 7 shows the combined effects of inter-point distance and potential tiger mosquito flux changing together. We see, first, how the lowest values of inter-point distance (bottom edge of plot) correspond with the lowest predicted ITS2 genetic distances while the lowest values of potential tiger mosquito flux (left edge of plot) correspond with the highest. When both variables are at their lowest (bottom left corner), inter-point distance is determinative: predicted ITS2 genetic distance is low here regardless of potential tiger mosquito flux. Beyond these lowest values, we see how predicted tiger mosquito flux acts to hold predicted ITS2 genetic distance down even as distance increases. Although predicted ITS2 genetic distances never reach their lowest values at these longer distances, the potential tiger mosquito fluxes of around 26,000 mosquitoes per day (this is the potential flux, for example, between Barcelona municipality and El Prat de Llobregat) keep the predicted ITS2 genetic distance within the range of 0.0035-0.0036, even at distances of 100 km.
Note, finally, that although the ITS2 genetic distances and effect sizes shown in Figs. 5, 6 and 7 are small overall, this should be interpreted in light of the distribution of observed genetics distances, which ranged from 0 to only 0.07, with a standard deviation of 0.01.

Discussion
Aedes albopictus has spread worldwide and particularly in Europe at a fast pace, making it one of the 100 most invasive species on Earth 12 . In Spain, after Ae. albopictus was first found near Barcelona in 2004 18 , a continuous spread along the Mediterranean coast was observed 17 . All Mediterranean provinces are currently colonised, along with the Basque Country in the northern coast and several inland territories 20 , where the species continues to expand. In light of this, understanding Ae. albopictus dispersal routes across scales is crucial for planning effective early warning surveillance in non-invaded areas and implementing surveillance and control activities in the areas already colonised. The same information can also be valuable in predicting the transmission risk of pathogens by this vector. Knowledge of the effect of vehicles and transport infrastructures in the genetic structure of vector populations and their overall spreading capacity can comprehensively show our potential as a natural selective force and the existing contradictions between globalization and our efforts to combat biological invasions and pests 40 .
Our models indicate that at very small spatial scales (i.e. several meters) the genetic variation measured by ITS2 is sharply reduced, likely representing seasonal mosquito pools that come from a main source. Beyond Figure 5. Estimated relationship between ITS2 pairwise genetic distance, spatial distance (sp_dist; geodesic distance between sample locations in meters), spatial proximity (sp_prox; measured as the negative exponential of distance), potential tiger mosquito flux (mosq_flux; estimated from commuting patterns and tiger mosquito population distribution) and temporal distance (yr_diff; measured as absolute difference between years in which samples were taken) on the beta mean (μ) parameter (A) and on the zeros (B) in the zero-inflated beta regression models. Parameters are estimated from a set of Bayesian multilevel zero-inflated Beta regressions with multiple-membership random intercepts for the samples and sampling years represented in each pair. www.nature.com/scientificreports/ these scales, genetic variability steadily increases with spatial distance, as a clear positive correlation between dispersal distance and genetic variation appears. This is reflected in the combined effects of the (small scale) spatial proximity variable and the linear spatial distance variable. More interestingly, our models suggest that human transportation has a role in shaping Ae. albopictus nuclear genetic structure by means of passive dispersal of adult tiger mosquitoes in cars: there is a clear negative relationship between mosquito flux and genetic variation. Previous studies have also highlighted that, although Ae. albopictus has low natural dispersal capabilities, human-aided transport (especially cars) has probably facilitated significantly the tiger mosquito's movement and invasion process 29,30,32,41 . Our findings are consistent with these but go a step further by suggesting that the "hitchhiking" of Ae. albopictus in cars observed in Eritja, Palmer, Roiz, Sanpera-Calbet and Bartumeus 30 actually helps to explain observed patterns of population genetics. While genetic variability increases with spatial distance, car transport can strongly reduce this effect. The high-resolution of our dataset makes it possible to show the significant role of human transportation in shaping the genetic constitution of Ae. albopictus and promoting regional gene flow. Mosquito movement can be affected by human activities like commuting and human-made structures like roads, which combined act as bridges for dispersal by favouring gene flow and promoting genetic mixing 32 . At a much broader spatial scale, we find a general lack of genetic structure and geographic consistency among haplotypes across the large expanse of the Iberian Peninsula, with numerous haplotypes shared among several distant areas (e.g. the most abundant ITS2 haplotypes found in Spain were also detected in several other European and Asian countries). This suggests that human-mediated large-scale dispersal of Ae. albopictus is also common, and point to a pattern of regular introductions of the species from abroad, through e.g. transportation of used tires and aquatic plants, which allows for survival and establishment at long distances of whole batches of eggs. Previous genetic studies of Ae. albopictus have also showed little or no genetic structure according to geography, both in the native and introduced range of the species (reviewed in 39 ). Such a genetic feature is likely the consequence of the high level of human-mediated spread from several genetically distinct source populations followed by global dispersal, and it is concordant to what has been found in other wide-ranging invasive insect species, especially those that are closely associated with humans, e.g. the German and American cockroaches (Blattella germanica and Periplaneta americana, respectively) 42,43 and the longhorn crazy ant Paratrechina longicornis 44 . Taken together, our results highlight the role of human activities in promoting unintentional mid-and long-distance dispersal and thus in shaping the current genetic structure of insect species commonly found in human-modified landscapes.
Our study has some limitations that should be considered. First, the modelled genetic diversity range, which was obtained from ITS2 pairwise genetic distances, is low. This likely has to do with the relatively low variability of the analysed genetic marker. Indeed, nuclear genes, as well as mitochondrial fragments, are expected to be less variable and bear lower resolving power than highly mutating markers, e.g. microsatellites. Second, although we rely on a relatively large number of sampling sites, additional sampling across larger areas of Spain could provide www.nature.com/scientificreports/ better balance and a wider range of values for the models. Third, intragenomic heterogeneity (i.e. presence of multiple haplotypes within the same individual) of ITS2 has been reported in several mosquito species including the genus Aedes 45,46 , and this can pose a challenge in DNA sequencing and analysis. In this study, all individuals presenting intragenomic heterogeneity were thus excluded from analysis. As for future directions, although ITS2 has proved to be a useful marker for studies on the spread of Ae. albopictus 47 , the reassessment of the species' genetic diversity and population structure through the use of molecular markers with greater variability and/or potential, such as microsatellites and single nucleotide polymorphisms (SNPs), could provide more detailed and deeper insights into the description of fine-scale dispersal patterns, gene flow and introduction routes. Global patterns of mtDNA and nuclear variation were highly discordant, with mtDNA showing little genetic diversity and a single star-like haplotype network. This is in agreement with previous studies showing Ae. albopictus levels of nuclear variation within the range of most insects, but extremely low mtDNA variation both within and among populations [48][49][50][51] . Interestingly, we found a high infection rate (79%) of Wolbachia in the studied Ae. albopictus samples. Wolbachia is a genus of maternally-inherited endosymbiotic bacteria that is known to induce male killing, feminization, parthenogenesis and cytoplasmic incompatibility, which facilitate its spread within the arthropod population 52 . Wolbachia is capable of inducing selective sweeps in mtDNA, i.e. fixation of a single or few mtDNA haplotypes that may become widespread in the host population through cytoplasmic hitchhiking driven by Wolbachia invasions 36 . Selective sweeps on mtDNA have been shown to not only reduce haplotype diversity producing a characteristic single star-like network, but also to cause the remaining set of haplotypes to deviate from neutrality 35 . Within Culicidae, the natural presence of Wolbachia has been documented in more than 30 species (e.g. [53][54][55][56], with Ae. albopictus harbouring significantly lower mtDNA diversity than the uninfected species 55 . In light of this, our results point to Wolbachia as causative agent for the lack of mitochondrial www.nature.com/scientificreports/ polymorphism here recovered, as suggested for other insect species, e.g. Acraea butterflies 57 and the cherry fruit fly Rhagoletis cerasi 58 . Nevertheless, it has to be noted that Wolbachia can show seasonal fluctuations in infection rates 59 , which were not addressed in this study. Hence, the infection rate reported here may not capture a representative overview of Wolbachia infection. Moreover, the low variability in mtDNA in the introduced ranges of Ae. albopictus could also be caused by demographic processes, such as genetic drift or population bottleneck during rapid colonization 60 . However, demographic processes cause changes in variation for both mitochondrial and nuclear markers, even though mtDNA is expected to show a stronger response 61 . Furthermore, the lack of mtDNA variation was also found in the native range of Ae. albopictus ( 48 , but see 62 ), strengthening our hypothesis of a Wolbachia-induced selective sweep. Alternatively, we cannot rule out that the low COI diversity here detected may be at least partially due to the length of the analysed fragment, as higher diversity has been observed when targeting larger COI fragments (> 1300 bp) 63,64 . Further research is needed to test this hypothesis. Mitochondrial DNA has been extensively used to shed light on the geographic origin of invasive Ae. albopictus populations and arthropods in general 49,50,65 , and disentangle their phylogeographic history 62,66 . Nevertheless, there is broad recognition that mtDNA can commonly be under selection, thus challenging the assumption of neutrality postulated by several population, phylogeographic and phylogenetic studies 33,67,68 . Selection can arise due to e.g. mito-nuclear co-evolution, adaptation of mtDNA to different climatic/environmental conditions, and selective sweeps of beneficial mtDNA haplotypes 68 . Mitochondrial DNA is a single, linked molecule with low or no recombination, meaning that selective processes such as selective sweeps can have a profound impact on the apparent rate of genetic drift 33 . Therefore, we suggest caution should be used in drawing conclusions from mtDNA alone and, whenever possible, aim for a multilocus approach to achieve a correct understanding of the genetic population structure and history of the study species.

Methods
Sample collection and DNA extraction. Ae. albopictus samples were collected during the period 2011-2015 at 140 locations encompassing most of the current species distribution in Spain (13 provinces mostly located along the Mediterranean coast; Fig. 1, Table 1). Additional samples were collected from two areas of France (Nice N 43° 38′ 32′′ E 7° 5′ 27′′ and Montpellier N 43° 40′ 39'' E 4° 2′ 35′′) and two regions of Greece (the area of Pylaia, in Salonica N 40° 27′ 22′′ E 23° 13′ 20′′ and Sykia N 40° 2′ 19'' E 23°56′ 22′′), which were used as complementary data from these Mediterranean areas (Table 1). Sample collection was performed using different methods according to the life stage (adults, larvae or eggs) (see Supplementary Methods for details). All samples were stored in absolute ethanol and kept at -20 °C until genetic analyses.
DNA was extracted from the whole bodies of mosquitoes (adults or larvae) in a final volume of 250 µL using the HotShot protocol 69 . Ae. albopictus nuclear and mitochondrial gene sequencing. We amplified two gene regions, including one nuclear ribosomal gene (ITS2) and one mitochondrial fragment (COI). The following primers were used for amplification and sequencing: for ITS2, primers ITS-CP-P1A (5′-GTG GAT CCT GTG AAC TGC AGG ACA CATG-3′) and ITS-CP-P1B (5′-GTG TCG ACA TGC TTA AAT TTA GGG GGTA-3′) 70 , and for COI, primers LCOI490 (5′-GGT CAA CAA ATC ATA AAG ATA TTG G-3′) and HCO2198 (5′-TAA ACT TCA GGG TGA CCA AAA AAT CA-3′) 71  Wolbachia screening. To test whether mtDNA variability could be affected by the presence of endosymbiotic bacteria of the genus Wolbachia, we analysed 62 randomly selected samples (13.2% of the overall samples). Individuals were selected aiming to cover all sampling years and all studied provinces. Two molecular markers were used for detecting Wolbachia infection, namely wsp and 16S rDNA. Primers used for amplification and sequencing were: for the wsp marker, primers 81F (5′-TGG TCC AAT AAG TGA TGA AGA-3′) and 691R (5′-AAA AAT TAA ACG CTA CTC CA-3′) 75,76 , while for 16S, primers 16SF (5′-CGG GGG AAA AAT TTA TTG CT-3′) and 16SR (5′-AGC TGT AAT ACA GAA AGT AAA-3′) 77,78 . Amplification conditions followed Wiwatanaratanabutr 53 in the case of wsp, and Heddi, Grenier, Khatchadourian, Charles and Nardon 78 in the case of 16S. PCR products were visualized on a 1% agarose gel. To validate the results, two PCR replicates were run for each sample (following Carvajal, Hashimoto, Harnandika, Amalin and Watanabe 54 ). A third replicate was run for samples that showed incongruent results based on the two prior replicates. Wolbachia infection was confirmed by two successful amplifications of both molecular markers. In light of the high infection rate detected in the selected samples (see Results), we decided to exclude COI from population structure and multilevel modelling analyses (see below), as mtDNA diversity was likely affected by the presence of Wolbachia.
Genetic diversity and structure. Genetic diversity indices including number of haplotypes, number of polymorphic sites, haplotype diversity and nucleotide diversity for mtDNA and nuclear DNA were estimated for the whole dataset and for each analysed province using DNASP 6.11.01 79 . To estimate gene genealogies, we used HAPLOVIEWER, which turns trees built from traditional phylogenetic methods into haplotype genealogies 80 . We estimated the phylogeny using a maximum-likelihood approach as implemented in RAxML 7.7.1 81  www.nature.com/scientificreports/ GTR CAT model rate of heterogeneity and no invariant sites for COI, and a gamma model rate of heterogeneity and invariant sites (GTR+G+I) for ITS2. The most appropriate model of nucleotide evolution was selected using jModelTest 2.1.3 82 under the Akaike information criterion (AIC). Input data were COI or ITS2 sequences from each individual, subsequently collapsed into haplotypes. The best tree was selected for network construction in HAPLOVIEWER. Population structure of nuclear DNA was characterized using classical multidimensional scaling (MDS), also known as principal coordinates analysis 83 . Pairwise distances between ITS2 sequences were calculated in MEGA by estimation of evolutionary divergence over sequence pairs, using the Kimura 2-parameter substitution model. MDS analysis was then performed with the pairwise genetic distance matrix, using the cmdscale function of the stats package in R 4.1.2 84 .

Construction of covariates.
Our analysis of the drivers of genetic distance focused on four constructed covariates: spatial distance, spatial proximity, temporal distance, and potential tiger mosquito flux. We constructed the spatial distance variable as the geodesic distance between the locations of each sampled mosquito (inter-point distance) in meters, calculated using the Vincenty method on the WGS84 ellipsoid as implemented in the geosphere package for R 85 . We constructed spatial proximity as the negative exponential of spatial distance in km, represented as e -d , where e is Euler's constant and d is spatial distance in km. This approach gives high values for spatial proximity (0.82-1) within the 200 m buffer often taken as the tiger mosquito's maximum flying distance, with values then quickly dropping and nearing zero by 5 km. The combination of these spatial distance and spatial proximity variables together is intended to capture the effects of inter-point distance at different scales. We constructed the temporal distance variable as the absolute number of years elapsed between the sampling times when each member of the sample pair was captured, rounded up to the nearest year (another way of thinking about this is as the number of mosquito seasons between each capture time). We constructed potential tiger mosquito flux as the potential daily bidirectional gross number of Ae. albopictus moving between each pair of municipalities. We estimated this by combining municipality-level Ae. albopictus risk estimates from the Mosquito Alert citizen science platform 86  Modelling the influence of spatial proximity, mosquito flux, and temporal distance on genetic distance. We explored drivers of nuclear genetic structure by carrying out simple Mantel tests of correlations 87 between ITS2 pairwise genetic distance and potential tiger mosquito flux, spatial distance, spatial proximity, and temporal distance, using pairs of individual mosquitoes as the unit of analysis. We implemented all tests in the ecodist package for R 88 , using 1000 permutations with bootstrap confidence limits estimated using 500 iterations.
We then used Bayesian multiple membership multilevel regressions to model ITS2 pairwise genetic distances between sampled mosquito pairs as a function of potential tiger mosquito flux between sampling sites, spatial distance and spatial proximity between sampling sites, and temporal distance.
We relied on zero-inflated beta regressions 89 , in which the distribution of the stochastic component is a mixture of a beta distribution and a degenerate distribution in 0. Following Figueroa-Zúñiga, Arellano-Valle and Ferrari 90 and Branscum, Johnson and Thurmond 91 , we used the beta distribution for ITS2 genetic distance values in (0, 1) because it is a highly flexible distribution that is defined on that interval. As these authors have done, we parameterized the beta distribution in terms of a mean (μ) and a parameter (ϕ) that captures precision. The probability density function of a variable (y) in this parameterization is: Since the ITS2 genetic distances in our dataset include zeros, for which the beta distribution is undefined, we used a zero-inflation component in our models. In addition to making it possible to include the zeros in the analysis (without adding artificial noise to them), this approach also has the advantage of treating them separately, which should minimize any problems associated with the inadvertent sampling of siblings (see Supplementary Methods). As explained in Ospina and Ferrari 92 , the probability density function of a variable y for this zeroinflated beta (zib) mixture is: where 0 < π < 1 is the probability of observing an ITS2 distance of 0, and μ and ϕ are defined as above.
We treated the observed ITS2 genetic distances between each sample pair i as independent random variables y 1 ,…,y n drawn from this zero inflated beta distribution such that y i ∼ zib y|π i , µ i , φ i . We used a logit link to model both π and μ, and a log link to model ϕ. We fit six models (M1-M6) with different combinations of mosquito flux, spatial proximity, temporal distance, and spatial distance as covariates. The models for μ are specified as following: (1) b y|µ, φ = Ŵ(φ) Ŵ(µφ)Ŵ(µφ)Ŵ((1 − µ)φ) y µφ−1 1 − y (1−µ)φ−1 , 0 < y < 1 (2) zib y|π, µ, φ = π, if ITS2 distance = 0 (1 − π)b y|µ, φ , if ITS2 distance ∈ (0, 1) www.nature.com/scientificreports/ where α j represents a random intercept for each of the years in which the mosquitoes in pair i were sampled, and α k represents a random intercept for each of the sampled mosquitoes. This multiple membership approach allows us to account for the effects of sample year and sample itself, which could otherwise confound results given that each pair may be connected to other pairs by the sampling years and sampling that they share 93 . β 1 , β 2 , β 3 and β 4 are the slopes on the spatial proximity (sp_prox), log mosquito flux (ln_mosq_flux), temporal distance (yr_diff) and spatial distance (sp_dist) variables, and α 0 is the overall model intercept.
We explicitly model the ϕ parameter of our beta distributions as a function of the provinces in which each sample in the pair was taken. This allows us to explore how the precision of the beta distribution changes across geography. The choice of provinces as the areal units for this part of the model is based on these units representing patterns of human settlement and activity that should be of relevance to tiger mosquito spreading patterns, while also being large enough to avoid adding too many additional parameters to an already complicated model. The model for ϕ is the same in all six models: Finally, we modelled π (the probability of ITS2 pairwise distance being equal to zero) as: The systematic part of the model for π is almost identical to the systematic part of the model for μ. The only difference is the absence of the sample random intercepts, which are excluded here for computational reasons.
Details on model fitting and accuracy estimation are outlined in the Supplementary Methods.

Data availability
Newly generated mitochondrial and nuclear DNA sequence data were deposited in GenBank under accession numbers OP060971-OP060988 and OP077008-OP077085.