Origin and expansion of the mosquito Aedes aegypti in Madeira Island (Portugal)

Historically known as the yellow fever mosquito, Aedes aegypti invaded Madeira Island in 2005 and was the vector of the island’s first dengue outbreak in 2012. We have studied genetic variation at 16 microsatellites and two mitochondrial DNA genes in temporal samples of Madeira Island, in order to assess the origin of the invasion and the population structure of this mosquito vector. Our results indicated at least two independent colonization events occurred on the island, both having a South American source population. In both scenarios, Venezuela was the most probable origin of these introductions, a result that is in accordance with the socioeconomic relations between this country and Madeira Island. Once introduced, Ae. aegypti has rapidly expanded along the southern coast of the island and reached a maximum effective population size (Ne) in 2012, coincident with the dengue epidemic. After the outbreak, there was a 10-fold reduction in Ne estimates, possibly reflecting the impact of community-based vector control measures implemented during the outbreak. These findings have implications for mosquito surveillance not only for Madeira Island, but also for other European regions where Aedes mosquitoes are expanding.

). The majority of these departures were associated with positive F is values, indicative of heterozygote deficits. Most heterozygote deficits were detected at a single locus, AC4, which accounted for 15 out of the 48 significant tests. Micro-checker results suggested that locus AC4 had a high probability of having null alleles in all but Fx05L and PM14A samples (Supplementary Table S3). The consistent heterozygote deficits and suspicion of null alleles lead us to remove locus AC4 from subsequent analyses of population structure. There were a total of 348 significant pairwise genotypic association tests out of 1851 performed. However, no pair of loci was consistently associated across samples, which suggests an absence of linkage disequilibrium among loci. Microsatellite polymorphism in Ae. aegypti from Madeira was low to moderate, with mean over sample AR ranging from 1.7 (AC7) to 6.6 (88AT1) and mean He from 0.081 (AC7) to 0.789 (88AT1) (Supplementary Table S3). Expected heterozygosity was significantly lower in the other two localities of Madeira when compared to the mean over-years of Funchal (Wilcoxon signed-rank tests, Paúl do Mar: p = 0.001; Santa Cruz: p = 0.004). In Paúl do Mar, no significant differences in He were found between 2013 and 2014 (Wilcoxon signed-rank test, p = 0.168). The proportion of unrelated individuals obtained by ML-RELATE was above 70% in all samples except for Santa Cruz. This sample had the highest frequency of related individuals (41%) with a high proportion of full sibs (15%) and backcrosses (20%).
The overall mean AR and He across temporal samples from Madeira was significantly different from those of the Brazilian samples (Wilcoxon signed-ranks tests, Santos: AR p < 0.001, He p = 0.033; São Sebastião: AR p = 0.018, He p = 0.015) but comparable to the sample of Caracas, Venezuela (Wilcoxon signed-ranks tests, Caracas: AR p = 0.159, He p = 0.433).
With the exception of Funchal in 2014, when the larval sample (Fx2014L) was less polymorphic, estimates of He and AR were largely similar between larval and adult samples collected in the same locality and year (Supplementary Table S3). The degree of relatedness among individuals was comparable between larval and adult samples, as shown in Supplementary Fig. S1. These results suggest that the genetic variation captured by both sampling methods is comparable. Therefore, larval and adult samples were pooled in subsequent temporal analyses to represent a single sample per collection year.
Effective population size and demographic stability. In Funchal, single-sample estimates of effective population size based on the linkage disequilibrium method (LD-N e ) increased overtime, from a minimum of 3.4 in 2005 to a maximum of 657.0 in 2012, the year of the dengue epidemic (Table 2). After 2012, there was ten-fold reduction of LD-N e . This pattern of temporal variation was not evident in the two-sample estimates of N e (F s -N e , Table 2). These estimates varied between 291.1 and 401.3 with no apparent trend for increase/decrease over years and with overlapping 95%CI.
For both methods, N e estimates were consistently higher in Funchal, when compared to the other localities (Table 2). In Paúl do Mar, there was a 5-fold increase of LD-N e between 2013 and 2014. The lowest LD-N e estimate was obtained for the only sample available from Santa Cruz. The estimates of LD-N e obtained for the three South American continental samples analyzed in this study were consistently lower than those obtained for Funchal and Paúl do Mar, except for the samples of 2005 and 2013, respectively (Table 2).
Significant departures from mutation-drift equilibrium were detected by heterozygote tests only under the TPM model (Table 2). There was a consistent surplus of loci with apparent heterozygote excess in all Madeiran population structure and origin. A first STRUCTURE analysis was performed with the Madeira dataset only and the results for the three best K values (K = 2 to K = 4) are shown in Fig. 2. Graphical representations of Evanno's ΔK can be seen in Supplementary Fig. S2. The sample from Paúl do Mar consistently formed a homogenous distinct genetic cluster in the three population structure scenarios. In the K = 3 clusters scenario, population partitioning corresponds to the geographic localities sampled, with distinct genetic clusters for Santa Cruz, Paúl do Mar and Funchal. The K = 4 scenario maintains the geographic substructuring but separates the samples from Funchal into two different genetic backgrounds. This genetic partitioning within Funchal was not confirmed by the DAPC analysis (Fig. 3). Madeira samples were divided into two principal genetic clusters. The first discriminant function separates Paúl do Mar from Funchal and Santa Cruz while subdivision of these two localities in discriminant function two is less pronounced, judging from the respective eigenvalues. A second STRUCTURE analysis was conducted with the complete dataset comprising the samples of Madeira and South America genotyped in this study, along with the dataset of Gloria-Soria et al. 20 . The best K obtained was K = 2, reflecting the known segregation of the African Aedes aegypti formosus from out-of-Africa Aedes aegypti aegypti populations ( Fig. 4a; see also Gloria-Soria et al. 20 ). All Madeiran individuals were homogenously assigned to the Ae. aegypti aegypti group. When the analysis was repeated without the African samples, the best value of K was equal to four (Fig. 4b). This partitioning reflected the previously shown three continental Asian/ Pacific, North-Central American and South American clusters 20 along with one additional cluster that grouped all Madeira island samples with a subset of South-American samples. A third STRUCTURE analysis was performed with samples of the fourth cluster only (Fig. 4c). This analysis gave a best K = 2 and grouped the Madeiran samples mainly with Venezuelan samples from Caracas. A few individuals from São Sebastião, Brazil, were also assigned to this Madeira/Caracas cluster. The other cluster comprised individuals mainly from Brazil, Colombia and non-Caracas Venezuelan samples.
Results of a DAPC analysis conducted with the Madeira/South America subset confirmed a closer relationship between Madeira Island and the samples of Caracas, Venezuela, and São Sebastião, Brazil (Fig. 5).

Mitochondrial DNA analysis. Summary statistics of genetic variation for each mtDNA gene in Madeira
Island are shown in Supplementary Table S4. Partial COI sequences were obtained for 202 individuals  An asterisk indicates the plot representing the optimal K as determined by the delta K method. Legend: 1-9: Funchal populations; 10-12: Paúl do Mar populations; 13-Santa Cruz population. For additional population details, see Table 1.  Table S6). The 764 bp alignment revealed the presence of three distinct haplotypes and nucleotide diversity (π) of 0.00317. Partial ND4 sequences were analyzed from 191 mosquitoes (Supplementary Table S5). The 351 bp alignment revealed the presence of four haplotypes and π = 0.00545. Neutrality tests were non-significant for both genes. A Median-Joining haplotype network for the concatenated COI/ND4 sequences is shown in Fig. 6. Of the five different haplotypes identified, haplotype COI_1/ND4_1 was present in over 90% of all individuals in all localities and it was the only haplotype detected in Paúl do Mar in the two years sampled ( Table 3). The second most frequent haplotype (COI_2/ND4_2) was separated by 23 mutational steps from the central COI_1/ND4_1   We performed a phylogenetic analysis with concatenated COI-ND4 sequences to infer the phylogenetic relationships between Madeira and worldwide Ae. aegypti sequences 22 . The resulting phylogenetic tree revealed that Ae. aegypti in Madeira is represented by members of the two major mtDNA lineages known for this species (Fig. 7) 23 . The main COI_1/ND4_1 is included in the West African lineage and clusters with Venezuelan and USA haplotypes. Haplotypes COI_1/ND4_3, COI_1/ND4_4 and COI_3/ND4_4 are also included in this clade. Haplotype COI_2/ND4_2 is included in the East Africa lineage that contains sequences from Asia, Central America, Caribe and Brazil.

Discussion
The results of the present study indicate that the recently established Ae. aegypti population of Madeira has derived from at least two independent introductions, possibly occurring at different time-points. Venezuela is the most likely geographic origin but a Brazilian source population, at least for one of the introductions, cannot be fully excluded. After the initial colonization, Ae. aegypti has rapidly expanded throughout the southern part of the island, reaching its maximum effective population size in 2012, which was coincident with the dengue   28,29 . Furthermore, MDE tests did not provide evidence of a population contraction associated with a founder event mediated by only a few individuals. This may reflect a higher evolution rate of microsatellites, with mutation rates 30 around 10 −4 -10 −3 , or that the size of the founding population was still sufficient to maintain a representative gene pool of this species.
Both genetic and historical evidence support an initial introduction of Ae. aegypti in Funchal, possibly by maritime transportation. Funchal is the capital and the major urban centre of the island, where the international harbour is located. It was in Funchal that the first Ae. aegypti specimens were collected in October 2005 9 . In agreement, Funchal displays the highest genetic diversity for both mtDNA and microsatellites and the largest N e estimates recorded in Madeira, suggesting a longer established population. After the introduction, Ae. aegypti rapidly expanded eastwards and westwards of Funchal, reaching Santa Cruz in 2008 10 and Paúl do Mar in 2012 12 . Again, genetic data supports an earlier colonization of Santa Cruz, judging from the higher haplotype diversity when compared to Paúl do Mar, with a single haplotype. The lower N e estimates also agree with subsequent colonisations of those two localities after the initial introduction of Ae. aegypti in Funchal. This expansion was most probably human-mediated, through road transportation along the only highway that connects most of the southern part of the island. In fact, the pronounced island's topography is likely to act as a natural barrier to active dispersal of this mosquito and this may be the reason why Ae. aegypti has not yet established in the northern part of the island. The influence of human movement in shaping genetic diversity, structure and differentiation was also observed on other Ae. aegypti insular populations as in the Antilles islands 31 and in the Pacific region 32 .
Concatenated mtDNA sequences revealed the presence of two highly divergent haplotypes separated by 23 mutation steps (COI_1/ND4_1 and COI_2/ND4_2). These two haplotypes were detected in the initial sample of Funchal 2005, providing evidence that the initial colonization was made by at least two different maternal lineages. However, the presence of a unique mtDNA haplotype (COI_3/ND4_4) in Santa Cruz may suggest one additional introduction on the island. Santa Cruz is a county where the International Airport of Madeira is located, providing an opportunity for airplane-mediated transportation of mosquitoes, an occurrence that has been previously detected 33 . While this haplotype was only detected in 2014, we cannot exclude an earlier introduction due to the lack of sample availability from previous years in this locality.
Both Bayesian clustering and DAPC analyses suggest a South American origin of Ae. aegypti in Madeira. All Madeira samples grouped in a distinct genetic cluster together with specimens from Venezuela, mainly from Caracas. Moreover, the most frequent mtDNA haplotype in Madeira (COI_1/ND4_1) is the same haplotype found in all Caracas specimens sequenced in this study. A Venezuelan origin of Ae. aegypti in Madeira is not surprising given that Madeira has an important emigrant community living in Caracas 34 . During summer, there is extensive movement between Caracas and Madeira because of holidaying by the migrant community. Coincidently, the only direct flight connecting Madeira and South America is between Funchal and Caracas 16 . A Venezuelan origin of Ae. aegypti also agrees with the insecticide susceptibility profile of Ae. aegypti in Madeira. This population was found to be resistant to three different insecticide classes and resistance was associated with knockdown resistance (kdr) mutations F1534C and V1016I and elevated expression of detoxification enzymes 35 . Similarly, insecticide resistance studies in Ae. aegypti from Venezuela revealed high frequencies of F1534C and V1016I kdr mutations and increased activity of glutathione-S transferases, esterases and mixed-function oxidases 36,37 , the same profile as that observed in Madeiran Ae. aegypti.
In addition to a Venezuelan origin, at least one introduction may have derived from Brazil. There were 7 individuals from São Sebastião, Brazil, which grouped in the Madeira/Caracas genetic cluster and 21 individuals from Madeira with genetic ancestry closest to the Brazilian/Colombian cluster. Moreover, the phylogenetic tree indicates that haplotype COI_2/ND4_2 groups in a clade in sequences from Brazil but not from Venezuela. However, this may simply reflect that the number of specimens sequenced from Venezuela was too low to capture all the haplotype diversity. Therefore, we cannot exclude the possibility of haplotype COI_2/ND4_2 also being present in Venezuela but not sampled.
Estimates of LD-N e in Funchal consistently increased overtime from the first sample time-point in 2005 until reaching its highest value in 2012. Coincidently, 2012 was the year of the dengue epidemic on the island 38 . Such an increase in N e agrees with the rapid expansion of this mosquito vector on the island. The precise ecological conditions driving this expansion are not fully understood but the mild temperate climate suitable for sustaining a mosquito population throughout the year coupled with an extensive availability of breeding sites (flower pots) 13 in urban and rural areas may have facilitated adaptation and subsequent population expansion on the island 14 .
The utility of genetic markers in assessing the impact of vector control on the mosquito population has been tested previously, with varying degrees of success [39][40][41] . Interestingly, estimates of F s -N e did not show any trend of temporal variation. While two-sample estimates are sensitive to population fluctuations, these methods are influenced by the initial (T 0 ) genetic variation of the population, since N e is retrieved from an unbiased estimate of allelic variance 42,43 . Therefore, the initial low microsatellite polymorphism of the Madeiran Ae. aegypti population may have affected the sensitivity of F s -N e in detecting population contractions. Coincidently, F s -N e samples of Funchal in the order of the hundreds are within the average values obtained in a previous study analyzing a global dataset for Ae. aegypti 44 .
After the 2012 outbreak, estimates of N e significantly decreased in 2013 and 2014. Heterozygosity tests and mode-shift allele detected a recent bottleneck during this period. These results suggest that the vector control measures implemented after the dengue outbreak were effective in reducing Ae. aegypti densities. It should be emphasized that vector control during the dengue outbreak of Madeira was predominantly based on community-based larval source reduction, enforced by a strong communication campaign led by the local health authorities 12 . Given the high insecticide resistance of Ae. aegypti on the island, alternative non-insecticidal methods are essential to contain the mosquito population and, subsequently, prevent arbovirus transmission. In this context, larval source reduction is a valid option for Madeira. This method is also recommended by the World Health Organization as a primary vector control tool for Ae. aegypti 45 .
To conclude, Ae. aegypti has recently arrived in Madeira and rapidly expanded its population size to levels able to sustain transmission of an arbovirus epidemic. The Venezuelan origin is coherent with the socioeconomic relations of this insular territory with that country and highlights the importance of monitoring mosquito populations at points of entry such as international harbours and airports. This study also provided evidence for the effectiveness of non-insecticidal vector control methods. The relatively small effective size of this island vector population may also be regarded as advantageous for the implementation of vector control tools that rely on genetically modified mosquitoes 46 .

Methods
Mosquito samples. Mosquito collections were performed in three localities of Madeira: Funchal, the capital city; Santa Cruz and Paúl do Mar, representing the eastern and western distribution limits of Ae. aegypti in the island (Fig. 1). Collections were made at six time points in Funchal (2005, 2009, 2011, 2012, 2013 and 2014) and two (2013 and 2014) in Paúl do Mar. In addition, mosquito samples from two localities in Brazil (Santos and São Sebastião) and one in Venezuela (Caracas) were also analysed. The two Brazilian localities represent coastal cities with major international harbours. Caracas is the capital of Venezuela and located ca. 30 km away from the major harbour city La Guaira. Sample details, including the sampling method, collection year and sample sizes, are available in Table 1.
Both immature and adult mosquitoes were sampled. Collected immatures (eggs and larvae) were reared to adults under insectary conditions. Adults were identified to species using morphological keys 47  Microsatellite genotyping. A total of 16 microsatellites were genotyped by fragment size analysis of polymerase chain reaction amplified products. Primer sequences and PCR conditions followed previously published protocols [49][50][51] and are described in Supplementary Table S1. Fragment size analysis was performed by capillary electrophoresis on an ABI3130xl genetic analyser (Applied Biosystems), at the DNA Analysis Facility at Science Hill, Yale University. Microsatellite alleles were scored using GENEMARKER software (SoftGenetics, PA, USA).
The genotypes obtained were integrated into the microsatellite genotypic database of Gloria-Soria et al. 20  Microsatellite data analysis. Expected heterozygosity (H e ) and the inbreeding coefficient (F is ) were estimated using GENEPOP 52 . The same software was used to perform exact tests of departure from Hardy-Weinberg proportions and of linkage disequilibrium (LD) among pairs of loci. Estimates of allele richness (AR) were obtained for each population by the statistical rarefaction approach implemented in HP-RARE 53 . The software Micro-checker 2.2.3 54 was used to test for the presence of null alleles (99% confidence interval) at each locus/ sample. Two estimates of current effective population size (N e ) were made: single-sample estimates based on the linkage disequilibrium method 55 ; and two-sample temporal estimates, based on the F-statistic of Jorde & Ryman 56 . Calculations were performed using NeEstimator v2 57 . Because rare alleles may bias linkage disequilibrium N e estimates, alleles with frequency below 0.05 at each locus were removed from the analysis.
Evidence of recent population perturbations was assessed by heterozygosity tests as implemented in BOTTLENECK version 1.2.02 58 . Expected heterozygosity estimates assuming mutation drift equilibrium (MDE) were calculated from the number of alleles and sample size under two mutation models, considered more suitable for microsatellites: the stepwise mutation model (SMM) and a two-phased model (TPM) with 30% multistep mutations (variance = 30%). Although the SMM has been considered as better suited for the type of mutation process most frequent in microsatellites (i.e. DNA slippage 59 ), there is evidence that intermediate mutation models such as the TPM with increasing proportions of multistep mutations are less prone to detect false positives 60 . Wilcoxon tests were used to assess significance between observed and MDE-expected heterozygosities, as recommended for analysis with less than 20 loci 61 . In addition, the allele frequency distribution method was also used 60 . Under MDE, an L-shaped allele frequency distribution is expected, whereas a shifted distribution due to loss of low-frequency alleles is consistent with a recent bottleneck 62 .
In order to assess the degree of relatedness among individuals, the maximum-likelihood method implemented in ML-RELATE was used 63 . For each pair of individuals, log-likelihood estimates are calculated for four pedigree classes: unrelated, half-siblings, full-siblings and parent-offspring.
Bayesian clustering analysis was performed using the software STRUCTURE version 2.3.4. 64 , in order to assess within-island population subdivision and to determine the most likely source populations of Ae. aegypti in Madeira. In a first analysis, only Madeira island samples were used. Subsequently, the Madeira island genotypes were analysed with the continental sample dataset of Gloria-Soria et al. 20 . Twenty independent runs were made for each value of K, which varied from one to 10 for within island analysis, including source population determination, and from one to five at the subspecies/species level. Each run was conducted with a burn-in of 100,000 iterations and 500,000 replicates, assuming an admixture model with correlated allele frequencies. The optimal number of clusters was determined following the guidelines of Pritchard et al. 64 and the delta K statistic of Evanno et al. 65 , using STRUCTURE HARVESTER version 0.6.94 66 . The information from the outputs of each K was aligned by the Greedy method implemented in CLUMPP 67 .
Discriminant Analysis of Principal Components (DAPC), as implemented by ADEGENET 68 , was used to visualize patterns of genetic differentiation among individual mosquitoes belonging to different genetic clusters in a two-dimensional plot. This analysis was performed with the samples from Madeira and a subset of candidate source populations selected from the Bayesian clustering analysis.
Whenever multiple tests were performed, the nominal significance level (α = 0.05) was adjusted by the sequential Bonferroni procedure 69 .
Mitochondrial DNA sequencing. The mitochondrial genes COI and ND4 were analysed by direct sequencing from amplified products, corresponding to 764 bp and 351 bp, respectively, using previously published primers 22,24 and protocols 19 . In addition to Madeira individuals, mtDNA sequences for 21 individuals from Caracas, Venezuela, were also obtained to compensate for the scarcity of sequences available in Genbank for this country. Sequences were aligned and manually corrected using BioEdit v7.0.5 70 . For each gene, haplotype diversity (Hd), nucleotide diversity (π) and the Tajima and Fu and Li neutrality tests were computed by DNAsp v5.10 71 .
In order to infer the relationships between haplotypes in Madeira, haplotype networks were constructed for concatenated COI-ND4 sequences using a median-joining algorithm as implemented in the NETWORK software 72 .
The BEAST v1.8.4 software 73 was used to generate a phylogenetic tree based on the COI-ND4 concatenated haplotypes, obtained from sequences produced in this study and others retrieved from GenBank (Supplementary Table S2). The analysis was run in three separate independent runs with 500 million generations, sampled every 100 000 runs for the concatenated genes. A Bayesian skyline population growth model was used. The substitution model HKY 74 with gamma and invariant sites and three partitions into codon positions was selected. MCMC analysis was run long enough for convergence to be obtained. To analyze convergence and stability, we used Tracer v1.6 software 75 . TreeAnnotator was used to estimate the final Maximum Clade Credibility Tree, summarizing the posterior probability of each clade of the trees, as well as the average and confidence interval for the evolutionary rate of each branch of the tree. The obtained Bayesian trees were visualized and edited with FIGTREE 1.4.3. (http://tree.bio.ed.ac.uk/software/figtree/).