Colonization routes uncovered in a widely introduced Mediterranean gecko, Tarentola mauritanica

In this study, we aimed to understand the contemporary and ancient colonization routes of the Moorish gecko, Tarentola mauritanica, using simple sequence repeats. By analyzing the genetic diversity of populations in different regions, we found that Morocco is the genetic diversity hotspot for the species, followed by the Iberian Peninsula. However, historical gene flow estimates identified the Iberian Peninsula, not Morocco, as the primary contributor of colonizing individuals, along with continental Italy to a lesser extent. Currently, mainland Italy is the main source of introduced individuals, likely due to the plant nursery trade. The study suggests that human-facilitated introductions from various geographical origins, with numerous regions colonized through continental Italy during two distinct periods, are responsible for the recurrent entry of individuals belonging to the European lineage of T. mauritanica into the Mediterranean and Macaronesia. These findings can inform better monitoring surveys and conservation programs by identifying putative current colonization routes of alien species.

The first phylogeographical studies performed on the Moorish gecko detected an extremely high mitochondrial DNA genetic variability, identifying six distinct lineages [22][23][24][25][26]42 , which were later recognized as putative candidate species based on a multilocus species tree study 43 . Moeover, this elevated genetic diversity is particularly evident in North Africa, with Morocco harbouring two endemic lineages and sharing most of the remaining ones with the Iberian Peninsula [23][24][25] .Also, the Iberian Peninsula has an endemic mitochondrial lineage 22 .Most of these mtDNA lineages have very restricted geographic ranges with considerable genetic diversity, contrasting with the pattern of the European lineage 25,26 ; this clade is spread across the entire Mediterranean Basin, with all introduced populations comprised exclusively of individuals from this clade with a practically null mtDNA genetic diversity, preventing the assessment of potential gene flow routes within this group.However, the recent study from Belluardo, et al. 44 uncovered 13 new 16S mitochondrial haplotypes within the Italian populations of the European clade, suggesting this region as a center of genetic diversity for this lineage. Moeover, their results on microsatellite data propose an overall shallow population genetic structure.
Hence, this study aims to uncover the contemporary and ancient colonization/introduction routes among several populations of the Moorish gecko belonging to the European mitochondrial clade.In order to achieve that, a battery of 11 microsatellites especially designed for this species was genotyped for 44 sampling sites distributed across the Mediterranean and Macaronesia regions.The results obtained here will ultimately explain the extant geographic range and genetic diversity composition and distribution within this widespread lineage of Tarentola mauritanica.

Results
Out of the 11 genotyped microsatellite markers, three of them were removed from further analyses (Mt3, Mt14 and Mt29; but see Table S2).Both Mt3 and Mt14 were identified by MICRO-CHECKER as containing null alleles.According to GENEPOP none of the loci were in LD but most of them were not in HWE, since almost every population was also not in HWE (independently of its effective size).Therefore, we discarded the loci with the lowest p values and higher number of populations in disequilibrium (Mt3, Mt14 and Mt29), ending up with a total of 8 microsatellite loci for further analyses.
Regarding the genetic diversity results, Morocco (the highest) and the Iberian Peninsula are clearly the regions harbouring the populations containing the uppermost number of alleles and allelic richness, in contrast to Greece and the Balearic Islands (Table 1 and S3).Indeed, the spatial interpolation of the rarefied allelic richness, identifies Morocco, the Iberian Peninsula and also the west-Mediterranean coast of France as major centres of genetic diversity for T. mauritanica (Fig. 2).
According to Evanno et al. 45 's method, STRU CTU RE identified that the best K was K = 4 (Fig. 3 and Fig. S1), but also both tess3r and DAPC seem to indicate that the genetic diversity of T. mauritanica could be substructured into four clusters (Figs.S2 and S3), since there is a slight elbow/plateau in both cross-validation and BIC values when the number of clusters is 4. Nevertheless, one should always be careful about over-interpreting these graphs and the value of K, since the number of genetic groups detected by ancestry estimation programs does not necessarily correspond to the number of biologically meaningful populations in the sample 46 .Hence, the major identified geographic groups are France and Greece, Morocco, the Iberian Peninsula, and Tunisia, Italy and Madeira.This geographic sub-structuring is also evident with tess3r (Fig. 4).Morocco and the Iberian Peninsula present unique genetic patterns, with very little allele sharing with other geographic regions (Figs. 3, 4, 5 and  S4).The only exception concerns the individuals from the Tui population (Spain) that share most of their alleles with Morocco, instead of the remaining Iberian populations.Additionally, the DAPC evidence that the genetic  21 , is represented in green.Map generated using QGIS software 95 .
Table 1.Genetic diversity statistics calculated from the microsatellite markers for each major geographic region (see "Methods" section).Minimum and maximum values are highlighted in bold.The following parameters are displayed: sample size (N), mean number of alleles (N a ), allelic richness (Ar), rarefied allelic richness (R-Ar), observed heterozygosity (H o ) and expected heterozygosity (H e ).clusters comprising the Iberian Peninsula and Morocco populations are genetically unique and divergent from the other groups (Figs. 5 and S4).On the contrary, clusters 1 and 3 are overlapped.Moreover, the results from both F st and G'' st support most of all defined geographic regions as independent units (Tables S4 and S5).The only exceptions were the Greece-Corsica and Corsica-Italy combinations.
Finally, the results from the gene flow assessment suggest that the direction and intensity of colonization by the Moorish gecko, have changed over time (Fig. 6).Currently, most of the gene flow occurs within each geographic range, and Italy acts as the main source of introduced individuals into new areas.Quite the reverse, in the past there was little intra-population migration and massive inter-population gene flow, with the Iberian Peninsula and at some point, also Italy being the main geographic sources of introduction.However, both these regions were simultaneously hosts of introduced individuals from different origins, and in most geographic territories the incoming and outcoming gene flow were 50/50.
From the comparison of both gene flow diagrams, it is also evident that regions such as Greece, France, Corsica, the Balearic Islands and Tunisia, have been the stage of multiple introductions from Italy in two different moments in time.

Discussion
Accumulated genetic evidence suggests that species introductions are often featured by complex histories, but also underlines that several key and relatively simple components may be part of the process 47 .
The Mediterranean region is a world biodiversity hotspot 48 with one of the longest histories of interaction between humans and biodiversity, with multiple introductions of taxa occurring over millennia 49 .In that sense, humans have been key dispersal drivers of several alien reptiles in this region.However, their distributions are determined by a complex interplay between human activities, geographic factors and species traits 50 .
A common finding in the Mediterranean region is that many introduced populations may originate from multiple geographically distinct sources e.g., 8,51,52 .Alternately, and a common scenario for the origin of multiple populations is a serial founding from a single source e.g., 8,53,54 .Another scenario which is rarely considered is a model concerning multiple introductions from the same source in different points in time.
The Balearic Islands appear as a striking example of a region marked by several faunal introductions across time; this archipelago that once harboured substantial levels of endemicity, now hosts more alien than native reptile and amphibian taxa reviewed in 55,56 .The same occurs nowadays in Madeira Island, where there is a single endemic lizard species (Teira dugesii), surpassed by two introduced geckos, Hemidactylus mabouia, and Tarentola mauritanica 10,34,57 , one snake (Ramphotyphlops braminus; 58 ), and one allochthonous skink (Chioninia fogoensis; 59 ).Hence, it is clear that although largely sedentary, reptiles are frequently introduced by humans, many times during transport of building material, soil or cultivated plants 4 .
Tarentola geckos in particular, which are primarily a North African clade, have naturally reached long distances such as many Macaronesia islands but also Cuba and the Bahamas, most likely by rafting on buoyant vegetation, at least 23 Mya 42 .Nevertheless, the current geographic distribution of T. mauritanica is partly the result of recurrent anthropogenic introductions e.g., [22][23][24][25][26][28][29][30]34,36,37,39,41,42,60 . Their success as human-asssted colonizers is somewhat associated with the synanthropic habits of this species 61 , allied with their relatively small size and cryptic nature.
Our results suggest that Morocco represents the genetic diversity hotspot for the European lineage of the Moorish gecko, followed by the Iberian Peninsula (Table 1 and Fig. 2).This is indeed not a surprise considering the Moroccan origin of T. mauritanica, harbouring most of the mitochondrial DNA diversity as well [23][24][25]43 . Altough, the mtDNA results from Belluardo et al. 44 suggest Italy as the centre of diversification of the European lineage of T. mauritanica, we have to acknowledge that their North African sampling was very limited and, possibly unable to tackle its underlaying genetic diversity.Moreover, all population structure analyses presented here support Morocco and the Iberian Peninsula as two divergent clusters with very little gene flow between them or with the other remaining groups (Figs. 3, 4, 5).The only exception is the Tui population, whose individuals share most of their alleles with the Moroccan populations, evidencing a clear case of significant introduction into this Spanish locality (Fig. 3). Th two other groups comprising France, Italy, Tunisia, Madeira and Greece are undoubtedly admixed.The historical literature in France seems to indicate a maritime transport from North Africa as the source of introduction of T. mauritanica, based on its presence in material freshly landed from www.nature.com/scientificreports/Algeria in the port of Sète 62 .However, an earlier account by Crespon 63 , reports that the species is rarer in south-west France than in the Provence region, which borders Italy.His earlier accounts, therefore, also point to a historically important arrival from Italy, which is supported by the gene flow results obtained here (Fig. 6).
Although the current study does not include specimens from Algeria, the results from Fig. 6A indicate the existence of historical gene flow from both Morocco and Tunisia into France.Hence, it is plausible to assume that if Algeria had been sampled, we might also detect introductions from here to France, as has already been described for two amphibian species, Discoglossus pictus 64 and Pelophylax saharicus 65 .
Considering the distribution of the genetic diversity but mostly of the evolutionary history of this European lineage, a bigger contribution of Morocco as a source of introduction was expected (Fig. 6).Indeed, the historical gene flow estimates identify the Iberian Peninsula as the major contributor of colonizing individuals, and also Italy in a smaller scale.According to Rato et al. 25 , the diversification of the European lineage started around 2.47Mya, most likely in Morocco.This means that the colonization of the Northern Mediterranean from North Africa could be quite ancient, and if so, maybe impossible to be detected using fast-evolving markers, such as microsatellites.In fact, the genus Tarentola and T. mauritanica, or morphologically closely-related taxa, were identified in the fossil records from Spain dated from the Early Pleistocene 66 , supporting the ancient occurrence of this taxon in the Iberian Peninsula.As the high incidence of microsatellite homoplasy increases with evolutionary distance, it might limit the depth of the phylogeny at which it is possible to make inferences 67 .Therefore, the obtained ancient gene flow scenario matches better with the known human history in the Mediterranean during the Classical Era (600 B.C. to A.D. 476), which highlights the importance of the Iberian Peninsula and Italy.This was the time of the Roman Empire (753 B.C. to A.D. 476) when cities were being developed, and communication networks expanded, which have clearly favoured the translocation of highly anthropophilic species 68 .In fact, during this historical time there was a peak of introduced reptile and amphibian species in the Iberian Peninsula 68 , a region known to be one of the most important gold suppliers during the Roman Empire 69 .Hence, these results together with the distribution of the genetic diversity support the Iberian Peninsula as the most likely entering point of the European lineage of T. mauritanica from North Africa, and from there to the remaining geographic regions.
Very importantly, we need to acknowledge Sardinia and Sicily as major sampling gaps in this study.Apart from being the two biggest islands in the Mediterranean, they had many historical connections with North Africa including herpetological species exchange 70,71 .The study from Belluardo et al. 44 identified the presence of a common widespread mitochondrial 16S haplotype in Sardinia, Sicily, Morocco, Algeria and Tunisia.Additionally, fossil remains of T. mauritanica dated from the Late Pleistocene-Holocene were found in Sicily (San Vito lo Capo) 66 .However, the origin of both Sardinian and Sicilian populations remains unknown, and an excellent opportunity for future studies.
Currently, Italy seems to be the main source of Moorish gecko introduced individuals, although most of the gene flow takes place within each geographic region, contrary to the long-distance colonization typical in Tarentola geckos 42 .Accumulating evidence is strongly identifying the "plant nursery trade" (commerce in live plants for ornamental purposes) as one of the main forms of reptile introduction into new territories 4 , since these animals often use plants and trees for refuge and thermoregulation e.g., 72,73 .This kind of human-mediated trade in the Mediterranean has been responsible, for instance, for the introduction of the Italian wall lizard, Podarcis siculus e.g., 8,74 , the brahminy blind snake, Indotyphlops braminus 75 , and the colubrid snakes Hemorrhois hippocrepis, Malpolon monspessulanus, and Zamenis scalaris 53 , outside their native geographic ranges.Curiously, many of these introductions result from the trade of old olive trees transported from Italy [76][77][78] .Indeed, the olive tree trade appears as a modern vector for bioinvasions across the Mediterranean for a wide spectrum of alien species, Tarentola mauritanica included, having been recorded in Catalonia (Spain) around olive trees brought from Italy 77 , but also in Lake Garda, Northern Italy 79 , and Beaugeay, on the Atlantic coast of France 80 .Hence, previous studies together with the results obtained here, point to the plant nursery trade from Italy as a putative modern route responsible for the introduction of T. mauritanica across a wide geographic range.Additionally, the Mediterranean Basin has a very intense maritime traffic (especially since the opening of the Suez Canal connecting the Mediterranean and the Red Sea 81 ), which is thought to be the main driver for the introduction of Italian Moorish gecko individuals in the islands of Corfu 30 and Lesvos 31 in Greece.Overall, the colonization of the European lineage of T. mauritanica in the Mediterranean and Macaronesia seems to result from a combination of several human-mediated introductions from multiple geographic sources, with many regions having been colonized by Italy in at least two different periods far apart in time.Most importantly, this study has identified and highlighted the most likely current colonization routes for this lineage, which will hopefully help authorities in the design of better monitoring and inspection conservation programs.

Study sites and sampling
In this study, 555 individuals of Tarentola mauritanica collected across 44 Mediterranean (some islands included) and Madeira populations were used, with each location represented by at least five individuals.Genotypes from all Italian populations were retrieved from Belluardo et al. 44 .All individuals included in this study belong to the mitochondrial European clade [22][23][24][25][26]44,82,83 . Tissue fom tail tip muscle was collected from each individual and preserved in 96% ethanol.More details on the localities and origin of the samples are presented in Table S1 and Fig. 1.

Data screening and quality assessment
The presence of possible null alleles, allele scoring errors due to stuttering and large allele dropout was evaluated using MICRO-CHECKER v2.2.3 87 .Linkage disequilibrium (LD) and deviations from Hardy-Weinberg equilibrium (HWE) were tested with "GENEPOP on the Web" (dememorization = 1000; batch number = 100; iteractions per batch = 1000) 88,89 .We applied the False Discovery rate 90 to correct p values (p < 0.05) from HWE and LD multiple exact tests, using the package fdrtool 91 in R 92 .
To avoid the computer intensive assessment of such a large number of distinct populations, these were grouped into nine geographical regions (Iberian Peninsula, Corsica, Continental France, Greece, Morocco, Balearic Islands, Tunisia, Madeira and continental Italy), which were considered in some of the posterior analyses (see population structure results that support this).

Genetic variation
Basic microsatellite diversity was evaluated separately for each population based on the number of alleles per locus (Na), expected and observed heterozygosity (H exp and H obs , respectively), using the R diveRsity package and the divBasic function 93 .The rarefied allelic richness (R-Ar) per population was calculated with the R package hierfstat 94 , which accounts for variation in sample size (each sample included a minimum of five loci).To identify geographic changes in genetic diversity, the rarefied allelic richness was spatially interpolated using the Inverse Distance Weighting method (IDW) in QGIS v.3.16.11 "Hannover" 95 .

Population structure
Population differentiation was assessed by means of pairwise F ST 96 and G" ST 97 measures, using the diffCalc function from the R diveRsity package.Respective 95% confidence intervals (CIs) were estimated using 1000 permutations, and pairwise estimates were considered significant when 95% CIs did not overlap zero.These analyses were applied to all 44 populations and also to the nine defined geographic regions.
Genetic structure was assessed using a Bayesian clustering analysis, implemented in STRU CTU RE 2.3.4 98,99 .Ten independent runs were performed for a number of clusters (K) ranging between 2 and 8. Runs consisted of a burn-in period of 10 4 iterations, followed by 10 6 MCMC reps, correlated allele frequencies, admixture model, and no prior information regarding population of origin.The best K was identified using STRU CTU RE HARVESTER Web v0.6.94 45,100 , and deemed as the best K describing the observed genetic data.Graphical plotting of STRU CTU RE results was implemented in the online software CLUMPAK 101 , and geographic plotting of ancestry coefficients using the R package mapplots 102 .
Additionally, two other methods were implemented as alternative and complementary approaches to assess the genetic diversity; a Bayesian Clustering algorithm using tessellations and Markov models for spatial population genetics under the R package tess3r 103 ; and a discriminant analysis of principal components DAPC; 104 .
The tess3r algorithm implements a new version of the program TESS 105 , based on geographically constrained matrix factorization and quadratic programming techniques.The new algorithms are several orders faster than the Monte-Carlo algorithms implemented in previous versions of TESS.
The DAPC summarizes the data to minimize genetic differentiation within previously defined groups, while maximizing it between groups, and it does not consider HWE or linkage equilibrium as necessary conditions.For this analysis, we assumed four genetic clusters based on the Bayesian Information Criterion (BIC) and ΔK (see "Results" section), while the number of retained principal components and discriminant functions were selected following the package's manual guidelines to avoid overfitting.

Gene flow assessment
Historical and contemporary migration rates were estimated among the major nine geographic regions with MIGRATE v3.7.2 106 and BayesAss v3.04 (BA3) (Wilson and Rannala, 2003), respectively.MIGRATE uses the coalescent in a Bayesian or maximum likelihood framework to calculate two parameters from the data, θ and M, where θ represents the effective population size (4Neμ for nuclear DNA), and M the mutation-scaled immigration rate (m/µ).This coalescent-based approach is most suitable for estimating migration rates over thousands of years or approximately 4N e generations in the past 107 .The data were assumed to follow a Brownian motion mutation model.The F ST calculation method was used to generate starting values for both θ and M. Uniform priors were specified for both parameters with a minimum of 0, mean of 50, maximum of 100, and a delta of 10.The Bayesian method was implemented to infer θ and M, specifying two independent runs, static heating with four chains (temperatures: 1.0, 1.5, 3.0, 10,000.0), a sampling increment of 20, 50,000 recorded steps, and a burn-in of 10,000.Convergence was assessed by examination of ESS values with a target of at least 1000.
Unlike MIGRATE, BAYESASS does not assume genetic equilibrium and is therefore, more suitable for inferring contemporary (over the past few generations) processes.The model in BAYESASS assumes linkage equilibrium between loci but allows for deviations in Hardy-Weinberg proportions by introducing an additional inbreeding (F) parameter.Several analyses with different starting seeds were performed and each Markov chain Monte Carlo run involved 10 8 iterations and discarding of the first 10 6 iterations as burn-in.The delta values DA, DF, and DM were set to 0.4, 0.5, and 0.4, respectively.Convergence of the chains was validated using Tracer v1.7.1 108 .

Figure 1 .
Figure 1.Black dots denote the location of all 44 populations of T. mauritanica used in this study.The native geographic distribution of the Moorish gecko modified from21 , is represented in green.Map generated using QGIS software95 .

Figure 2 .
Figure 2. Inverse distance weighted (IDW) interpolation of the rarefied Allelic richness (R-Ar) for the 44 populations of T. mauritanica used in this study.The spatial interpolation is displayed in a red (high) to white (low) gradient.Map generated using QGIS software 95 .

Figure 3 .
Figure 3. On the left is represented STRU CTU RE's bar plot displaying the assignment of individuals for the best K (K = 4; but see Fig. S1).Individuals are grouped by populations displayed on the left of the bar plot.Population codes are denoted in Table S1 and STRU CTU RE analysis within each major geographic group is displayed in Figs.S2-S4.On the right figure, the pie charts display population averages of ancestry proportions according to STRU CTU RE results.Geographic plotting was performed using the R package mapplots 102 .

Figure 4 .
Figure 4. Geographic maps of ancestry coefficients using K = 4 ancestral populations (see STRU CTU RE results), obtained with tess3r.Colours match with STRU CTU RE genetic clusters.Geographic plotting of ancestry coefficients was performed using the R package mapplots 102 .

Figure 5 .
Figure 5. Relative densities of genotyped individuals plotted against the discriminant function 1 from the Discriminant analysis of principal components (DAPC), representing genetic divergence among groups.Colours match with STRU CTU RE genetic clusters.

Figure 6 .
Figure 6.Gene flow diagrams for the Moorish gecko among different geographic regions, corresponding to (A) historical gene flow estimates from Migrate-n, and (B) contemporary gene flow estimates from BayesAss.Grid width represents the total amount of incoming and outgoing gene flow estimated for each population.Arrows indicate the direction of gene flow among the populations while the width of arrows is proportional to the relative amount of gene flow observed among connected populations.Exact gene flow estimates are presented in Supplementary TablesS6 and S7, respectively. https://doi.org/10.1038/s41598-023-43704-8