A comprehensive phylogeography of the Hyles euphorbiae complex (Lepidoptera: Sphingidae) indicates a ‘glacial refuge belt’

We test the morphology based hypothesis that the Western Palaearctic spurge hawkmoths represent two species, the Eurasian H. euphorbiae and Afro-Macaronesian H. tithymali. It has been suggested that these species merged into several hybrid swarm populations, although a mitochondrial phylogeography revealed substructure with local differentiation. We analysed a three-gene mt-dataset (889 individuals) and 12 microsatellite loci (892 individuals). Microsatellite analyses revealed an overall weak differentiation and corroborated the superordinate division into two clusters. The data indicate that the populations studied belong to only one species according to the biological species concept, refuting the opening hypothesis. A future taxonomic revision appears necessary to reflect the division into two subgroups. Ancestral mitochondrial polymorphisms are retained in H. euphorbiae, indicating gene flow within a broad ‘glacial refuge belt’ and ongoing postglacial gene flow. Diverse patterns of extensive mito-nuclear discordance in the Mediterranean and the Middle East presumably evolved by more recent processes. This discordance indicates introgression of H. tithymali-related mitochondrial haplogroups, accompanied (to a lesser degree) by nuclear alleles, into Italian and Aegean H. euphorbiae populations as recently as the late Holocene. The complex mosaic of divergence and reintegration is assumed to have been influenced by locally differing environmental barriers to gene flow.

The discipline of phylogeography has substantially advanced our understanding of the formation of geographically structured diversity within and among closely related species and thus enabled intriguing insights into the origin of species 1,2 . While geographic isolation promotes divergence and speciation 2 , gene flow counteracts this process by merging populations through introgressive hybridisation 3 . Phylogeographic studies have found most species of temperate latitudes to show distinct spatial patterns that are ascribed to vicariance during the severe climatic oscillations of the Pleistocene ice ages 2,4,5 . Distinct entities evolved in isolated glacial refugia, expanded at the beginning of the current interglacial following a few general patterns and generally formed narrow contact and/or hybrid zones upon secondary contact 2,5 . However, the process of divergence and reintegration may also lead to a more complex organisation of diversity that challenges a strictly bifurcating and hierarchically structured classification into discrete entities 6 . Introgression between two taxa can proceed differently depending on locally varying circumstances 2,7,8 and hybridisation may even give rise to independent evolutionary entities and/or species 3,6,9 . Furthermore, the assumption that the patterns of spatially distinct entities have generally remained static since their formation in the early Holocene 2,5 has been challenged recently in favour of a more dynamic evolution 7 , especially for climate-sensitive lepidopterans 8,[10][11][12][13] .
The spurge hawkmoths (Hyles euphorbiae complex: HEC) represent a widespread, polymorphic and taxonomically controversial species complex that features enigmatic biogeographic patterns in an incipient stage of speciation in the Western Palaearctic [14][15][16][17] . We excluded the Sardo-Corso-Balearic endemic H. dahlii and non Euphorbia-feeding H. zygophyllii from this study, although they are formally included in the HEC 18 . They show haplogroup 'enigmatica' also occurs in this species 16 . The Maltese population (* MAL) consists of a mixture of individuals bearing three different mitochondrial haplogroups [14][15][16]18 . Apart from both 'euphorbiae' and 'tithymali' (thus corroborating a hybrid origin 14,18 ), it also contains a more distantly related haplogroup, 'melitensis' , that has been assumed to represent the relict of an ancient endemic entity. Distinct, although 'tithymali'-related haplogroups, 'italica' and 'cretica' , predominate in southern Italy (corresponding to populations * CIT, * SIT, * SIC in this study) and southern Aegean Islands (* CRE) suggesting these areas represent glacial refugia of distinct taxa rather than hybrid zones.
By analysing museum specimens from Italy (corresponding to populations NIT, * EIT, * CIT, * SIT, * SIC in this study) and southern Aegean Islands (* CRE), Mende & Hundsdoerfer 12,13 found that mitochondrial haplogroups attributable to H. euphorbiae ('euphorbiae' and 'enigmatica') were once mixed among the endemic 'tithymali'-related haplogroups, 'italica' and 'cretica' . It was only during the Twentieth Century that 'italica' increased in frequency in Italy from below 60 percent of the total population to near fixation. This was ascribed to genetic drift resulting from habitat destruction and/or to a spread of this haplogroup through the population facilitated by climate warming 12 . This challenges the existence of refugial, long-term isolated and endemic entities and argues for the hypothesis of H. euphorbiae and H. tithymali hybridisation in these areas 12,13 .
The drawbacks of phylogeographic inference from mtDNA alone are known 24 , but in combination with nuclear markers the data can provide valuable understanding of a species' evolutionary course 25 and patterns of introgressive hybridisation 26,27 . Therefore, we used 339 individual sequences of the ~2300 bp Hundsdoerfer et al. 16 COI/II mitochondrial HEC dataset, and supplemented it with 550 individuals. In addition to this expanded mitochondrial dataset, we also analysed 12 microsatellite loci in 892 individuals to provide a comprehensive overview of the evolutionary history of the Western Palaearctic HEC. We discuss the main causes that could account for the complex phylogeographic patterns observed. The hypothesis (H 1 ) to be tested is that the HEC populations marked with a * in Fig. 1

Sampling.
Larval tissue or legs (if specimens were reared to or caught in the adult stage) of 898 specimens were collected from various localities covering virtually the entire Western Palaearctic distribution range of the HEC ( Fig. 1 and Supplementary Table S1) and stored in pure ethanol or dry at − 80 °C. For a statistically robust implementation of some analyses, we had to pool samples from several localities to work with a total of 43 populations (Fig. 1); however, we avoided pooling across morphological transition zones or potential geographic barriers, e.g. sea straits.
Laboratory techniques. In addition to DNA from 339 specimens that had already been isolated for mtDNA sequence analysis 16 , we extracted DNA from further specimens using the NucleoSpin Tissue Kit (Macherey-Nagel) or the InnuPREP DNA Mini Kit (Analytik Jena). Three mitochondrial genes (partial COI/II and interposed tRNA-leu; 2284 bp in total) were amplified using the primers given by Hundsdoerfer et al. 14,15 and established protocols 16 . The PCR products were sequenced on an ABI 3130xl (Applied Biosystems) and sequences aligned and proof-read using BioEdit 7.0.9.0 28 .
An unexpected diversity of distinct mitochondrial haplogroups was detected in H. euphorbiae (see Hundsdoerfer et al. 16 and this study). Hence, we had to confirm that the sequences correspond to those of transcribed mitochondrial genes rather than nuclear pseudogenes ("numts") by performing cDNA transcription of mRNA. RNA was extracted from frozen tissue of specimens bearing a haplotype of the 'euphorbiae' (Supplementary Table S1: #5651, 5699), 'melitensis' (#5658, 5674), 'enigmatica' (# 5663, 5666) and 'italica' (#5649) mtDNA haplogroups with the 'Perfect Pure RNA Cell & Tissue Kit' (5Prime) in a dedicated RNA laboratory. RNA concentration was measured with a NanoDrop ND-1000 (Thermo Scientific). The 'RevertAid First Strand cDNA Synthesis Kit' (Thermo Scientific) was used to synthesise cDNA from 5 μ g RNA according to the manufacturer's manual; however, with an extended DNA digestion at 37 °C for 1.5 h. A short fragment of COI (fragment 'B') and of COII ('L') was amplified for haplogroup assignment (for primers and protocol of fragment amplification, see Mende & Hundsdoerfer 12 ). All fragments matched the respective sequence obtained from mtDNA amplification.
We analysed the six microsatellite loci described by Mende et al. 29 (Hti62, Hti63, Hti65, Hti66, Heu72, Heu76) and another six loci developed by Hundsdoerfer et al. 30 (Hyti13, Hyti14, Hyti22, Hyti41, Hyti49, Hyti50) using protocols adapted for primers with dye labels (see below). Further described loci were discarded because they did not reliably cross-amplify (Hyti1, Hyti5, Hyti6, Hyti7, Hyti28) and/or presented substantial scoring problems due to a high number of barely distinguishable alleles caused by indels in the flanking regions (Hti67, Heu68). PCR was performed in a volume of 20 μ l containing 1 U Taq polymerase (Bioron), PCR reaction buffer with a final concentration of 2.5 mM MgCl 2 (Bioron), 0.2 mM of each dNTP (Fermentas), approximately 10-50 ng DNA template and infrared dye labelled forward and plain reverse primers. In addition, the reactions for loci developed by Hundsdoerfer et al. 30  Previous characterisation of the loci showed that highly variable flanking regions cause allele lengths that were inconsistent with the microsatellite repeat modulus 29,30 . Further alleles that were not present in the populations tested in these primer notes were verified to avoid artefacts by exemplarily sequencing homozygote specimens, if present. Nevertheless, given that allele length variation is in large part not solely attributable to repeat motif variation, we refrained from analyses that are based on a model of microsatellite evolution.
MtDNA sequence analyses. Evolutionary relationships among mitochondrial sequences were evaluated by a median-joining network using Network 4.5.1.0 (Fluxus Technology). Since the analysis is prone to errors in the presence of ambiguities, base positions with > 10 ambiguities in the total alignment were deleted, leaving 2070 bp. The few remaining ambiguities at non-informative sites were overwritten to the respective base at their particular locus and those at parsimony-informative sites to the most common base of the corresponding haplogroup. Average p-distances between haplogroups were calculated with Mega 4.0.2 31 and their geographic distribution was illustrated using ArcGIS version 10 (ESRI). DnaSP 5.0 32 was used to assess haplotype and nucleotide diversity within the entire HEC and the two clusters inferred by the Structure analysis of microsatellite data (see below).
We tested for signals of population size changes for each haplogroup with Arlequin 3.1 33 . Demographic and/or range expansions (or selective sweeps) are expected to generate an excess of recent and thus rare mutations, leading to a unimodal mismatch distribution and negative values of neutrality tests. Instead, a stationary population size (or balancing selection) would result in a multimodal mismatch distribution and a bottleneck in positive neutrality tests. We also inferred past dynamics of effective female population size (N ef ) for each haplogroup by means of a Bayesian coalescent approach with 3 × 10 8 Markov chain Monte Carlo (MCMC) iterations (of which 1.25 × 10 3 were sampled for 'euphorbiae' and 'tithymali' , 2.5 × 10 3 for 'italica' and 5 × 10 3 for the remaining haplogroups) using Beast 1.8.0 34,35 . The simulation (data were partitioned into three codon positions) was based on the following assumptions: the HKY nucleotide substitution model, a strict molecular clock with a substitution rate μ = 3.54 × 10 −8 per year based on a standard COI divergence rate of 3.54% per million years for univoltine insects 36 (i.e. μ = 1.725 × 10 −8 per generation; see Schenekar & Weiss 37 ) and a generalised average of two HEC generations per year 17,21 . We assured that the MCMC chains converged by means of effective sample size values for the parameters, i.e. ESS > 200, and visually. The results were visualised by means of skyline plots using Tracer 1.6 38 .
Microsatellite marker analyses. We tested for linkage equilibrium (LE) in each population containing ≥ 15 individuals using Genepop 'on the web' 4.2 39 with default Markov chain (MC) parameters and for Hardy-Weinberg equilibrium (HWE) using Arlequin with 100,000 MC and 10,000 dememorisation steps. Significance of the results was evaluated after Bonferroni correction. Micro-Checker 2.2.3 40 was used to test for null allele bias. We found no consistent significant deviation from LE between any pair of loci. Excessive deviations from HWE in > 50% of the populations were only found for the loci Hyti49, Hti62 and Heu76 (Supplementary Table S2). All deviations could be explained by the presence of null alleles (Supplementary Table S2), which are commonly encountered in lepidopterans 29 .
Pairwise differentiation (F ST ) between HEC populations was calculated with Arlequin (with 20,000 permutations to test for significance; populations containing < 4 individuals, i.e. GEW, SLO, POR and * LAM, excluded). We tested for correlation between genetic and geographic distance by performing a Mantel test implemented in IBDWS 3.23 41 using the F ST matrix computed by Arlequin, a geographic distance matrix calculated from the average population coordinates (Supplementary Table S1) by means of GDMG 1.2.3 42 , and 30,000 randomisations to test for significance of the correlation. Additionally, we computed a Mantel correlogram with 10,000 permutations and 49.4 ± 0.6 population pairs per distance class using PASSaGE 2 43 .
To provide a robust assessment of population structure in the HEC, we adopted two strategies: (1) A hierarchical AMOVA was performed using Arlequin to test the data support (based on an average over 12 loci) of a priori population structure hypotheses. According to the traditional species distribution ranges and H 1 , "H. euphorbiae" was defined to contain the populations IRA, KAZ, CAU, TUR, BUL, GRE, SGR, * CRE, HUN, SLK, GER, BEL, FRC, ESP, CSP, SSP, * GAL, NIT, * EIT, * CIT, * SIT, * SIC and * PAN and "H. tithymali" STU, * MOR, SMO, cLZ, cFV, cGC, cTF, cLG, cEH, cLP, MAD, SAA, FOG and YEM (Fig. 1). The populations * MAL and * TUN were excluded. It has been postulated for over ten years 14,15 that these populations are of hybrid origin, due to the discrepancy between conserved characters from adult morphology ("H. euphorbiae-look" of moths) and mitochondrial assignment (H. tithymali or a mixture including H. tithymali). The hypothesis for the a priori group definitions is thus based on other sources of characters than the ones tested, to avoid circular argument. (2) Analysis of the data without defining a priori groups. We performed Bayesian cluster analyses using Structure 2.3.2 44 and the R package Geneland 4.0.3 45,46 . Both searched for clusters that best fit the assumption of HWE and LE. We corrected for the presence of null alleles (see above) by coding them as recessive in Structure 47 and/or by means of null alleles frequency estimation along the clustering algorithm in Geneland 48 .
We assumed the admixture model with correlated allele frequencies in Structure 49 and uncorrelated allele frequencies in Geneland so as not to overestimate K 45 . The spatial clustering algorithm in Geneland was further provided with information of coordinates in decimal degrees (Supplementary Table S1) with an uncertainty value of 10 degrees (this large grid was chosen to avoid bias via coordinates, since the moths fly far). Both programs computed ten replicate runs for each value of K = 1-20 with 10 6 MCMC iterations of which the first 2.5 × 10 5 were discarded as burn-in (2.5 × 10 3 of the 10 4 sampled iterations in Geneland). We reran the Geneland computation with the same parameters except for K being fixed for the best two values inferred from the first computation (see below) to obtain more reliable estimates of individual cluster assignment 46 . The optimal number of clusters was inferred using the ΔK method of Evanno et al. 50 [rejecting K = 1 using mean lnP(D) values] for the Structure analysis and from the posterior probability distribution given by Geneland. Individual probabilities of cluster membership of the replicate run with the highest posterior probability were displayed as barplots with Distruct 1.1 51 for the optimal partition K = 2 and the second most probable partition K = 3 (see Results) to assess consistency of cluster boundaries and to reveal potential substructure (see e.g. Fontaine et al. 52 ). Geographic distribution of the K = 2 clusters was also illustrated using ArcGIS 10. In addition, we evaluated the variance of the data using a principal component analysis (PCA) as implemented in the R package Adegenet 1.3-6 53 , a method that is less prone to sample size bias than Bayesian cluster analyses.
Differentiation of Structure and Geneland clusters was inferred using a locus-by-locus AMOVA (as a weighted average over loci) using Arlequin. We assessed genetic diversity within the entire HEC and the Structure clusters by counting the numbers of alleles and computing allelic richness (mean of all loci) with Fstat 2.9.3.2 54 . Private alleles were summarised using Convert 1.3.1 55 and observed and expected heterozygosity (both summed over all loci) were calculated using Arlequin. For these analyses, all samples of a geographically defined population were collectively assigned to a cluster according to the predominant assignment of its individuals. MtDNA

Results
Mitochondrial phylogeography and demographic history. The network analysis of mitochondrial sequences corroborated the previously found division of the HEC into seven haplogroups 16 (Fig. 2). Additionally, a specimen each from Pantelleria (* PAN) and northern Tunisia (* TUN), both showing the local population's morphology, were found to carry a haplotype of the distinct Sardo-Corso-Balearic endemic H. dahlii (Figs 2 and 3a; the haplotype sequence is identical with H. dahlii GB Acc. No. AJ749458).
We detected three haplogroups, 'euphorbiae' , the less frequent 'enigmatica' and the very distinct 'melitensis' , in the entire Eurasian range of 'pure' H. euphorbiae ( 16 and uncorrected p-distances are given between them. Scientific RepoRts | 6:29527 | DOI: 10.1038/srep29527 YEM). Notably, 'enigmatica' is connected to the 'tithymali' haplogroup via 'robertsi' in the network (Fig. 2). The 'robertsi' haplogroup is restricted to Iran, IRA (in the current study; it may extend further east in unsampled populations), where it is mixed with 'euphorbiae' and 'enigmatica' (Fig. 3a) in the species H. robertsi (according to current taxonomy 19 , but H. euphorbiae according to our assumption, see introduction). In contrast, the range of H. tithymali (see above) and continental African putative hybrid populations (* MOR, * TUN) is predominantly occupied by a single haplogroup, 'tithymali' . This is absent from the range of morphologically 'pure' H. euphorbiae; its sole occurrence on the European mainland is in the putative hybrid populations of Central Italy (* CIT, * EIT).
Populations considered being of hybrid origin based on their intermediate larval morphology show remarkably different compositions of mitochondrial haplogroups ( Fig. 3a; populations marked with a * ). The north-western Spanish population (* GAL) was analysed with molecular methods for the first time in this study and was found to contain only 'euphorbiae' haplotypes, in contrast to its taxonomic status as H. t. gallaeci. Malta (* MAL) was corroborated as consisting of a mixture of haplotypes belonging to 'tithymali' and two haplogroups, 'euphorbiae' and 'melitensis' , that also occur in mainland H. euphorbiae. In contrast, the other Mediterranean putative hybrid populations are dominated by haplogroups that are equally very closely related to 'tithymali' and each other (Fig. 2): the Apennine Peninsula (* CIT, * SIT), Sicily (* SIC), and Pantelleria (* PAN) by 'italica' and the southern Aegean Islands (* CRE) by the endemic 'cretica' (Fig. 3a). In addition, 'italica' occurs more frequently and is more widespread in the African populations * MOR and * TUN than was previously found. Additionally, 'italica' were also found in the supposed non-hybrid H. tithymali deserticola of southern Morocco, SMO (Fig. 3a). A few individuals bearing 'italica' haplotypes were also detected in areas of Mediterranean Europe assigned to morphologically 'pure' H. euphorbiae (ESP, FRA, GRE, SGR). In turn, the H. euphorbiae haplogroups 'euphorbiae' , 'enigmatica' and 'melitensis' , show occasional occurrences among the dominant 'italica' in Southern Italy (* CIT/* PAN, * SIT or * SIC respectively).
Unimodal mismatch distributions of mitochondrial haplotypes and significantly negative neutrality test results unequivocally support a scenario of demographic and/or spatial expansion for the 'euphorbiae' , 'italica' , 'robertsi' , and 'tithymali' haplogroups, while support for this scenario is not clear for 'cretica' , 'enigmatica' and 'melitensis' (Table 1 and Supplementary Fig. S1). According to the Bayesian skyline plot, the median estimate of population size of the 'euphorbiae' haplogroup increased substantially at the onset of the Holocene from approximately 11,000 to 8,000 years before present (BP). The increase becomes significant, i.e. when the 95% highest posterior density (HPD) limits loose overlap with older estimates, at about 6,500 years BP (Fig. 4). The following comparably constant plateau appears as a slight decrease in the last ~3,000 years (Fig. 4), but is insignificant and may be an artefact due to pooled sampling from a structured population 56 . The median estimate of the 'tithymali' and 'italica' haplogroups started to increase considerably later at about 5,000 years BP (Fig. 4). The expansion becomes significant for 'tithymali' at about 3,000 years BP. It nearly becomes significant for 'italica' only after 1,000 years BP, but the 95% HPD limit overlap remains at every time, thus 'italica' only shows a trend of expansion. The skyline plot of 'enigmatica' at best reveals low population growth since coalescence at around 4,000 years BP, but due to the high overlap of the 95% HPD limits the population size could also have remained stable. In contrast, the population size of 'robertsi' is estimated to have remained nearly constant throughout the Holocene (Fig. 4).
Microsatellite marker analysis (in comparison with mtDNA). Nuclear genetic and geographic distances between populations are significantly correlated across the HEC's entire range (r m = 0.4276; p < 0.0000; Fig. 5a). Accordingly, 18.3 percent (i.e. r m 2 ) of the genetic divergence is explained by geographic distance. The Mantel correlogram reveals a continuous and approximately linear decrease of Mantel correlation, i.e. a decrease in genetic similarity, with increasing geographic distance (Fig. 5b).
The result of the hierarchical AMOVA (average over 12 loci) is that 9.9% (p < 0.00001) of the total variability in the dataset is found between the two groups of populations defined a priori, supporting the division into H. euphorbiae and H. tithymali (sensu H 1 ). Between populations within these groups we found 6.2% (p < 0.00001) of the total variability, i.e. less than between the two a priori defined groups. However, the greatest part of the variability in the dataset (83.8%) was found between individuals (all values; p < 0.00001).
The Bayesian cluster analyses of the nuclear microsatellite data concordantly reveal a most probable partition of the HEC into two clusters (Fig. 6), which are moderately differentiated (Structure partition: F ST = 0.091; Geneland: F ST = 0.102) and largely congruent between both analyses except for the populations * TUN and * MAL (Fig. 7a; see below). This differentiation is also (albeit not distinctly) reflected in the first component of the PCA (Fig. 7b) and in an increase in pairwise F ST values between populations assigned to the two different clusters (Supplementary Table S3). Remarkably, this differentiation is mainly attributable to only three loci with F ST values above the average (Hti50, Hti63 and Hti65), whereas it is substantially lower for the majority of loci and some values are even insignificant (Supplementary Table S2).
The geographic extents of the two clusters are largely congruent with the traditionally defined 'pure' ranges of the HEC's two main taxa, which is why we refer to them as H. euphorbiae (blue) and H. tithymali (red) (Fig. 3b) (Figs 3 and 7a). Accordingly, the two microsatellite clusters both contain a high number of individuals with private haplotypes (Table 2). Yemeni H. t. himyarensis (YEM) represent an unexpected exception to the congruence between microsatellite clusters and 'pure' taxa ranges, since they are predominantly assigned to the H. euphorbiae microsatellite cluster by both cluster analyses, in conflict with their morphological and mitochondrial affiliations (Figs 3 and 7a). Likewise, the assignment of individuals of the Iranian H. robertsi (IRA) (current taxonomy 19 , assumed to be part of H. euphorbiae in this study; see introduction) to the H. euphorbiae cluster contrasts with their partly 'tithymali'-related 'robertsi' haplogroup mitochondria. Remarkably, the diversity of both nuclear and mitochondrial markers is distinctly higher in the H. euphorbiae than in the H. tithymali cluster ( Table 2). The geographic position of the main divide between the two microsatellite clusters falls within the Mediterranean putative hybrid populations, but is inconsistent between the two cluster analyses (Fig. 7a). According to Structure, the African mainland (* TUN, STU * MOR, SMO) and Maltese (* MAL) populations are substantially admixed but predominantly assigned to H. tithymali. In contrast, Geneland assigns * TUN and * MAL to H. euphorbiae. The putative hybrid populations of Italy (* EIT, * CIT, * SIT, * SIC, * PAN), and the southern Aegean Islands (* CRE) also include an increased number of admixed individuals according to Structure (Fig. 3b). However, they are predominantly assigned to the H. euphorbiae cluster by Geneland. Mitochondrial 'tithymali'-related haplogroups 'italica' and 'cretica' (Figs 2, 3 and 7a) prevail in these populations. They thus show a considerable mito-nuclear discordance. The H. euphorbiae population of southern Spain (SSP) also shows a considerable number of admixed individuals according to the Structure result. In contrast, all individuals from the north-western Spanish H. t. gallaeci population (* GAL) are assigned to the H. euphorbiae cluster in both analyses, in congruence with mtDNA.
The results of the second most probable partition, K = 3, also deserve attention, since they reveal interesting substructure (Figs 5 and 6). The third cluster includes only the majority of individuals from the Mediterranean populations of putative hybrid origin (* CRE, * EIT, * CIT, * SIT, * SIC, * PAN, * LAM, * MAL, * TUN, * MOR). In the K = 2 result above, * TUN, * MAL have been assigned inconsistently to H. tithymali by Structure, and to H. euphorbiae by Geneland. * EIT, * CIT, * SIT, * SIC, * PAN, * CRE fall into H. euphorbiae by both analyses, whereby the latter is assigned to the third cluster only by Structure. A weak differentiation of the third cluster is also indicated in the second component of the PCA (Fig. 7b). Except for Malta (* MAL), the geographic extent of the third cluster in Europe according to Structure is congruent with the distribution areas of 'tithymali' and the related mitochondrial haplogroups 'italica' and 'cretica' (Fig. 7a). Remarkably, the mitochondrial haplogroups 'enigmatica' , 'robertsi' , 'italica' and 'cretica' are almost exclusively restricted to individuals in the microsatellite cluster representing H. euphorbiae (and the distribution range enclosed by these samples), whereas they are more closely connected to the main haplogroup of H. tithymali in the network of mitochondrial sequences ('tithymali'). These discordances demonstrate an incomplete separation of H. euphorbiae and H. tithymali. The underlying causes are probably different for each of the haplogroups. The patchy admixture of 'tithymali'-related 'enigmatica' with 'euphorbiae' indicates incomplete mitochondrial lineage sorting with a retained ancestral polymorphism in H. euphorbiae. This retained mitochondrial diversity in H. euphorbiae is accompanied by a higher nuclear diversity compared to the southern H. tithymali (Table 2), which thus contrasts with the general expectation of higher diversity in southern latitudes due to continuous residence during the glacial cycles 15,57 . In contrast to 'enigmatica' , the (nearly) exclusive prevalence of 'italica' (with a visible signal of recent growth) and 'cretica' in their respective geographically confined areas (* CIT, * SIT, * SIC, * PAN and/or * CRE) and their close relation to the 'tithymali' haplogroup suggest more recent introgressions of these mitochondrial haplogroups into H. euphorbiae (see Toews & Brelsford 27 and Discussion below).
Our data further indicate ongoing nuclear gene flow far beyond the Mediterranean populations of putative hybrid origin. Individuals assessed as admixed by Structure occur throughout nearly the entire Western Palaearctic HEC range (Fig. 3b) and the two clusters are hardly differentiated at the majority of microsatellite loci (Supplementary Table S2). Congruently, the Western Palaearctic HEC taxa lack diagnostic (i.e. fixed) morphological characters 20 and no clear or biologically meaningful phylogeographic structure has been found by means of ISSR markers 15 or nuclear genes 18 . Cumulative evidence points to the HEC constituting only a single gene pool, though structured into two main clusters, refuting H 1 (two species) and confirming H 0 . The populations studied should thus be classified as belonging to a single species according to the biological species concept. Although a division into two clusters corresponding to H. euphorbiae and H. tithymali is well supported by the microsatellite   Accepting the Western Palaearctic HEC as one biological species is not inconceivable, given that HEC moths are considered strong fliers 15,21 and no postzygotic reproductive barriers have been found in hybridisation experiments between different HEC taxa (see Hundsdoerfer et al. 14 ). Since the microsatellite clusters representing the HEC's two main Western Palaearctic entities are fairly well allocated between the African and Eurasian continents (Fig. 3b), the major force that probably caused and maintains this differentiation is the geographical divide posed by the Mediterranean Sea. However, this barrier is not impermeable, as indicated by the high amount of admixture in the nuclear genome on both sides at the narrowest sea straits and particularly by the dominance of 'tithymali'-related haplogroups in parts of Southern Europe.
Moreover, we found evidence for mitochondrial introgression from H. dahlii, the spurge hawkmoth species endemic to Sardinia, Corsica and the Balearic Islands, which challenges the alleged effective reproductive isolation of this morphologically and genetically more distinct taxon 14,17,18,21 . Hyles dahlii individuals have been reported sporadically from the surrounding mainland coasts and interpreted as rare dispersal events 14,17 . However, the detection of individuals carrying a H. dahlii haplotype in the established HEC populations * PAN and * TUN suggests that dispersal across the Mediterranean Sea and subsequent hybridisation may be more common than presumed. Detecting genetic traces of such hybridisation should be rather unlikely, given the small chance that rare migrants' genes can establish themselves in an established population due to the principle of 'high density blocking' 58 . These events probably contribute to the morphological variability in the HEC.
Evolutionary history of the two subgroups. The phylogeography of European biota, including mobile lepidopteran species, generally reveals well preserved spatial genetic patterns resulting from differentiation in isolated glacial refugia on the Mediterranean peninsulas and subsequent postglacial re-colonisation 2,4,5 . The HEC data obtained do not corroborate any of these paradigms of distinct entities expanding at the beginning of the current interglacial and forming narrow contact and/or hybrid zones upon secondary contact in Europe. In contrast, H. euphorbiae has three ('euphorbiae' , 'enigmatica' , 'melitensis') mitochondrial haplogroups (Fig. 3a) occurring throughout its range from the Iberian Peninsula to Central Asia, except for Italy and the Aegean Islands, which are occupied by the two geographically confined haplogroups 'italica' and 'cretica' (see also Discussion below). Massive interglacial gene flow could explain why the known postglacial re-colonization paradigms 4,5 are not observed (see Petit 59 ) and mitochondrial haplogroup diversity is rather maintained throughout the range of 'pure' H. euphorbiae. The data denote a scenario by which the organisms retreated into a 'glacial refuge belt' in the Circum-Mediterranean area with local differentiation and simultaneous ongoing gene exchange. In this scenario, moths would fly between refugia of favourable climatic conditions, maintaining gene flow and thus the geographic distribution of the mitochondrial diversity within this region. The recolonisation of Central Europe would have occurred by the northward expansion of moths carrying this mitochondrial mixture.
Assessments of past population size dynamics clearly indicate an expansion of the 'euphorbiae' haplogroup, roughly dated to the early Holocene (Fig. 4). Thus, this signal likely predominantly reflects the lineage's re-colonisation of more northern European latitudes enabled by postglacial warming 4,60 . In contrast, the Bayesian skyline plot of 'enigmatica' indicates at best much lower population growth, in line with the fact that it is mostly at a lower frequency than the dominant haplogroup in the populations in which it occurs. The 'enigmatica' haplogroup is currently most frequent in some Italian (* EIT) and Central European (SLK) populations, but has no core area. Our augmented dataset further reveals that 'melitensis' , highly divergent from 'euphorbiae' , is not endemic to Malta as previously assumed 16 but also occurs on Sicily (* SIC) and the European mainland (GER, SLK, SLO). Thus, conclusions about the places of origin of these haplogroups cannot be drawn unequivocally. Due to the principle of 'high density blocking' 58 , 'euphorbiae' , 'enigmatica' and 'melitensis' must all have occurred in founder individuals coming from the 'glacial refuge belt' (see above), and simultaneously occupied the new habitats in Central Europe. The larval food plants of the genus Euphorbia need open habitats, which would have increased with the spread of Neolithic culture from 6,500 years B.P. onwards 61 . As postulated before 15 , cultivated steppe would have replaced woodland allowing both foodplants and moths to thrive. Interestingly, the timing  Table 2. Summary statistics of genetic variation in the entire HEC and microsatellite clusters. Samples of populations were collectively assigned to one of the K = 2 Structure clusters (see Fig. 7a  corresponds exactly to the significant increase of the population expansion signal of 'euphorbiae' (about 6,500 years BP; Fig. 4). The currently observed genetic admixture between H. tithymali and H. euphorbiae and the high dispersal potential of the HEC leads us to postulate a broad 'admixture belt' that oscillated north and south with climate changes 16 . Besides Southern Europe, the Maghreb also often served as a source for the postglacial re-colonisation of Europe by numerous other species 57 (also see Discussion below). The high proportions of ancestral assignment to the H. euphorbiae microsatellite cluster of many African samples (Fig. 3b) and the prevalence of intermediate morphotypes in the northern Maghreb 17,20 (Fig. 1) suggest that at least parts of the nuclear identity of the H. euphorbiae entity also introgressed into the African HEC populations. Only the Canary and Cape Verde Islands and possibly unsampled (there are recent records of H. tithymali from southern Algeria, the Tibesti Mountains in northern Chad 62 and Mauritania) and/or now extinct southern peripheral populations of the once 'green' Sahara 60,63 are and/or may be left as 'purest' populations of the H. tithymali entity. Thus, the nuclear data denote an area far smaller than that suggested by the exclusive occupation of the entire African range to the Arabian Peninsula by the 'tithymali' (and closely related 'italica') mitochondrial haplogroup 16 . Patterns of admixture of the two subgroups. Southern Europe. Extensive mito-nuclear discordance as observed in the Italian HEC populations (* CIT, * SIT, * SIC, * PAN) is a common result of introgressive hybridisation 27 . Bayesian estimations of demographic changes indicate an expansion of the 'italica' haplogroup considerably later than the expansion of 'euphorbiae' (Fig. 4). Furthermore, it has to be noted that the more southern HEC populations may produce more than the assumed two generations per year, which would yield an even more recent time for the 'italica' expansion. Consequently, this signal probably reflects a replacement of previously present H. euphorbiae haplogroups ('euphorbiae' , 'enigmatica' , 'melitensis) in Italy by 'italica' . This is corroborated by an analysis of museum specimens, which showed that this replacement was still in progress during the last century 12 . Furthermore, it is not yet complete, as revealed by the sporadic findings of H. euphorbiae haplogroups throughout Italy. Accordingly, 'euphorbiae' on Malta is not as disjunct from the mainland as previously thought 16 . However, we could not resolve the initial timing of the introgression of 'italica' into the European H. euphorbiae or the haplogroup's geographic origin. Its present-day main occurrence would suggest that it initially separated from the 'tithymali' haplogroup only after it became isolated in Italy. However, the more frequent and widespread occurrence of 'italica' in the Maghreb (dominant in SMO, but also in * MOR and * TUN) than previously found 16 , rather indicates an African origin and a more recent arrival and spread in Italy. Given that 'italica' was at a frequency below 60 percent of the total Italian population only a hundred years ago 12 , it can likely be assumed to have been at a much lower frequency still, or even absent from Italy at the beginning of its estimated expansion a few to one thousand years ago (Fig. 4).
Interestingly, the evolutionary history of the Aegean Islands population (* CRE) is suggested to parallel the Italian one. A similar mito-nuclear discordance with a mitochondrial haplogroup closely related to 'tithymali' is observed. Unfortunately, the small sample size of the 'cretica' haplogroup, which exclusively dominates the Aegean Islands today, does not allow for a conclusive assessment of its past demographic evolution. However, an analysis of museum specimens revealed that 'cretica' had once also been admixed with 'euphorbiae' as far south as Crete 13 . Thus, it cannot be ruled out that 'cretica' may not be endemic to the Aegean Islands but possibly also have originated and still be present in Africa, like 'italica' . However, more samples, e.g. from the relict HEC population in western Egypt 17 (including museum material to go back in time), are needed to investigate this idea.
Mitochondrial introgression from H. tithymali-related haplogroups into the Italian (* PAN, * SIC, * SIT, * CIT) and Aegean Islands (* CRE) populations was evidently accompanied by a parallel nuclear introgression. This is indicated by the geographically congruent prevalence of intermediate morphotypes 20 (Fig. 1) and a considerable number of specimens that feature a substantial proportion of ancestral assignment to the H. tithymali microsatellite cluster according to Structure (Figs 3b and 7a). These populations' predominant assignment (* PAN, * SIC, * SIT, * CIT, * CRE, together with individuals from * EIT, * MAL and * TUN) to the separate third Structure cluster at K = 3 may also be attributable to this introgression (Fig. 7a). The assignment of the Aegean Islands population (* CRE) to H. euphorbiae by Geneland conflicts with the assignment to the third cluster by Structure. This is likely explained by Geneland taking geographic coordinates into the calculation into account and thus weighting the result to rather group individuals from closer geographic vicinities. Crete lies nearer to Greece (H. euphorbiae) than to the abovementioned populations in the third cluster.
Cluster algorithms are prone to divide a continuum of gradual differences into distinct clusters 64,65 . Thus, the third partition (and the incongruent assignment of * MAL and * TUN at K = 2) could reflect an artefact caused by allele frequency gradients between the genetically more distinct H. tithymali and H. euphorbiae populations in Macaronesia and Eurasia, respectively. A linear decrease of the Mantel correlation with geographic distance (Fig. 5b) also suggests that spatially directed clines-which could arise from a dynamic introgression as discussed above 7 -primarily account for the Mantel correlation 66 . However, the interpretation of Mantel tests and correlograms is highly controversial 64 . It cannot be ruled out that the individuals in the third cluster may in fact have an incipient separate identity, since a very weak differentiation is also suggested by the PCA (Fig. 7b). Animal hybrid swarm differentiation has been shown to occur more frequently than previously presumed 3,6 , including examples that evolved similarly from the contact between northern and southern taxa in Italy (sparrows 67 , salamanders 9 ). Nevertheless, it yet remains a matter of speculation for the HEC, since the already low genetic separation of the two parental entities with our marker set inevitably cannot provide confirmation of a possible hybrid swarm differentiation.
Middle East. In the Middle East, suitable HEC habitats were probably well spread across the 'green' Sahara and Arabian Peninsula until deserts expanded a few thousand years ago 60,63 and isolated African, Yemeni and Eurasian HEC populations from each other 17,20 . Mito-nuclear discordance with assignment to the H. euphorbiae microsatellite cluster and the 'tithymali' mitochondrial haplogroup is observed in Yemen and accordingly with the 'tithymali'-related 'robertsi' in Iran. However, the lack of an expansion signal for the Iranian 'robertsi' during the entire Holocene (Fig. 4) implies that this pattern did not evolve by a similarly recent dynamic introgression as for 'italica' . As a further difference, Iranian samples are not morphologically intermediate: morphotypes show high similarities with typical H. euphorbiae 17,20 in agreement with the samples' microsatellite cluster assignment. In contrast, the morphology of Yemeni specimens strongly resembles H. tithymali 17,20 , whereas their predominant nuclear genetic assignment is to H. euphorbiae. The adjacent Saudi Arabian population has morphologically been assigned to H. euphorbiae 17 (Fig. 1) but also features 'tithymali' mitochondria (Supplementary Table S1) further indicates complex admixture patterns in this region. However, a considerably larger sample size is needed for a more robust assessment of the population genetic structure and its history in this region.
Northwestern Iberia. Finally, the Galician population (* GAL) had also been considered a hybrid population based on morphology. Especially the frequent occurrence of horizontally connected upper eyespots in the larvae, a character frequently found on the western Canary Islands 20 , suggests that the population could have been established by occasional H. tithymali vagrants or even to represent a relict refuge of H. tithymali 68 . However, we found no traces of H. tithymali in either molecular marker. This may be due to the low sample size and the already low differentiation of the two microsatellite clusters. Alternatively, colour pattern similarities may not always reliably reflect genomic evolutionary relatedness of populations 26,69 and thus potentially misguide taxonomic inference 16 . The specific larval colour pattern may be caused by a coincidental combination of alleles of genes involved in complex, multi-locus pattern coding 16 (also see Harbich & Hundsdoerfer 70 ) that became frequent in Galicia and on the western Canary Islands, independently of one another.
Environmental factors potentially affecting phylogeographic patterns. In accordance with the majority of empirical studies, computer simulations reveal that introgression of selectively neutral markers (and especially uniparentally inherited organelle DNA) almost exclusively occurs from the local to the invading taxon due to demographic disparity 71 . However, the opposite is indicated by the 'italica' skyline plot (Fig. 4) and an increase in this apparently invading haplogroup, shown to have occurred during the last century 12 . This could happen under specific conditions such as disassortative mating, unequal hybrid fitness, or sex-biased dispersal 27 , each in favour of 'italica' bearing females. However, these hypotheses lack empirical evidence and appear implausible. A possible infection with Wolbachia 16 that favoured the spread of a mitochondrial lineage 72 should be analysed in more detail. Alternatively, however, there is growing evidence that mitochondrial variants can be adapted to environmental variables such as temperature gradients 27,73,74 and computer simulations show that this can result in phylogeographic structure 75 . Accordingly, the historic increase of 'italica' and fluctuations of its northernmost occurrences correlate well with warming summer temperatures during the last century 12 . Likewise, the estimated expansions of 'italica' and 'tithymali' since the late Holocene (Fig. 4) coincided with a linear increase of temperatures in the Mediterranean during the last 8,000 years 76 , suggesting a selective advantage at higher temperatures.
In contrast to dispersal of 'tithymali' and related mitochondrial haplogroups across the wide Strait of Sicily and Libyan Sea into Southern Europe, this was not observed across the comparatively narrow Strait of Gibraltar, although southern Spain has a potentially suitable climate for these haplogroups 16 . However, we did record nuclear influx from H. tithymali in SSP (Figs 3b and 7a). The Gibraltar Strait forms a strong barrier to gene flow for some bat species but not others, not correlating with the species' dispersal capabilities 77 . Nevertheless, hawkmoths can be carried by winds > 5 m/s 78 and mitochondrial introgressions into Southern Europe could have happened quite recently, so we hypothesise that current wind regimes could have helped shape the genetic structure. Prevailing Levante and/or Poniente winds, which blow east/west ( Supplementary Fig. S2a), would impede gene flow across the Gibraltar Strait effectively, whereas the Sirocco appears to promote dispersal across the Central Mediterranean Sea ( Supplementary Fig. S2b), as reported for the HEC 79 . Unlike neighbouring populations (* TUN, * PAN, * LAM, * SIC), 'italica' does not occur on Malta (* MAL). The considerably different mitogenetic composition can be ascribed to the island's geographic isolation 10 , especially in respect to prevailing winds ( Supplementary Fig. S2b). However, wind regimes apparently do not influence gene flow in the same way throughout the range of the HEC. Strong north-eastern trade winds blow almost permanently across the direct route between the African mainland and the Canary Islands. They also blow from north-east to south-west between the individual islands of this archipelago, but their HEC populations show very low overall genetic differentiation.

Conclusions
A locally varying interplay of geographic barriers and climate variables, as well as incomplete lineage sorting and introgression, have shaped a truly complex phylogeography of this dispersive hawkmoths species complex. Our study highlights the importance of comprehensive sampling and analyses of multiple markers for a conclusive evaluation of a species' evolutionary history. The data imply that the populations studied constitute a single gene pool, and thus belong to one biological species. Nevertheless, a division into two subgroups is predominant. Our data put forward a new pattern of a 'glacial refuge belt' with both local differentiation and ongoing gene exchange. A subsequent oscillating 'admixture belt' is postulated to follow climate fluctuations. The dominance of a distinct mitochondrial haplogroup in the Italian HEC would generally be considered to have evolved from long-term processes of vicariance during glacial periods and to have been static since the early Holocene 2,5 . However, our data suggest that it could have evolved more recently. The HEC potentially adds to the few examples of mtDNA variant adaptation to climate gradients known so far that challenge the general assumption of neutrality for this most commonly used marker in phylogeography 1 and molecular taxonomy 80 .