Altitude and hillside orientation shapes the population structure of the Leishmania infantum vector Phlebotomus ariasi

Despite their role in Leishmania transmission, little is known about the organization of sand fly populations in their environment. Here, we used 11 previously described microsatellite markers to investigate the population genetic structure of Phlebotomus ariasi, the main vector of Leishmania infantum in the region of Montpellier (South of France). From May to October 2011, we captured 1,253 Ph. ariasi specimens using sticky traps in 17 sites in the North of Montpellier along a 14-km transect, and recorded the relevant environmental data (e.g., altitude and hillside). Among the selected microsatellite markers, we removed five loci because of stutter artifacts, absence of polymorphism, or non-neutral evolution. Multiple regression analyses showed the influence of altitude and hillside (51% and 15%, respectively), and the absence of influence of geographic distance on the genetic data. The observed significant isolation by elevation suggested a population structure of Ph. ariasi organized in altitudinal ecotypes with substantial rates of migration and positive assortative mating. This organization has implications on sand fly ecology and pathogen transmission. Indeed, this structure might favor the global temporal and spatial stability of sand fly populations and the spread and increase of L. infantum cases in France. Our results highlight the necessity to consider sand fly populations at small scales to study their ecology and their impact on pathogens they transmit.

Bayesian clustering. Discriminant analysis of principal components (DAPC) identified 40 clusters with a mean assignment probability P Ass = 0.8568. Twenty individuals were grouped in one strongly differentiated cluster (cluster 21 with P Ass = 1) (Fig. 1). Moreover, two outliers (one from cluster 2 and one from cluster 28) were close to cluster 21. Bayesian Analysis of Population Structure (BAPS) found 28 clusters (probabilities for number of clusters = 0.93641). A cluster of 22 individuals (BAPS cluster 28) included the 20 individuals from DAPC cluster 21 and the two outliers highlighted by DAPC. The optimal number of clusters found by STRU CTU RE HAR-VESTER (used to visualize the results of STRU CTU RE analysis) was two with ΔK = 16.03 (the second biggest ΔK was 2. 55). Surprisingly, the clusters found by STRU CTU RE did not match the DAPC or BAPS results. These two clusters grouped individuals from several stations with no obvious relation with any ecological or geographical parameter, and with a very small average assignment probability of individuals to their cluster (P Ass = 0.53683). The partition found by STRU CTU RE and STRU CTU RE harvester with K = 2 is probably meaningless.
Cytb fragment sequencing of seven individuals from cluster 21 (# 12, 17, 18, 19, 856, 872, and 874) and one of the two outliers (# 23, cluster 2) showed 99-100% similarity with Ph. ariasi (GenBank accession number: KP685539.1, and sequences in Supplementary File 1). Therefore, the taxonomic status of the 22 outliers could not be elucidated. These 22 individuals (Group A in Supplementary Table S1) were all males captured in three stations (ST02, ST11 and ST12). Neither particular environmental condition nor specific morphological character was  Table S1). The dendrogram (NJTree) seems to exclude two subsamples A from all other subsamples, while a third one appeared fully included within group B (Fig. 2). This structuring appears to be robust since the same tree was obtained with 6 (Aria2, Aria3, Aria4, Aria5, Aria13, and Aria14, loci selected after Linkage Disequilibrium (LD) and F-statistics analyses, see below) or 10 loci (excluding Aria1). Except when specified otherwise, the 22 individuals of Group A were removed from the analyses to prevent Wahlund effects.
Locus selection for the analyses of Group B sand flies. The Aria1 locus was almost monomorphic (H S = 0.06) and was removed from the data. Using regression approach we detected a marginally not significant Short Allele Dominance (SAD) for the Aria14 locus (p value = 0.0507). As SAD results from a preferential amplification of the shortest allele in heterozygous individuals 28 , the Aria14 microsatellite profile of each homozygous individual was checked again following recommendation suggested in De Meeûs et al. 29 , corrected, and then analyzed using GENEMAPPER 4.0. After correction, SAD could not be detected on this locus any longer (p value = 0.0731).
The Micro-Checker analysis suggested stutter artifacts at five loci (Aria2, Aria3, Aria10, Aria14, and Aria15). The unilateral exact binomial test indicated that the proportion of significant stuttering was not significantly higher than the expected proportion under the null hypothesis for Aria2 (p value = 0.2078), Aria3 (p value = 0.2078) and Aria14 (p value = 0.5819). Nevertheless, it was marginally not significant for Aria10 (p value = 0.0503) and highly significant for Aria15 (p value = 9.728e−06). Therefore, alleles close in size were pooled, avoiding pooling together only rare alleles (i.e., presence of at least one frequent allele in the pool) for these two loci as recommended 29 . As the Aria10 locus became monomorphic after pooling, this locus was removed from the analysis. For Aria15, alleles 117, 119 and 121 were pooled with allele 123; allele 127 was pooled with allele 125; and allele 131 was pooled with allele 129. However, after correction, the Micro-Checker analysis and the unilateral exact binomial test (p value = 9.728e−06) highlighted that stuttering still affected this locus. Therefore, Aria 15 was also removed from the analysis.
On the remaining 9 loci, the linkage disequilibrium (LD) analysis indicated that 7 locus pairs (out of 11) displayed a significant LD (19.4%). Three of these locus pairs (Aria11 and Aria 12, Aria 11 and Aria 13, Aria 12 and Aria14) remained significant after Benjamini and Yekutieli adjustment. Aria11 and Aria12 displayed outlier profiles with strongly negative F IS (Fig. 3), above average F ST (Fig. 4), large F IS and F ST variance (Figs. 3,4), and strong LD. These results suggested the non-neutrality of these loci. Therefore, Aria11 and Aria12 were also removed from the analysis.
To check the result stability, additional DAPC analyses were performed by successively removing the following loci Aria1, Aria10, Aria11, Aria12, and Aria15. As long as Aria1 was kept, the obtained pattern was the same as in Fig. 2 (data not shown). With the remaining six loci (Aria2, Aria3, Aria4, Aria5, Aria13, and Aria14), DAPC and BAPS provided no clear structure. This suggested that the distinction between group A (the 22 outliers) and group B (the 1,177 remaining individuals) only relied on the Aria1 locus (Supplementary Table S1).    . Deviation from the genotypic proportions expected for panmixia as measured by F IS in Phlebotomus ariasi for each microsatellite locus and averaged across loci. For each locus, the 95% CI values for subsamples obtained with the jackknife over subsamples is represented with dashes. The 95% CI values for the average, with 10 and 6 loci, were obtained by 5,000 bootstraps over loci. The results of tests for panmixia, short allele dominance, and number of subsamples with stuttering for each locus are also indicated.   Table S1). The F IS was bigger in the complete dataset than in Group B alone (F IS = 0.079 and F IS = 0.057, respectively), and the difference was significant (p value = 0.0486 unilateral Wilcoxon signed rank test). This translated into a Wahlund effect when Group A individuals were included in the data. These individuals were thus kept excluded.

Regression approach.
A Principal Component Analysis (PCA) was undertaken with PCAGen (developed by J. Goudet, freely available at https ://www2.unil.ch/popge n/softw ares/pcage n.htm). The first two axes were significant using the broken stick criterion but not by permutation testing (p value = 0.2187 for axis 1 and p value = 0.1045 for axis 2) (see "Methods" section). Axis 1 and axis 2 represented 31% and 23% of the total inertia, respectively. We undertook two generalized linear model (glm) with the coordinates of subsamples at these axes (see "Methods" section). For the first axis, the minimum model, after a stepwise procedure, was: axis1 ~ hillside + latitude + longitude + latitude:hillside. For this axis, none of the included variables played a significant role (p value > 0.05 for all tests). For axis 2, the minimum model was: axis2 ~ hillside + altitude + latitude + longitude. Altitude and hillside played a significant role (p values < 0.001) and represented 51% and 15% of the total deviance, respectively (Table 1). Isolation by altitude distance. As showed above, latitude and longitude had a weak effect on the genetic structure. Geographic parameters were thus removed from the model and isolation by altitudinal distance was tested using the Mantel test for each hillside separately (South-East and North-West, two tests).
When individuals were grouped by altitude levels (see Table 2 and "Methods" section), isolation by altitude distance was significant for the Northern (p value = 0.00605) and marginally non-significant for the Southern hillside (p value = 0.07345). When combined with the generalized binomial procedure 31 , computed with the MultiTest V1 32 , isolation by altitude appeared highly significant (p value = 0.0054).
Effective population sizes and migration. The average effective population size (N e ) was 69 individuals (range: 39 to 98) across subsamples and methods (see "Methods" section). N e was not correlated with altitude (Spearman's ρ = 0.2216, p value = 0.7864). There was no effect of hillside (Kruskal-Wallis p value = 0.6242). Nei's unbiased estimator of genetic diversities 33

Discussion
Locus selection. In this study, using 11 microsatellite markers, we investigated the genetic structure of Ph.
ariasi populations in 1,253 individuals collected in 17 stations in the South of France (Supplementary Table S1). Different analyses (Micro-Checker, LD, F IS and F ST variance) allowed identifying loci with technical problems and/or loci that may not be neutral concerning natural selection. Consequently, we removed five of the eleven loci for various reasons including the presence of incurable stutter artifacts (Aria10 and Aria15), absence of polymorphism (Aria1), or non-neutral evolution (Aria11 and Aria12). One locus with SAD could be corrected.
presence of outlier individuals. The Bayesian analyses (DAPC and BAPS) for all remaining loci (Aria2, Aria3, Aria4, Aria5, Aria13, and Aria14) confirmed the 22 outliers (Group A). These outliers were morphologically similar to Ph. ariasi and displayed 99-100% cytb sequence similarities. A previous study, based on chromatographic analysis of cuticular components, provided evidence that there are two distinct Ph. ariasi populations in our study area: one predominantly sylvatic, and the second one domestic 19 . However, in our study, there was no geographical distribution difference between individuals from Group A and Group B.  40 ). However, the distinction between Group A and Group B mainly depends on a single genetic marker (Aria1), which is fixed for different alleles in each group, while the other markers display a rather weak, though significant, signal. This result is supported by the dendrogram as the same tree was obtained with 6 or 10 loci (excluding Aria1). Additional molecular and biological studies will be necessary to test the cryptic species hypothesis (vector competence, interbreeding, hosts and niche preferences, behavior, etc.). As the taxonomic status of Group A individuals could not be elucidated, they were removed from the analyses.
Genetic structuring in ecotypes. We observed an important and significant heterozygote deficit, instead of the heterozygote excess expected for dioecious populations with random mating 41 . We found no evidence of Wahlund effect with the LD-based method described by Manangwa et al. 42 . Consequently, the heterozygote deficit could (non-exclusively) be explained by (1) null alleles; (2) allelic dropout; or (3) positive assortative mating. In the last case, this would require a mutual attraction of sexual partners based on a sufficient proportion of the genome to allow the hitchhiking of microsatellite markers, which are theoretically non-coding DNA sequences.
The glm approach showed a strong influence of altitude and hillside, and a weak (if any) influence of geographic distances on the genetic data. The proportion of deviance (67%) explained by altitude and hillside suggested the existence of ecotypes in this Ph. ariasi population.
The very strong migration rate estimated in our study (more than 25% of the effective subpopulation size) is hard to reconcile with the emergence of genetically distinct ecotypes. Different scenarios can explain ecotype structuring despite the strong migration rate: (1) the death of most immigrants before they can reproduce (unlikely scenario), and (2) significant assortative mating that can also explain the heterozygote deficit observed in our data (see above). The rate of codominant assortative mating based on genes homogeneously distributed in the genome can be approximated with the same equation as for selfing, as described in Hartl et al. 43 (page 272), with the equation a ≈ 2F IS /(1 + F IS ). With a F IS = 0.057, the assortative mating in our population would be ≈ 0.11 (≈ 11% of zygotes produced).
It is worth noting that previous research on the same species in the same study area highlighted differences in wing phenotypes according to altitude and hillside 25 . It has been demonstrated that wing configuration is associated with wing beating frequency and mate recognition 44,45 . These features would lead to a preferential choice of partners that could explain assortative mating and ecotype structuring. More studies are necessary to investigate the link between phenotypic (wing configuration) and genetic (microsatellite) diversity.   47,48 . However, due to the low flying capacities of sand flies 46,49 , migrants are expected to disperse mainly over short distances. Therefore, the spread and increase of L. infantum cases in France might be mainly due to the movement of infected hosts.
This study demonstrates for the first time that Ph. ariasi presents a genetic structure in ecotypes. These data highlight the necessity to consider sand fly populations at small and specific scales to determine their ecology and its impact on Leishmania transmission. This structure may explain the long-term stability of sand fly populations. Not many papers compare available clustering algorithms to date and it is thus hard to really understand when and why such algorithms will converge or diverge in the best partition they offer. For some attempt comparisons see: Latch et al. 50 42 .
This type of study needs to be extended to other sand fly species. Indeed, the large diversities of sand fly populations worldwide may correlate with different ecological vector capacities and sensitivity to control measures.

Methods
Study area. The field study was performed in the South of France, on the "massif de l'Oiselette" hill situated between the "Hérault" (Ganges, Hérault) and "Arre" (Le Vigan, Gard) valleys. Sand flies were sampled along a 14 km transect from the "Saint Julien de la Nef " to "Le Vigan" villages, including "Roquedur-le-haut" (at 601 m above sea level) ( Fig. 5; Table 2). This region has a Mediterranean sub-humid climate 55 , and is characterized by the presence of plant species typical of scrubland habitats 5 .
The study area was divided in two hillsides with a South-East and North-West orientation, like in previous works performed in this area 3,24,25 . Stations were selected based on the paper of Rioux et al. 3 , first to allow the comparison with the data obtained 30 years ago in terms of species distribution, density and abundance 24 , second because these stations were distributed along the 14 km transect and represented the altitude and hillside diversity necessary for our population genetics study. Station 7 and Station 8 were not present in Rioux et al. 3 . These two stations are transitional between the two hillsides, and were thus added in the present study to obtain information on the consequences of transitional ecosystems. The stations were grouped according to altitude and hillside (see Fig. 5; Table 2). This area is characterized by the presence of various domestic animals, such as chicken, sheep, ducks, geese, horses, rabbits, cats, dogs, and also many different wild animals (wild boars, foxes, rodents, lizards, birds, etc.). These animals represent potential sand fly hosts. Moreover, cases of canine leishmaniasis were observed during the collection period (J.P. personal observation).  Table 2) with a mean of 189 sticky traps per station, in various biotopes, inside and around human dwellings and animal sheds, close to the vegetation, and in wall crevices. Each trap was collected after 2 days. In total, 1,253 sand flies were captured and transferred individually into 1.5 mL Eppendorf tubes with 96% ethanol and labeled. Prior to mounting, the sand fly head, genitalia and wings were removed. Heads and genitalia were cleared in Marc-André solution (chloral hydrate/acetic acid) and mounted in chloral gum 46 . Each individual specimen was identified on the basis of the morphology of the pharynges and/or male genitalia or female spermathecae, using the keys of Abonnenc 46 , Lewis 6 and Killick-Kendrick et al. 58 .
DNA extraction. Sand fly DNA was extracted using the Chelex method 59  Data analysis. Microsatellite raw data were formatted for CREATE 62 that allowed their transformation into the formats needed for the different analyses. Samples with more than 50% of missing data were removed from the analyses.
Bayesian clustering. Several Bayesian clustering analyses were carried out. To validate the sand fly species identification, the first analysis was based on the 11 selected loci and included all sand fly samples. Then, to study the organization of Ph. ariasi populations, data were analyzed at different scales: hillside, altitude, and station. This analysis was performed with the loci selected after Linkage Disequilibrium (LD) and F-statistics analyses (see "Results" section).
As the clustering method accuracy can vary depending on the statistical properties of specific software programs and datasets 29,42,63 , three different Bayesian clustering methods were used. First, a discriminant analysis of principal components (DAPC) 64 was performed using the adegenet 65 package for R 66 . This was followed by a Bayesian Analysis of Population Structure (BAPS; admixture model 67,68 , maximum number of clusters: 35, repetition: 50; that is freely available at https ://www.helsi nki.fi/bsg/softw are/BAPS/), and a STRU CTU RE (version 2.3.4) analysis 69 (burning period: 10,000, number of clusters from 1 to 35, with the admixture model). STRU CTU RE HARVESTER vA.2 70 was used to visualize the STRU CTU RE analysis results, examine the ad hoc ΔK statistic, and determine the optimal number of clusters.
Finally, a dendrogram (NJTree) was built with MEGA 7 71 from a Cavalli-Sforza and Edward's chord distance (D CSE ) 72 matrix as recommended 73 , between subsamples defined as combinations of the group obtained by Bayesian clustering and sampling stations. Because null alleles were suspected to occur, D CSE was computed with the INA correction with FreeNA 36 , after recoding missing data into null homozygotes following authors' recommendation.
Linkage disequilibrium and F-statistics. LD between each locus pair was tested with the G-based permutation test with 10,000 randomizations. This test was performed with F STAT 2.9.4, an updated version of F STAT 2.9.3 74 available at https ://www.t-de-meeus .fr/ProgM eeusG B.html. This procedure is the most powerful for testing LD across different subsamples 32 . There are as many p values as locus pairs. Then, the False Discovery Rate (FDR) correction for multiple non-independent tests described by Benjamini et al. 75 77 . Significant departure from 0, for F statistics, was tested by randomizing alleles between individuals within subsamples (deviation from the local random mating test) or individuals between subsamples within the total sample (population subdivision test). The p value corresponded to the number of times a statistic measured in randomized samples was as big as (or bigger than) the observed one (unilateral tests). For local panmixia, the statistic used was f (Weir and Cockerham's F IS estimator). To test for subdivision, the G-based test 78 was used. According to De Meeûs et al. 32 , the G-based test is the most powerful procedure when combining tests across loci.
The 95% confidence intervals (CI) of F-statistics were computed using the jackknife over populations method for each locus or 5,000 bootstraps over loci for the averages, as described in De Meeûs et al. 79 . Parameter estimates, testing, jackknife and bootstrap computations were done with F STAT 2.9.4.
The determination procedure described by De Meeûs 30 and Manangwa et al. 42 was used to discriminate demographic from technical causes of significant homozygote excess and LD. In the case of null alleles, F IS and F ST artificially increase and a positive correlation is expected between the statistics, F IS standard error is at least twice that of F ST , and a positive correlation is also expected between F IS and the number of missing data (putative null homozygotes). Correlations were tested with the unilateral Spearman's rank correlation test in R 3.5.1 66 . The frequency of null alleles was assessed with Micro-Checker v2.2.3 80 using Brookfield's second method 81 .
The presence of stutter artifacts at each locus in each subsample was evaluated with Micro-Checker v2.2.3 80 . A unilateral exact binomial test was used with R 3.5.1 66 to determine whether the observed proportion of significant stutter artifacts was greater than the expected 5% under the null hypothesis.
Short Allele Dominance (SAD) was assessed with the method described by Manangwa et al. 42 . The correlation between allele size and F IT was tested with the unilateral Spearman's rank correlation test in R 3.5.1 66 . In the case of SAD, a negative correlation is expected between allele size and F IT 42 . In the case of significant stutter artifacts or SAD, the incriminated loci were corrected using the method described by De Meeûs et al. 29 . Stuttering was addressed by pooling alleles close in size. To avoid a spurious increase of heterozygosity, each pooled group contained at least one frequent allele (e.g., with p value ≥ 0.05). SAD was addressed by going back to the chromatograms of homozygous individuals and trying to find a larger size micro-peak that could have been missed in the first reading.
In some instances, F IS values were compared between subsample groups using the Wilcoxon signed rank test for paired data (the locus was the pairing unit).

Role of environmental factors on sand fly structuring.
A principal component analysis (PCA) was done with PCAGen 1.2.1 (developed by J. Goudet, freely available at https ://www2.unil.ch/popge n/softw ares/pcage n.htm). The significance of the first axes was tested using the broken stick criterion 82 and 10,000 permutations of individuals across subsamples. The metric of each axis divided by the total genetic diversity corresponded to Weir & Cockerham's Theta (F ST estimator) (Goudet's personal communication). The coordinates of subsamples for each significant axis were used as the response variable of generalized linear models (glm). General models were as follows: axis i ~ latitude + longitude + altitude + hillside + latitude: hillside + longitude:hillside + altitude:hillside (where i corresponded to the significant axes; latitude and longitude are the latitudinal and longitudinal GPS coordinates in decimal degrees; altitude is the altitude in meters; hillside: south-east or north-west; and "X:Y" represents the interaction between the explanatory variables X and Y). All glm's were done using R version 3.5.1 66 , with the package rcmdr (R-commander) 83,84 . Model selection was performed using a forward stepwise model selection procedure and the Akaike Information Criterion 85 .
The influence of different factors (including interactions) was tested by examining the differences between models (complete, additive, and with one variable) with analyses of variances. As the entry order of explanatory variables matters in R analyses, the mean partial R 2 of each variable was calculated across all possible models.
Isolation by distance. Geographic distances were computed with Genepop version 4.7.0 86 using the Euclidian distance computed with the Universal Transverse Mercator (UTM) coordinates (Table 2). Genetic distances were estimated with the Cavalli-Sforza and Edwards chord distance 72 D CSE , the most powerful method to detect isolation by distance with microsatellite markers in most situations 87 . Due to the presence of missing data, data were converted into the FreeNA format to compute D CSE between each station pair with the ENA correction 36 that provides a very good correction for the isolation by distance regression slope in the case of null alleles 87 . As these station pair ended into a non-squared matrix that could not be handled by Genepop, the relationships between genetic and geographic distances were tested with a Mantel test (10 4 permutations) 88 in F STAT 2.9.4. As the Mantel test in F STAT is bilateral by default, the p value was halved in the case of positive slope, or computed as: "1-(1-bilateral p values)/2" in the case of negative slope, to obtain unilateral p values for a positive slope.
The relationships between genetic and altitudinal distances were also tested in the conditions described above. To avoid any interaction, the analyses for the South and North hillsides were done separately. The two p values obtained were then combined with the generalized binomial test 31 , with MultiTest 32 , taking into account all tests as recommended when the number of combined tests k < 4 89 .
Effective population sizes and migration. Four different methods were used to estimate the effective population sizes (N e ). The results obtained with these methods were averaged and weighted by the number of times a usable value (after removal of "infinity" results) was obtained. To obtain a range of possible N e values, the same approach was performed for the minimum and maximum values obtained with each method.
First, NeEstimator v2 90 was used with the LD method 91,92 that applies a correction for missing data 93 and the molecular co-ancestry method 94 . The LD method uses several threshold allele frequencies (0.05, 0.02, 0.01, and all alleles) to compute N e . The average across the different values obtained with these frequencies was computed.
Finally, the heterozygote-excess method (expected in dioecious populations) described by Balloux 41 was used for each locus that displayed heterozygote excess, as follows: N e = [− 1/(2 F IS )] −F IS /(1 + F IS ).
To determine the number of immigrants, the standardized differentiation index described by Meirmans and Hedrick was computed to correct for polymorphism excess: G ST " = n × (H T − H S )/[(n × H T − H S ) × (1 − H S )] 34 , where H T and H S are Nei's unbiased estimates of genetic diversity in the total sample or within subsamples, respectively 33 , and n is the number of subsamples. Genetic diversities were estimated with F STAT 2.9.4. Assuming an island model, this value was then used to obtain an approximation for the number of immigrants within subpopulations as N e m = (1 − G ST ")/(4 × G ST "). However, according to Wang's criterion 35 if Nei's G ST 33 is negatively correlated with H S , it is wiser to use F ST for this computation.

Data availability
All resources used in this article are provided in the Supporting Information and all the analyses are detailed allowing the assessment or verification of the manuscript's findings.