Diversification, selective sweep, and body size in the invasive Palearctic alfalfa weevil infected with Wolbachia

The alfalfa weevil Hypera postica, native to the Western Palearctic, is an invasive legume pest with two divergent mitochondrial clades in its invading regions, the Western clade and the Eastern/Egyptian clade. However, knowledge regarding the native populations is limited. The Western clade is infected with the endosymbiotic bacteria Wolbachia that cause cytoplasmic incompatibility in host weevils. Our aim was to elucidate the spatial genetic structure of this insect and the effect of Wolbachia on its population diversity. We analyzed two mitochondrial and two nuclear genes of the weevil from its native ranges. The Western clade was distributed in western/central Europe, whereas the Eastern/Egyptian clade was distributed from the Mediterranean basin to central Asia. Intermediate mitotypes were found from the Balkans to central Asia. Most Western clade individuals in western Europe were infected with an identical Wolbachia strain. Mitochondrial genetic diversity of the infected individuals was minimal. The infected clades demonstrated a higher nonsynonymous/synonymous substitution rate ratio than the uninfected clades, suggesting a higher fixation of nonsynonymous mutations due to a selective sweep by Wolbachia. Trans-Mediterranean and within-European dispersal routes were supported. We suggest that the ancestral populations diversified by geographic isolation due to glaciations and that the diversity was reduced in the west by a recent Wolbachia-driven sweep(s). The intermediate clade exhibited a body size and host plant that differed from the other clades. Pros and cons of the possible use of infected-clade males to control uninfected populations are discussed.

Selective neutrality test and positive selection test. For the Western clade, selective neutrality for mitochondrial segments was rejected by all indexes with minus values (D, D*, and F*), while for nuclear genes, selective neutrality was rejected by none of the indices (Table 4), suggesting recent sudden population growth after bottleneck event(s) in the mitochondrial lineage. For the Eastern/Egyptian clade, selective neutrality for both mitochondrial and nuclear segments was rejected by all indexes (Table 4), suggesting recent sudden population growth after bottleneck event(s) in the mitochondrial lineage and nuclear variants. For the intermediate clade, selective neutrality for mitochondrial segments was rejected by none of the indexes (Table 4). Selective neutrality for nuclear segments was untested because of the insufficient sample size.
A model with different ω (dN/dS) assigned for infected and uninfected clades improved the model fit most, compared to models with different ω for two or three different clades, although the improvement was nonsignificant ( Table 5). The ω for infected clade was three times higher than the ω for uninfected clade even though both were < 1 ( Table 5). Phylogeography. Based on available samples, the Balkan/Italian peninsulas and the Middle East are the most likely area of the origin of H. postica, from which the Western clade diversified via France (Fig. 5a). France is the likely area where the ancestral population was first infected with Wolbachia ( Fig. 5a, right). We found two connections between regions that were highly supported with BF > 3: France and western Europe (BF = 7.98) and Balkan/Italy and North Africa (BF = 7.90) (BSSVS analysis, Fig. 5b).

Discussion
This study revealed the large-scale geographic distribution and genetic diversity of H. postica in its native range. Intermediate mitotypes with larger body sizes were found from the Balkans to central Asia. The observed reduced diversity within the Western clade is likely due to a high percentage of Wolbachia infection within this clade, which is known in other species. We also identified a higher substitution rate of nonsynonymous mutations, suggesting promoted (fixation of) nonsynonymous mutations in the infected Western clade. In the Western clade, recent sudden population growth after a bottleneck was suggested only for mitochondrial genes and not for nuclear genes, supporting a recent selective sweep on mitochondria by Wolbachia infection.
Our results demonstrated a clear pattern of geographic distribution of the two divergent mitochondrial clades across the area of study, the Eastern/Egyptian and Western clades. The low genetic variation and star-like haplotype network within the Western clade is a signature of a recent demographic expansion from a few founders 49 . If the populations experienced ancient demographic bottlenecks, mitochondrial and nuclear genes are expected to have a concordant population structure. There was a weak concordance ( Table 2), suggesting that these genes may have shared similar evolutionary trajectories. Two bottleneck events are likely; postglacial recolonization (see next section) and a recent mitochondrial sweep by Wolbachia. The former may also serve as a major driver of IBD that was supported for overall geographic populations (two clades distributed separately in the north and south with intermediate mitotypes in between). The latter likely accelerates the fixation of nonsynonymous mutations in the Western clade 33,35,[50][51][52][53] .
The asymmetric inheritance of maternal mitochondria of an infected host caused by unidirectional CIinducing Wolbachia can eventually lead to a sweep, which likely explains the low mitochondrial genetic variation among infected individuals. The infected clade demonstrated accelerating nonsynonymous mutations or fixation. This result is consistent with a general trend of Wolbachia infected insect groups 38 , suggesting fixation of nonsynonymous mutations in mitochondria promoted by its small effective population size under the CI-inducing  22 .
In the southernmost populations of the Western clade or the geographic contact zone between the two clades, most individuals were uninfected or had lost Wolbachia. The imperfect maternal transmission was observed in the interclade crosses of H. postica 27 ; fitness costs incurred by cytoplasmic incompatibility and stochasticity during the invasion process 45,54 may lower the Wolbachia infection rate. Environmental causes (e.g., extreme temperatures) may also accelerate the endosymbiont loss 55,56 . Resulting uninfected H. postica populations (or with lowered Wolbachia density) must have regained reproductive compatibility between clades and enabled crosses between the diverged clades.
The intermediate variants exhibited a large body size. Larger genitalia of the males with these mitotypes may inhibit mating with the females of other clades and promote reproductive isolation. These mitotypes also were associated with an ecological niche that differed from the niche of other clades. Bulgaria populations used Vicia cracca as a host plant, whereas other populations used Medicago and Trifolium. Vicia cracca has high contents of cyanamide 57 and canavanine 58 that are toxic to insect herbivores 59,60 .
The genetic structure of most European biota has been strongly influenced by glacial oscillations of the Holocene 61,62 , and most temperate species exhibit northward post-glacial recolonization from glacial refugia located in southern Europe through central Asia during the last glacial maxima (southern genetic richness/ northern purity 63 ; in beetles 64 ). In H. postica, we observed mtDNA differentiation for all clades and within the Eastern/Egyptian clade. Based on the estimated ancestral states in mitochondrial phylogeography and mitochondrial/nuclear genetic diversity, the Balkan and Italian peninsulas are a possible candidate for the origin of Table 2. Mitochondrial and nuclear genetic diversity of the two clades (the Western and the Egyptian/ Eastern) and the intermediate clades in Hypera postica. п: mean number of pairwise differences; nucleotide diversity (average over loci) (mean ± SD). The genetic distance was calculated based on pairwise differences. Numbers in parentheses are sample sizes (pooled number of individuals, number of populations). The same letters within each column (gene segment) indicate no significant difference between clades (p > 0.01). For population-wise mitochondrial and nuclear diversities and distances, see Supplementary Tables S1 and S2.  65 , which may also consist of the area of origin of H. postica.
The recent dispersal routes that include the north Mediterranean were highly supported. Anthropogenic factors may allow occasional dispersal of H. postica to Europe and North Africa with alfalfa traded for livestock feed (by 2,600 years ago 66 ). More recent international trade of alfalfa meal and pellets may continue to aid the weevil's opportunistic long-distance dispersal; France, Spain, and Italy are the major alfalfa exporters among H. postica's native ranges 67 .

Conclusion. While geographic isolation assisted continental diversification of the weevil H. postica, recent
Wolbachia infection reduced diversity in a mitochondrial clade in the host weevil in western Europe. Wolbachiainfected males could be used as a control agent for the Incompatible Insect Technique on uninfected populations, however, the risk of heterosis in interclade crosses following accidental cure of Wolbachia must be assessed before application. Table 3. Mitochondrial and nuclear genetic diversity in Hypera postica based on geographic regions. п (mean number of pairwise differences); nucleotide diversity (average over loci) (mean ± SD). Bolded genetic diversity indices indicate the highest diversity for each gene segment. Numbers in parentheses are sample sizes (pooled number of individuals, number of sampled populations). See Table 1 for the country codes. For populationwise mitochondrial and nuclear diversities and distances, see Supplementary Tables S1 and S2. 10.23 ± 4.81 (29,9) 0.0194 ± 0.0102 3.08 ± 1.65 (29,9) 0.0110 ± 0.0065 1.08 ± 0.74 (20,8) Supplementary Table S3.

Wolbachia infection and phylogeny.
We used PCR to screen for possible Wolbachia infections. The Wolbachia ftsZ coding fragment was amplified using the primers fts-Z-f and fts-Z-r 70 . PCRs were performed using preheating as above, followed by 32 cycles at 94 °C for 40 s, 55 °C for 45 s, and 70 °C for 1 min. As a positive control, we used Wolbachia-infected Callosobruchus chinensis 71 . Blurred and extremely weak signals compared with the positive control were considered uninfected, which differs from a previous study 47 . Wolbachia-positive H. postica were further subjected to PCR and sequencing of the genes, coxA and hcpA, in addition to ftsZ for multilocus sequence typing of Wolbachia 72 .
For the phylogenetic reconstruction of Wolbachia, we used sequences from representative supergroups of Wolbachia (nr/nt database, Supplementary Table S1) with sequences of Anaplasma marginale as an outgroup. We used the GTR model, which was selected as the best fit model of nucleotide substitution by MrBayes3.2.6 73 , based on the AICc, using MrAIC.pl 1.3.1 74 . The three gene segments were partitioned. Markov chain Monte Carlo (MCMC) simulations were performed for one million generations, with sampling conducted every 1,000 generations. The convergence of independent parallel runs was checked using Tracer 1.6 75 , and the first 25% of trees were discarded as burn-in. Body size. After collecting specimens, the right elytron lengths of the samples were measured to the precision of 0.01 mm with a microscope (VH-5500, Keyence, Osaka, Japan). The sex of the samples was determined by both external and genital morphology. The effect of sex, clades (Egyptian/Eastern, Western, and intermediate), and Wolbachia infection on elytral lengths were tested by nonparametric Wilcoxon/Kruskal-Wallis signed rank tests. Posthoc multiple comparison was performed on the significant factor using the Steel-Dwass test. JMP 14.2.0 was used for statistical analyses.

Selective neutrality test and positive selection test. Selective neutrality was tested in each clade with
Tajima 76′ 's D, Fu and Li 77′ 's D* and F*, using DNASP 6.12.03 78 . P values were derived by coalescent simulations with 2,000 replications. For the coalescent simulations for nuclear segments, an intermediate recombination rate was assumed. We used all mitochondrial (808 bp) or nuclear sequences (1,193 bp) of two individuals sampled per clade from each population to avoid sample size bias between populations. The equal nonsynonymous/synonymous substitution rate ratio (dN/dS ratio, ω) between infected and uninfected clades and between the two major clades was tested with a phylogenetic analysis using the maximum likelihood method (likelihood ratio test) employed by the codeml program in PAML 4.9i 79 . The models with three or more different ω for each branch were compared with a reference (basal) model with two different ω (one for the root and the other for both the Western and Egyptian/Eastern clades). We concatenated all the open reading frames (protein-coding fragments) and removed potential stop codons (leading to 215 codons) of the mitochondrial sequences of two individuals randomly sampled per clade from each population to avoid sample size bias between populations. Codons for invertebrate mitochondria were used.
Haplotype networks and diversity. All mitochondrial sequences were assembled using Sequencher 5.0 (Gene Codes Corp, Ann Arbor, MI, USA), and we checked for the presence of pseudogenes using commonly employed methods 80,81 . Statistical parsimony networks were reconstructed based on mitochondrial and nuclear fragments using TCS 1.21 82 , in which we allowed connections between haplotypes of 20 steps for mitochondrial genes and 95% for the nuclear genes, to elucidate the maximal divergence observed among haplotypes. Nucleotide diversity (average over loci) and п (mean number of pairwise differences) 83 were estimated for each geographic area, using Arlequin 3.5.2.2 84 . Geographic history. Isolation by distance. We assessed spatial mitochondrial differentiation by testing for isolation by distance (IBD) 85 . With a sweep (e.g., by Wolbachia) followed by rapid spread or frequent anthropogenic long-distance dispersal events, the IBD correlation is predicted to be weak at most. We tested if, as predicted by IBD, pairwise geographic distances and pairwise genetic differences were positively correlated using a one-tailed Mantel test 86 based on 2,000 permutations with the ISOLDE program implemented in Genepop 4.2 87 . For pairwise genetic differences, we employed corrected average pairwise differences between populations X and Y, [π XY − (π X + π Y )/2] 88 and their p values were derived using Arlequin.
Phylogeography. We estimated the historical dispersal patterns of H. postica, using a Bayesian discrete phylogeographic approach 89 with a Bayesian skyride framework implemented in the software package BEAST 1.10.4 90 . We used two mitochondrial segments (808 bp). To avoid sample size bias, we selected only one individual per clade from a given locality but excluded intermediate mitotypes, which reduced the data set to 34 individuals, with H. miles as an outgroup. We used default settings, applied the same molecular evolution model as presented above, and used an uncorrelated relaxed clock model assuming lognormal rate distribution 91 . We assigned each sequence to one of the seven geographic regions, and the symmetric exchanges between the geographic regions throughout the entire phylogeny were modeled with the Bayesian stochastic search variable selection (BSSVS). MCMC runs were performed for 50 million generations, sampling one tree every 25,000 generations. After confirming the stationarity of parameter estimates using Tracer, the first 40% of trees were discarded as burnin, and maximum clade credibility (MCC) tree was built using TreeAnnotatorv1.10.4. As each node in each MCMC sample is annotated with a geographic region and Wolbachia infection status, we assessed the certainty of the geographic reconstruction by looking at the distribution of node states across the MCMC using FigTree v1.4.4 (http:// tree. bio. ed. ac. uk/ softw are/ figtr ee/). Bayes factor (BF) values for exchange rates between each pairwise regions were retrieved from the log file from the BSSVS analysis using SpreaD3 v0.9.7.1rc 92 .