Assessing the origin, genetic structure and demographic history of the common pheasant (Phasianus colchicus) in the introduced European range

The common pheasant, a game species widely introduced throughout the world, can be considered as an ideal model to study the effects of introduction events on local adaptations, biogeographic patterns, and genetic divergence processes. We aimed to assess the origin, spatial patterns of genetic variation, and demographic history of the introduced populations in the contact zone of Central and Southeast Europe, using mitochondrial DNA control region sequences and microsatellite loci. Both types of molecular markers indicated relatively low to moderate levels of genetic variation. The mtDNA analyses revealed that common pheasants across the study area are divided into two distinct clades: B (mongolicus group) and F (colchicus group). Analyses of the microsatellite data consistently suggested a differentiation between Hungary and Serbia, with the pheasant population in Hungary being much more genetically homogeneous, while that of Serbia has much more genetic mixture and admixture. This cryptic differentiation was not detected using a non-spatial Bayesian clustering model. The analyses also provided strong evidence for a recent population expansion. This fundamental information is essential for adequate and effective conservation management of populations of a game species of great economic and ecological importance in the studied geographical region.

. Sampling localities of the common pheasant in the contact zone of Central and Southeast Europe. Green circles and red squares indicate clades B and F, respectively, and the numbers next to these symbols denote the haplotypes found in each sampling locality, according to the numbering assigned to them (see Figs. 2 and 3 for details). Map source: ESRI. The map was generated using ArcGIS 10.4.1 by ESRI (available online at: http:// www. esri. com/). www.nature.com/scientificreports/ area. The second most common haplotype was H2, which was shared by seven individuals and also found over a relatively wide geographical area. The two main mitochondrial lineages exhibited relatively low haplotype and nucleotide diversities in the study area (Table 1).
Population structure. The BAPS analysis assigned sequences of common pheasant from the study area to two clusters, where sequences belonging to each mitochondrial clade (B and F) fell into a distinct cluster ( Figure S1). The AMOVA estimated that 96.07% of the mtDNA variation in the study area is due to the differentiation between the two populations (p < 0.0001) and, accordingly, the F ST fixation index was extremely high (Table 2).
Demographic analyses and neutrality tests. The EBSP indicated an increase in maternal effective population sizes for both detected clusters (B and F) since approximately the last 10,000 years (Fig. 4). The mismatch distribution analysis indicated a unimodal distribution for both clusters B and F (Fig. 4), fitting to the sudden expansion model. The sums of squared differences (SSD) and raggedness indices (H ri ) were not significant (P > 0.05) in any of the cases (Table 1), also supporting population expansion for the population. The values of the Tajima's D and Fu's FS statistics were mostly negative for the clusters B and F and total sequences in the study area (Table 1), indicating, respectively, an excess of rare nucleotide variants and of rare haplotypes relative to those expected under mutation-drift equilibrium. While both tests were significant for clade F, only the Tajima's test was significant for clade B.
Microsatellite analysis. Hardy-Weinberg and linkage disequilibrium tests. Of the 10 loci, we did not include the results for two of them in the data set for analysis because one (PC5) was monomorphic and the other (PC1) had a high proportion of missing data. The global test indicated significant deviations from HWE after Bonferroni correction (P values < 0.0009) and heterozygote deficit in three loci (PC4, PC7, and PC10). However, in separate tests for each of the two countries, these three loci showed significant deviation from HWE and heterozygote deficit only for Serbia, not for Hungary. This suggests that the data for these loci may not be affected by intrinsic problems, such as null alleles or allelic dropout, but that the significant test results in Serbia may be due to underlying genetic structure (Wahlund effect). Accordingly, estimates of the frequency of null alleles at these loci were generally small (< 0.2; Table S2). We therefore retained these loci in the data set. No signs of significant  Genetic structure and population differentiation. In the STRU CTU RE analysis, both LnP(D) and ΔK suggested the presence of two genetic clusters (K = 2) with either the model with prior information on the sampling location of individuals or the non-spatial model (Fig. 5). With the model using information on sampling location and applying a membership probability threshold of 50%, all Hungarian samples grouped in the same cluster, whereas in Serbia the individuals were approximately divided equally into each cluster (53% in the same cluster as the Hungarians, and 47% in the other). When applying a stricter threshold of 70%, the results for the Hungarian samples were the same as for the 50% threshold, while most Serbian samples (64%) could not be assigned to    . STRU CTU RE assignment of common pheasant individuals into two (K = 2) genetic clusters. The bar plot on the left and the pie chart in each country refer to the analyses using information on sampling location and the membership probability threshold of 70%. In the bar plots, each individual is depicted by a column that is partitioned into K segments, which length is proportional to the membership coefficient of the individual for each cluster. Grey in the pie charts represents the percent of individuals that could not be assigned to any cluster. The results of non-spatial model and the membership probability threshold of 50% is given in the attached www.nature.com/scientificreports/ any cluster (of the remaining 36%, 18% were assigned to one cluster and 18% to the other). The analyses using the non-spatial model did not detect any clear geographic structure in the data. Specifically, considering the 50% threshold for cluster membership, 55% of the Hungarian samples and 53% of the Serbian samples were assigned to the same cluster, with the remaining 45% and 47%, respectively, grouping into the other cluster. With a threshold of 70%, 73% of the Hungarian samples and 76% of the Serbian samples had intermediate assignment probabilities to both clusters. As a great number of Serbian samples were not assigned to any of the two clusters in both spatial and non-spatial models, we studied cluster membership probabilities for models with more than two clusters to understand admixture patterns and fine-scale genetic structuring. However, the spatial models with K > 2 (K = 3, 4, and 5) and membership threshold of 70% showed that none of the admixture Serbian samples identified with K = 2 were assigned to a specific cluster, confirming the absence of discrete subpopulations whthin Serbian samples and high degree of admixture ( Figure S2). The Bayesian spatial clustering of individuals performed in BAPS also supported the presence of two clusters (K = 2) as the most likely ( Fig. 6). Most of the samples from Hungary were allocated to one cluster (shown in green) while the majority of samples from Serbia grouped in the other cluster (in red). The MEMGENE analysis identified two significant variables (Fig. 7). The first (MEMGENE1) suggests a significant genetic structure that mostly corresponds to a differentiation between the common pheasant populations in Hungary and Serbia (Fig. 7a). The second (MEMGENE2) supports the pattern of genetic differentiation between Hungary and Serbia suggested by MEMGENE1, and indicates the presence of genetic mixture in Serbia, which, in turn, appears to be essentially absent in Hungary (Fig. 7b). The estimated amount of genetic structure explained by spatial patterns was modest (R 2 adj = 0.072), indicating that much of the spatial genetic pattern can be explained by IBD.
The spatial autocorrelation analysis showed significant coefficients (p < 0.01) for the smallest distance class (i.e. 0-50 km; Fig. 8), suggesting an IBD pattern at this spatial scale. A marginally significant positive correlation was also found in the distance class 140-155 km, and no other significant positive correlation was detected in the remaining distance classes. The Mantel test indicated the presence of a modest global pattern of IBD (r = 0.136; p = 0.0004; upper limit of a 95% confidence interval = 0.219. Contraty to the global pattern of IBD in the whole dataset, we didn't find a significant relation between geographic and genetic distance in the Serbian samples (r = 0.078; p = 0.081).
Genetic diversity and bottleneck tests. Across all individuals, a total of 68 alleles were detected, with the number of alleles per locus ranging from four to 19 (mean 8.5), and the mean number of effective alleles, HO and HE were 3.82, 0.580 and 0.651, respectively ( Table 3). Estimates of the different genetic diversity parameters were generally similar between the two countries. The most notable differences were in the number of private alleles, which were almost double in Serbia (although it also should be noted that more Serbian samples were analyzed), whereas heterozygosity was higher in Hungary (Table 3).
No evidence of a recent genetic bottleneck was detected in any of the two data sets analysed (country-level and cluster-level), and under any of the three tested mutation models, based on the Wilcoxon's signed rank test, except for the cluster 2 (all Hungarian and 53% of Serbian samples) under the IAM (which was significant at α = 0.05). Based on the sign test, the Serbia data set showed evidence of a recent genetic bottleneck only under the SMM (Table S3). Given that for fewer than 20 loci the Wilcoxon's test may be the most appropriate and powerful, and that the TPM appears to be the most appropriate model for most microsatellites 52 , overall we interpret   www.nature.com/scientificreports/ these results as indicating absence of a recent bottleneck. Concordantly, the 'mode-shift' test in the three data sets always found a normal L-shaped allele frequency distribution.

Discussion
Although several studies have investigated the genetic diversity, population structure and demographic history of the common pheasant across its native range in Asia 30,31,[43][44][45] , there is a great lack of this kind of information on the populations introduced into Europe. This study provides the first detailed genetic assessment of common pheasants in their introduced European range, particularly in the contact zone of Central and Southeast Europe (Hungary and Serbia), and the results are essential information for the development of effective conservation and management strategies.
Mitochondrial diversity, phylogeography and demographic history. Mitochondrial diversity of common pheasants in the study area is relatively low (Hd = 0.192-0.417, π = 0.00041-0.00108), particularly when compared to that reported for native populations in Asia (e.g. 37 ); Hd = 0.500-1, π = 0.0009-0.0130). The phylogenetic analysis, including both the new control region sequences generated here and those previously published, inferred the same clades already identified in previous studies (e.g. 30,31 ) (Fig. 2). The mitochondrial sequences found in the study area belong to two well-supported clades: B (corresponding to several subspecies groups such as mongolicus, torquatus, and pallasi) and F (corresponding to the colchicus subspecies group) 30,31,44 .
Notably, clade F is much more frequent and geographically widespread, while clade B is mainly present in Serbia (Fig. 1), where the geographic overlap between the two clades is much more significant. Boev 29 proposed that the first common pheasants introduced into Europe were colchicus (as early as 500 AD), and only much later (eighteenth century) other subspecies (mongolicus and torquatus) would have been introduced. Our results, given the much greater frequency and distribution of clade F in the study area, are consistent with the hypothesis that colchicus was long introduced, so that this lineage had ample time and opportunity to spread throughout the region. Haplotype H1 of clade F and haplotype H2 of clade B were the most frequent and widespread in the study area (Fig. 1). This, and the relatively low number of haplotypes found in the study area, when compared for example to the haplotype richness in clade B (Figs. 2 and 3), suggest introductions involving a small number of individuals, followed by demographic and range expansion. The apparently more recent introduction of clade B seems to have so far spread mainly only across Serbia.
The EBSP results indicated that P. colchicus likely experienced a long-term historical expansion based on detected clusters in the study area (Fig. 4). Morever, mismatch distribution analysis suggested evidence for unimodal distribution of the clusters B and F, which is relevant to the panmictic population undergone the sudden expansion model in its evolutionary history 53,54 . In addition, the results of the Fu's and Tajima's mutation-drift equilibrium tests and the star-shaped pattern in the haplotype network support the hypothesis of demographic expansion of clade F (Table 1; Fig. 3). Given the postulated timings of the introductions of the common pheasant in Europe 29 and the time scale of the mtDNA mutation rate, this expansion signal likely reflects an older process that took place in the native range, but it is possible that some of the less frequent haplotypes differing by a single mutation from H1 are derived from mutations that occurred after introduction. Our result supporting the hypothesis of historical demographic expansion in the colchicus group is in line with results in Kayvanfar et al. 31 and Liu et al. 45 . Also, we find evidence for the hypothesis of historical demographic expansion for clade B. This hypothesis is suggested for the torquatus group by several previous studies 30,31,42,45 . Microsatellite diversity, genetic structure and bottleneck tests. The estimated moderate levels of microsatellite genetic diversity for common pheasants in the study area were similar to those reported for a sample of comparable size containing equal numbers of nine subspecies from China 55 . Therefore, possible genetic signals in the microsatellite diversity of founder effects associated with introductions (indicated by the relatively low nucleotide and haplotype diversity, Table 1 56 ) may have been erased as a result of a combination of factors such as the high mutation rate of microsatellites, rapid population growth, and the genetic enrichment resulting from the mixture of subspecies in the study area (e.g. high number of private alleles in Serbia, Table 3). Accordingly, the patterns of microsatellite variability did not show signs of effects of genetic bottlenecks.
Analyses of the microsatellite data using spatial Bayesian clustering and spatial genetic neighbourhood methods consistently suggested a differentiation between Hungary and Serbia, with the common pheasant population in Hungary being much more genetically homogeneous, while that of Serbia has much more genetic mixture and admixture (Figs. 5, 6, and 7). This cryptic differentiation was not detected using a non-spatial Bayesian clustering model. Therefore, the genetic differentiation between the common pheasant populations of the two countries is shallow. This may be due to gene flow resulting from the mobility of the species, since birds, due to their dispersal abilities, often show less population genetic structure than other vertebrates [57][58][59][60] . A similar pattern of low genetic structure was observed among populations of common pheasant in China 55 and of silver pheasant in southern China 61 . Given that the European range of P. colchicus is larger than the contact zone, it may be worth noting at some point that studying the species at the European range scale may provide deeper insights into the population structure of the species. We found evidence of some influence of isolation-by-distance on the genetic structure. In particular, spatial autocorrelation analyses indicated non-random distribution of genotypes at scales up to 50 km. Further study is needed to investigate the issue of fine-scale genetic structure.
Conservation management implications. An important conclusion from our data, both mitochondrial and nuclear, is the fact that they strongly indicate that releases over time to the present day of captive-bred individuals have apparently not led to the establishment of other lineages/subspecies distinct from clades B and F. Also, the detection of neighborhood size up to a distance of 50 km is of practical relevance for the species' www.nature.com/scientificreports/ management. Thus, wildlife management practice has traditionally treated this distance value as the basis for isolating pheasant populations. However, the known home range of common pheasants is much smaller: on average 0.05-0.1 km 2 (range of 0.008-0.215 km 2 ), and even less in the breeding season (0.02 km 2 ) 62,63 . Our results, however, suggest that the scale of the genetic impact of releases of captive-bred individuals may be much larger than could be predicted by game managers based on the typical home range of the species. It has been observed and reported that the common pheasant populations of the two countries in the study area have been declining for the past 30 years. It is then apparent that the intended population reinforcements with captive-bred individuals have not helped to achieve the impact hoped for by wildlife farmers and could not prevent the decline in population size. However, in the absence of reference areas and populations, it is difficult to judge the exact impact of artificial releases. It may have even helped to strengthen natural populations, but at the same time it could only slowed the rate of population decline. But it may also have been negatively affected the population size trend by outbreeding depression, the introduction and faster spread of diseases and parasites due to birds introduced from foreign sources and the larger herd size, the increase in the number of predators and the neglected natural population due to supplementation.The results of this study provide essential information for conservation management, monitoring, and for comparison with future assessments of the genetic status of common pheasant populations in Central and Southeast Europe.

Material and methods
Sample collection and DNA extraction. We collected 74 fresh tissue samples in the contact zone of Central and Southeast Europe (Hungary and Serbia) between 2014 and 2015 (Fig. 1). Figures S3 and S4 show, respectively, wild-born and captive-bred pheasants from Hungary. The samples were obtained from legally hunted pheasants and stored in 96% ethanol. No animals were shot only for the purpose of this study. An ethics statement was not recquired for this work; collection was done in accordance with the countries' national regulations and under hunting license. Genomic DNA was extracted using Roche High Pure PCR Template Preparation Kit (Roche Life Sciences) following the manufacturer's protocol. Microsatellite genotyping. Ten microsatellite loci (Table S4) (Table S4). Fragment analysis was performed on an ABI 3730XL DNA Analyser (Applied Biosystems). Genotypes were determined using Peak Scanner v1.0 (Applied Biosystems).

Mitochondrial DNA analysis. Genetic diversity and phylogenetic analyses.
New sequences in the present study were combined with 202 GenBank sequences from previous studies (Table S1) to yield a dataset in order to estimate the phylogenetic relationships among common pheasants across the species range. The number of haplotypes, haplotype diversity (Hd), nucleotide diversity (π), and the number of polymorphic sites (P) were calculated using DnaSP 5.10.1 67 . The best-fit models of nucleotide substitution were selected using jModeltest 0. www.nature.com/scientificreports/ Demographic analyses and neutrality tests. We constructed extended Bayesian skyline plots (EBSPs) 75 in BEAST 2.6.4 71 to explore the demographic dynamics of detected population of the common pheasant in the study area (clusters B and F detected by BAPS). The latter analyses were conducted using a HKY substitution model, a strict clock model, and the coalescent extended Bayesian skyline as tree prior. The MCMC chain was set to a total 400 million generations, and the Markov chain was sampled every 10,000 steps. Tracer was used to assess the convergence and stationarity of the EBSP values, and whether the effective sample sizes (ESS) were > 200. We performed multiple replicate runs to ensure MCMC convergence. EBSPAnalyzer 2.5.2 71 was employed to construct the EBSPs. We used R 3.1.2 76 to generate the visualizations. Mismatch distribution analyses were used to examine demographic expansions in detected population clusters. To compare observed distributions with expected distributions under the expansion model, we estimated sums of squared deviations (SSD) and Harpending's raggedness index (RI) in Arlequin. Morever, we used the tests of Fu's FS 77 and Tajima's D 78 , computed in Arlequin to test equilibrium of the population. A negative value of these statistics signify an excess of low frequency polymorphisms, and suggests a population size expansion and/ or purifying selection of a population 78 . When we detect more than one distinct population cluster in the BAPS analysis, we carried out the EBSP analyses, mismatch distribution and neutrality tests for each cluster separately.

Microsatellite analysis.
Hardy-Weinberg and linkage disequilibrium tests. We used exact tests in Genepop 4.2 79 to test loci for Hardy-Weinberg equilibrium (HWE) deviations and linkage disequilibrium (LD) between pairs of loci, based on the whole dataset and separately for each region (Hungary and Serbia). Probability values were assessed for significance using sequential Bonferroni correction for multiple comparisons and a nominal significance value of 0.05 80 .
Genetic structure and population differentiation. We used STRU CTU RE 2.3.4 81 to investigate the likely number of genetic clusters (K) in the study area. STRU CTU RE was run using the admixture ancestry model, correlated allele frequencies, K from 1 to 5, 10 independent simulations for each K, 100,000 iterations as burn-in, and 1,000,000 iterations for sampling. We conducted Bayesian clustering with (i.e. LOCPRIOR model; assuming two geographic populations; (Serbia and Hungary) 82 ) and without prior information on sampling locations. We used STRU CTU RE HARVESTER 83 to process and visualize results from STRU CTU RE, and we determined the most likely value of K based on the log probability of the data (Ln P(D)) and the ΔK statistic 84 . The membership probabilities (Q-values) were used to assign individuals to clusters considering a membership threshold of either 50% or 70% (e.g. 85,86 88 were used to average the membership probabilities over the 10 runs for each K and to plot the results, respectively. The number of genetic clusters present in the study area and the degree of admixture between them was also investigated using the spatial mixture model in BAPS 6 73,89 . We tested K values from 1 to 5, and subsequently performed a separate analysis in the 'fixed K' mode with K = 2. Additionally, we used MEMGENE 1.0.1 90 to complement the analyses with STRU CTU RE and BAPS and to visualize spatial genetic neighbourhoods. MEMGENE performs a spatially explicit ordination technique free of linkage and Hardy-¬Weinberg equilibria assumptions; in particular it uses Moran's Eigenvector Maps (MEM) to detect weak genetic structure 90 . Spatial autocorrelation analyses were used to detect fine-scale patterns of genetic structure across the study area 8,91,92 . We divided the individuals into 22 geographic distance classes of equal sample size. A correlation coefficient r between geographic and genetic distance was calculated among individuals in each class. Significance tests were performed using 10,000 permutations to test the null hypothesis of no spatial autocorrelation. Confidence intervals around the estimates of r were determined using bootstrapping with 10,000 replicates. The results were plotted in a correlogram. Finally, we used the Mantel test within the R package Vegan version 2.5-7 93 to assess the hypothesis of isolation-by-distance (IBD) 94 . For this, we calculated genetic and log-transformed geographic distances among all pairs of individuals in GenAlEx 6.5 95 . A total of 10,000 replicates were used to determine the significance of the test relative to the 95% quantile of the distribution of permutations.
Genetic diversity and bottleneck tests. We quantified genetic diversity using basic genetic variation statistics, including observed (HO) and expected (HE) heterozygosity, mean number of alleles (MNA), allelic richness (AR), effective number of alleles (NE), private allele frequency (AP), and polymorphic information content (PIC). The estimates and 95% confidence intervals were obtained for the total sample and separately for each country (Hungary and Serbia) using GenAlEx, FSTAT v.2.9.3.2 96 , and POPGENE 1.32 97 . The inbreeding coefficient (FIS) was estimated using GENETIX4.05 98 and the significance of values was evaluated based on 10,000 permutations. BOTTLENECK v1.2.0269 52 was used to test for evidence of recent bottlenecks in effective population size at two levels including cluster-level (STRU CTU RE-inferred clusters based on the spatial model and membership probability threshold of 50%) and country-level (Hungary and Serbia). We ran BOTTLE-NECK using 1000 iterations and three different mutation models: the infinite alleles model (IAM), the stepwise mutation model (SMM), and the two-phase model (TPM; variance for TPM = 12% and proportion of SSM in TPM = 95%, following recommendations of Piry et al. 52 for microsatellites). The statistical tests applied in BOT-TLENECK were the Wilcoxon signed-rank test 99 , sign test, and the 'mode-shift' indicator 100 .

Data availability
All haplotypes are available from the Genbank database (accession number (s) KY474094-KY474103 for the mtDNA control region.