Taxonomic status and molecular phylogeography of two sibling species of Polytremis (Lepidoptera: Hesperiidae)

The skipper Polytremis theca species complex is widely distributed in the south of the Qinling Mountains in China. A recent study of the Polytremis genus suggested that this species might encompass two differentiated lineages. We tested this hypothesis, by carrying out a phylogenetic study of this agricultural pest based on nationwide sampling and the evaluation of mitochondrial and nuclear DNA markers. We show that this species is actually an amalgamation of two sibling taxa (P. t. theca and P. t. fukia), which displayed levels of genetic divergence as great as those generally found between sister species in the Polytremis genus, suggesting that they actually correspond to two distinct species. The Divergence time estimates suggest that an active period of speciation within Polytremis occurred within the Pleistocene eras. Based on its distinct phylogenetic placement and geographical isolation, we suggest that the subspecies should be elevated to full species status under the phylogenetic species concept, which has significant management implications.

a very small number of samples and required confirmation in a larger sample. To answer the question, we conducted molecular phylogenetic analyses of P. theca species complex collected from 13 geographic localities in China using a region of mitochondrial genes (COI) and a region of nuclear genes (wingless).

Materials and methods
Samples collection. We collected specimens of P. t. theca (8) from five local regions and P. t. fukia (25) from eight different localities in China from 2008 to 2014 ( Fig. 1; Table 1), and preliminarily identified them based on traditional wing and genitalia morphology characters recommended by Evans 1 , Eliot 15 and Chou 16 . Our sampling of 33 specimens covered the major range of the P. theca (Fig. 1). The outgroups included eight specimens of P. nascens Leech, 1893, five specimens of P. mencia Moore, 1877 (Table 1). All specimens were caught in the field, preserved by dehydration in small envelopes and dried with silica desiccant for further processing.
DNA extraction and sequencing. The DNA was isolated from legs of adult butterfly using a QIAamp DNA Mini kit (Qiagen, Hilden, Germany) essentially following the manufacturer's instructions but with some modification. Briefly, after adding proteinase K and buffer AL (QIAGEN ® ), the mixed homogeneous solution was incubated at 70 °C for 2 h. Subsequently, 200 μ L of 100% ethanol was added and the mixture transferred to a QIAamp spin column. The mixture in the spin column was subjected to 3cycles of centrifugation at full speed (14,000g) for 1 min and the filtrate was returned to the spin column to increase the amount of DNA obtained. Partial sequences of mitochondrial gene COI (487 bp) was amplified and sequenced with primers HCO2198: 5′ -TAA ACT TCA GGG TGA CCA AAA AAT CA-3′ and LCO1490: 5′ -GGT CAA CAA ATC ATA AAG ATA TTG G-3′ 17 . The nuclear DNA fragments of wingless (389) were amplified and sequenced with primers WG1z: 5′ -GAR TGY AAR TGY CAY GG-3′ and WG2a: 5′ -ACT ICG CAR CAC CAR TGG AAT GTR CA-3′ 18 . The PCR for all amplicons were conducted in 20 μ L final volume reactions containing 1 μ L template DNA, 0.4 μ M each of the amplification primers, 200 μ M each dNTP, 1.5 mM MgCl 2 , 50 mM KCl, 10 mM Tris-HCl (pH 8.3) and 2 U Taq polymerase (Takara, Otsu, Shiga, Japan). The thermal profile used was as follows. COI: 95 °C denaturation for 5 min followed by 35 cycles of 94 °C for 45 s, 42 °C for 1 min, 90 °C for 90 s and a final elongation step at 72 °C for 10 min. Wingless: 95 °C denaturation for 5 min followed by 30 cycles of 94 °C for 60 s, annealing at 55 °C for 60 s and extension at 72 °C for 2 min, with a final extension of 72 °C for 10 min.
Extraction blanks were run in all reactions to control for contamination during the extraction and PCR processes. The amplification products were subjected to electrophoresis in a 2% (w/v) agarose gel in TAE buffer (0.04 M Tris-acetate, 0.001 M EDTA) with a DL1000 ladder size marker (Takara, Otsu, Shiga, Japan) to determine whether the amplification reactions were successful. In addition, after electrophoresis in agarose gels, the amplification products were extracted using the Wizard SV Gel and PCR Clean-up System (Promega, Madison, WI, USA) for sequencing. Finally, all the haplotypes obtained were deposited in GenBank (Table 1).

Statistical analyses.
The sequence data of the mitochondrial COI and nuclear wingless were aligned with published homologous sequence from genus Polytremis (e. g., P. eltola Hewitson, 1869, P. discreta Elwes & Edwards, 1897, etc.) as well as Borbo cinnara Wallace, 1866 and Pseudoborbo bevani Moore, 1878 (presented in Table 1) respectively, translated to amino acid sequences to check for nuclear mitochondrial pseudogenes (numts) and pruned to remove redundant sequences with Bioedit v.7.0 19 . The haplotype sequence matrix was used for all subsequent phylogenetic analyses (Table 1). Phylogenetic trees were constructed by the ML (maximum-likelihood) methods with PhyML 20 . Modeltest 3.7 21 was used to select the optimal nucleotide substitution models following the Akaike Information Criterion (AIC). In the ML analysis, a heuristic search was  conducted. The starting tree for branch-swapping was from stepwise addition. Nodal support of the ML tree was estimated by 1000 bootstraps. Network profile of the haplotypes identified in P. t. theca, P. t. fukia, P. nascens and P. mencia was constructed with Network4.5 using the median-joining method 22 . The haplotype diversity (Hd) and nucleotide diversity (π ) for P. t. theca and P. t. fukia were estimated by DnaSP4.90 23 .
For the COI data set, pairwise F ST was also calculated with Arlequin v3.0 24 , which accurately reflects patterns of genetic variation. Correspondingly, gene flow was estimated in Arlequin v3.0 24  of molecular variance (AMOVA) were conducted to evaluate possible population genetic structure of P. t. theca and P. t. fukia using Arlequin v3.0 with 1,000 permutations. We calculated Tajima's D 25 and Fu's F statistic 26 and ran 10,000 coalescent simulations for each statistic to create 95% confidence intervals investigate the historical population demographics and testing whether the sequences conformed to the expectations of neutrality. Pairwise mismatch distribution analyses were performed for all P. theca specimens and two subspeices specimens separately to find the evidence of past demographic expansions using DnaSP4.90 23 . The times to the most recent common ancestor of the major lineages and the whole population were estimated using relax-clock molecular dating estimation implemented in the BEAST 1.5.2 27 . Analyses using the HKY model of nucleotide substitution with gamma distributed rate variation among sites were performed. The Yule speciation method was assumed and the nucleotide substitution rate of 3.54% per million years that has generally been calibrated for COI in insect 28 was used. Chains were run for 50 million generations, with the first 20% discarded as burn-in. The results were summarized through TRACER 1.5 29 .

Result
Genetic Divergences and haplotype networks. All 46 samples yielded high-quality of DNA and were successfully sequenced for the mitochondrial COI and nuclear wingless (accession numbers KR911919-KR911947, Table 1, Fig. 2). The data set of COI alignment contains 487 nucleotide positions, of which 64 positions are variable and 47 are parsimony informative. The mean base composition of the fragment shows a strong bias of A + T (T 39.4%, C 18.2%, A 28.7% and G 13.7%), as found commonly in insect mitochondrial genomes 30 . Nineteen haplotypes were identified in all 46 samples (three in P. t. theca, nine in P. t. fukia, three in P. nascens and four in P. mencia) and the haplotype network was constructed and presented in Fig. 3. There was no shared haplotype among the four taxa. Haplotypes of the same taxon differed from each other by no more than five mutation distance. The five mutation distance existed between the haplotype Ptt I and Ptt III of P. t. theca. The potential ancestral haplotype of P. t. fukia, defined by its central position in the network, was designated as Ptf I, which was found in three samples from West Tianmu Mountains, one from Jinggang Mountains, and one from Lingui. Ptf II was the most common haplotype in P. t. fukia and shared with ten samples. Haplotype Ptf III was found in two samples from Wuyishan. Haplotype Ptf XIII was identified in two sampled from Maoershan and one sampled from Anjiangpin. The remaining haplotypes of P. t. fukia occurred in only one individual. At least 18 nucleotide substitutions were observed between the potential ancestral haplotypes of P. t. theca (Ptt I) and that of P. t. fukia (Ptf I). In addition, the haplotype (Pn I) of P. nascens differed from Ptt I and Ptf I by 29 and 25 nucleotide substitutions while the haplotype (Pm I) of P. mencia differed from Ptt I and Ptf I by 26 and 23 nucleotide substitutions, respectively (Figs 2 and 3). The data set of nuclear wingless contains 390 nucleotide positions without gaps or stop codons, of which 18 positions are variable and 9 are parsimony informative. In total, ten haplotypes were found in all samples, in which two haplotypes were found in P. t. theca, four in P. t. fukia, three in P. nascens and one in P. mencia. All the haplotypes were used for network construction with the software Network 4.5 using the median-joining method (Fig. 3). Haplotypes of the same taxon differed from each other by no more than two mutation distance, in particular, the two haplotypes in 13 samples of P. t. theca differed by only one-mutation distance. Five nucleotide substitutions were observed between the potential ancestral haplotypes of P. t. theca (Ptt I) and that of P. t. fukia (Ptf I). In addition, the haplotype (Pn I) of P. nascens differed from Ptt I and Ptf I by nine and five nucleotide substitutions while the haplotype (Pm I) of P. mencia differed from Ptt I and Ptf I by two and seven nucleotide substitutions, respectively (Figs 2 and 3).  Table 1. The order of mutations (which were scored relative to haplotype Ptt I) on a branch is arbitrary.
Scientific RepoRts | 6:20820 | DOI: 10.1038/srep20820 Overall, P. t. theca had a lower diversity than P. t. fukia according to the result of analysis of both mitochondrial COI and nuclear wingless. The haplotype diversity (Hd) and nucleotide diversity (π ) for P. t. theca and P. t. fukia are given in Table 2. Additionally, they differed from each other by 5.07 ± 0.49% (4.3-5.9% divergence) for the COI sequences and by 1.70 ± 0.27% (1.3-2.1% divergence) for the wingless sequences.  The submodel GTR + I + G was selected for the COI gene sequence data; (B) The submodel GTR + G was selected for the wingless sequence data.  Phylogenetic and population structure analysis. Phylogenetic analyses were performed using ML method. For the COI gene sequence data, the submodel GTR + I + G was selected. For the wingless sequence data, the submodel GTR + G was selected. Figure 4A shows the ML tree based on the data set of COI. On the phylogeny, the haplotypes of P. theca were split into two discrete clades. One clade with three haplotypes consisted exclusively of the P. t. theca (strongly supported by the bootstrap probability of 99%), whereas the other clade with nine haplotypes exclusively consisted of the P. t. fukia (strongly supported with bootstrap probability of 99%). Figure 4B shows the ML tree based on the data set of wingless. Three haplotypes, constituted a monophyletic group, exclusively containing P. t. fukia butterflies whereas two haplotypes of the P. t. theca, P. pellucida Murray, 1875, P. zina Evans, 1932 and P. mencia Moore, 1877 constituted a strongly supported clade (89%). On the other hand, two P. t. theca haplotypes formed a strongly supported clade (86%). Mitochondrial COI and nuclear wingless strongly supports the distinction between P. t. theca and P. t. fukia. The AMOVA for the COI sequences of P. t. theca and P. t. fukia revealed that 88.53% of the genetic variation was among populations and 11.47% was within populations ( Table 3). The average Φ ST value as 0.896 (p < 0.01), suggesting significant genetic variation among the populations. Pairwise estimates of F ST (0.885) and gene flow (Nm = 0.065) between P. t. theca and P. t. fukia suggests that the subspecies in this species are highly differentiated.

Sum of squares
Demographic inference and estimation of divergence times. Demographic history changes were analyzed for P. t. theca and P. t. fukia populations through neutrality tests and mismatch distribution. The neutrality tests reveal that the Mitochondrial COI appear to be not evolving neutrally as Fu's F values in P. t. fukia group are negative significantly ( Table 2). The Tajima's D and Fu's F values were non-significantly positive in Scientific RepoRts | 6:20820 | DOI: 10.1038/srep20820 P. t. theca group and all P. theca samples group. The mismatch analysis yielded a unimodal distribution of pairwise differences for P. t. fukia (Fig. 5B) compared to the multimodal distribution of P. t. theca samples (Fig. 5A) and the pooled samples (Fig. 5C). According to Rogers and Harpending 31 , the observed curves with unimodal representing population expansion and the observed curves with many peaks or resemblance to expected curves mean equilibrium population, which further elucidating the demographic history of P. theca. The results suggest that population expansion in P. t. fukia and population equilibrium in P. t. theca. Divergence time analysis with an uncorrelated lognormal relaxed clock run in BEAST produced a tree with a topology similar to ML tree (Fig. 6). P. t. theca diverged from P. t. fukia around 0.81 (HPD = 0.53-1.28) million years ago (Mya) during the Pleistocene (node a in Fig. 6). P. theca diverged from other congeners included in the analysis about 1.36 (HPD = 1.02-1.53) Mya during the Pleistocene eras (node b in Fig. 6).

Discussion
There is a small region of overlap in central Sichuan province in the distribution of P. t. theca and P. t. fukia, but otherwise they are not sympatric (Fig. 1). P. t. theca inhabits the higher elevations of west-central Sichuang Province and the Qinling Mountains 16 . P. t. fukia occurs in the whole southeastern of China, from northern Zhejiang Province, Jiangxi Province to east-central Yunnan Province 9,32,33 . According to the description on morphological variation between P. t. theca and P. t. fukia 12 , we found a different morphological feature existing in female genitalia except for the different color and spot number in some part of wings (Table 4). The ductus bursae of female P. t. theca is thinner than that of female P. t. fukia. We suspected thetaxonomic status of the subspecies from their geographic separation and the morphological variation.
Mitochondrial haplotypes sampled from P. theca form well supported clades that closely correspond with subspecific boundaries delimited primarily on the basis of wing color and pattern (Fig. 4A). The haplotype clades associated with both subspecies are deeply genetically divergent, differing from each other by 5.07 ± 0.49% (4.3-5.9% divergence). This degree of divergence suggests that evolutionary separation of both subspecies occurred about 0.81 (HPD = 0.53-1.28) Mya, likely sometime during the Pleistocene based on a molecular clock calibration of 3.54% pairwise divergence per million years for a homologous mtDNA fragment in other insect species 28 . (fossil data and geographical calibration event are not available for this group). It is noteworthy for the nuclear wingless sequences that P. t. theca and P. mencia are considered distinct species with a genetic divergence of 0.65 ± 0.15% while the P. t. fukia is currently considered a subspecies of the P. t. theca despite 1.70 ± 0.27%   sequence divergence. The phylogenetic of wingless gene indicates that the P. theca is paraphyly with three species sister to the clade of P. t. theca (Fig. 4B). While 100% of P. t. fukia constitutes one separate clade, the clade consisting of P. t. theca also includes P. pellucid, P. zina and P. mencia. The clade consisting of P. t. theca is not monophyletic, but complex. This suggests that P. t. theca and P. t. fukia differ from each other, as evident from the COI tree where they form two separate clades (Fig. 4A). Concordance between strongly differentiated mtDNA, nuclear haplotype clades and phenotypic variation supports the hypothesis that both subspecies of P. theca deserves recognition at the species-level under the general lineage concept of species 4,34 . Two neutrality tests were chosen for the demographic history analysis, and significantly negative values of neutrality statistics can be indicative of background selection, but are also consistent with either population subdivision or expansion. Fu's Fs is significantly negative in P. t. fukia group reveal that the mtDNA of P. t. fukia appear to be not evolving neutrally. Additionally, we analysed the population size change of P. t. theca, P. t. fukia and all P. theca, respectively by the software DnaSP4.90 23 and got no evidence for population expansion in P. t. theca group 31 (Fig. 5A). However, we got unimodal curves representing population expansion in P. t. fukia 31 (Fig. 5B). The results suggest that population expansion in P. t. fukia. We still confirm the result of the population size change in haplotype network (Fig. 3). Statistical parsimony network reflects genealogical relationships of the mtDNA haplotypes, that is, single mutation steps separate adjacent haplotypes in the network and older haplotypes are placed at internal branching points whereas younger ones occur toward the tip positions 35 . The haplotypes network of P. t. fukia displays a star-like pattern (Fig. 3). Haplotype I, the second most common and geographically widespread in central-west of China, is in the star's centre and derivatives are connected to it by short branches. Based on coalescence theory, the star-like topologies for this cluster strongly suggest the effect of a population expansion 36 . In our study, a higher F ST value indicated a lower level of gene flow (Nm) and higher genetic differentiation among populations. The results of two-level AMOVA show that significant genetic variation exists among the examined populations. These results provide a second line of support to a conclusion that the P. t. fukia is a different species.
In conclusion, our investigations and analyses revealed significant molecular and biogeographical differences between P. t. theca and P. t. fukia. We propose that P. t. fukia should be treated as a distinct species called Polytremis fukia Evans 1940 under the Phylogenetic Species Concept. In fact, it has been recently found that other species previously considered subspecies based on morphology are in fact sibling species that passed unnoticed until the advent of molecular techniques [37][38][39][40] . Results from our study strengthen information about the Polytremis species complex and help in developing appropriate integrated pest management strategies for these insect pests.