Introduction

Animal migration is one of the most remarkable phenomena in nature, in pursuit of improved safety, increased foraging opportunities, and higher reproductive output1,2. When it comes to long distance migrations, some animals, such as birds and whales, may be the most commonly thought of3. Yet, some dragonflies are as capable of long- distance travel. Insects in the order Odonata (phylum Arthropoda, class Insecta, subclass Pterygota) are the most ancient invertebrates capable of flight and very diverse, with approximately 6000 species worldwide4,5,6. Mass migration of dragonflies has long been known and there are about 25–50 migratory species6,7. Pantala flavescens is well suited for phylogeographic studies due to its extensive migratory ranges, which spans mountain ranges, continents and oceans.

Pantala flavescens (Fabricius, 1798) may be the most common dragonfly in the world because it is very adaptable to different habitats and long-distance dispersal capability. It occurs in temperate and tropics regions, from lowland to montane, typically in coastal and open areas that span thousands of kilometers8,9,10,11. Isotopic evidence also suggests that its multigenerational journey may total over 18,000 km, with single individuals traveling over 6,000 km during the transoceanic trek from northern India to east Africa12.

Such migratory behavior of odonate insects could homogenize genetic differentiation among populations by the exchange of individuals and genes among populations that are separated by large distances and thus impact the population structure of that species. For example, haplotype diversity in populations of Anax junius (Drury 1773), the common green darner, is relatively high in the absence of any obvious phylogeographic pattern13. Libellula quadrimaculata (Linnaeus, 1758), the four-spotted chaser or skimmer, showed high haplotype interconnection among samples collected within Asia, Europe and North America14.

Previous research on P. flavescens using randomly amplified polymorphic DNA has also shown that genetic diversity is low and gene flow is high among five geographically isolated populations within India15. Based on sequence data using cytochrome oxidase 1 (CO1), high rates of gene flow are present among all studied geographic regions and genes are shared among individuals across the globe3. In addition, no significant genetic differentiation among Malaysian populations of P. flavescens was found, despite a high level of gene flow determined by mismatch distribution and neutrality tests, which also provided evidence of demographic expansion during the Pleistocene (190,000–260,000 years ago)11.

A less-mobile species may be expected to show some evidence of haplotype clustering according to geographic region. A portion of mitochondrial CO1 gene sequence analysis of the tiny dragonfly Nannophya pygmaea showed overall low genetic diversity among 68 N. pygmaea individuals collected over six habitats in Korea16. Although these geographic populations of N. pygmaea in Korea clustered into two groups, genetic isolation by distance was not detected16.

Microsatellites are the most popular genetic marker owing to their high abundance, easily typed locus-specific codominance and Mendelian inheritance. The mitochondrial cytochrome b gene (Cytb) has often been used as a marker for studies of evolution and population structure and to resolve taxonomic conflicts in many animal groups17. In the present study, we used partial sequences of Cytb and microsatellite markers to analyze the population structure, genetic divergence and demographic history among individuals of P. flavescens collected during an intensive sampling in the eastern monsoon region of China. Such investigations can improve our knowledge of the migratory behavior of P. flavescens in China and may also contribute to investigations of the evolution of migration. Additionally, the sampling and analytical methods used here may provide a potential model to study the migration of other insects.

Results

Mitochondrial DNA analysis

From 583 individuals of P. flavescens, 542 sequences including 477 bases in the mtDNA Cytb genes were obtained. Of the total genetic characters, 430 bases were conserved, and 47 polymorphic sites were found in the alignment of 542 Cytb sequences, with 12 singleton polymorphic sites (20%) and 35 parsimony informative sites (80%). Genetic diversity and the distribution of haplotypes among different populations of P. flavescens based on the Cytb sequences are shown in Table 1. The number of haplotypes (NH) in each population ranged from 7 to 17 (mean 12). CS (See Supplementary Table 1 for code definitions and locations of each population) had the most haplotypes (NH = 17) and PL had the fewest haplotypes (NH = 7). Among the 77 haplotypes, haplotype H6 was the most frequent and widely distributed, being shared by 206 samples among 19 populations. The second-most frequent haplotype was H1, shared by 145 individuals among 19 populations. Thus, H6 and H1 were the primary haplotypes and were shared by individuals from different populations. The global haplotype diversity (HD) ranged from 0.662 (population PL) to 0.926 (HZ) (mean = 0.810). Nucleotide diversity (π) ranged from 0.003 (PL) to 0.009 (CS and QF) (mean = 0.006). The number of transitions was 311 and transversions numbered 48. Based on the combined populations, all 19 populations had high haplotype diversity and low nucleotide diversity (Table 1). These haplotypes are clustered into three branches but have no obvious geographical distribution in a median-joining network (Supplementary Fig. 1).

Table 1 Distribution of haplotypes and molecular diversity based on Cytb sequence from 542 individuals from among 19 populations of Pantala flavescens in China.

Population genetic structure and Bayesian clustering based on mitochondrial data

Pairwise Fst values ranged from −0.003 to 0.090, with the highest differentiation observed between CC and LF and the lowest between HEB and QF (Supplementary Table S2). There was an intermediate level of differentiation between 66 populations (0.05 < Fst < 0.15), and the other 105 populations with no differentiation (Fst < 0.05). The Fst matrix based on mitochondrial genes showed that 133 of the 171 population pairs had no significant differentiation (Supplementary Table S2).

The AMOVA of the mtDNA data revealed 1.91% of the genetic variation was among population within groups, whereas the remaining (98.07%) came from variation within the populations. Results of the AMOVA test based on mtDNA markers in different populations of P. flavescens are shown in Table 2.

Table 2 Results of analysis of molecular variance (AMOVA) test on microsatellite and Cytb markers in different populations of Pantala flavescens in China.

When the 19 populations were regarded as a whole, Tajima’s D and and Fu’s FS statistic are statistically negative but not significant (P > 0.02) (Table 1). At the same time, the bimodal mismatch distribution (Fig. 1) and three phyletic clusters of the mtDNA haplotype network also indicated no demographic expansion (Supplementary Fig. S1). For most populations in the eastern monsoon regions of China, Tajima’s D and Fu’s FS was negative (Table 1), but not significant (P > 0.02). The results showed that these populations were in a stable state, with no recent bottleneck or a rapid population expansion. However, for the other populations (Table 1), Tajima’s D and Fu’s FS statistic values are negative with P values being significant (P < 0.02), which showed these populations had experienced a recent population expansion.

Figure 1
figure 1

Frequencies of the observed and expected pairwise differences (the mismatch distribution) in the samples of Pantala flavescens from 19 populations in China.

Microsatellite analysis

In a microsatellite analysis of 542 DNA samples from 19 populations using 10 microsatellite markers, Fisher’s test indicated 142 of 190 locus–population combinations deviated significantly from Hardy–Weinberg Equilibrium (HWE). According to a Micro-checker analysis, the null allele was very low, mostly less than 0.01 among 10 microsatellite markers; therefore, the presence of null alleles had almost no influence on the analysis of Fst. The genetic parameters of these populations based on the 10 microsatellite markers among the 19 populations are summarized in Table 3. For 10 microsatellite markers across all populations, allelic richness (Ar) ranged from 5.33 in population ZZ to 6.95 in CF (mean 5.94); 140 alleles were obtained, with 6 alleles (Na) across microsatellite loci in XX to 8.9 in HZ (mean 7.35). The mean effective number of alleles (Ne) across microsatellite markers was 3.06, ranging from 3.03 in HEB to 4.41 in GL. The mean value of Shannon’s index (I) across microsatellite markers was 1.46, ranged from 1.34 (GZ) to 1.66 in (HZ). The observed heterozygosity (Ho) ranged from 0.36 in QF to 0.55 in JX (mean 0.48) and expected heterozygosity (He) ranged from 0.64 in HEB to 0.75 in HZ (mean 0.68). The unbiased expected heterozygosity (uHE) ranged from 0.65 in HEB to 0.77 in CF (mean 0.69). Overall, there was a high level of genetic diversity for all microsatellite loci in the study regions.

Table 3 Genetic diversity indices of 10 microsate markers in 19 populations of Pantala flavescens in China (See Table S1 for code definitions and locations of each population).

Population genetic structure and Bayesian clustering based on microsatellite data

The Fst matrix based on microsatellite data showed that 101 of the 171 population pairs had significant differentiation (Supplementary Table S3). Pairwise Fst values based on microsatellite data ranged from 0.001 to 0.09, with the highest differentiation between LF and CC and the lowest between GY and GL (Supplementary Table S3). There was intermediate differentiation between 65 population pairs (0.05 < Fst < 0.15), and the other 106 population pairs had no differentiation (Fst < 0.05). The Bayesian clustering analysis revealed the presence of three distinct clusters, and each population was a mixture including individuals from three clusters (Supplementary Fig. 2, Fig. 2). Figure 2 revealed allelic similarities among these populations and showed differences in the frequencies of common alleles among them.

Figure 2
figure 2

Bayesian clustering analysis of 19 populations of P. flavescens. Population codes are given in Table S1. Each individual is represented by a vertical bar displaying membership coefficients for each genetic cluster. Blue, green and red represent the three clades.

The AMOVA of the microsatellite data revealed 4.18% genetic variation among population within groups, whereas the remaining (95.68%) was genetic variation within populations.

A high distribution proportion was obtained by the partial Bayesian method in Geneclass, and 95.7% (519/542) of the individuals were assigned to the 19 populations. The remaining 23 individuals might be migrants from other areas to the sampling regions. GENECLASS identified that all migratory individuals were offspring of migrants and that none were first-generation migrants.

Mantel test for isolation by distance

Our observation showed no significant correlation between genetic distance and geographic distance of these populations based on the two types of markers (microsatellite genotypes: Z = 8299.1787, r = 0.0854, P = 0.804, and Cytb: Z = 2863.798, r = −0.1002, P = 0.147), which may be reflective of a lack of phylogenetic divergence among the individuals across the study areas. This result also suggested migratory dragonflies could colonize and exchange genes with local populations.

Discussion

Our results showed high rates of gene flow occurred among 19 geographic populations. Natural dispersal ability and long-distance migration are the most important factors contributing to a higher level of gene flow and consequent slowing or limitation of geographic differentiation11,18. In our current study, Pantala flavescens had high haplotype interconnection among 19 populations in the eastern monsoon regions of China and a lack of phylogeographic structuring. Large-scale migrations of P. flavescens resulted in high rates of gene flow, lower genetic diversity. and the lack of physical barriers to gene flow.

The present study revealed deviations from Hardy–Weinberg equilibrium at 10 microsatellite markers in 19 populations, which was due to the low heterozygosity, which was further confirmed by the MICRO-CHECKER analysis. Ability to migrate long distances of dragonflies might be the major factors causing the deviation from Hardy–Weinberg equilibrium among the studied populations. This result is consistent with previous reports19. Wright (1978) suggested that if Fst = 0, then the two populations lack differentiation; when Fst = 0.05–0.15, the populations are moderately differentiated, and when FST = 0.15–0.25, the populations are highly differentiated20. In our experiment, different populations had low to moderate genetic differentiation based on Fst values: from −0.003 to 0.09 (Supplementary Table S2) and from 0.001 to 0.09 (Supplementary Table S3). For both the SSR and mtDNA molecular markers, the percentage of variation mainly existed within populations (>95%), whereas the percentage of variation among groups and among populations within groups clustered by 19 populations was less than 5% (Table 2). Compared with mitochondrial genes, microsatellite DNA might reveal more information on variation (Supplementary Tables S2 and S3), gene flow, bottlenecks and population divergence. P. flavescens in the eastern monsoon regions of China was classified into three genetic clusters, which was supported by the AMOVA result (Table 2) and STRUCTURE analysis (Supplementary Fig. 2, Fig. 2).

The median-joining network revealed a close relationship among haplotypes, suggesting that P. flavescens populations share a recent history without long-term genetic isolation. These ancestral haplotypes (H6 and H1) of mtDNA were widely distributed in all populations, and the haplotypes in 19 populations formed three clusters, but had no obvious geographic divisions, which indicates that geographic barriers and climatic factors have little influence on migration of this dragonfly in different regions.

Estimates of Nm are often taken at face value as the approximate number of migrants moving among populations21. In our experiment, different populations had a high number of individuals dispersed (Nm = 4.326), thus avoiding the genetic differentiation that arises from genetic drift and explaining the low inter-subpopulation genetic variation. Our results showed a high rate of gene flow and lack of population differentiation among 19 studied populations. At the same time, genetic differentiation and geographic distance were not correlated, so isolation by distance does not appear to be a barrier for gene flow.

A less-mobile species may be expected to harbor some evidence of haplotype clustering according to geographic region. Nannophya pygmaea is very small and unlikely to migrate over large-scale regions, which likely contributes to its overall low diversity and genetic isolation by distance16. The endangered damselfly Coenagrion mercuriale (Charpentier, 1840) is a weak flier22, and significant genetic differentiation between sites can be prevented if sites are <2 km apart and not separated by a physical barrier. Nevertheless, dispersal by C. mercuriale is sufficiently restricted so that genetic structure can result from isolation by distance develops within 10 km22.

Our results suggested that P. flavescens could successfully colonize and adapt new habitats. They were able to disperse randomly and exchange genes with local populations which lead to high rates of gene flow among 19 geographic populations. However, the migration of P. flavescens on a global scale and the potential ecological impacts of their migratory behavior remain unknown and need further research.

Materials and Methods

Insect materials

In total, 583 individuals of P. flavescens were sampled from 19 geographic sites in 18 provinces in China, from June 2013 through October 2014 (Fig. 3; Supplementary Table S1). These samples were collected through a sweep net and stored at −20 °C.

Figure 3
figure 3

Locations of 19 sampling sites of Pantala flavescens in eastern the monsoon region of China during 2013 and 2014. Population codes are given in Table S1.

DNA extraction

Genomic DNA was extracted from the thorax of individual adults using the TIANamp Genomic DNA Kit (Tiangen Biotech, Beijing), and preserved at −20 °C in the refrigerator.

Mitochondrial DNA sequencing and analysis

Partial regions of the mitochondrial gene Cytb were amplified using published primers23. Each PCR amplification was performed in 30 μL reactions with each reaction consisting of 15 μL of 2× Taq Master Mix solution (CoWin Biotech Co., Beijing), 1 μL of DNA template, 1 μL of each primer (forward and reverse, both diluted 10×), and 12 μL of RNase-free water. The PCR reactions were performed in a Techne thermocycler (Germany) with the following program: 5 min initial denaturation step at 94 °C; 35 cycles of 45 s at 94 °C, 45 s at 55 °C, 1 min at 72 °C; and a final extension 5 min at 72 °C. Amplification products of Cytb were sequenced on an ABI 3730 DNA Sequencer (Applied Biosystems, USA). Mitochondrial DNA sequences were manually checked and aligned with ClustalX 1.8524, using the multiple alignment default parameters. Nucleotide composition, parsimony informative sites, variable sites and conserved sites were calculated with MEGA 625. Molecular diversity indices such as nucleotide diversity and haplotype diversity were analyzed in DnaSP 4.026. Genetic differentiation (fixation index Fst) between populations was calculated using mtDNA data and Arlequin 3.0 with 10000 permutations27. Analysis of molecular variation (AMOVA) based on mtDNA data, as implemented in Arlequin 3.027, was used to test for hierarchical genetic structure of the populations.

Network 2.028 software was used to construct a median-joining network. Demographic history changes were analyzed for P. flavescens using two neutrality tests, Tajima’s D (1989)29 and Fu’s Fs (1997)30, which explained a recent population bottleneck or population expansion. Mismatch distributions count the number of site differences between each pair of sequence in a sample and use the calculation result to build a histogram. According to the coalescent theory, a population usually shows a unimodal mismatch distribution following a population demographic expansion31.

Microsatellite genotyping and analysis

Total DNA was extracted from 583 individuals for PCR and genotyping for 10 microsatellite loci developed for P. flavescens19. MICRO-CHECKER 2.2.3 was used to detect genotyping errors in microsatellite data due to null alleles, stuttering, or allele dropout using 1000 randomizations32. Deviations from Hardy–Weinberg equilibrium (HWE) in all the loci of each population and linkage disequilibrium between pairs of loci were assessed using Genepop 3.433. To study the genetic diversity of different geographic populations, we used GenAlEx 6.4134 to calculate mean number of alleles (Na), effective number of alleles (Ne), Shannon’s index (I); expected and observed heterozygosity (He and Ho) and unbiased expected heterozygosity (uHE). Allelic richness (Ar), fixation index (Fst), and inbreeding coefficient (Fis) among these sites were analyzed in FSTAT 2.9.335. Genetic differentiation between all pairs of populations was calculated in Arlequin 3.027.

Analysis of molecular variation (AMOVA) based on microsatellite data, as implemented in Arlequin 3.027, was used to test for hierarchical genetic structure of the populations. We used STRUCTURE 2.3.336 and its nonspatial algorithm to further assess the degree of population differentiation within and between the 19 populations based on microsatellite data. The allelic frequencies for different populations and the admixture model were used to class different individuals into corresponding population clusters. Simulation was run 7 times for each value of k for 106 iterations after a burn-in period of 30,000. To determine the optimal number of groups (K), we utilized both the log likelihood [ln Pr (X/K)] method as recommended by Pritchard et al.37 and the ΔK statistic of Evanno et al.36. Individuals of the first migrant generation for each population were detected using the L-home likelihood computation in GENECLASS 238.

Mantel test for isolation by distance

Correlation tests were conducted between the geographic distance and the corresponding genetic differentiation matrix of these populations in the Mantel test. The geographical and genetic distances should be positively correlated if the dispersal of P. flavescens is influenced by distance. To determine whether movement patterns were limited by spatial scale, we ran isolation by distance analysis between pairwise linearized genetic and log-geographic distance data using a Mantel test in IBDWS 3.2339.