Population structure and genetic differentiation of tea green leafhopper, Empoasca (Matsumurasca) onukii, in China based on microsatellite markers

The tea green leafhopper, Empoasca (Matsumurasca) onukii Matsuda, is one of the dominant pests in major tea production regions of East Asia. Recent morphological studies have revealed variation in the male genitalic structures within and among populations. However, the genetic structure of this pest remains poorly understood. This study explores the genetic diversity and population structure of this pest in nineteen populations from the four main Chinese tea production areas using microsatellite markers, with one Japanese population also examined. The results show low to moderate levels of genetic differentiation with populations grouped into four clusters, i.e. the Jiangbei group, the Southwest group 1, the Southwest group 2 and the South China group. Populations from China have a close phylogenetic relationship but show significant isolation by distance. Lower genetic diversity and genetic differentiation of E. (M.) onukii were found in the Kagoshima population of Japan. Evidence for genetic bottlenecks was detected in the South China and Jiangnan populations. Population expansion was found in the Southwest, Jiangbei and Kagoshima populations. This is the most extensive study of the population genetics of this species and contributes to our understanding of its origin and evolutionary history.

The number of alleles for the 18 microsatellites ranged from 2 to 18 over the entire population. Fifteen markers (all except Eo-42, Eo-1-57 and Eo-E-12) were highly polymorphic, with polymorphic information content (PIC) values ranging from 0.515 to 0.892 39 . At the population level, the mean N a and AR ranged from 8.4 (PE, YT and JJ) to 10.4 (CY) and from 6.52 (JJ) to 8.27 (CY), respectively. The mean number of N e per marker was 5.2, with the lowest value being 4.2 (JJ) and the highest being 5.9 (SX and CY). Only 69 individuals had one or more private alleles across all markers and they were spread across 17 populations. H O had a mean value of 0.679 and was lowest in the ZY population (0.615) and highest in the RZ population (0.737). The mean H E was 0.741, ranging from 0.665 (JJ) to 0.788 (CY) ( Genetic structure. The Bayesian analysis of population structure indicated that 19 populations from China represent four main genetic clusters (K = 4) (Fig. 1). Although ΔK had peaks in K = 2 and K = 4, α was relatively constant at K = 4. Furthermore, individuals were strongly assigned to K = 4. The CX population and PE population remain as a fixed cluster. The numbers of individuals from the CT population assigned to clusters are similar to those of the ZY and CY populations. The XY, RZ, TA and SX populations (all belonging to the Jiangbei tea production area) were assigned to each cluster similarly. The populations from Jiangnan and South China did not show strong population genetic structure. Analysis of molecular variance (AMOVA) indicated that 2.43% of the genetic variation was partitioned among groups (P < 0.001) and 2.14% among populations (P < 0.001) ( Table 2). The majority of the genetic variance originated from variation among individuals within populations (P < 0.001). The F ST ranged from 0.0012 to 0.0981    Table S1). The mean F ST values show moderate differentiation between Southwest group 2 and the other groups. The NJT (Fig. 2) based on Nei's genetic distance clustered the Chinese populations into 4 major groups, consistent with the results of both the Bayesian analysis and AMOVA. The populations from the Southeast tea production area were divided into two groups. The populations from the Jiangnan and South China tea production areas were clustered into one group. Nei's genetic distances between Chinese populations and the Kagoshima population were 0.119-0.255 (Supplementary Table S2).
Similar results were obtained in PCoA for these populations, as shown in Fig. 3. The mean factor scores for the 20 populations are shown in the first two principal component axes, which account for 33.54% and 14.76% of the total variance. The CX population is clearly distant from the PE population. This analysis shows conspicuous divergence of populations in the Jiangnan and South China tea production areas from the other populations.

Isolation by distance. Mantel tests of genetic and geographical distances over all populations revealed that
there was a significant correlation between these two variables (R XY = 0.549, R2 = 0.301, P = 0.000, Supplementary  Fig. S1a). However, it was not significant when only the Jiangbei populations (R XY = 0.495, R 2 = 0.245, P = 0.318) and Southwest populations (R XY = 0.609, R 2 = 0.371, P = 0.062) were analyzed. There was a significant correlation when the South China group (including Jiangnan and South China populations) (R XY = 0.539, R 2 = 0.291, P = 0.0001, Supplementary Fig. S1b) was analyzed.
Bottleneck analysis. Significant

Discussion
Our analysis of microsatellite data reveals that genetic differentiation of E. (M.) onukii within a particular region is mainly reflected in the differences among individuals within a population. This is consistent with previous results from AMOVA based on mtDNA COI and 16S rRNA [21][22][23][24][25] . Analysis of microsatellite markers detected low to moderate levels of genetic differentiation among populations within the main tea growing regions of China, and much more genetic variation within populations (95.43%) than among populations (2.18%). This suggests substantial amounts of gene flow and a more homogeneous gene pool across different geographical populations. Measured genetic and geographical distances are correlated. Within a population, genetic mutation and drift are expected to yield genetic differentiation among individuals, whereas geographic isolation (such as geographic distance and geographic barriers) mainly influences genetic differentiation between populations.
The low to moderate levels of genetic differentiation and subdivision are concordant with geographic distribution. The genetic structure based on 18 microsatellite markers confirms the genetic difference between Southwest groups 1 and 2 in the topographically complex Southwest tea production area, similar to the result obtained previously 27 . Furthermore, different allelic richness and mean expected heterozygosity were observed in the PE and CY populations of the Southwest tea production area (Tables 1 and 2). There was moderate differentiation between populations of Southwest group 2 and the other populations. Genetic diversity differs significantly between the CY and CT populations and other populations, which are isolated by the Sichuan Basin (AR: t-test: t = −2.944, d.f. = 17. P = 0.009; H E : t-test: t = −2.582, d.f. = 17. P = 0.019). These subdivisions in the Southwest tea production area populations are probably due to different combinations of geographic isolation and climatic variation. Substantial climatic differences between the Southwest tea production area and the other tea production areas and the existence of geographic barriers in this area, such as mountain ranges, basins and large rivers, may account for the genetic differentiation among E. (M.) onukii populations that have limited dispersal capacity 22,23 .
Fossil evidence suggests that tea trees originated 60-70 Mya (million years ago) in the Paleogene of the Yunnan-Guizhou plateau 41 . Terpene index analysis has shown that the original tea cultivars are from Yunnan Province 42 . This Southwest tea production area was previously suggested as the area in which tea was first cultivated. The "original" haplotypes of 16S rRNA of E. (M.) onukii (based on a median joining haplotype network) were obtained from Yunnan and Sichuan populations 22 , suggesting that the tea green leafhopper expanded its range from the Southwest tea production area into other areas, following the spread of tea cultivation. Based on this evidence, as well as the genetic diversity and structure inferred in this study, we propose the following scenario for the spread of E. (M.) onukii populations in Chinese tea production areas: (1) the tea green leafhopper spread from Yunnan to Guangxi and Guangdong, and subsequently to Jiangnan and the South China tea production areas; (2) this pest then expanded to the Jiangbei tea production area from Yunnan through Shaanxi; and  46,47 . In these cases, pesticides may have stimulated the reproduction and expansion of pest populations, promoting gene exchange and homogenizing the genetic structure 48 . Integrative pest management incorporating biocontrol or organic production methods may be needed to build ecologically sustainable tea plantations in these areas 49 Microsatellite markers revealed lower genetic differentiation between Sichuan and Yunnan populations than was shown in a previous study on this same species in the same region 27 . This may be because the present study used samples obtained from multiple localities in Sichuan. To provide more robust results, future studies should employ larger numbers of genetic markers and more individuals within the same locations to more fully represent the diversity of local E. (M.) onukii populations. The area of origin of E. (M.) onukii should be further explored by more intensive sampling of populations in the Southwest tea production area, where the level of genetic differentiation is highest. The dispersal routes of E. (M.) onukii among Chinese tea production areas should also be analyzed in more detail using more genetic loci (such as SNPs). Use of SNPs that have a high abundance throughout the genome, predictable mutation modes and low back mutation rates 51,52 could reduce error in estimation of population parameters (such as F ST and migration rates) compared to our use of microsatellite markers.
This study found low to moderate levels of genetic differentiation and segregation in E. (M.) onukii populations among the main Chinese tea production areas. Populations of this pest have differentiated into four groups: the Jiangbei group, the Southwest group 1, the Southwest group 2 and the South China group. Low genetic diversity and moderate genetic differentiation of E. (M.) onukii were found in the Kagoshima population of Japan. The observed patterns may be attributable to geographic isolation, differences in feeding preference among tea germplasms and the use of pesticides in tea production areas. Additional study is needed to elucidate the relationship between selective factors (such as tea germplasm resources and geographical conditions) and genetic structure.

Materials and Methods
Sampling. E. (M.) onukii were collected by sweep net from 19 localities in four tea production areas of China and one location in Japan, representing a total of 20 geographically delimited populations ( Fig. 4 and Table 3). In order to reduce the likelihood of sample contamination from non-tea plants, specimens obtained in each tea garden were collected from five different sites (approximately five square meters per site) selected at random near the middle of each plantation. The net was swept across the top of the tea plants at each site and the collected specimens were placed in five separate vials. All specimens were preserved in absolute alcohol at −20 °C until they were identified and used for genotyping. The rest of the specimens are now deposited in the Entomological Museum, Northwest A&F University, Yangling, China (NWAFU).
Only male specimens representing each population were selected for this study because of the inability to identify females using morphological characters 1 . All were identified in advance in the laboratory by the first author based on morphological characters diagnostic for the species 1 .
DNA extraction and microsatellite genotyping. Total genomic DNA was extracted respectively from 30 male individuals per population (except for genital segments used for species identification) using the DNA Easy Blood and Tissue Kit (Qiagen, Hilden, Germany) or a modified CTAB method 53 . DNA concentration was measured using an ND-1000 spectrophotometer (Bio-Rad, Hercules, CA, USA). Eighteen microsatellite markers (Supplementary Table S3) out of the 21 originally characterized by Zhang et al. 27 were labeled with a fluorescent dye (HEX, TAMRA or FAM). PCR protocol was performed as described by Zhang et al. 27 . PCR products were run by automated capillary electrophoresis on an ABI 3130xl (Applied Biosystems, Foster, CA, USA) genetic analyzer. Allele sizes were scored against a GeneScan TM 500LIZ standard using GeneMapper v4.0 (Applied Biosystems, Foster, CA, USA).  Table 3; the pie chart in each population represents the proportion of individuals from four clusters inferred by Structure analysis. SimpleMappr was used to produce a distribution map based on the geographical coordinates in Table 3

Data Availability
All data generated or analysed during current study are available within the published article (and its Supplementary Information files).