Genetic diversity in the endangered terrestrial orchid Cypripedium japonicum in East Asia: Insights into population history and implications for conservation

Little is known about levels and patterns of genetic diversity for the entire range of endangered orchids native to China, Korea, and Japan. In this study, we focus on Cypripedium japonicum and suggest three hypotheses: 1) that genetic drift has been a primary evolutionary force; 2) that populations in central and western China harbor higher levels of genetic variation relative to those from eastern China; and 3) that C. japonicum in China maintains the highest genetic variation among the three countries. Using ISSR and SCoT markers, we investigated genetic diversity in 17 populations to test the three hypotheses. As anticipated, we found low levels of genetic diversity at the species level with substantially high degree of genetic divergence, which can be mainly attributed to random genetic drift. Chinese populations harbor the highest within-population genetic variation, which tends to increase from east to west. We also found a close relationship between Korean populations and central/western Chinese populations. Historical rarity coupled with limited gene flow seems to be important factors for shaping genetic diversity and structure of C. japonicum. Our results indicate that the mountain areas in central and western China were likely refugia at the Last Glacial Maximum.

polymorphic bands were obtained. At the population level, the mean PPB P was 9.7%, ranging from 1.3% (BT, LS) to 24.4% (FP), whereas the mean N aP , PA, N eP , H EP , and SI were 1.10, 1.88, 1.04, 0.027, and 0.042, respectively (see the Methods section or Table 1 for the definitions of abbreviations of genetic parameters). Population FP exhibited the highest H EP value (0.062), while population BT did the lowest H EP (0.004). At the species level, all these parameters exhibited higher mean values, with PPB S of 57.7%, N aS of 1.58, N eS of 1. 16, H ES of 0.104, and SI of 0.169 (Table 1). With SCoT, a total of 77 bands were observed with an average of 8.6 bands per primer, of which 47 were polymorphic. At the population level, the mean PPB P was 10.9%, ranging from 2.6% (Ka) to 24.7% (FP), and the mean N aP , PA, N eP , H EP , and SI were 1.11, 1.69, 1.05, 0.033, and 0.051, respectively (Table 1). FP exhibited the highest H EP value (0.074) again, but Ka had the lowest H EP (0.007). At the species level, like the ISSR, all those parameters exhibited higher mean values: PPB S = 61.0%, N aS = 1.61, N eS = 1.24, H ES = 0.145, and SI = 0.224 (Table 1).
Mean genetic estimates (mean PPB P , N aP , N eP , H EP , and SI) from SCoT were higher than those from ISSR (Table 1), and these differences were significant (Wilcoxon signed-rank tests; for all genetic parameters P < 0.05; see Supplementary Table S1). However, we found a significant correlation between ISSR and SCoT datasets (Spearman's rank correlation test; for all genetic parameters R 2 , coefficient of determination, ranged from 0.791 for H EP to 0.930 for PPB P and N aP , P < 0.000; Supplementary Table S1), suggesting these two datasets are highly compatible and, thus, that SCoT markers behaved as selectively (or nearly selectively) neutral in C. japonicum. For PA, however, we found neither significant difference nor correlation between the two datasets (Supplementary Table S1).
In China, levels of genetic diversity with the exception of PA and N eP tended to increase from east to west (Table 1); populations in western region harbored significantly higher levels of within-population genetic variation than those in eastern region (with ISSR, mean H EP = 0.049 vs. 0.018, P = 0.008; with SCoT, mean H EP = 0.058 vs. 0.028, P = 0.030). On the other hand, levels of genetic diversity of the six parameters tended to increase from Korea to China (Table 1); 11 Chinese populations exhibited marginally significantly higher levels of within-population genetic variation than two Korean populations (with ISSR, mean H EP = 0.031 vs. 0.008, P = 0.100; with SCoT, mean H EP = 0.040 vs. 0.015, P = 0.054).
Genetic differentiation between populations. Values of G ST showed very small differences between the total set of populations and the set of populations excluding those with small sample sizes (n < 10), and this was applicable for the whole species range (e.g., for ISSR, G ST = 0.749 and 0.737 including and excluding small populations, respectively) as well as within each country separately (e.g., for ISSR in China, G ST = 0.636 vs. 0.599) and with all possible combinations of country pairs (e.g., for ISSR in China vs. Japan, 0.335 vs. 0.362). We consider, Figure 1. Sampled populations of Cypripedium japonicum in this study. The reconstructed Last Glacial Maximum coastlines are represented as thin dotted lines. All the geographical features quoted in the text are also indicated. The areas indicated by ①, ②, and ③ partitioned by thick dotted lines represent the "first", the "second", and "third" steps of the "three-step ladder" of Chinese geographic features 58 . The base map has been generated with ArcGIS 9.3 (ESRI, Redlands, CA, USA) from a 30 arc-sec layer downloaded from WorldClim Version 1 (http://worldclim.org/), and modified using Adobe Illustrator CS5.1 (Adobe Systems Incorporated, San Jose, CA, USA). Layers from WorldClim Version 1 are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (https://creativecommons.org/licenses/by-sa/4.0/). therefore, that including populations with small sample size would not significantly alter results and conclusions, and we conducted further data analyses with the total set of populations.
With ISSR, the total genetic diversity for the species (H T ) was 0.106, while the mean heterozygosity within populations (H S ) and between populations (D S ) were 0.027 and 0.080, respectively. The degree of genetic differentiation between populations (G ST ) was 0.749. Similar results were found with the SCoT dataset: H T = 0.152, H S = 0.033, D S = 0.119, and G ST = 0.779. Chinese populations showed higher G ST (0.636 for ISSR and 0.697 for SCoT) than those from Japan (0.321 for ISSR and 0.406 for SCoT) and Korea (0.104 for ISSR and 0.145 for SCoT) (Supplementary Table S2). When populations in each country were combined, the highest G ST value was found for "Korea vs. Japan" (0.672 for ISSR; 0.608 for SCoT), followed by "China vs. Japan" (0.335 for ISSR; 0.300 for SCoT), and "China vs. Korea" (0.120 for ISSR; 0.159 for SCoT) (Supplementary Table S2).
AMOVA analysis with ISSR revealed that variation among regions (China, Japan, and Korea) and among populations within regions contributed 48.1% and 30.9% to the total genetic variance, respectively (Table 2). Similar results were found with SCoT: 43.5% and 38.3% of the total variation among regions and among populations within regions, respectively (Table 2). Consistent with the hierarchical analyses for G ST , separate AMOVA analyses for different subsets of the data showed that the highest genetic divergence between regions was found between Korea and Japan (80.2% for ISSR and 75.4% for SCoT), followed by China vs. Japan (52.6% for ISSR and 43.3% for SCoT; Table 2); in contrast, low degree of divergence was found between China and Korea (22.6% for ISSR and 32.2% for SCoT; Table 2).
We found a significant correlation between pairwise genetic distances and linear geographic distances (km) of populations (Mantel test for IBD, isolation by distance: r = 0.713, P = 0.001 for ISSR; r = 0.515, P = 0.001 for SCoT) when all populations were used ( Supplementary Fig. S1). Except for SCoT in Japan (r = 0.478, P = 0.262, primarily due to low statistical power, only six pairs), results of the hierarchical test for IBD showed significant correlations within countries (China, r = 0.752 for ISSR and r = 0.669 for SCoT; Japan, r = 0.858 for ISSR), and also significant correlations were observed when countries were combined (China vs. Japan, r = 0.782 for ISSR and r = 0.600 for SCoT; Korea vs. Japan, r = 0.687 for ISSR and r = 0.747 for SCoT; China vs. Korea, r = 0.316 for ISSR and r = 0.362 for SCoT) (Supplementary Table S2).
Genetic structure and relationships between three countries. The UPGMA dendrograms constructed from ISSR ( Fig. 2A) and SCoT (Fig. 2B) markers both showed that the populations of C. japonicum could be largely divided into two geographic clusters: the China + Korea cluster and the Japanese cluster. The main difference between the two datasets was that Korean populations clustered with western and central Chinese Principal coordinate analysis (PCoA) results were in good agreement with those from the UPGMA dendrograms (Fig. 3). In the ISSR PCoA (Fig. 3A), the first two components accounted for 76.5% (axis 1 = 47.2%; axis 2 = 29.3%) of the total genetic variance, in a similar way that in the SCoT PCoA (Fig. 3B) in which they accounted for 70.8% (axis 1 = 44.7%; axis 2 = 26.1%) of the total genetic variance. The main differences between the two distribution plots were BT and SN populations, which were placed between the eastern and western Chinese populations in the SCoT analysis but nested within western Chinese populations in the ISSR analysis.
In the combined datasets (ISSR and SCoT), the ΔK statistics indicated K = 2 as the best grouping scheme (K = 5 was the second highest value; Fig. 4A), whereas the estimated log probability of data obtained with Structure reached a plateau when the number of clusters was set to 5 (Fig. 4B). Thus, we chose K = 5 as the most meaningful clustering. However, we also represented K = 2, K = 3, and K = 4 as these results might be informative. When K = 2, the Chinese and Korean populations clustered together and Japanese population had an independent genetic pool (Fig. 4C). However, when K = 5, the Japanese, Korean, eastern Chinese, western Chinese, and central Chinese populations represented independent genetic pools (Fig. 4C). Interestingly, at K = 3 the Korean populations were clustered together with central and western Chinese populations, and their own cluster were not assigned until K = 4 (Fig. 4C). The findings of Bayesian clustering analysis implemented in Structure were in broad agreement with the UPGMA and PCoA results, clearly depicting that 1) populations from China and Korea have a close relationship, and 2) Korean populations are genetically more related to central and western Chinese populations than to eastern Chinese populations.

Discussion
Comparisons between ISSR and SCoT. In this study, two dominant genetic markers, ISSR and SCoT, were employed to assess the levels and distribution of genetic diversity of C. japonicum in East Asia. The amplification products generated from the SCoT markers may be correlated to functional genes and their corresponding traits, while the ISSR markers detect polymorphisms in microsatellite regions that do not necessarily represent functional genes 35 . Each marker system targets different regions of the genome, which leads to different sources of the detected diversity. In the present study, both of them indicated low genetic diversity both at population and species levels, and showed a similar clustering of C. japonicum populations (see below for details). Although the values for the genetic parameters were significantly higher for SCoT compared to ISSR, overall a significant correlation was found between these two datasets. To sum up, using independent data from two different methods (as complementary analyses) can represent a more efficient and accurate way to survey the genetic variability of a given species.
Extremely low levels of within-population and high degree of between population genetic diversity. As Table S3). The much greater estimates at the species level are due to the substantial genetic divergence detected between populations of C. japonicum (G Chung et al. 7 found no allozyme variation in six Korean populations of C. japonicum and argued that the species was historically rare compared to its congener C. macranthos on the Korean Peninsula, suggesting that RGD would be a major evolutionary force in the former. Consistent with this and more recently, Son and Son 37 found low levels of microsatellite-based genetic variation in four Korean populations of C. japonicum (%P P = 35.0%; A P = 1.53; H eP = 0.109, based on 10 polymorphic loci) but a higher among-population divergence (Φ ST = 0.810 including one Japanese population). Our finding of much lower levels of within-population genetic variation when compared to those at the species level, indirectly indicates the importance of RGD for C. japonicum. A recent meta-analysis of allozyme-based population genetic studies in orchids also revealed that "rare" taxa harbor significantly lower levels of genetic diversity than their "common" congeners (at the population level, percentage of polymorphic loci, %P P = 46.7% vs. 23 38 . One of the exceptions was C. macranthos var. rebunense, endemic to the small Rebun Island close to Hokkaido (Japan), which harbors comparable levels of genetic diversity with its common North American congeners. This unexpected genetic result is probably due to the fact that the decline in habitat range and population size of this island's endemic variety has been very recent (since 1950s) 39 ; there have not been sufficient generations for the initial diversity to be substantially eroded by RGD 40 . Other life-history or ecological traits associated with pollen and seed dispersal have a highly significant impact on genetic variability and are strong predictors for the levels and distribution of genetic diversity in plant species 41,42 . The three bumblebees Bombus remotus, B. imitator and B. picipes are the most effective primary pollinators of C. japonicum in China 43 , while B. ardens and B. diversus are the effective pollinators in Japan 44 . It is known that bumblebees have a flight range of less than 15 km 45 . Pollen dispersal is, therefore, likely to be relatively restricted among the highly isolated populations of C. japonicum. Moreover, non-rewarding C. japonicum flowers bloom in May 43 , when precipitation is high in most of its distribution range in China, which could dramatically reduce the foraging activity of bumblebees and indirectly limit pollen dispersal 46 . Pollination also suffers from visits by illegitimate visitors as these insects destroy the flowers 47 . All these factors might contribute to the low pollen dispersal observed in the species 43,44 .
Since orchids produce tiny, "dust-like" numerous seeds, they have the potential for long-distance seed movement once seeds are released and enter the air column 48 . However, direct and indirect (genetic) studies have documented that the majority of seeds are transported to very short distances, in the close vicinity of maternal plants, with events of long-distance dispersal being sporadic [49][50][51] . In addition, like other terrestrial orchids, the seeds of C. japonicum require mycorrhizal fungi for germination and seedling nutrition 8,52,53 . The limitation of pollen and seed flow across highly isolated populations would lead to low gene flow, increasing the genetic differentiation between populations of C. japonicum (see below for further details).

Levels of genetic diversity and population genetic structure: insights into phylogeographic history.
Since historical events leave an "indelible" mark on patterns of genetic diversity found within plant species, understanding their current levels and distribution can provide valuable insights into the recent evolutionary and biogeographical history of these species 16,17,19,26,54 . As hypothesized, levels of genetic diversity (with the exception of PA and N eP ) tend to increase from east to west in China. The finding that the central and western populations harbor relatively higher levels of genetic variation compared to eastern ones are consistent with the hypothesis that the former areas served as large LGM refugia for many species, mainly boreal and temperate plants, in China 25,26 . In particular, population FP, the highest in levels of genetic diversity among all Chinese populations examined, is located in the south Qinling Mts. (Fig. 1); this mountain region is considered one of the main refugia of western China, and was an area not heavily influenced by the LGM 55 . In general, large parts of central and southwestern China, including the Three Gorges Region, the Yungui Plateau, the Qinling Mts., and partially the Hengduan Mts. (Fig. 1), acted as plant refugia 25,26 due to their relatively stable climatic conditions 56 , which allowed the persistence of even thermophilous, relict plant taxa (e.g., Davidia involucrata 57 ). All the studied populations of C. japonicum occurring in these areas (with the only exception of population BT) are among the most genetically variable (Table 1). It should be noted that, although in eastern China the presence of LGM refugia has also been hypothesized, these were generally small ("microrefugia"), such as Tianmu Mts. (Fig. 1) 25,26 . Despite  (Table 1) also suggests the occurrence of glacial refugia, albeit small, there.
The genetic clustering patterns of Chinese populations might, indeed, reflect the different nature of the terrain, which would have deeply influenced the occurrence of C. japonicum refugial areas. The two main genetic clusters detected in the species (the western/central China cluster and the eastern China cluster, clearly depicted by the UPGMA, PCoA, and Structure) roughly correspond to the "second" and "third" steps of the "three-step ladder" of Chinese geographic features 58 , respectively (see ② and ③ in Fig. 1, respectively). The second step approximately extends from the eastern margin of the Qinghai-Tibetan Plateau to Taihang Mts. (Fig. 1), and it is comprised of plateaus and middle-altitude mountains (of generally 1,000-2,000 m), including the Qinling Mts. and other mountain systems of central China (Daba Mts. and the Three Gorges area; Fig. 1), which would have served as the main glacial refugia for C. japonicum and many other species 25,26 ; the third step, containing China's largest plains, would have harbored, instead, populations that endured the LGM in small, favorable pockets in eastern China (in "microrefugia", e.g., Tianmu Mts.; Fig. 1), and perhaps also newly-founded populations (i.e., populations of post-glacial origin).
As also hypothesized, the levels of genetic diversity of C. japonicum are higher for Chinese populations, followed by Japanese populations, and finally Korean ones, which further supports the species' glacial refugial scenario for central and western China. Korean populations exhibit extremely low levels of genetic diversity, despite that the Baekdudaegan (BDDG, the main mountain system of the Korean Peninsula) was a glacial refugium for a large assemblage of boreal and temperate plants during the LGM 21,59 . Plants from the BDDG often harbor moderate to high levels of within-population genetic diversity ( Table 1 in Chung et al. 21 ), because these mountains might provide relatively stable habitats and enabled plants to track the climate changes by altitudinal movements, ensuring relatively large population sizes. For example, C. macranthos (a congener of C. japonicum), largely occurring along the BDDG ridge, harbors moderate levels of within-population genetic variation 7 . Populations of C. japonicum, however, occur on hillsides with elevations of less than 600 m on the Korean Peninsula. Likewise, in Japan, C. japonicum largely occurs on lower hillsides, and our samples were collected from hillsides at altitudes below 400 m. Like in Korea, lowlands and low-elevation hills in Japan hardly harbored large refugia ("macrorefugia"). Instead, contemporary populations would be descendants of populations that endured the LGM in microrefugia, where they suffered the effects of RGD.
The closer relationships between Korean and Chinese populations than between Korean and Japanese populations can be simply due to the fact that the Korean Peninsula is contemporarily connected with NE China, although such connection was even more straightforward at the LGM, when the Yellow Sea was completely exposed (cf. dotted lines for the reconstructed LGM coastlines; Fig. 1). The Japanese Archipelago, instead, was nearly isolated from the continent throughout the Quaternary. Unexpectedly, the two Korean populations are clustered with western and central Chinese populations rather than with the eastern Chinese populations (Figs 2-4), which are geographically closer to the Korean populations. An origin of the Korean populations from the mountains of western China instead of eastern China could explain such pattern of genetic clustering; since C. japonicum occurs in middle-low altitude mountains in China, the most straightforward migration routes were probably through the mountains of north China [Qinling, Taihang Mts., then the north Hebei Mts. (Yan Mts.); see Fig. 1]. Colonization of Korea from eastern Chinese mountains seems unlikely, as both regions are separated by large plains (e.g., the huge North China Plain; Fig. 1) with no suitable habitat for the species. Our ongoing study on phylogeography of C. japonicum in East Asia using cpDNA has revealed that the only haplotype found in Korean populations also occur in SN and ZP, which are located in Daba Mts. (L. X. Han et al. unpubl. data).
The results of AMOVA, UPGMA, PCoA, and Structure (but also the hierarchical analyses of G ST and IBD) show that C. japonicum in East Asia can largely be grouped into two genetic clusters, namely the China + Korea cluster and the Japan cluster. The different chromosome number of C. japonicum in China (2n = 22 in Tianmu Mts. 60 ) and Japan (2n = 20 based on cultivated plants 61,62 ) suggests an ancient separation and a reproductive barrier between Chinese and Japanese populations, in case that these counts are confirmed for other populations within the species' range in these two areas. Our genetic data that indicate a sharp genetic break between continental (China + Korea) and Japanese populations are in agreement with the cytogenetic divergence. Whether both genetic lineages (China + Korea and Japan) represent cryptic species, and merit any taxonomic (as species, subspecies or varieties) or evolutionary recognition (e.g., evolutionarily significant units), need further research including morphological, cytogenetic (e.g., its chromosome number in Korea is not known), and molecular phylogenetic studies.

Implications for conservation.
Cypripedium japonicum is threatened with extinction because of its low levels of genetic variation and its high ornamental and medicinal value, particularly in China. The species is categorized as "Endangered" at global scale following the most recent criteria of IUCN 63 : B2ab(ii,iii,iv,v); C2a(i) 5 . Thus, in situ and ex situ conservation efforts should be of particular importance for this species. Some previously recorded populations in China might have disappeared already due to deforestation and/or illegal collection 4,5 .
For a short-time conservation strategy, C. japonicum should be considered as the priority subject of in situ conservation with an emphasis on maintaining the number of genetically distinct individuals. Designation of priority protected areas for natural habitats of C. japonicum could provide ideal protection for in situ conservation 64 . In Korea, fortunately, all known Korean populations are protected by laws 11 .
For a long-term conservation strategy, the four populations in central and western China with highest genetic diversity (FP, SZ, SN and JF) should be prioritized for protection, because such populations would provide the best source of propagules for ex situ conservation and in situ restoration efforts 65 . Although populations from Japan exhibit low within-population genetic variation, they should also be considered for in situ conservation and ex situ preservation because of their considerable divergence from the Chinese plus Korean populations. Indeed, the extinction of any population may lead to a considerable loss of genetic diversity because of the species' high degree of between-population genetic differentiation, restricted distribution pattern and narrow habitat preferences. Thus, all populations are of great importance for short-term and long-term survival of C. japonicum in East Asia.
Since about 20 conservation actions have already been recommended for Korean, Chinese, and overall populations in East Asia 4,5,7 , we here further provide a few suggestions for the management and conservation of C. japonicum: 1) to use hand-pollination within or among populations in order to enhance the fruit set and even to accelerate the rejuvenation of small populations because of low rate of fruit set (5.2-7.7% in Hubei, China 43 ; 14.9% in Chiba, Japan 44 ); 2) to apply tissue culture techniques for propagation of different genotypes as well as for satisfying the demand of traditional medicine; 3) to identify detailed clonal structure and fine-scale genetic structure of typical or representative populations in the three countries where the species is present (China, Korea, Japan), in order to provide more scientific and efficient sampling for ex situ preservation; and 4) to develop comprehensive conservation strategies through international collaboration among concerned countries.

Conclusions.
To our knowledge, this study is the first detailed examination of any East Asian orchid covering its entire distribution range to evaluate the levels of genetic diversity, to provide hypotheses about the past phylogeographic history, and to recommend conservation strategies. As expected, the endangered terrestrial orchid C. japonicum harbors extremely low genetic diversity at the population level across its whole distribution area. At the species level (samples as a whole), C. japonicum maintains low levels of genetic diversity with high degree of genetic divergence between populations. These genetic results suggest that random genetic drift (RGD) coupled with limited gene flow between populations have played a significant role in shaping levels of genetic diversity SCIeNTIfIC REPORTS | (2018) 8:6467 | DOI:10.1038/s41598-018-24912-z and structure in the species. Genetic data revealed that the populations are genetically arranged into two major groups: Chinese + Korean and Japanese groups. Although in situ conservation and ex situ preservation strategies should be primarily focused on populations with high genetic diversity in central and western China, conservation of Korean and Japanese populations is also desirable through close international cooperation among concerned countries.

Methods
Study species. Cypripedium japonicum grows in moist humus-rich soil in the understory of mature and successional deciduous forests on hillsides from China, Japan, and Korea. The plants can also propagate via buds that grow from the creeping underground rhizomes 2 . Most plants of C. japonicum have two leaves and a 35-55 cm tall stalk bearing a single, large, pendent flower with sac-like labellum (4-5 × 3-3.5 cm). Cypripedium japonicum is self-compatible with a non-rewarding, deceptive pollination system and requires effective pollinators 43,44,47 . Five species of the genus Bombus have been reported as effective pollinators both in China and Japan with a low fruit set of 5.2-14.9% 43,44 . Seeds of C. japonicum have a low germination rate, which could restrict their survival and development 9,53 . Population sampling. The number of samples collected per population was in proportion to the size of that population. In order to avoid clonal interference for estimating genetic variation in populations, we collected only one sample per small patch (as in small patches all individuals may be ramets of the same genet); we did our best to collect at least 20-30% of all shoots per population. A total of 287 samples from 17 natural populations of C. japonicum were collected from China, Japan and Korea during four years (2010-2013), including small populations with only a few individuals (n = 3 in LS; Fig. 1 and Table 1 . We further divided the 11 Chinese populations into three groups according to geographic and physiographic criteria: DB, HS, LS and TM formed the "eastern" group, BT and SN the "central" group, and FP, JF, SZ, WX and ZP the "western" group. Population SZ, although geographically located in Hunan (province often included in central China according to Chinese geographers) was, instead, merged with the group of western populations because it occurs within Wuling Mts. (that actually constitute the eastern tip of the Yungui Plateau; Fig. 1). For each sampled individual, a small piece of fresh leaf was collected from a single plant and kept in a paper bag with silica gel until DNA extraction. DNA isolation. Total genomic DNA was extracted by using the modified CTAB method 66 and dissolved in ultrapure water for future use. DNA quality was checked by electrophoresis in 1% agarose gels. Nanodrop was used to test its concentration and optical density value. For the ISSR experiment, working DNA was diluted to a concentration of 40-100 ng/μL in ultrapure water and stored at −20 °C before subsequent use. For the SCoT experiment, the most appropriate concentration of working DNA was 100-160 ng/μL. The materials used for ISSR analysis (n = 285) were slightly different from SCoT analysis (n = 265) due to technical constraints. In particular, samples of ZP could not be used for the SCoT analysis due to the degradation of materials (Table 1).

ISSR and SCoT amplification and optimum primer selection.
To ensure the reproducibility of both ISSR and SCoT markers, we first conducted many preliminary experiments to optimize PCR amplification (by trying several concentrations of DNA, primers, dNTPs, etc.), and then we chose primers with high polymorphism and good repeatability for subsequent manipulations. About 45% (ISSR) and 35% (SCoT) of the samples were re-run with all the primers, respectively. For ISSR analysis (n = 285), 10 primers were chosen from 100 ISSR primers supplied by University of British Columbia (http://www.biotech.ubc.ca/services/naps/primers.html) for PCR research (Supplementary Table S4). For SCoT analysis (n = 265), nine SCoT primers (Supplementary Table S5) were selected from 80 primers according to Collard and Mackill 30 and Luo et al. 35 . Amplification with these primers was carried out in Takara TP600 and commenced with initial denaturation of 2 min at 94 °C, followed by 35  Statistical analyses. Visible and clear DNA bands obtained from each ISSR and SCoT primers were scored as absent (0) or present (1). Bands length between 200-1,200 bp (ISSR) and 400-1,500 bp (SCoT) were scored. Using the program Popgene 1.32 67 , and assuming that populations were in Hardy-Weinberg equilibrium, we estimated the following genetic diversity parameters: percentage of polymorphic bands (PPB), observed number of alleles (N a ), number of "private" alleles (PA), average effective number of alleles per locus (N e ), Nei's gene diversity index (H E ) 68 , and Shannon's information index (SI).
To assess the correspondence of genetic diversity between the two datasets (ISSR vs. SCoT), we performed Wilcoxon signed-rank tests on 16 population pairs (with the exception of population ZP) for five within-population genetic variation parameters. We further conducted a Spearman's rank correlation analysis between ISSR and SCoT for each genetic measure; the higher correlation between ISSR and SCoT for each genetic measure, the smaller the differences in the measurements between the two datasets. Thus, this correlation analysis SCIeNTIfIC REPORTS | (2018) 8:6467 | DOI:10.1038/s41598-018-24912-z can be viewed as an indirect way to check the suitability of the SCoT markers in surveying the genetic variability in C. japonicum.
Although we found significant differences in the distribution of values of H E and SI (Mann-Whitney U-test or Wilcoxon rank-sum test for both ISSR and SCoT, P = 0.000), the order of values of those two genetic parameters were nearly the same (Spearman's rank correlation analysis for both ISSR and SCoT, R S 2 = 0.995 and 0.985, respectively, P = 0.000), suggesting that using either H E or SI would be appropriate. We used a Mann-Whitney U-test or Wilcoxon rank-sum test to assess the significance of differences in H E (a summary statistic of within-population genetic variation) between the pair of regions with the highest and the lowest estimates in China. In addition, we further conducted Mann-Whitney U-test to determine any significant difference in H E between the pair of countries with the highest and the lowest estimates in East Asia.
To estimate the degree of population genetic differentiation, we further calculated total genetic diversity (H T ), genetic diversity within populations (H S ), and genetic differentiation among populations (G ST ) using Popgene. As the combination of dominant markers and small sample sizes could artificially inflate genetic structure (e.g., G ST ), we estimated G ST values with and without the four populations with small sample sizes [n < 10; LS (3 individuals), DB (5), Ha (7), and JF (8)] to see how much these small populations may be influencing our statistics. To test for the influence of individuals within populations, populations within regions (countries), and regions on the observed genetic variation, we conducted a hierarchical analysis of molecular variance (AMOVA) using GenAlEx 6.5 69 . To determine the relative importance of gene flow and RGD at regional scale 70 , we conducted a correlation analysis between pairwise genetic distances 71 and linear geographic distances (km) and ran a Mantel test for IBD (with 999 permutations) using GenAlEx. A positively significant linear relation suggests that populations are at regional equilibrium between gene flow and RGD 70 . As reproductive barriers would exist between Chinese and Japanese populations due to different chromosome number (2n = 22 in China 60 ; 2n = 20 in Japan 61,62 ) and geographical isolation, we could be dealing with two cryptic species or subspecies. In order to avoid such potential "phylogenetic" biases in our genetic results, we performed further analyses of G ST and IBD in a hierarchical fashion: within each country separately and then with all possible combinations of country pairs (i.e., "China vs. Korea", "China vs. Japan", and "Korea vs. Japan").
We conducted unweighted pair-group method with arithmetic means (UPGMA) based on Nei's unbiased genetic distances 72 between populations using the Tfpga 1.3 program 73 . Bootstrap values for nodes were estimated based on 999 replications. As a complementary analysis, we performed a principal coordinate analysis (PCoA) with GenAlEx 69 , on the basis of Nei's genetic distances 74 , to investigate the relationships among the populations, and the two principal coordinates were used to visualize the dispersion of accessions in a two-dimensional array of eigenvectors. To deeply explore the population structure and unravel genetic admixture, we used Structure 2.3.4 75 to analyze the combined dataset (16 populations with ZP exclusion due to lack of SCoT data) of two markers with a Bayesian approach. Posterior probabilities of the data for each K were obtained for K = 1 to K = 18 clusters using the Admixture Model. Fifteen runs were completed for each K, with a Markov Chain Monte Carlo (MCMC) of 200,000 iterations, following a burn-in period of 1,000,000 iterations. We inferred the most likely value of K by the ΔK statistics 76 , with the aid of Structure Harvester 77 . Since the ΔK method tends to identify K = 2 as the top level of hierarchical structure 78 , we combined it with the method of selecting the smallest K after the log probability of data [ln Pr(X|K)] values reached a plateau 79 . Programs Clumpp 1.1.2 80 and Distruct 1.1 81 were used to combine the results of the 15 replicates of the best K and to visualize the results produced by Clumpp, respectively.