Genetic structure analysis of cultivated and wild chestnut populations reveals gene flow from cultivars to natural stands

Japanese chestnut (Castanea crenata Sieb. et Zucc.), the only fruit tree species domesticated in Japan, has been cultivated alongside natural stands since prehistorical times. Understanding the genetic diversity of this species and the relationships between cultivated and wild chestnut is important for clarifying its breeding history and determining conservation strategies. We assessed 3 chestnut cultivar populations and 29 wild chestnut populations (618 accessions). Genetic distance analysis revealed that wild populations in the Kyushu region are genetically distant from other populations, whereas other wild and cultivar populations are comparatively similar. Assignment tests suggested that cultivars were relatively similar to populations from central to western Honshu. Bayesian structure analyses showed that wild individuals were roughly classified according to geographical distribution along the Japanese archipelago, except that some wild individuals carried the genetic cluster prevalent in cultivars. Parentage analyses between cultivars and wild individuals identified 26 wild individuals presumed to have a parent–offspring relationship with a cultivar. These results suggested that the genetic structure of some wild individuals in natural stands was influenced by gene flow from cultivars. To conserve wild individuals carrying true “wild” genetic clusters, these individuals should be collected and preserved by ex situ conservation programs.


Results
Basic genetic characteristics of chestnut populations. We genotyped chestnuts from 32 populations (618 individuals in total) by using 30 nuclear simple sequence repeat (nSSR) markers (Table 1; Supplementary  Tables S1-S3 Table S3), indicating that the markers are reliable enough to use in population genetic studies. Allelic richness (A r ) of the 32 populations ranged from 4.75 to 6.33 (KYU and TAR, respectively; Table 1). The mean A r value of the cultivar populations (5.46) was similar to that of the wild populations (5.55). Observed and expected heterozygosity of the 32 populations was H o = 0.605-0.752 and H e = 0.596-0.726, respectively (Table 1). Similar to the A r values, mean heterozygosities of cultivar populations (H o = 0.708 and H e = 0.699) were similar to those of wild populations (H o = 0.685 and H e = 0.681). The inbreeding coefficient (F is ) ranged from − 0.034 to 0.077. Cultivar populations originated in regions other than KAN or TAN (designated "other regions"; OR) and six wild populations (SIR, TSU, TOK, YAT, MIK, and NAK) showed significant F is values (Table 1).
Genetic relationship among populations. Chloroplast haplotype frequencies of the 618 individuals were determined using 4 chloroplast SSR markers (cpSSRs). Markers Cmcs1 and Cmcs3 each showed a single fragment size in all individuals, whereas Cmcs2 and Cmcs5 each showed two fragment sizes. Only two chloroplast haplotypes (HAP1 and HAP2; Table 2) were identified among the 618 individuals (Table 1). HAP1 was composed of the combination of 143 bp for Cmcs2 and 151 bp for Cmcs5, whereas HAP2 was composed of 142 bp for Cmcs2 and 149 bp for Cmcs5. The combinations of 143 bp for Cmcs2 and 149 bp for Cmcs5, and of 142 bp for Cmcs2 and 151 bp for Cmcs5, were both completely missing, suggesting that these haplotypes differentiated a long time ago and became geographically separated. All of the cultivars and the 26 wild populations on Honshu carried only HAP1. Within the wild populations in the Kyushu region, population KYU carried only HAP2, whereas MIY and KAG carried both HAP1 and HAP2 (Table 1).
Pairwise Jost's D values among the 32 populations were calculated using the 30 nSSRs (Supplementary Table S4 Table S6). The ∆K values were highest at K = 2 both with and without the LOCP-RIOR model. In both models, one cluster corresponded to wild populations in the Kyushu region (blue genetic cluster; Fig. 2) and the other cluster to cultivar populations and the wild populations on Honshu island (orange genetic cluster). In both models, the second-highest ∆K values occurred at K = 3, whereas the values of |L″(K)| and ∆K were lower at K = 4. The values of |L″(K)| and ∆K at K = 5 were higher than those at K = 4. Because the 3 wild populations in Kyushu were genetically distant from the other 29 populations, we also estimated ∆K values using only the 29 populations from outside Kyushu. The ∆K values for this subset were clearly higher at K = 2 and K = 4 (corresponding to K = 3 and K = 5 using all populations) than at other K values, both with and without the LOCPRIOR model. In addition, we applied hierarchal analyses of molecular variance (AMOVA) to estimate the variance among the clusters identified in STRU CTU RE (Supplementary Table S7). The results of AMOVA based on two clusters (K = 2) across all 32 populations showed that 23.4% of the total variance was among clusters, 6.1% was among populations within clusters, and 70.5% was within populations. Effects both among clusters and among populations within clusters were significant (P < 0.001). The percentage of variance among clusters  Tanba (OR; population number 3) is not shown in this figure because these cultivars are distributed all over Japan. The map was created using MapChart (https ://mapch art.net). www.nature.com/scientificreports/ based on five clusters (K = 5) in all 32 populations and that based on four clusters (K = 4) in the set of 29 populations excluding populations from Kyushu were small (7.8% and 2.8%, respectively) but significant at P < 0.001. Thus, we constructed additional bar plot diagrams for K = 3 and K = 5 using all 32 populations (Fig. 2) and for K = 2 and K = 4 using the set of 29 populations excluding populations from Kyushu ( Supplementary Fig. S2). The results of the clustering at K = 3 and K = 5 using all populations and those at K = 2 and K = 4 using the 29 populations from outside Kyushu were quite similar, so we focused on the clustering results obtained for the full set of populations. The results of the clustering at K = 3 with and without LOCPRIOR were similar. In each Table 1. Grouping and genetic diversity of the 32 chestnut populations investigated in this study. N a number of alleles; A r allelic richness; H o observed heterozygosity; H e expected heterozygosity; F is fixation index. *P < 0.05; **P < 0.01. www.nature.com/scientificreports/ case, the accessions were classified into three clusters corresponding to (1) the cultivar populations and the wild populations on central and western Honshu island ("orange"), (2) wild populations on northeast Honshu island ("yellow"), and (3) wild populations in the Kyushu region ("blue"). When K = 5 with and without LOCPRIOR, we further identified a "red" genetic cluster corresponding to the cultivar populations and a "green" one corresponding to wild populations on western Honshu island. Hereafter, the "red", "yellow", "orange", "green", and "blue" genetic clusters at K = 5 are referred to as the "cultivar", "northeast Honshu", "central Honshu", "western Honshu", and "Kyushu" clusters, respectively. A neighbor-joining tree was constructed based on the net  www.nature.com/scientificreports/ nucleotide distances between pairs of clusters (Fig. 3). The "cultivar" cluster was genetically similar to the "northeast Honshu", "central Honshu", and "western Honshu" clusters, whereas the "Kyushu" cluster was genetically quite distant from the other four clusters. A noticeable contribution of the "cultivar" cluster was observed in some wild individuals in regions ranging from Tohoku to Kyushu. In particular, TSU and TOK (wild populations near Tokyo) carried a high proportion of the "cultivar" cluster (68.3% for TSU and 51.4% for TOK when analyzed without the LOCPRIOR model).
To validate the genetic structures of wild chestnut populations in Japan, 95 wild chestnut individuals collected by the Forest Tree Breeding Center (FTBC) were combined with the 32 populations in a Bayesian structure analysis without the LOCPRIOR model (Supplementary Table S8). The results were quite similar to those based only on the 32 populations for K = 2, 3, or 5 in that the wild populations were differentiated by region from northeast to southwest, with the exception that the "cultivar" cluster was highly represented in individuals from Akita and Okayama (which are not geographically close) when K = 5 ( Supplementary Fig. S3). For K = 5, individuals derived from northeast Honshu (Aomori, Akita, and Miyagi) mainly showed the "northeast Honshu" cluster, whereas those from the western end of Honshu (Tottori, Shimane, Yamaguchi, and Hiroshima), which was not well represented in the analysis of 32 populations, mainly showed the "western Honshu" cluster. The FTBC collection did not include individuals originated in Kyushu, but a noticeable contribution of the "Kyushu" cluster was observed in FTBC individuals derived from Tottori, Shimane, and Yamaguchi, which are geographically close to Kyushu. Parentage analysis. To clarify whether cultivars had influenced the genetic structure of wild individuals, we conducted parentage analysis of 557 wild individuals (Table 1) by using 92 major cultivars (Supplementary  Table S1) as a putative parent database. We identified 26 wild individuals presumed to have a direct offspringparent relationship with a cultivar in the database ( Table 3). The delta score for putative parents ranged from 1.7 to 19.5. Traditional cultivar 'Ginyose' was presumed to be a parent of wild individuals TSU11, NAK13, and NAK17. Recent major cultivar 'Tsukuba' was also found to be a putative parent of HAN01, NOS14, and NOS15. Another major cultivar, 'Ishizuchi' , was found to be a putative parent of KNE02 and TOK3. We also conducted parentage analysis of wild individuals collected by FTBC (Supplementary Table S8) using the database of 92 major cultivars and identified 8 putative parent-offspring pairs between cultivars and wild individuals (Supplementary Table S9). The local cultivars 'Saimyouji 1′ and/or 'Saimyouji 2′ were presumed to be a parent of 3 wild individuals from Akita. In addition, 3 wild individuals from Okayama were presumed to be offspring of the traditional cultivar 'Ginyose' or of 'Hiratsuka 4′, a selection from 'Ginyose' .  Table S10 (KAN, TAN, and  OR). Overall, cultivars showed higher probabilities of origin in central-to-western populations than in northern and Kyushu populations. In particular, cultivars from the Tanba region were likely to be assigned to central and western populations. For most cultivars, the probabilities of originating from wild populations at the northern end of Honshu and from wild populations in the Kyushu region were zero or nearly zero. The populations TOK, KAS, TAR, NOS, SAS, EIT, and NAK showed higher probability scores than the other populations.

Analysis of isolation by distance.
To clarify the correlation between genetic differentiation and geographical distance in wild populations, the 26 wild populations on Honshu were used for isolation-by-distance analysis. A significant correlation between genetic distance and geographic distance was detected by the Mantel test when all possible pairs of the 26 wild populations were included (r 2 = 0.356, P < 0.001; Fig. 5). When the analysis was limited to pairs that included either TSU or TOK, both of which are closely related to cultivars, there was no significant correlation between genetic differentiation and geographical distance (r 2 = 0.002, P < 0.736).

Discussion
In this study, we examined the genetic diversity and genetic relationships among 29 wild populations and 3 cultivated populations. Genetic diversity parameters such as heterozygosity and allelic richness were similar in cultivated and wild populations, and no significant evidence of genetic bottleneck was identified in cultivated populations. In perennial fruit crops, some species maintained high levels of genetic diversity during domestication, such as apple and grape 40,41 , whereas Chinese cherry showed moderate loss of genetic diversity in cultivated accessions 8 . This kind of comparison depends on the geographic diversity of the cultivars and individuals collected. We identified a total of 242 alleles from 61 cultivars and 385 alleles from 557 wild individuals. Although   (Table 1, Fig. 3). In Bayesian structure analysis using K = 3 (Fig. 2), accessions were classified into (1) wild populations in northeast Honshu ("yellow"), (2) cultivars (regardless of location) and wild populations in central and western Honshu ("orange"), and (3) wild populations in the Kyushu region ("blue"). Interestingly, in some individuals from Kyushu populations MIY and KAG, we identified introgressions of genetic clusters from Honshu ("yellow" and "orange") into the genetic cluster from Kyushu ("blue"), suggesting that seeds and scions have been imported into Kyushu from Honshu island. Human-mediated seed and scion dispersal have been common in chestnut 26,28,29,31,32 . The altitude of MIY is 109-212 m above sea level (asl) and MIY is located close to a Satoyama landscape (secondary woodlands and grasslands adjacent to human settlement) 42 , where human activities are presumed to have had an impact on the distribution of trees. A similar result was obtained from chloroplast haplotype analysis, as MIY and KAG carried not only HAP2 (presumed to have originated in Kyushu) but also HAP1 ( Table 1).
The genetic diversity of KYU was lowest among the populations used in this study: this population had only the "Kyushu" genetic cluster and carried only HAP2. The "Kyushu" genetic cluster was rarely observed in cultivars except for 'Obiwase' in Miyazaki Prefecture and 'Nanatate' in Kochi Prefecture (both from population OR). As the altitude of KYU is high (1037-1074 m asl), it might have had less chance to receive chestnut seed or pollen from other regions and cultivated populations. The "Kyushu" genetic cluster was also detected at low levels in some wild chestnut individuals collected by FTBC in Yamaguchi, Shimane, Hiroshima, and Tottori Prefectures, which are located on the western end of Honshu close to Kyushu island ( Supplementary Fig. S2). In a previous study, HAP2 was also identified in Korean cultivar 'Pochun B-1′ 43 . As the Korean peninsula is located not far north of Kyushu island, it may be possible that wild chestnuts in Korea have a genetic structure similar to those on Kyushu island.
With the exception of those in Kyushu, the wild populations were genetically similar and carried only HAP1 ( Fig. 3; Table 1). On Honshu island, isolation-by-distance analysis showed continuous patterns of genetic differentiation from one end to the other. Genetic diversity tends to decrease from south to north in the Northern Hemisphere, as is common in deciduous tree species 44 . On the other hand, Bayesian structure analysis with and without LOCPRIOR showed a clear relationship between genetic structure and geographical distribution (Fig. 2). For K = 3, wild chestnut populations on Honshu island were divided at the middle of central Honshu, roughly corresponding to the boundary of Fossa Magna, a major tectonic depression. Genetic diversification related to Fossa Magna was also reported in F. japonica 35 , Q. mongolica var. crispula 37 , and Q. serrata 36,37 . The genetic boundary of Q. aliena showed a rather different pattern and was located between Chubu and Kinki Districts, about 200 km west of Fossa Magna 36 . Following yet another pattern, F. crenata showed genetic divergence between the Japan Sea lineage and the Pacific lineage 38 . Quercus variabilis and Q. phillyraeoides, which are mainly www.nature.com/scientificreports/ distributed south of Fossa Magna, were presumed to be unaffected by this boundary. Also, Q. acutissima Carruth is not thought to have been present in the main archipelago of Japan during the last glacial maximum 45 and shows monomorphism of cpDNA in Japan 46 , suggesting that it was heavily influenced by human activity rather than by natural migration. To summarize these results, Fossa Magna commonly affected the genetic diversification of deciduous species in family Fagaceae that were naturally and widely distributed across this boundary, except for Q. aliena and F. crenata. The barrier caused by high mountain ranges and large deposits of volcanic rocks might have prevented the migration of seed 37 . For K = 5 in the Bayesian structure analysis, the cultivar populations shared the "red" ("cultivar") genetic cluster with some wild chestnut populations, especially in the Kanto region (TOK and TSU; Fig. 2). We suspect that the "red" cluster in these wild populations is related to gene flow from cultivars to natural stands. So far, there have been no reports that cultivars were selected from wild individuals in the Kanto region (TOK and TSU), whereas several reports suggest that cultivars were selected in the Tanba region of western Honshu 21,22 . Although genetic differentiation and geological distance among wild populations on Honshu are significantly correlated (r 2 = 0.356; Fig. 5), analysis of pairs that included TOK or TSU showed no significant correlation (r 2 = 0.002), suggesting that these populations were affected by non-natural (i.e., human-mediated) transportation of genetic clusters from other regions. Furthermore, in Ibaraki Prefecture (the location of TSU), chestnut production and the presence of orchards dramatically increased from the 1940s to the 1960s, as wild individuals were replaced by cultivars 21 . Trees in these orchards might have hybridized with wild individuals in TSU. The direct parent-offspring relationships between cultivars and wild individuals identified in several regions also suggest gene flow from cultivars to wild populations in these regions (Table 3, Supplementary Table S10). In fact, the "cultivar" ("red") genetic cluster was observed in some wild individuals in regions ranging from Tohoku to Kyushu (TSU, TOK, MKY, KNE, TNN, KUJ, TAR, NOS, SAS, EIT, MIK, NAK, MIY, and others; Fig. 2) when analyzed both with and without the LOCPRIOR model, suggesting that the present wild chestnut populations throughout Japan have been influenced by gene flow from cultivars. Interestingly, all of the wild populations showing significant values in bottleneck analysis under TPM (TNN, TOK, NOS, and NAK) carried considerable portions of the "cultivar" genetic cluster. These sites might have been somewhat influenced by human activities. Large amounts of timber from wild chestnut were used for railway construction 47 in the early twentieth century. Chestnut timber is thought to have been in short supply by the mid-1910s 48 , and it was reported that trees distributed in forests near railway lines and other convenient places had been cut down and exhausted for railway construction 49 .
Gene flow from cultivars to wild populations has been identified in some fruit tree species. In the Japanese archipelago, pear shows extensive introgression of prehistorically naturalized P. pyrifolia into the threatened native species P. ussuriensis, which was identified in northern Tohoku 12 . With rare exceptions, wild pear trees have only been found in human-disturbed secondary forests or pasturelands, so native individuals have likely sustained a genetic influence from cultivated or escaped trees. Therefore, true native P. ussuriensis is distributed in limited areas and is endangered. In apple, some studies have investigated crop-to-wild gene flow from modern cultivars (Malus domestica Borkh.) imported from western countries into native species (Malus sieversii and Malus orientalis) 15 . In grape, a wild population in Georgia had a high proportion of hermaphrodite vines, a character specific to cultivar genotypes, indicating gene flow from cultivated to wild types 50 .
Genetic studies of current chestnut populations in combination with information from preserved nut remnants can reveal clues as to how and where chestnut might have been domesticated. It is obvious that one component of the domestication syndrome in Japanese chestnut is large nut (seed) size, as in other species 51 . Although previous reports clarified that nut size increased during the early to late Jomon (4000-1000 BCE) 17,18 , Motoki suggested that the nuts discovered mainly in ruins in northeast Japan were not directly influenced by native cultivars that originated in the Tanba region, which is geographically far from northeast Japan 23 . In Bayesian structure analysis with K = 3, cultivars were closer to wild populations in central to western Honshu, the latter including the Tanba region, than to those in northeast Honshu, which includes Sannai-Maruyama ruin (Aomori Prefecture), Aota ruin, and Yachi ruin (Niigata Prefecture), where large nuts were excavated. Thus, cultivars from the Tanba region would not have been selected directly from the nuts from these ruins. On the other hand, the results of Bayesian analysis and an assignment test suggested that the candidate origin of the cultivars was central to western Honshu, though the Tanba region (western Honshu)is more likely, based on the literature 21,22 . In any case, it is quite difficult to clarify the detailed origin of cultivars by comparisons to wild populations that were influenced by gene flow from cultivars, because the direction of gene flow is opposite to that typically seen during breeding and domestication processes (i.e., from wild populations into cultivars).
In cereals, it took several 1000 years to consistently obtain the non-shattering spikelet phenotype, a defining characteristic of domesticated crops 45 . Among fruit tree species, apple is a good example of a species whose domestication appears to have taken more than a few 1000 years. The cultivated species M. domestica originated in the Tian Shan mountains from M. sieversii. During the dispersal of apple from Asia to Europe along the Silk Route, hybridization with and introgression from Caucasian and European Malus species contributed to form present-day apple cultivars 52 . Slow domestication of a species suggests that there were periods of pre-domestication cultivation 53 , or "hansaibai", which is suggested by Kitagawa and Yasuda 17 to have occurred in Japanese chestnut at the location of the Sannai-Maruyama ruin. It is possible that the effort to improve genotypes via predomestication cultivation and domestication of chestnut was initiated more than once, and in several regions, because large nut remains were discovered in several ruins from different eras.
In conclusion, this study of both cultivars and wild populations of chestnut indicates the presence of gene flow from cultivars to wild forms and reveals a possible history of cultivar breeding. The genetic relationships between cultivars and wild individuals clarified in this study will be useful for both conservation and breeding. From the perspective of chestnut breeding, wild individuals distributed in Kyushu could be valuable genetic resources because they are genetically quite distant from current cultivars. Climate change is bringing new problems of black spot nut and nut rot to chestnut, and the materials from Kyushu might be helpful for chestnut production www.nature.com/scientificreports/ in hot areas. For the purposes of conservation, it is important to preserve wild individuals from Kyushu that are divergent from cultivars because gene flow from cultivars to natural stands was observed in this study. To achieve this goal, wild individuals carrying true "wild" genetic clusters should be collected and preserved by ex situ conservation programs such as the gene banks of public institutions.

Materials and methods
Plant materials and DNA extraction. Thirty-two populations (618 accessions), including 3 cultivated populations and 29 wild populations, were used in this study (Table 1, Supplementary Tables S1 and S2, and Fig. 1). The three cultivated populations were composed of local cultivars originated in the Kanto region (KAN), Tanba region (TAN), and other regions (OR). All of the sampling sites for wild populations excluded chestnut orchards. Individual trees were selected at least 20 m apart so as to minimize the sampling of close relatives. The heights of most trees were 5 to 15 m, which is typical for wild trees of Castanea crenata because damage from the white-striped longicorn beetle (Batocera lineolata) shortens the life of C. crenata in Japan. Ninety-five wild chestnut individuals collected by the FTBC as part of a forest tree gene bank program were used to validate the genetic structure of wild chestnut populations (Supplementary Table S7). The FTBC program aimed to collect excellent trees for wood products, so the collections would likely be genetically biased, but they included the western part of Japan (Tottori, Shimane, Hiroshima, and Yamaguchi), where sampling of wild chestnut populations had not previously been done. We did not include this collection in the main analysis because of the potential for bias and the low number of individuals.
Genomic DNA was extracted from young leaves and buds with a DNeasy Plant Mini Kit (Qiagen, Germany) according to the manufacturer's instructions.
Genetic markers. The 618 accessions and the wild chestnut individuals collected by FTBC were genotyped for 30 nSSRs [54][55][56] (Supplementary Table S3) and 4 cpSSRs 57 (Cmcs1, Cmcs2, Cmcs3, and Cmcs5; Table 2 Basic genetic characteristics. The observed heterozygosity (H o ) and expected heterozygosity (H e ) were calculated in GenAlEx v. 6.5 software 58 , and inbreeding coefficients (F is ) and allelic richness (A r , n = 12) were calculated using FSTAT Version 2.9.3 59 . Significant deviations from Hardy-Weinberg equilibrium, as indicated by deviations of F is from zero, were tested by randomization using FSTAT. The frequencies of null alleles were estimated using CERVUS version 3.0 software 60 . Pairwise genetic distances based on a Jost's D matrix among the 32 populations were calculated with GenoDive version 3.0 software 61 . To detect signatures of bottlenecks in the populations, we used the program INEST 2.2 62 to assess excess heterozygosity in each population. We ran the model using 100,000 coalescent simulations, and tested significance using the Wilcoxon signed-rank test, calculated based on 1,000,000 permutations under both the stepwise mutation model (SMM) and the two-phase model (TPM).
Analysis of population structure. Bayesian statistical inference of the population structure was performed by use of STRU CTU RE 2.3.4 software 63 with the correlated model for allele frequency using no prior population information (USEPOPINFO = 0), both with and without the LOCPRIOR model. The LOCPRIOR model allows for sample group information to be used to aid in the clustering process 64 . This model has been found to detect genetic structure at lower levels of divergence than the regular mode. The analysis was run 10 times for each value of K (number of inferred ancestral populations) from 2 to 8 for 1,000,000 iterations after a burn-in period of 1,000,000 iterations. Evanno et al.'s criterion of ∆K was used to estimate the appropriate K value 65 . Furthermore, genetic variance among clusters was also used to perform AMOVA 66,67 in GenoDive version 3.0 software 61 . Significance levels were determined using 10,000 permutations. Because the populations from Kyushu were clearly genetically distant from the others, the values of ∆K and variance among clusters were also calculated using the 29 populations from outside of Kyushu. The CLUMPAK online tool 68 was applied to calculate the ∆K values and to graphically display the results produced by STRU CTU RE. A neighbor-joining tree based on net nucleotide distances was generated using POPULATIONS v. 1.2.30 69 . The 618 accessions were used for the main analysis. For validation purposes, the 95 wild chestnut individuals collected by FTBC were coanalyzed with the 618 accessions. The results of the validation are shown in Supplementary Figure S2.

Data analyses.
To identify possible parent cultivars of wild individuals, putative parent-offspring relationships were calculated with CERVUS 60 software, which uses a maximum-likelihood-based approach to infer parentage. The genotype data of the 30 nSSRs for 92 major cultivars (Supplementary Table S1) were used as the cultivar database of candidate parents. The paternity analyses were carried out to infer whether each wild accession had a parent-offspring relationship with each of cultivars. The parameters of the simulated genotypes were the following: "offspring" 100,000; "candidate parents" 100; "prop. sampled" 0.10; "prop. loci typed" 1.00; and "prop. loci mistyped" 0.02. Because the true parents of wild individuals could be outside of our database, we applied strict assignments using a 99% confidence threshold (95% for default parameter) for delta LOD scores to minimize the type-II error 69  www.nature.com/scientificreports/ a cultivar in the database, whereas the true parent is outside the set of sampled individuals). The average probability of non-exclusion of an incorrect father using this approach was 1.5 × 10 −7 . Assignment tests have been used to infer the origin of cultivars in several crops [70][71][72] . Assignment based on Bayesian criteria for computation for the 92 cultivars was carried out in GeneClass 2.0 73 by using the wild populations as reference sets. We applied an exclusion test using the Monte Carlo resampling method 74 . The genotypes were generated by Markov Chain Monte Carlo simulations of 10,000 individuals for each of the wild populations. When the probability that the wild populations in a region were sources of a cultivar was less than a given threshold (α = 0.01), the populations in that region were excluded from candidate sources for the tested cultivar.
For the 26 wild populations on Honshu island, we tested isolation by distance. Mantel tests 75 were used to estimate the correlation between (F ST /(1 − F ST )) and the geographic distances (km) between sampled populations based on 10,000 randomizations using GenAlEx v. 6.5 software. Because wild populations TSU and TOK carried substantial portions of the "cultivar" genetic cluster, correlation of genetic distance and F ST /(1 − F ST ) was also calculated for pairs including either TSU or TOK.