Genetic analyses reveal independent domestication origins of the emerging oil crop Paeonia ostii, a tree peony with a long-term cultivation history

Paeonia ostii, a member of tree peony, is an emerging oil crop with important medical and oil uses and widely cultivated in China. Dissolving the genetic diversity and domestication history of this species is important for further genetic improvements and deployments. We firstly selected 29 simple sequence repeats (SSRs) via transcriptome mining, segregation analyses and polymorphism characterizations; then, 901 individuals from the range-wide samples were genotyped using well-characterized SSR markers. We observed moderate genetic diversity among individuals, and Shaanxi Province was identified as the center of genetic diversity for our cultivated plants. Five well-separated gene pools were detected by STRUCTURE analyses, and the results suggested that multiple independent domestication origins occurred in Shaanxi Province and Tongling City (Anhui Province). Taken together, the genetic evidence and the historical records suggest multiple long-distance introductions after the plant was domesticated in Shandong, Henan and Hunan provinces. The present study provides the first genetic evaluation of the domestication history of P. ostii, and our results provide an important reference for further genetic improvements and deployments of this important crop.

targeted 4 loci (PS0337, PS158, PS004 and PS108) that deviated significantly from a neutral equilibrium model. Additionally, an analysis using Genecap v1.4 52 determined that a pair of samples had an identical genotype; therefore, one member of this pair was removed. Finally, 29 markers (19 in Mendelian inheritance) considered unlinked and neutral with all 901 samples were remained for further analyses.
Genetic diversity and differentiation. A total of 142 alleles were found, and the number of alleles per locus (N) ranged from 2 to 8, with an average of 4.897 alleles per locus. The mean polymorphism level of the loci (PIC) was 0.318, with a range from 0.108 (at the locus PS183) to 0.613 (PS112) ( Table S3). The observed heterozygosity (H O ) ranged from 0.080 (at the locus PS074) to 0.801 (PS101), with an average of 0.343, and the expected heterozygosity (H E ) ranged from 0.081 (at the locus PS183) to 0.625 (PS112), with an average of 0.321 (Table S3). At the population level, 39 private alleles (P A ) were found at 20 loci distributed in the 14 populations (Table S4). The average genetic diversity (H) and allelic richness (A R ) were 0.330 and 2.194, with H ranging from 0.286 for SXSLA to 0.415 for SXFX (two populations from Shaanxi Province) and A R ranging from 1.610 for XXFH (an isolated population in the south) to 2.510 for SXSL (a population from Shaanxi Province). We observed that the genetic diversity and allelic richness were higher in populations from Shaanxi Province ( Table 1). The fixation inbreeding coefficients (F IS ) at the locus and population levels were −0.053 and −0.045 (Table 1 and S3). The negative average fixation index (F) values indicated an excess of heterozygotes in P. ostii, which is consistent with the self-incompatibility system of this species.
The global multi-locus F ST estimated for the 29 loci was 0.106 and ranged from 0.022 at locus PS260 to 0.529 at locus PS187. The pairwise multi-locus F ST estimated for these loci ranged from 0.001 to 0.208, which indicated moderate population differentiation among the populations (Table S5).
Population structure. STRUCTURE analyses indicated that the ln P(D) reached a clear mode at K = 8 before decreasing, and the highest delta K was detected when K = 8 with the second highest values were detected at K = 2, K = 4 and K = 7 (Fig. 2). The species genetic structure is discussed with the results up to K = 8 (except K = 5). Four major well-separated genetic clusters (I, II, III and IV) were identified at K = 4-8, and one minor well-separated genetic cluster (SHX1, composed of the SXTG and SXGQ populations) was observed at K = 6-8. Models assuming different K values helped to detect admixed groups (AHBZ, SD1, SHX2 and MIX). Group (AHTL) of populations from Tongling in Anhui Province almost entirely made up cluster I; similarly, populations (SDHZB, SDHZC, SDHZD) from Shandong Province were always assigned to cluster II; three populations (HNLY, SXSN and HNLY) from Hunan, Shaanxi and Henan Province were assigned to cluster III; three populations (SXSLA, SXSLB, SDHZA) from Shaanxi and Shandong Province were assigned to cluster IV. These four major and one minor well-separated genetic clusters revealed the five gene pools in the domestication of the species, indicating probably five times of independent domestication events. Two populations (AHBZE, AHBZF) from Bozhou City in Anhui Province showed a closer affiliation with cluster I at any K value, which may be attributed to recent migration from Tongling to Bozhou City in Anhui Province. When K = 4, the group (AHBZ) of populations from Bozhou City in Anhui Province, Linfen City in Shanxi Province (SXLFA, SXLFB) and Liaocheng City in Shandong Province (SDLCA, SDLCB) appeared to be admixed groups of clusters I and III, highlighting the important role of clusters I and III in recent breeding activities through hybridization. At K = 8, populations from Shaanxi Province were assigned to four different ancestral gene pools and populations from Shandong Province were assigned to two ancestral gene pools.
A principal coordinate analysis (PCA) partitioned 67.40% and 1.80% of the variance in the experimental data to the first two axes (Fig. 3). Collectively, 69.2% of the variation is explained by the first two components. Although the PCA analysis showed a more continuous and overlapping pattern of genetic structure than the STRUCTURE analyses, the results were similar when the five ancestral gene pools were considered. The analyses of molecular variance (AMOVAs) revealed that most of the genetic diversity (93.77%) was observed within populations, whereas only 6.23% of the genetic variation occurred between the evaluated populations ( Table 2). This observation could be attributed to the breeding system of the cross-pollinating plants.
Sample pooling and clustering. According to the results of the STRUCTURE analyses and the geographical sources of the samples, we divided our samples into nine groups to better identify the regional divergence among populations. Samples from the HNSY, HNLY and SXSN populations shared the same proportion of genetic   ancestry in the STRUCTURE analyses; therefore, these samples were pooled into a single group (HN). Similarly, samples from the SDHZB, SDHZC, SDHZD and SXMXA populations were pooled together to form the SD2 group, samples from the SXTG and SXGQ populations were pooled together to form the SHX1 group, and samples from the SXSLA, SXSLB and SDHZA populations were pooled together to form the SHX3 group. Samples collected from Tongling and Bozhou City in Anhui Province were pooled into the AHTL and AHBZ groups, respectively. Similarly, samples from Shandong and Shaanxi Provinces were pooled into the SD1 and SHX2 groups, respectively. Because a significant structure was not observed for certain samples, samples from SXLFA, SXLFB, XXFH, HBBK and BJJF were pooled into a single group (MIX). In total, nine groups were analyzed: AHTL, AHBZ, SD1, SD2, SHX1, SHX2, MIX, HN and SHX3 (Fig. 2). The global multi-locus F ST for the groups was estimated at 0.076, and the pairwise multi-locus F ST ranged from 0.001 to 0.170, indicating moderate genetic differentiation ( Table 3). The SHX3 group (from Shaanxi Province) showed the highest degree of genetic differentiation from the other groups (average F ST = 0.130). Groups SHX1 and HN (composed of samples from different geographically distant populations) showed moderate genetic differentiation compared with the other groups (average F ST = 0.109-0.105). Group AHTL (the group of putatively original cultivation samples) and SD2 showed low genetic differentiation compared with group AHBZ, SD1, SHX2 and MIX, and this finding was consistent with the STRUCTURE analyses. The AMOVA analyses of the nine cluster groups revealed that 5.83% of the genetic variation occurred among regions, whereas 2.48% of the genetic variation occurred among populations within regions ( Table 2).
The directional relative migration networks as estimated by divMigrate 53 depicted the relative migration rates among the nine groups (Fig. 4a-c). For all three estimators (Jost's D, G ST , and Nm), clusters were grouped in a pattern similar to that of the STRUCTURE analyses and pairwise F ST analyses. Four admixed groups (AHBZ, SD1, SHX2 and MIX) that grouped close together displayed a high degree of gene flow, and two genetically pure groups (AHTL and SD2) exhibited a relatively high gene flow with these four admixed groups. Three genetically distinct groups (SHX1, SHX3 and HN) appeared relatively isolated from the other groups.

Discussion
To establish breeding and management strategies for emerging crops, a thorough understanding of the population genetics of the species of interest has become essential for producing long-term sustainable goals. This work represents the first systematic population genetic study of P. ostii, and we selected unlinked and neutral markers and performed range-wide genotyping and genetic evaluation for this commercially important woody plant with a long history of cultivation.
Genetic diversity and future exploration. P. ostii presents a wide distribution range, reproduces via insect pollination and outcrossing and has a self-incompatible mating system, and these characteristics allow this species to develop a higher genetic diversity. Consistent with other cross-pollinating plants 54,55 , most of the total genetic variation (>90%) in P. ostii occurs among individuals within a population, whereas a small proportion of the genetic variation occurs between populations. Compared with previous studies of wild peony species in the Moutan section, the genetic diversity indices obtained in the present study are lower than those in Paeonia jishanensis 56 (Na = 4.762, Ho = 0.446, He = 0.340) and Paeonia rockii 57 (Na = 3.69, Ho = 0.459, He = 0.492). This could be attributed to long-term domestication of P. ostii involving selecting and modifying its wild progenitor to meet human needs 58 . The elite P. ostii cultivars that growers use today are the result of intensive selection applied to the founding stock and its descendants, which is presumed to have led to additional losses in genetic diversity. The reduction in genetic diversity does not bode well for future genetic gains in productivity and could result in broad susceptibility to newly emerging diseases or insect pests, thereby threatening long-term food and feed security 59,60 . Future germplasm collections for the creation of new cultivars should account for the vast reservoir of genetic diversity to maintain the genetic base of P. ostii. STRUCTURE analyses identified five main ancestral gene pools across the wide range of P. ostii samples. Interestingly, we found that samples in Shaanxi Province were assigned to four ancestral gene pools. In addition, the genetic diversity analyses demonstrated that both the diversity and allelic richness are generally higher in populations from Shaanxi Province. Based on these results, we proposed that Shaanxi Province is the main genetic diversity center of the cultivated materials of the species. The other important group of populations from Tongling (AHTL) in Anhui Province contained a well-separated gene pool and displayed extensive genetic exchanges with admixed groups as revealed by the divMigrate analyses. Therefore, the resources from these two provinces could represent priorities for future germplasm conservation and exploration.
Population structure and domestication history. Cultivation of P. ostii for medicinal use originally occurred in Shaanxi, Henan, Shandong and Anhui (Tongling) provinces 29,61 . The hypothesis that the domestication origin of P. ostii occurred in Shaanxi Province has generally been accepted, although other hypotheses of its domestic origin are still popular 61 . However, the domestic origin hypotheses have not been tested using genetic data. The present study presented for the first time a genetic test for P. ostii based on an analysis of multi-locus data. Five genetically well-separated gene pools were identified in geographically isolated regions of Shaanxi Province, Tongling City in Anhui Province, and in Shandong, Henan and Hunan provinces. The allopatric distribution of distinct gene pools of cultivated P. ostii provided support for the hypothesis of independent domestication origins of this species. Consistent with the cultivated ornamental tree peony 62 , our study provided another case of multiple independent domestication for the cultivated tree peonies of the Moutan section.
In the present study, the STRUCTURE results of five well-separated gene pools indicated that five independent domestication origins may have occurred over the long-term cultivation of P. ostii. Interestingly, four distinct gene pools were found in Shaanxi Province, with two of these pools also found in Shandong. The richest diversity of both the wild populations and species of tree peony was observed in Shaanxi 25,26 , and this province also presents the longest recorded history of cultivation of tree peonies [63][64][65] . Reports have suggested that the tree peony was introduced from Shaanxi to Shandong Province several different times from ancient to modern times 28,63,65 . Given these facts, it is easy to predict that at least four times domestication origins of cultivated P. ostii ever happened in Shaanxi Province. The reconstruction of four well-separated gene pools identified by the STRUCTURE analyses clearly suggested that protection and utilization should be prioritized for the cultivars of different domestication origins that are isolated from each other in areas such as SXSN (Shangnan, Shaanxi Province), SXMXA (Meixian, Shaanxi Province), SXSLA (Shangnan, Shaanxi Province), SXSLB (Shangzhou, Shaanxi Province), SXGQ (Ganquan, Shaanxi Province), and SXTG (Tongguan, Shaanxi Province). Because most of these areas are located in the vicinity of the Qinling mountains, which presents the largest specific diversity of the Moutan section, we suspect that the Qinling Mountains in Shaanxi represents the likely origin of domesticated P. ostii. As the wild plants of P. ostii have been completely eradicated 33 and all populations sampled from Shaanxi were cultivated, further analysis of P. ostii, including models of the domestication history and phylogenetic reconstructions based on the low-copy nuclear genes relative to that of the chloroplast and rDNA sequences, will be needed to validate this hypothesis.
As the world's largest producer of ornamental tree peony, Heze City in Shandong Province cultivates P. ostii resources that are extensively used as rootstock for ornamental cultivated tree peony grafting. Our present study suggested that at least two rounds of independent long-distance P. ostii introductions likely occurred from Shaanxi to Shandong. We observed that one well-separated gene pool was identified in three distinct populations of SXSN (Shangnan, Shaanxi), HNLY (Luoyang, Henan Province) and HNSY (Shaoyang, Hunan Province), reflected probably unique domestication in Shaanxi Province and following long-distance introduction to Henan and Hunan provinces. It is not difficult to see the importance of Shaanxi Province in domestication and deployment of P. ostii.
Another important region in the domestication of P. ostii is Tongling City in Anhui Province, where the long-term cultivation of this species has been recorded 27 . We found a group (AHTL) of populations from Tongling that almost solely consisted of a unique well-separated gene pool (cluster I). This finding indicates at least one independent domestication origin in Tongling City, which is distinct from the origins in Shaanxi Province. It is interesting that all cultivated samples from Tongling City were assigned to one unique gene pool without introduction were found from Shaanxi Province.
Our study suggested that Tongling City has likely played an important role in the domestication and breeding of modern cultivars of P. ostii. Bozhou City, which is located close to Tongling City, has earned a reputation as the "Capital of Traditional Herbs" and has become the central region for the cultivation of P. ostii 66 . The production of P. ostii in Bozhou accounts for more than 70% of the annual total production 29,66 . In our present study, both the STRUCTURE and divMigrate analyses revealed that cultivars in Bozhou City and admixed cultivars popular in other regions are similar in genetic makeup and present the closest relationships with cultivars from Tongling City in Anhui Province, Shaanxi and Shandong provinces. This finding demonstrates the important role of domestication in Tongling City in Anhui Province.
Alongside with those from well-separated gene pools, admixed individuals were identified extensively across most of the sampled regions (Shaanxi, Shandong, Beijing, Shanxi, Anhui, Hubei and Hunan). We hypothesized that these individuals were likely derived from recent hybridization breeding processes, but we failed to find records that corroborated this hypothesis. Genomic data may help elucidate the origin of the admixed populations.

Conclusions
This study was the first comprehensive work to select well-characterized SSR markers for population genetics studies on peony species. Our study settled long-running debates by confirming that (1) moderate genetic diversity and differentiation in P. ostii; (2) Shaanxi Province has the largest genetic diversity; (3) multiple independent domestication origins occurred in Shaanxi Province and Tongling City in Anhui Province, and this species was introduced to Shandong, Henan and Hunan provinces. Our results provide comprehensive information that can be used in genetic assessments, regional genetic improvements and management of P. ostii and insights for the development of breeding strategies for this emerging oil crop.

Materials and Methods
Sample collection for the population genetic study. A total of 902 individuals from 36 populations that cover the vast majority of the distribution area in China were collected (Fig. 5). The geographical distribution of the 36 sampled populations was shown in Supplementary Table S1. Young leaves were collected in the field individually, dried with colored silica gel 67 , and maintained at room temperature for DNA extraction.

SSR development and genetic mapping.
A total of 59,275 unigenes were obtained from the transcriptome assembly of P. suffruticosa 'Luo Yang Hong' 48 , and 4,373 potential SSRs were identified from 3,787 unigene sequences. In total, 2,989 SSR-containing sequences were appropriate for primer design, and 788 primer pairs were further selected for the primer synthesis. Of these, 373 primer pairs generated PCR amplification products of the expected sizes 43 . Those 373 SSR loci were then screened for polymorphism in a genetic mapping population. The mapping population was composed of 197 individuals, including the two parents and 195 F 1 progeny. The parents were P. ostii 'FenDanBai' (female parent) and P. × suffruticosa 'HongQiao' (male parent). The F 1 mapping population plants were grown in the Beijing Guose Peony Garden, Beijing, China. The validity and polymorphism of 373 SSR loci were analyzed by using 6 progeny individuals randomly selected from F 1 progeny, and 74 SSR loci were found to be polymorphic. We kept these 74 SSR loci for genotyping of 195 mapping individuals. All SSR genotypes we generated were combined with SNP genotypes 68 from the same mapping population, and then used for chisquare test (χ 2 ) of segregation and genetic map construction with Joinmap 4.1. Fifty-three SSR loci showed no deviation from Mendelian inheritance ratios (P < 0.05). Then, the optimal arrangement order was determined via regression analyses with the single-linkage clustering algorithm at logarithm of odds (LOD) ≥ 4.0 Scientific RepoRts | 7: 5340 | DOI:10.1038/s41598-017-04744-z and recombination rates (r) ≤ 0.3. Based on the above results, the LOD threshold was relaxed to 3.0 and markers that showed segregation distortion with higher LOD values in the linkage analyses were also mapped to the linkage group without changing the order of marker linearity. The genetic distance between adjacent markers was then estimated. The graph of the genetic map with 4 SNP markers (Marker19470, Marker3373, Marker18459 and Marker36271) and 68 SSR markers was visualized using MapChart v2.3 69 .
To further examine the genetic diversity and structural analyses of the 902 P. ostii individuals, 68 SSR loci on the genetic linkage map (Fig. 1) were used to perform polymorphism screening in random selected 8 P. ostii individuals and 40 polymorphic markers evenly distributed all 5 linkage groups (LGs) were selected. DNA extraction and genotyping. Total genomic DNA was extracted from 30 mg of silica-dried leaf material using the DNeasy Plant Genomic DNA Kit (Qiagen, Beijing, China). The purity and concentration of the extracted genomic DNA was measured using a NanoDrop 2000 spectrophotometer at 260 nm (Thermo Scientific, USA) and adjusted to 50 ng/μL. PCR amplification was performed in a total volume of 10 μL containing 1 μL (50 ng) genomic DNA, 3 μL ddH 2 O, 5 μL 1 × Taq PCR Master Mix (Aide Lai, Beijing, China), and 0.5 μL (10 pmol) each reverse and forward primer with fluorescent labeled 6-FAM, HEX, TAMRA or ROX (Ruiboingke, Beijing, China). PCR amplification was performed with the following amplification protocol: pre-denaturation at 95 °C for 4 min, followed by 35 cycles of denaturation at 94 °C for 35 s, annealing at 47-60 °C for 35 s (different primer annealing temperatures are shown in Supplementary Table S1) and extension at 72 °C for 35 s, with a final extension at 72 °C for 7 min. The PCR products were analyzed separately with an ABI 3730XL capillary sequencer along with an internal size standard (GeneScan-500 LIZ, Applied Biosystems). The SSR allele sizes were scored manually using GeneMarker v2.2 (Soft Genetics, State College, Pennsylvania, USA) for all populations.
Statistical analyses. Suitability of microsatellites for the population genetic analyses. We further checked the suitability of each marker for the population genetic analyses. Micro-Checker was used to identify possible genotyping errors, including stuttering, large allele drop-outs and null alleles. We determined whether selected polymorphic microsatellite markers deviated significantly from a neutral equilibrium model using Ewen-Watterson tests in Arlequin.  v3.25 72 . Hardy-Weinberg equilibrium (HWE) was calculated with the assistance of Genepop (http://genepop. curtin.edu.au/). Samples with identical genotypes were detected by GeneCap.
Genetic structure. The model-based (Bayesian) cluster software STRUCTURE v2.3.4 73 was chosen to estimate the population structure. For each value of K (K = 1-20), ten independent runs were performed with a burn-in period of 200,000 followed by 200,000 Markov Chain Monte Carlo (MCMC) replications. The most likely K value was determined using Structure Harvester 74 based on both the log likelihood and the maximum ΔK. Because clustering algorithms may incorporate stochastic simulations as part of the inference, independent analyses of the same data may result in several distinct outcomes; therefore, we used the computer program CLUMPP v1.1.2 75 to analyze the results from replicate analyses for optimal alignments of replicate clusters. The output from CLUMPP was graphically displayed by the cluster visualization program DISTRUCT 76 .
A PCA based on the pairwise F ST distance matrix was performed using the "adegenet" package 77 in R 78 .
Pairwise genetic distance and gene flow among pooled collections. To identify the genetic differences in the 901 P. ostii samples, sampled populations were pooled according to their geographic origin and genetic composition, and the global multi-locus F ST for the 29 loci was calculated with GenAlEx. The pairwise multi-locus F ST and significance of differences between F ST values was assessed in exact tests conducted with Arlequin, and the gene flow (N m ) based on the F ST values was calculated as N m = 0.25 (1 − F ST )/F ST . A genetic distance matrix of pairwise FST values was also used for AMOVAs 79 in Arlequin. Significance levels were determined using 1000 permutations.