Assessment of genetic diversity and variety identification based on developed retrotransposon-based insertion polymorphism (RBIP) markers in sweet potato (Ipomoea batatas (L.) Lam.)

Sweet potato, a dicotyledonous and perennial plant, is the third tuber/root crop species behind potato and cassava in terms of production. Long terminal repeat (LTR) retrotransposons are highly abundant in sweet potato, contributing to genetic diversity. Retrotransposon-based insertion polymorphism (RBIP) is a high-throughput marker system to study the genetic diversity of plant species. To date, there have been no transposon marker-based genetic diversity analyses of sweet potato. Here, we reported a structure-based analysis of the sweet potato genome, a total of 21555 LTR retrotransposons, which belonged to the main LTR-retrotransposon subfamilies Ty3-gypsy and Ty1-copia were identified. After searching and selecting using Hidden Markov Models (HMMs), 1616 LTR retrotransposon sequences containing at least two models were screened. A total of 48 RBIP primers were synthesized based on the high copy numbers of conserved LTR sequences. Fifty-six amplicons with an average polymorphism of 91.07% were generated in 105 sweet potato germplasm resources based on RBIP markers. A Unweighted Pair Group Method with Arithmatic Mean (UPGMA) dendrogram, a model-based genetic structure and principal component analysis divided the sweet potato germplasms into 3 groups containing 8, 53, and 44 germplasms. All the three analyses produced significant groupwise consensus. However, almost all the germplasms contained only one primary locus. The analysis of molecular variance (AMOVA) among the groups indicated higher intergroup genetic variation (53%) than intrapopulation genetic variation. In addition, long-term self-retention may cause some germplasm resources to exhibit variable segregation. These results suggest that these sweet potato germplasms are not well evolutionarily diversified, although geographic speciation could have occurred at a limited level. This study highlights the utility of RBIP markers for determining the intraspecies variability of sweet potato and have the potential to be used as core primer pairs for variety identification, genetic diversity assessment and linkage map construction. The results could provide a good theoretical reference and guidance for germplasm research and breeding.


Results
Discovery and classification of LTR retrotransposons in the sweet potato genome. A total of 21555 LTR retrotransposons were obtained, making up 1.1% of the sweet potato genome 25 (870 Mb). According to the sequence similarity with the reported retrotransposons, 13002 LTR retrotransposons were assigned to the copia family, 8114 LTR retrotransposons belonged to the gypsy family, and 439 LTR retrotransposons were classified to other families. The copia retrotransposons were further clustered into 12342 subfamilies with a single LTR retrotransposon sequence, 23 subfamilies with two LTR retrotransposon sequences, and 85 subfamilies with three or more LTR retrotransposon sequences ( Table 1). The gypsy retrotransposons were clustered into 7775 subfamilies with a single LTR retrotransposon sequence, 37 subfamilies with two LTR retrotransposon sequences, and 49 subfamilies with three or more LTR retrotransposon sequences (Table 1). After searching HMMs in the 21555 LTR retrotransposons, 1616 LTR-RT sequences containing at least two models were screened and used for subsequent analysis. The 1616 LTR-RT sequences included 1311 copia families and 305 gypsy families.

Development and evaluation of RBIP primers.
According to the principle of primer design, 48 pairs of RBIP primers were finally developed from the 1616 LTR-RT sequences, 6 pairs were from the copia 1 subfamily, 15 pairs were from the copia 2 subfamily, and 27 pairs were from the gypsy 2 subfamily (Supplementary Table 1). The Tm values of the RBIP primers ranged from 51.1 to 59.12 °C, and the GC content ranged from 35 to 55%. The length of the amplified products was 152-993 bp, with an average of 456 bp (Supplementary Table 1).
DNA fingerprinting and characteristics of RBIP markers. In the 64 bands of the 6 pairs of RBIP primers, 51 polymorphic bands were used to generate a DNA fingerprint map of the 105 sweet potato cultivars. For each primer pair, the number of loci ranged from 7 to 14 with an average of 10.7, while the number of polymorphic bands varied from 6 to 11 with an average of 8.5 (Table 3). The polymorphic bands were converted to digital fingerprint data with presence as "1" and absence as "0". A "1", "0" (Supplementary Table 2) digital fingerprint map was constructed by polymorphic loci. The digital fingerprint map was subsequently used to analyze the genetic diversity. POPGENE software 26 was used to further dissect the genetic variation among the 105 sweet potato cultivars using the 6 pairs of RBIP primers. The effective number of alleles (Ne*) ranged from 1.1512 to 1.6464, with an average of 1.3397. Nei's gene diversity (H*) ranged from 0.1253 to 0.3492 among various genomic groups. The maximum gene diversity was in LTR10, followed by LTR11. Shannon's index (I*) for each primer combination is also reported in Table 3. This index was highest in LTR10 (0.5111) but lowest in LTR38 (0.2369). To identify the most highly informative primer combination, the amount of polymorphism information content (PIC) was estimated from 0.1149 for LTR38 to 0.2713 for LTR10, with an average value of 0.1828 (Table 3).
Genetic relationships among sweet potato accessions. Bayesian modeling of the number of homogeneous gene pools (K) in STRU CTU RE 27 was used to estimate the membership fractions of the 105 sweet potato accessions. An evaluation of the optimum value of K following the procedure described by Evanno et al. 28 indicated two clear optimal values for Delta K, at K = 2 and 3 ( Fig. 1), which indicated that a model with two gene pools captured a major split in the data and that substantial additional resolution was provided under a model with K = 3. Barplots of the proportional allocations to each gene pool for K = 2 and 3 were constructed in STRU CTU RE and are shown in Fig. 2. The plots showed that these two models were related to each other hierarchically, such that the red cluster in the two-gene pool model was subdivided into two (blue and red) gene pools in the three-gene pool model.
The primary split in the data (K = 2) divided the accessions among two groups: group 1 and group 2. Group 1 (red in Fig. 2) included 97 sweet potato accessions, one of which was from America, while the remainder were from different provinces in China, and the majority of all the samples were from Zhejiang Province. Group 2 (blue) comprised 8 samples, six of which were from Zhejiang Province, and the remaining 2 were from Jiangsu  www.nature.com/scientificreports/ and Shandong Provinces. The accessions that demonstrated a low level of admixture, except "Xushu 18-1", belonged to Ipomoea batatas (L.) Lam. The model with 3 gene pools was also supported by the STRU CTU RE results. Under this model, group 1 in the K = 2 model was further divided into two gene pools (red and blue), but the other gene pools remained almost the same (Fig. 2). The 3 groups included 53, 44, and 8 sweet potato accessions, respectively. In the K = 3 model, group 1 and group 2 (red and blue) overlapped substantially with one another, and the hierarchical levels in these two clusters could hardly be recognized. All 97 accessions in group 1 and group 2 appeared to be from the two major gene pools. Several accessions from Jiangsu Province showed admixed origins, such as 'Xushu 18-1' , with three major gene pools. A two-dimensional and three-dimensional PCA (principal component analysis) further depicted the relationship among the 105 sweet potato accessions (Fig. 3). In the two-dimensional PCA, Dim-1 and Dim-2 were 1.12 and 0.51, respectively. The Dim-3 was 0.60 in the three-dimensional PCA. From the PCA diagrams, we could  www.nature.com/scientificreports/ see that all the 105 sweet potato accessions were divided into two groups, group 1 and group 2 (green and red, 8 and 97, respectively), or three smaller groups, group 1, group 2, and group 3 (green, red, and blue, 8, 53, 44, respectively). The PCA results were similar to the STRU CTU RE results at K = 2, which classified all sweet potato accessions into two groups (red and blue, 97 and 8, respectively); however, in the K = 3 model, the red group was then divided into two groups (Fig. 2). The dimensionalities of the 3 groups indicated that the accessions in group 1 exhibited a higher genetic diversity than those in groups 2 and 3.
Neighbor-joining cluster analysis clearly divided the 105 sweet potato accessions into 3 groups containing 8, 54, and 43 materials, respectively. This result was highly consistent with the assignments made using STRU CTU RE. (Fig. 4). Group 2 was divided into 4 subgroups, containing 11, 14, 5, 13, and 11 materials. Group 3 was divided into 5 subgroups, containing 9, 6, 5, 4, and 19 materials. Group 1 included all improved varieties, except Hongpibaixin-11, which had large genetic distances from the other accessions. For several accessions, such as 'Jinguahuang' and 'nanguafanshu-2' as well as 'Hongpibaixin-2' and 'Hongpibaixin-3' , the genetic distances between them were 0, which meant that they had the same genotypes based on the 6 RBIP markers. The UPGMA dendrogram also showed that sweet potato accessions from the same regions were not well clustered in the same groups. For example, 22 accessions from the Zhejiang Academy of Agricultural Sciences were scattered. It was obvious that these results coincided with the previous STRU CTU RE and PCA results.
A population differentiation analysis was performed to analyze the genetic variations among and within groups, as revealed by the population structure. In the K = 3 model, AMOVA revealed that 53% genetic differentiation (P < 0.001) of total molecular variance in the germplasm occurred among groups, and 47% (P < 0.001) was attributed to variation within groups ( Table 4). The total genetic variance among individuals within populations was significantly greater than 0 (0.526), indicating that the genetic variation between and within the geographical population of the tested sweet potato resources was extremely significant (Table 4). Genetic distance among the 3 inferred groups revealed that Group 2 (blue) and Group 3 (green) had the highest differentiation with 0.819, and comparatively, Group 1 and Group 2 had a closer relationship with 0.265. The pairwise fixation index (F ST ) values between the three populations were all 0.001 (Table 5).

Discussion
To support sweet potato breeding programs, it is essential to assess the genetic diversity and relationships among cultivars. The ubiquity and abundance of LTR retrotransposons in the plant genome have made them valuable for studying genome-wide variation and diversity. The retrotransposon-based genetic DNA fingerprinting method could provide potentially useful genetic information. The increasing amount of sequence data released by nextgeneration sequencing technology provides a valuable resource for the development of retrotransposon-based markers. These retrotransposon-based markers have been applied successfully to analyze the genetic diversity in various plant species.
In the current study, we confirmed that the LTRs of sweet potato accessions contained the full complement of LTR retrotransposons. Structural analysis revealed that they are transcriptionally active and could be functional. Based on this genome-wide analysis, we found that only 12.5% of RBIP markers generated polymorphic bands, signifying that inter-LTR regions in the research genome of sweet potato accessions were significantly conserved. This implies that the sweet potato genome is still under evolution and that LTRs are not very active in contributing to genome-wide variations. To the best of our knowledge, this is the first study of genetic diversity in sweet potato using RBIP-based fingerprinting.
A previous report indicated that 7.37% of the sweet potato genome (approximately 4.4 Gb) was identified as an LTR 29 , while it was 10.987% in the present study. This small difference in the number of LTRs might be attributed to the different approaches and parameters that were used in the two studies. In our study, only putative full-length LTR retrotransposons with two very similar LTR sequences were isolated. The ratio of Ty3-gypsy to Ty1-copia can reflect the contribution in the sweet potato genome. Our results showed that full-length copia LTR retrotransposons were more common than gypsy retrotransposons ( Table 1). The ratio of Ty3-gypsy to Ty1-copia was (1:1.6), higher than previous study (1:1.15) 29 . The numbers of full-length LTR retrotransposons   www.nature.com/scientificreports/ in the different subfamilies were generally low, gypsy subfamilies had more single sequences (95.8%) than copia subfamilies (94.9%), and only 0.6% (49) contained more than 3 LTR retrotransposons. These findings are consistent with the results reported in other plants with different genome sizes [30][31][32] . New bioinformatics software offers exciting perspectives for the development of new markers based on whole genome sequences. RBIP markers were more ubiquitous than SSR markers in sweet potato 29 . Although many SSR markers have been developed from sweet potato, almost all SSRs (86.1%) have mononucleotide or dinucleotide repeat motifs, and "stutter bands" or increased mutation rates in repeat lengths may create issues for using SSR markers [33][34][35] . RBIP markers amplify a single locus in samples, differing from SSR markers that potentially amplify two or possibly more homologous loci. RBIP markers also can detect the presence or absence of retrotransposons in a locus produced by the integration of an element 36 .
In our 48 developed RBIP markers, 21 and 27 pairs of primers were related to the insertion of copia and gypsy retrotransposons in a particular locus, respectively. Due to the high similarity of LTRs from the same subfamily, primers designed with these LTRs may produce a same left primer sequence. For example, the left primer sequences of LTR10, LTR11, LTR13 and LTR20 were the same, but the downstream sequences were different. This kind of situation does not influence the specific amplification. The insertion of copia and gypsy retrotransposons was extensively detected in most of the cultivars with more than one locus. Diversity analysis showed that copia and gypsy LTR retrotransposons have existed in sweet potato varieties for a long time. These results implied that copia and gypsy retrotransposons replicated many times in the development of cultivated sweet potato and might explain why 10.98% of the genome was LTR retrotransposon in this research. Additionally, compared with previous RBIP markers 1 , the genome based RBIP markers have universal applicability.
According to the STRU CTU RE analysis, the 105 germplasms can be divided into two groups when K = 2 in STRU CTU RE (Fig. 2). Almost all the germplasms had unique backgrounds, except 'Xushu18-1' , 'Jinguafanshu' , and 'Taizhong6' . 'Xushu 18-1' (p330683034) was released by the Xuzhou Regional Agricultural Research Institute in 1972. It was selected from the cross between 'Xindazi' X '52-45' with an inbreeding backcross, and '52-45' was a hybrid offspring of 'Nancy Hall' X 'Okinawa 100' . Previous studies have shown that most of the sweet potato cultivars have a genetic background of Okinawa 100 from Japan and Nancy Hall from the USA 7,10,37,38 . Based on the above research, we inferred that the two gene pools may be Nancy Hall and Okinawa 100. However, in the K = 3 model (Fig. 2), most of the accessions had one major gene pool source and a small minor gene pool, except 'Xushu 18-1' . From these results, we can see that 'Xushu 18-1' has a wider genetic background than other accessions. The genetic background of sweet potato was single in Zhejiang, even among China, so it is necessary to broaden the genetic background of sweet potato varieties, enrich their genetic diversity and protect highquality germplasm resources.
All the accessions can be divided into three groups (group I represented by green, group II by red, and group III by blue) according to the PCA results and UPGMA dendrogram (Figs. 3, 4). In group I, 7 of the 8 accessions were improved varieties, and the other 21 improved varieties were scattered in subgroups II and III. This result indicated that the genetic background of improved varieties had more similarity than other germplasms to some degree. 'Hongpibaixin-11' is a variety selected by local farmers and known for its phenotypic traits such as leaf shape, root tuber skin color, and root tuber flesh color. There were 11 and 9 improved varieties in groups II and III, respectively, and other landraces were scattered in the two groups. The varieties 'Zhe 38' , '71438' , 'Zhe 81' , 'Zhe75' , and 'Zheshu 48' , which were improved by the Institute of Crop and Nuclear Technology Utilization, Zhejiang Academy of Agricultural Sciences, clustered on the third subgroup of group I, illustrating that these varieties had similar genetic backgrounds in the hybrid combination. 'Zhe 255' and 'Xinxiang' , 'Zhezishu 5' , 'Zheshu 6025' and 'Nanshu 88' were the same.
The UPGMA dendrogram, PCA, and STRU CTU RE analysis remained highly consistent regardless of K = 2 or K = 3 (Figs. 2, 3, 4). The PCA results showing that the sweet potato germplasms in the three groups were very concentrated, and the genetic differences of the three groups were obvious. The genetic distance between Group_1 and Group_2 was 0.265, that between Group_1 and Group_3 was 0.727, and that between Group_2 and Group_3 was 0.819 ( Table 5). The above data indicated that the accessions between Group_2 and Group_3 had the widest genetic background, followed by accessions between Group_1 and Group_3, and between Group_1 and Group_2. The pairwise fixation index (F ST ), as a population differentiation index determined by genetic structure, can often be used to assess genome-wide variation. The mean F ST value between the three groups were 0.001, indicating that there was a very high level of differentiation between the three groups. Thus, the germplasms in the three groups could be combined as good hybrid parents.
AMOVA showed that the source of variation among and within populations was 53% and 47%, respectively, indicating that the genetic variance was significant in the tested resources ( Table 4). Most of the sweet potato resources have been produced in Zhejiang Province for a long time, and local environmental conditions have a significant effect on genetic variation. This result was inconsistent with those of previous studies 7, 9 . www.nature.com/scientificreports/ The proportion of the total genetic variance among individuals within populations (PhiPT) value was 0.526, and P<=0.001, showing that the total genetic variance among individuals within populations was extremely significant. We found that several germplasms with similar phenotypes were separated into close subgroups. For example, 'Hongpibaixin-7' , 'Shenglibaihao-1' , 'Fanshu-2' , 'Shenglibaihao-4' , 'shenglibaihao-2' , 'shenglibaihao-3' , and 'Hongpibaixin-8' , with similar phenotypic characters of leaf shape, leaf teeth type, leaf color, stem primary color, root tuber skin color, and root tuber flesh color (Supplementary Table 3), clustered together in the second subgroup of the second group; The germplasms 'Hongpibaixin-4' , 'Hongpibaixin-2' , 'Hongpibaixin-3' , 'Hongpibaixin-10' , ' An'yangbaifanshu' , 'Liushiri-1' , 'Liushiri-2' , 'Hongpibaixin-5' and 'Hongpibaixin-11' , with similar phenotypic characters clustered in the fifth subgroup of the second group. This phenomenon may be attributed to most of the germplasms collected from locals being used for planting for many years. Long-term self-retention may cause a germplasm resource to exhibit segregation of variables. The UPGMA genetic relationship reflects the difference in genetic background between germplasm resources, so selection of genetically distant accessions as hybrid parents in breeding is more likely to generate elite varieties. Our results have demonstrated the high potential of molecular marker-based parental selection in promoting genetic improvement in future sweet potato breeding programs.
However, several germplasm resources, such as 'Shenglibaihao-1' , 'Shenglibaihao-2' , 'Shenglibaihao-3' , and 'Shenglibaihao-4' , collected from different counties and cities of Zhejiang Province, called the same name (Shenglibaihao) by local people, were not similar in terms of the phenotypic characters of leaf vein color and leaf stalk color. Thus, several germplasm resources were numbered and considered synonymous. From the results of the UPGMA dendrogram, however, we could see that those synonymous germplasm resources were not clustered together. They were not a same variety. 'Shenglibaihao' , also named Okinawa 100, was bred in Japan and then introduced to China before the 1970s. Almost 90% of the genetic background of improved varieties in the 1960s was filial generations of 'Shenglibaihao' 7,10,37,38 . The filial generations that have phenotypic traits similar to those of 'Shenglibaihao' may also be called 'Shenglibaihao' by farmer breeders, which could be the reason why there were resources named 'Shenglibaihao-1' , 'Shenglibaihao-2' , 'Shenglibaihao-3' , and 'Shenglibaihao-4' , with different variations but clustered together. The synonymous landraces 'Hongpibaixin' , 'Liushiri' , and 'Fanshu' have similar situations. The landraces 'Hongpibaixin-2' and 'Hongpibaixin-3' , collected from Cangnan County, Wenzhou City, and Jinyun County, Lishui City, Zhejiang Province, China, respectively, showed 100% similarity in the UPGMA dendrogram, STRU CTU RE, and PCA results. The 6 RBIP primer pairs used in this study amplified the same fragments in these two accessions (Supplementary Table 2), so we speculated that these two accessions might be synonyms. Investigation of phenotypic traits (Supplementary Table 3) confirmed our speculation. The same situations existed between 'Jinguahuang' and 'Nanguafanshu-2' as well as 'Zheshu 77' and 'Lianhuaru' . However, we did not observe the same phenotypic characteristics between 'Hongmudan' and 'Hongtou' , 'Chaosheng 5' and 'Jinqing' , 'Zhe 259' and 'Shiniuhongmudan' . The reason may be that these sweet potato germplasm resources had very similar genetic backgrounds, and more markers will be needed to confirm their relationships.
The results in this research expanded the application of molecular markers in sweet potato. Successfully developing RBIP markers, evaluating the capacity and efficiency of 6 RBIP markers for distinguishing genetic diversity in 105 germplasm resources of Zhejiang Province. The clustering results combined with phenotypic characteristics were used to identify several germplasm resources. The importance of molecular markers in variety identification was further confirmed. This is the first RBIP-based and combined with phenotypic characteristics genetic diversity assessment in sweet potato. These results will play a great role in sweet potato genetic research and breeding programs.

Conclusions
In this study, we successfully developed 48 RBIP primer pairs from the sweet potato genome, and successfully analyzed the genetic diversity and constructed a fingerprint of 105 sweet potato germplasm resources based on 6 RBIP primer pairs. These sweet potato germplasm resources exhibited a relative narrow genetic background due to scarce backbone parents and geographical isolation. This study highlights the utility of RBIP markers for determining the intraspecies variability of sweet potato. These highly polymorphic RBIP primer pairs have the potential to be used as core primer pairs for variety identification, genetic diversity assessment and linkage map construction in sweet potato. All these findings could provide a good theoretical reference and guidance for germplasm research and breeding.

Materials and methods
Plant materials and DNA extraction. All 105 sweet potato cultivars used in the present study are donated by the Sweet potato Germplasm Repository, the Institute of Crop and Nuclear Technology Utilization, Zhejiang Academy of Agricultural Sciences, Hangzhou, China (Table 2). They collected these resources from Zhejiang Province according to the <Implementation Plan of the Third National Crop Germplasm Resources Survey and Collection Action> issued by the Ministry of Agriculture and Rural Affairs of China.The samples included 26 improved varieties, 78 landraces from different geographical regions and 1 introduced variety from the United States of America. Total genomic DNA was extracted from fresh young leaves with the modified cetyltrimethylammonium bromide (CTAB) method described by Saghai-Maroof et al. 39 The quality and quantity of DNA were detected using spectrophotometric analyses and 1% (w/v) agarose gel electrophoresis, respectively. The DNA was diluted to a final concentration of 50 ng/μL and then stored at − 80 °C for further use. The phenotypic traits of the cultivars were also investigated for their genetic information assessment, including leaf shape, leaf tooth type, top bud color, tip hair color, leaf vein color, petiole color, stem primary color, stem secondary color, root tuber shape, root tuber skin color, and root tuber flesh color (Supplementary Table 3 www.nature.com/scientificreports/ LTR-RT prediction in the sweet potato genome. The full-length sequences of LTRs were predicted using LTR harvest software based on the genomic data of sweet potato variety 'Taizhong 6' , which was downloaded from the website http:// public-genom es-ngs. molgen. mpg. de/ Sweet potato/. The parameters of LTR harvest were set as follows: (1) the length range of the LTRs was 100-1000 bp; (2) the distance between the starting points of the LTRs was 1000-15000 bp; (3) the similarity threshold was 85%; (4) the repeat sequence length of each target site was 4-20 bp; (5) there were 4-6 bp target site duplications (TSDs) or polypurine tracts (PPTs) and primer binding sites (PBSs) on both sides of two identical LTRs; and (6) the other software parameters were set to the default options. The sequences of predicted LTRs were translated into six codes to obtain the corresponding protein sequences. All the copia and gypsy gene models were downloaded from the PFAM tadabase (gag, pf03732; integrate, pf00665; reverse transcriptase, pf00078 and pf07727) (http:// pfam. xfam. org/), and an HMM (gag, INT, RT) was constructed based on the downloaded data. The functional domain sequences of LTR protein sequences were subsequently analyzed using BLASTN searches (E-value< 1e−10) against the HMMs. By searching the three models in each protein sequence, LTRs containing at least two models were screened for subsequent analysis. The screening criteria were a full-length E-value < 1e−10 and an optimal domain E-value< 1e−10.
Development and evaluation of RBIP primers. The RBIP primers were designed by Primer3 40 . One primer was designed from LTR sequence and another was designed from flanking genome sequence. The design principles were as follows: (1) the primer length was 18-25 bp; (2) the amplified products were 100-1000 bp; (3) the GC content of the primers was 35-55%; (4) the annealing temperature was 50-60 °C; and (5) the annealing temperature difference between upstream and downstream primers was less than 5 °C. The designed primers were synthesized by Beijing Tsingke Biotechnology Co., Ltd.
PCR amplification was carried out in 15 μL reaction solution consisting of 1 μL DNA template, 7.5 μL Tsingke Master Mix (Tsingke, Beijing, China), 1 μL (10 μmol L-1) of each RBIP primer (Tsingke, Beijing, China) and 4.5 μL deionized distilled water. PCR amplification was performed with the following procedure: 94 °C for 5 min followed by 35 cycles at 94 °C for 30 s, 58-60 °C (depending on the RBIP primers) for 30 s, 72 °C for 30 s, a final extension at 72 °C for 5 min, and storage at 4 °C. Amplicons were analyzed by electrophoresis on a 2% (w/v) agarose gel. Amplicons were pooled together with an internal size standard (ABI GeneScanTM 500 LIZ, Applied Biosystems, Foster City, USA) and loaded on an ABI Genetic Analyzer (3730XL, Applied Biosystems, Foster City, USA). Accurate allele points were analyzed by Gene mapper 4.1 software 41 .
Data analysis. Characteristics of the RBIP primer pairs developed for analyzing sweet potato genetic diversity were evaluated in the 105 sweet potato accessions in terms of the effective number of alleles (Ne*), Nei's gene diversity (H*), Shannon's information index (I*) and polymorphism information content (PIC) using POP-GENE version 1.32 26 and the Botstein formula 42 , respectively.
For nonhierarchical genotypic clustering, the number of homogeneous gene pools (K) was modeled using the genotypes obtained from all 105 individuals in the software STRU CTU RE version 2.3.3, which uses the Markov chain Monte Carlo (MCMC) algorithm 43,44 . This revealed the genetic structure by assigning individuals or predefined groups to K clusters. Twenty runs of STRU CTU RE were performed by setting the number of clusters (K) from 2 to 10. Each run consisted of a burn-in period of 100,000 iterations followed by 100,000 MCMC iterations, assuming an admixture model. The results were uploaded to the STRU CTU RE HARVESTER website (http:// taylo r0. biolo gy. ucla. edu/ STRuc tureH arves ter/ 27 ) to estimate the most appropriate K values. The replicate cluster analysis of the same data resulted in several distinct outcomes for estimated assignment coefficients, even though the same starting conditions were used. Therefore, CLUMPP software was employed to average the 20 independent simulations, and the results were illustrated graphically using DISTRUCT 45 .
All the "1" and "0" data were used to calculate Dice's similarity coefficients and genetic distances 46 among the 105 sweet potato accessions by the NTSYS-pc version 2.2 statistical package 47 . A UPGMA dendrogram based on the genetic distance matrix was constructed by MEGA X software 48 to evaluate genetic relationships among the sweet potato varieties. Two-dimensional and three-dimensional PCAs were constructed with the R statistical package 49 and used to indicate the distribution of individual varieties in the scatter diagram.
To investigate the genetic differentiation among the 105 sweet potato accessions, AMOVA was performed based on population inference according to structure analysis by the software Arlequin v3.5 50 , with 1,000 permutations and sum of square size differences as molecular distance. Furthermore, pairwise differentiation levels were estimated by the pairwise F ST , a measure of heterozygosity among populations relative to heterozygosity within populations 51 .