An enriched sugarcane diversity panel for utilization in genetic improvement of sugarcane

Sugarcane crop is important for both sugar and biofuels. A world collection of sugarcane and related grasses (WCSRG) maintained at Miami, FL contains > 1,200 non-redundant clones of different species and genera within the Saccharum complex. However, linkage of undesirable alleles with useful genes in wild species has hindered its efficient utilization in sugarcane breeding. A core collection developed previously with smaller number of clones representing WCSRG did not take into account > 120 wild/exotic clones maintained at the USDA-ARS Sugarcane Research Unit in Houma, Louisiana. Moreover, the genome complexity and sub-tropical to temperate growing climate of Louisiana warrant a region-specific core collection that can be used for base-broadening breeding aimed at efficient introgression of desirable alleles. Genetic diversity of 1,485 clones within WCSRG and Louisiana (commercials, wild/exotic) using 423 SSR alleles showed an average gene diversity (h) at 0.208 among all species groups where Erianthus-like Saccharum species (ELSS), Miscanthus spp., and S. spontaneum each formed a distinct cluster, Saccharum robustum, S. officinarum, hybrid cultivars, and S. edule grouped together in a major cluster, and Saccharum sinense and S. barberi formed distinct grouping. A 309-clone diversity panel (SDP1) was developed that captured the genetic diversity based on the combination of maximum length subtree and manual selection to maximize representation of Louisiana clones and minimize import of clones from Miami. SDP1 shared 324 alleles out of the 423 alleles in the entire population of 1,485 clones and captured the genetic diversity of the entire collection with an average gene diversity (h) at 0.163. The variation within (11–17%) and among (83–89%) the populations in SDP1 were comparable with the entire population of 1,485 clones (9–15% and 85–91%, respectively). The breadth of the genetic variation of SDP1 was exemplified by the intra- and inter-specific diversity of a 190-clone mini-core collection with markers derived from known cold-responsive genes. SDP1 will facilitate genome-wide association studies for identification of trait-specific markers for use in marker-assisted breeding in Louisiana and elsewhere.

Methods plant materials. For the diversity panel, 1,485 clones within the Saccharum complex, including Saccharum, Miscanthus, Coix, Imperata, and Sorghum, were used 36 (Fig. 1). Of these, 1,236 were clones from the WCSRG, 113 clones were elite and historic breeding clones from the Louisiana sugarcane breeding program, 119 were clones of wild/exotic species (not present in WCSRG), and 17 were hybrids from the base-broadening introgression program of the USDA-ARS Sugarcane Research Unit. Saccharum spp. previously classified as Erianthus spp. including S. arundinaceum, S. bengalense, S. ravennae, S. rufipilum, S. brevibarbe, S. kanashiroi, and S. procerum were grouped together as Erianthus-like S. spp. (ELSS) for simplicity of analysis. For diversity analysis with cold responsive genes, a mini-core of 190 clones (Supplementary Table S1 DNA purification and genotyping. Total genomic DNA was extracted from ~ 100 mg leaf tissues using the CTAB miniprep and checked for quality and quantity using a ND-100 spectrophotometer (Nanodrop Technologies Inc, Wilmington, DE), as described previously 36 . Four-hundred-fifty SSR primer pairs, including 277 genomic SSRs 38,39 , 127 eSSRs from sugarcane cold-responsive genes 40 , and 46 eSSRs from brown rust-responsive genes 41 , were initially tested for polymorphism among 113 Louisiana clones. Eleven SSR primer pairs mapped on nine out of 10 sugarcane monoploid homeologous groups 42 and 17 out of 32 pseudochromosomes 43 that had high polymorphism index (Supplementary Table S2) were selected to evaluate genetic diversity of the 1,485 clones.
For genetic diversity analysis of the 190-clone mini-core, 9,974 cold responsive genes reported in Sorghum 44 were searched for the presence of simple sequence repeat (SSR) motifs using BatchPrimer3 v1.0 with criteria set to at least five repeats for dinucleotide motifs and three repeats for tri, tetra, penta and hexa nucleotide motifs. Primers flanking SSR motifs ≥ 16 nt from 52 genes were designed using BatchPrimer3 v1.0. In addition, 48 eSSRs from cold responsive genes of sugarcane 40 were included.
The polymerase chain reaction (PCR) and resolution of PCR products were performed following a method described earlier 21 . Briefly, 50 ng of genomic DNA was used as the template in 10 µl PCR containing 2 µl of 5 × buffer, 1 µl of 25 mM MgCl 2 , 1 µl of 2 mM dNTP mix, 0.1 µl of Taq DNA polymerase (Promega, Madison, WI) and 0.5 μl of 10 µM forward and reverse primer (IDT, Corvallis, OR). A thermal profile of 95 °C for 5 min followed by 35 cycles of 95 °C for 15 s, 58 °C for 15 s, and 72 °C for 30 s, and 72 °C for 10 min was used. The amplification products were resolved in a 13% polyacrylamide gel using a high-efficiency gel electrophoresis system (Nihon Eido, Tokyo, Japan). Amplified fragments (alleles) were manually scored as "1" (present, dominant) and "0" (absent) in a binary matrix.
Genetic diversity analysis. Alleles at SSR loci occurring in less than 1% of the clones were discarded prior to downstream analysis to reduce false similarity between clones due to shared absence of alleles while still capturing rare alleles. The polymorphism information content (PIC) was computed for each SSR marker following 45 . GeneAlEx 6.502 46 was used to compute gene diversity (h), Shannon's information index (I), Nei's genetic distance (D), principal coordinate analysis (PCoA), and an analysis of molecular variance (AMOVA). AMOVA was done for species groups by recorded names: S. spontaneum, S. officinarum, hybrid cultivar, S. robustum, ELSS, S. sinense, S. barberi, Miscanthus spp., and other 46 . AMOVA was also conducted on species groups devised from the neighbor-joining analysis, and on groups devised from the population structure analysis (described below). Private alleles, population differentiation and gene flow were estimated for the mini-core by F ST and N m values, respectively. For the mini-core, h and I for every locus, genetic diversity within a population (H s ), total heterozygosity (H t ), gene flow (N m ), and G st were calculated using PopGene 1.32 47 .  www.nature.com/scientificreports/ Genetic diversity was also analyzed using DARwin 6.0.12 48 using Dice dissimilarity scores to validate the clustering of the clones. Weighted neighbor-joining algorithms were used to construct a phylogenetic tree with 1,000 bootstrap repetitions to evaluate the robustness and significance of each node. Bootstrap analysis was implemented with 1,000 iterations to compute the minimum number of SSR alleles needed to differentiate the species using Bootsie software 49 and allele numbers were plotted against their corresponding CV values in a bi-plot curve.
Structure analysis. Assignment of clones to a specified number of clusters (K) and population structure were determined using Structure ver. 2.3.4 50 . Models were run using Bayesian algorithm for K = 2-10, and K = 8 was selected as per the software's documentation and eight species groups. A standard admixture model was used with an inferred alpha. To accommodate minor alleles, lambda was evaluated at different levels, and a lambda of 0.5 yielded the best models based on the log of the probability of the data. The Markov chain Monte Carlo program converged well before 50,000 iterations, so 50,000 iterations were used for 'burn-in' , and 25,000 subsequent iterations were used for model parameter estimation. To estimate the number of clusters, an admixture model with correlated allele frequencies was run in 10 models, and two non-symmetric modes were found. One mode occurred seven times and the other was less consistent and occurred three times. An average of the seven runs from the first mode was used for the final result.

Results
SSR genotyping. The 11 SSR primer pairs resulted in a total of 423 polymorphic alleles (Supplementary Table S2). The number of alleles per SSR ranged from 13 to 65 with an average of 38. Three sugarcane SSRs on homeologous group (HG) 4 generated 106 alleles. All other HGs were represented by a single SSR producing 18 (HG 9) to 58 (HG 5) alleles with an average major allele frequency of 0.76. The PIC values ranged from 0.17 to 0.38 with an average of 0.25. The average number of alleles per clone-SSR pair was 8.01 with the maximum being 30.16. The maximum average number of alleles per clone for an SSR was 16.28, which is typical of sugarcane where the basic chromosome number of the species in the population ranges from x = 7 to x = 19 5 . Saccharum spontaneum, representing one-third of the population, has a basic chromosome number of 8.
The low frequencies of S. sinense, S. barberi, and Miscanthus spp. necessitated maintaining minor alleles. Thus, alleles with frequencies between 0.990 and 0.010 were retained for analysis. These bounds were equivalent to the frequency of Miscanthus spp. in the population, which was 0.009. All SSR primers produced at least one polymorphic allele with a frequency at or above 0.348, higher than the average allele frequency (0.210).
Gene diversity and allele polymorphism by species. The gene diversity (h) ranged from 0.16 for Miscanthus spp. to 0.24 for S. spontaneum indicating that Miscanthus spp. were the least diverse and S. spontaneum the most diverse. The overall average gene diversity of all species groups was 0.21 (Supplementary Table S2). The number of accessions evaluated for each species apparently influenced the gene diversity. For example, number of Miscanthus accessions (14) were 37-fold less than the S. spontaneum (516).
The entropy measured by Shannon's information index (I) followed the same trend with those species having higher gene diversity also exhibiting higher entropy (Supplementary Table S2). The only exceptions were S. officinarum and S. barberi, where S. officinarum had h = 0.21 but an I = 0.33, and S. barberi had an h = 0.21 but an I = 0.33. The I value ranged from 0.25 (Miscanthus spp.) to 0.39 for (S. spontaneum), with an average of 0.33.
The percentage of polymorphic alleles within a species was apparently directly proportionate to its population size ( Fig. 1, Supplementary Table S3). Again, the polymorphic alleles were highest for S. spontaneum (99.3%) and lowest for Miscanthus spp. (54.4%).
Genetic distance between species. Based on Nei's pairwise genetic distance (D), the greatest distance was between the hybrid cultivars group and Miscanthus spp. at 0.105, while the shortest genetic distance was between S. officinarum and hybrid cultivars group (0.009), followed by the D between S. officinarum and S. robustum (0.011) (Supplementary Table S4). Miscanthus spp. were the farthest from the other species groups. Miscanthus spp. were closest to ELSS and S. spontaneum at D = 0.038 and D = 0.049, respectively, but farthest from S. sinense (D = 0.101). The D value between the rest of the species groups and Miscanthus spp. ranged from 0.07 to 0.105. The next farthest from the others was ELSS (0.024 to 0.086) followed by S. spontaneum (0.032 to 0.060). phylogeny and population structure. Neighbor joining with a Dice dissimilarity matrix was also used to evaluate genetic diversity (Fig. 2). ELSS, Miscanthus spp., and S. spontaneum each showed distinct separation from the other species. Saccharum robustum, S. officinarum, hybrid cultivars, and S. edule grouped together in a major cluster with subclusters concentrated independently with S. robustum and hybrid cultivars. Saccharum sinense and S. barberi also showed distinction, but overall had little intra-species diversity. Miscanthus spp. were most distant from other clones followed by ELSS and S. spontaneum.
The eight sub-populations delineated by the structure analysis had near direct correspondence to species groups (Fig. 3). The first two sub-populations corresponded to both S. officinarum and S. robustum with no clear distinction between them. Other sub-populations corresponded to hybrid cultivars, S. sinense, and S. barberi. ELSS and Miscanthus spp. comprised one sub-population, whereas S. spontaneum was delineated by two sub-populations.
Principal coordinate analysis (PCoA) showed definite variance between species groups (Fig. 4). The coordinates cumulatively accounted for 12.8% of the total variation, with the first three accounting for 8.04, 2.55, and 2.21%, respectively. Saccharum spontaneum grouped by itself with some outliers that grouped with the cluster  Table S5). The first division was based on the clones with nine groups consisting of S. spontaneum, S. officinarum, hybrid cultivars, S. robustum, ELSS, S. sinense, S. barberi,  www.nature.com/scientificreports/ Miscanthus, and others, which included Coix lacryma-jobi, Imperata sp., Sorghum polumosum, Saccharum edule, and unknown species (Fig. 1). The second division was based on the DARwin neighbor-joining on Dice dissimilarities. Here clones grouping together were considered part of the same species groups. This eliminated the 'other' group and combined S. officinarum and S. robustum leaving seven groups. Clones falling in the S. officinarum / S. robustum / hybrid cultivar complex that were not hybrid cultivars were considered part of the S. officinarum / S. robustum group. The third grouping was based on STRU CTU RE model estimates using the group that was estimated to contribute the largest proportion to the genome. All three AMOVA showed significant differences within sub-populations with p values ≤ 0.001 (Supplementary Table S5). Genetic variation within a sub-population was high (85.1-90.7%), whereas 9.3, 14.4, and 14.9% of the variance was attributable to between sub-population variance in the name-based division (groups by species), the DARwin and the STRU CTU RE divisions, respectively.

Sugarcane diversity panel (SDP1) selection.
A 309-clone diversity panel designated as SDP1 was selected by the combination of the molecular markers-based genetic diversity using the maximum length subtree algorithm in DARwin (238 clones) and Louisiana sugarcane breeders selection wherein clones (57 Louisiana commercials and 14 basic F 1 progeny) were picked from different subclusters maximizing the clones existent in Louisiana to minimize the number of clones to be imported from the WCSRG. SDP1 consisted of 284 clones from Louisiana that represent clusters of clones within WCSRG and Louisiana breeding programs. The remaining 25 clones require importation from the WCSRG. Over 100 of the SDP1 clones were Louisiana hybrid cultivars that consisted of both historic and current clones. Louisiana clones exhibited some clustering, but were fairly well dispersed across the entire study population (Fig. 5). Saccharum spontaneum, being a noxious weed, cannot be grown in the field making accurate phenotyping of traits more difficult. Therefore, S. spontaneum clones' representation in the SDP1 were intentionally minimized where hybrids x S. spontaneum F 1 s developed by the base-broadening introgression breeding program of the USDA-ARS replaced 31 S. spontaneum clones. Representation of the ELSS and Miscanthus spp. from the WCSRG were intentionally kept to a minimum. The remainder of SDP1 was proportionally distributed over other species groups (Fig. 1). The gene diversity (h) for different species groups were comparable, ranging from 0.13 for Erianthus-like species to 0.15 for hybrids (Supplementary Table S3). The overall average gene diversity of all species groups was lower (0.16). The polymorphic alleles were highest for S. spontaneum (88%) followed by 77.8% for hybrids. Erianthus-like species had lowest allele polymorphism (40.2%).
Nei's genetic distance (D) showed that Erianthus-like group was the most genetically distant from other species (0.029-0.041) with the highest D (0.041) with S. robustum. Expectedly, D (0.007) was the smallest between hybrid cultivars group and S. officinarum, followed by the equal D (0.016) between S. officinarum and S. robustum and between S. officinarum and S. barberi (Supplementary Table S4). Saccharum sinense and S. barberi were genetically close at D = 0.018.
Analysis of variance showed significant differences within sub-populations (Supplementary Table S5) with high genetic variation among the clones within a sub-population (83.0-89%), whereas 11-17% variation was attributed to sub-populations by species groups, and DARwin and the STRU CTU RE defined divisions.  (22), followed by tetra (24%), hexa (10%), penta (6%), and di (5%) (Supplementary Fig. S1). Of the 100 SSR primer pairs used to genotype the mini-core, 67 were polymorphic producing a total of 1,377 alleles with 37.8 and 27.8 as the average and effective number of alleles, respectively (Supplementary Table S6). Polymorphism information content (Nei's gene diversity, h) of the markers ranged between 0.005 and 0.484 with a mean of 0.211 (Supplementary Table S6). The dendrogram generated through unweighted pair group method of arithmetic mean (UPGMA) exhibited three distinct clusters (Fig. 6). Cluster I contained two subclusters, where subcluster IA was predominated by Saccharum hybrids and close to the subcluster IB containing mostly S. officinarum. Saccharum spontaneum populated subcluster IIA, which was close to Cluster IIB with clones listed as unknowns. The third cluster consisted of two subclusters, where Miscanthus, S. edule, S. kanashiroi, and S. robustum formed subcluster IIIA, and S. arundinaceum, S. erianthus, S. bengalense, S. barberi, S. imperata, S. rufipilum, S. coix, S. ravennae, and S. kanashiroi grouped together as subcluster IIIB (Fig. 6). On the other hand, Bayesian model of structure grouped the clones into five clusters with Louisiana hybrids in a separate cluster (Fig. 7). The principal coordinate analysis (PCoA) based on Euclidean distance between clones was in agreement with the cluster analysis, and coordinates 1, 2, and 3 explained 14.24%, 3.87%, and 1.97% of the variation, respectively (Fig. 8).
Louisiana hybrid clones had the highest number of unique alleles (37 of the population total of 67) (Supplementary Table S7) when the mini-core was analyzed with the alleles generated by the 11 SSR primers used for the entire 1,485 clones. On the other hand, the population group containing Miscanthus and Erianthus had the highest number of unique alleles when analyzed with cold-responsive genes-derived SSR primer pairs (Supplementary Table S7). The mean diversity among the populations with cold responsive genes-derived SSRs (0.218) was higher than that with 11 SSRs (0.182) (Supplementary Table S7). Hierarchical analysis of molecular variance (AMOVA) showed equal amount of variation among and within populations at 19% and 81%, respectively with both cold-responsive genes-derived SSRs and 11 SSRs used for the entire population of 1,485 clones (Supplementary Table S8). Nei's genetic identity between populations was 0.900-0.984 (Supplementary Table S9)

Discussion
The clones within WCSRG are genetically diverse and possess desirable alleles for several agronomic traits 35,51 . However, the daunting task of exploiting the potential of a large number of highly genetically complex clones possessing many undesirable alleles has led to the use of only a few clones in sugarcane breeding programs worldwide. The advent of molecular genomic tools and development/utilization of MAS may allow the tremendous potential value and utility of WCSRG to be realized. A core collection with a smaller number of non-redundant clones, representative of the maximum genetic diversity of the entire collection, will facilitate its utilization in commercial breeding 52 , thus effectively and efficiently managing resources on a smaller but diverse population while maximizing genetic gain. To this end, the present study developed a sugarcane diversity panel (SDP1), a core collection selected from 1,485 clones enriched with clones from Louisiana, based on the diversity analysis and breeders' input ( Figs. 1 and 5). Subtropical and temperate climate specific sugarcane genotypes were not well represented in other core collections previously developed [33][34][35]53 thereby limiting their application for GWAS in Louisiana. For instance, the 300-clone core collection developed 35 includes 228 wild S. spontaneum, other S. spp., other genera, and unknown  www.nature.com/scientificreports/ species, which potentially need to be maintained in pots on concrete. This could hinder the phenotypic characterization of the panel under natural field conditions. SDP1, enriched with Louisiana clones, is expected to circumvent this problem as it contains less than 50 wild-type clones and all of these clones are already maintained at the USDA Sugarcane Research Unit in Houma, Louisiana. While it is challenging to accomplish very good coverage of the complex Saccharum spp. genome with only 11 primers as used in the present study, the random coverage of the alleles over the genome was possible due to the multiallelic nature of the SSR primers 54 distributed over the 10 sugarcane homeologous groups. The 11 SSR primers generated a total of 423 alleles with an average of ~ 38.45 alleles per locus. The range (0.172 to 0.375) of the PIC values of the SSR primers used in the entire collection was similar to the range (0.195 to 0.375) reported earlier 35 . Previous studies have used nearly two times fewer total alleles and average number of alleles per SSR for diversity analysis, such as 205 alleles with 13.67 per locus 55 , 209 alleles with 5.8 per locus 35 , and 261 alleles with 7.35 per locus 56 . Thus, 423 alleles were considered enough to randomly capture a significant proportion of the genetic diversity within the Saccharum genome to effectively discriminate accessions among different species groups. This was evident from the bootstrap results (Supplementary Table S10) that showed that 423 polymorphic alleles reduced the CV to 2.3% and 114 alleles were sufficient enough to result in the recommended 5% coefficient of variation (CV) 57,58 .
The clusters generated by neighbor-joining analysis, structure analysis, and PCoA were similar. The only difference was that, unlike the neighbor-joining analysis, neither the structure analysis nor the PCoA produced a Miscanthus spp. cluster separate from the ELSS. The species groups described by neighbor-joining were S. spontaneum, S. officinarum / robustum, hybrid cultivars, ELSS, S. sinense, S. barberi, and Miscanthus spp. Saccharum experienced one genome duplication shared with Miscanthus, but an additional, more recent lineagespecific duplication leading to autopolyploidization in Saccharum species may explain the distance between Saccharum hybrids and Miscanthus 59 . Principal coordinate analysis showed hybrid cultivars and S. officinarum / robustum clustering together. Additionally, the PCoA most likely could provide visualization for the majority of the variance between populations with 10.59% of the total variance explained by the first two coordinates, which was close to 9.3, 14.4, and 14.9% found by the AMOVA between groups. All this evidence supports the current Saccharum species delineations.
The low robustness of the neighbor-joining tree was due to the combination of varying numbers of clones within a species in the population and the use of minor alleles. Because of the nature of the population, alleles primarily present in S. spontaneum, in particular, or in S. robustum, S. officinarum, or hybrid cultivars required little representation within their primary species to be included at higher frequencies under bootstrapping, while alleles primarily present in species with less clones needed more representation. The cut-off allele frequency of 0.01 used in this study affected 15 clones, represented by 3% of the S. spontaneum, but up to 107% of the Miscanthus spp. clones. Thus, each species did not have equal representation among the alleles used, which resulted in changes in the diversity-based tree when different samples were selected. Nevertheless, the genetic diversity was well represented, as demonstrated by the neighbor-joining, structure and principal coordinate analyses.
Results of the diversity analysis of the WCSRG from the present study and that of Nayak et al. 35 were comparable. In the present study, overall averages were 0.213 and 0.338 for gene diversity and Shannon's information indexes, respectively, as compared to 0.310 and 0.438, respectively, observed by Nayak et al. 35 . The percentage of polymorphic alleles in the present study ranged from 54.37 to 99.29, whereas it ranged between 75.60 and 99.52 www.nature.com/scientificreports/ for Nayak et al. 35 . The genetic distance between groups/species in the present study was lower than previously published reports. For example, the distance between S. spontaneum and S. officinarum was 0.55 as compared to 0.79 reported by Nayak et al. 35 . The low averages in the present study could be due to the retention of minor alleles. Nevertheless, S. spontaneum remained the most genetically diverse species, primarily due to its high variation in chromosome number (2n = 40-128), diversity of habitats, and widest geographic distribution 43,60 . Clustering analysis results of the WCSRG and Louisiana collection in the present study were similar to that observed by Nayak et al. 35 where three main clusters: 1) S. spontaneum; 2) S. officinarum, hybrid cultivars, S. robustum, S. edule, S. barberi, and S. sinense; and 3) other species were described. However, species within each cluster were resolved better in our study, where seven clusters in the neighbor-joining analysis (S. spontaneum, S. officinarum / robustum, hybrid cultivar, ELSS, S. sinense, S. barberi, and Miscanthus spp.) and six clusters with the structure analysis and the PCoA (S. spontaneum, S. officinarum / robustum, hybrid cultivar, ELSS / Miscanthus spp., S. sinense, and S. barberi) were identified. Nayak et al. 35 left Erianthus and Saccharum nomenclatures as found in the WCSRG database. They used non-Saccharum species as a group, grouping Erianthus spp. with Miscanthus spp. Considering that several Erianthus spp. have been reclassified under Saccharum, it is difficult to comprehend if their analysis found the same distinction that was observed in the present study between ELSS and Miscanthus spp. The distinction between Saccharum spp. and Erianthus-like Saccharum spp., as found in our study, was also observed earlier 61 .
Genic SSRs derived from the cold-responsive transcripts with a putative function are useful for functional diversity assay in natural populations or germplasm collections. The low polymorphism information content of the genic SSRs (0.211) we observed among the clones in WCSRG was similar to that with a set of cold-responsive genes-derived markers among the Louisiana clones in our previous study 39 . The low polymorphism of genic SSRs could be explained by the conserved sequence of the transcribed regions within the Saccharum genus 62 or by the low transmission of diverged sequence differences between the Saccharum spp. 51 . Genetic similarity was high among the sugarcane hybrids, which is likely because only a few parental clones were involved in the development of the foundation clones through nobilization in breeding programs 63 . Alleles at these loci are possibly fixed as a result of a limited number of founder parents and subsequent selection. The Louisiana sugarcane hybrids formed a distinct group, close to the group containing S. officinarum clones, which is expected because most of the modern cultivars inherited the major part (~ 80-85%) of the genome of S. officinarum during nobilization events 64 . Besides, Louisiana-bred clones grouped in a subcluster within the hybrids cluster, which may be because they were developed in subtropical conditions under selection pressure from cold temperatures. Similar to the S. spontaneum in the entire collection of 1,485 clones, S. spontaneum clones in subcluster 3 of the minicore were close to unknown S. spp. clones in cluster 4. This supported our assumption that most, if not all, of the unknown clones in the WCSRG are S. spontaneum. The non-correspondence of genetic similarity of clones within a species group within WCSRG and the mini-core is believed to be due to the varied cold stress response of the clones. This is justified by the high percentage (81.2%) of variation within a species group (population) compared to 18.8% among the species groups (Supplementary Table S8) and high average gene diversity (0.218) in the mini-core (Supplementary Table S7) as compared to the 0.208 in the present study (Supplementary  Table S3) and 0.304 35 within the WCSRG. The number of alleles unique to the different species groups (average 17.4; Supplementary Table S7) suggested a wide genetic variation in cold stress response of the mini-core clones. Ongoing studies are focused to elucidate the role of variation in microsatellite loci in discriminating clones with varied cold tolerance and future development of functional markers for their use in MAS in breeding for cold tolerant sugarcane cultivars.
Taken together, the genetic diversity analysis based on genotypic data of 1,485 clones helped Louisiana breeders in the selection of 309 clones (20.8%) as the SDP1 that, with an average h at 0.163 (Supplementary Table S3), captured the diversity found in the population (average h at 0.208), including the Louisiana core collections. SDP1 and mini-core shared 324 and 319 alleles, respectively out of 423 alleles in the entire collection of 1,485 clones (Supplementary Table S2). The percentage of variation among the populations in the entire collection and SDP1 were comparable (Supplementary Table S5). The genetic diversity of the mini-core by cold-responsive genes-derived SSR markers, which potentially affected the function of the cold-responsive proteins, suggested the diversity present for phenotypic traits, in this case cold response, within SDP1. SDP1 is now being used for phenotyping of various traits of agronomic importance for GWAS studies to identify markers associated with those traits. Phenotypic characterization within SDP1 can then be used to identify clones of special interest for particular traits, and these results will allow more effective introgression of useful alleles from wild/exotic clones in the already existing sugarcane base-broadening program of Louisiana to improve commercial sugarcane or energy cane variety development. In addition, another important value of SDP1 will be to serve as the validation platform for trait-markers identified from QTL mapping studies involving biparental populations. This function will allow the realization of the potential for molecular breeding for improved cultivar development in sugarcane.