Taraxacum kok-saghyz (rubber dandelion) genomic microsatellite loci reveal modest genetic diversity and cross-amplify broadly to related species

Nowicki, Marcin; Zhao, Yichen; Boggess, Sarah L.; Fluess, Helge; Payá-Milans, Miriam; Staton, Margaret E.; Houston, Logan C.; Hadziabdic, Denita; Trigiano, Robert N.

doi:10.1038/s41598-019-38532-8

Download PDF

Article
Open access
Published: 13 February 2019

Taraxacum kok-saghyz (rubber dandelion) genomic microsatellite loci reveal modest genetic diversity and cross-amplify broadly to related species

Scientific Reports volume 9, Article number: 1915 (2019) Cite this article

5083 Accesses
15 Citations
3 Altmetric
Metrics details

Subjects

Abstract

Taraxacum kok-saghyz (TKS) carries great potential as alternative natural rubber source. To better inform future breeding efforts with TKS and gain a deeper understanding of its genetic diversity, we utilized de novo sequencing to generate novel genomic simple sequence repeats markers (gSSRs). We utilized 25 gSSRs on a collection of genomic DNA (gDNA) samples from germplasm bank, and two gDNA samples from historical herbarium specimens. PCR coupled with capillary electrophoresis and an array of population genetics tools were employed to analyze the dataset of our study as well as a dataset of the recently published genic SSRs (eSSRs) generated on the same germplasm. Our results using both gSSRs and eSSRs revealed that TKS has low- to- moderate genetic diversity with most of it partitioned to the individuals and individuals within populations, whereas the species lacked population structure. Nineteen of the 25 gSSR markers cross-amplified to other Taraxacum spp. collected from Southeastern United States and identified as T. officinale by ITS sequencing. We used a subset of 14 gSSRs to estimate the genetic diversity of the T. officinale gDNA collection. In contrast to the obligatory outcrossing TKS, T. officinale presented evidence for population structure and clonal reproduction, which agreed with the species biology. We mapped the molecular markers sequences from this study and several others to the well-annotated sunflower genome. Our gSSRs present a functional tool for the biodiversity analyses in Taraxacum, but also in the related genera, as well as in the closely related tribes of the Asteraceae.

The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars

Article Open access 15 April 2024

Jarkko Salojärvi, Aditi Rambani, … Patrick Descombes

Genetic gains underpinning a little-known strawberry Green Revolution

Article Open access 19 March 2024

Mitchell J. Feldmann, Dominique D. A. Pincot, … Steven J. Knapp

A pan-genome of 69 Arabidopsis thaliana accessions reveals a conserved genome structure throughout the global species range

Article Open access 11 April 2024

Qichao Lian, Bruno Huettel, … Raphael Mercier

Introduction

The growing human population has generated an increased demand for resources, including rubber, a substrate used for over 40,000 commercial products¹ (Supplementary Fig. S1). Significant progress has been made in the production of synthetic rubber from non-renewable petroleum, and this increased its percentage in the total amount of the rubber supplied¹. Yet, the vast majority of rubber production is still reliant on the same natural source from which it was initially discovered, the rubber tree Hevea brasiliensis Müll. Arg. (rubber palm^2,3). This important crop is threatened by the South American leaf blight pathogen Microcyclus ulei (Henn.) Arx (syn. Pseudocercospora ulei [(Henn.) Hora Junior & Mizubuti, comb. nov. 2014]) and is losing competition for the land and the manpower against the economically favored African oil palm Elaeis guineensis Jacq^2,4.

Natural rubber from plants outperforms that from petroleum in several aspects: the polymer of the natural rubber has much higher molecular weight compared with the synthetic rubber and the sustainable and renewable production of the plant (natural) rubber is considered superior to processing the non-renewable petroleum¹. Several thousand plant species from across the world were screened for laticiferous properties, especially at times of increased rubber demand, e.g., WWI or WWII⁵ (Supplementary Fig. S1). The current body of scientific evidence points towards only a few species bearing the potential as an alternative to H. brasiliensis as a source of usable latex^1,2,3,5,6. These species include guayule (Parthenium argentatum Gray), rubber ficus (Ficus elastica Roxb. ex Hornem.), and Russian dandelion (Taraxacum kok-saghyz Rodin; TKS). The molecular properties of rubber from each of these plants differ from those of the Hevea product^1,2,3,6 and point toward specialty uses on species basis. For instance, the guayule rubber could be used for medical products because of the lower content of allergenic proteins². The TKS rubber is of particular interest to the tire industry due to its high molecular weight (polymer index) and fast generation time (six months in TKS vs. seven years in Hevea), albeit with a comparatively higher content of allergenic proteins^1,2,3. Moreover, each of these species could be grown in areas complementary to the Hevea palm (24°S through 23°N⁷) with latitudes reaching as high as temperate zones (P. argentatum: 21°N through 37°N; F. elastica: 10°S through 35°N; TKS: 35°N through at least 45°N).

TKS is of particular interest for the industry due to the proven success in production of tires⁸. The tire industry reported in their very first uses of TKS rubber that the tires “differed but little, according to their mechanical characteristics, from those made from imported natural rubber⁹” (citing¹⁰). In addition, it offers an accessory gain of inulin used in the manufacturing of numerous commercial products^11,12,13,14. Both biosynthetic pathways are linked interchangeably within the TKS metabolism^13,14,15. Several establishments devoted to TKS rubber production were founded in the United States (US) and Europe (Kultevat Inc., KeyGene Inc., ESKUSA GmbH⁸) emanating from the major research projects (project acronyms: EU-PEARLS; DRIVE4U¹⁶).

TKS is native to Kazakhstan¹⁷ and Western Xinjiang, China¹⁸, and is currently grown in Western Europe and North America alike (Kultevat Inc., KeyGene Inc., ESKUSA GmbH^8,16,19). The plant was a major crop and model plant for rubber studies during the times of the Soviet Union of Socialist Republics (USSR). As hypothesized in other studies, likely due to the governmental pressure on performance, the TKS germplasm was profusely confused with the common dandelion species (T. officinale or T. brevicorniculatum). As a result, the world’s germplasm and botanical gardens collections were annotated as TKS for over 50 years despite being the common dandelions^20,21,22,23. Recent United States Department of Agriculture - Agricultural Research Service (USDA-ARS) and European expeditions helped remedy this issue and provided new properly identified germplasm^22,24. TKS is obligatory out-crossing, self-incompatible, diploid herbaceous plant (2n = 16), and morphologically resembles common dandelions, which exhibit mostly clonal reproduction due to polyploid genomic architecture^21,25.

A recent spike in the TKS research confirmed the USDA-ARS collected species identity using the morphological^{12,13,15,19,21}, molecular^14,21,25,26, biochemical^11,12,14, physiological^11,13,15, and breeding^14,19,26 approaches. The outcrossing nature of TKS was regarded when devising the plant genome linkage map²⁷, followed by its genome sequence assembly to contigs⁴ and transcriptome sequencing²⁸. All of this helped elucidate the TKS latex biosynthesis pathways^14,28,29. In addition, a number of physiological and developmental studies on currently available germplasm provided data that was helpful in maximizing the rubber/inulin yield in both years the plants were grown^11,13,15,19. TKS also proved amenable to genetic transformation and tissue culture^14,26, indicating the potential for its breeding engineering and increases in yield of rubber^30,31 or inulin^13,14,15.

Although some progress has been made regarding TKS biology, physiology, and genetics, until recently only limited information was available regarding the species diversity and inheritance/interplay of traits of interest. McAssey et al.³² utilized the USDA-ARS TKS germplasm²⁴ to estimate the genetic diversity of the species using 17 expressed-sequence tags/simple sequence repeats markers (genic SSR; EST-SSR; here dubbed as “eSSR”) mined from the available GenBank EST libraries, across 17 TKS populations from the species native area²⁴. They concluded that the majority of the species diversity is captured within each population³². Similar conclusions were drawn from a study of Russian, American (USDA-ARS), and wild Chinese accessions of TKS using 23 eSSRs³³. None of these studies utilized nuclear genomic short-sequence repeats markers (gSSRs) to infer the population structure and genetic diversity of this economically important plant species. Moreover, the available TKS genome assembled to the contigs level only⁴ is lacking an extensive annotation or higher-level organization, despite providing insights into the TKS rubber/inulin biosynthetic pathways.

The goal of our study was to infer the TKS population structure, information of high value for breeding of this potential industrial crop. We utilized de novo sequencing to generate novel TKS gSSRs and to estimate the genetic diversity and spatial structure of the USDA-ARS TKS germplasm. We hypothesized that the majority of the species diversity would be captured in each examined population, as found in prior studies that utilized eSSRs^32,33. The specific research objectives included the following: (1) identifying and characterizing polymorphic gSSR loci using de novo sequence of the TKS genome and mapping of the useful polymorphic gSSRs and other marker sequences onto the well-annotated genome of the related species Helianthus annuus^4,27,32; (2) estimating the genetic diversity and inferring the population structure of the USDA-ARS TKS germplasm²⁴ and two available historical herbarium TKS samples using gSSRs; and (3) comparing the gSSR data with the published eSSR data of McAssey et al.³² and Yushuang et al.³³ to reach better-informed conclusions on TKS genetic diversity. We then deployed those gSSRs in a cross-amplification study with the local US dandelion samples (T. officinale), including their molecular identification. Information provided here will be useful in advancing future TKS studies, in the current and future breeding efforts of this potential crop for renewable rubber, and in augmenting the currently available resources for analyses of Taraxacum spp. and related plants.

Results

Designing and validating gSSR-markers

TKS SSRs discovery and the marker map

The TKS de novo genome sequencing yielded 45,804,966 paired-end reads of 275 bp. After trimming, 42,367,598 reads with a mean length of 265 bp were masked and used for de novo assembly on ABySS. The resulting de novo assembly contained 8,077,494 unitigs, from which 99,429 SSRs were identified on 95,692 sequences. From these, 11,259 were compound SSRs, meaning that two SSRs were separated by at most 15 bp. Primers were computed for 22,764 perfect SSRs. The number of SSRs with primers were 15,760 for the di-, 4,893 for the tri-, and 2,111 for the tetranucleotides, respectively.

Because of the lack of a well-annotated TKS genome, the ~1 kb sequences pulled from the TKS contigs of Lin et al.⁴ containing the markers used for the construction of TKS linkage map²⁷ and those of the gSSR and eSSR³² population genetics studies were mapped to the sunflower genome. The markers analyzed were located across all eight TKS linkage groups based on the mapping back to the sunflower genome (Fig. 1). In several instances, the marker sequences localized to separate TKS contigs, but the sequences mapped to the same sunflower genome regions (Fig. 1). Only one of our gSSRs (Tara026) co-localized with two other map markers (TC27; TC66) within a single TKS genome scaffold, and also mapped back close to each another on the sunflower genome (Fig. 1).

SSR genotyping and analyses

We chose a pool of 25 di- and 25 tri-nucleotide repeat gSSR markers for the study of TKS germplasm (Table 1 and Supplementary Table S1). After their initial screening on the TKS gDNA, this gSSR pool was reduced as several did not amplify a significant number of gDNA samples of the collection, lacked polymorphic alleles, or amplified a more complex banding pattern. We thus chose the 25 best-performing gSSRs for their specificity (single or double PCR products only) and reproducibility, and used them for TKS population genetics studies (Table 2 and Supplementary Table S1).

Table 1 Population genetics indices of Taraxacum kok-saghyz (TKS) populations.

Full size table

TKS: Population genetics analyses

gSSRs: Analysis of TKS spatial fixation genetics indices, Multi-locus genotype (MLG) networks, and population structure: Our results suggest no significant deviations from the Hardy-Weinberg equilibrium (HWE) across the 25 gSSR markers used to analyze the TKS populations (Supplementary Fig. S2) despite low sampling. Only the TKS population 35162, and to a lesser extent 35178, show deviations from HWE at six and two loci, respectively. All loci were polymorphic in each population tested, and no clonal MLGs were detected. The Genotype Accumulation Curve (GAC) analyses indicated an MLG saturation (Supplementary Fig. S3) with eight gSSRs in the analyzed TKS germplasm. Analyses of the Index of Association (Ia) confirmed the outcrossing character of the TKS germplasm studied (Supplementary Fig. S4). Only modest linkage disequilibrium was found in the gSSR TKS dataset (Supplementary Fig. S5) and suggested well-dispersed genomic locations of the gSSRs used.

The amplified gSSR markers yielded from 3 to 13 alleles per locus, averaging about 6 across the TKS germplasm pool (Table 2). The 25 gSSRs used indicated a moderate degree of inbreeding within the populations and overall (F_IS = 0.287; Tables 1 and 2). Our results further indicated a moderate TKS population fixation and genetic differentiation across the 25 loci tested (F_ST = 0.094; F’_ST = 0.098; Table 1). This implied a moderate level of gene flow among the TKS populations (inferred N_m = 2.41). Collectively, the data indicated rather low genetic differentiation among the TKS datasets analyzed, despite high allelic diversity of the obligatory outcrossing TKS. In agreement with the spatial fixation indices accrued, AMOVA for the gSSR dataset indicated the majority of the molecular diversity partitioned among the individuals and not among populations (Φ_IT = 66.52%; Φ_IS = 23.63%; Φ_ST = 9.86%).

Table 2 List of the Taraxacum kok-saghyz (TKS) genomic short sequence repeat (gSSR) markers developed in the study and summary statistics across 20 TKS populations.

Full size table

To compare the relatedness of both TKS datasets (gSSRs and eSSRs), we generated pairwise matrices of population genetic distances for both. Values of the population pairwise distance matrix of F_ST ranged from 0.018 to 0.355 for the gSSR, and from −0.024 to 0.261 for the eSSR datasets, respectively (data not shown). The pairwise population F_ST distance matrices (and D_ST matrices; both unstandardized and standardized; data not shown) for the gSSR and eSSR datasets provided similar results (Supplementary Fig. S6), thus indicating that the TKS diversity information was comparable between them. Sub-population-wise, the distance matrix for the gSSR dataset showed low resolution in the analyzed TKS collection (Fig. 2; Prevosti genetic distance range: 0.004 to 0.244, averaging 0.076 ± 0.056). The Neighbor-Joining dendrogram built on this basis indicated three TKS sub-populations as outliers (Herbarium, 35162, and 35178), and the remaining ones possibly divided into two separate clades. Testing of the geographic distance among TKS populations driving the genetic diversity of the species proved inconclusive (Fig. 3).

Reticulation analyses of the gSSR MLGs using the Minimum Spanning Networks (MSN) supported the F-statistics conclusions with no evidence of population structure in the TKS germplasm analyzed with gSSRs (Fig. 4). The lack of clustering or population structure visualized this way, suggests species-wide gene flow, implying that TKS diversity is well retained at the sub-population level. The results of the Discriminant Analysis of the Principal Components (DAPC; Fig. 5) were in agreement with the F_ST population-wise trees (Fig. 2), with the gSSR populations 35162, 35178, and the Herbarium samples placed with some distance to the majority of the remaining samples.

eSSRs: Comparative analysis of TKS spatial fixation genetics indices, Multi-locus genotype (MLG) networks, and population structure: In the re-analyzed eSSR TKS dataset³², the significant deviations from HWE presented a locus-wise pattern and were much more common in occurrence than in the gSSR dataset (Supplementary Fig. S2). The eSSRs saturated the MLGs detected in the TKS germplasm significantly slower than gSSRs (10 vs. 8 markers, respectively; Supplementary Fig. S3). The eSSR dataset provided congruent results with the gSSR dataset on Ia and pairwise linkage disequilibrium (Supplementary Figs S4 and S5). Regarding the fixation indices, the eSSR dataset harbored an overall F_ST = 0.11³² and F’_ST = 0.068 (data not shown), and an inferred N_m = 2.02. Partitioning of the molecular variance with AMOVA for the eSSR dataset yielded results similar to the gSSR dataset (Φ_IT = 84.61%; Φ_IS = 8.34%; Φ_ST = 7.04%). Differences occurred in the variance partitioned among individuals within populations, and the gSSR dataset showed higher value of this parameter than the eSSR dataset. The F_ST distance matrix for the eSSR dataset of TKS showed different population-wise clustering from the gSSR dataset (Fig. 2). The eSSR study showed marginally higher resolution in the pairwise genetic distances of TKS populations, likely due to a much higher number of samples per population analyzed (Prevosti distance range: 0.003 to 0.149, averaging 0.099 ± 0.082). Similar to the gSSR dataset, the sub-population 35162 was separated with high confidence from the bulk of other sub-populations, as was 35159. The absolute placement of the eSSR sub-populations differed from the gSSR dataset and indicated generally better resolution than the gSSR dataset, but no major clustering. Testing of the geographic distance among TKS populations driving the genetic diversity of the species proved inconclusive, similar to the gSSR dataset (Fig. 3).

MSN reticulation of the eSSR dataset (Fig. 4) provided results similar to the gSSR dataset, confirming that study’s conclusions³² of TKS lacking well-defined population structure. Analysis of networks from both gSSR and eSSR datasets resulted in similar Bruvo’s genetic distance ranges, and congruently implied lack of TKS population structure. Similar to the gSSR dataset, the DAPC analysis of the eSSR dataset (Fig. 5) also confirmed the sub-population-wise tree of genetic distances (Fig. 2). The eSSR population 35162 presented a similar (diverged) pattern to this observed in the gSSR dataset. Overall, our results suggest a lack of a well-defined population structure of the TKS germplasm with little support for the more differentiated population 35162.

Analyses of US Taraxacum officinale

Species genotyping and assignment and ITS phylogeny of the plant materials: Species identity of the samples collected from Tennessee, Georgia, Alabama, and Mississippi (Tables 3 and S1) was confirmed by Internal Transcribed Spacer region (ITS) sequencing (Fig. 6; Supplementary Tables S1 and S2). Samples identified as Taraxacum spp. lacked major differences in their ITS sequences (Fig. 6) and could not be unambiguously classified at species level based on this criterion alone (NCBI BLAST; data not shown). Grouping with the T. officinale and other Taraxacum species sequences for ITS³⁴ and NCBI consensi (Supplementary Table S2) did not resolve our collection into distinct species (Fig. 6 and Supplementary Files S2). Therefore, based on that non-resolution and due to the plants sharing major morphologic similarities, we treated those samples as a presumptive T. officinale collection. ITS sequencing also identified a number of outgroup specimens, morphologically resembling the T. officinale but from distant genera such as Youngia (Y. japonica; GU724281.1; 99% ITS sequence identity over 100% coverage), Hypochaeris spp. (several species hit with 99% and higher identity over 99% and higher coverage), Krigia spp. (L13945.1; 98% identity over 100% coverage), Lactuca (L. canadensis; GU818575.1; 99% identity over 99% coverage), Pyrrhopappus (P. carolinianus; AY218955.1; 99% identity over 90% coverage), and Erigeron (E. annuus; EF107653.1; 99% identity over 100% coverage, E. philadelphicus; AF046989.1; 99% identity over 90% coverage).

Table 3 Population genetics indices of the US dandelion (Taraxacum officinale) populations.

Full size table

Analysis of Taraxacum officinale spatial fixation genetics indices, Multi-locus genotype (MLG) networks, and population structure: The majority of the TKS-derived gSSRs cross-amplified the gDNA of the related US native dandelions (T. officinale) and of the outgroup specimens (Tables 2, 4 and S1). From the 25 gSSRs tested using the TKS gDNA collection, 21 gSSRs (five di- and 16 tri-nucleotide repeats) cross-amplified to the T. officinale gDNA collection as confirmed on four gDNA samples (Knoxville, TN population). Overall, the cross-amplification was broad and proved effective even in the specimens of related genera and tribe (Supplementary Table S1).

The analyses of the species diversity and population structure included 74 samples of T. officinale collected in several locations in the US using the 14 best-performing gSSRs (five di- and nine trinucleotide repeats) developed for TKS. Our results indicated violations of HWE in both locus- and population-manner (Supplementary Fig. S2). The MLG accumulation in this dataset was comparatively the slowest among all the datasets analyzed as 13 gSSRs saturated the genotype accumulation curve (Supplementary Fig. S3). Moreover, the index of association (Ia) was typical for clonal/asexual organisms (Supplementary Fig. S4; P = 0.051). Linkage disequilibrium range for this dataset was similar to that of the gSSR study of TKS (Supplementary Fig. S5) with the difference of fewer and smaller negative values recorded for the T. officinale dataset. As expected, the ploidy of the apomictic T. officinale samples estimated by the number of detected alleles often reached the tetraploid levels (diploid, n = 4; triploid, n = 17; tetraploid; n = 53; Supplementary Table S1), which limited the scope of the population genetics analyses, in particular the F-statistics (fixation indices analyses). To gain access to that data, we coded the whole dataset as tetraploid with occasionally missing alleles and corrected the ploidy with the R package polysat before analyses.

The T. officinale dataset displayed between 5 and 16 alleles per locus (averaging about 10; Table 4). The estimated dataset-wide F_ST value was 0.044 and the D’_ST was 0.048. Population-wise F_IS values (Table 3) indicated a considerable degree of homozygote excess in this dataset, further supporting the conclusion of asexual reproduction in this species. The population-wise Prevosti distance tree for T. officinale indicated its genetic distances were lower than TKS using the same markers (range: 0.008 to 0.157, averaging 0.055 ± 0.045; Fig. 2), indicating the lowest resolution in this dataset among those analyzed in the study. Further, the majority of the tree remained unresolved, with the samples from Herbarium (US western coast) and KnoxvilleTN forming an outgroup to the bulk of the dataset yet separated from one another. Similar separation was observed when analyzing the genetic and geographic distance matrices using the Mantel test (Fig. 3). Herbarium samples from the US western coast clustered separately from the remaining individuals based on the geographic spacing (Fig. 3). The majority of the molecular variance was retained among the individuals within the populations, whereas about one quarter of the total variance was partitioned among the populations (AMOVA: Φ_IS = 74.98%; Φ_ST = 25.02%). Several analyses indicated the presence of population structure in this dataset. The MSN analyses (Fig. 4) took into account motif lengths in the gSSRs and grouped individuals of several populations together using the Bruvo distance. In agreement with the population-wise tree of distances (Fig. 2), the DAPC analyses separated the WelcomeRaMS, as well as the Herbarium and KnoxvilleTN samples from the bulk of the remaining ones (Fig. 5). Comparatively larger resolution of this dataset than either of the TKS datasets suggested more pronounced population structure in the T. officinale, as (sub-)populations are more diverged from one another than in TKS. Bruvo’s distance-based tree of individuals (Fig. 7; motif lengths considered) was visualized using the Bayesian Information Criterion and grouped individuals from the geographically close populations together yet further implying population structure in the common dandelion species. Collectively, the results for T. officinale indicated the existence of low-diversity populations clonal in character but differentiated geographically.

Table 4 The US Taraxacum officinale and summary statistics, using the T. kok-saghyz (TKS) gSSR markers.

Full size table

Discussion

In this study, we aimed to gain a deeper understanding of the genetic diversity of TKS, a potential alternative, sustainable rubber crop^2,13,14. To reach this goal, we developed a set of genomic SSRs (gSSRs) based on our de novo sequencing of TKS and utilized them for evaluating the genetic diversity of TKS germplasm. We then carried-out an array of comparative population genetics analyses, re-analyzing the recently published genic SSR (eSSR) dataset generated on the same TKS germplasm³², and an expanded cross-amplification study with the local US dandelions using those gSSRs.

Our de novo gSSRs were distributed across the TKS genome, based on the linkage disequilibrium data, as were the eSSRs³². We mapped both types of SSRs (gSSRs and eSSRs) along with the TKS markers used for the linkage map, back to the related and well-annotated H. annuus genome, based on the TKS contigs⁴. This is very likely to be helpful for the future breeding efforts. We chose not to use the TKS genome assembly⁴ or the closely related Lactuca sativa genome assembly³⁵ because both are more fragmented and have fewer scaffolds anchored to chromosome locations in comparison to the H. annuus genome. To further underscore the need for improved TKS genome resources, the gSSR Tara003 sequence could not be found in the TKS contigs published⁴. Moreover, only 15 markers out of the 65 that constructed the TKS linkage map²⁷ were mapped back together (in pairs or in threes) to six TKS scaffolds of Lin et al.⁴. Also, only one of the SSRs analyzed (gSSR Tara026) co-localized with two other map markers of Arias et al.²⁷ within a single TKS contig of Lin et al.⁴. Several studies independently reported the TKS genome size as ~1,420 Mb based on flow cytometry (1.45 pg/1C^27,31). Other studies estimated the diploid plant genome size at 2,400 Mb^21,28. Comparatively, the draft TKS genome estimates at 1,040 Mb by 19 mer, 1,140 to 1,210 Mb by flow cytometry, or the 1,290 Mb assembly (all in⁴) represent an underestimation, which signifies room for improvement in the TKS genome completeness and assembly. As H. annuus is related, but somewhat distant to TKS, we expected mis-localizations and/or ambiguities in the marker placement due to genome rearrangements and/or sequence diversity. It is noteworthy though, that many chromosome regions in the map (Fig. 1) were enriched in the markers from the same linkage groups of TKS²⁷, with the gSSRs and eSSRs placed among them. This might indicate that despite a tentative character of this placement, the markers may be physically close. Thus, the markers found close on the H. annuus may indeed be linked on the TKS genome, extending the linkage information to new markers. gSSRs were slightly more ambiguously placed than eSSRs (excess of the sunflower genome BLAST hits of 2.8-fold vs. 2.2-fold, respectively), which could stem from targeting the parts of genome different in character, duplications of the non-coding regions targeted by gSSRs^36,37, or differences in the genomes of TKS (2n = 16) and H. annuus (2n = 34).

Several studies addressed the TKS diversity at various levels; agronomic performance and rubber/inulin production was of primary concern due to the industrial potential of the plant^15,27. Seedling growth characteristics were also studied³⁸. The first attempt at estimating the species genetic diversity using molecular methods was focused on a wide collection of TKS materials and allowed for a genetic distinction of the Russian/Kazakh and Chinese TKS germplasms³³. A milestone in the TKS molecular diversity research was the study of the Kazakhstan-originating USDA-ARS germplasm using a set of eSSRs³² with which we compare the statistics of our gSSR dataset. Despite our sampling scheme being lower in number than in the previous eSSRs study of TKS, our study yielded very similar results and provided significant correlation of the population distances/indices. This result was possibly accrued by employing ~50% more gSSRs at lower population sampling, yet, ensured reliability of our results. This also confirmed the general observation on the TKS diversity formulated before³² that the overall low species diversity resides mainly within populations. This observation is in agreement with our research hypothesis for this outcrossing, self-incompatible dandelion species. Comparison of the HWE violations in the gSSR and eSSR datasets shows much lower occurrence in the former dataset. This could be intrinsically related to the sequences targeted by either SSR type, or variable mutational frequency of the targeted loci^39,40. This is further substantiated by the patterns of HWE violations detected. The (sub-) population violations in gSSR dataset could stem from the limited sampling, whereas locus-wise HWE violations in the eSSR dataset suggest a different underlying reason, with abundant (sub-) population TKS sampling³².

Developing eSSRs is generally achieved faster and easier than the gSSRs due to comparatively more conserved character of the transcriptome^39,40. Owing to the fact of differences in parts of the genome targeted, in their conserved character, and in cross-amplification rates, both types of SSRs provide slightly different but complementary information^39,41,42. Thus, inferences made from both types of SSRs together will provide more substantiated conclusions on the species diversity (or other studies for which they were used). Diversity of several economically important crops was analyzed using both types of SSRs, and in almost all cases led to similar results, which could also be taken as a confirmation study. For instance, deployment of both types of SSRs on the cucumber germplasm provided consistent positioning of most of the accessions analyzed on dendrograms and detected higher polymorphism rates using the gSSRs⁴³. Similarly, high similarity was found between the gSSR and eSSR dendrograms among the tomato germplasms with higher polymorphism rate for the gSSRs, albeit slightly lower polymorphic information content⁴⁴. The authors of that study postulated that combining both marker types in tomato would be effective for the species genetic diversity analyses. In contrast, studies of soybean indicated comparatively lower agreement between the gSSRs and eSSRs^45,46. Authors argued for use of the eSSRs in soybean diversity studies for direct access to the population diversity in genes of agronomic interest but concluded that the species diversity was effectively estimated by both types of SSRs⁴⁶. Analyses of the genetic diversity in wheat repeatedly indicated higher polymorphism of the gSSRs over eSSRs, but the authors of the studies argued that use of the eSSRs allowed for a more accurate delineation of the genetic relationships^47,48,49. Studies in other cereal species observed the highest proportion of trimeric eSSRs, especially those encoding for neutral bulky amino acids^42,50. Both studies also stated that the lower level of polymorphism detected by eSSRs compared to gSSRs might be due to the more conserved character of the targeted regions with selection acting against variation, a feature that could drive the relatively higher transferability of the eSSRs and a comparatively superior genotypic identification. Another conclusion emerged from the studies of the Prunus species. Although both types of SSRs resulted in similar dendrograms, combination of both datasets increased the genotypic discrimination⁴⁴ and indicated a higher polymorphism and more effective resolution by the gSSRs than by the eSSRs⁵¹. The emerging conclusion from those and other studies is that similar levels of genetic diversity between populations or species may be recorded by using either SSR type with eSSRs often detecting lower variation, but performing more reliably at species differentiation^52,53,54.

Cross-amplification with the TKS gSSRs proved very successful and our markers transferred to other genera of the Asteraceae (Fig. 6; Supplementary Table S1). Within the Taraxacum genus, the 14 gSSRs tested extensively in this study also cross-amplified to four independent gDNA samples of T. brevicorniculatum (²⁶; Nowicki et al. unpublished data; Fig. 6). The outgroup specimens that cross-amplified with our gSSRs for TKS belonged to distant subtribes (Taraxacum and Youngia are in the subtribe Crepidinae; Hypochaeris in the Hypochaeridinae; Krigia in the Mricroseridinae; Lactuca in the Lactucinae; and Pyrrhopappus in the Cichoriinae), but the Erigeron specimens belong to a distant tribe Asterae. This indicates a possible broad application of our gSSRs in the Asteraceae crops analyses. The TKS eSSRs also cross-amplified with four gDNA samples of local dandelions³². Thus, our gSSRs present additional resources to the classical (GA/CT)_n gSSRs identified by restriction digest, hybridization, and Sanger sequencing⁵⁵.

Both eSSR and gSSR datasets of TKS confirmed its sexual reproduction as observed in nature^26,32,34,56. In contrast, results of the US dandelions are in agreement with the previous studies^25,54,57,58 that provided evidence of both sexual and asexual modes of reproduction present in T. officinale with a broad cross-amplification to related species. The retrieved ITS sequences remained largely indiscriminate as to the species identity of the local US dandelions, co-localizing with the T. officinale ITS sequence consensus and the historical Herbarium specimen. Yet, previous research indicated predominance of only three Taraxacum species in North America (T. ceratophorum, T. erythrospermum, and T. officinale^25,57,58,59). Including in the phylogenetic analyses the respective ITS consensus sequences of those three species, of the historical T. officinale specimen, and of T. officinale used for previous research²⁶ (and data not shown) suggested the bulk of the US local dandelions could belong to T. officinale, if the microspecies of Taraxacum are disregarded^20,60. Notably, the obligatory sexual diploid TKS was segregated with high confidence from the bulk of the US dandelions, as was the Central Asia-frequent T. brevicorniculatum.

The results of our gSSR analyses of this collection of US dandelions are in agreement with the recent ploidy analyses of the North America common dandelions²⁵. The majority of our dataset was tri- or tetra-ploid, and it is possible that we used too few markers to capture the higher levels of ploidy of the remaining several local dandelions samples classified as diploid based on the allele counts alone. In contrast to TKS, the US T. officinale presented evidence of population structure. This is in agreement with the biology of both species, especially considering the postulated clonal reproduction of the alloploid apomictic T. officinale in North America^25,57,60. The higher frequency of sampling the outgroup specimens belonging to distant genera in the Southeastern US may be worth investigating in regard to the species range.

Species of Taraxacum are notorious for hybridization, which often results in genome rearrangements, regional gDNA duplications, and/or polyploidization^21,34,57. Cross-amplification of the TKS gSSRs (this study) and eSSRs (confirmed on four samples³²), could help invigorate the molecular and genomic analyses of the more demanding polyploid dandelions^25,55,57. Our study distinguishes the local US populations of T. officinale from TKS in several aspects. First, higher frequency of HWE violations indicated a difference in the US dandelions dataset. Second, the higher ploidy in this dataset inferred from the number of alleles detected indicated the possibility of clonal/asexual reproduction, which was further supported by the Index of association (I_A). Third, several analyses indicated presence of population structure in this dataset contrary to the outcrossing diploid TKS. Overall, our gSSRs present a useful analytical tool for Taraxacum spp., due to cross-amplification in related species, even in distant genera.

Conclusions and Outlook

Results on the genetic diversity of TKS accrued in the course of this study may help current and future breeding efforts of this potential crop for renewable rubber. Complementary and congruent data obtained from both gSSR and eSSR study on the same germplasm provided thorough insights into the species biology. Although the TKS well-annotated genome is still to come, the combined marker map located on the related sunflower genome may help advance future TKS studies. Furthermore, cross-amplification of our gSSRs into related species of dandelions and even other genera augments the currently available resources to analyze their biodiversity and provides a platform for their further research.

Materials and Methods

Plant materials

TKS germplasm

TKS germplasm (seeds) collected in Kazakhstan²⁴ was obtained from USDA-ARS and identified in a previous study¹⁵ (Table 1 and Supplementary Table S1). Plants were grown from seed as described earlier²⁶. Young fresh leaves of 60 individuals from 19 different populations as designated by USDA-ARS²⁴ with their mapped locations of origin³² were used for genomic DNA (gDNA) extraction. We extracted three to five independent plant specimens per population for population diversity study (Tables 1 and S1). In addition, two TKS herbarium specimens, MONT 51683 (H.E. Morris, September 11, 1942) and KE 650 (C. Hobbs, July 02, 1949) submitted to us for destructive sampling, were used for comparison with the freshly collected samples. Plant tissue was subject to gDNA isolation using the DNeasy Plant Mini Kit (Qiagen, Germantown, MD) following the manufacturer’s protocol. The gDNA of the herbarium samples was isolated using the E.Z.N.A. Plant DNA Kit (Omega Bio-Tek, Norcross, GA) according to the manufacturer’s protocol. Isolated gDNA was evaluated for integrity by electrophoresing it in 2% agarose gels stained with ethidium bromide, and purity and concentration were assessed using Nanodrop ND-1000 UV/Vis (Fisher Scientific, Pittsburgh, PA).

United States plant materials and sequencing for species identification

Leaves of wild T. officinale Weber (n = 74) accessions from the Southeastern US and plants morphologically very similar were collected across different geographical regions (Tennessee, Georgia, Alabama, and Mississippi) and from eight distinct populations, as well as from historical herbarium specimens (Table 3 and Supplementary Table S1). Upon species identification by ITS sequencing (see below), specimens identified as not-Taraxacum spp. (n = 23) were set as a multiple outgroup. Leaf samples were collected in January and February of 2017, before the majority of the plants set bloom. No specific permissions were required for these locations/activities, as the materials are considered common weeds and regarded as neither endangered nor protected. Collected plant tissue was placed in ziplock bags containing silica gel (50 g each; Dri Splendor H&P Sales Inc., Vista, CA). gDNA was isolated from the freshly collected tissues with the DNeasy Plant Mini Kit (Qiagen, Germantown, MD) as per the manufacturer’s protocol. Samples of the historical T. officinale were provided to us by the University of Washington Herbarium (WTU, Seattle, WA, USA; n = 9) and Oregon State University (OSC, Corvalis, OR, USA; OSC 225005; Halse 7823; March 2010) for destructive sampling and analyses (Supplementary Table S1). Those samples’ gDNA was isolated using the E.Z.N.A. Plant DNA Kit (Omega Bio-Tek) according to the manufacturer’s protocol. Isolated gDNA was evaluated for integrity by electrophoresing it in 2% agarose gels stained with ethidium bromide, and purity and concentration were assessed using Nanodrop ND-1000 UV/Vis (Fisher Scientific, Pittsburgh, PA).

Genotyping of the Internal Transcribed Spacer (ITS) region and sequence analyses

The genotyping of the TKS and the US dandelions collection was completed using the primers ITS1 (Fw: 5′-TCCGTAGGTGAACCTGCGG-3′) and ITS4 (Rv: 5′-TCCTCCGCTTATTGATATGC-3′)⁶¹. Each PCR of 30 µl was composed of 1 × PCR buffer, 2.5 mM MgCl₂, 0.25 mM dNTP, 10 ng gDNA, 0.5 µM of each primer, and 1 U of AmpliTaq Gold DNA Polymerase (Fisher Scientific, Waltham, MA). The optimized thermal profile used included an initial denaturation at 94 °C for 2 min, 40 cycles of 95 °C for 30 s, 60 °C for 1 min, 72 °C for 90 s, and the final extension at 72 °C for 7 min. For each PCR, 5 µl of products were electrophoresed in 2% agarose-TAE buffered gels stained with ethidium bromide to confirm the amplification, and the rest was purified with ExoSAP-IT (Thermo Fisher Scientific) according to the kit manual. Analytical sequencing was done at McLab (Molecular Cloning Laboratories, South San Francisco, CA) or University of Tennessee – Knoxville Genomics Core (UT; Knoxville, TN). Sequences were assembled using LaserGene SeqMan version 7.0.0 (DNAStar Inc., Madison, WI), manually inspected and corrected, and identified using BLAST at NCBI. The obtained sequence matrix was enriched for published TKS ITS data³⁴ (Genbank: KF437406 and KF 437407) and the ITS consensus sequences of T. ceratophorum (n = 3), T. erythrospermum (n = 12), and T. officinale (n = 53) from NCBI, respectively (Supplementary Table S2 and the references within). Sequences were then aligned using MAFFT with default settings^62,63, truncated at the low-quality ends using Mesquite version 2.1⁶⁴, and the uninformative characters removed using Seaview (version 4) Gblocks function with all the ‘less stringent selection’ options⁶⁵. This sequence matrix was then submitted for phylogenetic analyses using RAxML GUI version 1.5⁶⁶ for Maximum Likelihood using 100 runs, with thorough bootstrap of 10,000, bootstrap branch lengths activated, and General Time Reversible (GTR) substitution model⁶⁷. Multiple outgroup was set by selecting the 23 samples identified as not Taraxacum spp. (Supplementary Table S1 and Supplementary File S1) collected from the Southeastern US along with T. officinale. Phylogenetic relationships among the samples were visualized using FigTree version 1.4.3⁶⁸.

Genome sequencing and gSSR discovery

Genomic DNA from the leaf sample E55/12 (hybrid progeny of the TKS USDA germplasm²⁴; the detailed lineage is a proprietary information of ESKUSA GmbH, Parkstetten, Germany; chosen owing to abundant plant growth and thus availability of fresh leaf material) was isolated with the method described by Stein et al.⁶⁹ and submitted to the UT Genomics Core for Illumina MiSeq sequencing at 275 bp, paired-end, on a v3,600 cycle flow cell. The gDNA library was prepared using the Nextera XT kit (Illumina Inc., San Diego, CA, USA) following the manufacturer’s protocol with minor modifications, that included doubled incubation times and omission of the Normalizing step.

Illumina sequencing adapters, low quality bases (mean quality <30), and short reads (<30 bases) were trimmed off with Skewer version 0.2.2⁷⁰. Read quality control was performed using FastQC⁷¹. De novo assembly was performed with ABySS version 1.9.0⁷² with a k-mer size of 64. Sequence filtering for low complexity repeats was completed using the utility DustMasker⁷³ on the resulting unitigs. gSSRs were identified using an in-house developed perl script. The minimum and maximum motif frequency definitions on the gSSRs were six to 20 bp for the di- and tri-nucleotide repeats and four to 20 bp for the tetra-nucleotide repeats. A pair of primers flanking each SSR was designed using Primer3⁷⁴. For the primer design, the following parameters were selected: optimum primer size of 21 bp (in the range of 18 to 27); optimum annealing temperature of 60.0 °C (in the range of 55 to 65 °C); primer GC content in the range of 40 to 60%.

SSR and marker map

The TKS genome sequence⁴ was used in combination with the TKS linkage map information²⁷ to infer the genomic locations of the SSR markers in this study. We used the marker sequences published therein, those obtained from our de novo sequencing gSSR search, as well the marker information and/or primer sequences of the published TKS eSSRs³² for comparison. The marker sequences were compared to the TKS genome contigs assembly of Lin et al.⁴ using gmap with default scoring settings (except for –allow-close-indels = 2 and –nosplicing). For each best sequence match to the TKS genome, a ~1 kb region containing the marker (500 bp on either side) was selected. The resultant contig fragments were used to BLAST the genome of related species, sunflower Helianthus annuus L., HA412-HO bronze assembly⁷⁵. Best-hit sequences were then drawn on a map, respective to their physical locations on the sunflower chromosomes. If multiple best-hits had the same e-value, all were retained.

SSR genotyping and analyses

PCR genotyping of the collection of TKS gDNA samples was completed using a set of 25 gSSR primers identified as described above (Tables 2 and S1) with subsequent capillary electrophoresis (QIAxcel Advanced Electrophoresis System, Qiagen). The single gDNA sample E55/12 that served for de novo sequencing was used for an initial genotyping screen with 50 primer pairs (25 di- and 25 tri-nucleotide repeats) with the PCR procedure described below. The results were visualized by capillary electrophoresis using Qiaxcel (Qiagen) and analyzed by using 25 to 500 bp DNA size marker and internal 15/600 bp alignment marker. We screened the results of genotyping with the 50 gSSRs for specificity on this gDNA sample, and the best-performing 25 gSSRs were selected for the analysis of the TKS gDNA collection (see Supplementary Table S1 for primer sequences). Cross-amplification to the US dandelions collection (T. officinale and outgroup specimens, Supplementary Table S1) was first checked on the four random gDNA samples isolated from plants local to Knoxville, TN using the 25 best-performing gSSRs on the TKS gDNA collection. The results were then screened in a fashion similar to the TKS screening procedure.

PCR reactions of 10 µl were composed of 1 × PCR buffer, 2 mM MgCl₂, 0.25 mM dNTP, 5% (v/v) DMSO, 4 ng gDNA, 1 µM of each primer, and 1 U of AmpliTaq Gold DNA Polymerase (Fisher Scientific). The experimentally optimized thermal profile used included an initial denaturation at 94 °C for 3 min, 15 touch-down cycles of 95 °C for 40 s, 63-0.5 °C/cycle for 40 s, 72 °C for 30 s, 25 cycles of 95 °C for 40 s, 55 °C for 40 s, 72 °C for 30 s, and the final extension at 72 °C for 4 min.

Analysis of population structure

A total of 62 TKS gDNA samples were genotyped using 25 gSSRs and binned using FlexiBin (an MS Excel macro⁷⁶). In addition, the published dataset of TKS-eSSR study was retrieved³² and binned to allow comparison of the datasets. Lastly, the dataset of T. officinale collected in the US (n = 74) and genotyped using 14 gSSRs was also binned, following the same procedure as the two datasets mentioned above. The binned datasets were analyzed separately for an array of population genetics parameters. To estimate the fixation and differentiation indices (F_ST and F’_ST, respectively⁷⁷), we used packages: poppr^78,79, hierfstat^80,81, and polysat^82,83 in R version 3.4.3⁸⁴. Due to the detected variation in ploidy levels in the US dandelions dataset, the data was corrected for ploidy in R version 3.4.3 using the package polysat and then recoded as tetraploid with occasionally missing alleles when samples were actually di- or tri-ploid. The mixed ploidy of that dataset limited the scope of the indices accrued, notably the differentiation index F’_ST⁷⁷; we resorted to GenoType/GenoDive⁸⁵ to calculate the respective T. officinale dataset-wide F_ST and D’_ST indices. As per convention, the F_ST bins considered were low (F_ST < 0.05); moderate (0.05 < F_ST < 0.15), and high (F_ST > 0.15). Deviations of Hardy-Weinberg equilibrium (HWE) were calculated using package pegas version 0.10⁸⁶ in R version 3.4.3, using the exact test based on Monte Carlo permutations of alleles (B = 1,000) and α = 0.05. The results were depicted as a probabilistic heatmap for HWE deviation in a locus- and subpopulation-manner. The multi-locus genotype (MLG) networks were constructed using the Bruvo distances, using the minimum-spanning networks (MSN) reticulation algorithm in the package poppr in R version 3.4.3. POPTREE2⁸⁷ was used to calculate the population-wise distance matrices using either F_ST or D_ST indices (both standardized and unstandardized). Mantel tests were performed in R version 3.4.3 using the package MASS⁸⁸. Analysis of the molecular variance (AMOVA) was performed in R version 3.4.3 using the package poppr, and the resulting Φ indices are reported as [%] values, after 1,000 permutations, at. the three levels of each dataset hierarchy: within individuals Φ_IT, within individuals between subpopulations Φ_IS, and among subpopulation and Φ_ST. The mixed-ploidy T. officinale dataset did not lend itself to the Φ_IT calculations using AMOVA. Discriminant Analysis of Principal Components (DAPC) was performed in R version 3.4.3 using the package adegenet version 2.1.1^89,90.

Compliance with ethical standards

Research involving Human Participants and/or Animals: This article does not contain any studies with human participants or animals performed by any of the authors.

Data Availability

All data generated or analyzed during this study are included in this published article and its supplementary information files. The Skewer-trimmed MiSeq reads og TKS gDNA are available at NCBI BioSample SAMN10414186, BioProject PRJNA505305.

References

van Beilen, J. B. & Poirier, Y. Guayule and Russian dandelion as alternative sources of natural rubber. Critical Reviews in Biotechnology 27, 217–231 (2007).
Article PubMed Google Scholar
Cornish, K. Alternative natural rubber crops: Why should we care? Technology & Innovation 18, 244–255 (2017).
Article Google Scholar
van Beilen, J. B. & Poirier, Y. Establishment of new crops for the production of natural rubber. Trends in Biotechnology 25, 522–529 (2007).
Article PubMed Google Scholar
Lin, T. et al. Genome analysis of Taraxacum kok-saghyz Rodin provides new insights into rubber biosynthesis. National Science Review 5, 78–87 (2017).
Bowers, J. E. Natural rubber-producing plants for the United States. (National Agricultural Library, Beltsville, Maryland, USA, 1990).
Venkatachalam, P., Geetha, N., Sangeetha, P. & Thulaseedharan, A. Natural rubber producing plants: An overview. African Journal of Biotechnology 12, 1297–1310 (2013).
Yeang, H. Y. Synchronous flowering of the rubber tree (Hevea brasiliensis) induced by high solar radiation intensity. New Phytologist 175, 283–289 (2007).
Article Google Scholar
Gevers, N. & Kappen, F. The Apollo Vredestein Press Publications and PR. BioRubber for Europe in global perspective, EU-PEARLS Consortium Wageningen, the Netherlands 34 (2012).
Krotkov, G. A review of literature onTaraxacum koksaghyz Rod. The Botanical Review 11, 417–461 (1945).
Article CAS Google Scholar
Rogov, N. A. & Magidov, I. A. Shyny iz kok-sagyza [Automobile tires from kok-saghyz]. Kauchuk i Rezina 10, 50–53 (1939).
Google Scholar
Arias, M., Hernández, M. & Ritter, E. How does water supply affect Taraxacum koksaghyz Rod. rubber, inulin and biomass production? Industrial Crops and Products 91, 310–314 (2016).
Article CAS Google Scholar
Arias, M., Herrero, J., Ricobaraza, M., Hernández, M. & Ritter, E. Evaluation of root biomass, rubber and inulin contents in nine Taraxacum koksaghyz Rodin populations. Industrial Crops and Products 83, 316–321 (2016).
Article CAS Google Scholar
Kreuzberger, M., Hahn, T., Zibek, S., Schiemann, J. & Thiele, K. Seasonal pattern of biomass and rubber and inulin of wild Russian dandelion (Taraxacum koksaghyz L. Rodin) under experimental field conditions. European Journal of Agronomy 80, 66–77 (2016).
Article CAS Google Scholar
Stolze, A. et al. Development of rubber‐enriched dandelion varieties by metabolic engineering of the inulin pathway. Plant Biotechnology Journal 15, 740–753 (2017).
Article CAS PubMed Central PubMed Google Scholar
Cornish, K. et al. Temporal diversity of Taraxacum kok-saghyz plants reveals high rubber yield phenotypes. Biodiversitas Journal of Biological Diversity 17, 847–856 (2016).
Meer, I. V. D. Rubber dandelions and nickel eating flowers. http://library.wur.nl/WebQuery/wurpubs/516507 (2017).
Rodin, L. Taxonomic description of Taraxacum kok-saghyz. Acta Institua Botanici Academiae Scientarum, Ser. L. Fase 1, 187–189 (1933).
Google Scholar
Tropicos database. Missouri Botanical Garden. www.torpicos.org (accessed 9-24-2018).
Moussavi, A., Cici, S., Loucks, C. & Van Acker, R. Establishing field stands of Russian dandelion (Taraxacum kok-saghyz) from seed in southern Ontario, Canada. Canadian Journal of Plant Science 96, 887–894 (2016).
Article CAS Google Scholar
Kirschner, J. et al. Identification of oligoclonal agamospermous microspecies: taxonomic specialists versus microsatellites. Preslia 88, 1–7 (2016).
Google Scholar
Kirschner, J., Štěpánek, J., Černý, T., De Heer, P. & van Dijk, P. J. Available ex situ germplasm of the potential rubber crop Taraxacum koksaghyz belongs to a poor rubber producer, T. brevicorniculatum (Compositae–Crepidinae). Genetic Resources and Crop Evolution 60, 455–471 (2013).
Article Google Scholar
van Dijk, P., Kirschner, J., Štěpánek, J., Baitulin, I. O. & Černý, T. Taraxacum koksaghyz Rodin definitely is not an example of overcollecting in the past. A reply to Volis, S. et al. (2009). Journal of Applied Botany and Food Quality 83, 217–219 (2010).
Google Scholar
Volis, S., Uteulin, K. & Mills, D. Russian dandelion (Taraxacum kok-saghyz): one more example of overcollecting in the past? Journal of Applied Botany and Food Quality 83, 60–63 (2009).
Google Scholar
Hellier, B. C. Collecting in Central Asia and the Caucasus: US national plant germplasm system plant explorations. HortScience 46, 1438–1439 (2011).
Article Google Scholar
Iaffaldano, B. J., Zhang, Y., Cardina, J. & Cornish, K. Genome size variation among common dandelion accessions informs their mode of reproduction and suggests the absence of sexual diploids in North America. Plant Systematics and Evolution, 303, 719–725 (2017).
Chandrasekera, B., Fluess, H., Zhao, Y., Trigiano, R. & Winkelmann, T. In vitro plant regeneration from ovules of Taraxacum officinale and Taraxacum koksaghyz. African Journal of Biotechnology 16, 1764–1775 (2017).
Article CAS Google Scholar
Arias, M. et al. First genetic linkage map of Taraxacum koksaghyz Rodin based on AFLP, SSR, COS and EST-SSR markers. Scientific Reports 6, 3103 (2016).
Luo, Z., Iaffaldano, B. J., Zhuang, X., Fresnedo-Ramírez, J. & Cornish, K. Analysis of the first Taraxacum kok-saghyz transcriptome reveals potential rubber yield related SNPs. Scientific Reports 7, 9939 (2017).
Whalen, M., McMahan, C. & Shintani, D. Development of crops to produce industrially useful natural rubber. In: Isoprenoid Synthesis in Plants and Microorganisms 329–345 (Springer, 2012).
Hodgson-Kratky, K. J., Stoffyn, O. M. & Wolyn, D. J. Recurrent selection for rubber yield in Russian dandelion. Journal of the American Society for Horticultural Science 142, 470–475 (2017).
Article Google Scholar
Luo, Z., Iaffaldano, B. J. & Cornish, K. Colchicine-induced polyploidy has the potential to improve rubber yield in Taraxacum kok-saghyz. Industrial Crops and Products 112, 75–81 (2018).
Article CAS Google Scholar
McAssey, E. V., Gudger, E. G., Zuellig, M. P. & Burke, J. M. Population genetics of the rubber-producing Russian dandelion (Taraxacum kok-saghyz). PLoS One 11, e0146417 (2016).
Article PubMed Central PubMed Google Scholar
Yushuang, Y. et al. Genetic diversity analysis of Taraxacum kok-saghyz Rodin germplasm by SSR markers. Chinese Agricultural Science Bulletin 32, 79–85 (2016).
Google Scholar
Kirschner, J., Drábková, L. Z., Štěpánek, J. & Uhlemann, I. Towards a better understanding of the Taraxacum evolution (Compositae–Cichorieae) on the basis of nrDNA of sexually reproducing species. Plant Systematics and Evolution 301, 1135–1156 (2015).
Article Google Scholar
Reyes-Chin-Wo, S. et al. Genome assembly with in vitro proximity ligation data and whole-genome triplication in lettuce. Nature Communications 8, 14953 (2017).
Article ADS CAS PubMed Central PubMed Google Scholar
Portis, E. et al. Comprehensive characterization of simple sequence repeats in eggplant (Solanum melongena L.) genome and construction of a web resource. Frontiers in Plant Science 9, 401 (2018).
Article PubMed Central PubMed Google Scholar
Shi, J. et al. Genome-wide microsatellite characterization and marker development in the sequenced Brassica crop species. DNA Research 21, 53–68 (2013).
Article ADS PubMed Central PubMed Google Scholar
Gao, Y., Xu, W. & Liu, S. Correlation analysis and genetic diversity of agronomic traits of Taraxacum kok-saghyz germplasm at seedling stage. Chinese Journal of Tropical Agriculture 36, 21–25 (2016).
Google Scholar
Ellis, J. & Burke, J. EST-SSRs as a resource for population genetic analyses. Heredity 99, 125 (2007).
Article CAS PubMed Google Scholar
Varshney, R. K., Graner, A. & Sorrells, M. E. Genic microsatellite markers in plants: features and applications. TRENDS in Biotechnology 23, 48–55 (2005).
Article CAS PubMed Google Scholar
Gadaleta, A. et al. Comparison of genomic and EST-derived SSR markers in phylogenetic analysis of wheat. Plant Genetic Resources 9, 243–246 (2011).
Article CAS Google Scholar
Song, Y.-P. et al. Differences of EST-SSR and genomic-SSR markers in assessing genetic diversity in poplar. Forestry Studies in China 14, 1–7 (2012).
Article CAS Google Scholar
Hu, J., Wang, L. & Li, J. Comparison of genomic SSR and EST-SSR markers for estimating genetic diversity in cucumber. Biologia Plantarum 55, 577–580 (2011).
Article CAS Google Scholar
Zhou, R., Wu, Z., Jiang, F. & Liang, M. Comparison of gSSR and EST-SSR markers for analyzing genetic variability among tomato cultivars (Solanum lycopersicum L.). Genetics and Molecular Research 14, 13184–13194 (2015).
Article CAS PubMed Google Scholar
Chang, W. et al. Development of soybean EST-SSR marker and comparison with genomic-SSR marker. Chinese Journal of Oil Crop Sciences 2, 007 (2009).
Google Scholar
Mulato, B. M., Möller, M., Zucchi, M. I., Quecini, V. & Pinheiro, J. B. Genetic diversity in soybean germplasm identified by SSR and EST-SSR markers. Pesquisa Agropecuária Brasileira 45, 276–283 (2010).
Article Google Scholar
Eujayl, I., Sorrells, M., Baum, M., Wolters, P. & Powell, W. Assessment of genotypic variation among cultivated durum wheat based on EST-SSRs and genomic SSRs. Euphytica 119, 39–43 (2001).
Article CAS Google Scholar
Xinquan, Y., Peng, L., Zongfu, H., Zhongfu, N. & Qixin, S. Genetic diversity revealed by genomic-SSR and EST-SSR markers among common wheat, spelt and compactum. Progress in Natural Science 15, 24–33 (2005).
Article Google Scholar
Yang, X.-Q. et al. Comparative analysis of genetic diversity revealed by genomic-SSR, EST-SSR and pedigree in wheat (Triticum asetivum L.). Acta Genet Sin 32, 406–416 (2005).
CAS PubMed Google Scholar
Varshney, R. K., Thiel, T., Stein, N., Langridge, P. & Graner, A. In silico analysis on frequency and distribution of microsatellites in ESTs of some cereal species. Cellular and Molecular Biology Letters 7, 537–546 (2002).
CAS PubMed Google Scholar
Rahemi, A. et al. Genetic diversity of some wild almonds and related Prunus species revealed by SSR and EST-SSR molecular markers. Plant Systematics and Evolution 298, 173–192 (2012).
Article CAS Google Scholar
Chen, H. et al. Assessment of genetic diversity and population structure of mung bean (Vigna radiata) germplasm using EST-based and genomic SSR markers. Gene 566, 175–183 (2015).
Article CAS PubMed Google Scholar
Lind, J. F. & Gailing, O. Genetic structure of Quercus rubra L. and Quercus ellipsoidalis EJ Hill populations at gene-based EST-SSR and nuclear SSR markers. Tree Genetics & Genomes 9, 707–722 (2013).
Article Google Scholar
Yadong, Z., Chan, P., Zhenfang, L., Yanling, Y. & Xingyi, H. Genetic diversity of genomic-SSR and EST-SSR markers in interspecies of poplar. Journal of Northeast Forestry University 12, 003 (2011).
Google Scholar
Falque, M., Keurentjes, J., Bakx-Schotman, J. & Van Dijk, P. Development and characterization of microsatellite markers in the sexual-apomictic complex Taraxacum officinale (dandelion). Theoretical and Applied Genetics 97, 283–292 (1998).
Article CAS Google Scholar
Warmke, H. E. Macrosporogenesis, fertilization, and early embryology of Taraxacum kok-saghyz. Bulletin of the Torrey Botanical Club, 164–173 (1943).
King, L. M. Origins of genotypic variation in North American dandelions inferred from ribosomal DNA and chloroplast DNA restriction enzyme analysis. Evolution 47, 136–151 (1993).
Article CAS PubMed Google Scholar
Vasut, R. J., Van Dijk, P. J., Falque, M., Travnicek, B. & de Jong, J. Development and characterization of nine new microsatellite markers in Taraxacum (Asteraceae). Molecular Ecology Resources 4, 645–648 (2004).
CAS Google Scholar
Brouillet, L. & Taraxacum, F. H. Wiggers. Flora of North America North of Mexico 19, 239–252 (2006).
Google Scholar
Zeisek, V. Taxonomic principles, reproductive systems, population genetics and relationships within selected groups of genus Taraxacum (Asteraceae). PhD Dissertation. Faculty of Science, Charles University, Prague, Czech Republic (2018).
White, T. J., Bruns, T., Lee, S. & Taylor, J. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. PCR protocols: a guide to methods and applications 18, 315–322 (1990).
Google Scholar
Katoh, K., Rozewicki, J. & Yamada, K. D. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Briefings in Bioinformatics, bbx108 (2017).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular Biology and Evolution 30, 772–780 (2013).
Article CAS PubMed Central PubMed Google Scholar
Maddison, W. Mesquite, a modular system for evolutionary analysis, version 2.6 (software). http://mesquiteproject.org (2009).
Gouy, M., Guindon, S. & Gascuel, O. SeaView version 4: a multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Molecular Biology and Evolution 27, 221–224 (2010).
Article CAS PubMed Google Scholar
Stamatakis, A., Hoover, P. & Rougemont, J. A rapid bootstrap algorithm for the RAxML web servers. Systematic biology 57, 758–771 (2008).
Article PubMed Google Scholar
Tavaré, S. Some probabilistic and statistical problems in the analysis of DNA sequences. Lectures on Mathematics in the Life Sciences 17, 57–86 (1986).
MathSciNet MATH Google Scholar
Rambaut, A. FigTree-version 1.4. 3, a graphical viewer of phylogenetic trees, http://tree.bio.ed.ac.uk/software/figtree/ (2017).
Stein, N., Herren, G. & Keller, B. A new DNA extraction method for high‐throughput marker analysis in a large‐genome species such as Triticum aestivum. Plant Breeding 120, 354–356 (2001).
Article CAS Google Scholar
Jiang, H., Lei, R., Ding, S.-W. & Zhu, S. Skewer: a fast and accurate adapter trimmer for next-generation sequencing paired-end reads. BMC Bioinformatics 15, 182 (2014).
Article PubMed Central PubMed Google Scholar
Andrews, S. FastQC: a quality control tool for high throughput sequence data. Babraham Institute, Cambridge, UK, http://www. bioinformatics.babraham.ac.uk/projects/fastqc (2014).
Simpson, J. T. et al. ABySS: a parallel assembler for short read sequence data. Genome Research 19, 1117–1123 (2009).
Article CAS PubMed Central PubMed Google Scholar
Morgulis, A., Gertz, E. M., Schäffer, A. A. & Agarwala, R. A fast and symmetric DUST implementation to mask low-complexity DNA sequences. Journal of Computational Biology 13, 1028–1040 (2006).
Article MathSciNet CAS PubMed Google Scholar
Koressaar, T. & Remm, M. Enhancements and modifications of primer design program Primer3. Bioinformatics 23, 1289–1291 (2007).
Article CAS PubMed Google Scholar
Priyam, A. et al. Sequenceserver: a modern graphical user interface for custom BLAST databases. Biorxiv, 033142 (2015).
Amos, W. et al. Automated binning of microsatellite alleles: problems and solutions. Molecular Ecology Resources 7, 10–14 (2007).
CAS Google Scholar
Bird, C. E., Karl, S. A., Smouse, P. E. & Toonen, R. J. Detecting and measuring genetic differentiation. Phylogeography and Population Genetics in Crustacea 19, l–55 (2011).
Google Scholar
Kamvar, Z. N., Brooks, J. C. & Grünwald, N. J. Novel R tools for analysis of genome-wide population genetic data with emphasis on clonality. Frontiers in Genetics 6, 208 (2015).
Kamvar, Z. N., Tabima, J. F. & Grünwald, N. J. Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. PeerJ 2, e281 (2014).
Article PubMed Central PubMed Google Scholar
Goudet, J. Hierfstat, a package for R to compute and test hierarchical F‐statistics. Molecular Ecology Resources 5, 184–186 (2005).
Google Scholar
Goudet, J., Raymond, M., de Meeüs, T. & Rousset, F. Testing differentiation in diploid populations. Genetics 144, 1933–1940 (1996).
CAS PubMed Central PubMed Google Scholar
Clark, L. V. & Jasieniuk, M. POLYSAT: an R package for polyploid microsatellite analysis. Molecular Ecology Resources 11, 562–566 (2011).
Article PubMed Google Scholar
Clark, L. V. & Schreier, A. D. Resolving microsatellite genotype ambiguity in populations of allopolyploid and diploidized autopolyploid organisms using negative correlations between allelic variables. Molecular Ecology Resources 17, 1090–1103 (2017).
Article CAS PubMed Google Scholar
The R Core Team. In R Foundation for Statistical Computing (2014).
Meirmans, P. G. & Van Tienderen, P. H. GENOTYPE and GENODIVE: two programs for the analysis of genetic diversity of asexual organisms. Molecular Ecology Notes 4, 792–794 (2004).
Article Google Scholar
Paradis, E. pegas: an R package for population genetics with an integrated–modular approach. Bioinformatics 26, 419–420 (2010).
Article CAS PubMed Google Scholar
Takezaki, N., Nei, M. & Tamura, K. POPTREE2: Software for constructing population trees from allele frequency data and computing other population statistics with Windows interface. Molecular Biology and Evolution 27, 747–752 (2009).
Article PubMed Central PubMed Google Scholar
Venables, V. & Ripley, B. Modern Applied Statistics with S. 4^th Edition (New York, NY, USA: Springer Science + Business Media, 2002).
Jombart, T. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24, 1403–1405 (2008).
Article CAS PubMed Google Scholar
Jombart, T. et al. Package ‘adegenet’, https://github.com/thibautjombart/adegenet (2018).
Nei, M. Analysis of gene diversity in subdivided populations. Proceedings of the National Academy of Sciences 70, 3321–3323 (1973).
Article ADS CAS MATH Google Scholar
Nei, M. Genetic distance between populations. The American Naturalist 106, 283–292 (1972).
Article Google Scholar
Prevosti, A. La distancia genética entre poblaciones. Miscellanea Alcobe 68, 109–118 (1974).
Google Scholar
Desper, R. & Gascuel, O. Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting. Molecular Biology and Evolution 21, 587–598 (2004).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This research was funded by the United States Department of Agriculture - Agricultural Research Service grant (NACA 58-6062-6). The authors thank the following herbaria for samples for destructive testing used in this study: Burke Museum of Natural History and Culture (University of Washington; WTU, Seattle, WA), The Tom S. and Miwako K. Cooperrider Herbarium (Kent State University Herbarium; KE, Kent, OH), Montana State University Herbarium (MONT, Bozeman, MT), and Oregon State University Herbarium (OSC, Corvallis, OR). The authors also thank ESKUSA GmbH for providing plant materials used for genome sequencing. Use of trade names is for identification purposes only and does not imply their endorsement by the Authors or the study funding entities.

Author information

Miriam Payá-Milans
Present address: Centro de Biotecnología y Genómica de Plantas, UPM-INIA, 28223, Madrid, Spain

Authors and Affiliations

Department of Entomology and Plant Pathology, The University of Tennessee, Knoxville, TN, USA
Marcin Nowicki, Sarah L. Boggess, Miriam Payá-Milans, Margaret E. Staton, Logan C. Houston, Denita Hadziabdic & Robert N. Trigiano
Guizhou Key Laboratory of Agro-Bioengineering, Guizhou University, Huaxi, Guiyang, P. R. China
Yichen Zhao
Julius Kühn Institute for Breeding Research on Agricultural Crops, Sanitz OT Groß Lüsewitz, Germany
Helge Fluess

Authors

Marcin Nowicki
View author publications
You can also search for this author in PubMed Google Scholar
Yichen Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Sarah L. Boggess
View author publications
You can also search for this author in PubMed Google Scholar
Helge Fluess
View author publications
You can also search for this author in PubMed Google Scholar
Miriam Payá-Milans
View author publications
You can also search for this author in PubMed Google Scholar
Margaret E. Staton
View author publications
You can also search for this author in PubMed Google Scholar
Logan C. Houston
View author publications
You can also search for this author in PubMed Google Scholar
Denita Hadziabdic
View author publications
You can also search for this author in PubMed Google Scholar
Robert N. Trigiano
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Project concept, supervision, and securing budget: R.N.T. Laboratory and research resources, supervision: S.L.B. Plant materials: H.F., M.N., D.H.G., R.N.T. Data acquisition: Y.Z., M.N., L.C.H. Transcriptome bio-informatics: M.P.M., M.E.S. Data analysis: M.N., D.H.G., S.L.B. Primary writing: M.N., S.L.B., R.N.T.

Corresponding author

Correspondence to Marcin Nowicki.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Table S1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Nowicki, M., Zhao, Y., Boggess, S.L. et al. Taraxacum kok-saghyz (rubber dandelion) genomic microsatellite loci reveal modest genetic diversity and cross-amplify broadly to related species. Sci Rep 9, 1915 (2019). https://doi.org/10.1038/s41598-019-38532-8

Download citation

Received: 28 September 2018
Accepted: 19 December 2018
Published: 13 February 2019
DOI: https://doi.org/10.1038/s41598-019-38532-8

This article is cited by

Construction of the first high-density SNP genetic map and identification of QTLs for the natural rubber content in Taraxacum kok-saghyz Rodin
- Yushuang Yang
- Bi Qin
- Shizhong Liu
BMC Genomics (2023)
Multilocus DNA polymorphism of some rubber-bearing dandelions (Taraxacum spp.) of Russia and Kazakhstan
- Bari Gabit
- Gainullina Karina
- Kuluev Bulat
Genetic Resources and Crop Evolution (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars

Genetic gains underpinning a little-known strawberry Green Revolution

A pan-genome of 69 Arabidopsis thaliana accessions reveals a conserved genome structure throughout the global species range

Introduction

Results

Designing and validating gSSR-markers

TKS SSRs discovery and the marker map

SSR genotyping and analyses

TKS: Population genetics analyses

Analyses of US Taraxacum officinale

Discussion

Conclusions and Outlook

Materials and Methods

Plant materials

TKS germplasm

United States plant materials and sequencing for species identification

Genotyping of the Internal Transcribed Spacer (ITS) region and sequence analyses

Genome sequencing and gSSR discovery

SSR and marker map

SSR genotyping and analyses

Analysis of population structure

Compliance with ethical standards

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Supplementary information

Supplementary Information

Supplementary Table S1

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Construction of the first high-density SNP genetic map and identification of QTLs for the natural rubber content in Taraxacum kok-saghyz Rodin

Multilocus DNA polymorphism of some rubber-bearing dandelions (Taraxacum spp.) of Russia and Kazakhstan

Comments

Search

Quick links