Genetic diversity in Sickleweed (Falcaria vulgaris) and using stepwise regression to identify marker associated with traits

One of the well-known medicinal plants in the Falcaria genus is Sickleweed. Falcaria species exhibit a high degree of genetic variability, posing challenges in the examination of genetic diversity due to the significant potential for hybridization and introgression among them. Utilizing morphological traits and molecular markers may prove to be a valuable approach in evaluating and harnessing germplasm, considering the current obstacles faced in breeding this medicinal herb. In 2021, fifteen Sickleweed populations were cultivated in pots under field conditions, employing a randomized complete block design with three replications. This aimed to assess genetic diversity and conduct marker-trait association analyses utilizing morpho-physiological characteristics and SSR markers. The Sickleweed populations displayed considerable genetic diversity across all traits. Through cluster analysis of traits and the utilization of the UPGMA method based on the Gower distance matrix, the population was classified into three distinct clusters. Upon examining all genotypes, 52 polymorphic bands were detected, with an average of 8.68 bands per primer. The average expected heterozygosity across all loci was 0.864, while the average PIC was 0.855. Molecular data analysis employing the Jaccard similarity index and UPGMA method revealed the division of Sickleweed populations into two major groups. Furthermore, the results of molecular variance analysis indicated that variation within the population exceeded that between populations. Thirty-two SSR fragments were found to be significantly associated with genomic regions controlling the studied traits, determined through the application of stepwise regression. Selection based on molecular markers offers a rapid method for breeding programs, with the genetic information obtained from these markers playing a crucial role. Therefore, alongside traits, selecting superior genotypes and populations of high value in breeding programs becomes feasible. The findings highlight that certain markers are linked to multiple traits, emphasizing the critical importance of this characteristic in plant breeding for the simultaneous improvement of numerous traits. The study’s insights regarding markers hold potential for application in Sickleweed breeding programs.


Materials and methods
This study was conducted in compliance with the relevant institutional, national, and international guidelines and legislation of Iran. No specific permits were required for the collection of plant materials. The seeds of 15 different populations of Sickleweed were collected from seven Iranian provinces, namely Ardebil, Kurdistan, Kermanshah, Gilan, Hamedan, Qazvin, and Qom (Table 1). The pollination behavior of Sickleweed (Falcaria vulgaris) is an important factor to consider in studies assessing genetic diversity. Sickleweed is primarily a crosspollinated species, relying on the transfer of pollen between individual plants for successful fertilization. The specific pollination behavior, including the involvement of different pollinators and the extent of self-pollination, can have significant implications for the composition and diversity of Sickleweed populations. It is important to consider pollination behavior when collecting samples for genetic diversity studies. To ensure representative sampling and capture the genetic diversity within populations, it is crucial to employ a strategy that accounts for potential differences in pollination patterns among individuals and populations. In our study, we followed a systematic approach for sample collection, taking into consideration the pollination behavior of Sickleweed. We employed a random sampling strategy, ensuring that individuals were selected from different locations within each population. By collecting samples from multiple individuals within a population, we aimed to capture the genetic variation present within and among populations, including potential variations resulting from pollination patterns. Furthermore, we acknowledge the importance of providing detailed information on the sample collection strategy, including the number of plants sampled per population, the collection locations, and any additional considerations specific to pollination behavior. We will revise the manuscript to include these details and emphasize the importance of incorporating the pollination behavior in the sampling strategy to obtain a comprehensive assessment of genetic diversity in Sickleweed populations. The species identity was confirmed by Dr. Rahimi, and voucher specimens were deposited in the Graduate University of Advanced Technology Herbarium under the numbers 151 to 165. These specimens are available for botanical studies upon official request. The 15 different populations of Sickleweed were cultivated in pots under field conditions in 2021 at the Graduate University of Advanced Technology in Kerman, Iran. The plants were grown in pots located in the field and field conditions. The growth media used in this study consisted of a well-balanced mixture of organic and inorganic components to provide optimal nutrient availability and support healthy plant growth. The growth media composition consisted of a combination of sterile soil, peat moss, and perlite in a ratio of 3:1:1. This mixture was selected based on previous studies and recommendations for the cultivation of Sickleweed to provide a suitable substrate for plant growth. The sterile soil provided essential minerals and nutrients, while the peat moss and perlite contributed to the media's moisture retention and aeration properties. To ensure the consistency of growth conditions, the growth media were thoroughly mixed and sterilized before planting the Sickleweed seeds. The pots were filled with the growth media, and the seeds were sown at the recommended depth. The experimental design employed was a randomized complete block design with three replications. In our study, we maintained a single Sickleweed plant per pot to minimize competition and facilitate accurate data collection. This allowed us to focus on the individual performance of each plant and avoid potential confounding effects. To represent the genetic diversity within Sickleweed populations, we used multiple pots for each population. The number of samples for each pot in each replicate was 10, and their average was used for the data of each replicate. To ensure the reliability of our results, we incorporated 3 replications in our experimental design. The number of replications varied depending on the specific analysis and statistical requirements of each trait. We employed a sufficient number of replications to minimize the impact of random variability and enhance the statistical power of our findings. The specific details regarding the number of plants per pot, the number of pots per population, and the number of replications were carefully considered during the planning and execution of our study. These design considerations aimed to ensure the validity and robustness of our results and provide meaningful insights into the genetic diversity and marker-trait associations in Sickleweed. During the vegetative www.nature.com/scientificreports/ stage, various morphological traits such as plant height, fresh and dry weight, leaf number, length, and width were measured. After harvesting, the leaves of the plants were collected, wrapped in foil, and transported to the laboratory for DNA extraction and assessment of nutrient content. The nutrient content analysis focused on elements such as zinc, manganese, potassium, iron, sodium, magnesium, and calcium ions. These measurements were carried out using flame and atomic absorption methods. The GTA-110 graphite tube atomizer Spectra AA 220Varian, manufactured in Australia, was utilized for the measurement of dissolved ions 41 . Before sample measurement, the standard solution of each ion was injected into the device, and the corresponding standard curve was generated using the device's software (Spectra AA). The unknown concentrations of the solutions were determined using software 42 .
To extract nutrient content from the plant tissue, the following procedure was followed. A dry plant tissue sample weighing 0.5 g was placed in 10 ml of concentrated nitric acid (65%) and allowed to dissolve in the acid for 24 h. After this period, the sample was heated to release acid vapors. Distilled water was then added to bring the solution volume up to 50 ml. The solution was filtered through filter paper to remove any particulate matter and obtain a clear solution suitable for analysis in an atomic absorption device. In addition to the atomic absorption analysis, the nitrogen percentage in the plant tissue was measured and calculated using the Kjeldahl method. This involved a process of digestion, distillation, and titration using a Kjeldahl apparatus.
Descriptive statistics, including measures such as mean, range, and standard deviation, were computed for the average data of the 15 populations. Additionally, the phenotypic coefficient of variation (PCV) was calculated for the studied traits. The PCV is a measure of variation relative to the mean and is often expressed as a percentage. To calculate the PCV for a trait, the following formula can be used: The PCV provides insights into the relative magnitude of variation for a specific trait compared to its mean value. It helps assess the extent of phenotypic diversity within the populations under study.
The analysis of variance (ANOVA) was conducted using the randomized complete block design, and the expected value of the mean square was used to determine the variance components. To analyze the zinc and manganese features, which had small measured data sizes, the data were initially multiplied by 100 before undergoing the analysis of variance. The software used for performing the analysis of variance was SAS 9.4 43 . In the laboratory of the Graduate University of Advanced Technology in Kerman, Iran, DNA extraction was carried out from young leaf samples of Sickleweed ecotypes using the Dellaporta method 44 with some modifications. After DNA extraction, the quality of the samples was assessed through 2% agarose gel electrophoresis, while the quantity of DNA was determined using spectrophotometry. The polymerase chain reaction (PCR) was conducted using six microsatellite primers that had been previously investigated for their suitability in Sickleweed 2 .
For the SSR marker, the polymerase chain reaction (PCR) was carried out in a volume of 10 µl. The reaction mixture included 2 µl of template DNA (50 ng), 0.6 µl of dNTP (10 mM), 0.3 µl of MgCl 2 (50 mM), 0.12 µl of Taq polymerase enzyme (5U), 1 µl of PCR buffer (10×), 0.4 µl of the forward primer (60 ng), 0.4 µl of the reverse primer (60 ng), and 5 µl of sterile deionized water. The PCR thermal cycle for the microsatellite marker consisted of an initial denaturation step at 94 °C for 4 min, followed by 35 cycles. Each cycle included a 30-s denaturation step at 94 °C, a 45-s annealing step at the temperature indicated by the primer's melting temperature (TM) in Table 4, a 2-min extension step at 72 °C, and a final extension step at 72 °C for 5 min. The reaction mixture was then cooled to 4 °C. The PCR products were detected using a Bio-Rad Sequi-Gen vertical electrophoresis machine and a 6% polyacrylamide gel. The staining method described by An et al. 45 was employed with some modifications ( Table 4).
The banding patterns obtained from the PCR analysis were scored as either absence (zero) or presence (one) of a band. These scores were organized in a matrix format with populations as rows and bands as columns. This matrix was used to calculate various marker indexes and to group the populations accordingly.
Marker indexes were calculated using specific formulas and a custom Excel program developed by the first author. These indexes included the number of amplified bands, the number of polymorphic bands, the polymorphic percentage, Polymorphism Information Content (PIC) 46 , Expected Heterozygosity 47 , Marker Index, Effective Multiplex Ratio and Mean Heterozygosity 48 , and Marker Detection Power 49 . Furthermore, the number of effective alleles, Shannon's index 50 , and Nei's genetic diversity 51 were determined using the POPGEN software version 1.3.1 52 . Cluster analysis was performed using different methods and criteria based on either the studied traits or SSR markers to group the ecotypes. The cluster method and criterion that yielded the highest Cophenetic correlation coefficient were selected, and the grouping was performed using the chosen method with the R software. The determination of the number of groups was based on the maximum distance method of merging two groups. The analysis of molecular variance (AMOVA) was conducted using the GenAlex software package version 6.5 53 . This analysis partitioned the total molecular variance among and within all populations based on the number of groups obtained from the cluster analysis. To examine the relationship between the molecular data (independent variables) and the quantitative data (dependent variables), stepwise regression analysis was performed. The stepwise regression utilized a step-by-step method and was conducted using the PAST software 54 .
Ethics approval and consent to participate. This study complied with relevant institutional, national, and international guidelines and legislation of Iran, and no specific permits were required to collect the plant materials. The species identity was done by Dr. Rahimi and voucher specimens were deposited in the Graduate University of Advanced Technology Herbarium (no. 151 to 165), available for botanical studies with an official request.

Results
Descriptive statistics, such as the minimum, maximum, and range of the studied traits, are presented in Table 2. The analysis of the Phenotypic Coefficient of Variation (PCV) for these traits reveals a favorable level of variation, indicating their potential utility in enhancing the studied populations of Sickleweed (Table 2). Notably, Potassium exhibits the lowest PCV value of 5.75, while Leaf width demonstrates the highest PCV of 38.50, followed by Zinc with a PCV of 34.30. The remaining traits fall within the range of PCV values ( Table 2). The analysis of variance results for the studied traits, including leaf characteristics and elements, indicated a significant difference (p < 0.01) among the Sickleweed populations (Table 3). This significant difference highlights the presence of notable diversity among these populations. Table 4 presents the results of the variance components, ranging from 28.5 to 99.5%. These findings indicate a significant amount of variation between the populations, which can be utilized in the selection of superior populations. Additionally, variation within the populations was observed for certain traits, ranging from 0.05 to 71.46%. This internal variation offers the opportunity to select the best individuals within each population, particularly for traits such as Plant height and Fresh weight, which exhibit high levels of diversity.
The UPGMA method with Gower distance yielded the highest Cophenetic correlation coefficient value (0.756) among all the methods using different distance criteria. Consequently, cluster analysis was performed using this method, resulting in the division of the Sickleweed populations into three distinct groups (Fig. 1). The  Table 3. The analysis of variance for morphological and elements traits in Sickleweed populations based on randomized complete block design. ns , * and **: Non-Significant and significant at 5% and 1% probability levels, respectively.
Df.   www.nature.com/scientificreports/ In this study, six SSR primers were used, resulting in a total of 55 bands. Among these bands, 52 were polymorphic, with an average of 8.67 polymorphic bands per primer ( Table 5). The GSSR24 primer exhibited the highest number of bands, with 13 bands, while the ESSR80 primer had the lowest number, with four bands ( Table 5). The percentage of polymorphism observed in the Sickleweed populations varied from 81.82 to 100% across the different primers, with an average percentage of polymorphism of 94.59% (Table 5). The polymorphic information content (PIC) ranged from 0.705 to 0.926, with an average PIC of 0.855 (Table 5). The expected heterozygosity (H) for the SSR markers ranged from 0.729 to 0.928, with an average expected heterozygosity of 0.864 (Table 5). The marker index (MI) varied among the SSR markers, with the ESSR80 primer exhibiting the highest MI (0.136) and the GSSR24 primer showing the lowest (0.020) ( Table 5). The average heterozygosity (Havp) ranged from 0.0042 to 0.0678 for the SSR markers. The ESSR80 and BSSR53 primers had the highest average heterozygosity, indicating their high efficiency in detecting polymorphism (Table 5). Assessing genetic diversity among cultivars and populations often involves evaluating the genetic diversity index. Nei's gene diversity index ranged from 0.370 to 0.458 among the SSR primers, with an average of 0.414 in the studied population ( Table 5). The ESSR80 and GSSR25 primers exhibited the highest genetic diversity, respectively. The mean Shannon's coefficient for the SSR markers was 0.601, indicating an average level of diversity within the investigated populations. The ESSR80 and GSSR25 primers had the highest values of Shannon's index, suggesting that these primers captured a greater extent of genetic diversity within the population ( Table 5). The number of effective alleles ranged from 1.632 to 1.856, with an average of 1.734 in the studied population ( Table 5).

Sources of variation
The evaluation of different methods of cluster analysis using various similarity criteria revealed that the UPGMA method with the Jaccard similarity index demonstrated the highest Cophenetic correlation coefficient value (0.62). Figure 2 illustrates the clustering results obtained using this method. The Sickleweed populations were divided into three distinct groups based on the molecular data, which aligned closely with the phenotypic data. The populations of Kur-Seylatan and Ard-Shaban exhibited the lowest genetic distance (0.0625), indicating a high level of genetic similarity. On the other hand, the populations of Ham-Chaharduli and Ard-Shaban displayed the highest genetic similarity (0.52). As depicted in Fig. 2, the studied populations were categorized into two groups. The first group comprised Qom-Khaljastan, Kur-Amirabad, Kur-Gerger-e Sofla, Ard-Shaban, Qaz-Ilan, and Gilan-Deylaman.The second group consisted of Kur-Panjeh Ali, Ker-Kivananat, Ker-Sahneh, Kur-Bolbanabad, Kur-Nemat abad auliya, Ker-Bavaleh, Ham-Chaharduli, Kur-Qaleh Gah, and Kur-Seylatan populations. The clustering results highlight the similarities and differences between the populations within each group. The observed genetic distances or similarities may be attributed to varying genetic compositions or other environmental factors influencing the traits studied.
To understand the differentiation of subgroups, molecular variance analysis was carried out with the assumption of three groups and the results showed that 18% of the total molecular diversity is explained by diversity among the populations, and 82% of that variation is related to within populations (Table 6).
In this study, a total of 49 markers (alleles) were found to have a significant relationship with the studied traits and were included in the regression model ( Table 7). Some of these markers were found to have an impact on multiple traits, resulting in a final selection of 32 markers that effectively explained the phenotypic variations of these traits. These markers can be valuable in identifying superior genotypes based on the studied traits. However, it should be noted that other markers did not show a significant effect on the model. The number of identified markers ranged from one for Leaf number and Leaf width traits to nine for Potassium. These markers exhibited either positive or negative correlations with the studied traits. Notably, no marker showed a significant relationship with the Fresh weight trait. The proportion of phenotypic variation (R 2 ) explained by each marker Table 5.  www.nature.com/scientificreports/ for the studied traits is presented in Table 7, with values ranging from 31 to 100% for the different traits. For instance, the marker GSSR24-3 was associated with the Leaf number trait and explained 46% of its phenotypic variation. Similarly, the marker GSSR24-13 showed a significant relationship with Leaf width, explaining 31% of its phenotypic variation. Furthermore, two markers, GSSR24-12 and ESSR80-4, were correlated with the Leaf length trait and collectively accounted for 50% of its variation. By calculating the standardized β values, it was determined that GSSR24-12 had greater importance and exerted a reducing effect on the trait (Table 7).

Discussion
The high phenotypic coefficient of variation (PCV) observed for Leaf width and Zinc traits indicates that these traits have a greater potential for improvement and modification through selective breeding programs. This suggests that there is a higher chance of selecting superior populations among the studied Sickleweed populations for these traits. Conversely, the Potassium trait exhibits the lowest PCV, implying that improving this trait through selection in the studied population may be less successful compared to other traits. The observed high diversity in traits among the Sickleweed populations can be attributed to environmental conditions as well as the genetic variations among populations. The phenotypic evaluations and dispersion indices demonstrate that the studied populations exhibit significant diversity across various traits. This diversity can be valuable for association analysis and breeding programs. Furthermore, the results of variance analysis highlight the existence of inherent genetic diversity among the studied populations across all traits. This emphasizes the possibility of identifying superior populations or genotypes based on desired traits. Similar findings have been reported by researchers studying various medicinal plants, highlighting the potential to utilize phenotypic diversity in populations or genotypes for achieving superior populations or genotypes and managing breeding programs based on them. These traits have proven effective in identifying and characterizing diversity within populations [55][56][57][58][59] .
The variance component analysis revealed that, except for the Plant height trait, the variation between populations was greater than within populations for all other traits. This difference in variation can be attributed to the diverse environmental and genetic conditions among the populations. Morphological traits are known to be influenced by the growth environment, and the existing variations in environmental and growth conditions contribute to the observed diversity in these traits. It is important to note that morphological traits, being polygenic, may  www.nature.com/scientificreports/ not accurately reflect genetic changes at the genomic level. Therefore, grouping populations based on these traits may yield different results and changes under different environmental conditions. This emphasizes the need for Table 7.
Stepwise regression analysis of studied traits (dependent variable) and SSR markers (independent variables) in Sickleweed. * and **: Significant at 5% and 1% probability levels, respectively. www.nature.com/scientificreports/ careful consideration when using morphological traits alone for grouping or selection purposes 55,56 . To maximize heterosis, it is beneficial to cross genotypes or cultivars that have significant genetic dissimilarity 60 . Phenotypic traits can be used as indicators to select parents with a substantial genetic distance between genotypes for such crosses. Multivariate analysis techniques, such as cluster analysis and biplot grouping with principal component analysis, can be employed to achieve this objective. These techniques allow for the grouping of genotypes based on agricultural, biochemical, and physiological traits, thereby facilitating the selection of diverse parents for crossbreeding programs and enhancing genetic diversity 61 . By utilizing these multivariate analysis techniques, distant groups of genotypes can be identified based on their traits, enabling their use as parents in crossbreeding programs to increase genetic diversity and potentially improve desirable traits in the offspring 62 . The GSSR24 primer exhibited the highest PIC value in this study, indicating its efficacy in differentiating the studied populations of Sickleweed. These findings are consistent with a previous study by Piya et al. 29 , which also reported GSSR24 as having the highest PIC value in Sickleweed populations. Similar studies on other plants have shown high average PIC values for SSR markers, highlighting their effectiveness in assessing genetic diversity [23][24][25][26][27] . The GSSR24 primer also demonstrated the highest value of H, indicating its efficiency in distinguishing the studied populations. High values of genetic diversity reflect the marker's ability to differentiate genotypes from each other. The observed heterozygosity at certain loci may be attributed to gene introgression, microsatellite motif replication during the breeding season, or the evolutionary history of Sickleweed. The extent of genetic diversity is influenced by factors such as the type of molecular markers, characteristics of the SSR repeat unit, the number of SSR markers used, and the genetic relationships within the Sickleweed germplasm 24,25 . The effective polymorphism ratio, which represents the number of polymorphic loci in germplasm, varied between 2 for the ESSR80 marker and 4.8 for GSSR24. The marker's detection power (D), which reflects its ability to distinguish between two individuals, ranged from 0.781 for ESSR80 to 0.995 for GSSR24. These values indicate that the GSSR24 primer has a higher capacity for distinguishing between individuals. The variations in the number of alleles identified across different studies can be attributed to variations in the origin and characteristics of the studied genotypes, as well as differences in the markers and PCR conditions employed in each study 23,26,27 . PIC is an important criterion for comparing different markers in terms of their discriminatory power. Higher PIC values indicate greater polymorphism and the presence of rare alleles or alleles that significantly contribute to distinguishing individuals. Markers with high PIC values are particularly useful for distinguishing closely related genotypes [35][36][37] .
The primary objective of cluster analysis is to assess the degree of relatedness or genetic distance among populations. This approach enables researchers to reduce the time and effort required for random crossbreeding by strategically selecting distant populations from different clusters. By crossing populations that exhibit significant genetic distance, the chances of obtaining desired hybrids or achieving maximum segregation in subsequent generations, such as F 1 , can be enhanced 63 . Genetic diversity in plant species is influenced by various factors, including geographical distribution, population size, and breeding system 64,65 . It is essential to understand the genetic diversity of a plant species to plan and implement effective conservation strategies, irrespective of its geographical range 65 . A species' extensive geographic distribution does not necessarily ensure the preservation of its genetic diversity 64 . Therefore, conservation efforts should consider the genetic characteristics of the species. In many cases, the results obtained from molecular markers do not align perfectly with phenotypic traits. This discrepancy can be attributed to the polygenic control of agronomic traits and their susceptibility to environmental influences 33,36,37,66 . Morphological markers, which are based on agronomic data, may not accurately reflect the genetic differentiation of individuals based on their geographic environment. However, in this study, the agreement between cluster analysis using molecular markers and agronomic traits suggests that the chosen SSR markers provided sufficient coverage throughout the genome. Increasing the number of SSR markers could potentially yield even better separation within the studied germplasm, necessitating the design and utilization of additional SSR markers. Furthermore, since SSR markers are designed from non-coding regions of the genome, they may not target coding genes directly responsible for morphological traits. Therefore, the use of EST (Expressed Sequence Tag) markers, which are designed based on coding regions, is recommended for studying morphological traits and capturing a more comprehensive understanding of the genetic basis underlying these traits.
The results of the analysis of molecular variance (AMOVA) indicate that the variation within the sub-populations is greater than the variation between the sub-populations. This finding suggests that there is a wide range of allelic diversity within each population. Considering that the populations within each cluster are derived from different geographical locations, this result is consistent and highlights the significant diversity among the 15 populations studied 35,37 . It is important to note that increasing germplasm diversity leads to the presence of rare alleles within the population. While this can enhance the identification of potential associations, it also introduces challenges such as the increased likelihood of false relationships and a reduction in the statistical power of association mapping 67 . To ensure reliable results in association mapping studies, it is crucial to evaluate populations with high genetic and phenotypic diversity 68 . Despite the advantages of association mapping, it is important to acknowledge that the presence of population structure can potentially lead to false associations between markers and traits 69 .
The utilization of regression analysis to identify markers associated with traits can serve as an initial step for future studies focusing on QTL identification. The markers identified in this research, compared to other markers, may have a higher likelihood of being located in the coding regions of the studied traits. This is because they were included in the regression model and demonstrated a stronger association with the observed changes in those traits. Similar to other researchers in the field, regression analysis has been employed to establish the relationship between markers and traits in various plant species, ultimately facilitating plant breeding efforts [70][71][72][73]  www.nature.com/scientificreports/ programs can use genotypes lacking or possessing specific alleles identified by these markers to increase or decrease the expression of the target traits, aligning with the breeder's objectives. It is noteworthy that some markers were associated with multiple traits, indicating a close connection between these traits or potential pleiotropic control. This highlights the interdependence and shared genetic regulation among these traits, as observed in other studies 70,71 . The use of molecular markers related to important morphological traits in plant production and breeding, particularly through marker-assisted selection (MAS), allows for the identification of key genes and the introduction of candidate markers for further investigation in populations. While challenges exist, such as the scarcity of divergent populations for mapping and limitations in time and the correlation between morphological traits and molecular markers, regression analysis helps overcome these limitations and offers a promising approach for identifying markers associated with morphological traits. In this study, it was observed that the population structure of Sickleweed was characterized by the presence of seven subgroups. However, these subgroups were not completely separated, indicating a significant degree of mixing and suggesting a mixed descent for the studied genotypes. This implies that individuals may inherit portions of their genome from different subgroups within the population. Additionally, the similarity in allele frequencies across different populations could be attributed to migration or shared descent 37,72 . The identification of markers associated with traits, as well as the relationship between certain markers and multiple traits, opens the door for further research in specific genomic regions. Validating these associated markers and converting them into specific SCAR markers or specific DNA sequences for targeted breeding is a practical and applicable solution. These markers can be utilized for important traits identified in this study 36,37 . The use of association analysis to identify markers linked to morphological traits in Sickleweed populations was documented for the first time in this study. The findings, along with previous research, suggest that highly associated and reliable markers for specific traits can be identified and utilized in future studies. However, it is important to employ larger and more diverse populations and incorporate a greater number of markers for more comprehensive investigations. These identified markers should be further examined in segregated populations and larger populations to confirm their correlation with specific traits. Ultimately, these markers can significantly enhance the effectiveness of breeding programs.

Conclusion
The findings of this study highlight the high genetic diversity present in the examined Sickleweed population. The populations, originating from different regions and having distinct genetic backgrounds, exhibit variations in traits, indicating the presence of genetic factors influencing population differences alongside environmental effects. The use of SSR markers proves effective in studying the genetic diversity of the Sickleweed population, thanks to their stable positions in the genome. Selection based on molecular markers offers a rapid approach in breeding programs, with the genetic information obtained from these markers playing a crucial role. Thus, in addition to considering phenotypic traits, selecting superior genotypes and populations with high breeding value can be accomplished using molecular markers. It should be noted that the grouping of Sickleweed populations based on molecular data did not align closely with the grouping based on agricultural traits or the Bayesian method. However, the populations of Dillman-Gilan and Ilan-Qom exhibited the highest values for most nutritional elements traits. Considering the significance of these traits in human nutrition, these populations can be selected as superior populations and recommended for cultivation as vegetable sources for human consumption. Through stepwise regression analysis, 32 loci associated with the studied traits were identified. The results demonstrate that some markers are linked to multiple traits, underscoring their critical importance in plant breeding for simultaneously improving multiple traits.

Data availability
The data used to support the findings of this study are included in the article.