Introduction

Bread wheat (Triticum aestivum L.) is the most important cereal for humans and has an undeniable role in food security. Therefore, increasing grain yield and yield stability have been prioritized by breeding programs to maintain wheat productivity. However, such a goal is challenged by the genotype-by-environment interaction (GEI) because a polygenic attribute like grain yield is controlled by numerous major and minor effect genes that interact with each other and the environment1,2. Thus, genotypes usually show a wide range of reactions before being introduced in a multi-environment trail (MET), leading to changes in their performance rankings and thus more confusion for breeders.

Yield stability may be obtained through a combination of agronomic traits3. Even though wheat grain yield stability has been specifically studied so far, in recent years, some studies have correctly focused on the GEI pattern in yield components4,5,6,7. In cereals, grain yield can be affected by yield components directly or indirectly. However, indirect improvement of yield stability might not be possible through agronomic traits8. This is due to the complex nature of performance stability controlled by genetic factors9 and can be interpreted using genotypic and environmental covariables10,11.

Stability statistics with almost simple calculation operations have long been the most important methods for assessing the stability of genotypes. Such statistics usually have a clear interpretation and can cover various aspects of stability, including static and dynamic types. Stability in the static concept refers to the constant performance of the genotype in different environments. In contrast, stability in the dynamic concept is the performance of the genotype that is constant according to the estimated or predicted level of the environments. These concepts are equivalent to biological and agronomic stability, respectively12,13. However, obtaining a genotype that maintains its yield in all environments is almost impossible, and such a concept of stability is not appropriate for production as genotypes are expected to behave well under favorable environmental conditions. On the other hand, the performance of stable genotype with a dynamic concept in response to different environments is parallel to the average response of all studied genotypes8. Therefore, the use of dynamic stability during breeding programs leads to increased resilience to climate change in new varieties14. So far, about 50 different stability statistics, both parametric and non-parametric, have been used. Woyann et al.15 stated that Wricke’s16 ecovalance (Wi) and additive main effects and multiplicative interaction (AMMI) statistics, namely AMMI stability value (ASV)17 and Modified AMMI stability index (MASI)18, emphasize stability, while Finlay and Wilkinson19 regression coefficient (bi) measures adaptability. In other words, Wi measures dynamic stability20, while low values of bi are estimates of static stability8. The method of harmonic mean of the relative performance of the genetic values (HMRPGV) provides estimates of adaptability and genotypic stability based on mixed models21. Several statistics have simultaneously examined performance and stability, including the yield stability index (YSI)22 and weighted average of absolute scores from the singular value decomposition of the matrix of BLUP for the GEI effects generated by an LMM and response variable (WAASBY) index23. In one of the latest statistics, while emphasizing different traits in MET analysis, the multi-trait stability index was introduced24.

Different chromosomal regions are involved in wheat adaptation25. Determining molecular markers associated with quantitative traits and indices of trait stability and adaptability can help identify regions of the genome that control GEI26. Furthermore, identifying genomic regions that affect stability can facilitate the selection process27. In addition, understanding the interaction of QTL-by-environment is also important because most related QTLs are not stable across environments, and the repeatability of marker-trait associations (MTA) is widely disturbed by the GEI28. MTAs have been identified for the stability index on chromosomes 4B and 7B29. In a genetic architecture study, the grain yield stability of wheat and other traits using the GWAS approach identified several SNPs on different chromosomes that affected their mean traits and stability9. In addition, the role of functional markers, including photoperiod genes, in performance stability has been revealed14. The combination of GWAS and genomic prediction suggested that dissecting the genetic basis of yield stability would be more complex than the one in grain yield29. Other similar studies identified stability-related QTLs in the barley20,26,30, soybean27, and rice31. However, there are a few studies on the dissection of GEI using genome-wide association studies in wheat. The present study investigated the stability of Iranian bread wheat in terms of different traits and diversity indices of SNP markers. Then, to understand the genetic basis of GEI, we used association analysis for stability indices and examined the ontology of the identified genes.

Results

Genotype-by-environment interaction

The effects of genotype, environment, and GEI were significant at different probability levels for the four traits in the total population and subpopulations (Table 1). Due to drought stress in the study and different rainfall patterns in different years (Fig. 1), such a result was not unexpected. Broad sense heritability was low for GY, moderate for SW and GN, but high for PH. To select the desired genotypes in terms of mean traits and stability, we used different statistics, and the results are presented in Fig. 2. Based on these two criteria, genotypes were divided into approximately four classes: (I) high mean and stable, (II) high mean and unstable, (III) low mean and stable, and (IV) low mean and unstable. In terms of GY, 54, 29, 70, and 115 genotypes were present in these classes, respectively. Cultivars included 37.5%, 9.1%, 36.4%, and 17%, and landraces included 11.7%, 11.7%, 21.1% and 55.5% of the members of these classes, respectively (Fig. 2A). In terms of GN, in class I 43 genotypes (38.6% of cultivars and 5% of landraces), in class II 113 genotypes (42% of cultivars and 42.2% of landraces), in class III 99 genotype (14.8% of cultivars and 47.8% of landraces), and in class IV 13 genotypes (4.5% of cultivars and 5% of landraces) were present (Fig. 2B). According to SW, 135 genotypes (79.5% of cultivars and 36.1% of landraces), 13 genotypes (3.4% of cultivars and 5.5% of landraces), 45 genotypes (6.8% of cultivars and 21.7% of landraces), and 66 genotypes (10.2% of cultivars and 36.7% of landraces), were observed in the mentioned four classes, respectively (Fig. 2C). For plant height, there were 85 genotypes (19.3% of cultivars and 37.8% of landraces) in class I, 40 genotypes (10.2% of cultivars and 17.2% of landraces) in class II, 97 genotypes (62.5% of cultivars and 23.3% of landraces) in class III, and 46 genotypes (8% of cultivars and 21.7% of landraces) in class IV (Fig. 2D). On the other hand, as expected, some indices, especially HMRPGV and WAASBY, were correlated with the mean of the traits and were in the same group (Fig. 2). Since it is difficult to select genotypes with simultaneous stability for all four traits, we calculated the multi-trait stability index based on yield and yield components (Supplementary Fig. 1). The results interestingly showed that 11 cultivars (12.5%) and 29 landraces (16.1%) formed the genotype selected based on this index (Supplementary Table 3).

Table 1 Mean, standard deviation (SD), broad sense heritability (H2), and combined analysis of variance based on studied traits in 286 Iranian wheat landraces and cultivars and 6 environments.
Figure 1
figure 1

Average rainfall in different months each year.

Figure 2
figure 2

Heatmap based on stability indicators for grain yield (A), number of grains (B), spike weight (C), and plant height (D) in Iranian wheat landraces and cultivars. I: high mean and stable, II: high mean and unstable, III: low mean and stable and IV: low mean and unstable. Mean: average trait in all environments, ASV: AMMI stability value, bi: Finlay-Wilkinson regression, Wi: Wricke’s ecovalance measures, MASI: Modified AMMI stability index, YSI: Yield stability index, HMRPGV: harmonic mean of the relative performance of the genetic values, WAASBY: weighted average of absolute scores from the singular value decomposition of the matrix of BLUP for the GEI effects generated by an LMM and response variable. The heatmaps were created using "gplots" package "heatmap.2" function in R32.

Genetic data and population structure

Based on the results, the distribution of SNP markers showed that genome B alone accounted for 50% of the total markers, while genome D had the lowest number of SNP markers by far. In the A, B, and D genomes, chromosomes 7A, 3B, and 2D, respectively, had the highest number of SNPs in cultivars, landraces, and the sum of the two (Table 2). The density (SNP/Mbp) was similar, with the B genome having the highest density of SNPs, especially for chromosomes 6B and 3B. This is more conveniently illustrated in Fig. 3. The average minor allele frequency (MAF) and gene diversity (GD) in cultivars were slightly higher than the landraces. The amount of heterozygosity (HET) of the landraces in each of the chromosomes and consequently the genomes were higher than the cultivars. The polymorphism information content (PIC) in cultivars ranged from 0.240 (4D) to 0.309 (2A) and in landraces from 0.232 (2D) to 0.292 (4A). The mean PIC in cultivars, landraces, and the sum of these two was equal to 0.280, 0.267, and 0.270, respectively (Table 2). On the other hand, the total number of SNP pairs (TNSP) and the number of significant SNP pairs (NSSP) were higher in the B genome (especially on chromosomes 3B, 2B, and 6B) and lower in the D genome (especially on chromosomes 4D, 5D, and 3D). The percentage of NSSP in cultivars ranged from 25.11% (4D) to 58.26% (4A) and in landraces ranged from 26.16% (4B) to 53.27% (4A). The r2 values of cultivars were higher than landraces, especially in B and D genomes. Such a difference in distance (cM) can also be seen in the D genome (Table 3). The results of genetic population structure analysis indicated the existence of two subpopulations (Fig. 4A). The highest value of ΔK was observed at K = 2 (Fig. 4B), and its average log-likelihood value confirmed it (Fig. 4C). One of these subpopulations consisted mainly of cultivars, and the other contained landraces.

Table 2 Distribution of SNP markers and indices of genetic diversity by chromosomes.
Figure 3
figure 3

Density plot by different chromosomes in total Iranian bread wheat cultivars and landraces.

Table 3 A summary of observed LD (r2) among SNP pairs and the number of significant SNP pairs per chromosomes and genomes of Iranian bread wheat cultivars and landraces.
Figure 4
figure 4

Barplot (A), the average log-likelihood value (B), and delta K for different numbers of sub-populations (C), in the analysis of population structure using 43,446 SNP markers.

MTAs for mean traits and stability indices

An overview and detailed information of MTAs results are provided in Supplementary Tables 4 and 5. A total of 846, 653, and 1023 significant MTAs were identified for the studied traits and stability indices of cultivars, landraces, and total genotypes, respectively (Fig. 5). Circular Manhattan plots for common regions associated with different traits are plotted (Fig. 6). Ten and 12 markers were related to the mean grain yield of cultivars and landraces, respectively, mainly located in genome A. This number was higher with 55 markers for all genotypes. ASV and MASI statistics had the highest MTAs in the B genome, while for Wi in the D genome (especially chromosome 7D) and the B genome, most markers in landraces were identified on the B and A genomes. There were 22 and 14 significant associations for HMRPGV in cultivars and landraces, respectively. Chromosomes 4A and 2A for cultivars and 3D for landraces were important. WAASBY was significantly associated with 24 and 21 SNPs in cultivars and landraces. These markers were mainly distributed on chromosomes 6B, 2B, and 2D. Although bi for cultivars and landraces had the lowest MTAs in the D genome, this genome (especially its 6D chromosome) contained the highest MTAs considering the total genotypes. Finally, among all the indices, YSI in the cultivars was associated with the highest number of SNPs in the B genome (Fig. 5A).

Figure 5
figure 5

GWAS results stability indicators for grain yield (A), number of grains (B), spike weight (C), and plant height (D) in Iranian wheat landraces and cultivars.

Figure 6
figure 6

Circular Manhattan plots to draw common regions associated with grain yield (A), number of grains (B), spike weight (C), and plant height (D) in Iranian wheat landraces and cultivars. Inner to outer circles represents average trait and stability indices including ASV, bi, HMRPGV, MASI WAASBY, Wi, and YSI, respectively. The chromosomes are plotted at the outmost circle where thin dotted blue and red lines indicate significant level at p value < 0.001 (− log10 (p) > 3) and < 0.00001 (− log10 (p) > 5), respectively. Green and red dots indicate genome-wide significantly associated SNPs at p value < 0.001 and < 0.00001 probability level, respectively. Scale between ChrUn and Chr1A indicates − log10 (p) values. Colored boxes outside on the top right side indicate SNP density across the genome where green to red indicates less dense to dense.

For GN, 14 MTAs for the mean and 209 MTAs for the stability parameters were identified in the cultivars, compared to 24 and 171 MTAs for the landraces, respectively. Like GY, more MTAs were identified based on all genotypes. Chromosomes 6B and 2B in cultivars contained the highest markers associated with ASV and MASI, while the SNPs identified for these two indices were low in landraces and scattered on different chromosomes. Genome B, especially chromosome 2B, had the highest QTLs associated with Wi. In total, 14 and 30 MTAs were determined for HMRPGV in cultivars and landraces, respectively, with chromosomes 1A, 4A, and 5A having the highest SNPs in the landraces. The highest number of SNPs associated with WAASBY in cultivars and landraces were located in B and A genome, respectively. The highest number of bi and YSI-related SNPs belonged to the B genome, and chromosomes 6B and 7A in landraces were important for the YSI index (Fig. 5B).

Mean SW in cultivars, landraces, and total genotypes was identified as 31, 19, and 57 MTAs, respectively, mainly located on chromosomes 6A, 3B, 4B, 6B, and 7D. The B genome and then the A genome had a highly significant number of HMRPGV-related SNPs. For WAASBY in the cultivars, 16 and 3 MTAs were identified in the B and A genome, respectively. Although no significant MTAs were observed in genome D cultivars, like the A genome, it contained SNPs related to the WAASBY index in the landraces. In terms of ASV and MASI indices, 6.4 and 2.6 times more MTAs were detected in the cultivars compared to landraces, respectively. More MTAs were observed on chromosomes 6B and 6D in cultivars and 1B, 3D, 6A, 6B, and 7B in landraces for Wi index. Most bi-related SNPs were located on chromosomes 1A, 6B, and 7A in the cultivars and on 1D, 3B, 6B, and 7A in the landraces. Finally, for the SW, like GY, the highest number of MTAs we could see in a genome was the YSI index (Fig. 5C).

Among the traits, the lowest MTAs were observed for PH and its stability indices. Moreover, 13 markers on 1D, 2B, 3B, 5B, 7A, 7B, and 7D chromosomes in cultivars and seven markers on 1A, 1D, 2A, 5B, and 7B chromosomes were associated with mean trait. The markers identified for ASV and MASI were the same in the cultivars and slightly different in the landraces. Such similarity was observed by considering the sum of genotypes, with 3D and 7A chromosomes having a larger number of SNPs. Although genome B had the lowest number of MTAs for Wi in cultivars, it showed the highest association in landraces. Chromosomes 3A, 6D, and 7A in cultivars and chromosomes 4A and 6B in landraces were important for this index. For HMRPGV in cultivars, seven markers were identified on chromosomes 6D, 7A, 3D, and 5B. These numbers were equal to 12 and were distributed on chromosomes 7B, 5D, 5B, 1D, 1A, 6A, and 2B. According to WAASBY, 14 SNPs were identified in cultivars on different chromosomes, including 1A, 1B, 2B, 3B, 3D, 4B, 4D, 5B, 6A, and 7B. In the landraces, 10 SNPs were identified, more than half of which were located on chromosome 1D. The bi, in cultivars on chromosomes 6B and in the landraces on chromosomes 3D and 3D, had the highest number of MTAs. Finally, the number of MTAs detected for YSI in cultivars was three times higher than in landraces (Fig. 5D).

Among the identified markers, 171, 131, and 224 cases in cultivars, landraces, and the sum of these two overlapped with different traits and indices, respectively (Supplementary Table 5). For example, the marker rs65138 in cultivars and rs51479 in total genotypes were associated with the mean of three traits GY, GN, SW, and some of their stability indices and were located on chromosomes 1B and 3B, respectively. One such marker in the landraces was rs58587, which was located on chromosome 7B and was associated only with the stability indices of GY, SW, and PH. Other SNPs with many pleiotropy effects were located on chromosomes 6B, 2A, 2B, 4D, 3B, and 4A in the cultivars. These cases in the landraces included 1A, 2A, 4A, 2D, 6A, 3D, 1B, and 7D. Considering the total genotypes, we found that the SNPs associated with most of the traits and indices were on chromosomes 4B, 4A, 2A, 2B, 7D, 2B, 6A, and 5D (Supplementary Table 5).

Gene ontology

For a closer look, we studied the ontology of highly significant markers (P < 0.0001). Except for PH, some of the identified MTAs were involved in important biological and molecular processes for all traits. These genes were distributed on different chromosomes, including 1A, 1B, 1D, 2D, 3A, 4A, 4B, 6A, 6B, and 7A, with chromosome 4B, 1B, and 7A having the highest number (Table 4). Genes with MTAs mainly encoded proteins wrapped in biological and molecular processes associated with adaptation, including drought stress tolerance. Oxidoreductase activity, DNA-binding transcription factor activity, ATPase-coupled transmembrane transporter activity, protein kinase activity, protein binding, and integral component of the membrane were some of the molecular processes. Some biological processes also included the oxidation–reduction process, regulation of transcription, jasmonic acid biosynthetic process, transmembrane transport, protein phosphorylation, fatty acid biosynthetic process, and DNA repair. The KEGG orthology system was also used to accurately annotate the identified SNPs. The results showed that genes were involved in various pathways such as biosynthesis of secondary metabolites, carotenoid biosynthesis, fatty acid elongation and ubiquinone, and another terpenoid-quinone biosynthesis (Table 5).

Table 4 Description and annotation of identified markers (P < 0.0001).
Table 5 KEGG orthology-based annotation system for significant SNP sequences.

Discussion

The high significance of GEI for the studied traits was expected in this study, which accords with the previous reports9. Similar to this study, different monthly rainfall in MET studies, which also has drought stress, is one of the main reasons for GEI10,33. For some traits, the effect of GEI in cultivars was less than in landraces. This result is due to breeding programs and the small number of samples in the cultivars compared to the landraces, leading to fewer effects of GEI. Severe GEI caused low heritability of traits, especially in GY. In general, heritability and repeatability for complex traits such as GY are low compared to PH9,28,34. High yield and stability of wheat cultivars were expected since new wheat genotypes tolerate adverse environmental conditions such as drought stress35,36. On the other hand, breeding programs have improved wheat adaptation throughout a century25 and continued to provide adapted wheat germplasm37. The genotypes in the fourth group in each of the traits, which included unstable low-yield genotypes, mainly consisted of landraces. Also, landraces had the highest percentage of genotypes selected by the multi-trait stability index. Most likely, the lack of specific selection for high yield and priority of yield stability during wheat domestication has led to such a result38. However, severe genetic heterogeneity in the Iranian wheat landraces and the application of some early breeding processes39 have put landraces in desirable groups in terms of yield and stability. In this context, additional assessments with a large number of locations are needed to fully explain GEI patterns.

The concepts of static and dynamic stability can be clearly distinguished based on bi and Wi indices in GY. The genotypes of group I, i.e., most cultivars, had dynamic stability. In contrast, although unstable in terms of dynamic concept, the fourth group, including the landraces, had static stability due to the low values for bi. The static concept is associated with low GY8. The genotypes of the second group, which included a small number of cultivars and landraces, were unstable in terms of both concepts despite their high yield and had good adaptability according to WAASBY and HMRPGV indices. We found a distinction between the concepts of stability and adaptability in other traits, especially PH. However, the literature paid scant attention to such a distinction between cultivars and landraces in terms of stability.

We found that the studied SNPs covered the wheat genome well. The number of SNPs based on the new wheat reference genome was higher in the B genome and lower in the D genome. There also seemed to be a direct relationship between marker density and chromosome size, and such a frequency of SNPs results from the evolutionary process of wheat. This conclusion was reported by Alipour et al.39 in Chinese Spring and W7984 reference genomes. Other similar results were confirmed by Mourad et al.40 and Edae et al.41. The difference of r2 in cultivars, landraces, and different chromosomes, in addition to the evolutionary process, indicates the effect of breeding programs42. In this regard, comparing landraces and cultivars of wheat in China and Pakistan showed that the distances of LD decays in the landraces were less than cultivars. On the other hand, LD decays in genome A was slower than that of B43. Given that the landraces are genetically heterogeneous and are collected from areas with different climates, we expected that their heterozygosity would be high. Environmental factors affect genetic diversity and the structure pattern of plant populations44. Therefore, the high level of gene diversity in the studied population can be attributed to the geographical diversity of collection sites, differences in growth habit, etc. These factors led us to observe two subpopulations that separated cultivars well from the landraces. Moreover, the breeding programs and improved accessions are the reasons for such a separation. Iranian wheat genotypes have been categorized into two subpopulations in the previous studies40. The mean PIC value for all genotypes was 0.27, which is a good value for the bi-allelic marker39,45, and given their good distribution throughout the genome, they can be used to understand the genetic basis of GEI control.

Genome-wide association studies capture the genetic loci linked to significant variation for traits of interest in a vast collection of wild relative populations, breeding cultivars, and landraces46,47. It is also an important tool for selecting high-yield genotypes in a group of environments33. In the current study, genomic regions controlling GY, GN, SW, and PH traits and stability indices based on these traits were identified on all 21 chromosomes, including those that were not mapped to any chromosome. The number of MTAs identified in all genotypes was higher for GY and its indices in the D genome compared to the B genome, and for PH, it was higher than in the A and B genomes. In a study, GWAS for GY was run in each environment due to the presence of GEI, and the results showed that the D genome had the highest number of SNPs33. This suggests that the role of the D genome in wheat adaptability should be further addressed48. The greatest number of significant MTAs were identified on chromosome 6B in both cultivars and landraces datasets, while the least numbers were detected on chromosomes 5D and 4D in cultivars and landraces datasets, respectively. Acuña-Galindo et al.49 also found two meta-QTL for adaptation to drought stress on chromosome 6B in wheat. A recent study also reported a major grain yield QTL on chromosome 6B and fifteen haplotype blocks associated with two stability indices, including Lin and Binn’s superiority index and Eberhart and Russell’s coefficient on chromosomes 1A, 4A, 4B, 5B, 6B, 7A, 7B, and 7D29. In addition, genomic regions associated with grain yield and yield stability on chromosomes 2B, 3A, 4A, 5B, 7A, and 7B were identified in CIMMYT’s spring bread wheat50. Considering all genotypes, we located about 44% and 24% of the markers associated with the mean GY on chromosomes 6A and 7D, respectively.

Interestingly, GO results showed that one of these markers is in the coding region of proteins that regulate transcription. Previous reports indicate that chromosome 6A contains GY and TGW-related locus in MET data that harbored a TaGW2-6A gene and that other genes influence its expression51. Chromosome 7D is of great importance in explaining GY phenotypic variation28. Muhu-Din Ahmed et al.52 identified MTAs for GY on chromosomes 1A, 3A, 4A, 1B, 4B, 6B, 7B, 5D, and 7D under both well-watered and water-deficit conditions. Several studies also demonstrated MTAs for GY in various wheat panels analyzed thorough GWAS on chromosomes 2B, 3A, 3D, 5B, 7A and 7B53, 1A, 2D, 3A, 7B and 7D1, and 1B54 under different water regimes. The marker locus on 4B in GY under water stress conditions was also associated with this trait in the Pakistani wheat population55. Similarly, in genome-wide association mapping, Edae et al.56 reported MTAs for GY on chromosomes 4A, 1B, 5B, and 2B of spring wheat association panel under contrasting moisture regimes. Moreover, Lozada et al.57 found MTAs for GY on chromosomes 5A, 1B, 2B, and 4B in a diverse panel of 239 wheat genotypes evaluated across two growing seasons using SNP markers. Tadesse et al.58 reported GY-related MTAs on 1B in 120 elite hexaploid wheat genotypes, which were evaluated under rain-fed and irrigated conditions for a genome-wide study.

The multi-trait loci controlling performance and stability were located on chromosomes 1B, 3B, and 7B. Furthermore, chromosomes 2A and 4A in all three cultivars, landraces, and the sum of these two had multi-trait control loci. All chromosomes, except for chromosome 3B, were reported in a similar study9. In another study, chromosomes 3B and 2B, 3A, 4A, 5B, 7A, and 7B were associated with wheat yield stability coefficient50. Major QTLs with pleiotropic effects on chromosomes 3B and 7B have also been confirmed59. One study concluded that a specific combination of photoperiod genes increases the yield stability of durum wheat14. Also, the best allelic combination using stepwise regression in markers identified by genome-wide association mapping (GWAM) can lead to increased stability and yield in wheat50. Therefore, it is possible to say that yield stability is controlled by genes with pleiotropic effects. However, as the experiment was performed in the same place and under different conditions, the correlation between grain yield in different environments may be a reason to observe common SNPs. In this regard, the lack of correlation between the environments resulted in no common SNP for the GWAS performed in 9 environments33. Although several common MTAs were identified in for GY, GN, SW, and PH traits and different stability indices, these traits are not exclusive and independent. Thus, it is possible to select both traits and stability indices in Iranian wheat cultivars and landraces since most significant MTAs (almost 90%) were not common among the trait values and stability indices. Lozada and Carter9 identified 12 SNP loci linked to both trait value and stability parameters in Pacific Northwest winter wheat. Two major effect SNP markers of Tdurum_contig61410_542 (1B) and BS00022542_51 (7B), were associated with grain yield and yield stability indices. The common MTAs between different traits and yield stability coefficient have already been reported50. The low number of MTAs identified for PH is probably due to the fact that this trait is controlled by a small number of genes compared to other traits. However, the above results for yield and its components show that they are controlled by several genes that interact with each other and the environment. Stability-associated genes can also be stress-responsive genes50. Therefore, GO results could be well described, given that two of the six environments are under rain-fed conditions. Proteins phosphorylation, especially in wheat grains, play an important role in drought stress60. Jasmonic acid biosynthetic modulates drought stress in wheat61. Markers related to mean GY and SW were annotated with antioxidant activity. Reducing the effects of drought stress by such activity with various enzymes in wheat was demonstrated by previous researchers62. The Synthesis of fatty acids is useful in counteracting the drought stress in oats63. Transmembrane transport, DNA-binding transcription factor activity, DNA repair, and peptidase activity were other examples that were annotated and possibly involved in response to drought stress. These results are similar to the previous reports64. Earlier efforts have been made to interpret GWAS results and understand GEI using gene annotation33. KOBAS is a useful tool for genome annotation65. It has been shown that ubiquinone and other terpenoid-quinone biosynthesis are metabolic pathways of response to drought stress in plants66. In addition, carotenoid biosynthesis is involved as one of the KEGG pathways in drought stress tolerance67. Such an important role for the biosynthesis of secondary metabolites has been proven68.

Conclusions

In the current study, GWAS was performed for some important agronomic traits and different static and dynamic stability indices based on those traits were calculated in a diverse panel of 268 Iranian wheat cultivars and landraces. The highest number of marker pairs and lowest LD decay distance in both cultivars and landraces was observed on the B genome, whereas the D genome had the least number of marker pairs and most significant LD decay distance. A total of 846, 653, and 1023 significant MTAs were identified for the traits and their related stability indices in cultivars, landraces, and total genotypes datasets, respectively. The chromosomes 6B and 4D had the highest and lowest number of MTAs, respectively. The multi-trait loci controlling mean traits and stability were located on chromosomes 1B, 3B, and 7B, and GO results for highly significant MTAs almost confirmed the accuracy of the identified markers. The identified markers in this study could provide valuable genetic resources to initiate marker-assisted selection, fine mapping, and cloning the underlying genes and QTLs.

Methods

Plant materials and field evaluation

A set of 268 Iranian bread wheat genotypes, including 180 landraces and 88 cultivars, were studied in six environments (Supplementary Table 1). The environments included four well-watered environments during 2014, 2015, 2017, and 2018 and two rain-fed environments in 2017 and 2018 (Supplementary Table 2). Trials were planted in early November and harvested in July of the next year. The experiments were performed on the research farm of the University of Tehran with latitudes of 50.58 E and 35.56 N and 1112.5 m above sea level in a randomized complete block design with two replications. The dimensions of the plots consisted of four lines with a length of 1 m (80 × 100 cm). The distance was 20 and 5 cm between and within the rows. Plant height (PH, cm), grain number per spike (GN), spike weight (SW, g), grain yield per plant (GY, g plant-1) were traits that were measured based on ten randomly selected samples from each plot. Plant height was recorded from ground level to tip of the spike, excluding awns, at maturity stage. After harvesting, all spikes were hand-threshed to determine the GY, SW, and GN. Then, stability parameters (Table 6) of each trait were calculated using ‘agricolae’69, ‘ammistability’18, and ‘metan’70 packages in the R and STABILITYSOFT online programs71. Broad sense heritability of traits was calculated using the following equation:

$$ H^{2} = \sigma_{g}^{2} /(\sigma_{g}^{2} + (\sigma_{ge}^{2} /e) + (\sigma_{\varepsilon }^{2} /er)) $$

where \(\sigma_{g}^{2}\) and \(\sigma_{ge}^{2}\) are the variance due to genotype, and genotype-by-environment interaction, respectively. \(\sigma_{\varepsilon }^{2}\) is the residual variance, and e and r are the number of environments and replications, respectively72.

Table 6 Description of the stability statistics studied.

Genotyping

The development and genetic material studied was previously described based on genotyping by sequencing of a GBS library for the Iranian wheat samples have been by Alipour et al.39. In brief, sequence reads were first trimmed to 64 bp and were grouped into sequence tags. Then, SNPs were identified using internal alignment allowing for mismatch up to 3 bp. The UNEAK (Universal Network-Enabled Analysis Kit) GBS pipeline was used for SNPs calling, where reads with a low-quality score (< 15) were discarded. Imputation was performed in BEAGLE v3.3.273 using w7984 reference genome74. Finally, SNPs with heterozygotes < 10%, and minor allele frequency > 5% were used for further analysis.

Genome-wide association study

Both general linear model (GLM) and mixed linear model (MLM) were employed to obtain the unbiased estimation of marker effects using TASSEL 5.075 software and GAPIT R-package76. The results of GLM was adjusted using the first three principal components (PCA) and population structure (Q) and MLM was corrected using kinship-matrix with the first three principal components (PCA + K) and population structure (Q + K). Results of all approaches from both TASSEL and GAPIT were evaluated based on the Q-Q plot and significance of associated loci using t-tests. In general, the results of the MLM approach of the first three principal components and kinship-matrix (PCA + K) obtained from GAPIT provided a more robust control of confounding effects. We, therefore, only reported the results MLM obtained from GAPIT. In the MLM model, individuals are considered random effects, and the relatedness among individuals is conveyed through a kinship matrix. A threshold of –log10 (p) > 3 was used to state statistically significant MTAs77,78. Confidence intervals (CIs) for MTAs were calculated for each chromosome using the linkage disequilibrium (LD) decay. Circular Manhattan plots were performed using the CMplot R-package79.

Gene annotation

Sequences surrounding all significantly associated SNPs were obtained from the blast tools in EnsemblPlants database (http://plants.ensembl.org/index.html) to assess gene annotation using Gramene (http://www.gramene.org/) by aligning them to the IWGSC RefSeq v1.0 annotation (https://wheat-urgi.versailles.inra.fr/Seq-Repository/Annotations). After aligning SNPs sequences to the reference genome, we selected overlapping genes with the highest identity percentage and blast score for further processing. The gene ontology of each selected gene, including molecular function and biological process, was extracted from the ensemble-gramene database (http://ensembl.gramene.org). In addition, the sequences of significant SNPs were used for GO enrichment analyses using KOBAS (KEGG Orthology-Based Annotation System) software80 to test for statistically enriched pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG, https://www.genome.jp/kegg/) database.