Genetic analyses and prediction for lodging‑related traits in a diverse Iranian hexaploid wheat collection

Lodging is one of the most important limiting environmental factors for achieving the maximum yield and quality of grains in cereals, including wheat. However, little is known about the genetic foundation underlying lodging resistance (LR) in wheat. In this study, 208 landraces and 90 cultivars were phenotyped in two cropping seasons (2018–2019 and 2019–2020) for 19 LR-related traits. A genome-wide association study (GWAS) and genomics prediction were carried out to dissect the genomic regions of LR. The number of significant marker pairs (MPs) was highest for genome B in both landraces (427,017) and cultivars (37,359). The strongest linkage disequilibrium (LD) between marker pairs was found on chromosome 4A (0.318). For stem lodging-related traits, 465, 497, and 478 marker-trait associations (MTAs) and 45 candidate genes were identified in year 1, year 2, and pooled. Gene ontology exhibited genomic region on Chr. 2B, 6B, and 7B control lodging. Most of these genes have key roles in defense response, calcium ion transmembrane transport, carbohydrate metabolic process, nitrogen compound metabolic process, and some genes harbor unknown functions that, all together may respond to lodging as a complex network. The module associated with starch and sucrose biosynthesis was highlighted. Regarding genomic prediction, the GBLUP model performed better than BRR and RRBLUP. This suggests that GBLUP would be a good tool for wheat genome selection. As a result of these findings, it has been possible to identify pivotal QTLs and genes that could be used to improve stem lodging resistance in Triticum aestivum L.

) is among the widely consumed food crops worldwide and is regarded as one of the most traded commodities on global markets 1,2 .Lodging is one of the most important limiting environmental factors for achieving the maximum yield and quality of grains in cereals, including wheat [3][4][5] .Lodging is when the roots of a crop are dislocated and/or their stems are irreversibly bent downward 4 .There are several difficulties that result from this situation, including higher drying costs, slowed harvest, reduced grain quality, and drastic yield losses of up to 85% 3,5,6 .The main challenge is the lack of global, regional, or local statistics on lodged areas related to various crops 7 .There are three key elements in determining the level of lodging and yield loss-the lodging angle, the spatial extent of lodging, and the stage of crop development (time of lodging incidence) 8 .By definition, CAI refers to the angles formed by stems with respect to vertical planes 9 .During lodging, a crop may experience a sequence of steps (i.e., lodging stages) beginning with CAI∼0° (a low deviation from the vertical situation) and finishing with CAI∼90° (crop bending close to the horizontal situation) 10 .Hence, CAI levels (ranging from moderate to severe) can be used to evaluate the lodging stage and canopy structure of lodged crops 11,12 .Agronomists and crop physiologists study lodging widely, but their efforts are usually limited to two aspects: agronomic practices (which reduce lodging-related risks) and breeding programs (which develop lodging-tolerant cultivars) 11 .There are four main characteristics of wheat ideotypes that demonstrate lodging resistance: larger stem diameters, wide root plates, strong root systems, and moderately short heights 13 .There are many challenges associated with evaluating lodging level since there are no data associated with it, no standard scale is available to present it, and lodging distribution on farms is random, involving complex genetic-environmental interactions 10,14,15 .The physiology of wheat lodging is influenced by a complex genetic architecture 16,17 .A complex trait that is difficult to quantify in the field is lodging 18 .For these reasons, assessing a genetic panel for lodging tolerance is a difficult task for wheat breeders.To make further progress in the development of lodging-tolerant wheat varieties it is crucial to get a better understanding of the molecular basis of lodging tolerance-related traits by using genetic tools, like QTL (quantitative trait loci) mapping 19 .In this context, a handful of QTLs accounting for 2-27% of stem strength and lodging variation [20][21][22] have been reported.The advent of next-generation sequencing (NGS) approaches has enabled cost-efficient genotyping-by sequencing that has been shown to be a useful tool by facilitating genetic dissection of complex traits in non-model organisms.Association mapping overcomes many of the restrictions of classic QTL mapping and can help identify minor genetic factors underlying complex traits 19 .QTLs identified through association mapping can be directly utilized in marker-assisted selection for improving genetic gain.To date, a few genome-wide association studies (GWAS) have been adopted to explore marker-trait associations (MTAs) and candidate genes affecting growth and development for lodging in crop plants including wheat 23 , oat 24 , bean 25 , canola 26,27 , and rice 28 .For example, Singh et al. 23 explored lodging tolerance via GWAS and identified a key genomic region on chromosome 2A, consistent across digital and visual scores of the lodging.
Anatomically lodging resistance is directly related to plant height, all 21 chromosomes carry genes that control plant height in wheat [29][30][31] .Up to now, 24 reduced height (Rht) genes (Rht1-Rht24) are catalogued in wheat 32,33 , where Rht8 on chromosome arm 2DS has been extensively explored 34,35 .We could locate only two QTLs to chromosome 2DL, whereas the ones reported by Borner et al 36 , on chromosome 2DS could not be detected.
Genomic prediction (GP) boosts speed and effectiveness of breeding by shortening breeding cycles and improving selection accuracy as auxiliary tools for GWAS 37,38 .An advantage of this approach is that it provides an opportunity to select a candidate gene by genotyping it before determining its phenotype 39 .Genomic prediction involves training a model that is comprised of all genetic markers within a model.A validation set is used to estimate the accuracy of the model 40 .Several factors affect genomic accuracy, including marker set, population structure, genomic selection method, and trait genetic architecture.Research has shown that GP is highly or moderately accurate for quantitative characteristics in barley 41 , maize 42 , rice 43 , oat 44 , and wheat 20,39 .
To the best of our knowledge, little is known about genomic regions associated with lodging resistance in wheat.Therefore, we uncovered putative candidate genes and evaluated the genomic prediction accuracy of lodging resistance using three methods for building a genomic selection model, namely genomic best linear unbiased prediction (GBLUP), ridge regression-best linear unbiased prediction (RRBLUP), and Bayesian ridge regression (BRR).

Result Phenotypic variation and correlation analysis
According to the analysis of variance, genotypic, and genotype × environmental effects on lodging-related traits were significant (Supplementary Table S1).Grain yield and number of nodes showed the lowest and highest phenotypic coefficient of variation (PCV) and genotypic coefficient of variation (GCV).A low heritability was observed in grain yield (41.32%), and a high heritability was observed in number of nodes (88.25%).Descriptive data on the lodging-related traits of wheat genotypes can be seen in Supplementary Fig S1 and Supplementary Table S2.Cultivars and landraces showed an average crop angle of inclination (CAI), lodging area (LA), and lodging index (LS) of 69.3 and 79.2°, 64.4 and 100%, 0.49 and 0.84, respectively.Because of this, cultivars have lower lodging rates than native landraces.Compared to native populations, cultivars had lower plant height, IL1, IL2, PL, and PeL, and larger stem diameters at nodes.Landraces showed higher DTH, DTF, and DTM while cultivars showed lower DTH, DTF, and DTM.Furthermore, cultivars showed better spike weight, spike area, and grain yield than landraces.
Pearson correlation between lodging-related traits is illustrated in the lower the grain yield (r = − 0.26**).Correlation coefficients between two environments (year 1 and year 2) for the lodging-related traits in Iranian wheat cultivars and landraces are shown in Supplementary Table S3.
Genotypes were divided into four bunches based on heatmap output.The foremost lodging-resistant genotypes were found within bunch A, which had a lodging index of zero or near zero.Genotypes with lodging index between 0 and 0.15% were found within bunch B. In the remaining two groups, genotypes were observed with a high lodging index.The lodging index within bunch D, which incorporates most local populaces, was the most elevated and extended from 0.6 to 1. Traits were separated into four groups: group 1 including SA, SW, and GY; Group 2 including PD, PeD, ID1, and ID2; Group 3 including DTH, DTF, and DTM; Group 4 including LS, CAI, LA, IL1, IL2, PH, NFN, PL, and PeL (Fig. 2).

Marker distribution and linkage disequilibrium (LD)
We had used four different reference genomes including barley, Chinese Spring, W7984, and IWGSC RefSeq v1.0 for imputation and the results indicate that W7984 provides more accurate results than other reference genomes.Therefore, in the current study we only used the result that obtained from W7984 reference genome.An analysis of 566,439,207 unique reads was obtained by genotyping 298 Iranian bread wheat accessions.A total of 133,039 SNPs were identified after alignment and de-duplication, of which 10,938 had a MAF > 5%, heterozygosity < 10%, and missing data < 10%.The 10,938 SNPs were retained and used for imputation.Afterward, 43,525 imputed SNPs were used to conduct the association study.A, B, and D genomes were mapped with 15,951, 21,864, and 5,710 SNPs, respectively, representing 36.7%, 50.2%, and 13.1% of all SNPs.The lowest and highest number of SNPs were on 4D (270 SNPs) and 3A (4034 SNPs), respectively (Supplementary Fig. S2).
According to the LD calculation of 46,525 SNPs, 1,858,425 marker pairs (MPs) were found in the panel of cultivars, of which 37.72% showed significant linkage.A range of 0.138 (Chr.3D) to 0.368 (Chr.4A) LD between marker pairs was observed across the 21 chromosomes.The D genome had the lowest number of MPs (215,600, 11.60%), followed by the A genome (683,825, 36.80%) and the B genome (959,000, 51.60%) (Table S1).A similar test on wheat landraces revealed 1,867,575 MPs, which were lower than those in wheat cultivars with a mean r 2 of 0.182.As expected, landraces showed a higher percentage of significant markers (847,725, 45.39%).The strongest LD was observed between marker pairs in Chr.4A (0.369), followed by Chr.2A (0.289) (Supplementary Table S4).

Population structure and Kinship matrix
According to the analyses of population structure, there are three subpopulations with varying degrees of mixing within them.The population structure matrix also revealed the maximum value of ΔK for K = 3, showing that the Iranian wheat genotypes can be divided into three subpopulations (Fig. 3A,B).A cluster analysis of the kinship matrix showed that the SBP-I subgroup contains 135 genotypes (13 cultivars and 122 landraces), the SBP-II contains 88 genotypes (5 cultivars and 63 landraces), and the SBP-III contains 75 genotypes (72 cultivars and 3 landraces) (Fig. 3C).In the principal component analysis, PC1 and PC2, explained 14.1 and 5.8 of the genotypic variation, respectively (Fig. 3D).Three distinct subpopulations were identified based on the first three PCs, with admixed accessions falling between the three major subpopulations.Additionally, the neighbor-joining tree of all accessions clearly showed that they were clustered into three subgroups (Fig. 3E).

Putative candidate genes for lodging tolerance
In-depth analysis was conducted on the markers with the highest significance (P < 0.0001) and pleiotropy.Gene ontologies based on 45 reliable MTAs indicate that candidate gene harboring these SNPs encode proteins that play a variety of roles in various biological processes, such as defense response, calcium ion transmembrane transport, carbohydrate metabolic process, chloroplast organization, nitrogen compound metabolic process, biosynthetic process, protein neddylation, protein phosphorylation, lipid metabolic process, phosphorelay signal transduction system and response to oxidative stress under lodging stress (Table 1).A total of 45 highly significant, functional MTAs were considered "reliable" MTAs for lodging-related characteristics.In choosing reliable MTAs, a high significance threshold and the molecular function were taken into account.The "major" MTAs were selected from the reliable MTAs that that had R 2 > 10%.A total of 20 major MTAs were detected for lodging-related in two environments (Supplementary Table S5).The following pathways have been discovered   based on the rice reference genome; starch and sucrose metabolism (Supplementary Fig. S4), zeatin biosynthesis (Supplementary Fig. S5), amino sugar and nucleotide sugar metabolism (Supplementary Fig. S6), and carbon metabolism (Supplementary Fig. S7).

Genomic prediction for lodging-related traits
The BRR, GBLUP, and RR-BLUP approaches using whole SNPs led to the identification of the highest prediction accuracies for 1, 13, and 5 phenotypes, respectively (Fig. 6).The highest prediction accuracy was determined  www.nature.com/scientificreports/via the GBLUP model for DTF, DTH, DTM, GY, ID1, IL1, IL2, LA, LS, NFN, PH, PL, SA, via the RR-BLUP for RR-BLUP for CAL, ID2, PeD, PeL, SW, and via the BRR for PD traits.The best prediction accuracy was related to the three traits CAL, LA, and LS, respectively with an average of 0.627, 0.610, and 0.519, while the lowest prediction accuracy was related to trait SW with an average of 0.233 (Fig. 6).GBLUP, RR-BLUP, and BRR approaches using significant SNPs displayed the highest prediction accuracies for the phenotypes 6, 6, and 7, respectively (Fig. 6).The highest prediction accuracy by the GBLUP was obtained for Convexity; by the RR-BLUP for DTH, IL1, ID2, PeD, PeL, SW; as well as by the BRR for PD, DTM, IL2, LA, PH, and PL traits.The best prediction was related to the three traits CAL, LA, and LS, respectively with a mean of 0.737, 0.637, and 0.650, while the lowest accuracy was related to trait SW with a mean of 0.361 (Fig. 6).

Discussion
It appears that Iranian wheat accessions are more diverse than the cultivars, and thus have a higher lodging score index due to the higher DTH, DTF, DTM, IL, and PH, and lower stem diameter and NFN.The stem diameter explains 55% of the variance in the lodging index 45 , so it is viewed as an important parameter for enhancing lodging resistance due to the increased amount of lignin, cellulose, and carbohydrates soluble in water.Increasing internode diameters can lower tiller numbers per unit area and therefore grain yield 46 , thus the association between grain yield and stem structure needs to be further explored.According to the correlation between phenological traits and lodging, an increase in DTH, DTF, and DTM leads to further growth, which leads to lodging 47 .Trait correlations revealed lodging to be directly related to plant high and other stem traits 48 .The properties of stems and their composition contribute significantly to stem bending resistance in crops 13,49 .The results show the lodging index correlates more strongly with ID1 than ID2, suggesting that the first internode contributes more to lodging resistance in wheat.Considering the first internode's nearly twice the material strength of the second internode, this association makes sense 50 .A weakness affecting stem strength, root characteristics, or soil structure can contribute to anchorage failure and lodging susceptibility 51 .The research by Berry et al. 52 found a decrease in lodging risk as a result of an increase in stem strength and root anchorage.Similarly, Tripathi et al. 45 found lodging resistance negatively related to spike area and weight.Thus, the increase in stem diameter of wheat genotypes can reduce lodging risk, making these genotypes ideal for breeding programs aimed at improving lodging resistance.www.nature.com/scientificreports/using the first five principal components as covariates [87][88][89][90] .In the Iranian wheat genotypes, there is a significant population structure with 30.5% of diversity coming from the first five eigenvalues.It was also observed in other studies that the population structure negatively affected GWAS and GP models 90 .Based on our observations, the GBLUP model provided the highest prediction accuracy.A study by Shabannejad et al. 37 investigated classic strategies to exploit GP accuracy in wheat cultivars and landraces under normal and drought conditions.Using the GBLUP and BRR methods, they identified the highest GP accuracy.Singh et al. 23 examined genome prediction models for lodging-related traits and found high predictive accuracy (0.42) across populations and environments.The authors observed that obtaining the highest GP accuracy depends on the genetic variation, the genetic architecture of the trait, the level of LD, and the genomic selection approach.As a result, the GBLUP model can detect genetic impacts in wheat populations better than other genomic prediction models.

Conclusion
Using validated lodging measurements along with association and genomic prediction analyses, we provide evidence in support of a polygenic genetic architecture of lodging in wheat.Our findings have diverse applications in plant breeding and genetics.The results of our research provide new insight into the molecular mechanisms underlying lodging resistance traits in wheat.To develop lodging resistance wheat cultivars, marker-assisted selection can target genes controlling these traits, including LS, PH, NFN, and IL1.Moreover, genomic selection by using our putative genetic markers along with GBLUP-based genomic prediction will help to achieve the above-mentioned goal.

Plant material and phenotyping
The research was conducted in the Alborz province in the Department of Agriculture & Natural Resources Campus (35°48′59ʺN, 51°58′48ʺE, 1321 m elevation) (Fig. 7a).We conducted an alpha-lattice experiment with two replications on 298 wheat accessions (208 native landraces and 90 cultivars) to analyze lodging and related traits in wheat under normal conditions (Supplementary Table S6).The replications each contained 30 incomplete blocks containing 10 genotypes each.Wheat accessions were grown in plots with a plant density of 250 plants/m 2 in four rows (1 m × 2 m) at 0.20 m intervals.The wind rose plot, and the climatic characteristics are presented in Fig. 7B,C, and Supplementary Table S7.Based on evaporation from an evaporation pan, a 40 mm threshold was determined for irrigation to be implemented in irrigated crops.The lodging and related traits of wheat accessions were measured in the pre-physiological stage.The traits measured in this study were as follows: Lodged area (LA, %), Lodging score index (LS), Crop angle of inclination (CAI, degree), Plant height (PH, cm), Internode length 1 and 2 (IL1 and IL2, cm), Penultimate length (Pel, cm), Peduncle length (PL, cm), Internode diameter 1 and 2 (ID1 and ID2, mm), Penultimate diameter (PeD, mm), Peduncle diameter (PD, mm), Number of node (NFN, n), Grain Yield (GY, g per plant), Spike weight (SW, g), Spike area (SA, cm 2 ), Days to maturity (DTM), Days to flowering (DTF), and Days to heading (DTH).
The authors declare that all study complies with relevant institutional, national, and international guidelines and legislation for plant ethics in the "Methods" section.Samples are provided from the Gene Bank of Agronomy and Plant Breeding Group and these samples are available at USDA with USDA PI number (Supplementary Table 2), respectively.The authors declare that all that permissions or licenses were obtained to collect the wheat plant.

LA, CAI, LS
In order to determine whether the wheat plots were healthy (H) or lodged (L) in the field, the CAI was calculated for each plot using the lodged area (LA [0-100%]) and vertical angle (CAI [0-90°]) (Fig. 8A,B) 91 .The CAI was calculated using trigonometric calculations and a plumb bob.In this case, the string of the plumb bob was suspended from the top of the crop, and it was just possible for the plumb bob's tip to touch the soil, ensuring precise vertical height (hv) measurements.The slant height (hsl) of lodged plants was measured using a plumb bob.The CAI was then calculated from the vertical using the Eq. ( 1) 91 : lodged area was also analyzed visually using a quadrant methodology.Using this approach, the LA % was computed in each of the four quadrants starting from the center of the plot, and then the final LA for each plot was calculated.The Fig. 8C and D depict lodged and healthy subplots for healthy plots, each trait was measured in three subplots (0.25 m 2 ), while for lodged plots, four to eight subplots were measured in order to account for spatial heterogeneity within each lodged patch 91 .

Statistical analysis
Analysis of variance and the best linear unbiased prediction (BLUP), were estimated using METAR v2.1 93 .The diversity of Iranian wheat accessions was evaluated and compared using advanced statistical analysis.The box plot was drawn with R 4.1 software using ggplot2, dplyr, and ggpubr packages.Also, correlation diagrams were drawn using the R packages corrplot and RcolorBrewer.Cluster analysis and heatmaps were implemented with the R 4.1 packages gplots, dendextend, and d3heatmap in order to classify wheat accession types.

Genotyping and SNP imputation
By using CTAB, wheat seedling genomic DNA was extracted and RNA contamination was removed with RNase (ribonuclease) 94 .A Thermo Scientific NanoDrop was used to determine DNA concentration, and a 0.8% agarose gel was used to evaluate DNA integrity.Genotyping-by-sequencing (GBS) was employed to genotype all 298 wheat samples 95 .A library of GBS has been developed and sequenced as described by Alipour et al 58 .Each sequencing read was trimmed to 64 bp and grouped into sequence tags.The SNPs were explored using BLAST, which allows for up to 3 bp of mismatch.The UNEAK pipeline in TASSEL 96 was used to call SNPs.To avoid false-positive SNPs originating from sequencing errors, SNPs with a missing rate < 10% across samples, a minor allele frequency (MAF) > 5% and heterozygosity < 10% were excluded.Missing data were imputed using the LD

Fig. 1 .Figure 1 .
Figure 1.Correlation coefficients between the lodging-related traits in Iranian wheat cultivars and landraces.LA lodged area, CAI crop angle of inclination, LS lodging score index, PH plant height, NFN number of nodes, PL peduncle length, PeL penultimate length, IL1 internode length 1, IL2 internode length 2, PD peduncle diameter, PeD penultimate diameter, ID1 internode diameter 1, ID2 internode diameter 2, DTH days to heading, DTF days to flowering, DTM days to maturity, SW spike weight, SA spike area, GY grain yield.

Figure 2 .
Figure 2. Hierarchical clustering and heatmap of Iranian wheat landraces and cultivars based on the wheat lodging-related traits.LA lodged area, CAI crop angle of inclination, LS lodging score index, PH plant height, NFN number of nodes, PL peduncle length, PeL penultimate length, IL1 internode length 1, IL2 internode length 2, PD peduncle diameter, PeD penultimate diameter, ID1 internode diameter 1, ID2 internode diameter 2, DTH days to heading, DTF days to flowering, DTM days to maturity, SW spike weight, SA spike area, GY grain yield.

Figure 3 .
Figure 3. Determination of subpopulations number in wheat genotypes based on ΔK values (A), A structure plot of the 298 wheat genotypes and landraces determined by K = 3 (B).Principle component analysis (PCA) for a total of 298 Iranian bread wheat accessions (C).Cluster analysis using Kinship matrix of imputed data for Iranian wheat accessions (D).The dendrogram of Neighbor-Joining clustering was constructed using 43,525 SNPs and 298 Iranian wheat accessions (E).

Figure 8 .
Figure 8. Measurement of crop angle of inclination (A) and presentation of various lodging stages (B).Presentation of the plot center and the healthy/lodged subplots in the field (C).Division of the plot into four quadrants Q1, Q2, Q3, and Q4 (D).LA1, LA2, LA3, and LA4 are corresponding to the lodged area in each quadrant.In this scenario, H1 and H2 present the healthy subplots while L1 to L6 are the lodged subplots.The CAI is estimated via averaging the CAI and LA calculated in the six lodged subplots and in each quadrant, respectively.

Table 1 .
Description of expected MTAs using imputed SNPs for lodging traits in Iranian wheat accessions.LA lodged area, CAI crop angle of inclination, LS lodging score index, PH plant height, NFN number of nodes, PL peduncle length, PeL penultimate length, IL1 internode length 1, IL2 internode length 2, PD peduncle diameter, PeD penultimate diameter, ID1 internode diameter 1, ID2 internode diameter 2, DTH days to heading, DTF days to flowering, DTM days to maturity, SW spike weight, SA spike area, GY grain yield.