Genome-wide association study and expression of candidate genes for Fe and Zn concentration in sorghum grains

Sorghum germplasm showed grain Fe and Zn genetic variability, but a few varieties were biofortified with these minerals. This work contributes to narrowing this gap. Fe and Zn concentrations along with 55,068 high-quality GBS SNP data from 140 sorghum accessions were used in this study. Both micronutrients exhibited good variability with respective ranges of 22.09–52.55 ppm and 17.92–43.16 ppm. Significant marker-trait associations were identified on chromosomes 1, 3, and 5. Two major effect SNPs (S01_72265728 and S05_58213541) explained 35% and 32% of Fe and Zn phenotypic variance, respectively. The SNP S01_72265728 was identified in the cytochrome P450 gene and showed a positive effect on Fe accumulation in the kernel, while S05_58213541 was intergenic near Sobic.005G134800 (zinc-binding ribosomal protein) and showed negative effect on Zn. Tissue-specific in silico expression analysis resulted in higher levels of Sobic.003G350800 gene product in several tissues such as leaf, root, flower, panicle, and stem. Sobic.005G188300 and Sobic.001G463800 were expressed moderately at grain maturity and anthesis in leaf, root, panicle, and seed tissues. The candidate genes expressed in leaves, stems, and grains will be targeted to improve grain and stover quality. The haplotypes identified will be useful in forward genetics breeding.

and Zn content in our research field was beyond the minimum required levels (4.50 to 6.50 mg kg −1 Fe and 0.6 to 0.9 mg kg −1 Zn) 46 and ample for normal growth and development of plants.Soil type at Parbhani was deep and black whereas, at ICRISAT it was shallow and light black.Irrigation was provided at each important crop stages to grow healthy plants including the grains development stage.The experiments were conducted in the post-rainy season to ensure the good quality of the seed.The experimental design was alpha lattice with two replications.Each germplasm was grown in a single row of 4 m long at both locations with inter and intra-row spacing of 60 × 15 cm.Standard agronomic practices were followed for successful crop development in each season.

Phenotyping and estimation of Fe and Zn concentration in the grains
The panicles were harvested to measure the Fe and Zn content from the grains as they reached the physiological maturity stage.The panicles from five random plants from each plot were selfed before the flowering stage to obtain pure seed.Upon maturity, these panicles were harvested and stored separately in a dry cloth bag to produce clean grain samples for micronutrient analysis.
The harvested panicles underwent a 7-day sun drying to achieve a post-harvest grain moisture level of 10-12%, essential for mitigating fungal infections.Threshing was meticulously conducted in cloth bags to prevent sample contamination by extraneous particles.Subsequently, the seeds underwent cleaning to eliminate glumes, anthers, dust, and other contaminants.A composite seed sample weighing approximately 30 g was then collected for each plot.Analysis of grain Fe and Zn content was performed using X-ray fluorescence spectrometry (XRF) (Make: Hitachi High-Tech, Japan; model: X-Supreme8000), a calibrated, efficient, non-destructive, and cost-effective analytical method.

Statistical analyses
A combined analysis of variance (ANOVA) was performed to assess the main and interaction effects of Environment (E) and Genotype (G), considering E, G, and replications as random effects.The individual variance of environments was modelled using the Residual Maximum Likelihood (REML) procedure using SAS Mixed procedure using SAS v9.4 47 .Best Linear Unbiased Predictors (BLUPs) were estimated using below equation: where Y ijkl , M, E i , R(E) j(i) , G k , GE jk , and e ijkl , respectively, stand for the measurement on plot l in environment i , block j , containing genotype k , the overall mean of all plots in all environments, the effect of environment (trial) i , the effect of replicate j within environment i , the effect of genotype k , the interaction of genotype k with environment i , the plot residual.
The heritability was estimated as repeatability 48 using the formula: where σ 2 g , σ 2 gxe , andσ 2 e are genetic variance component, genotype × environment variance component and residual variance respectively; E and r are the number of environments and replications, respectively.The four best com- binations of environments were selected for phenotypic data processing in this study, based upon the relatively high repeatability (Table 1) from the combined analysis of variance.All combinations of the environments are presented in Supplementary Table S2.

Isolation of DNA and genotyping
Genomic DNA from the single plant of sorghum minicore samples was isolated from 30 days old seedlings using the QIAGEN DNAeasy 96 plant kit.Purity and quantity of the extracted DNA was determined using gel electrophoresis and a Qubit 2.0 Fluorometer (Life Technologies, Carlsbad, CA) respectively and finally diluted to 30 ng/µl.Genotype-by-sequencing (GBS) libraries were prepared using the restriction enzyme ApeK1 according to Elshire et al. 49 .SNPs were called using the TASSEL v5.2 GBS pipeline 50 against the S. bicolor BTx623 reference

Population structure
The high-quality SNPs were used to determine the genetic distance between the sorghum accessions using R "amap" and "labdsv" packages.The phylogeny analysis was performed with the Euclidean method with 1000 bootstrap replications with R "ape" package 55 and the neighbor-joining tree was visualized in iTOL tree viewer 56 .The Principal Coordinates Analysis (PCo) between the accessions was measured with R "labdsv" package (https:// CRAN.R-proje ct.org/ packa ge= labdsv).The admixture analysis was performed with ADMIXTURE 1.3.0 57with expected K in the range of 2 to 15 and the K with the lowest cross-validation error considered as optimal subpopulations.The genome-wide Linkage Disequilibrium (LD) was generated using the r 2 values calculated with TASSEL v5.2 50 .

Association mapping
We performed GWAS with multiple models viz., MLMM, SUPER, BLINK and FarmCPU.Manhattan and Quantile-Quantile (Q-Q) plots were visualized in RStudio 58 using GAPIT 3 package 54 .The spurious associations in GWAS were corrected using Bonferroni Correction (5% level of significance) 59 and significant MTAs, corresponding to putative QTLs for the studied traits, were determined by the P-value less than 0.05/m, with m being the number of markers 60 .In the present investigation, Bonferroni correction value is calculated at 9.07 × 10 -5 .Further, the percentage of phenotypic variance explained (PVE) by all significant SNPs was generated either by GAPIT 3 built-in algorithms or calculated in RStudio as the squared correlation between the phenotype (BLUPs) and genotype of the SNP accounting for the above-described populations of environments.The Linux bash and in-house developed script was used to find the haplotype blocks and to identify the promising germplasms exhibiting high Fe and Zn content from the sorghum population.

Candidate gene identification and in silico gene expression analysis
Single nucleotide polymorphisms (SNPs) explaining more than 7.5% of the phenotypic variance of their associated traits were identified and their genomic regions were further analyzed in the process of functional GWAS and candidate gene identification.To perform functional GWAS, an interval of 51.77 Kb upstream and downstream the SNP position was considered, based on a genome-wide linkage disequilibrium (LD) decay cut-off at r 2 = 0.1.Annotation details for genes and respective Gene Ontologies (GO) within each region were retrieved using the Phytomine interface implemented in Phytozome 61 .JGI Plant Gene Atlas (https:// plant genea tlas.jgi.doe.gov/) 62 was used to check the tissue-specific expression of candidate genes and to access the expression data for 36 tissues at juvenile, vegetative, and reproductive stages.Gene expression was in the unit of Fragments per kilobase of transcripts per million mapped reads (FPKM).The expression heatmap was generated using TBtools II 63 .

Ethics statement
This research does not involve the ethics of human and animal experiments.

Institutional, national, and international guidelines and legislation statement
The seeds of sorghum used in this work were collected from ICRISAT's Genebank (http:// geneb ank.icris at.org/).The authors confirm that all methods applied in this work were performed in accordance with the relevant guidelines/regulations/legislation.
The combined ANOVA showed significant differences among the genotypes and a significant genotype × environment for both Fe and Zn traits (Table 2).As depicted in Table 2 and in Fig. 1, the concentration of Fe and Zn varied over a wide range, and the correlation between the two micronutrients was high (r > 0.80).The frequency distributions of the phenotypic data followed a normal distribution, as suggested by Shapiro-Wilks normality test 64 ; and Ryan-Joiner statistics 65 (Fig. 2).

Genetic diversity, population structure, and linkage disequilibrium (LD)
We retained 55,068 high quality SNPs that were well distributed across the genome (Fig. 3) with a density of ~ 5505 markers per chromosome (Chr) and around 80 markers per Mb region.Genomic regions with high marker density were observed on Chr 01 and Chr 02 with an average magnitude of 91 markers per Mb on both chromosomes.Chr 06 has averaged 86 markers per 1 Mb region with high-density regions between 47 and 52 Mb.Chr 08 and 09 also showed high-density regions between 57-63 Mb and 55-74 Mb, respectively.Linkagedisequilibrium decay (LD decay) is determined using the entire set of markers.The LD decay plot was plotted     www.nature.com/scientificreports/as LD (r 2 ) against the distance in base pairs (bp).The overall LD decay across the genome estimated at 51.77 kb (Fig. 4a).The unweighted unrooted neighbor-joining tree (Fig. 4b) used to depict the phylogenetic diversity showed that the genotypes clustered based on the races, but also revealed a significant level of admixtures (Fig. 4b).A similar clustering was confirmed using the PCA with the first 3 PCs explaining 39.13%, 19.27% and 11.20% genetic variation (GVE), respectively (Fig. 4c), which gives a total of 70% GVE.The optimum population size estimation using the ADMIXTURE model also gave K = 4 as the point with the lowest cross-validation error (CV error) (Fig. 4d).

Marker-trait associations
A genome-wide association analysis was undertaken to use the whole-genome high-quality 55,068 marker information, high-repeatability phenotypic data, and four single-locus (SUPER) and multi-locus (BLINK, MLMM, FarmCPU) algorithms, and evaluate the linkage disequilibrium (MTAs) that existed between genetic variations and nutritional traits viz., grain Fe and Zn concentration in the panel of 140 sorghum genotypes.For both Fe and Zn traits, five MTAs were detected, each, by BLINK and MLMM GWAS models, whereas FarmCPU and SUPER, each, detected two.A threshold of MAF (≥ 0.05) was set to account for the limitations of GWAS 66 and to correct for spurious MTAs.Table 3 displays a comprehensive compilation of significant markers that showed significant effects.The Manhattan plots (Fig. 5) showcased the GWAS output, portraying where these highly promising markers assert their statistical significance amidst the vast complexity of the genome.Additionally, these markers were carefully selected based on their elevated values in the Quantile-Quantile (QQ) plots (Fig. 5), indicating their non-random association with the traits of interest.We identified 5 consistent MTAs for Zn and 4 for Fe that were revealed by at least one model in at least one trial combination (Table 3; Fig. 5).Three of the 5 MTAs for Zn were detected using the statistically stringent MLMM model 54,67 while the remaining 2 were detected using the SUPER model.All Zn MTAs were detected for trial combination 2, except one (S05_58213541), that was detected for trial combinations 1 and 2. The highest PVE reported for Zn was 31.8% (Table 3).The strongest MTA (S01_72265728) reported for Fe was detected by 3 www.nature.com/scientificreports/models including the most stringent, MLMM (Table 3).The same MTA was also detected across trial combinations 2 and 4, and was just 1.5 Mb downstream to a Zn MTA locus detected by marker S01_73777110 (Table 3).
The highest PVE for this Fe MTA was 34.7%.The other MTAs for Fe were detected by BLINK (S03_73164578, S04_43148417, S05_67287071) and FarmCPU (S05_67287071), all of which had the highest PVE at 14.7%.Supplementary Table S3 depicts further information about significant MTAs along with MAF, effects, genes, and the functional annotation.

Candidate genes associated with grain Fe and Zn
Candidate genes for each of the MTAs were identified within the window of LD block (51.77 kb upstream-downstream) as defined by the LD decay estimation (Supplementary Table S4; Fig. 4a).Moreover, Table 4 provides a summary of all significant markers that were identified to be localized within genes, and the corresponding annotations.A total of 5 genes were identified for both traits studied (Table 4).Two significant and genic SNPs associated with Fe were linked with Sobic.001G445900 and Sobic.005G188300genes which play a significant  role in Cytochrome P450, heme binding, iron ion binding, oxidoreductase activity, and monooxygenase activity.
The genes are putatively involved in other processes including acting on paired donors with incorporation or reduction of molecular oxygen in dhurrin biosynthetic process, jasmonic acid mediated signaling pathway, leaf shaping response to brassinosteroid sterol, and metabolic process unidimensional cell growth.Similarly, three significant SNPs associated with Zn were located in Sobic.001G017500,Sobic.001G463800 and Sobic.003G350800genes which are putatively associated several properties including NAD(P)-binding domain, Glucose/ribitol dehydrogenase, NADPH-cytochrome P450 reductase, peptidase S8/S53 domain-containing protein and malate dehydrogenase acting in deoxyribonucleoside triphosphate catabolic process, nucleoside triphosphate catabolic

Haplotype block analysis
In the analysis of trait-associated markers, an adjacent set of five markers was utilized to assess the haplotype block for each marker.This approach resulted in an average block size of 109 kb for iron (Fe) trait markers and 66 kb for zinc (Zn) trait markers.The analysis identified a total of eleven accessions exhibiting promising characteristics for Fe traits, and fifteen accessions demonstrating favorable attributes for Zn traits.Notably, five accessions viz., IS22294, IS29239, IS22239, IS29565, and IS23514 were found to possess haplotypes blocks of markers associated with both Fe and Zn traits (Fig. 6).Additionally, based on the Zn haploblock analysis, three accessions (IS22239, IS21512, and IS19676) demonstrated unique pattern of haplotypes indicating that these are related and have the strong variation in Block 1.

Expression analysis
The tissue-specific expression of the identified genes was analyzed in 36 sorghum tissues at three different growth stages of the plant (Supplementary Table S5, Fig. 7).Based on the FPKM values, an expression heatmap was generated, plotting the expression data of five candidate genes across 36 sorghum tissues (Supplementary Table S5, Fig. 7).Out of five genes, Sobic.003G350800 which is a peptidase S8/S53 domain-containing protein, showed higher level of expression in several tissues such as leaf, root, flower, panicle, and stem.Conversely, Sobic.001G017500 is not expressed in any of the tissues.Sobic.005G188300 and Sobic.001G463800genes are expressed moderately at grain maturity and anthesis stages in leaf, root, panicle, and seed tissues.Sobic.001G445900gene showed high expression at juvenile and grain maturity stage.

Discussion
Iron (Fe) and zinc (Zn) are essential micronutrients for both human and animal nutrition, and their importance in sorghum lies in their role in promoting overall health and preventing nutrient deficiencies, also known as hidden hunger.In addition to preventing micronutrient malnutrition, these micronutrients are essential for cognitive development, immune system function, agricultural productivity, and socio-economic development.Hence, Figure 6.Haplotype blocks for the significant MTAs for traits studied.Five adjacent markers on both sides of the significant SNP were used to find the significant haplotype block.The 'BlockPattern' at the end of the haplotype blocks describes the pattern of the respective hapblock.
biofortified sorghum provides a readily available and affordable source of essential micronutrients, preventing and alleviating micronutrient deficiencies and preserving traditional food culture.It preserves traditional food preferences, promotes economic development, and contributes to climate resilience as sorghum is a C4 crop with high level of resource use efficiency.Few biofortified sorghum varieties were released under the world's drylands and this gap motivated our research work; we aimed to identify the genomic regions associated with grain Fe and Zn for downstream use in sorghum biofortification breeding.Several GWAS studies has identified the genes governing complex traits, revolutionizing agricultural improvement.Research on pulses and grains revealed genes related to the accumulation of micronutrients, including the pathway for the carotenoid production in maize 68 .Additionally, wheat displayed 92 SNP trait correlations linked to 10-grain mineral areas.Twenty genes were functionally annotated to demonstrate their significance in grain mineral accumulation; most of these genes were found in the D-genome, indicating control over wheat grain mineral diversity 69 .
The genetic architecture of seed molybdenum and selenium in wild and cultivated chickpea was investigated using GWAS 70 .After 180 entries were surveyed, 16 SNPs were linked to these characteristics.Similarly, 22 quantitative trait nucleotides (QTNs) for grain nitrogen were found by phenotyping 174 accessions of Croatian common bean land races for the seed concentrations of eight micronutrients 71 .In our study, using sorghum entries from the ICRISAT's minicore provided a guarantee that the uncovered major QTLs and proxy SNPs would benefit wider research communities.To the best of our knowledge, this work is the first to report on the marker traits associations for grain Fe and Zn in sorghum using SNPs and minicore lines.
In this study, the analysis of variance (ANOVA) was carried out for both the characters viz.grain Fe and grain Zn over four combinations of environments.The result indicates highly significant differences among the genotypes for both traits, which revealed the existence of sufficient variation for effective GWAS, statistical inferences, and selection of superior plant ideotypes.Despite the study's small population size, the identified MTAs were significant and agreed with previous results, indicating that the population size did not pose severe data www.nature.com/scientificreports/quality challenges 33,72,73 .This study demonstrated that GWAS is a powerful tool for identifying potential genetic factors that contribute to important traits in sorghum genotypes, even with a small sample size.The repeatability for different combinations was varied, the highest was measured for Fe and Zn (56% and 74%, respectively) in combination 4, whereas low values were registered in combination 2 i.e., 41% and 57% for Fe and Zn, respectively.The genetic basis of traits like grain Fe and Zn content can be complex, involving multiple genes with small effects 24 , and this can explain the observed moderate repeatability.The observed medium to high broad-sense heritability/repeatability and wide variation within population for the evaluated micronutrients are the precondition for a successful GWAS [74][75][76] .The association between grain Fe and Zn showed significant and high positive values and the trend was similar across environments, implying that selection for either mineral can be used as proxy for the other.Such a favorable relationship between these micronutrients were reported earlier in sorghum [77][78][79][80] and other cereals, such as pearl millet [81][82][83][84][85][86] , maize 87,88 , rice 89,90 , and wheat [91][92][93][94] .
Research on sorghum grain Fe and Zn so far could only identify a few small effect QTLs and some putative candidate genes 24 , without meaningful use in sorghum genetic biofortification.In this study we used multiple popular GWAS models i.e., MLMM, BLINK, FarmCPU and SUPER 33,37 ; the latter is a single locus model, while the remaining three are multi-locus.The use of multiple models allowed to detect more reliable QTLs i.e., those that were co-detected by more than one model 95 .On the other hand, although multi-locus GWAS models showed advantage over single-locus GWAS methods 96 , a combination of single-locus methods and multi-locus methods was used in this work as recommended by [97][98][99][100] to improve the detection power and robustness of GWAS.The FarmCPU method offers superior statistical power by dividing Multiple Loci Linear Mixed Models into fixed effects model (FEM) and random effects model (REM), removing confounding, and controlling false positives; the SUPER model addresses computing issues with MLM, while BLINK improves statistical power with using LD information 33,54 .Structure analyses showed that the minicore population was genetically structured with an estimated four subpopulations, and corrective measures were implemented in the GWAS to account for population structure and cryptic relationships to avoid false positive associations 33,101,102 .
With the use of GBS SNP data, we identified nine highly significant MTAs for grain Fe and Zn with P values ranging from 3.88 × 10 -12 to 5.19 × 10 -7 , explaining 7.5 to 34.7% of phenotypic variation (PVE).Four MTAs for Fe were identified on chromosomes 01, 03, 04 and 05 while five MTAs for Zn were identified on chromosomes 01, 02, 03 and 05.Two major SNPs were identified for Fe and Zn: for iron, the SNP S01_72265728 was identified in the cytochrome P450 gene, while the SNP S05_58213541 associated with Zn is intergenic and near Sobic.005G134800(2.8 kb) which codes for a zinc binding ribosomal protein.S01_72265728 was associated with positive effect on Fe accumulation in the kernel, while S05_58213541 was associated with negative effect on Zn accumulation.Many other differentially expressed genes are involved in the uptake of minerals.In the case of the gene for Fe, the Cytochrome P450 gene is involved in mineral uptake for both Fe and Zn.Cytochrome P450 is a pigment with heme-protein properties and participates in several catalytic processes that involve specific heme group 103 .Cytochrome P450s catalyze a wide range of chemical reactions and have different enzymatic mechanisms and complex substrate specificities.The enzyme structures of different P450 possess a heme-binding mode with an unusually long heme-binding loop and a unique I-helix which may involved in Fe uptake from the soil 104 .A study on barley collection identified single-nucleotide polymorphisms for Fe and Zn biofortification in which cytochrome P450 superfamily protein was found to be involved in element transport, iron, and zinc binding 43 .Plant cytochrome P450 (P450) participates in a wide range of biosynthetic reactions and targets a variety of biological molecules 105 .However, there is no specific information on how cytochrome P450 gene is involved in Fe and Zn uptake.Satyavathi et al. 106 identified a cytochrome P450 superfamily protein involved in element transport, iron, and zinc binding, and found out that it was present in the genotypes with high Fe and Zn contents.A study on rice biofortification with zinc and selenium found that the expression pattern of a Cytochrome P450 (CYP) gene followed the mineral accumulation in flag leaves 107 .Such mixed reports suggests that more research is needed to fully understand the function of cytochrome P450 gene in mineral uptake.In pearl millet, it was reported that the cytochrome P450 proteins are up-regulated during panicle initiation 105 .The process of iron uptake by plants is an extremely energy-intensive mechanism 108 .A plant's ability to extract iron from the complex or chelating molecule by reducing Fe+++ to Fe + + is essential for iron absorption 108 .Uptake of Fe from the soil is dependent on other cations in the soil solution such as manganese (Mn) and calcium (Ca).Accumulation of Fe in grain is a very complex process and many factors such as polyphenol content and other stresses play a significant role in Fe concentration in the grain.In another study found that drought stress alters iron accumulation in sorghum seeds, and photosynthesis impaired by drought stress might trigger a disturbance in iron homeostasis 109 .The same authors hypothesized the that increased vacuolar transporters and ferritin might be involved in the regulation of iron accumulation in sorghum seeds under drought stress.
Zinc-binding proteins are involved in abiotic stress tolerance and play a significant role in root hair growth under stress conditions.It was reported that the metal binding proteins facilitates the absorption of Zn and metal ions by procuring the binding sites 110 but in our study, the identified SNP (S05_58213541) showed a negative effect which could mean that there is an association between that SNP and a particular trait, but the presence of a specific allele at that SNP locus is associated with a decrease or reduction in the trait of interest, Zn concentration in this case.This can indicate that either the plant absorbed the Zn ions from the soil and sequester it but unable to translocate it to the grains, or the plant with that particular SNP was unable to uptake enough Zn form the soil.Whether this is a direct effect (modified gene product) or a non-allelic interaction, it is to be investigated.This gene can also be a candidate for genome editing in order to improve Zn concentration in the kernel.If the associated gene is knocked out, there may be chances of significantly increasing the concentration of Zn in grains.Most putative genes identified in this study are zinc-binding or zinc ion-binding proteins.The gene ontology CCHH term indicates the presence of transcripts involved in metal ion binding activity, indicating their role in uptake and transport of Fe and Zn.A zinc finger is a small protein structural motif characterized by the coordination of Based on the results, among the four combinations of locations, combination 2 consisting of three locations identified the most MTAs.The identified candidate genomic regions/candidate genes are likely to have an important role in achieving high Fe and Zn content in sorghum grains.The identified SNPs can therefore be validated and used in developing nutritionally improved, Fe and Zn rich sorghum cultivars, which would help address micronutrient deficiencies and increase food security.The correlation between Fe and Zn concentration in the sorghum grain was high (r > 0.80) but we could not come across pleiotropic QTLs (SNPs) in this study.The lack of common SNPs in this study may indicates that the effects of the genetic factors that influence Fe and Zn concentrations may vary depending on the environmental conditions.Indeed, a correlation of 0.80 implies that 36% of the variance in sorghum grain Fe or Zn concentration cannot be explained by neither of the two metals.
Haplotype-based breeding has recently come to prominence as an effective approach to developing crop varieties that meet particular requirements 111 .To be used in breeding programs, this breeding strategy needs to first determine superior haplotypes.Haplotypes are distinct sets of alleles found on a single chromosome that are inherited together with a limited probability of contemporary recombination 112 .Breeders can increase the accuracy of genomic predictions and breeding strategies by more effectively defining haplotypes using linkage disequilibrium-based techniques and haplotype diversity 113,114 .
Recently, researchers explained the genetic diversity and evolutionary background of sorghum accessions by identifying several haplotypes of genes such as Dry 115 and Sh1 116 in domesticated and wild sorghum lines.Furthermore, Wu et al. 117 conducted a population genetic study that demonstrated the Sh1 and SbTB1 regions were subject to strong selection during the domestication of sorghum.Moreover, it has resulted in the potential role of SbTB1 haplotype in controlling the number of lateral branches in sorghum in domesticated and wild accessions.To the best of our knowledge, there is no specific study that directly addresses haplotype analysis in grain Fe and Zn in sorghum, and this is the first study to report haplotypes for grain Fe and Zn.In the present investigation, the haplotype analysis used sets of five markers to evaluate haplotype blocks, resulting in average block sizes of 109 kb for iron (Fe) trait markers and 66 kb for zinc (Zn) trait markers.It identified eleven accessions with promising Fe traits and fifteen with favorable Zn traits.The identified haploblocks were further used to identify sorghum accessions that inherited iron and/or zinc QTLs stably and possibly with rare crossing-over events.Notably, five accessions possessed haplotype blocks associated with both Fe and Zn traits.Furthermore, three accessions showed unique haplotype patterns related to strong variation in Block 1 based on Zn haploblock analysis.Thus, it will be beneficial to: (1) target the identified haploblock-containing plants as potential QTL donors in sorghum crossing blocks, and (2) tailor plant architecture and enhance grain nutrient content by incorporating these haplotypes through haplotype-based breeding in sorghum breeding programs.
In this work, the baseline expression profiles of the identified genes were shown using heat map.The gene Sobic.003G350800encoding for a protein which contains the peptidase S8/S53 domain, was found to be highly expressed in a variety of tissues, including the leaf, root, flower, panicle, and stem.Nearly all plant species include peptidase S8/S53 domain subtilases, which are subtilisin-like proteases that regulate a variety of biotic and abiotic stressors, and are expressed in a variety of tissues.Sorghum bicolor contains 57 genes belonging to the S8 family 118 .Subtilase family members exhibit remarkable functional versatility, from protein expression to stress responses, and modulating plant growth and development from seed development to senescence 119 .Genes, Sobic.005G188300 and Sobic.001G463800 are expressed moderately at grain maturity and anthesis stages in leaf, root, panicle, and seed tissues.These genes mainly encode for malate dehydrogenase (MDH: EC 1.1.1.82)in C4 plants.The functional annotation of Sobic.001G463800showed its role in iron ion binding activity, which is probably one of the factors determining the success of biofortification, nutrient bioavailability at different plant growth stages, from soil to plant tissues.In nature, the synthesis of malate is catalyzed by the MDH enzyme through the reversible reduction of oxaloacetate to malate.As such, MDH is therefore indirectly but strongly associated with iron and other nutrients uptake in plants.Indeed, Malate is a key product of plant metabolism with diverse functional roles in plants 120,121 , including, but not limited to: respiration and energy generation, photosynthesis (both C3 and C4), fatty acid oxidation, lignin biosynthesis, pulvinal and stomatal function, nitrogen (N2) fixation and amino acid biosynthesis, ion balance, uptake of phosphorus (P) and iron (Fe), and aluminum (Al) tolerance [121][122][123] .Cultivar differences, such as low nutrient mobility and remobilization efficiency, also affect the effectiveness of biofortification, particularly in relation to leaves and edible parts 124 .The identified candidate genes showing high micronutrients expression in leaves, stem, and grains at different growth stages (juvenile, vegetative, and reproductive stages) can be targeted to increase the nutritional value of both grain and stover, providing health-promoting food, feed, and forage.Overall, this study provides invaluable information on the genetic basis of grain Fe and Zn contents in sorghum and identifies candidate genes and genomic regions that can be used in breeding programs to improve these micronutrients.However, more research works are needed to further characterize these genomic regions and candidate genes and to validate their role in grain Fe and Zn concentration in sorghum grain before they are implemented in sorghum breeding programs.

Figure 2 .
Figure 2. Graphical representation of grain iron and zinc frequency distributions for all four combinations.Comb_1, _2, _3, _4, respectively, environments combination 1, 2, 3, and 4. Red color bars indicate the distribution of grain iron and blue color bars indicate the grain zinc distribution.The distribution of population is normal in all four combinations as described in Shapiro-Wilk test for normality in frequentist statistics.The Ryan-Joiner (RJ) statistic also confirms that phenotypic data follow a normal distribution.

Figure 3 .
Figure 3. Distribution of SNP markers on ten chromosomes of the sorghum.Telomeric regions of the chromosomes are highly dense relative to the centromeric regions.

Figure 4 .
Figure 4. Informativeness of the markers and characterization of the structure of the genotypes used for GWAS.a. LD decay distance estimated at 51.77 Kb. b.A dendrogram showing the clustering of the genotypes used.Four clear clusters are observed with the rest appearing as admixtures.Cluster I-Caudatum and associated hybrids; Cluster 2-Kafir and associated hybrids; Cluster III-Caudatum-bicolor hybrids; Cluster IV-Durra genotypes.c.PCA showing informativeness of the markers and further confirming the clustering observed in the dendrogram.d.Optimum population size estimation confirms K = 4 as the point with the lowest CV error.

Figure 5 .
Figure 5. Manhattan plots along with their respective QQ-plots showing the association of sorghum accessions for grain Fe and Zn content.Manhattan (left) and respective QQ-plots (right) (from top to bottom) depicted for grain Fe using BLINK, FarmCPU and MLMM in combination 2; and combination 4 using BLINK; and for grain Zn combination 1 using MLMM; and combination 2 using MLMM and SUPER.Associations were detected using 55,068 high-quality SNPs.The green horizontal line in the Manhattan plot shows the Bonferroni threshold at a 5% level: −log 10 p.value < 6.04, above which solid dots indicated significant MTAs.

Figure 7 .
Figure 7. Tissue-specific expression of candidate genes identified for Fe and Zn contents.The expression of five candidate genes in 36 tissues from three different growth stages (juvenile, vegetative, and reproductive) are plotted in the heatmap.Juvenile, vegetative, and reproductive stages are depicted in green, brown, and black font color, respectively.

Table 3 .
The marker-trait associations (MTAs) or SNPs detected for grain iron (Fe) and grain zinc (Zn) using multiple model algorithms with MAF ≥ 0.05.*Numbers within round brackets refer to respective trial combinations in the rightmost (8th) column.

Table 4 .
A summary of SNP markers located within a gene and the corresponding annotation.