Genetic dissection of grain zinc concentration in spring wheat for mainstreaming biofortification in CIMMYT wheat breeding

Wheat is an important staple that acts as a primary source of dietary energy, protein, and essential micronutrients such as iron (Fe) and zinc (Zn) for the world’s population. Approximately two billion people suffer from micronutrient deficiency, thus breeders have crossed high Zn progenitors such as synthetic hexaploid wheat, T. dicoccum, T. spelta, and landraces to generate wheat varieties with competitive yield and enhanced grain Zn that are being adopted by farmers in South Asia. Here we report a genome-wide association study (GWAS) using the wheat Illumina iSelect 90 K Infinitum SNP array to characterize grain Zn concentrations in 330 bread wheat lines. Grain Zn phenotype of this HarvestPlus Association Mapping (HPAM) panel was evaluated across a range of environments in India and Mexico. GWAS analysis revealed 39 marker-trait associations for grain Zn. Two larger effect QTL regions were found on chromosomes 2 and 7. Candidate genes (among them zinc finger motif of transcription-factors and metal-ion binding genes) were associated with the QTL. The linked markers and associated candidate genes identified in this study are being validated in new biparental mapping populations for marker-assisted breeding.

More than two billion people worldwide do not get enough essential vitamins and minerals in their diets 1 . Inadequate Zn intake affects 17.3% of the world's population, mostly across Asia and Africa 2 , and is responsible for the deaths of 433,000 children under the age of five each year 3 . The impact of micronutrient deficiencies is most serious in women of reproductive age (especially pregnant women) and children under the age of five, due to their increased micronutrient requirements 4,5 .
Wheat is an important staple food crop that provides about 20% of daily dietary energy and protein and is a major source of essential micronutrients such as Fe and Zn 6 . For more than 70 years, the International Maize and Wheat Improvement Center (CIMMYT) has bred wheat for improved grain yield potential and processing quality, resulting billions of dollars of benefits to farmers and consumers over this period 7 . The development and dissemination of high-yielding, nutrient-rich wheat varieties is a cost-effective and sustainable solution for minimizing nutrition gaps in remote rural households 8 . However, breeding for enhanced Zn concentrations is quite challenging due to the complexity of genetic and metabolic networks controlling the Zn homeostasis [9][10][11] . Moreover, differences in wheat plants Zn use efficiency, variability of translocation into the grain, and genotype-dependent source-sink relations can affect internal mobilization patterns of these micronutrients 12 .
Nutritional quality traits (such as Zn and Fe) tend to be multigenic. Breeding strategies are therefore focusing on novel approaches to broaden the genetic base using wild relatives and landraces, and identifying genetic control and their effects 13 . The biofortification breeding program at CIMMYT emphasizes the development of new spring bread wheat germplasm that combines wide adaptability, high yields, high grain Zn concentration, better   end-use quality traits 24 . To our knowledge, this is the first study that applies GWAS to dissect the genetics of grain Zn concentration in CIMMYT germplasm. Our study used the Illumina iSelect 90 K Infinitum SNP array 25 to genotype and analyze a diverse panel of 330 high zinc wheat lines for grain micronutrients across multiple environments. The study aimed to identify molecular markers associated with grain Zn concentrations and to identify associated candidate genes using recently-published wheat reference sequence information. The results will facilitate marker-assisted breeding for enhanced grain Zn in wheat.

Results
The combined analysis of variance across six environments showed significant differences between genotypes and significant genotype × environment interaction effects for grain Zn concentration, which was evident from the moderate level of broad-sense heritability (H 2 = 0.6) for grain Zn. Continuous variation of grain Zn concentration (the primary target trait) was observed in all environments (Fig. 1). Across both seasons in Ciudad Obregon, Mexico, there was much wider variation in grain Zn concentration at CENEB (high Zn environment), compared to intermediate grain Zn levels at IIWBR and lower levels at PAU and BHU (low Zn environments) was observed. Differences in grain Zn between environments were probably caused by different soil Zn levels at each test location. Large

GWAS for grain zinc and agronomic traits. Principal component analysis indicated four groups in the
HPAM panel (Fig. 2). The first principal component separated a group of Veery (Babax)-derived lines (G1) from a second large group of lines with the parents Kauz or Seri in their pedigree (G2). Lines with Attila and Waxwing in their genetic background formed a third group (G3). The fourth group (G4) was formed by lines unrelated to any of the previous three groups. Since the lines included in the panel were selected based on pedigree information, the partitioning of the genetic diversity in subgroups remained a distinct feature of the panel, similar to that observed in other elite wheat collections 26 .
A mixed linear model combining population structure and kinship data was used to identify a total of 39 significant MTA for grain Zn. Corresponding SNPs were assigned to 10 different locations on chromosomes 1A, 2A, 2B, 2D, 5A, 6B, 6D, 7B, 7D, and one uncertain chromosome (Table 1; Fig. 3). The SNPs explained 5-10.5% of phenotypic variation. Among the 39 MTA, seven were identified in at least three environments. A major SNP on chromosome 2A (RAC875_c34757_180) at 60 cM explained approximately 9% of the phenotypic variation while another SNP on chromosome 5A (IAAV1375) explained approximately 6%. For SNP RAC875_c34757_180, 277 lines had the CC genotype and 44 lines had the TT genotype. TT genotypes showed increased grain Zn concentration and thousand kernel weight (TKW). For SNP IAAV1375, 236 lines had genotype CC and 86 lines had genotype TT. The CC genotype displayed increased grain Zn and slightly reduced TKW (Fig. 4). Another major SNP, wsnp_Ex_c5268_9320618, was detected at 120 cM on chromosome 7B and explained approximately 10.5% of the variation.
There was a significant positive correlation between environments for grain Zn, except in BHU-2013 and IIWBR-2013. This suggests that, although there was significant G x E interaction, the ranking order of lines for grain Zn was maintained across locations (Fig. 4). Significant positive correlation was observed between environments for TKW and days to maturity (not shown). There was a strong correlation for grain Zn across both seasons at CENEB, indicating moderately-high heritability for grain Zn, though location or ecological zone may play an important role in determining micronutrient content as well as yield and yield-related traits. Genomic predictions were evaluated for accuracy using cross-validations approaches. The genomic predictions for grain Zn showed good potential, with correlations ranging from 0.5-0.65 for grain protein content, in accordance with the findings of Velu et al. 27 .
We also investigated the relationship between grain Zn and agronomic traits. The 347 markers (not shown) associated with days to heading, days to maturity, plant height, or TKW were not highly significantly associated with grain Zn. However, marker Excalibur_c23723_141 on chromosome 2B was associated with both TKW and grain Zn concentration. Based on these results, we conclude that high grain Zn is not related to agronomic traits or their pleiotropic effects ( Table 2). These results suggest an absence of any confounding effects of Ppd, Vrn, and Rht genes. However, previous studies have shown that the Rht genes that confer semi-dwarf characteristics in CIMMYT wheat germplasm showed a reduction in grain Zn and Fe 6 , whereas a promising marker on chromosome 2B associated with TKW increased grain Zn concentration 19 . Positive associations between TKW and grain Zn concentration have also been reported in CIMMYT germplasm 19,28 . Genes encoding observed MTA. Significant SNPs identified from the GWAS study were used to locate known candidate genes relevant to metal homeostasis and transporters using the recently annotated wheat reference sequence (RefSeq V1.0). Interestingly, the major loci identified on chromosome 2 defined by the markers BS00070798_51 and RAC875_rep_c78518_198 is located in proximity to a heavy metal domain (metal ion transporter) and Glutathione S-transferase gene (TraesCS2A01G072500). Another marker on chromosome 2B (Excalibur_c11392_1193) is associated with a metal ion binding protein (Mg and Mn), which is a paralogue of a rice (Oryza sativa subsp japonica) metal binding protein gene 'Os04g0609600' . This marker was linked with a phosphatase enzyme (Table 2). However, this marker exhibited the presence of SNP [A/G] in non-coding 3′ UTR region and did not exhibit any alteration in corresponding protein when the marker-linked cDNA was computationally translated into a putative protein. Further BLAST-N analysis revealed the higher conservation and 100% identity of this protein with Aegilops taushii, 90% with T. urartu, 92% with Oryza sativa, and more than 80% identity with other dicot species. This analysis revealed higher conservation of this gene from monocots to dicots and this marker is linked with an essential gene phosphatase for catalyzing phosphate ion cleavage on target proteins. It is presumed that higher phosphorous concentration competes with Zn accumulation and this phosphatase enzyme catalyzes phosphate and helps to accumulate more Zn in grain. Moreover, this phosphatase enzyme is an Mg 2+ or Mn 2+ -dependent protein, which is highly conserved even from prokaryotes to eukaryotes. One of the markers on chromosome 7B (wsnp_Ex_c5268_9320618) aligned with the database sequence from plants.ensembl.org. This marker was identified as 'Zn finger motif ' with synonymous mutation. We expect that this transcription factor would not affect its own function and thus should not affect activation of downstream genes. This kind of genes is highly conserved and may sustain the function of other genes. Since it is 'Zn finger motif ' , a certain level of Zn is necessary for gene function, or if additional Zn becomes available the function of this transcription factor will be activated 29 .
Interestingly, another marker on chromosome 6B (Ra_c7673_1788) was associated with ABC transporter G family (28-like). The ABC transporter in wheat is associated with pathogenic defense (for example, Lr34) and is critical for adaptation to stress environments 30 . Subsequent analysis suggests that allelic variation in SNPs (CC vs. TT) had appreciable positive effects on grain Zn and TKW for the SNP RAC875_c34757_180 on chromosome 2B Fig. 5. In contrast, SNP IAAV1375 on chromosome 5A had a positive effect on grain Zn but negative effect on TKW. Nevertheless, there was a positive pleiotropic effect of grain Zn with TKW, which is evident from the positive significant correlation between grain Zn and TKW 19,28 .

Discussion
The diverse panel of 330 wheat lines were phenotyped in a range of environments in India and in Mexico and genotyped using the high-density iselect 90 K SNP assay to disclose the genetic mechanism of Zn accumulation in wheat grain. GWAS identified 39 MTA for grain Zn concentration, and associated candidate genes were localized using the sequence annotation (RefSeq v1.0) for finding the positions of the markers and the annotations of the genes. GWAS study identified two major effect QTL on 2 and 7 chromosome groups. A number of QTL for high grain Fe and Zn have been mapped on 2 and 7 chromosomes of wheat [19][20][21][22] . These results confirm that the group 2 and 7 chromosomes holding genes for nutrient uptake, translocation and sequestration of mineral in wheat plant 21,23 .
This study identified some promising genetic loci associated with grain Zn in wheat. Novel candidate regions, together with genes previously shown to regulate metal homeostasis genes, are among the promising genes for further validation and testing in CIMMYT wheat germplasm. Major QTL regions were identified on chromosomes 2B and 7B, concurring with previous studies and highlighting the potential of this region for fine mapping and gene discovery research. Our results indicate that the Zinc finger motif transcription factor and phosphatase (metal ion binding) play major roles in loading additional Zn to the wheat grain. Functional analysis is now being conducted for the selected markers using metabolomics approaches to give further information about Zn accumulation regulators and associated metal transport pathways. The markers and allelic combinations identified here can be used for marker-assisted selection. Fixing and accumulating these common loci and introducing new rare alleles into high-yielding elite germplasm represents a promising opportunity in breeding Zn biofortified wheat. In addition, genomic prediction based on all SNPs seems to be a more effective strategy than selection based on only a few SNPs. Genomic prediction approaches using dense DNA markers can therefore enable more efficient selection of lines at early stages of wheat breeding programs to reduce the breeding cycle compared to phenotyping.
Our study provides comprehensive use of wild progenitor species such as T. spelta, T. dicoccum, and Ae. tauschii for offering new source of alleles for high grain Zn in wheat. Sequences of significant SNP markers were aligned to contigs on publicly available databases (e.g. https://urgi.versailles.inra.fr/) to identify candidate genes from wheat, barley, Brachypodium, and rice. Furthermore, SNPs and corresponding gene sequences allowed us to anchor the some potential genomic regions (e.g. Zinc finger motif and metal ion binding genes) to recently-released wheat assemblies that are being investigated for genes involved in the biosynthesis and metal transport pathways. These genes, together with other known candidate genes, are being investigated and significant SNPs will be converted into Kompetitive Allele Specific PCR (KASP) markers that can be efficiently employed to transfer rare alleles into elite wheat lines. This study has demonstrated that modern genotyping technologies efficiently detect promising loci for nutritional quality traits. The development of PCR-based markers (such as KASP assays) can accelerate breeding through efficient marker-assisted selection and by pyramiding diverse QTL regions from different genetic resources.
Transcription factor candidate genes known to be associated with Zn accumulation which were mapped on chromosome 7B and probably associated with loci identified in bi-parental QTL analyses 21 . In addition, 18 SNPs identified in this study were co-located in/near regions of QTL for seed size. The co-location of QTL for Zn with seed size indicates that breeding for both higher micronutrients and seed size is highly possible. Marker-assisted selection for high Zn and Fe could simultaneously improve multi-associated target traits of seed size by using the co-associated genetic loci.  The rich genetic diversity for Zn in various wheat wild species and landraces provides novel alleles for genetically enhancing wheat Zn and Fe. Traditional breeding approaches have been used to successfully incorporate several novel alleles for grain Zn into elite germplasm by crossing high yielding elite wheat lines with Ae. tauschii-based synthetic hexaploid wheats or T. spelta accessions. CIMMYT's biofortification breeding program has incorporated several novel alleles for grain Zn in elite, high-yielding germplasm via targeted crosses with increased population sizes and by selecting for agronomic traits in early generations 14 . The result has been the development and release of biofortified wheat varieties such as 'Zinc Shakti (Chitra)' , WB-02, HPBW 01 (PBW 1 Zn), Zincol-2016, and BARI-Gom 33, which possess approximately 30-40% higher grain Zn than local checks 15 .
While grain Zn concentration is under quantitative genetic control and influenced by soil and other associated factors, it can be further improved by pyramiding multiple grain Zn QTL in high yielding wheat. Although grain Zn expression influenced soil and environmental factors, moderately high heritability and high genetic correlations between environments suggests ranking of high Zn genotypes across diverse environments 14 . The genetics of biofortification traits is now better understood and promising genotypes that combines high Zn density and stable performers are being used in the biofortification breeding program.
High-throughput, non-destructive phenotyping for grain Zn and Fe using EDXRF analysis is facilitating selection 31 . Gene discovery and mapping studies leading to the utilization of markers to further improve the breeding efficiency. As proof of concept of germplasm deployment of improved germplasm simultaneously enriching genetic knowledge through high-density genomics analysis could enhance breeding efficiency.  since 2009-10 to enrich the available soil Zn and minimize Zn heterogeneity. Sufficient nutrition and water were supplied throughout the trials to avoid potential nutrient and drought stresses. Fungicide and pesticides were applied to protect the crop from pest and diseases. Field observations were recorded for TKW, plant height, days to heading and days to maturity.

Methods
Grain sampling and trait determination. Plant materials were harvested after physiological maturity when grains were totally dry in the field. Grain samples (approx. 20 g/entry) were carefully cleaned, with broken grains and foreign materials discard, and the residual sample was used for micronutrient analysis. Grain samples from all five environments were analyzed using a 'bench-top' , non-destructive, energy-dispersive X-ray fluorescence spectrometry (EDXRF) instrument (model X-Supreme 8000, Oxford Instruments plc, Abingdon, UK) that was standardized for high-throughput screening of grain Zn and Fe concentration (unit: mg/kg) in wholegrain wheat 31 . A proficiency study was conducted by Flinders University, Adelaide comparing EDXRF instrument results with the original Inductively Coupled Plasma (ICP) data which confirmed that the EDXRF results are reproducible. Soil analysis of the experimental area showed average Zn concentration of 1.3 ppm at 0-30 cm depth and 0.78 ppm at a soil depth of 30-60 cm.
Statistical models. Broad , where σg 2 is the genotypic variance, σge 2 is the genotype × environment variance, and σe 2 is the residual error variance for r replicates and l locations using the R program. Genotypic values (i.e. line means) were estimated as Best Linear Unbiased Estimators as fixed effect, with a random effect for replicates nested within each environment.
Genotypic data and GWAS. DNA was extracted according to standard CTAB procedures and genotyped using the Illumina iSelect 90 K Infinitum SNP array developed in wheat 25 . The array revealed a total of 28,074 SNP markers, which we coded as 0 and 1 for homozygote and 0.5 for heterozygote allele scores for data analysis. Markers were filtered for 70% missing data, 0.10 minor allele frequency, and greater than 10% heterozygosity, giving a remaining total of 14,273 SNP markers for GWAS.
Significant MTAs were identified using the mixed linear model approach in TASSEL5.0, which simultaneously accounted for population structure and kinship. Population structure (summarized by the Q matrix) was inferred by the STRUCTURE program v.2.2, and the kinship matrix (summarized by the K matrix) was calculated using TASSEL5.0. SNPs with p ≤ 0.0001 were considered significantly associated with individual traits. R 2 was used to evaluate the magnitude of MTA effects. The recent version of IWGSC RefSeq v1.0 have been accessed for identifying associated candidate genes.