Elucidating the genetics of grain yield and stress-resilience in bread wheat using a large-scale genome-wide association mapping study with 55,568 lines

Wheat grain yield (GY) improvement using genomic tools is important for achieving yield breakthroughs. To dissect the genetic architecture of wheat GY potential and stress-resilience, we have designed this large-scale genome-wide association study using 100 datasets, comprising 105,000 GY observations from 55,568 wheat lines evaluated between 2003 and 2019 by the International Maize and Wheat Improvement Center and national partners. We report 801 GY-associated genotyping-by-sequencing markers significant in more than one dataset and the highest number of them were on chromosomes 2A, 6B, 6A, 5B, 1B and 7B. We then used the linkage disequilibrium (LD) between the consistently significant markers to designate 214 GY-associated LD-blocks and observed that 84.5% of the 58 GY-associated LD-blocks in severe-drought, 100% of the 48 GY-associated LD-blocks in early-heat and 85.9% of the 71 GY-associated LD-blocks in late-heat, overlapped with the GY-associated LD-blocks in the irrigated-bed planting environment, substantiating that simultaneous improvement for GY potential and stress-resilience is feasible. Furthermore, we generated the GY-associated marker profiles and analyzed the GY favorable allele frequencies for a large panel of 73,142 wheat lines, resulting in 44.5 million datapoints. Overall, the extensive resources presented in this study provide great opportunities to accelerate breeding for high-yielding and stress-resilient wheat varieties.

Genome-wide association mapping for grain yield. We performed GWAS using GY data from 100 datasets and the filtered GBS markers (Table S2). We report the marker p-values from 734,304 tests of significance of associations between the markers and GY observations (Table S3). After Bonferroni correction for multiple testing, we obtained 1,413 markers that were significantly associated with GY in the different datasets (Figs. 2, 3, 4, 5, Fig. S1). This included 801 markers on all chromosomes that were significant in more than one dataset and the highest number of these markers were on chromosomes 2A (130), 6B (89), 6A (84), 5B (78), 1B (60) and 7B (56). For these consistent markers, we also report the genetic positions in the POPSEQ map 23 (Table S4). In addition, we created a reference map aligned to the RefSeq v.1.0, with 159 GY-associated markers that were significant in seven to twenty-four datasets and observed that the highest number of those consistent markers were on chromosomes 5B (37), 7B (29) and 2A (25) (Fig. 6, Fig. S2). Furthermore, we used the LD between markers to designate 214 GY-associated LD-blocks (where each block constitutes markers with a pairwise R 2 values greater than 0.9 and the p-value for the existence of LD equal to zero) and observed that 89.3% of the consistent LD-blocks in the irrigated-FP environment, 86.9% in the moderate-drought environment, 84.5% in the severe-drought environment, 100% in the early-heat environment and 85.9% in the late-heat environment overlapped with the consistent GY-associated LD-blocks identified in the irrigated-BP environment (Fig. S3).
The key GY-associated LD-blocks on different homeologous chromosomes are highlighted below, along with their positions relative to the reported GY-associated and phenology-associated loci. The maximum additive effect (MAE) of a marker in the GY-associated LD-block within environments (not considering the combined datasets) is indicated in parenthesis after the dataset where the MAE was observed.

Discussion
We have performed the largest GWAS for wheat GY using a massive panel of wheat breeding lines evaluated in multiple environments and have reported consistent LD-blocks associated with GY in the irrigated-BP (199 LD-blocks), irrigated-FP (56 LD-blocks), moderate-drought (61 LD-blocks), severe-drought (58 LD-blocks), early-heat (48 LD-blocks) and late-heat (71 LD-blocks) environments and target sites in different countries (97 LD-blocks). These include several novel loci and previously reported loci, that together provide important insights into the quantitative genetic architecture of GY. However, it should be noted that the GY-associated LD-blocks reported in this study are only based on pairwise R 2 values greater than 0.9 and are not independant. Hence, several LD-blocks could be in LD with the same QTL, due to the high LD in wheat 10 . The association of Yld.cim-1BS. 10 with GY in all the environments mentioned above, indicates its exceptional value in breeding for GY in the optimum, drought and heat stressed environments. In addition, Yld.cim-2AS.1 that was associated with GY in the optimum, drought-stressed and target environments and also associated with GY in the highest number of datasets in this study, was in the position of the 2NS translocation from Aegilops ventricosa. This is an invaluable region in wheat breeding, because of its additional favorable effect on resistance to rusts and that were associated with grain yield in 11-24 datasets. The yellow color indicates the favorable allele (allele that has an increasing effect on grain yield), the blue color indicates the non-favorable allele (allele that has a decreasing effect on grain yield), the magenta color indicates the heterozygote and the white color indicates missing data. For 56 of the 68 markers, we observed that greater than 50% of the lines had favorable alleles. www.nature.com/scientificreports/ wheat blast diseases, lodging tolerance, etc. 10 . We also observed that some LD-blocks were in the location of the vernalization genes and the Photoperiod-B1 gene. These have to be considered cautiously as they might not be associated with GY per se, but are significant because of their effect on flowering time that determines GY and stress-adaptation 48 . While accounting for flowering time in GWAS is possible, the large populations used in this study comprised lines with a wide range in their flowering time and we were interested in exploiting this variation to understand the flowering time-linked genes that had an indirect effect on GY.
A remarkable observation in our study was that greater than 84% of the LD-blocks identified in the stressedenvironments overlapped with the consistent LD-blocks identified in the irrigated-BP environment. This substantiates the positively correlated performance of lines in optimum and stressed environments, indicating that simultaneous improvement for GY potential and stress-resilience is feasible 49 . Considering the target sites in different countries, we observed that 93.8% of the consistent LD-blocks identified in them, overlapped with those identified in the irrigated-BP environment in CIMMYT's key selection environment, Obregon, thereby providing compelling evidence to the shared genetic basis of GY in CIMMYT's selection environment and target environments in different countries.
While it is interesting that only 56.7% of all the significant markers identified in this study were significant in more than one dataset, we also observed that the highest number of datasets where a marker was significantly associated with GY was only 24, thereby accentuating the persistent challenge of identifying consistent GYassociated markers in different environments. Furthermore, our observation of a large variation in the additive effect of markers in different environments (average range in the additive effect of a marker across environments was 59.1 kg/ha and the maximum range was 245 kg/ha) clearly indicated the inconsistency of marker effects, which complicates their deployment in marker-assisted selection and genomic selection across years 10,[50][51][52] . We also observed that among the 2,389 significant marker-dataset associations within environments and cycles, the marker additive effects were between 300 and 320 kg/ha in 0.38% (9) of the marker-dataset associations, between 200 and 299 kg/ha in 1.97% (47) of the marker-dataset associations, between 100 and 199 kg/ha in 26.4% (630) of the marker-dataset associations and between 23 and 99 kg/ha in 71.3% (1,703) of the marker-dataset associations. This substantiates that while GY is pre-dominantly controlled by many small-effect loci, there are some moderate to large-effect loci, that can be utilized in selections.
We have also successfully utilized the Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway method 53 for GWAS in the S1 irrigated-BP combined dataset with greater than 50,000 lines. While it involved a significantly low computation time (~ 8 h), we also observed that 42.7% markers identified by this method were also identified in other datasets using the mixed linear model implemented in TASSEL (Trait Analysis by aSSociation Evolution and Linkage). We have also developed a reference map with consistent GYassociated markers aligned to the RefSeq v1.0, which is an exemplary resource to the wheat community for comparing GY-associated markers in different populations and environments. Finally, we have also reported the genomic profiles of GY-associated markers for the largest panel of wheat lines, which indicated that the CIMMYT wheat germplasm is rich in GY FAs and a trove for breeders to choose parents and design strategic crosses based on complementary GY alleles at desired loci. Overall, this study has extended the knowledge on the genetic architecture of wheat GY and the extensive resources presented provide significant opportunities to accelerate breeding for high-yielding and stress-resilient wheat varieties.

Online methods
Grain yield testing trials, environments, and designs. Grain yield (GY) was measured as the harvested grain weight on a plot-basis and the GY testing trials used in this study comprised the following:

Stage 1 GY testing trials
These trials comprised greater than 7,000 lines in the first year of GY testing at Obregon (Mexico), selected each year from about 70,000 head-row derived lines that were developed using the selected-bulk breeding method. In this method, all the early-generations were selected for agronomic type, phenology, rust resistance, grain size and health, spike fertility and tillering capacity and spikes from the selected plants were bulked until the F4, F5, or F6 stage (depending on the cross and the breeding shuttle logistics), from which the head-rows were obtained and then selected again for the aforementioned traits. The Stage 1 GY testing trials used in this study were evaluated in an irrigated-bed planting (S1 irrigated-BP) environment, where the lines were planted during the optimum planting time (the last 8-10 days of November to the 1 st week of December) on raised beds in optimally irrigated environments that received a total of about 500 mm of water in five irrigations.
The GY evaluation plot size was 4.8 m 2 and the lines were sown in three rows over each of the two beds that were 80 cm wide. The lines were replicated twice in an alpha-lattice design, with each of the > 300 trials

Stage 2 GY testing trials
These trials comprised 1,092 lines in the second year of GY testing at Obregon, that were selected from the lines in Stage 1 for high GY, good resistance to stem and stripe rust, acceptable agronomic type, phenology and end-use quality. The lines were sown in three replications in an alpha-lattice design and each of the 39 www.nature.com/scientificreports/ trials comprised 28 lines and two high-yielding check varieties arranged in six blocks, during the 2013-2014 to 2018-2019 crop cycles in six managed environments including: (i) Stage 2 irrigated-bed planting (S2 irrigated-BP) environment-In this environment, GY potential was evaluated for lines planted on raised beds during the optimum planting time (the third week of November) and they received an optimum irrigation of about 500 mm of water in total from five irrigations. The GY evaluation plot size was 4.8 m 2 and the lines were sown in three rows over each of the two beds that were 80 cm wide.
(ii) Stage 2 irrigated-flat planting (S2 irrigated-FP) environment-In this environment, GY potential was evaluated for lines planted on flat seed beds during the optimum planting time (the third week of November) and they received an optimum irrigation of about 500 mm of water in total from five or more irrigations. The GY evaluation plot size was 5.46 m 2 , and the lines were sown in six rows that were 18 cm apart and 4.2 m in length.
(iii) Stage 2 moderate-drought stress (S2 moderate-drought) environment-In this environment, the GY under moderate-drought stress was evaluated for lines planted on raised beds during the optimum planting time (the third week of November) and they received a reduced irrigation of about 200 mm of water in total from two irrigations. The GY evaluation plot size, the number of rows and beds were similar to the S2 irrigated-BP environment.
(iv) Stage 2 severe-drought stress (S2 severe-drought) environment-In this environment, the GY under severe-drought stress was evaluated for lines planted on flat seed beds during the optimum planting time (the third week of November) and they received water through drip irrigation to adjust the total available soil water to about 180 mm. The GY evaluation plot size was 5.85 m 2 , and the lines were sown in six rows that were 18 cm apart and 4.5 m in length.
(v) Stage 2 early-sown heat stress (S2 early-heat) environment-In this environment, the GY under heat stress during the juvenile growth stage was evaluated for lines planted on raised beds about 30 days before the optimum planting time (around the 20th of October) and they received an optimum irrigation of about 500 mm of water in total from five irrigations. The Stage 2 panels were evaluated in the early-heat environ-

Stage 3 GY testing panels
These trials comprised 280 lines in the third year of GY testing in Obregon that were selected from the lines in Stage 2 for high GY in different environments, good resistance to leaf rust, stem rust, stripe rust, Fusarium head blight, Septoria tritici blotch and spot blotch, acceptable agronomic type, phenology and end-use quality. The lines were also sown in three replications in an alpha-lattice design (10 trials, with each trial comprising 28 lines and two high-yielding check varieties in six blocks), during the 2014-2015 to 2017-2018 crop cycles in three managed environments similar to that in Stage 2, including the irrigated-bed planting (S3 irrigated-BP), severe-drought stress (S3 severe-drought) and late-sown heat stress (S3 late-heat) environments.
After  (4)  Best linear unbiased estimates and statistical analysis of the grain yield data. The best linear unbiased estimates (BLUEs) for GY in each of the panels and environments for Stages 1, 2 and 3 of yield testing and the SABWGPYTs were calculated using the ASREML statistical package 54 , using the following mixed model: where y ijkl was the observed GY, μ was the overall mean, g i was the fixed effect of the genotype, t j was the random effect of the trial that was independent and identically distributed (IID) ( t j ∼ N 0, σ 2 t ), r k(j) was the random effect of the replicate within the trial that was IID (r k(j) ∼ N 0, σ 2 r ) , b l(jk) was the random effect of the incomplete block within the trial and the replicate that was IID ( b m(jk) ∼ N 0, σ 2 b ) and ε ijkl was the residual that was IID (ε ijkl ∼ N 0, σ 2 ε ). We also obtained the GY BLUEs for the 1,092 lines that were evaluated in both Stages 1 and 2 using the random effect of the year in Eq. (1), resulting in GY BLUEs for the lines evaluated in the following cycles: S1 1213 and S2 1314, S1 1314 and S2 1415, S1 1415 and S2 1516, S1 1516 and S2 1617, S1 1617 and S2 1718, S1 1718 and S2 1819. In addition, for the combined datasets comprising all the lines in the Stage 1-3 trials, the SABWG-PYTs and the ESWYTs, we obtained the GY BLUEs across years with the random effect of the year included in Eq. (1). The GY BLUEs within each panel and environment were used to calculate the maximum, mean, median, minimum, range, standard deviation, standard error of the mean and variance of GY in the different datasets. We also obtained the percentage increase in GY across years in the Stage 1-3 trials, SABWGPYTs and ESWYTs using the initial year as the base year.
Genotyping. The genotyping-by-sequencing (GBS) method 55 was used to genotype all the lines used in this study and the single nucleotide polymorphisms (SNP) were called using the TASSEL (Trait Analysis by aSSociation Evolution and Linkage) version 5 GBS pipeline 56 .The discovery of SNPs was done using a minor allele frequency of 0.01, resulting in 13,082,477 GBS tags that were aligned to RefSeq v1.0 using Bowtie2 57 with an overall at an alignment rate of 68.98%. The SNPs were then filtered for Fisher's exact test (p < 0.001), inbred coefficient (> 80%) and Chi Square (expected inbreeding of 96%), and we obtained 89,863 SNPs that passed at least one of these filters. These markers were then filtered in each individual panel and in the combined panels and markers with greater than 60% missing data, less than 5% minor allele frequency and greater than 10% heterozygosity were removed. Similarly, the lines with greater than 50% missing data were removed, and the total number of markers and lines used are given in Table S3. Marker imputation was done using Beagle version 4.1 with default settings, except for the 'err' parameter which was set to 0.1 and the 'number of iterations' which was set to 1,055 58 .
For 99 datasets (except the large S1 irrigated-BP combined dataset), we performed GWAS using the mixed linear model 59 in TASSEL version 5, where population structure was used as a fixed effect and the kinship or the degree of relatedness between the lines was used as a random effect. We used the first two principal components 60 to account for the population structure and the pedigree-relationships among the lines to account for kinship. The mixed linear model was run with the optimum level of compression and the population parameters previously determined 61 options. In the S1 irrigated-BP combined dataset that had 50,283 lines with good genotyping data and was computationally demanding, we evaluated the Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK) method, that is proven to result in higher statistical power and computing efficiency in large datasets 53 . We implemented the BLINK algorithm in the 'BLINK-R' package and accounted for population structure using the first two principal components.
To correct for multiple testing and to identify significant markers, we used the Bonferroni correction for multiple testing with an α level of 0.01 for the seven large S1 datasets and a relaxed α level of 0.20 for all the other datasets. The p-values for each marker were obtained in the different datasets and the CMplot 'R' package 62 was (1) y ijkl = µ + g i + t j + r k(j) + b l(jk) + ε ijkl Scientific Reports | (2021) 11:5254 | https://doi.org/10.1038/s41598-021-84308-4 www.nature.com/scientificreports/ used for generating the Manhattan plots. From the markers that were significant after correcting for multiple testing, we obtained the consistent markers that were significant in more than one dataset and their genetic positions on the POPSEQ map 23 . We then obtained 159 markers that were significant in greater than seven datasets, created a reference map aligned to the RefSeq v.1.0 and visualized it using Phenogram (http://visua lizat ion. ritch ielab .org/pheno grams /plot). Since several significant markers were in high LD with nearby markers, we calculated the pairwise R 2 (a measure of LD) between the markers using all the 50,283 lines and used two criteria (pairwise R 2 greater than 0.9 and the p-value for the existence of LD equal to zero) to designate markers into an LD-block. However, it should be noted that several LD-blocks could be in LD with the same QTL. The LDblocks that overlapped between the irrigated-BP environment and the other environments were visualized using the 'R' package, 'VennDiagram' 63 . We also analyzed the additive effects of all the markers from the mixed linear model and obtained the maximum additive effect of a marker in the LD-block, considering only the individual panels, where the additive effect was the effect within an environment. Finally, the position of each significant marker was compared to previously reported markers for GY, GY components and phenology, whose positions were available in the RefSeq v1.0.
Genomic profiling of lines for grain yield favorable alleles. For the 801 consistent GY-associated markers identified in this study, we profiled 73,142 wheat breeding lines, comprising 62,581 lines from Stage 1 of yield testing developed during 2013-2019, 961 lines from Stage 2 evaluated in the 2013-2014 cycle and 9,600 lines from other CIMMYT populations that were genotyped. For each marker, we obtained the favorable allele (FA) or the allele that had an increasing effect on GY in the significant datasets, the non-favorable allele or the allele that had a decreasing effect on GY in the significant datasets and the heterozygote. We then calculated the percentage of GY FAs in each line and also identified lines that had a high percentage of FAs (greater than 80%) at the non-missing markers and also high GY (over 8.2 t/ha). To visualize the GY FA composition of lines, we took a subset of the consistently significant markers (68 markers that were significant in more than 10 datasets), plotted a heatmap of the alleles and clustered the lines based on the presence of FAs using the 'R' package gplots 64 . Furthermore, to understand the percentage of lines with GY FAs at each of the 801 consistent markers, we plotted the percentage of lines with FAs at each marker in the different homeologous chromosomes using the 'R' package 'ggplot2' 65 .

Data availability
The phenotyping data for the lines used in this study are available in Supplementary www.nature.com/scientificreports/