Genome-wide association mapping for wheat blast resistance in CIMMYT’s international screening nurseries evaluated in Bolivia and Bangladesh

Wheat blast caused by the fungus Magnaporthe oryzae pathotype Triticum (MoT) is an emerging threat to wheat production. To identify genomic regions associated with blast resistance against MoT isolates in Bolivia and Bangladesh, we performed a large genome-wide association mapping study using 8607 observations on 1106 lines from the International Maize and Wheat Improvement Centre’s International Bread Wheat Screening Nurseries (IBWSNs) and Semi-Arid Wheat Screening Nurseries (SAWSNs). We identified 36 significant markers on chromosomes 2AS, 3BL, 4AL and 7BL with consistent effects across panels or site-years, including 20 markers that were significant in all the 49 datasets and tagged the 2NS translocation from Aegilops ventricosa. The mean blast index of lines with and without the 2NS translocation was 2.7 ± 4.5 and 53.3 ± 15.9, respectively, that substantiates its strong effect on blast resistance. Furthermore, we fingerprinted a large panel of 4143 lines for the 2NS translocation that provided excellent insights into its frequency over years and indicated its presence in 94.1 and 93.7% of lines in the 2019 IBWSN and SAWSN, respectively. Overall, this study reinforces the effectiveness of the 2NS translocation for blast resistance and emphasizes the urgent need to identify novel non-2NS sources of blast resistance.


Results
Blast phenotyping data distributions and correlations. The datasets used in this study include lines from the following nurseries that were evaluated in two planting dates (first planting, FP and second planting, SP): (i) the 50 IBWSN and 35 SAWSN that were evaluated in Okinawa and Quirusillas during the 2018 cycle and in Jashore and Quirusillas during the 2019 cycle and (ii) the 51 IBWSN and 36 SAWSN that were evaluated in Jashore, Okinawa and Quirusillas during the 2019 cycle and in Quirusillas during the 2020 cycle (Supplementary Table S1). The distributions of blast indices indicated that on average 73.9 ± 13.3% of the lines across all the nurseries had blast indices of 0 and were highly resistant (Supplementary Table S2). While the Quirusillas 2018 SP 50 IBWSN dataset had the highest number of lines (91.1%) with a blast index of 0, the Jashore 2019 FP 35 SAWSN dataset had the least number of such lines (44.9%) ( Supplementary Fig. S1). Overall, across all the panels, 83.7% of lines had a mean blast index less than 10, among which 27.4% of lines had a mean blast index of zero ( Supplementary Fig. S1) . We also observed that the Okinawa 2019 environment had the highest mean blast index (12.8 ± 3.14), followed by Jashore 2019 (12.6 ± 3.5), Quirusillas 2019 (8.03 ± 1.8), Quirusillas 2018 (6 ± 1.1), Okinawa 2018 (4.8 ± 1.6) and Quirusillas 2020 (4.6 ± 0.9) environments. Considering the four nurseries, the 36 SAWSN had the highest mean blast index (10.1 ± 4), followed by the 35 SAWSN (9.2 ± 4.9), 51 IBWSN (8.5 ± 3.8) and the 50 IBWSN (6.9 ± 4.1) (Supplementary Table S2). The blast correlations across the different sites were analyzed (Supplementary Table S3) and we observed that Quirusillas and Okinawa had the highest mean correlations (0.78 ± 0.07), followed by Quirusillas and Jashore (0.61 ± 0.13) and Okinawa and Jashore (0.61 ± 0.12). Within sites, the average correlation across all the planting environments and years of evaluations were high in Okinawa (0.82 ± 0.09) and Quirusillas (0.78 ± 0.06), but moderate in Jashore (0.57 ± 0.1). There were no significant differences between the blast indices in the two plantings (at a p-value threshold of 0.001), except in the Jashore 2019 environment (p-value of 9.98E−6). The average narrow-sense heritability across the two planting times in the different environments was 0.72 ± 0.12, and it ranged between 0.49 and 0.87 (Supplementary Table S3). We also observed low average correlations of the blast indices with days to heading (0.1 ± 0.12, ranged between − 0.12 and 0.27) and height (0.01 ± 0.13, ranged between − 0. 35  www.nature.com/scientificreports/ population structure analysis. Moderate population structure was observed in all the nurseries using the first two principal components (Fig. 1). The percentage variation explained by the first two principal components were: 8.9% and 5.4% in the 50 IBWSN, 7% and 4.3% in the 35 SAWSN, 13.7% and 8.8% in the 51 IBWSN and 10.7% and 7.9% in the 36 SAWSN, respectively. The principal components were also used to partition the lines into clusters using the 'k-means' clustering approach (Supplementary Table S5       www.nature.com/scientificreports/ Markers associated with blast resistance in the 36 SAWSN. In the 36 SAWSN, markers 2A_16823170, 2A_16881506, 2A_18468495, 2A_1872142 and 2A_735436 on chromosome 2AS were the most significant markers (Fig. 5). Across the different datasets, 30 markers were significantly associated with blast resistance, including 28 markers on chromosome 2AS between 718,152 and 24,002,740 bps that were significant in all the datasets with an average AE of 21.1 ± 6.3 and explained on average 31.4 ± 11.4% of the blast variation.

Markers associated with blast resistance in the combined panel.
In the combined panel, the most significant markers in the different datasets included markers 2A_14418760, 2A_15449240, 2A_15962301, 2A_16629866, 2A_18468495, 2A_18495181 and 2A_19902461 on chromosome 2AS (Fig. 6, Supplementary  Fig. S3). Across the different environments, we identified 37 repeatable on chromosomes 2AS, 3AS, 3BL and 4AL. On chromosome 2AS, 28 markers between 718,152 and 24,002,740 bps were significant in all the datasets with an average AE of 19.9 ± 6.2 and explained on average 23.8 ± 7.3% of the blast variation. On chromosome 3AS, marker 3A_38834713 (35 cM) was significant in the Quirusillas FP and SP datasets with an average AE of 1. Considering the allelic effects at the 36 consistent markers and the mean blast indices of the 1,106 lines evaluated in the six environments (Fig. 8), the 2AS 718,152-24,002,740 bps region, markers UN_12522200 and UN_367803425 had large differences in the blast indices for the FA and the unfavorable allele across all the environments. In addition, markers 3B_128050543 and 3B_542493172 also had moderate differences in several environments, but none of the other markers showed consistent allelic effects across the different environments. We then compared the consensus FA at the markers on chromosome 2AS between 718,152 and 24,002,740 bps to the presence or absence of the 2NS translocation scored using the Kompetitive allele specific polymerase chain reaction marker CIMwMAS0004 [50][51][52] . The coincidence of the FAs in 98.8% of the lines (excluding the heterozygotes) with the presence of the 2NS translocation indicated clearly that the most consistent markers in this study are in the 2NS translocation (Supplementary Table S7).
To understand if the presence of FAs for markers other than the 2NS translocation in the lines with the 2NS translocation had an effect on the blast severity in each of the environments, we fitted a linear regression model with the blast severity as the response and the number of  www.nature.com/scientificreports/ 4.6% of the lines had mean blast indices between 10 and 30 and three lines had blast indices between 32.2 and 45.6. The mean blast indices in the lines with and without the 2NS translocation were 2.7 ± 4.5 and 53.3 ± 15.9, respectively. Among the 17 heterozygotes, the mean blast indices were zero in five lines, ranged between 0.2 and 18.7 in five lines and between 23.13 and 63.6 in seven lines. We also analyzed the progenies in 36 crosses with the 2NS translocation and more than five full-sibs per cross and observed that two families had a range greater than 20 in the blast indices including: (i) BECARD/AKURI/3/KINGBIRD #1//INQALAB 91*2/TUKURU/4/ BECARD/AKURI with mean blast indices ranging between 0 and 32.2 (ii) KACHU/BECARD//WBLL1*2/ BRAMBLING/3/ATTILA*2/PBW65//MURGA with mean blast indices ranging between 0.5 and 22.4. We then clustered the 915 lines with the 2NS translocation based on the alleles at all the 28 markers in the translocation (Supplementary Fig. S4). We observed that 761 of the 915 lines had FAs at all the non-missing markers in the translocation, but their blast indices ranged between 0 and 45.6. Furthermore, we also visualized the blast indices, FAs and the unfavorable alleles at the 28 markers tagging the 2NS translocation in a subset of 65 lines that had consensus 2NS FAs, but also had some unfavorable alleles and heterozygotes ( Supplementary Fig. S5). While the blast indices of these lines ranged between 0 and 29.2, there was no clear relationship between the presence of few unfavorable alleles or the heterozygote and the blast index.
Considering the 117 lines without FAs at the 2NS translocation, 62.5% of them had mean blast indices greater than 50, 31.3% of them had mean blast indices between 30 and 50, and seven lines had mean blast indices less than 30 (Supplementary Table S8 Table S9). We observed that in the IBWSNs, the percentage of lines with the 2NS translocation had increased 113.8% in seven years, from 44% in 2012 to 94.1% in 2019. Similarly, in the SAWSNs, the percentage of lines with the 2NS translocation had increased 524.3%, from 15% in 2012 to 93.7% in 2019 (Supplementary Fig. S6).

Discussion
In this study, 1,106 lines in the IBWSNs and SAWSNs were evaluated for wheat blast in Bolivia and Bangladesh, and we observed a high level of resistance, with 83.7% of them having mean blast indices less than 10. In the environments considered in this study, the planting time did not significantly affect the disease indices except in one dataset, in contrast to a previous observation 29 . In addition, traits like days to heading and height had low correlations with disease index which can be partly attributed to the low variability in the blast indices. The high correlations between blast indices in Bangladesh and Bolivia (mean of 0.61) is encouraging, indicating similarities in blast responses across these environments.
Multi-environment GWAS for blast resistance were performed providing valuable insights into the genetic architecture of resistance against different isolates of MoT in Bolivia and Bangladesh. The markers that were significantly associated with blast resistance in all the 49 datasets used were on the 2NS translocation and explained up to 71.8% of the phenotypic variation, clearly indicating the large and consistent effect of this region. The low mean blast indices (less than 10) in 95.03% of lines with the 2NS translocation and the high mean blast indices (greater than 30) in 93.8% of the lines without the 2NS translocation, in addition to the large difference in the mean blast index of lines with and without the 2NS translocation provide strong evidence to the high and consistent effect of the 2NS translocation on blast resistance 38,40 . In addition, we observed that a small percentage of lines (1.6%) were heterozygous at all or several markers at the 2NS translocation, implying that heterozygosity in this translocation is possible, despite reports that the 2A and the 2N chromosomes do not recombine 50 . However, we observed that only 31.3% of the lines with the 2NS translocation had a mean blast index of zero across all the environments, and the remaining lines had mean blast indices ranging up to 45.6. While this affirms the background-dependence and partial nature of the resistance 38 , we have also reported families with the 2NS translocation that clearly had a range in their mean blast indices, indicating the incomplete effect of the translocation in some families and the probable involvement of other genes with small effects additively contributing to high levels of resistance.
Besides the 2NS translocation and the two unaligned markers (UN_12522200 and UN_367803425) that had a high LD with the 2NS translocation and probably were in the same region, we observed only six markers on chromosomes 2AS, 3BL, 4AL and 7BL that were significant in more than one nursery or environment. However, none of them had significant effects across all the six environments in this study, besides marker 3B_542493172 that was 7 cM away from marker 3B_476296464 reported to be associated with blast 39 . In addition, the combined presence of FAs for all the six markers had a low effect on the blast indices in few datasets indicating that the 2NS translocation is the only consistent region associated with blast resistance. The paucity of non-2NS based blast resistance sources is clearly demonstrated in our study by that only one line without the 2NS translocation had mean blast index of zero across all the environments. This can be attributed to the fact that only 10.7% of the lines used in this study lacked the 2NS translocation, among which 93.8% had mean blast indices greater than 30 and were susceptible. www.nature.com/scientificreports/ We also fingerprinted a large panel of 4143 lines for the 2NS translocation that provided excellent insights into its frequency over years and indicated its presence in 94.1% and 93.7% of lines in the most recent IBWSN and SAWSN, respectively. This clearly indicates that future identification of novel non-2NS based resistance in the elite germplasm from CIMMYT might be challenging and evaluation of other genetic resources including synthetics, land races and exotic germplasm is essential to identify new sources of wheat blast resistant lines and genes. The remarkable increase in the proportion of lines with the 2NS translocation in the IBWSNs and SAWSNs (113.8 and 524.3%, respectively) over seven years is due to the association of the translocation with stripe rust (caused by Puccinia striiformis) resistance to races in Mexico and high grain yield 39 . However, the 2NS translocation has also been reported to be associated with lodging tolerance and resistance to stem rust caused by Puccinia graminis, leaf rust caused by Puccinia triticina, eyespot caused by Pseudocercosporella herpotrichoides, cereal cyst caused by Heterodera avenae, root knot caused by Meloidogyne spp. 39,53-57 making it a very valuable region in wheat breeding.
While 'Milan' was the original source of the 2NS translocation in the CIMMYT germplasm, it has increased in frequency through 'Milan' derived lines like Mutus (MILAN/S87230/4/BOW/NAC//VEE/3/BJY/COC) and Kachu (KAUZ//ALTAR 84/AOS/3/MILAN/KAUZ/4/VEE/KOEL) that were used also widely as parents. The presence of the 2NS translocation in a large number of recent IBWSNs and SAWSNs is promising for shortterm blast management and blast resistant varieties like 'BARI Gom 33′ with the 2NS translocation have been released 58 . However, the reliance on only one resistance locus with a large effect is not recommended as it leads to selection pressure on the MoT populations 4,59 and there are reports that the MoT isolates in Brazil 21 and Paraguay 60 have overcome the 2NS resistance. Hence, we would like to emphasize the importance of identifying other minor genes that can additively result in enhanced blast resistance and this strategy has been successfully used in breeding for durable resistance in other pathosystems such as the wheat-rust, where frequent breakdown of major genes occurs 61 . We also conclude that further research on identifying novel non-2NS sources of blast resistance is critical for diversifying and sustaining genetic resistance to wheat blast. In Jashore, Bangladesh, the nurseries were sown in late December 2018, with the same experimental design as in Bolivia. The inoculum was a mixture of six local MoT isolates, BHO17001, MEH17003, GOP17001.2, RAJ17001, CHU16001.3 and JES16001, that had shown high pathogenicity and high capacity of sporulation. Local varieties BARI Gom 26 and BARI Gom 33 were used as the susceptible and resistant check, respectively. The nurseries were surrounded by spreader rows of susceptible line BARI Gom 26, that were inoculated with MoT isolates 4-5 times between the seedling to the heading stage.

Methods
A misting system was set up in each nursery, operating from 8 a.m. to 7 p.m. in Bolivia and 9 a.m. to 5 p.m. in Bangladesh during the period of disease development, with 10-15 min per hour of water misting to keep a humid micro-environment that is conducive for wheat blast development. Blast evaluations were done 20-22 days after the first inoculation, when the susceptible checks had greater than 80% blast severity. The total and infected number of spikelets on ten spikes that were tagged at anthesis were then recorded. Blast incidence was calculated as the proportion of spikes with blast infection and the blast severity was calculated as the average percentage of spikelets that were infected. The incidence and severity values were then used to estimate the blast index using the formula: wheat blast index = incidence × severity. We also calculated a mean blast index across all the environments for each nursery. The blast indices for all the lines are given in Supplementary Table S1 and we also obtained the mean, median, minimum, range, standard deviation, standard error of the mean and variance of the blast indices. In addition, we obtained the Pearson's correlation between the blast indices in the different environments and the narrow-sense heritabilities across the planting times in each environment using the formula: where σ 2 g was the genetic variance among the lines calculated using the markers, σ 2 ε is the error variance, and nplantings is the number of planting dates. We estimated the genetic and residual variances using the average information-restricted maximum likelihood algorithm 62 in the 'R' package 'heritability' 63 . Genotyping data. All the 1106 lines in the four nurseries were genotyped using the genotyping-bysequencing method 64 and the marker polymorphisms were called using the TASSEL (Trait Analysis by aSSociation Evolution and Linkage) version 5 GBS pipeline 65 . Marker polymorphisms were discovered using a minor  Marker-trait association tests were done in TASSEL version 5 using the mixed linear model 69 with the optimum level of compression and the 'population parameters previously determined' option 70 . The first two principal components accounting for the population structure 71 were used as fixed effects and the kinship matrix among individuals estimated using the centered identity-by-state method 72 was used as the covariance matrix for the random genotypic effect in the mixed linear model. The principal components were then used to cluster the lines into five arbitrary clusters based on the k-means clustering approach in the Jmp statistical software (www. jmp.com). We also estimated the individual ancestry coefficients of the lines and the number of ancestral subpopulations using the 'R' package LEA (Landscape and Ecological Association studies) based on the value of the cross-entropy criterion 73 . The p-values, additive effects and percentage variation explained by each marker were obtained and Manhattan plots with the − log 10 p-values of the markers were plotted using the 'R' package CMplot 74 . The Bonferroni method for multiple testing correction at an α level of 0.20 was used to declare significance of the markers. The markers that were consistently associated with blast resistance in more than one dataset were identified and their genetic positions in the Popseq map 75 were obtained.
Genomic fingerprinting and frequencies of favorable alleles. The effects of the consistent markers significant in more than one dataset estimated from the mixed linear model were analyzed and the alleles that had a decreasing effect on the blast index (favorable alleles, FAs), an increasing effect on the blast index (unfavorable allele) and the heterozygote were identified. These alleles were then used to fingerprint the 1106 lines in all the nurseries to understand the allelic composition of lines for blast resistance associated markers. A heatmap with the fingerprinted markers was obtained with the 'R' package 'pheatmap' 76 . We also plotted the marker alleles in the 1106 lines with and without the 2NS translocation against their mean blast indices in Jashore 2019, Okinawa 2018, Quirusillas 2018 and Quirusillas 2019 using the 'R' package ggplot2 77 .

Frequencies of lines with the 2NS translocation in the IBWSNs and SAWSNs. To validate
whether the most consistent regions identified in this study referred to the 2NS translocation, we used the Kompetitive allele specific polymerase chain reaction co-dominant marker CIMwMAS0004 that is specific for the translocation 50

Data availability
The blast phenotyping data of 1106 lines evaluated in different environments is available in Supplementary  Table S1.