Molecular marker dissection of stem rust resistance in Nebraska bread wheat germplasm

Stem rust (caused by Puccinia graminis f. sp. tritici) is a major disease of wheat. To understand the genetic basis of stem rust resistance in Nebraska winter wheat, a set of 330 genotypes representing two nurseries (DUP2015 and TRP2015) were evaluated for resistance to a Nebraska stem rust race (QFCSC) in two replications. The TRP2015 nursery was also evaluated for its resistance to an additional 13 stem rust races. The analysis of variance revealed significant variation among genotypes in both populations for stem rust resistance. Nine stem rust genes, Sr6, Sr31, Sr1RSAmigo, Sr24, Sr36, SrTmp, Sr7b, Sr9b, and Sr38, were expected and genotyped using gene-specific markers. The results of genetic analysis confirmed the presence of seven stem rust resistance genes. One genotype (NE15680) contained target alleles for five stem rust resistance genes and had a high level of stem rust resistance against different races. Single marker analysis indicated that Sr24 and Sr38 were highly significantly associated with stem rust resistance in the DUP2015 and TRP2015 nurseries, respectively. Linkage disequilibrium analysis identified the presence of 17 SNPs in high linkage with the Sr38-specific marker. These SNPs potentially tagging the Sr38 gene could be used in marker-assisted selection after validating them in additional genetic backgrounds.

The first step in pyramiding stem rust resistance genes is to identify the different resistance genes in elite genotypes that may be used as parents to create new pyramids in new cultivars 7 . The gene-for-gene theory is very helpful to postulate the presence of seedling genes. Genes can be postulated based on the infection type (ITs) of the different stem rust resistance genes to known stem rust races. To confirm the presence of the postulated genes in the tested nurseries, DNA markers tightly linked to those genes could be used.
Marker-assisted selection (MAS) provides many benefits in plant breeding 8,9 . One of the most important benefits is that markers are highly heritable and can be screened at the seedling stage. Thus, plant breeders can select resistant plants without phenotypic screening. In addition, MAS is very helpful in identifying the tagged resistance genes which have been pyramided in one genotype 10 .
The objectives of this study were to (1) screen a set of 330 Nebraska winter wheat genotypes for resistance to stem rust, (2) identify the possible genes and gene pyramids in the elite genotypes, and (3) select genotypes with pyramided stem rust resistance genes to be used in breeding programs.

Materials and Methods
plant materials. Two different nurseries were used in this study: DUP2015 nursery which consists of 270 diverse genotypes and TRP2015 nursery which contains 60 diverse genotypes. The structure of each nursery was previously described 11,12 . Basically, lines in the DUP (preliminary yield trial) are advanced to the TRP (advanced yield trial) the following year based upon agronomic performance, disease resistance, and end-use quality. Briefly, both nurseries are part of the Nebraska Wheat Improvement Program where the selection is routinely done using stem rust resistance as one of the criteria.
Stem rust inoculation and screening. The DUP2015 nursery (270 genotypes) was evaluated for the predominant race in North America, QFCSC, in Nebraska in two replications (one at Lincoln, NE and one at USDA-ARS, at Manhattan, KS). The TRP2015 nursery (60 genotypes) was evaluated for resistance to races QFCSC, QTHJC, MCCFC, RCRSC, RKRQC, TPMKC, TTTTF, GFMNC, QCCSM and TTKSK (commonly known as Ug99) at the USDA Cereal Disease Laboratory at St. Paul, MN in one replication. Because race TTKSK is one of the races in the Ug99 race group, the resistant or heterogenous genotypes to this race were selected and evaluated again using three additional variants in the Ug99 race group; TTKST (with added virulence to Sr24), TTTSK (with added virulence to Sr36), and TTKTT (with added virulence to Sr24 and SrTmp) 13 , as well as non-Ug99 races TKTTF and TRTTF that possess significant virulence combinations 14,15 . Testing with the Ug99 races was done to determine if there are resistant lines should those races be found in Nebraska. In addition, the TRP2015 nursery was evaluated against race QFCSC in a second replication at Lincoln, NE. This approach of screening the DUP2015 to the predominant race of stem rust in North America and then to multiple races in the TRP2015 was to save labor and resources. In the DUP2015, our goal was to make sure lines have stem rust resistance. In the TRP2015, there are fewer lines, so it is easier to test with multiple races of stem rust. All the evaluations were performed at the seedling stage using the method of Jin and Singh 16 with some modifications as described in Mourad et al. 11 . Rust reactions were scored using a 0-4 infection type scale developed by Stakman et al. 17 . Genotypes with scores of 0 to 2 were considered as resistant, and 3 to 4 as susceptible.
In order to identify possible genes controlling the resistance in the tested populations, gene postulation was performed. The TRP2015 population was postulated based on the IT of eleven different stem rust races, avirulence/virulence formula for these races and pedigree information. The phenotypic data of the DUP2015 population was compared to the IT of the different stem rust genes at the seedling stage 18 . Pedigree information of this population was used to confirm the possible presence of the postulated genes.
Statistical analysis of stem rust resistance. The 0-4 scale 17 was converted to a linear scale of 0 to 9 7,11,19 for statistical analysis. Heterogeneous genotypes that had resistant plants were grouped with the resistant lines because resistant lines could be selected from them. The analysis of variance for stem rust resistance values was done using R software 20 .
DNA extraction and polymerase chain reaction (PCR) conditions. DNA was extracted for genotyping-by-sequencing (GBS), simple sequence-repeat (SSR), or sequence-tagged site (STS) markers that link to known rust resistance genes 11 . Based on the pedigrees of the tested genotypes, some stem rust resistance genes were expected in some genotypes and SSR and STS markers for these genes were screened for those genotypes. Genotyping for Sr24, Sr38, Sr31, and Sr1RS Amigo was done at the USDA-ARS lab, Manhattan, KS. Polymerase chain reaction (PCR) amplifications for SSR and STS markers for these genes were performed in a Tetrad Peltier DNA Engine (Bio-Rad Lab, Hercules, CA, USA) with a 12 µl PCR mixture containing 1.2 µl 10x PCR buffer (Bioline, Taunton, MA, USA), 2.5 mM MgCl 2 , 200 µM of each dNTP, 50 nM forward tailed primer, 250 nM reverse primer, 200 nM M13 fluorescent-dye-labeled primer, 0.6 U Taq DNA polymerase, and about 60 ng template DNA. A touchdown PCR program was used for PCR amplification. Briefly, the reaction was incubated at 95 °C for 5 min then continued for 5 cycles of 1 min at 96 °C, 5 min at 68 °C with a decrease of 2 °C in each subsequent cycle, and 1 min at 72 °C. For another 5 cycles, the annealing temperature started at 58 °C for 2 min with a decrease of 2 °C for each subsequent cycle. Reactions then went through an additional 25 cycles of 1 min at 96 °C, 1 min at 50 °C, and 1 min at 72 °C with a final extension at 72 °C for 5 min.
For the remaining expected genes (SrTmp and Sr36), genotyping was done at the University of Nebraska-Lincoln. Polymerase chain reaction (PCR) for the available SSR markers (Supplementary Table 1 used to separate the SSR marker products. The differential line of each gene was included in the SSR genotyping to identify the target band size for each primer (Supplementary Table 2).
Gel scoring and statistical analysis. The amplified products of SSRs were scored using a visual score, ABI PRISM 3730 DNA Analyzer (Applied Biosystems, Foster City, CA, USA) and by using Gel Analyzer 2010 software (http://www.gelanalyzer.com/). Converted phenotypic data, (0-9 scale) and the genotypic data were used to perform single marker analysis (SMA). The analysis was done using SAS version 9 21 , following this model: where Y is the trait value, µ is the nursery mean, and f (marker) is a function of the molecular marker 22 . The phenotypic variation explained by each marker was calculated for the markers that showed a significant effect on rust resistance based on the SMA. Box plots showing stem rust scores for each of the alleles of the significant markers were created using R package 'ggplot2' 23 . Genotypes having outlier values in each group were excluded and the SMA was redone in order to confirm the significant effect of each marker.

Genome-wide association analysis (GWAS) and linkage disequilibrium (LD).
In the DUP2015 nursery, the association between markers (35,128 SNPs) and stem rust data from the tested genotypes (270 genotypes) was analyzed using TASSEL 5.0 software 24 after removing heterozygous SNPs. Due to the presence of population structure 11 , two models were tested, mixed linear model + kinship (MLM + K) and mixed linear model + kinship + Q-matrix (MLM + K + Q-matrix). The marker-trait association was tested against Bonferroni corrections and false discovery rate (FDR) at a significance level of 5% in both models as well as compared with α = 0.05 significance. As many genes were expected to exist in these nurseries, we also looked for the association of minor genes controlling the resistance. This was accomplished using Settlement of MLM Under Progressively Exclusive Relationship (SUPER) method employed by GAPIT-R package 25 .
In the TRP2015 nursery, the number of genotypes was too low for GWAS analysis (60 genotypes). To identify SNP markers that are significantly associated with significant resistance genes, linkage disequilibrium (r 2 ) among the SNPs located on the targeted chromosome and gene-specific SSR markers for those significant genes were calculated using TASSEL software 24 . Values of LD (r 2 ) were visualized in a heatmap generated by 'LDheatmap' R package 26 . After detecting SNP markers which had high LD with the resistance-gene-specific marker, SMA between these SNPs and stem rust resistance was done to confirm the association between these SNPs and the resistance gene using SAS software 21 based on the same SMA model described previously. The allele effect and targeted allele of each SNP were calculated using TASSEL 24 . Positive values indicated genes associated with the marker decreased stem rust resistance (increased susceptibility), while negative values indicated genes associated with the marker allele for increased resistance. candidate genes and gene annotation. To further investigate the association of the significant SNPs and stem rust resistance, gene models underlying these SNPs were identified using the reference genome assembly (IWGSC REF Seq v.1.0). The functional annotation of the detected gene models was investigated using the same reference genome. The expression of gene models at different development stages was compared using a wheat expression database (http://www.wheat-expression.com/) to provide additional understanding of these results.
Key message. Identifying genotypes carrying many stem rust resistance genes is helpful to produce a wide range of resistance. We identified five genotypes carrying four genes that could be used in breeding programs.

Results and Discussion
Genetic variation and pyramiding of stem rust resistance genes. The analysis of variance revealed highly significant differences among genotypes, but not between replications in both DUP2015 and TRP2015 nurseries 11 . Most of the DUP2015 nursery was resistant to race QFCSC with a percentage of 80% of the genotypes (Fig. 1), indicating the historic phenotypic screens and gene pyramiding for stem rust resistance in the breeding program was successful.
Many of the genotypes in the TRP2015 nursery were resistant to races QCCSM (82%), GFMNC (75%), RCRSC (72%), QFCSC (68%), MCCFC (68%), QTHJC (58%), TPMKC (50%), RKRQC (48%), TTKSK (43%), and TTTTF (33%) (Fig. 1). As TTKSK is one of the races in the Ug99 race group, we tested the 33 TTKSK-resistant genotypes against other races in the Ug99 race group, including TTTSK, TTKST, and TTKTT and found 23, 19 and seven genotypes were, respectively, resistant to these three races. Of the 39 genotypes that were evaluated using two non-Ug99 races, 17 and 14 genotypes were resistant to TKTTF and TRTTF, respectively. The high percentages of resistant genotypes to race QFCSC (80 and 68% in the DUP2015 and TRP2015, respectively) were expected in both nurseries due to routine evaluation and effective selection of advanced breeding lines for resistance to the different races in collaboration with USDA-ARS. As a result of this process, a high percentage of resistant genotypes was found in the TRP2015 nursery. The 14 genotypes which are resistant to TKTTF, TRTTF, and the Ug99 races could be used as a good source for breeding to stem rust resistance outside the U.S. as well as in the U.S. incase new races appear. prediction and screening of stem rust resistance genes using molecular markers. The DUP2015 and TRP2015 nurseries were expected to contain many stem rust resistance genes such as Sr24, Sr31, Sr6, Sr7b, Sr9b, Sr1RS Amigo , SrTmp, Sr36, and Sr38. This expectation was determined based on the presence of these genes in the genetic background of the tested genotypes. For example, SrTmp was reported in "Goodstreak" in addition to Sr6 27-29 ; Sr7b was reported in "Gage" 19 ; Sr36 and Sr9b were present in some northern USA cultivars 29,30 . The presence of these genes in the tested genotypes seems to be logical as those genes were widely used to improve wheat www.nature.com/scientificreports www.nature.com/scientificreports/ for stem rust resistance in the USA [31][32][33] . Two different SSR markers were used to screen the two nurseries for Sr6 and the target alleles associated with this gene were found in many genotypes of those nurseries 11 . Moreover, Sr6 was a major gene identified in the two populations [11], therefore, we focused on the other eight genes that were apparently less frequent in the current study.
To confirm the presence of the expected genes in the tested nurseries we compared the infection type (IT) of the genotypes with the widely accepted IT of each gene illustrated by McIntosh et al. 34 , which has been reported as a fast and effective approach to determine stem rust resistance genes 35 . However, this may not always give an accurate prediction of a gene due to the epistatic interaction between the different resistance genes. Therefore, it was necessary to use DNA markers to further investigate the presence of the expected resistant genes. Using molecular markers for specific stem rust resistance genes should be done side by side with gene postulation in order to confirm the presence of the resistance genes. For this purpose, the available specific SSR markers for each gene were used (Supplementary Table 1).
Genetic analysis using gene-specific markers. As previously mentioned, nine stem rust genes were expected in both nurseries. Sr6 gene was previously described 11 . Two expected genes (Sr9b, and Sr7b) were not included in this study due to the lack of good DNA markers for them. Markers for the remaining genes (Sr24, Sr31, Sr1RS Amigo , SrTmp, Sr36, and Sr38) were analyzed as follows.
Sr36 gene. Among the four SSR markers used to predict the presence of the Sr36 gene in the tested nurseries (Supplementary Table 1), only two markers (Xwmc477 and Xgwm319) showed clear polymorphisms. Based on the positive control, ISr36, the target allele band size was 176 and 170 bp for Xwmc477 and Xgwm319, respectively. In the DUP2015 nursery, 12 genotypes (4.4%) contained the target band of Xwmc477 while four genotypes (1.5%) had the target allele of Xgwm319. In the TRP2015 nursery, 5.0 and 25% had the target alleles of Xwmc477 and Xgwm319, respectively (Fig. 2a). Two biparental populations were previously tested for Sr36 using the two SSR markers 34 . Xwmc477 co-segregated with Sr36 in both populations while Xgwm319 co-segregated with Sr36  www.nature.com/scientificreports www.nature.com/scientificreports/ in one of the population and tightly linked to this gene at 0.9 cM distance in the other population 36 . In the current study, all the genotypes with Xwmc477 target allele had IT of ;0 that conformed to the expected IT of Sr36 gene. However, some of the genotypes with the target allele of Xgwm319 showed ITs were not in accordance with expected Sr36 IT (higher IT) (Supplementary Tables 3 and 4). Xwmc477 appears to be a better marker than Xgwm319 for detecting the presence of Sr36. However, by comparing the responses of the genotypes in the TRP2015 nursery to the different stem rust races, we found that some genotypes with Sr36 marker alleles did not show the expected ITs for some races. For example, Sr36 should exhibit IT of 0 or 0; to QFCSC and other Q races as well as to TTKSK and TTKTT. However, the genotypes carrying Sr36 markers showed higher ITs to these races (Supplementary Table 5). Therefore, the discrepancy between Sr36 marker data and phenotypic responses to different races remains to be elucidated. Such a discrepancy between the phenotypic response and marker data was found previously in some genotypes which confirms our conclusion 7 .
Sr38 gene. The presence of Sr38 was tested by the STS marker VENTRIUP-LN2 37 . The target allele band size of this STS marker is 259 bp in ISr38. In the DUP2015 nursery, 74 genotypes (27.4%) had the target band associated with the Sr38 gene and 19 genotypes (31.7%) in the TRP2015 had the target band (Fig. 2a). All the genotypes with the Sr38 gene had the expected IT for this gene (; to ;13 Supplementary Tables 3 and 4). In addition, the TRP2015 nursery showed IT data conforming the postulation for the presence of Sr38 predicted by marker analysis, except for one genotype which showed highly susceptible responses to TMPKC and QTHJC races. Based on these results, we concluded that VENTRIUP-LN2 marker is a powerful marker in predicting the presence of this gene. Sr38 is a very important gene as it is located on a translocation and tightly linked with Lr37 resistance gene for resistance to leaf rust (caused by P. triticina Eriks) and Yr17 for resistance to stripe rust (caused by P. striiformis Westend. f. sp. tritici Erikss) 4 . Those genotypes with Sr38 gene can be used in further breeding for resistance to multiple rust diseases.
Sr31 and Sr1RS Amigo genes. SSR marker Xstm120 was used to detect the presence of the rye chromosome arm (1RS) in the tested materials. 1RS contains important resistance genes to stem rust and powdery mildew (incited by Blumeria graminis D.C. f. sp. tritici) 34 in addition to high yield traits 31,32 . In this study, 13.7% of the DUP2015 genotypes (37 genotypes) carry the 1BL.1RS translocation and contains Sr31. In the TRP2015 nursery, 6.7% of the genotypes (four genotypes) showed the presence of 1BL.1RS. Only a few genotypes had the 1AL.1RS translocation with Sr1RS Amigo gene in both nurseries (three genotypes in the DUP2015 and one genotype in the TRP2015) (Fig. 2a). Most of the TRP2015 genotypes with the target band of Xstm120 had ITs that agreed with the presence of Sr31 and Sr1RS Amigo for the different stem rust races, which confirms that this marker is a useful marker for detecting the presence of the two stem rust resistance genes.
Sr24 gene. The STS marker Sr24#12 was used to predict the presence of the Sr24 gene in both nurseries. This marker has been reported to be in a complete linkage with Sr24 and hence it is effective in detecting the presence of this gene 38,39 . Based on the positive control isoline, ISr24, the target allele size of this marker is 500 bp. In the DUP2015 nursery, 36.4% of the genotypes (98 genotypes) contained the Sr24#12 target allele whereas in the TRP2015, 21.7% of the genotypes (13 genotypes) had the Sr24#12 target allele (Fig. 2a). Almost all the tested genotypes had ITs that were consistent with the presence of Sr24 predicted by the marker (Supplementary Table 3 and 4). In addition, all the TRP2015 genotypes which had the target allele of the marker displayed the expected ITs for the presence of Sr24 gene, except for two genotypes which showed a degree of susceptibility against some races (Supplementary Table 5). Based on our results and previous studies, we can conclude that the Sr24#12 STS marker is a useful marker for detecting the Sr24 gene.
SrTmp gene. Three SSR markers were used to detect the presence of SrTmp gene in this study (Supplementary  Table 1), but only marker Xgpwm5182 showed a clear polymorphism. Based on the positive control isoline, ISrTmp, the target allele size of this marker is 174 bp. Using this marker, only 10 (3.7%) and seven (11.7%) genotypes were identified to carry the target band in the DUP2015 and TRP2015 nurseries, respectively (Fig. 2a). Xgpwm5182 has been reported as a possible marker to detect the presence of SrTmp gene 40 . However, due to the large distance between the marker and SrTmp (1.8 cM) and the presence of another resistance gene in the same chromosomal position 41 , Xgpwm5182 may not be predictive for SrTmp as was expected and more diagnostic markers are needed for this gene. Goodstreak in the TRP2015 nursery was expected to carry SrTmp in addition to Sr6 28,29 , however, it did not have the target band of this marker (Supplementary Table 4). In addtition, only two out of the seven genotypes with the tragetted marker band had ITs in agreement with the expected response to the different stem rust races. Therefore, this marker appears to be ineffective in predicting the presence of SrTmp gene. This contradiction confirms that this marker did not have any diagnostic value to detect this gene.
Single marker analysis and gene pyramiding. After screening gene-specific markers for all the expected genes in the tested nurseries, we found that the number of target alleles per genotype ranged from zero to five in the DUP2015 and from zero to four in the TRP2015, respectively (Fig. 2b and Supplementary Tables 3 and 4). Among the resistant genotypes identified phenotypically, 24 and eight genotypes did not have any target allele of the screened markers for the known resistance genes in the DUP2015 and TRP2015 suggesting that these genotypes might carry other stem rust resistance genes than those we screened with molecular markers. Based on the marker data, one, three, 29, and 62 genotypes contained five, four, three, and two stem rust resistance genes in the DUP2015 nursery; and two, five, and 22 genotypes contained four, three, and two stem rust resistance genes in the TRP2015 nursery. The presence of genotypes with more than one stem rust resistance gene was expected because the intermating between resistant genotypes has been a routine in the Nebraska Wheat Improvement Program. As this program has a long history in Nebraska, many genes have been inadvertently pyramided in www.nature.com/scientificreports www.nature.com/scientificreports/ its germplasm. Such genotypes with many stem rust resistance genes could be selected for excellent stem rust resistance, and potentially, both improved stripe rust and leaf rust resistance due to the existence of genes for resistance to multiple rusts including Sr38/Lr37/Yr17, Sr24/Lr24, and Sr31/Lr26/Yr9. Genotype NE15680 from the DUP2015 nursery and genotype NE14575 from the TRP2015 nursery are good examples of such genotypes (Supplementary Tables 3 and 4). Crossing between these two genotypes could be useful in developing wheat cultivars with at least five resistance genes (Sr24, Sr31, Sr36, Sr6, and Sr38) and hence more durable resistance to multiple stem rust races.
Due to the existence of many stem rust resistance genes in the germplasm of the tested nurseries, it is worth to identify genes with the major effect on the resistance using single marker analysis ( Table 1). The analysis was done using the phenotyoic data of QFCSC race only as it is the most common race in Nebraska. In both nurseries, Sr6 was highly significantly associated with the resistance 11 . In addition to this marker, Sr24#12 (Sr24) and VENTRIUP-LN2 (Sr38) had highly significant associations with stem rust resistance in the DUP2015 and TRP2015 nurseries, respectively with p values of 0.003 for both markers (Table 1). These two genes were the most frequent genes after Sr6 in the DUP2015 (98 genotypes) and TRP2015 (19 genotypes). The phenotypic variation explained by marker (PVE) was calculated for the significant markers. Marker Sr24#12 (Sr24) had a PVE of 3% in the DUP2015 nursery, while, marker VENTRIUP-LN2 (Sr38) had a PVE of 15% in the TRP2015 nursery. For each significant gene, the phenotypic differences between the two genotypic groups with contrasting marker alleles (1 vs 0) in both nurseries is presented in Fig. 3. The average rust rating of the genotypes having Sr38 gene in the TRP2015 nursery was in agreement with the linearized IT ("1"). The average rust rating of the genotypes having Sr24 in the DUP2015 was lower than expected linearized IT ("4"). The average rust rating of genotypes without the Sr24 and Sr38 genes in both populations were also lower than expected IT for susceptible genotypes (5 or higher), indicating the presence of other resistance genes in the genotypes without Sr24 and Sr38.
Association mapping for stem rust resistance. Structure analysis showed the presence of structure (k = 2) in the DUP2015 nursery 11 . Therefore, two models, MLM + k and MLM + k + Q-matrix were used in the GWAS. A set of 32 SNPs were previously found to be associated with the resistance on chromosome 2D and linked with Sr6 gene 11 . Based on the SMA, Sr24 (on chromosome 3D) was significantly associated with the resistance in the DUP2015 nursery. Therefore, it was worth to identify the significant SNPs associated with this gene using GWAS. However, the GWAS results did not reveal any significant SNPs located on that chromosome based  Table 1. Single marker analysis for stem rust resistance in the DUP2015 and TRP2015 nurseries. a Phenotypic variation explained by marker. www.nature.com/scientificreports www.nature.com/scientificreports/ on Bonferroni corrections 5%, FDR 5%, or at α = 0.05 significance level. The GWAS was performed also using SUPER software which increases the statistical power of the test 25,42 . However, the same result was obtained. The absence of significant SNPs could be explained by the lower PVE value (3.00%) of this gene in the DUP2015 nursery based on SMA ( Table 1).
Because of the low number of the TRP2015's genotypes (60 genotypes), GWAS was not applied as it was recommended to use a minimum of 100-500 individuals to detect a true marker-trait association 43 . Alternatively, in order to identify SNPs significantly associated with the resistance gene (Sr38), LD analysis was done between all SNPs located on chromosome 2A (1601 SNPs) and the specific STS marker, VENTRIUP-LN2 (Table 2 and Fig. 4). As a result, 17 SNPs were found to be linked to the STS marker with R 2 values ranging from 0.71 for SNP S2A_2708784 to complete LD of 1.00 for SNP S2A_2367215 and S2A_2708760. The high LD between the STS marker and SNPs suggests that these SNPs would be excellent markers to predict the presence of this gene.
To confirm the association between these 17 SNPs and Sr38 gene, SMA was done between these SNPs and stem rust resistance (Table 2). Among the 17 SNPs, 15 SNPs were highly associated with the resistance at a p-values ranging from 0.008 for SNPs S2A_2800711, S2A_3965047, and S2A_3965054 to 0.001 for SNPs S2A_2367215 and S2A_2708784. The remaining two SNPs (S2A_710998 and S2A_2998843) were significantly associated with the resistance at p-values 0.016 and 0.018, respectively. The phenotypic variation explained by those markers (PVE) were high and ranged from 12.02% for S2A_710998 to 30.62% for SNP S2A_2367215 ( Table 2). The allele effect of each SNP marker was evaluated using stem rust linearized 0 to 9 score and it ranged from −1.64 for SNP S2A_710998. to −3.23 for SNP S2A_2708784. This result indicated that selecting for these significant SNPs could reduce the infection by more than 33%.

Genes underlying significant SNPs and their functional annotations.
To further understand the association between the significant SNPs associated with Sr38 and the stem rust resistance in the TRP2015 nursery, genes containing these SNPs were annotated. Among the 17 SNPs, nine were located within four genes; TraesCS2A01G004100.1 (one SNP), TraesCS2A01G005500.1 (two SNPs), TraesCS2A01G003700.1 (four SNPs) and TraesCS2A01G010200.1 (two SNPs). The functional annotation was known for only three models of the identified models and the fourth one has unknown functional annotation ( Table 3).
The expression data of the four identified gene models under control and disease conditions was reported (Fig. 5). Two gene models have more expression under disease conditions compared with controlled conditions at the seedling stage (TRIAE_CS42_2AS_TGACv1_112938_AA0348130.2 = TraesCS2A01G010200.1 and TRIAE_ CS42_2AS_TGACv1_114644_AA0368550.1 = TraesCS2A01G004100.1). The functional annotations of the first gene model was unknown. The second gene model (TRIAE_CS42_2AS_TGACv1_114644_AA0368550.1 = Trae sCS2A01G004100.1) has a functional annotation associated with disease resistance as it produced TIR-NBS-LRR, one of the disease resistance protein families (Table 3). This protein family was reported to stay in the cell membrane and helps the plant to recognize the pathogen that attacks it. Some genes from this family are required for the defense activation of some resistance genes like Pto (bacterial resistance gene in tomato) and Rpg1 (stem rust resistance gene in barley and rice) 44 . In our study, gene model TRIAE_CS42_2AS_TGACv1_114644_ AA0368550.1 was underlying SNPs in a complete LD with the STS marker, had high P.V.E., large allele effect, and  Table 3. Gene annotation of the four gene models underlying the significant SNPs and their probable function.  www.nature.com/scientificreports www.nature.com/scientificreports/ was highly significantly associated with the resistance (SNP_ID; S2A_2367215, S2A_2708760, and S2A_2708784). Based on these results, the identified SNPs could be converted to Kompetitive Allelic Specific PCR (KASP) markers and used for selecting the Sr38 gene. However, more studies are required to further investigate the association between these SNPs and the Sr38 gene. The LD analysis between SNPs and known specific DNA markers was very useful and could be an alternative way to identify candidate SNPs for specific genes.
In conclusion, the high percentage of resistant genotypes from the Nebraska Wheat Breeding Program provided evidence for the success of the program. In this program, many stem rust resistance genes have been incorporated and pyramided at various levels in some genotypes. Resistant genotypes carrying multiple stem rust genes could be selected and used in future breeding programs for resistance to a wide spectrum of pathotypes of stem rust, stripe rust, and leaf rust. SNP markers were not identified for the Sr38 gene previously. The identified 17 SNPs using LD could be a reliable source for marker-assisted selection (MAS) for this gene. However, more studies should be done to confirm the association between these SNPs and the gene.