Introduction

Rice (Oryza sativa) is grown on more than 147 million hectares worldwide, primarily in countries with high human population densities1. Mounting constraints to rice production, which include increasing global temperatures, decreasing land spaces for production, depleting water supplies, as well as other abiotic and biotic stresses threaten the security of food supply. Of the biotic stresses affecting rice production, diseases pose a major threat. Rice blast caused by the fungus Magnaporthe oryzae is a principal disease of rice due to its wide geographical distribution and its destructiveness under favorable conditions2. Breeding for resistant cultivars containing major resistance (R) genes has been a key means to managing blast infection with approximately 100 R genes available for use against multiple races of the pathogen3,4,5. Even with this large number of R genes, loss of resistance, largely due to pathogen adaptation, has been well documented2,6,7.

The development of cultivars with more stable forms of blast resistance is highly desirable, prompting the exploration of additional strategies beyond the accumulation of R genes against specific pathogens. Durable resistance is associated with polygenic partial/quantitative resistance which typically shows no race specificity8,9,10,11. The candidate gene approach, which is based on the principle that genes with known functions in the traits of interest may reside in major loci9, has been used to map known genes to QTL of multiple traits, including disease resistance, in various rice mapping populations10,11,12,13,14. These CGs included resistance gene analogs and a number of defense response genes that code for chitinase (CHI), oxalate oxidase (OXO), germin-like proteins (GLP), peroxidase (POX), phenylalanine ammonia lyase (PAL), superoxide dismutase, 14-3-3 proteins, and thaumatin-like protein (PR5), which are associated with QTL conferring resistance to fungal and bacterial pathogens of rice as well as resistance to brown plant hopper15,16,17,18,19,20,21,22. This frame map has been a useful reference for selecting CGs involved in both pathogen recognition and general plant defense, and has also been important for analysis of mapping populations to improve resistance to rice blast10,13.

Simultaneous detection of major genes and/or QTL for blast resistance and their transfer to elite backgrounds can accelerate the process of varietal development23. Using an advanced backcross method, Moroberekan, a traditional rice variety from the Ivory Coast, was employed in hybridization to improve blast resistance in a popular drought-tolerant upland rice variety Vandana from Eastern India13. Seven QTL were identified when BC2F4 and BC3F4 populations were analyzed: two QTLs from Vandana and four from Moroberekan conferred resistance to leaf blast, and one QTL from Moroberekan reduced panicle blast severity.

In this paper, we developed cultivars with QTL-based resistance to rice blast by pyramiding different combinations of CGs associated with rice blast QTL while simultaneously selecting progenies with good agronomic traits for cultivar release. Our strategy involved intermating previously produced BC3F4 families from Vandana/Moroberekan crosses that showed partial resistance to rice blast13 to produce lines with different combinations of CGs. The intermated progenies were screened for resistance to rice blast at locations with high levels of disease and pathogen-diversity in India (Almora, Hazaribag and Ambikapur) and the Philippines (Cavinti and the IRRI blast nursery). Progenies exhibiting a good level of resistance to rice blast as well as desirable morphological and agronomic traits were advanced to the next generation. The advanced lines containing multiple CGs exhibited effective broad-spectrum resistance to rice blast at multiple locations, showing the potential of pyramiding CGs for disease resistance.

Results and discussion

Disease resistance of F4 lines from intermated BC3F4 of Vandana/Moroberekan

Moroberekan (IRTP 19187), a japonica cultivar with durable resistance to rice blast, was crossed to Vandana (IRTP 932), a popular drought-tolerant but blast-susceptible variety from India, and the population was advanced to BC3F3 and BC3F413. Fifteen lines from the BC3F4 were crossed in all pairwise combinations and then selfed until F6 (Fig. 1). At the F4 stage, lines were screened for field resistance to blast at five sites in India and the Philippines. A comparison of the mean trait value for reaction to rice blast showed that Vandana was consistently more susceptible than Moroberekan across all sites, while intermated lines exhibited a range of responses to seedling blast, from resistant to highly susceptible (2.5–8.5, SES scale). Measurement of Diseased Leaf Area (DLA) 7 weeks after sowing confirmed that Vandana was highly susceptible to blast (65% DLA) compared to Moroberekan (1.5% DLA) in Almora. Among the intermated lines, variations in DLA were also observed with scores reaching up to 70% but with a mean value of just over 12% (Table 1). A wide variation in panicle blast (PB) severity was observed in Almora, with SES ranging from 3.0 to 8.3. While Moroberekan was moderately resistant to PB (SES = 3.15), Vandana was highly susceptible, scoring an average rating of 8.3. The response to PB was more pronounced in Hazaribag, where the PB scores ranged from 0 to 34.5.

Figure 1
figure 1

Breeding scheme describing the development of intermated lines from BC3F4 Vandana/Moroberekan families generated by Wu et al.13.

Table 1 Trait mean values for Vandana, Moroberekan and F4 intermated BC3F4 lines screened for resistance to rice blast at multi-location trials in India and the Philippines.

Agronomic performance under field blast condition was only recorded in Almora. Vandana yielded poorly (5 g per plot) under heavy field blast, while intermated lines yielded from 5 to 170 g per plot (Table 1). Most trait means for blast resistance were more similar to Moroberekan than to Vandana. The progenies were moderately to completely resistant to blast and yielded better under blast conditions than either of the parents (Table 1). In Almora, DLA and yield under blast (r =  − 0.60***) and panicle blast severity and yield under blast (r =  − 0.81***) were negatively correlated, while the start of disease symptoms is positively correlated to yield under blast (r = 0.62***) (Table S5). Taken together, these data show that although the intermated lines are phenotypically most similar to Vandana, they were more resistant to blast and yielded better under blast conditions than Vandana.

The disease response of intermated lines was equally correlated across the multiple locations. DLA at Almora showed positive correlation with lesion number (LN) scores in Hazaribag (r = 0.54**—Fig. S1) and LN in Hazaribag and Ambikapur were also similarly correlated (r = 0.54**—Fig. S2). Additionally, seedling blast (SB) measurements performed at the IRRI blast nursery were well correlated with panicle blast in Almora (p < 0.01), as well as seedling blast in Cavinti (p < 0.0001) (Table S5). Blast races at these locations vary, as shown by a coordinated monitoring program of the lineages of M. oryzae and virulence on a set of international differential varieties in the Philippines and in India24,25,26. Thus, the correlation in disease reactions of the BC3F4 intermated lines at multiple locations, i.e., their similar behavior, suggests that quantitative resistance exhibited by these materials is race-nonspecific.

Molecular analyses of intermated Vandana/Moroberekan lines

Cluster analysis of BC3F4 families using genotypic data showed higher similarity to the Vandana background (> 85%) relative to Moroberekan13. To better understand the contribution of accumulated CGs on the intermated lines under this genetic background, we utilized a combination of CGs specific markers, CGs associated SSR markers, and SSR markers randomly dispersed throughout the rice genome. The identified CGs included genes associated with disease resistance QTL in rice as well as genes induced by fungal or bacterial pathogens in rice or other species (Table S2). Twenty PCR primer pairs (Table S3) which detected polymorphisms in or near CGs and 31 SSR markers (Table S4) which co-localized with polymorphic CGs were used to identify CGs and introgressed regions from parental lines. In some instances, the SSR markers were more polymorphic than the CG-based PCR markers developed in this study (data not shown). An additional 122 polymorphic SSR markers were used to conduct genome-wide analysis (Table S6, Fig. S3).

Selected BC3F4 parents contained CGs in different number and combinations13. Line V4M-14-1-B, the common parent for all crosses (Table S1), contained Moroberekan alleles for the CGs that code for chitinase, oxalate oxidase, HSP90, thaumatin-like protein, peroxidase and probenazole-induced protein. V4M-5-3-B, V4M-6-1-B and V4M-82-2-B contained the Moroberekan allele for GLP. In addition, V4M-82-2-B also contained oxalate oxidase and DHAP from Moroberekan while V4M-6-1-B carried DHAP. CGs found in the parental lines were detected in various combinations in the selected progenies; several lines with no CG alleles from the Moroberekan parent were also identified.

Analysis of disease progress curves of F4 intermated lines demonstrated that higher accumulation of CGs from Moroberekan contributed to a decrease in blast susceptibility in field experiments in Almora (Fig. 2). Interestingly, lines carrying five and six CGs were comparable to the monogenic lines carrying Piz and Pi2, R genes that are effective at all three locations26. Analysis of CGs accumulation on intermated F6 progeny showed a similar trend. Genotyping data from 60 SSR markers covering five chromosomes enriched in CGs were used to generate a neighbor-joining tree calculating Gower distances between the genotypes of F6 intermated lines (Fig. 3). The analysis verified the preponderance of the Vandana genotype and showed strong clustering of lines that had accumulated Moroberekan markers, including CG markers as well as CG-linked SSR markers (Fig. 3). Phenotypically, most F6 intermated lines with accumulated Moroberekan regions showed increased resistance to rice blast across screening sites in Almora and Hazaribag, as well as better yield under blast infection in Almora (Figs. 3, 4, Table 1). Genome-wide comparison among lines containing different combinations of CGs showed no significant influence from additional Moroberekan genomic regions (Fig. S4). Overall, while the monogenic lines generally exhibited an ‘all or nothing effect’ with high or low disease indices, the introgressed population had a range of disease intensities that declined progressively with the addition of each CG.

Figure 2
figure 2

Performance of F4 intermated lines carrying different number of candidate genes measured by area under the disease progressive curve (AUDPC) in Almora, India.

Figure 3
figure 3

Neighbor-joining tree generated by calculating Gower distances between genotypes of F6 intermated lines based on the similarity to the population’s parents (left panel). Genotyping of intermated lines using 60 SSR markers spanning five chromosomes (see Fig. S3 for reference) and ten CG markers (black box highlight) shows greater similarity to Vandana (1—yellow) than Moroberekan (2—blue); some regions were heterozygous (3—light green), while others did not fit the above classes (4—dark green) (center panel). Lines accumulating Moroberekan regions performed better under seedling blast conditions in Almora (right panel). The same lines showed higher yields under heavy blast conditions. Performance under drought at IRRI is consistent with the parent Vandana. Chi (Chitinase), OXO (Oxalate Oxidase), ar (Aldose reductase), thau7 (Thaumatin like), pox (Peroxidase), oxlp (Germin-like protein), HSP90 (Heat shock protein 90), PR1 and PR10 (Pathogenesis-related genes).

Figure 4
figure 4

(a) Performance of F6 intermated lines with different number and candidate gene combinations (highlighted in red boxes) in Almora and Hazaribag, India. DLA disease leaf area, DAS days after sowing. (b) Leaf blast field phenotypes in Almora. Left: IR78221-19-6-56 (R); center: IR78222-20-1A-7 (S); right: IR78224-22-2-114 (HS). (R) Resistant; (S) Susceptible; (HS) Highly susceptible. Red arrowhead = blast lesions.

Association of candidate genes with QR

Six CGs, coding for chitinase, HSP90, OXO, GLP, PR5, and POX, were correlated with blast resistance across all screening sites in India and the Philippines (Table 2). The exception was Hazaribag, where no measurable contribution to resistance was observed; Hazaribag typically has low blast pressure, particularly relative to Almora, so this result was not unexpected. Of the evaluated CGs, the genes coding for aldose reductase, OXO, GLP and POX are involved in oxidative stress, while chitinase is an enzyme that inhibits fungal growth by degrading chitin found in fungal cell walls27,28,29. HSP90 is an important chaperone protein found in many plant species and is involved in protection against multiple biotic as well as abiotic stresses30.

Table 2 Phenotypic contribution (percentages) of candidate gene allele to rice blast disease resistance in Vandana, Moroberekan and intermated F6 lines screened for resistance to rice blast at multi-location trials in India and the Philippines.

All CGs, to a certain degree, contributed to resistance in one or more locations (Table 2). Of the CGs evaluated, only the OXO allele detected by OXO-associated markers contributed significantly (p < 0.001) to all six blast resistance parameters measured in Almora. In the Philippines, OXO contributed significantly to seedling blast resistance in Cavinti (p < 0.01) and the IRRI blast nursery (p < 0.01). Interestingly, not all of the selected markers associated with the CG resistance allele detected the same levels of contribution to disease resistance. For example, of the two OXO gene-specific markers tested that were polymorphic between Vandana and Moroberekan, only OsOXO4 UP marker was significantly associated with resistance conferred by OsOXO4. This result correlates with our previous study on the expression of OsOXO4 during resistance to rice blast, and supports association with a missing 26-bp promoter region in the Vandana allele and selected blast-susceptible progeny31. The association to a phenotype and contribution level for each marker will depend on multiple factors including the polymorphism detected by the marker, the evolutionary history of the polymorphism in a population, and the quantitative nature of the trait. A marker detecting a causative polymorphism may show different association to a phenotype than a marker detecting a polymorphism linked to the causative one32. In addition, the quantitative nature of a trait will vary depending on the genetic background and environmental conditions tested leading to the fluctuating contribution by different elements associated with the trait.

Other CGs also showed significant correlation with blast resistance. POX showed association to resistance against DLA, decreased PB and yield under blast in Almora, while GLP exhibited significant association with start of disease, panicle blast and yield under blast as well as LB (Table 2). In addition, GLP showed high association with leaf blast infections in Ambikapur. Single gene analysis of variance (ANOVA) also correlated POX with yield under blast in Almora (p < 0.001); both POX and OXO appear to be important for panicle blast resistance at this location. In addition, while both OXO and PR5 are significantly associated with seedling blast resistance at the IRRI blast nursery, only PR5 is associated with panicle blast resistance in Cavinti. A two-gene ANOVA (p < 0.0001) revealed association of oxalate oxidase/thaumatin with yield under drought conditions and with seedling blast resistance at IRRI. Moreover, association between HSP90/thaumatin with seedling blast resistance was noted at IRRI nursery, and thaumatin/oxalate oxidase with panicle blast resistance in Cavinti.

The associations found using PCR markers for the CGs were confirmed using 26 SSR markers that are correlated with resistance to rice blast across screening sites in India and the Philippines (Table S7). These include SSR markers which co-localize with CGs. For example, SSR markers which co-localized with OXO (RM168 and RM426), POX (RM5720), and GLP (RM3215, RM25, and RM331) were highly correlated with resistance to blast across screening sites. OXO marker RM168 was previously identified as significantly associated with blast lesion number in Vandana/Moroberekan, as well as other rice varieties13,33. Moreover, comparative mapping of these SSRs using cMAP revealed that they co-localized with previously reported QTLs involved in resistance to bacterial blight, rice blast, sheath blight and/or brown planthopper (Table S8). No other SSR markers showed significant association with resistance to rice blast across the studied screening sites.

Overall, despite the different contribution of individual CGs or their combinations to disease response at the tested locations, results confirm that increasing numbers of CGs in a line correlated to lower susceptibility of F6 intermated progeny. Importantly, the use of CGs as markers was effective in determining the response of each line and demonstrated that accumulation of even a few QTL-based CGs can substantially reduce blast susceptibility.

Accumulation of defense response genes is correlated to resistance

Screening of 60 F4 lines in blast endemic locations in India indicated that the disease progress curve declined progressively with addition of each CG (Fig. 2). Among these were several progenies carrying from 1 to 6 CG combinations which exhibited good morphoagronomic traits (Table S9). The performance of F6 intermated lines with different CG combinations in Almora and Hazaribag similarly demonstrated strong correlation between the level of resistance and the number of defense response genes in the intermated lines, and confirmed the results obtained with F4 generation. Lines carrying multiple CG alleles from Moroberekan showed lower DLA and lower panicle blast severity estimates. The leaf blast (Almora) and panicle blast (Hazaribag) data for these lines were comparable to the donor parent Moroberekan. In contrast, Vandana and lines with one or two CG alleles were more susceptible at both sites, but this depended on the CG present in the line (Figs. 3, 4). For example, the line IR78221-19-6-99-B-B which carried only OXO was less susceptible across screening sites. However, line IR78221-19-3-196-B-B, containing only the chitinase CG, was susceptible across screening sites. The presence of some single CGs conferred resistance while others, when present alone, did not, showing the differences in contribution of each gene. This was also apparent when the genes were combined in lines, that is, the resistance observed was not proportional to the number of CGs present. However, we did observe a significant reduction in leaf blast severity with increasing numbers of CGs in introgressed lines at different locations and years. This suggests that accumulation of CGs, conferring different mechanisms of resistance, contributes to resistance which is effective in multiple environments. Thus, our work confirms that selection for CGs is an effective means to improve quantitative disease resistance in rice10,11,14.

Lines with accumulated CGs show broad spectrum resistance against sheath blight

In the rice paddies, combinations of multiple biotic stresses are observed and these combinations contribute to higher yield losses. Thus, breeding for cultivars showing increased disease resistance against multiple pathogens provides immense advantages. Sheath blight, cause by Rhizoctonia solani, is also a devastating disease which affects rice production in many areas around the world. Given the strong response observed at the multiple tested locations against rice blast, and considering the combination of CGs obtained in the introgressed lines, we also tested several lines for resistance to sheath blight. Greenhouse tests performed with a virulent strain of R. solani (RM01401) demonstrated that lines with favorable CG combinations also showed decreased susceptibility to sheath blight (Fig. 5). Line IR78221-19-3-196 containing a single CG was more susceptible than even the susceptible parent line Vandana. Overall, these results are consistent with previous findings that accumulation of appropriate combinations of CGs can enhance resistance against pathogens other than just the rice blast pathogen.

Figure 5
figure 5

Performance of F6 intermated lines with different CG genes after inoculation with R. solani strain RM01401 in greenhouse tests. Introgressed lines with multiple CGs, particularly those with OXO, are more resistant than lines with few CGs. Line number (CG content): 19-6-7-B-B (OXO, HSP90, POX); 19-6-56-B-B (Chitinase, OXO, POX, GLP); 19-6-99-B-B (OXO); 19-3-196-B-B (Chitinase).

Combined blast resistance and drought tolerance

Because Vandana is a known drought tolerant cultivar from India, the Vandana/Moroberekan progenies were also evaluated for tolerance to drought at the reproductive stage. From all the tested lines, those which performed as well as or better than the Vandana parent during the reproductive stage (less than 10 cm rainfall for more than 30 days at rice reproductive stage) were selected. From 84 Vandana/Moroberekan F6 intermated lines tested, nine progeny were selected based on their resistance to rice blast across all screening sites (Table S9). These lines contained different CG allele combinations from Moroberekan, showed decreased DLA readings in Almora, and also were highly similar to Vandana (Table 3). Moreover, the selected lines also exhibited similar or better morphoagronomic traits than Vandana providing better harvest index and grain yield (Table 3). Thus, these lines are tolerant to drought at the reproductive stage while also showing enhanced resistance response against blast (Figs. 3, 6). Importantly, the introgression of CGs from Moroberekan into Vandana by the advanced backcross method did not compromise the drought tolerance provided by the latter. Thus, the advanced backcross method allowed for more rapid varietal development, combining both effective QTL-based disease resistance, a result of the simultaneous accumulation of CGs, and drought tolerance.

Table 3 Means of agronomic traits evaluated under drought condition of selected F6 Vandana/Morobekan intermated lines with resistance to field blast and the candidate genes detected in the genomes.
Figure 6
figure 6

Yield during DS drought conditions at IRRI against seedling blast infection recorded at Cavinti and Almora of selected 60 F5 progenies from intermated BC3F4 lines. Highlighted in the graphic, filled data points correspond to lines (from left to right): IR78221-19-6-7-B-B (green); IR78221-19-6-3-B-B (yellow); IR78221-19-6-33-B-B (red); IR78222-20-7-128-B-B (gray); IR78221-19-6-90-B-B (orange); IR78221-19-6-99-B-B (blue); IR78221-19-6-56-B-B (purple); black (Vandana).

Conclusions

To achieve effective levels of resistance that are broad-spectrum and durable, identification and deployment of both qualitative and quantitative resistances are desirable. However, it is difficult to accumulate quantitative resistance without knowledge of the underlying genetic control and corresponding mechanisms involved in resistance expression. In this study of QTL association mapping using advanced backcross populations, we show that relatively few chromosomal regions (5–10) which contain CGs exhibit effective broad-spectrum resistance without compromising desirable agronomic traits or drought tolerance. To contribute to the practicality of this approach, the genetic behavior of the QTL containing CGs in different genetic backgrounds needs to be investigated. This research will facilitate the selection of cultivars with broad-spectrum resistance against blast and other pathogen populations in diverse rice-growing environments.

Materials and method

Plant materials

Initial crosses of Vandana by Moroberekan were produced by Wu et al.13. Fifteen of the BC3F4 parental lines that showed partial resistance to rice blast and carried CG alleles from Moroberekan were selected and crossed in all pairwise combinations. Seeds of 10 F2 families were selected based on partial resistance of the BC3F4 parental lines to seedling blast and neck blast, association of the parental lines with positive alleles, and phenotypic similarity to Vandana. Resistant lines from selected intermated families were advanced until F4. At F4, the top 10% of the lines (60 out of > 600) derived from the progenies of BC3F4 crosses V4M-14-1-B × V4M-5-3-B, V4M-14-1-B × V4M-6-1-B, and V4M-14-1-B × V4M-82-2-B which exhibited acceptable agronomic traits were selected, and were designated as IR78221, IR78222, and IR78224, respectively (Table S1). Selected F5 plants were evaluated for morphoagronomic traits and drought tolerance. Sixteen F5 lines that were still segregating by morphology were advanced separately to F6, from which 84 lines were identified for molecular analyses. At the IRRI nursery, 24 BC3F4 intermated F6 lines selected for drought tolerance were analyzed for CGs from Moroberekan and for resistance to blast.

In silico mapping of candidate genes in the rice genome

Eleven defense response CGs were analyzed in the intermated backcross lines of Vandana/Moroberekan (Table S2). CGs coding for 14-3-3 protein, aldose reductase, chitinase, heat shock protein 90 (HSP90), oxalate oxidase (OXO), germin-like proteins (GLP), PR1, PR5 (thaumatin-like proteins), peroxidases (POX), probenazole-induced protein, and putative 2-dehydro-3-deoxyphosphoheptonate aldolase (DHAP) were identified to be associated with quantitative resistance to fungal pathogens in rice mapping populations10,12,13. Rice orthologs of these CGs were mapped using BLASTn similarity searches against the rice genome database34.

Molecular analysis of plant materials

SSR markers co-localizing with the CGs were identified using the Rice Genome Browser (http://www.gramene.org) of Gramene (Table S3), which maps the physical location of the CGs and displays molecular markers adjacent to the CG loci. PCR primers were designed based on the gene-coding region, the 3ʹ untranslated region, and the 1 kb upstream region of the CG rice sequences (Table S4). A survey of SSR polymorphism was conducted between Vandana and Moroberekan to identify polymorphic SSRs to scan the genome of the progenies. Polymorphic SSRs were used for genome-wide scans of the advanced backcross Vandana/Moroberekan progenies. Data generated from genome-wide scans informed statistical analyses for SSR markers contribution. Comparative mapping using the cMAP tool in Gramene was also used to identify QTL co-localizing with SSRs which were correlated with resistance to blast.

Two fresh leaves from ten 2-week-old plants were bulked, and DNA was extracted using the CTAB method35. DNA was amplified by PCR in a 50 μL volume. The final concentration of PCR reactions contained 100 ng of genomic DNA, 0.25 mM of each primer, 2 mM of each dNTP, 1.5 units Taq polymerase and 1× PCR buffer (New England Biolabs), under the following conditions: initial denaturation at 94 °C for 4 min; 35 cycles consisting of 1 min at 94 °C, 1 min at annealing temperature, and 2 min at 72 °C; and a final extension of 7 min at 72 °C. The PCR products from CG primers were resolved in 1.5% agarose gels while those from SSR markers were resolved in 2:1 2% metaphor:agarose gels. Gels were stained with ethidium bromide.

Phenotyping of progenies

Plants were screened for resistance to rice blast in field locations in India (Almora, Hazaribag and Ambikapur) and in the Philippines (Cavinti and the IRRI blast nursery at Los Baños). Fifty seeds per line were sown per 1 m row; this was replicated three times in a randomized complete block design. Border rows consisting of a mixture of susceptible varieties were planted around each plot to enhance the pathogen population. In Almora, India, plants were grown to maturity and yield data (grams per plot) were collected. Scoring of leaf blast and panicle blast was based on the Standard Evaluation System (SES; http://www.knowledgebank.irri.org/images/docs/rice-standard-evaluation-system.pdf) for rice36. The progenies were also characterized for morphoagronomic traits and drought tolerance (http://inger.irri.org/).

Greenhouse sheath blight screenings were performed on 14-day-old plants using the microchamber assay37 and Rhizoctonia solani isolate RM01401. Plants were incubated in microchambers for 14 days post inoculation and were scored for both disease index (DI; lesion height/plant height × 9) and visual index (0–9 scale).

Data analysis

A matrix of pairwise distances between varieties was generated by calculating distances between strings of coded genotypes, using the daisy function from the cluster library in R38. A neighbor-joining tree was obtained by using the nj function from the package ‘ape’ on the distance matrix39. The phylogenetic tree, the genotype data and accompanying data were visualized using the ggtree package40.

Simple and multiple regression analyses were conducted using the General Linear Model (GLM) procedure in SAS (SAS Institute, 1989) for test of association between CG and SSR markers with phenotypes at 0.05 and 0.01 levels of significance. Two-way and three-way ANOVA (SAS) were used to determine the best CG combinations per site. The PROC CORR procedure was used to estimate the Pearson correlation coefficient. Single-marker regression was analyzed using QGene41.

Declaration

All studies in the manuscript comply with relevant institutional, national and international guidelines and legislation.