Introduction

Fusarium head blight (FHB) severely impacts wheat production worldwide (Buerstmayr et al., 2009). FHB disease is mainly caused by the fungal pathogens Fusarium graminearum and Fusarium culmorum, which attack the spikes of wheat, leading to shriveled and discolored kernels containing mycotoxins (Pestka and Smolinski, 2005). The European Union and other international legislative bodies rigorously limit the tolerated mycotoxin content in wheat to be used as food or feed, making infected wheat often unmarketable (Verstraete, 2008).

Breeding for FHB resistance can reduce the detrimental effects on wheat grain yield and quality. However, reliable phenotypic selection requires labor intensive and thus costly artificial inoculation to guarantee homogeneous disease pressure in field trials (Miedaner et al., 2002). As an alternative approach, marker-assisted selection has been promoted in breeding for decreased FHB susceptibility in wheat varieties (Miedaner et al., 2009). Large effect quantitative trait locus (QTL) such as Fhb1 and Fhb5 were identified in crosses with the exotic Sumai 3 and CM82036 donor lines (Becher et al., 2013). However, despite worldwide efforts, the frequency of favorable Fhb1 and Fhb5 alleles is still low in European and US elite wheat germplasm (Sneller et al., 2010; Bernardo et al., 2012; Miedaner and Korzun, 2012). One reason is the long time need to reduce linkage drag associated with marker-based introgression of Fhb1 or Fhb5 (Miedaner and Korzun, 2012). In contrast to Fhb1 and Fhb5, the effects of FHB resistance QTL that were identified in adapted European or US wheat germplasm and thus are not associated with linkage drag are only rather small (Buerstmayr et al., 2008; Löffler et al., 2009; Sneller et al., 2010). This so far hinders an efficient application of marker-assisted selection for FHB resistance for QTL alleles identified in adapted elite germplasm.

Genomic selection has been suggested as an attractive alternative to marker-assisted selection to predict phenotypes for traits that are controlled by multiple genes with small effects (Meuwissen et al., 2001). In contrast to marker-assisted selection, which predicts genotypic values based only on the markers associated with significant QTL detected in the genome-wide association mapping, genomic selection simultaneously utilizes large numbers of markers distributed across the genome to train a prediction model. The potential of genomic selection for predicting FHB resistance has been studied previously using 170 lines, which were phenotyped at the US cooperative FHB wheat nursery and genotyped with genome-wide diversity array technology as well as with functional markers for FHB resistance (Rutkoski et al., 2012). Despite the small number of genotypes, these findings suggested that US cooperative FHB nursery data can be useful for genomic selection.

Investigating the accuracy of prediction of genomic versus marker-assisted selection in wheat revealed that genomic selection usually increases accuracy, with some dependence on the trait under consideration and the related linkage disequilibrium structure (Guo et al., 2012; Rutkoski et al., 2012; Miedaner et al., 2013; Zhao et al., 2013, 2014). One interesting finding was that for FHB resistance, as for some other complex traits, the differences between the mean accuracies of marker-assisted and genomic selection were surprisingly small (Rutkoski et al., 2012). A genome-wide mapping study in a large hybrid wheat population revealed that the accuracy of marker-assisted selection is not only depending on functional knowledge of the genetic architecture, but also profits from genetic relationships between individuals in the training and the validation population (Gowda et al., 2014). This could provide an explanation for the surprisingly low differences in mean accuracy of genomic and marker-assisted selection. Studies on the influence of relatedness versus functional QTL information as driving forces of the accuracy of prediction of marker-assisted selection, however, are lacking to the best of our knowledge.

Marker systems that have been used for genome-wide fingerprinting in wheat include restriction fragment length polymorphism (Chao et al., 1989), amplified fragment length polymorphism (Barrett and Kidwell, 1998), simple sequence repeat (SSR) markers (Röder et al., 1998), diversity array technology (Akbari et al., 2006), single-nucleotide polymorphism (SNP) markers (Cavanagh et al., 2013), and most recently genotyping-by-sequencing approaches (Poland et al., 2012). When Heslot et al. (2013) compared the accuracy of prediction of genomic selection based on genotyping-by-sequencing versus diversity array technology markers, they found advantages of genomic selection based on genotyping-by-sequencing data, which they interpreted to be mainly because of the higher number of markers. The same study also unraveled a sampling bias for the used diversity array technology markers. Würschum et al. (2013) compared SNP and SSR markers in wheat and found only low congruency among the genetic relatedness matrices estimated with the two marker types. As genetic relatedness is one of the major forces exploited in genomic selection, substantial differences are expected for the prediction models trained with SNP or SSR markers in wheat.

In this study, we draw on published data derived from a diversity panel consisting of 358 recent European winter wheat varieties plus 14 spring varieties, which had been phenotyped for FHB resistance and genotyped with 732 SSR markers (Kollers et al., 2013a, 2013b). The 372 varieties were additionally genotyped with 9k (Cavanagh et al., 2013) and 90k SNP arrays (Wang et al., 2014a). Aiming to determine the genetic architecture of FHB resistance in the European wheat panel, we applied genome-wide association mapping, which detected the highest number of QTL when the analysis was based on the high-density 90k SNP data. However, when tested in fivefold cross-validation, marker-assisted selection using the detected QTL allowed only marginally higher accuracies of prediction than calculation based on random samples of markers not selected for marker–trait associations. Consistent with a simulation approach, this suggested relatedness as a main driver of the accuracy of prediction. We then set out to test the accuracy of prediction of FHB resistance by genomic selection and to identify the factors that determine accuracy. When three genomic selection models were contrasted for the three marker data sets, no significant differences in accuracies among marker platforms and genomic selection models were observed. Marker density impacted the accuracy of prediction only marginally, indicating that genomic selection of FHB resistance can be implemented most cost-efficiently based on low- to medium-density SNP arrays.

Materials and methods

Plant material and phenotypic data analyses

A total of 358 European winter wheat varieties plus 14 spring wheat varieties were used in this study (Kollers et al., 2013a, 2013b). The 372 wheat varieties were evaluated for FHB resistance over 2 years in four environments in Germany in trials with three replications. In all environments, lines were artificially spray inoculated as described in detail by Kollers et al. (2013a). To compensate for different flowering times of the wheat lines, inoculation was carried out at three time points in order to inoculate each genotype at least once at full flowering. As a measure of FHB resistance, the FHB score with a possible range from 0% (most resistant) to 100% (most susceptible) was calculated as FHB score=FHB incidence × FHB severity/100%, with FHB incidence representing the percentage of infected spikes in a test plot and FHB severity representing the mean percentage of infected area on infected spikes.

We performed a one-step phenotypic analysis using a linear mixed-model with effects for genotype, environment, interaction of genotype by environment, and replication. To estimate variance components all effects were assumed to be random. Broad-sense heritability (h2) was estimated on an entry-mean basis as , where E refers to the number of environments, R is the number of replications, σ2GxE refers to the variance of the genotype by environment interactions and σ2e refers to the error variance. A second linear mixed model, in which genotypic effects were assumed to be fixed and all other effects remained random, was used to get the best linear unbiased estimates of FHB scores for each variety.

Genotypic data

The 372 varieties were genotyped with a 9k Infinium SNP array (Cavanagh et al., 2013; Supplementary Table S1), a 90k Infinium SNP array (Wang et al., 2014a) and 732 SSR markers resulting in 782 loci (Kollers et al., 2013a, 2013b, 2014). In total, 78% of the SNP markers from the 9k array were also present on the 90k SNP array.

We performed quality control for SNP markers to exclude those with rates of missing values above 5%, rates of heterozygotes above 5% and allele frequencies <0.05 or >0.95. For SSR markers, we excluded those with rates of missing value above 10% and rates of heterozygotes above 5%. All varieties were additionally genotyped for three predefined functional markers, the dwarfing genes Rht-B1, Rht-D1 and the photoperiodism gene Ppd-D1 (Ellis et al., 2002; Beales et al., 2007), which were used for association mapping.

We estimated Rogers’ distances for each pair of lines using the three different marker sets. Correlations between each pair of distance matrices were calculated and the Mantel test (Mantel, 1967) was applied. Principal component analyses were performed for the three marker sets. To test for differences in the proportion of variance explained by the individual principal components, we implemented a bootstrap approach as outlined in detail elsewhere (Heslot et al., 2013).

Association mapping and marker-assisted selection

We applied for each marker set a two-step association mapping scan. In the first step, the best linear unbiased estimate for each genotype in each environment was estimated assuming fixed genotype and random replication effects using a linear mixed model. Then, in the second step, a standard linear mixed-model approach (Yu et al., 2006) was used to perform a genome-wide association mapping scan:

where y is the vector of best linear unbiased estimates for each genotype in each environment, μ is the vector of common intercept term, m is the effect of the marker being tested, α denotes the vector of scores of the marker, g is the vector of genotype effects, l is the vector of environment effects, X and E are the corresponding design matrices, and e is the residual term. In the model only the marker effect was assumed to be fixed and all other effects are random. The population structure was considered by assuming , where G is twice the kinship matrix estimated as 1 minus the Rogers’ distances and σ2G is the genotypic variance estimated by a restricted maximum likelihood approach. To check whether the model is able to adequately control the population or family structure, a qq-plot was drawn based on the observed P-values and expected P-values of all markers (Yu et al., 2006). Significance of marker–trait associations was tested based on the Wald F statistic. The proportion of the phenotypic variance explained by all QTLs was estimated using the R2 values fitting a multiple regression (Utz et al., 2000), which ignores information of relatedness. In addition, we implemented the method suggested by Sun et al. (2010) and estimated the likelihood-ratio-based R2 values (R2LR), which takes the information of relatedness into account.

Given the same threshold, there would be more marker–trait associations detected in the 90k SNP array than in the 9k SNP array. However, the number of informative markers is not necessarily higher in the 90k array because of possible linkage disequilibrium. We defined a new statistic parameter called effective number of marker–trait associations. To calculate it, we first performed principal component analysis with the significant markers among the 372 lines, and then extracted the minimal number of principal components needed to portray 95% of the total variation.

SNP markers were scored as 2, 1 or 0 as the number of copies of a particular allele. SSR markers were treated differently from SNP markers. For the SSR markers, we avoided treating each allele as an individual locus. Instead, markers were scored as a class (factor) indicating the presence of the different alleles. However, SSR markers that could reliably be mapped to different genomic positions were considered as separate loci.

The accuracy of prediction of marker-assisted selection was evaluated by fivefold cross-validation with a total of 100 cross-validation runs. In each run, four-fifths of the 372 wheat lines used in the study were assigned to the estimation set, whereas the remaining one-fifth was assigned to the test set. We then performed an association mapping scan for the estimation set and recorded the detected QTLs. We applied three different significance thresholds to study their influence on the accuracy of prediction. For the 9k SNP array and SSR markers, we set the thresholds to P<0.01, 0.005 and 0.0001. For the 90k SNP array, the thresholds were P<0.005, 0.001 and 0.0001. In each run of cross-validation, a multiple linear regression model was fitted in the estimation set with fixed effects of significant markers detected in the association mapping scan. The non-cross-validated accuracy of prediction was determined within the estimation set as square root of the coefficient of determination standardized with the square root of the broad-sense heritability on an entry-mean basis (Lande and Thompson, 1990). To determine the cross-validated accuracy of prediction of marker-assisted selection, we estimated the effects for the significant markers using the previous linear regression model fitted with the estimation set and predicted the genotypic value of the lines in the test set. Based on these values, we calculated the cross-validated accuracy of prediction as the Pearson product-moment correlation between predicted and observed genotypic values in the test set standardized with the square root of the broad-sense heritability on an entry-mean basis. The average difference between non-cross-validated and cross-validated accuracies of prediction was considered as bias. To quantify the proportion of variance explained in the test set because of the exploitation of relatedness, we calculated the prediction accuracy via the same training and test sets as before but using randomly chosen markers. More precisely, in each cross-validation run we randomly chose the same number of markers as the number of QTLs observed in the association mapping scan in the estimation set, estimated the effects for these randomly chosen markers using the estimation set and predicted the genotypic value of the individuals in the test set. The prediction accuracy in the test set was defined as above. The procedure was repeated 1000 times.

Simulation study to examine the impact of relatedness on marker-assisted selection

We investigated the influence of relatedness on the accuracy of prediction of the phenotypic performance by conducting computer simulations based on the 9k SNP array data of our study. We selected 30 SNP markers with a linkage disequilibrium measured as r2 values of 0.1 or less to all other markers. We then assumed that each of these 30 SNPs is perfectly linked to one QTL explaining 1.7% of the genotypic variance each, resulting in a total of 50%. We further assumed a heritability of 0.90. The simulated data set was used to conduct association mapping coupled with fivefold cross-validation with 100 runs, with the accuracy of prediction defined as the Pearson product-moment correlation between predicted and observed genotypic values in the test set standardized with the square root of the heritability. The following four scenarios were contrasted in order to separate the contributions of marker effects and relatedness: (1) we assumed the 30 SNP markers to represent predefined marker–trait associations and estimated their effects based on the estimation set. The accuracy of prediction was validated in the test sets. (2) We assumed that the 30 SNP markers were not predefined and performed an association mapping scan in the estimation set to identify the 30 most significant marker–trait associations. Effects of these marker–trait associations were estimated based on the estimation set. The accuracy of prediction was evaluated in the test set. (3) Randomly selected 30 SNP markers excluding those with significant marker–trait associations were sampled and their effects were estimated in the estimation set. The accuracy of prediction was then evaluated in the test set. (4) Randomly selected 30 SNP markers were sampled among all SNPs and their effects were estimated in the estimation set. The accuracy of prediction was then evaluated in the test set.

Genomic selection

Three genomic selection models were applied in evaluating the prediction accuracy. They are ridge regression best linear unbiased prediction (RR-BLUP; Whittaker et al., 2000; Meuwissen et al., 2001), reproducing kernel Hilbert space regression (RKHSR; Gianola and van Kaam, 2008) and Bayes-Cπ (Habier et al., 2011). In the following, we give some descriptions about the three models. For more details, we refer to the above references.

Let n be the number of genotypes, m be the number of markers and l be the number of environments. The RR-BLUP model has the form y=1nμ+Xg+e, where y is the vector of best linear unbiased estimates of FHB scores for all genotypes across environments, 1n denotes the vector of 1’s, μ is the overall mean, g is the vector of marker effects (for SSR markers, allele effects), X is the corresponding design matrix and e is the residual term.

In the model, we assumed that marker and residual effects are random and follow the multivariate normal distribution are the genotypic and residual variance components obtained in the mixed model in the phenotypic data analysis. The penalty parameter is . For SSR markers, we treated as shared by all alleles in each marker, with each allele effect having the same variance within each marker. Hence, the variances of allele effects are different among markers.

The RKHSR model is of the form y=1nμ++e, where y, 1n, μ and e are the same as in the RR-BLUP model, is a vector of random effects and K is the n × n symmetric positive-definite matrix whose entries are defined by

In the above formula, xi and xj are (m × 1) vectors of marker indices for the i-th and j-th genotype respectively, and h is a smoothing parameter. To determine h and estimate , we first chose a grid of values for h. For each value of h, we estimated using a restricted maximum likelihood approach and then calculated the fitted values of the model. Finally we chose the value h optimizing the generalized cross-validation statistic of the model.

The basic model of Bayes-Cπ is the same as RR-BLUP. However, all parameters are treated as random variables in a Bayesian framework. First, we defined the prior distributions as is assumed to be zero with probability π and a scaled inverse chi-squared distribution with probability (1−π). The probability π is a random variable whose prior distribution is uniform on the interval [0,1]. The prior distribution of is also scaled inverse chi-squared. A Gibbs sampler algorithm was then implemented to infer all the parameters in the model. It was run for 10 000 cycles and the first 1000 cycles were discarded as burn in. The samples of g from all later cycles were averaged to obtain estimates of the marker effects.

We evaluated the accuracy of prediction of the three genomic selection models using a fivefold cross-validation across genotype scheme. The individuals were randomly divided into five subsets. Then four of the five subsets were used as the estimation set and the genotypic values of the individuals in the remaining test set were calculated. After the genotypic values of all individuals were calculated, the means and standard errors of the prediction accuracy were calculated using a bootstrap approach (Rutkoski et al., 2012). The sampling of bootstrap was repeated 1000 times. The accuracy of prediction was defined as the correlation between observed and predicted values divided by the square root of the heritability .

Moreover, we studied the effect of the number of markers on the prediction accuracy of genomic breeding values through cross-validation. Therefore, we varied in the cross-validation studies the number of markers from 5 to 90% of markers available for each of the three marker sets (9k and 90k SNP, SSR). All calculations were done using R and Asreml-R (Gilmour et al., 2006).

Results

Population structure observed based on SSR markers, 9k SNP and 90 SNP array data

We observed a wide-range of pairwise Rogers’ distances for all three marker systems used (Table 1). The coefficient of variation (CV) of pairwise Rogers’ distances was similar for the 9k (CV=0.17) and 90k (CV=0.15) SNP array data, both slightly exceeding that for the SSR marker data (CV=0.12). The distance matrices estimated with the three different marker data sets were significantly (P<0.01) associated with Pearson product-moment correlation coefficients of above 0.84.

Table 1 First- and second-degree statistics for RD among 372 wheat lines estimated by three different sets of markers as well as Pearson product-moment correlation coefficients among distance matrices (rRD )

The analysis of the population structure revealed the absence of genetically distinct sub-populations for all three marker data sets (Figure 1 and data not shown). We applied a bootstrapping procedure and found significant (P<0.05) differences among the SSR and SNP markers in terms of the explained proportion of molecular variation for the first three principal components (Figure 2). This suggested that the population structure estimated based on SSR markers is more complex than that examined with SNP markers. The observed differences among SSR and SNP markers may be attributed to the biallelic (SNPs) and multiallelic (SSRs) nature of the two marker types, respectively.

Figure 1
figure 1

Heat map plot of the relationship measured as one minus Rogers’ distance based on the 90k SNP array data of the 372 wheat lines ordered according to a hierarchical cluster analysis. The red segments indicate the positions of the 14 spring varieties in the diversity panel. A full color version of this figure is available at the Heredity journal online.

Figure 2
figure 2

Principal component analyses of the 372 wheat lines based on the three marker sets 9k SNP array, 90k SNP array and SSR markers. Proportions of variance explained by the first 10 principal components are shown with confidence intervals obtained by 1000 bootstrap samples.

Artificial inoculation resulted in high-quality phenotypic data

Disease pressure was high as reflected by a broad range of FHB scores among the panel of 372 European wheat inbred lines with a mean value of 11% (Table 2). The distribution of FHB scores of the 358 winter wheat varieties was similar to that of the 14 spring wheat varieties (data not shown), suggesting that combined analyses are not afflicted by population stratification effects. The genetic variance of FHB scores was significantly (P<0.01) larger than zero. Heritability on the line mean basis was high and amounted to 91%. This clearly underlines the high quality of the phenotyping data.

Table 2 First- and second-degree statistics for 372 wheat lines evaluated for FHB resistance (%) in four environments

FHB resistance is often influenced by flowering time and plant height. For our panel of 372 wheat lines, Kollers et al. (2013a) estimated that flowering time explained only 5% of the variation in FHB resistance. Consequently, we expect only minor relevance of flowering time as masking effect on FHB resistance. Plant height had a slightly higher predictive value explaining 12% of the variation of FHB resistance. Consequently, it is likely that genes with large impact on plant height have a role for genomics-based prediction of FHB resistance.

Marker-assisted selection for FHB resistance

The functional markers for Ppd-D1 and Rht-D1, two major genes in the regulation of flowering time and plant height in wheat, were significantly (P<0.005) associated with FHB resistance (Supplementary Table S2). This result could be expected considering the association between FHB resistance and plant height as well as heading time observed at the phenotypic level (Kollers et al., 2013a). The marker–trait association for Rht-B1 was not significant (P>0.05), which could be explained by the low allele frequency of the mutant allele detected in only 26 of the investigated varieties. Another explanation may be given by the known difference in the effects of Rht-B1 and Rht-D1 on FHB (Srinivasachary et al., 2009). The Rht-D1 gene significantly influences both type 1 resistance (resistance to initial infection) and type 2 resistance (resistance to spread of the fungus within the spike), whereas Rht-B1 has significant influence only on type 1 resistance.

We performed more detailed further analyses for the two functional genes Ppd-D1 and Rht-D1. Excluding the two markers and closely linked ones from our association study led to a drop of cross-validated prediction accuracy of 14–30% (depending on the threshold used). In comparison, the cross-validated accuracy of prediction for the corresponding random scenario avoiding sampling the two genes dropped only by 4.2–17%. Thus, Ppd-D1 and Rht-D1 are functional in determining FHB resistance. When solely Ppd-D1 and Rht-D1 markers were used in prediction, the cross-validated accuracy reached 0.50. However, this result has to be considered with caution, as the correlation between Rogers’ distances among the 372 lines estimated based on all markers with the one based on solely Ppd-D1 and Rht-D1 amounted to 0.26 (significant at P<0.001). Thus, a large part of the accuracy of prediction accuracy was due to relatedness.

In preparation for genome-wide association mapping, qq-plots based on observed P-values and expected P-values were drawn (Supplementary Figure S2). For all three marker data sets, the plots were close to a diagonal line, confirming that the association mapping model used in this study has adequately controlled false positives because of population structure. A total of 4, 114 and 6 significant (P<0.005) marker–trait associations were detected for the 9k SNP array, 90k SNP array and SSR markers, respectively (Supplementary Table S2). It is worth to note that the effective number of significant marker–trait associations defined as the minimal number of principal components needed to portray 95% of the variation of the significant markers was 40 for the 90k SNP array, indicating that we indeed detected more marker–trait associations than in the 9k SNP array. We then compared the importance of different parameters for the selection of potential functional markers for use in breeding or map-based cloning activities. When we first tested the deviation of P-values of marker–trait associations in fivefold cross-validation runs, we observed substantial variation in P-values (Figure 3). P-values of markers with minor allele frequencies were on average less robust than those for markers with intermediate frequencies. Nevertheless, also P-values for markers with intermediate frequencies were sometimes substantially higher in the full data set in contrast to the estimation sets. Thus, inspecting the distribution of the cross-validated P-values in the estimation sets can be used as one criterion in the search of robust candidates such as BS00043676_51 (coded as SNP_2144, see Supplementary Figure S1) for map-based cloning.

Figure 3
figure 3

P-values of 20 most significant markers detected in the genome-wide association mapping scan (red triangles) and the distributions of P-values in the 100 cross-validation runs. The gray boxes highlight markers that are not significant according to the median P-values of the cross-validation runs. The polymorphic information content (PIC) of each marker is indicated in the bracket after the name of the marker. Linkage disequilibrium among markers is given in Supplementary Figure S1. The names of markers in the 90k SNP chip were coded for better visibility. The original names of those markers are listed in Supplementary Figure S1. A full color version of this figure is available at the Heredity journal online.

Second, we investigated the correspondence between the observed P-values with two approaches to estimate the explained proportion of phenotypic variation. The likelihood-ratio-based parameter R2LR was closely associated with the P-values in contrast to the R2 estimated with multiple regressions (Figure 4). The strong deviation of P-values of the association mapping scan from R2 values in a multiple regression is not surprising, because the latter approach ignores information on genetic relatedness. In contrast, the close association between R2LR and P-values of the association mapping scan is expected as both are based on similar biometrical models considering genetic relatedness. The different interpretations of R2 versus R2LR have to be considered while selecting markers for further use in marker-assisted breeding or map-based cloning. If the distance matrix of functional markers such as Rht-D1 is associated (r=0.26; P<0.001) with the overall genetic structure of the mapping population, then R2LR will underestimate the value of true functional markers.

Figure 4
figure 4

The relationship between P-values and the proportion of the phenotypic variance explained by all QTLs estimated using multiple regressions (R2) and a likelihood-ratio-based method (R2LR).

Irrespective of the chosen significance thresholds, the analyses revealed an overestimation of the accuracy of prediction of FHB resistance for all three marker systems as is reflected by higher accuracies observed for non-cross-validated compared with the cross-validated value (Figure 5). This bias was less pronounced for the 9k SNP array compared with the 90k SNP array and SSR marker data analyses. Interestingly, the accuracy of prediction dropped down only slightly when information on marker–trait associations was ignored in cross-validation by using random samples of markers in numbers corresponding to the average number of QTL detected. This suggests that relatedness is the major driving force for the prediction accuracy of marker-assisted selection.

Figure 5
figure 5

Marker-assisted selection for FHB resistance based on 9k SNP array, 90k SNP array and SSR markers. Accuracy of prediction was determined non-cross-validated (white columns), cross-validated as the Pearson product-moment correlation between predicted and observed genotypic values in the test set (black columns) or cross-validated based on a randomly selected set of markers (gray columns). The average number of significant markers used for prediction is indicated in brackets.

To examine our findings on the role of relatedness in more detail, we performed computer simulations in different scenarios. For all, we postulated an involvement of 30 QTLs, each explaining ~1.7% of the genotypic variation, summing up to 50% of the total genotypic variation. In the first simulation scenario, we assumed that all SNPs would have marker–trait associations in order to study how precise QTL effects can be estimated in the underlying mapping population. We observed an accuracy of prediction of 0.64 (Figure 6), which corresponds to 75% of the genotypic variation explained by the 30 QTLs. In the second scenario, we assumed that marker–trait associations were unknown and had to be detected in a genome-wide association mapping scan. Before calculating the prediction accuracy, we checked the false-positive rate in the scan, which was 0% under the threshold of P<0.001. Thus, population structure was adequately controlled in our association mapping model. The accuracy of prediction decreased to 0.39 by approximately 40% compared with the first scenario, which was nevertheless still surprisingly high taking into consideration the population size and the effect size of each QTL. The QTL detection rate per cross-validation run on average reached only 10%. Contrasting the second scenario with a third scenario using 30 SNP markers unlinked to marker–trait associations led to only approximately 20% lower accuracy of prediction of 0.30. This clearly underlines that genetic relatedness is a major driving force for prediction accuracy in marker-assisted selection for complex traits in the absence of QTL with large effect sizes. The fourth scenario was similar to the third one except that we randomly sampled 30 SNPs out of all SNPs, thus potentially selecting also marker–trait associations. This scenario reflected the ‘random’ marker scenario of the experimental data analyses. We observed only a marginal increase of the accuracy of prediction to 0.31. Compared with the third scenario, this suggested that the bias in the ‘random’ scenario of the experimental data analysis by sampling all markers including also putative QTL can be ignored.

Figure 6
figure 6

Simulation of marker-assisted selection based on 9k SNP array data to separate contributions of marker effects and relatedness. The cross-validated accuracy of prediction defined as the Pearson product-moment correlation between predicted and observed genotypic values in the test set. (1) We assumed the 30 SNP markers to represent predefined marker–trait associations and estimated their effects based on the estimation set. (2) We assumed that the 30 SNP markers were not predefined and performed an association mapping scan in the estimation set to identify the 30 most significant marker–trait associations. Effects of these marker–trait associations were estimated based on the estimation set. (3) We randomly selected 30 SNP markers excluding those with significant marker–trait associations, and estimated their effects in the estimation set. (4) We randomly selected 30 SNP markers among all SNPs and estimated their effects in the estimation set.

Genomic selection of FHB resistance

The cross-validated accuracy of prediction of phenotypic performance was at a significance threshold of P<0.005 on average 47% higher for genomic than marker-assisted selection across 9k SNP array, 90k SNP array and SSR marker-based data (Table 3 and Figure 5). Interestingly, we observed no significant differences in the accuracies of prediction between marker platforms. The mean accuracies of prediction also did not differ significantly across the three genomic selection models RR-BLUP, RKHSR and Bayes-Cπ. Re-sampling of reduced sets of markers revealed that the accuracy of prediction was already reaching a plateau at rather low marker densities (Figure 7).

Table 3 Means and s.e. of cross-validated prediction accuracies of genomic selection for Fusarium head blight resistance of 372 wheat lines based on the three different marker sets 9k SNP array, 90k SNP array and SSR markers and the three different biometrical models RR-BLUP, RKHSR and Bayes-Cπ
Figure 7
figure 7

Effect of the number of markers on the prediction accuracy of genomic selection. The total number of markers used is indicated in brackets.

Discussion

Relatedness strongly impacts the cross-validated prediction accuracy in association mapping studies

It has been suggested that high-throughput genome-wide association mapping studies have the potential to simultaneously answer two central questions in quantitative genetics, regarding (I) how many genes influence a particular trait and (II) what the allele distributions at these gene loci are (Wallace et al., 2013). For complex traits such as FHB resistance that are determined by a rather large number of genes, association mapping studies in wheat were often found to explain a surprisingly high amount of the genetic variation (Miedaner et al., 2011; Rutkoski et al., 2012; Kollers et al., 2013a). Our results are in accordance with these previous findings, as the marker-assisted selection approaches based on the three marker systems explained up to 78% of the phenotypic variation for the total population when a significance threshold of P<0.005 was applied. This number dropped in fivefold cross-validation (Figure 5) as was expected from previous results (Guo et al., 2012; Miedaner et al., 2013; Zhao et al., 2013; Gowda et al., 2014; Würschum and Kraft, 2014). Nevertheless, the cross-validated accuracy of prediction of FHB resistance is still high and in most cases higher than using randomly sampled markers (Figure 5), suggesting that genome-wide association mapping studies indeed have the potential to at least partially answer the two central questions on the genetic architecture of FHB resistance outlined above.

In order to maximize the information gained from a genome-wide mapping study, an optimum significance threshold reflecting a balance between false-discovery rate and power to detect QTL of interest needs to be chosen (Moreau et al., 1998). Applying increasingly conservative thresholds, we observed a monotonic decreasing trend in cross-validated values of prediction accuracies, which suggests that within the investigated range of P-values, increasing the power of QTL detection with a more relaxed significance threshold was more relevant than increasing the risk to detect false-positive QTL. However, a further explanation for this is that relatedness also contributes to the accuracy of prediction of marker-assisted selection. Every sequence polymorphism, also the ones defining QTL, will contribute to relatedness. This relatedness is then exploited in marker-assisted selection. This has been observed earlier in a genome-wide mapping study in a vast hybrid wheat population (Gowda et al., 2014) and in two bi-parental rye populations (Wang et al., 2014b). Relatedness can be portrayed more precisely with relaxed significance thresholds that allow a high number of SNPs to be used in the prediction models.

We also sampled a random set of markers corresponding to the number of QTL detected in the QTL mapping scans, and used this random set of markers to predict the phenotypic performance by applying fivefold cross-validation. Our finding that using significant marker–trait associations did lead to only marginal higher accuracy of prediction of FHB resistance in comparison with the random set of markers was surprising (Figure 5). It clearly indicated that the contribution of genetic relatedness to the prediction ability of significant markers is far from negligible. To confirm that, we compared the correlations between Rogers’ distances among the 372 lines estimated using full marker data set and that based on significant markers detected in the association mapping scan. We observed significant correlations that amounted to 0.28 (P<0.001) with only 4 significant SSR markers and 0.24 (P<0.001) with 21 significant SNP markers in the 9k array. Hence, our analysis underlined that genetic relatedness has a key role in the prediction of FHB resistance in marker-assisted selection. We further substantiated our results applying simulations (Figure 6). As a consequence, relatedness must be taken into account when interpreting the results on the genetic architecture of complex traits in genome-wide association mapping studies.

Influence of marker system and biometrical model on the accuracy of prediction in genomic selection

The non-linear, semi-parametric RKHSR has the potential to capture complex interactions better than RR-BLUP and Bayes-Cπ assuming linear relationships between phenotype and genotype (de los Campos et al., 2010). Previous genome-wide association mapping studies suggested presence of epistatic effects for FHB resistance (Miedaner et al., 2012). Thus, it is tempting to speculate that RKHSR outperforms RR-BLUP and Bayes-Cπ as observed in several previous genomic selection studies (Crossa et al., 2010, 2011; de los Campos et al., 2010; Heslot et al., 2012). In contrast to this expectation, however, no significant differences among the accuracies of the three applied genomic selection models were observed (Table 3). The variation of the accuracy of prediction of the applied genomic selection models across the different cross-validation scenarios also did not differ. Consequently, RR-BLUP, the model with the lowest requirement on computational time, seems to be the most promising model for implementation of genomic selection for FHB resistance in wheat breeding programs.

Genomic selection exploits to a large extent the relatedness among the plant material in estimation and test sets to predict the phenotypic performance. In total, 29% of the variation in Rogers’ distances observed with the 90k SNP array cannot be explained with the SSR marker data (Table 1). Thus, the choice of the marker set is expected to impact the accuracy of genomic selection. In addition, the bootstrap analyses of the principal component analyses revealed a more complex population structure for the SSR as compared with the SNP arrays (Figure 2). This slightly different picture is expected because of the different nature of SNP versus SSR markers, with SSR markers possessing a higher resolution to unravel the more recent population history (Würschum et al., 2013). Consequently, it is surprising that we did not observe differences in accuracies of prediction of the different marker systems. One possible explanation for the similar accuracies of prediction of the different marker systems is that the true relationship matrix for the 372 wheat lines lies in between the two matrices estimated with the SSR markers or the two SNP arrays. This would lead to similar accuracies of prediction of SSR and SNP marker data.

Conclusions

Our cross-validation study clearly underlined that the accuracy of prediction of marker-assisted selection is strongly influenced by relatedness. This was confirmed applying computer simulations. Our findings, however, also showed that relatedness can be better exploited by applying genomic selection instead of marker-assisted selection. Marker density had only a marginal impact on the prediction accuracy of genomic selection. Consequently, genomic selection as an attractive complementation to phenotypic selection of FHB resistance can be most efficiently implemented based on low- to medium-density marker panels.

Data Archiving

Relevant data sets have been added as Supplementary Tables S1 and S2.