Introduction

Genome-wide association studies (GWAS) and genomic selection (GS) are promising fields where genomic technologies are well integrated into plant breeding practices. GWAS have enabled to dissect genetic architecture of complex traits in more than a dozen plants (Zhu et al., 2008). However, GWAS are less suitable for quantitative traits influenced by a large number of genes with small effects, so its utility to breeding is limited. GS has paved the way to overcome the limitation by using all genomic information simultaneously to predict phenotypes, thus avoiding information loss and reducing biases in marker effect estimates (Desta and Ortiz, 2014). Moreover, GS can increase the efficiency of plant breeding due to early selection before phenotypes are measured. GS has been applied to breeding in many aspects such as inbred performance prediction, parental selection and hybrid prediction (Riedelsheimer et al., 2012a; Crossa et al., 2014; Xu et al., 2014; Wang et al., 2017).

Since Meuwissen et al. (2001) first proposed this concept of GS along with several models, numerous statistical methods, including parametric and nonparametric methods, have been used to predict quantitative traits. Parametric methods include best linear unbiased prediction (BLUP; Henderson, 1975), least absolute shrinkage and selection operator (LASSO; Tibshirani, 1996), partial least squares (PLS; Gelandi and Kowalski, 1986) and Bayesian methods such as BayesA, BayesB and Bayesian LASSO (Yi and Xu, 2008; González-Recio and Forni, 2011); nonparametric methods include random forest (Svetnik et al., 2003), support vector machine (SVM; Maenhout et al., 2007) and reproducing kernel Hilbert spaces regression (RKHS; de los Campos et al., 2010). Recently, many investigators have evaluated the performance of various statistical methods used in GS. de los Campos et al. (2013) gave an overview of the parametric methods and concluded that BLUP performs well for most traits and BayesB yields slightly higher predictive accuracy for traits with large-effect quantitative trait loci (QTL). Riedelsheimer et al. (2012b) compared the predictive performance of five different GS methods for traits measured in maize inbred lines, and found that these methods differ slightly in their predictive abilities. Heslot et al. (2012) used 10 GS methods to predict the performance of 18 traits measured in different species, and found that RKHS was the best performer overall across traits and species. Howard et al. (2014) compared the predictabilities of parametric methods with nonparametric models using simulation data, and observed that parametric methods performed slightly better than nonparametric methods for predicting traits with more additive genetic component in their genetic architectures. However, all of the above comparisons were based on genomic data. As metabolomic and expression profiling technologies develop, metabolomic and transcriptomic data provide new sources for phenotypic prediction in several species, such as Arabidopsis thaliana, maize and rice (Meyer et al., 2007; Gärtner et al., 2009; Riedelsheimer et al., 2012a). It is still unknown how these parametric and nonparametric methods perform when using metabolites and transcripts for prediction.

Although GWAS are not designed for detecting QTL for highly polygenic traits, they help us gain insights into the genetic architecture of several important traits in maize including leaf architecture and disease resistance (Kump et al., 2011; Poland et al., 2011; Tian et al., 2011). Numerous statistical approaches have been proposed to perform GWAS, among which the mixed linear model is one of the most popular methods, as it is able to correct for population structure and family relatedness (Yu et al., 2006). Under the framework of the mixed linear model, several methods have been developed to reduce the computational demand, such as the efficient mixed model association (EMMA; Kang et al., 2008) and the genome-wide efficient mixed model association (GEMMA; Zhou and Stephens, 2012). These methods are single-locus methods that test the association between a single locus and the trait of interest at a time. However, it is known that quantitative traits are influenced by a number of QTL, so that models considering association of single locus at a time result in model misspecification, thus likely giving biased results (Gupta et al., 2013). In addition, single-locus methods usually require multiple test corrections for the P-value threshold, such as Bonferroni correction, to control the Type 1 error rate. This criterion is too stringent and many true associations may be missed (Zhang et al., 2011). In contrast, multi-locus associations can overcome these problems because these methods simultaneously use all genetic information of multiple loci and there is no need for multiple testing corrections due to the multi-locus nature (Zhang et al., 2011). Multi-locus methods have shown to perform better than single-locus methods. In multi-locus association studies, the number of markers is often larger than the sample sizes. LASSO is a powerful approach to address the problem, but it does not have a default method to calculate the P-values for markers.

In this article, we used genomic, transcriptomic and metabolomic data to predict the performance of six agronomic traits measured from 339 diverse maize inbred lines using eight representative methods including BLUP, LASSO, PLS, BayesA and BayesB for the parametric methods and RKHS, support vector machine using the radial basis function kernel (SVM-RBF), support vector machine using the polynomial kernel function (SVM-POLY) for the nonparametric methods, and compared the predictive abilities of three omic data and eight different methods. We also provided a new method based on Bayesian theory to perform a significance test for LASSO estimated marker effects, and we compared the modified LASSO method with GEMMA in terms of their statistical power and Type 1 error through simulations. We also used the LASSO method to detect significant single-nucleotide polymorphisms (SNPs), metabolites and transcripts for the six agronomic traits. Finally, we performed BLUP analysis in conjunction with GWAS to see whether or not using markers selected according to the result of GWAS can improve the predictive abilities.

Materials and methods

Material collection

Three omic (genomic, transcriptomic and metabolomic) data collected from 339 maize inbred lines were used for prediction. All lines were genotyped using Illumina MaizeSNP50 BeadChip (Ganal et al., 2011). RNA sequencing (RNA-seq) was subsequently performed on the immature seeds of 15 days after pollination for these 339 lines using 90 base pair pair-end Illumina (Fu et al., 2013). A total of 100K SNPs and 28 769 gene expression traits (transcriptomic data) were obtained. Metabolic profiling was carried out on mature maize kernels and 748 metabolites were detected using high-throughput liquid chromatography-tandem mass spectrometry analysis (Wen et al., 2014). We analyzed six yield-related traits to evaluate the efficacy of prediction: (1) ear length (EL), (2) ear diameter (ED), (3) ear row number (RN), (4) kernel number per row (KN), (5) ear weight (EW) and (6) cob weight (CW). Each trait was measured from five replicated experiments (2009 from three locations, 2010 from another two locations), and in each replicate, five plants from each line were sampled and the average phenotypic value was used for phenotypic analysis (Yang et al., 2014).

Methods of prediction

We used eight representative methods including five parametric methods (BLUP, LASSO, PLS, BayesA and BayesB) and three nonparametric methods (RKHS, SVM-RBF and SVM-POLY). The predictabilities were evaluated using a tenfold cross-validation where samples were randomly partitioned into 10 parts, 9 parts being used to estimate parameters and the remaining part being predicted. Thus, all the parts were predicted once and used nine times to estimate parameters. The predictive ability was defined as the correlation coefficient between the observed and predicted phenotypic values.

BLUP method

Let y be an n × 1 vector of phenotypic values of a quantitative trait for n individuals. The phenotypic vector is described by the following linear mixed model,

where X is a n × q design matrix, β is a q × 1 vector of fixed effects, m is the number of markers, Zk={Zjk} is an n × 1 vector of genotype indicators with Zjk=1 for the homozygote of the major allele, Zjk=0 for the heterozygote and Zjk=−1 for the homozygote of the minor allele, γk is a random effect of marker k, ɛ is an n × 1 vector of residual errors. Assume that ɛ~N(0,Inσ2) and γk~N(0,φ2/m), where σ2 is the residual variance and φ2 is a polygenic variance shared by all makers. The expectation of y is E(y)=Xβ and the variance–covariance matrix is

where is the variance ratio and K is a marker-generated kinship matrix defined as

The restricted maximum likelihood was used to estimate parameters. When the sample size is large, it can be very costly to evaluate the likelihood function. The eigen-decomposition algorithm was used to estimate parameters, details of this algorithm can be found in Xu et al. (2014).

Let us partition the total number of individuals into a training sample and a test sample. Let Y1 be a vector of phenotypic values in the training sample and Y1 be a vector of phenotypic values in the test sample. Accordingly, X can be partitioned into X1 and X2. The kinship matrix and matrix V are partitioned correspondingly, as shown below,

The BLUP prediction of is also the conditional expectation of Y2 given Y1,

where all the parameters are substituted by the restricted maximum-likelihood estimates from the training sample. The predictability is defined as the Pearson correlation between y (observed values) and (the predicted values). The BLUP method was implemented in our own R program.

LASSO method

LASSO is a constrained form of ordinary least squares with the sum of the absolute values of the regression coefficients being smaller than a constant (Tibshirani, 1996). LASSO was first proposed as a tool in GS by Usai et al. (2009). In this study, LASSO was implemented in the R/glmnet package (Friedman et al., 2010).

PLS method

The PLS method incorporates the principal component analysis into the multilinear regression model. It transforms the original data into a new set of linearly uncorrelated components as predictors to predict the phenotype. However, it differs from principal component analysis in that components are constructed by maximizing the covariance between the response variable and the independent components. The PLS method was implemented in an R program called pls (Mevik and Wehrens, 2007).

BayesA and BayesB

They are two popular Bayesian approaches to genomic prediction. The only difference between these two methods lies in the prior distribution of parameters. BayesA assumes that the prior distribution of variances across markers follows a scaled inverse chi-square distribution, while BayesB assumes that the prior distribution is a two-component mixture with one component being a scaled inverse chi-square distribution and the other being a point mass at 0. All parameters in BayesA and BayesB were sampled using the Gibbs sampling algorithm and the Markov chain Monte Carlo algorithm (Meuwissen et al., 2001). BayesA and BayesB were implemented in an R package called BGLR (Perez and de los Campos, 2014).

SVM method

It is a kernel-based learning method for classification and regression. Maenhout et al. (2007) first applied this method to predict maize hybrid performance. SVM implicitly maps the input data into a high-dimensional feature space via a kernel function (for example, polynomial, Gaussian radial basis function, hyperbolic tangent kernel, the linear kernel). We chose the radial basis function (SVM-RBF) and the polynomial kernel functions (SVM-POLY; Karatzoglou et al., 2004), and implemented these two algorithms using an R package called kernlab.

RKHS method

The RKHS method has been proved to be an efficient machine learning tool, which has been used in many areas, such as spatial statistics and smoothing splines (de los Campos et al., 2010). Gianola et al. (2006) first applied the RKHS method to genomic prediction. The reproducing kernel is a key factor of model specification in RKHS. Both single-kernel models and multi-kernel models can be fitted in RKHS. Campos et al. Perez and de los Campos (2014) showed that the multi-kernel model is very useful for kernel selection. Here, we choose the multi-kernel approach and implemented the method in the R/BGLR package (Perez and de los Campos, 2014).

The websites for all the R software packages of the prediction methods used in this study are listed in Supplementary Table S1.

Integrating multiple omic data

For the BLUP method, the integration model is defined as

where Xβ represents some fixed effects; G, T and M are indicator variables of genome, transcriptome and metabolome, respectively; m, p and q are the number of SNPs, transcripts and metabolites, respectively; αk, δh and γi are effects of SNPs, transcripts and metabolites with and distributions, respectively; is a vector of residual errors. The expectation of y is E(y)=Xβ and the variance–covariance matrix is var(y)=V, where

The variance components were estimated using the restricted maximum-likelihood method. The procedure of prediction is the same as the above BLUP method used for a single omic data set. For the other seven methods, we rescaled the predictors and combined three omic data together as the overall predictors for further prediction.

The LASSO method for GWAS

LASSO is a popular method in variable selection and we applied this method to detect significant markers. The LASSO method was implemented using an R package called R/glmnet. However, the software does not provide a standard error for an estimated effect. Here we adopted a Bayesian method of Xu (2013) to approximate the standard error for each selected marker effect. The LASSO model can be redefined as

where y is a n × 1 vector of the phenotypic values, Xkis a n × 1 design matrix for the kth selected markers, bk is the effect of this marker and ɛ is a n × 1 vector of residual errors. All markers are selected (with non-zero effect). Let be the LASSO estimated effect for marker k and be the variance of , which are interpreted as the Bayesian posterior mean and posterior variance, respectively. Let be the estimated marker effect from the data alone and its variance is defined as

where is the estimated residual error variance, which is defined as

where

is the hat matrix. Here, X denotes the design matrix for all markers with non-zero effects after LASSO variable selection. The above residual error variance is the estimated residual variance from a generalized cross-validation analysis (Golub et al., 1979). This residual variance has corrected the overfitting caused by too many predictors in the model. Let be the prior variance of bk. The prior variance can be defined as the expectation of ,

The posterior variance can be obtained from the prior variance and the variance from the data, and described as

Substituting equation (13) into equation (12) yields

Solving for , we get

Substituting equation (15) into equation (13), we will have an estimated . Given the LASSO estimate , we have a Wald test statistic for H0:bk=0,

Assume that Wk follows a Chi-square distribution with one degree of freedom, the P-value is calculated from,

Simulation studies for GWAS

To test the power and Type 1 error of the proposed LASSO method for GWAS, we performed simulation experiments based on the genotypic data of 339 maize inbred lines. We assigned a total of 10 QTL distributed on the first eight chromosomes of the maize genome. The last two chromosomes contained no QTL and were used to evaluate the Type 1 error. The proportion of the phenotypic variance contributed by the 10 simulated QTL was 60%. Detailed information about the 10 simulated QTL is shown in Table 1. The polygenic and residual error variances were set at φ2=1 and σ2=1, respectively. We also simulated population structure effects using the first four principal components of the marker data. The population structure explained 10% of the total phenotypic variation. Phenotypes were simulated as the sum of the effects of the 10 QTL, the polygenic effect, the residual error and the population structure effect. We also compared the results of our method with GEMMA (Zhou and Stephens, 2012) in the simulation studies. A total of 100 replications were generated and analyzed by both the LASSO method and the GEMMA method. The statistical power of a QTL was calculated as the proportion of replicates where the P-value of the QTL was less than 0.05 for the LASSO method and 0.05/m or 1/m for the GEMMA method. The Type 1 error was defined as the average proportion of false positives for all markers in the last two chromosomes that contain no QTL.

Table 1 True effects and statistical powers of 10 simulated QTL and Type 1 error rates for the modified LASSO method and GEMMA drawn from 100 replicated simulation experiments

Results

Comparison of predictive abilities

The predictive abilities of the six traits in maize obtained from all the eight methods (BLUP, LASSO, PLS, BayesA and BayesB, RKHS, SVM-RBF and SVM-POLY) are presented in Table 2. For genome, traits RN and ED have the highest predictive abilities across all methods, followed by traits EL and EW, with trait KN being the worst predictable trait. The largest differences in predictive ability among the eight methods range from 0.02 to 0.12 for the same trait. For transcriptome, the average predictive abilities of all traits are lower than those obtained from genome, and the predictive abilities are highest for RN (0.55) and lowest for CW (0.33). For CW and EL, the predictive abilities vary greatly across different methods with SVM-POLY being the best and LASSO being the worst, but for the other traits, the eight approaches have similar performances. For metabolome, the predictive abilities for the six traits are lower than those from genomic prediction, and metabolomic predictions for CW and EL are only around half of the genomic predictions. Large differences in predictive ability (>0.2) are observed between LASSO and BayesB for traits CW, ED and EW.

Table 2 Predictive abilities of six traits from three sources of omic data using eight statistical methods

Using the predictive abilities of all 3 × 6 × 8=144 omic-trait-method combinations, we performed analyses of variances under a factorial design. All main effects and two-way interaction effects are significant except the interaction effect of method × trait (Table 3). Results of multiple comparisons for the main effects are illustrated in Figure 1. Predictabilities of the three omic data are significantly different, with genomic prediction being the best followed by transcriptomic and metabolomic predictions (Figure 1a). Among the six traits, RN and ED are the best predictable traits followed by EW and KN, and CW is the worst (Figure 1b). By comparing eight methods, BLUP performs the best and BayesB performs the worst, with other methods ranging between the two (Figure 1c). All two-way interaction effects are given in Supplementary Data S1, from which we find that RKHS is the best for genome prediction and metabolome prediction, whereas it is not efficient for transcriptome prediction. Although BLUP is not the best for each omic prediction, it consistently ranks near the top. BayesB works well in genomic prediction and transcriptomic prediction. However, it performs poorly for metabolomic prediction, which has an enormous negative impact on the overall performance of BayesB.

Table 3 Analyses of variances of predictability from a 3 × 8 × 6 factorial design with three sources of omic data, eight prediction methods and six traits
Figure 1
figure 1

Multiple comparisons illustrated by boxplots. In each panel, different capital letters above the group labels indicate significant differences between groups. In each box plot, the plus sign represents the mean predictability, the box defines the first and the third quantiles, the bold line in the box defines the second quantile (median), the open circles represent outliers. (a) Compares the predictabilities of the three omic data across six traits and eight methods. (b) Compares the predictabilities for the six traits over three omic data and eight methods. (c) Compares the predictabilities of the eight methods across three omic data and six traits.

Combined prediction

We also combined all three omic data into a single model to perform a combined prediction. Overall, the combined prediction has no obvious advantage over the best single omic prediction (Figure 2). For the BLUP method, combining data from different sources slightly improves the prediction for all traits except KN, whereas for other methods, the combined prediction rarely increases the predictive ability compared with the use of single source of data. For trait EW, metabolomic prediction is better than combined prediction when using LASSO, PLS, RKHS and SVM-RBF.

Figure 2
figure 2

Comparison of predictability for three sources of omic data and the combined analysis of the three omic data over six traits and eight methods. The three omic data are genomic, transcriptomic and metabolomic data. The six traits are labeled as CW, ED, EL, EW, KN and RN. The eight statistical methods are BLUP, LASSO, PLS, BayesA, BayesB, RKHS, SVM-RBF and SVM-POLY.

Simulation studies for GWAS

We used LASSO and other methods to predict six quantitative traits of maize. LASSO, however, can also be used for genome-wide association studies. We compared our LASSO method with GEMMA for GWAS under two criteria of Bonferroni correction (GEMMA-A and GEMMA-B). The statistical powers and Type 1 error obtained from 100 replicated simulations for the 10 QTL are given in Table 1. In general, both LASSO and GEMMA are powerful for QTL with large simulated effects that explain more than six percent of phenotypic variance. The LASSO method has substantially higher powers for the four small QTL than the GEMMA method, regardless of what P-value criteria are used. The Type 1 error are well controlled in all the cases, where GEMMA-B provides the best control of Type 1 error, followed by LASSO and GEMMA-B. Overall, the LASSO method performs better than GEMMA-A in statistical power and Type 1 error. Although GEMMA-B achieves better control of Type 1 error than the LASSO method, it has a much lower power in detection of small QTL.

GWAS for six traits of maize using LASSO and GEMMA

Manhattan plots of all six traits of maize using the GEMMA and LASSO methods are shown in Figure 3. When we set the Bonferrroni-corrected P-value threshold at 0.05/m=5.0E−7 for the GEMMA method, no SNPs were detected for any of the six traits. This criterion may be too stringent for GEMMA, so we set the threshold at 1/m=1.0E−5. The criterion for LASSO remains at 0.05 because it is a multiple marker model. A total of eight SNPs for three agronomic traits (CW, EW and RN) were identified from the two GWAS methods, of which four SNPs were detected by LASSO and the others were detected by GEMMA (Table 4). Neither method detected any significant SNP associated with the other three traits (ED, EL and KN). With GEMMA, two SNPs associated with CW were detected on chromosomes 2 and 7. Also, the LASSO method detected one SNP on chromosome 2. Two SNPs influencing EW located in chromosomes 5 and 8 were identified by GEMMA and LASSO, respectively. All three SNPs associated with RN are located on chromosome 1; the one detected by the LASSO method is located in 28 Kb upstream of a known gene ZmADF3 (GRMZM2G060702), a key regulator of actin dynamics in plant cells, which has an important role in kernel development (Qiao et al., 2016).

Figure 3
figure 3

Manhattan plots for six traits obtained from two GWAS methods (GEMMA and LASSO). The six traits are labeled as CW, ED, EL, EW, KN and RN. The dashed blue horizontal line in each Manhattan plot depicts the significance threshold and the red dot indicates significant SNPs. The significance thresholds are 1.002 × 10−5 (after Bonferroni correction) and 0.05 for GEMMA and LASSO, respectively. A full color version of this figure is available at the Heredity journal online.

Table 4 Significant SNPs identified for six traits using GEMMA and the modified LASSO method

Metabolome-wide association studies using LASSO and GEMMA

We used the LASSO and GEMMA methods to detect significant metabolites associated with the six agronomic traits. Only two metabolites (n499 and n790) were detected for two traits (EN and KW) by GEMMA at the Bonferroni-corrected threshold (0.05/m=6.7E−05). The LASSO method identified a total of 15 significant metabolites for the six traits, which include the two metabolites detected by GEMMA (Supplementary Data S2). Some metabolites are significantly associated with more than one trait. For example, both metabolites n0710 and n0768 control CW and EW, and metabolite n0967 has a significant effect on three traits (EW, EL and KN). These metabolites may have an important role in maize ear development. Several metabolites have been detected in other species, such as n0710, n0075 and n0691. All the significant metabolites detected by LASSO explain a small fraction of phenotypic variation, and the strongest metabolite (n0499) only explains 3% of phenotypic variation for trait KN. However, this is not to say that the detected metabolites are not important. The small proportion of phenotypic variance explained may be due to the shrinkage nature of the LASSO method. It is worth noting that the number of metabolites is far less than the number of SNPs, whereas the number of significant metabolites is greater than the number of significant SNPs.

Transcriptome-wide association studies using LASSO and GEMMA

The LASSO and GEMMA methods were also used to detect significant transcripts associated with the six agronomic traits. No significant transcripts were identified for the six traits by GEMMA at Bonferroni-corrected P-value threshold (0.05/m =1.74E-6). Four significant transcripts for three agronomic traits (ED, KN and EW) were identified by LASSO (Supplementary Table S2). The two transcripts, GRMZM2G045243 and GRMZM2G126128, influencing EW were detected on chromosomes 2 and 4, respectively, both of which are protein-coding genes. Functions of the other two transcripts remain unknown. The strongest transcript (GRMZM2G001648) only explains 1.3% of phenotypic variation for trait ED. This may explain why GEMMA fails to detect any transcripts.

Genomic prediction using selected markers from GWAS

In a usual genomic prediction study, genome-wide markers are simultaneously included in a single model to predict the phenotypic values of a trait. However, most people outside the genomic selection community believe that markers with small or no effects on a trait may be detrimental to genomic selection if included in the model. They prefer using only selected markers that are associated with the trait of interest for prediction. In this study, we will answer the question whether using selected markers can improve genomic selection or not. We used selected markers from GWAS of the GEMMA method to predict phenotypes with the BLUP method under two different scenarios. Scenario A: markers were selected from the whole sample and only selected markers were used in the prediction, where predictabilities were drawn from 10-fold cross-validation. Scenario B: markers were selected within folds, where a GWAS was performed from each training sample and markers selected from the training sample were used to predict the trait values of the test sample. The markers were selected based on their P-values from the following sequences: 0.01, 0.05, 0.10, 0.2, 0.3, 0.4, 0.5 and 1.0, where P-value equal to 1.0 is equivalent to using all markers for prediction. The predictive abilities obtained from these two scenarios are illustrated in Figure 4. Figure 4a (the top panel) shows the result of scenario A, where markers were selected from the whole sample. When the P-value is small, the predictabilities are very high and they continue to increase until they reach a plateau when P≈0.05. After the plateau, the predictabilities start to decline and eventually reach the minimum values when P=1.0. This trend of the predictability change can mislead many investigators because the cross-validation using markers selected from the whole sample does not reflect the true prediction. The predictabilities are seriously biased upward. Using this result to report predictability is a kind of ‘cheating’, though unintentionally in many cases. Figure 4b (bottom panel) represents the actual predictabilities when markers were selected from training samples only. When the P-values are very small, the predictabilities are very low in four of the six traits. As the P-value increases, the predictability starts to increase and then quickly reaches a plateau. Further increase in P-value does not change the predictability very much. Overall, the integration of GWAS and prediction can significantly improve predictive ability in scenario A, but fail to increase predictive ability in scenario B. As scenario A cannot be achieved in actual genomic selection programs, we conclude that using selected markers for genomic selection does not help very much.

Figure 4
figure 4

Predictive abilities of BLUP for six traits using selected markers obtained via GWAS. Markers included in the prediction model are selected at seven different levels of P-value: 0.01, 0.05, 0.10, 0.2, 0.3, 0.4 and 0.5 (horizontal axis). (a) Markers are selected from the whole sample before cross-validation. (b) Markers are selected within folds of cross-validation.

Discussion

In this study, the average predictive ability was 0.38 from metabolomic data, 0.43 from transcriptomic data and 0.51 from genomic data across all traits and methods. Genome is still the most important predictor for maize. Riedelsheimer et al. (2012a) predicted seven heterotic traits in hybrid maize using 56 110 SNPs and 130 metabolites and found that the average predictive ability across seven traits was 0.73 from genome and 0.57 from metabolome. Gärtner et al. (2009) proposed to use 110 genetic markers and 181 metabolic markers to predict the heterosis of Arabidopsis thaliana and also found that predictive ability from metabolome was slightly lower than those from genome. Despite the fact that metabolites have proven to be useful in phenotypic prediction, they have the limitation that metabolites were measured at a specific moment, while some traits change dynamically at different developmental stages (Riedelsheimer et al., 2012a). In addition, we performed a combined prediction of three omic data and found no benefit from the combined analysis across traits and methods. However, Gärtner et al. (2009) proposed that combining data of both metabolites and SNPs leads to a substantial improvement of predictive ability. This may be due to the fact that they used a small number of genetic markers that were not able to capture information of the entire genome.

We also observed that the BLUP method slightly improved the combined prediction for most traits, while other methods slightly decreased the combined prediction for most traits. This may be because we assigned three different variances to three different sources of data in the mixed model analysis and these different variances were eventually used for BLUP prediction, whereas we simply combined the three types of predictors, albeit standardized, and placed them in a single model for other methods. Therefore, if we can give different sources of omic data a different set of weights, we may improve the combined prediction for other methods.

From the comparison of different prediction methods, we found that the BLUP method is the overall best performer, while BayesB is the worst one. Many studies have discovered that the genetic architecture has a strong impact on differences of predictive abilities among different prediction methods(Coster et al., 2010; Clark et al., 2011). The GWAS performed on this population did not detect any large-effect QTL, which suggests a polygenic genetic architecture for these agronomic traits. In the simulation study of Daetwyler, BLUP was not affected by the QTL number, whereas BayesB outperformed BLUP with lower numbers of QTL, but performed poorly compared with BLUP when the number of QTL was high (Daetwyler et al., 2010). Coster et al. (2010) also found that the predictive ability of selective shrinkage methods (LASSO and BayesB) decreased with an increased number of simulated QTL, whereas the PLS method was insensitive to the number of QTL. However, some analyses of real data showed that there were only small differences in predictive performance between different methods, regardless of the number and effects of QTL. Overall, shrinkage methods perform better for traits controlled by a few QTL with relatively large effects and BLUP is better suited for highly polygenic traits. In addition, we observed that predictive abilities obtained with the parametric and nonparametric methods were similar. It has been demonstrated that parametric methods had difficulty in capturing complex interactions such as epistatic effects, whereas nonparametric methods performed well for traits under epistatic genetic architectures (Gianola et al., 2006; Howard et al., 2014). Therefore, our similar predictive performance of parametric and nonparametric methods suggested that epistatic genetic effects may be negligible for these agronomic traits.

Currently, there is no method that fits all the data universally well. However, BLUP is often the best choice because its performance is good, in general, for all traits with omic data. In addition, BLUP is computationally more effective than other methods because we do not need to estimated marker effects. The fact that different methods perform differently across different traits and across different populations (Xu et al., 2014) leads to a new strategy of genomic selection. We should use all available methods to perform genomic selection and report the result from the ‘best’ method. Essentially, we are treating ‘method’ as a parameter and the best method is the maximum predictability estimate of the parameter method.

In this study, we provided an effective way to calculate the P-value of each marker for GWAS using the LASSO method. Although nonparametric methods, such as bootstrap, can also be used to calculate the standard error of an estimated marker effect and eventually provide a P-value, they are often costly in terms of computation. Simulation studies based on real genotype data of the maize population showed that the LASSO method performed well in terms of high power and low Type 1 error. One advantage of the multi-locus method over a genome-scanning approach is that no multiple test correction for P-value is needed. However, this method has its own limitation in that the number of markers cannot be too large, say >500k, because simultaneous estimation of that many effects in a single model is a real challenge without resort to a parallel computing scheme. In that case, we can perform multi-locus analysis on individual chromosomes. Recently, several two-step multi-locus methods have been developed to overcome that limitation (Li et al., 2011; Wang et al., 2016). The first step of these methods is to select a small fraction of makers using a less stringent criterion and then use the selected markers to conduct a multi-locus analysis in the second step. One issue with these methods is how to choose the appropriate critical value for marker selection in the first step.

We already demonstrated that using selected markers for genomic prediction does not improve the predictability. This does not mean that we cannot select markers for genomic selection. Figure 4b shows that when P=0.10 is used to select markers, the predictabilities of most traits already reach the plateaus. The number of markers that passed this criterion is about 9000 on average across traits. When a DNA chip is designed for genomic selection, a chip with 9K markers can be substantially cheaper than a chip with 90K markers. Therefore, selection of markers in genomic selection can be beneficial if genotyping more markers represents a proportional increase in cost.