Credit: NPG

This study is the first application of phenotype prediction using whole-genome-sequencing data in a higher eukaryote.

Prediction methods may be useful in identifying the genetic causes of complex polygenic traits

Traditional genome-wide association studies (GWASs) identify single variants that associate with a particular phenotype by testing the significance of each variant. Genome-based prediction instead uses information across the whole genome simultaneously to explain the variability in the observed phenotype. Prediction methods may be useful in identifying the genetic causes of complex polygenic traits, although so far the application of genome-based prediction has been restricted by the number of SNPs available and the small proportion of SNPs that are typically included in such studies.

The authors analysed 2.5 million SNPs obtained from sequence data of 157 inbred lines from the recently published 'Drosophila Genetic Reference Panel' in an attempt to predict phenotypes of two complex traits: starvation stress resistance and startle-induced locomotor behaviour. Both traits respond rapidly to artificial selection and are genetically variable in natural populations.

After first creating a genomic relationship matrix, the authors used a genomic best linear unbiased prediction (GBLUP) method to evaluate predictive ability by cross-validation. GBLUP approaches take into account the covariance structure inferred from the genomic data. The authors compared GBLUP with a Bayesian approach and evaluated the effects of SNP density and training set size on predictive accuracy. Although the predictive abilities obtained were moderate, the authors state that this study provides proof of concept for this approach.