Credit: V. Summersby/NPG

It is widely recognized that a limitation of genome-wide association studies using single-nucleotide polymorphism (SNP) arrays is that only a proportion of genetic variants is assayed, and low-frequency variants are missed. This limitation can be overcome using whole-genome sequencing, and authors from the Cohorts for Heart and Aging Research in Genetic Epidemiology (CHARGE) Consortium have now demonstrated how this approach can be used to examine the genetic architecture of a complex trait: namely, levels of high-density lipoprotein cholesterol (HDL-C).

Morrison et al. sequenced the genomes of 962 individuals to an average depth of sixfold; this sample size and depth gives a high probability of identifying variants that have a frequency of >0.5%. Indeed, rare variants (those with a minor allele frequency of <1%) made up ~65% of the ~25 million variants identified in this study.

The authors grouped variants as 'rare', 'low-frequency' or 'common', and using a linear mixed model approach they assessed the contribution of variants within each group to the heritability of HDL-C levels. The data in this study suggests that common variants and rare variants account for 61.8% and 7.8%, respectively, of the variance in HDL-C levels. Along with results from previous studies, this finding suggests that many variants with small effects contribute to the genetic architecture of this trait. They also found that a few of the individuals with HDL-C levels in the tails of the distribution carried rare variants with large effects that had been previously identified in Mendelian conditions that affect lipoprotein levels.

This paper presents examples that illustrate how whole-genome association data can be analysed using different approaches, such as a sliding window that aggregates the contribution of rare variants, as well as single SNP analyses. The 'landscape' of associations across the genome can also be compared to other genome annotations: for example, to assess the contribution of rare and common variants in regulatory regions. Thus, this work shows the benefits of taking a whole-genome sequencing approach to complex trait association studies, which could be extended in larger studies and as genome annotations continue to improve.