Genome-wide efficient mixed-model analysis for association studies

Journal name:
Nature Genetics
Volume:
44,
Pages:
821–824
Year published:
DOI:
doi:10.1038/ng.2310
Received
Accepted
Published online

Abstract

Linear mixed models have attracted considerable attention recently as a powerful and effective tool for accounting for population stratification and relatedness in genetic association tests. However, existing methods for exact computation of standard test statistics are computationally impractical for even moderate-sized genome-wide association studies. To address this issue, several approximate methods have been proposed. Here, we present an efficient exact method, which we refer to as genome-wide efficient mixed-model association (GEMMA), that makes approximations unnecessary in many contexts. This method is approximately n times faster than the widely used exact method known as efficient mixed-model association (EMMA), where n is the sample size, making exact genome-wide association analysis computationally practical for large numbers of individuals.

References

  1. Kang, H.M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348354 (2010).
  2. Kang, H.M., Ye, C. & Eskin, E. Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots. Genetics 180, 19091925 (2008).
  3. Kang, H.M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 17091723 (2008).
  4. Listgarten, J., Kadie, C., Schadt, E.E. & Heckerman, D. Correction for hidden confounders in the genetic analysis of gene expression. Proc. Natl. Acad. Sci. USA 107, 1646516470 (2010).
  5. Price, A.L., Zaitlen, N.A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459463 (2010).
  6. Yu, J. et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203208 (2006).
  7. Zhang, Z. et al. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 42, 355360 (2010).
  8. Lippert, C. et al. FaST linear mixed models for genome-wide association studies. Nat. Methods 8, 833835 (2011).
  9. Aulchenko, Y.S., Ripke, S., Isaacs, A. & van Duijn, C.M. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 12941296 (2007).
  10. Aulchenko, Y.S., de Koning, D.J. & Haley, C. Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177, 577585 (2007).
  11. Abney, M., Ober, C. & McPeek, M.S. Quantitative-trait homozygosity and association mapping and empirical genomewide significance in large, complex pedigrees: fasting serum-insulin level in the Hutterites. Am. J. Hum. Genet. 70, 920934 (2002).
  12. Guan, Y. & Stephens, M. Practical issues in imputation-based association mapping. PLoS Genet. 4, e1000279 (2008).
  13. Howie, B.N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
  14. Knuth, D.E. Big Omicron and big Omega and big Theta. ACM SIGACT News. 8, 1824 (1976).
  15. Bennett, B.J. et al. A high-resolution association mapping panel for the dissection of complex traits in mice. Genome Res. 20, 281290 (2010).
  16. The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661678 (2007).
  17. Lee, S.H., van der Werf, J.H., Hayes, B.J., Goddard, M.E. & Visscher, P.M. Predicting unobserved phenotypes for complex traits from whole-genome SNP data. PLoS Genet. 4, e1000231 (2008).
  18. Meyer, K. Estimating variances and covariances for multivariate animal models by restricted maximum likelihood. Genet. Sel. Evol. 23, 6783 (1991).
  19. Searle, S.R., Casella, G. & McCulloch, C.E. Variance Components. (Wiley, New York, 2006).
  20. Henderson, C.R. Applications of Linear Models in Animal Breeding (University of Guelph, Guelph, Canada, 1984).

Download references

Author information

Affiliations

  1. Department of Human Genetics, University of Chicago, Chicago, Illinois, USA.

    • Xiang Zhou &
    • Matthew Stephens
  2. Department of Statistics, University of Chicago, Chicago, Illinois, USA.

    • Matthew Stephens

Contributions

X.Z. and M.S. designed the study, developed methods and wrote the manuscript. X.Z. implemented software and analyzed data.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary Text and Figures (377 KB)

    Supplementary Figures 1 and 2, Supplementary Table 1 and Supplementary Note

Additional data