Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

Genome-wide efficient mixed-model analysis for association studies

Abstract

Linear mixed models have attracted considerable attention recently as a powerful and effective tool for accounting for population stratification and relatedness in genetic association tests. However, existing methods for exact computation of standard test statistics are computationally impractical for even moderate-sized genome-wide association studies. To address this issue, several approximate methods have been proposed. Here, we present an efficient exact method, which we refer to as genome-wide efficient mixed-model association (GEMMA), that makes approximations unnecessary in many contexts. This method is approximately n times faster than the widely used exact method known as efficient mixed-model association (EMMA), where n is the sample size, making exact genome-wide association analysis computationally practical for large numbers of individuals.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Comparison of GEMMA with EMMA, EMMAX and GRAMMAR on HMDP HDL-C data and WTCCC Crohn's disease data.

Similar content being viewed by others

References

  1. Kang, H.M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).

    Article  CAS  Google Scholar 

  2. Kang, H.M., Ye, C. & Eskin, E. Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots. Genetics 180, 1909–1925 (2008).

    Article  CAS  Google Scholar 

  3. Kang, H.M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008).

    Article  Google Scholar 

  4. Listgarten, J., Kadie, C., Schadt, E.E. & Heckerman, D. Correction for hidden confounders in the genetic analysis of gene expression. Proc. Natl. Acad. Sci. USA 107, 16465–16470 (2010).

    Article  CAS  Google Scholar 

  5. Price, A.L., Zaitlen, N.A., Reich, D. & Patterson, N. New approaches to population stratification in genome-wide association studies. Nat. Rev. Genet. 11, 459–463 (2010).

    Article  CAS  Google Scholar 

  6. Yu, J. et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006).

    Article  CAS  Google Scholar 

  7. Zhang, Z. et al. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 42, 355–360 (2010).

    Article  CAS  Google Scholar 

  8. Lippert, C. et al. FaST linear mixed models for genome-wide association studies. Nat. Methods 8, 833–835 (2011).

    Article  CAS  Google Scholar 

  9. Aulchenko, Y.S., Ripke, S., Isaacs, A. & van Duijn, C.M. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294–1296 (2007).

    Article  CAS  Google Scholar 

  10. Aulchenko, Y.S., de Koning, D.J. & Haley, C. Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177, 577–585 (2007).

    Article  CAS  Google Scholar 

  11. Abney, M., Ober, C. & McPeek, M.S. Quantitative-trait homozygosity and association mapping and empirical genomewide significance in large, complex pedigrees: fasting serum-insulin level in the Hutterites. Am. J. Hum. Genet. 70, 920–934 (2002).

    Article  CAS  Google Scholar 

  12. Guan, Y. & Stephens, M. Practical issues in imputation-based association mapping. PLoS Genet. 4, e1000279 (2008).

    Article  Google Scholar 

  13. Howie, B.N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

    Article  Google Scholar 

  14. Knuth, D.E. Big Omicron and big Omega and big Theta. ACM SIGACT News. 8, 18–24 (1976).

    Article  Google Scholar 

  15. Bennett, B.J. et al. A high-resolution association mapping panel for the dissection of complex traits in mice. Genome Res. 20, 281–290 (2010).

    Article  CAS  Google Scholar 

  16. The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).

  17. Lee, S.H., van der Werf, J.H., Hayes, B.J., Goddard, M.E. & Visscher, P.M. Predicting unobserved phenotypes for complex traits from whole-genome SNP data. PLoS Genet. 4, e1000231 (2008).

    Article  Google Scholar 

  18. Meyer, K. Estimating variances and covariances for multivariate animal models by restricted maximum likelihood. Genet. Sel. Evol. 23, 67–83 (1991).

    Article  Google Scholar 

  19. Searle, S.R., Casella, G. & McCulloch, C.E. Variance Components. (Wiley, New York, 2006).

  20. Henderson, C.R. Applications of Linear Models in Animal Breeding (University of Guelph, Guelph, Canada, 1984).

Download references

Acknowledgements

This research is supported in part by grants from the US National Institutes of Health (NIH) (HL092206 to Y. Gilad and HG02585 to M.S.). We thank A.J. Lusis for making the mouse genotype and phenotype data available. This study also makes use of data generated by the WTCCC15. A full list of the investigators who contributed to the generation of the data is available from the WTCCC website. Funding for the WTCCC project was provided by the Wellcome Trust (award 085475).

Author information

Authors and Affiliations

Authors

Contributions

X.Z. and M.S. designed the study, developed methods and wrote the manuscript. X.Z. implemented software and analyzed data.

Corresponding authors

Correspondence to Xiang Zhou or Matthew Stephens.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1 and 2, Supplementary Table 1 and Supplementary Note (PDF 365 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhou, X., Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat Genet 44, 821–824 (2012). https://doi.org/10.1038/ng.2310

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.2310

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing