Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

Rapid variance components–based method for whole-genome association analysis

Abstract

The variance component tests used in genome-wide association studies (GWAS) including large sample sizes become computationally exhaustive when the number of genetic markers is over a few hundred thousand. We present an extremely fast variance components–based two-step method, GRAMMAR-Gamma, developed as an analytical approximation within a framework of the score test approach. Using simulated and real human GWAS data sets, we show that this method provides unbiased estimates of the SNP effect and has a power close to that of the likelihood ratio test–based method. The computational complexity of our method is close to its theoretical minimum, that is, to the complexity of the analysis that ignores genetic structure. The running time of our method linearly depends on sample size, whereas this dependency is quadratic for other existing methods. Simulations suggest that GRAMMAR-Gamma may be used for association testing in whole-genome resequencing studies of large human cohorts.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Comparison of mixed model–based methods.
Figure 2: Correspondence between GRAMMAR-Gamma and FASTA test statistics.
Figure 3: Run time on a single processor for different two-step methods, namely, EMMAX, mmscore realization of FASTA, FaST-LMM and GRAMMAR-Gamma.

Similar content being viewed by others

References

  1. Helgason, A., Yngvadóttir, B., Hrafnkelsson, B., Gulcher, J. & Stefánsson, K. An Icelandic example of the impact of population structure on association studies. Nat. Genet. 37, 90–95 (2005).

    Article  CAS  Google Scholar 

  2. Astle, W. & Balding, D.J. Population structure and cryptic relatedness in genetic association studies. Stat. Sci. 24, 451–471 (2009).

    Article  Google Scholar 

  3. Fisher, R.A. The correlation between relatives on the supposition of Mendelian inheritance. Trans. R. Soc. Edinb. 52, 399–433 (1918).

    Article  Google Scholar 

  4. Henderson, C.R. Estimation of variance and covariance components. Biometrics 9, 226–252 (1953).

    Article  Google Scholar 

  5. Boerwinkle, E., Chakraborty, R. & Sing, C.F. The use of measured genotype information in the analysis of quantitative phenotypes in man. I. Models and analytical methods. Ann. Hum. Genet. 50, 181–194 (1986).

    Article  CAS  Google Scholar 

  6. Yu, J. et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006).

    Article  CAS  Google Scholar 

  7. Kang, H.M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008).

    Article  Google Scholar 

  8. Lippert, C. et al. FaST linear mixed models for genome-wide association studies. Nat. Methods 8, 833–835 (2011).

    Article  CAS  Google Scholar 

  9. Chen, W.M. & Abecasis, G.R. Family-based association tests for genomewide association scans. Am. J. Hum. Genet. 81, 913–926 (2007).

    Article  CAS  Google Scholar 

  10. Kang, H.M. et al. Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42, 348–354 (2010).

    Article  CAS  Google Scholar 

  11. Zhang, Z. et al. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 42, 355–360 (2010).

    Article  CAS  Google Scholar 

  12. Aulchenko, Y.S., de Koning, D.J. & Haley, C. Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysis. Genetics 177, 577–585 (2007).

    Article  CAS  Google Scholar 

  13. Amin, N., van Duijn, C.M. & Aulchenko, Y.S. A genomic background based method for association analysis in related individuals. PLoS ONE 2, e1274 (2007).

    Article  Google Scholar 

  14. Pardo, L.M. et al. The effect of genetic drift in a young genetically isolated population. Ann. Hum. Genet. 69, 288–295 (2005).

    Article  CAS  Google Scholar 

  15. Atwell, S. et al. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 465, 627–631 (2010).

    Article  CAS  Google Scholar 

  16. Aulchenko, Y.S. et al. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294–1296 (2007).

    Article  CAS  Google Scholar 

  17. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).

    Article  CAS  Google Scholar 

  18. Bacanu, S.A., Devlin, B. & Roeder, K. Association studies for quantitative traits in structured populations. Genet. Epidemiol. 22, 78–93 (2002).

    Article  Google Scholar 

  19. Astle, W. Population Structure and Cryptic Relatedness in Genetic Association Studies. PhD Thesis, University of London (2009).

Download references

Acknowledgements

We thank A. Kirichenko, D. Fabregat Traver and P. Bientinesi for technical support and advice and M. Axenovich, D. Balding, P. Borodin and W. Astle for discussion. This work was supported by grants from the Russian Foundation for Basic Research (RFBR) Programs of the Russian Academy of Sciences and the RFBR-Helmholtz Joint Research Groups program (research project 12-04-91322-).

Author information

Authors and Affiliations

Authors

Contributions

G.R.S. developed the GRAMMAR-Gamma statistical test, ran the simulations and analyzed the simulated data. N.M.B. analyzed human and A. thaliana data and designed figures and tables. C.M.v.D. provided the human data and supervised its analyses. T.I.A. and Y.S.A. jointly designed and supervised the project and wrote the paper. All authors contributed to critical review of the manuscript during its preparation.

Corresponding author

Correspondence to Yurii S Aulchenko.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Tables 1–6, Supplementary Figures 1 and 2 and Supplementary Note (PDF 797 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Svishcheva, G., Axenovich, T., Belonogova, N. et al. Rapid variance components–based method for whole-genome association analysis. Nat Genet 44, 1166–1170 (2012). https://doi.org/10.1038/ng.2410

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.2410

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing