Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Evaluating and improving power in whole-genome association studies using fixed marker sets

Abstract

Emerging technologies make it possible for the first time to genotype hundreds of thousands of SNPs simultaneously, enabling whole-genome association studies. Using empirical genotype data from the International HapMap Project, we evaluate the extent to which the sets of SNPs contained on three whole-genome genotyping arrays capture common SNPs across the genome, and we find that the majority of common SNPs are well captured by these products either directly or through linkage disequilibrium. We explore analytical strategies that use HapMap data to improve power of association studies conducted with these fixed sets of markers and show that limited inclusion of specific haplotype tests in association analysis can increase the fraction of common variants captured by 25–100%. Finally, we introduce a Bayesian approach to association analysis by weighting the likelihood of each statistical test to reflect the number of putative causal alleles to which it is correlated.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Fraction of common (MAF ≥ 5%) Phase II HapMap SNPs (y-axis) captured by array SNPs as a function of the r2 cutoff (x-axis).
Figure 2: Fraction of SNPs (y-axis) captured by SNPs on GeneChip 100K and 500K arrays at r2 ≥ 0.8 in the three HapMap panels: YRI, CEU and CHB+JPT.
Figure 3: Fraction of common SNPs (y-axis) captured by single-array SNPs versus multimarker predictors in three HapMap panels (YRI, CEU and CHB+JPT).

References

  1. Devlin, B. & Risch, N. A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29, 311–322 (1995).

    Article  CAS  Google Scholar 

  2. Risch, N. & Merikangas, K. The future of genetic studies of complex human diseases. Science 273, 1516–1517 (1996).

    Article  CAS  Google Scholar 

  3. Collins, F.S., Brooks, L.D. & Chakravarti, A.A. DNA polymorphism discovery resource for research on human genetic variation. Genome Res. 8, 1229–1231 (1998).

    Article  CAS  Google Scholar 

  4. Wheeler, D.L. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 34, D173–D180 (2006).

    Article  CAS  Google Scholar 

  5. Gunderson, K.L., Steemers, F.J., Lee, G., Mendoza, L.G. & Chee, M.S. A genome-wide scalable SNP genotyping assay using microarray technology. Nat. Genet. 37, 549–554 (2005).

    Article  CAS  Google Scholar 

  6. Matsuzaki, H. et al. Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays. Nat. Methods 1, 109–111 (2004).

    Article  CAS  Google Scholar 

  7. Reich, D.E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).

    Article  CAS  Google Scholar 

  8. Hirschhorn, J.N. & Daly, M.J. Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6, 95–108 (2005).

    Article  CAS  Google Scholar 

  9. Altshuler, D. et al. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

    Article  Google Scholar 

  10. Kruglyak, L. Power tools for human genetics. Nat. Genet. 37, 1299–1300 (2005).

    Article  CAS  Google Scholar 

  11. Carlson, C.S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).

    Article  CAS  Google Scholar 

  12. Kruglyak, L. & Nickerson, D.A. Variation is the spice of life. Nat. Genet. 27, 234–236 (2001).

    Article  CAS  Google Scholar 

  13. Pe'er, I. et al. Biases and reconciliation in estimates of linkage disequilibrium in the human genome. Am. J. Hum. Genet. 78, 588–603 (2006).

    Article  CAS  Google Scholar 

  14. Purcell, S., Cherny, S.S. & Sham, P.C. Genetic power calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics 19, 149–150 (2003).

    Article  CAS  Google Scholar 

  15. Pritchard, J.K. & Przeworski, M. Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69, 1–14 (2001).

    Article  CAS  Google Scholar 

  16. Sham, P.C., Cherny, S.S., Purcell, S. & Hewitt, J.K. Power of linkage versus association analysis of quantitative traits, by use of variance-components models, for sibship data. Am. J. Hum. Genet. 66, 1616–1630 (2000).

    Article  CAS  Google Scholar 

  17. Crawford, D.C. et al. Haplotype diversity across 100 candidate genes for inflammation, lipid metabolism, and blood pressure regulation in two populations. Am. J. Hum. Genet. 74, 610–622 (2004).

    Article  CAS  Google Scholar 

  18. Chapman, J.M., Cooper, J.D., Todd, J.A. & Clayton, D.G. Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum. Hered. 56, 18–31 (2003).

    Article  Google Scholar 

  19. Weale, M.E. et al. Selection and evaluation of tagging SNPs in the neuronal-sodium-channel gene SCN1A: implications for linkage-disequilibrium gene mapping. Am. J. Hum. Genet. 73, 551–565 (2003).

    Article  CAS  Google Scholar 

  20. de Bakker, P.I. et al. Efficiency and power in genetic association studies. Nat. Genet. 37, 1217–1223 (2005).

    Article  CAS  Google Scholar 

  21. Clark, A.G., Hubisz, M.J., Bustamante, C.D., Williamson, S.H. & Nielsen, R. Ascertainment bias in studies of human genome-wide polymorphism. Genome Res. 15, 1496–1502 (2005).

    Article  CAS  Google Scholar 

  22. Pritchard, J.K. & Cox, N.J. The allelic architecture of human disease genes: common disease-common variant or not? Hum. Mol. Genet. 11, 2417–2423 (2002).

    Article  CAS  Google Scholar 

  23. Cohen, J. et al. Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9. Nat. Genet. 37, 161–165 (2005).

    Article  CAS  Google Scholar 

  24. Cohen, J.C. et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305, 869–872 (2004).

    Article  CAS  Google Scholar 

  25. Stram, D.O. et al. Choosing haplotype-tagging SNPS based on unphased genotype data using a preliminary sample of unrelated subjects with an example from the Multiethnic Cohort Study. Hum. Hered. 55, 27–36 (2003).

    Article  Google Scholar 

  26. Lin, S., Chakravarti, A. & Cutler, D.J. Exhaustive allelic transmission disequilibrium tests as a new approach to genome-wide association studies. Nat. Genet. 36, 1181–1188 (2004).

    Article  CAS  Google Scholar 

  27. Botstein, D. & Risch, N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat. Genet. 33(Suppl.), 228–237 (2003).

    Article  CAS  Google Scholar 

  28. Roeder, K., Bacanu, S.A., Wasserman, L. & Devlin, B. Using linkage genome scans to improve power of association in genome scans. Am. J. Hum. Genet. 78, 243–252 (2006).

    Article  CAS  Google Scholar 

  29. Excoffier, L. & Slatkin, M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol. Biol. Evol. 12, 921–927 (1995).

    CAS  Google Scholar 

  30. Barrett, J.C., Fry, B., Maller, J. & Daly, M.J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We acknowledge Affymetrix, Inc. and Illumina, Inc. for sharing product data. We also thank Affymetrix, Inc. for making public genotype data of the HapMap samples generated by the GeneChip Mapping 500K Array.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to David Altshuler or Mark J Daly.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Fig. 1

Power of a Bayesian approach versus the existing frequentist approach. (PDF 21 kb)

Supplementary Fig. 2

Genotype relative risk as a function of the frequency of the causal variant. (PDF 20 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pe'er, I., de Bakker, P., Maller, J. et al. Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat Genet 38, 663–667 (2006). https://doi.org/10.1038/ng1816

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng1816

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing