Abstract
Emerging technologies make it possible for the first time to genotype hundreds of thousands of SNPs simultaneously, enabling whole-genome association studies. Using empirical genotype data from the International HapMap Project, we evaluate the extent to which the sets of SNPs contained on three whole-genome genotyping arrays capture common SNPs across the genome, and we find that the majority of common SNPs are well captured by these products either directly or through linkage disequilibrium. We explore analytical strategies that use HapMap data to improve power of association studies conducted with these fixed sets of markers and show that limited inclusion of specific haplotype tests in association analysis can increase the fraction of common variants captured by 25–100%. Finally, we introduce a Bayesian approach to association analysis by weighting the likelihood of each statistical test to reflect the number of putative causal alleles to which it is correlated.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Devlin, B. & Risch, N. A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29, 311–322 (1995).
Risch, N. & Merikangas, K. The future of genetic studies of complex human diseases. Science 273, 1516–1517 (1996).
Collins, F.S., Brooks, L.D. & Chakravarti, A.A. DNA polymorphism discovery resource for research on human genetic variation. Genome Res. 8, 1229–1231 (1998).
Wheeler, D.L. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 34, D173–D180 (2006).
Gunderson, K.L., Steemers, F.J., Lee, G., Mendoza, L.G. & Chee, M.S. A genome-wide scalable SNP genotyping assay using microarray technology. Nat. Genet. 37, 549–554 (2005).
Matsuzaki, H. et al. Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays. Nat. Methods 1, 109–111 (2004).
Reich, D.E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).
Hirschhorn, J.N. & Daly, M.J. Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6, 95–108 (2005).
Altshuler, D. et al. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).
Kruglyak, L. Power tools for human genetics. Nat. Genet. 37, 1299–1300 (2005).
Carlson, C.S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).
Kruglyak, L. & Nickerson, D.A. Variation is the spice of life. Nat. Genet. 27, 234–236 (2001).
Pe'er, I. et al. Biases and reconciliation in estimates of linkage disequilibrium in the human genome. Am. J. Hum. Genet. 78, 588–603 (2006).
Purcell, S., Cherny, S.S. & Sham, P.C. Genetic power calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics 19, 149–150 (2003).
Pritchard, J.K. & Przeworski, M. Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69, 1–14 (2001).
Sham, P.C., Cherny, S.S., Purcell, S. & Hewitt, J.K. Power of linkage versus association analysis of quantitative traits, by use of variance-components models, for sibship data. Am. J. Hum. Genet. 66, 1616–1630 (2000).
Crawford, D.C. et al. Haplotype diversity across 100 candidate genes for inflammation, lipid metabolism, and blood pressure regulation in two populations. Am. J. Hum. Genet. 74, 610–622 (2004).
Chapman, J.M., Cooper, J.D., Todd, J.A. & Clayton, D.G. Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum. Hered. 56, 18–31 (2003).
Weale, M.E. et al. Selection and evaluation of tagging SNPs in the neuronal-sodium-channel gene SCN1A: implications for linkage-disequilibrium gene mapping. Am. J. Hum. Genet. 73, 551–565 (2003).
de Bakker, P.I. et al. Efficiency and power in genetic association studies. Nat. Genet. 37, 1217–1223 (2005).
Clark, A.G., Hubisz, M.J., Bustamante, C.D., Williamson, S.H. & Nielsen, R. Ascertainment bias in studies of human genome-wide polymorphism. Genome Res. 15, 1496–1502 (2005).
Pritchard, J.K. & Cox, N.J. The allelic architecture of human disease genes: common disease-common variant or not? Hum. Mol. Genet. 11, 2417–2423 (2002).
Cohen, J. et al. Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9. Nat. Genet. 37, 161–165 (2005).
Cohen, J.C. et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science 305, 869–872 (2004).
Stram, D.O. et al. Choosing haplotype-tagging SNPS based on unphased genotype data using a preliminary sample of unrelated subjects with an example from the Multiethnic Cohort Study. Hum. Hered. 55, 27–36 (2003).
Lin, S., Chakravarti, A. & Cutler, D.J. Exhaustive allelic transmission disequilibrium tests as a new approach to genome-wide association studies. Nat. Genet. 36, 1181–1188 (2004).
Botstein, D. & Risch, N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat. Genet. 33(Suppl.), 228–237 (2003).
Roeder, K., Bacanu, S.A., Wasserman, L. & Devlin, B. Using linkage genome scans to improve power of association in genome scans. Am. J. Hum. Genet. 78, 243–252 (2006).
Excoffier, L. & Slatkin, M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol. Biol. Evol. 12, 921–927 (1995).
Barrett, J.C., Fry, B., Maller, J. & Daly, M.J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005).
Acknowledgements
We acknowledge Affymetrix, Inc. and Illumina, Inc. for sharing product data. We also thank Affymetrix, Inc. for making public genotype data of the HapMap samples generated by the GeneChip Mapping 500K Array.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Fig. 1
Power of a Bayesian approach versus the existing frequentist approach. (PDF 21 kb)
Supplementary Fig. 2
Genotype relative risk as a function of the frequency of the causal variant. (PDF 20 kb)
Rights and permissions
About this article
Cite this article
Pe'er, I., de Bakker, P., Maller, J. et al. Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat Genet 38, 663–667 (2006). https://doi.org/10.1038/ng1816
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/ng1816
This article is cited by
-
Variations on a Chip: Technologies of Difference in Human Genetics Research
Journal of the History of Biology (2018)
-
Identifying baseline immune-related biomarkers to predict clinical outcome of immunotherapy
Journal for ImmunoTherapy of Cancer (2017)
-
Evaluation of power of the Illumina HumanOmni5M-4v1 BeadChip to detect risk variants for human complex diseases
European Journal of Human Genetics (2016)
-
A new structural approach to genomic discovery of disease: example of adult-onset diabetes
Biological Cybernetics (2016)
-
GWAS in autoimmune thyroid disease: redefining our understanding of pathogenesis
Nature Reviews Endocrinology (2013)