Efficiency and power in genetic association studies

de Bakker, Paul I W; Yelensky, Roman; Pe'er, Itsik; Gabriel, Stacey B; Daly, Mark J; Altshuler, David

doi:10.1038/ng1669

Article
Published: 23 October 2005

Efficiency and power in genetic association studies

Paul I W de Bakker^1,2,3,4^na1,
Roman Yelensky^1,2,5^na1,
Itsik Pe'er^1,4,
Stacey B Gabriel⁴,
Mark J Daly^1,4,6 &
…
David Altshuler^1,2,3,4,6,7

Nature Genetics volume 37, pages 1217–1223 (2005)Cite this article

5605 Accesses
1445 Citations
32 Altmetric
Metrics details

Abstract

We investigated selection and analysis of tag SNPs for genome-wide association studies by specifically examining the relationship between investment in genotyping and statistical power. Do pairwise or multimarker methods maximize efficiency and power? To what extent is power compromised when tags are selected from an incomplete resource such as HapMap? We addressed these questions using genotype data from the HapMap ENCODE project, association studies simulated under a realistic disease model, and empirical correction for multiple hypothesis testing. We demonstrate a haplotype-based tagging method that uniformly outperforms single-marker tests and methods for prioritization that markedly increase tagging efficiency. Examining all observed haplotypes for association, rather than just those that are proxies for known SNPs, increases power to detect rare causal alleles, at the cost of reduced power to detect common causal alleles. Power is robust to the completeness of the reference panel from which tags are selected. These findings have implications for prioritizing tag SNPs and interpreting association studies.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Distributions of the test statistic in a typical ENCODE region.**

**Figure 2: Efficiency afforded by a tagging approach.**

**Figure 3: Efficiency and power for various tagging strategies.**

**Figure 4: Effect of tagging from an incomplete reference panel on testing burden and power.**

**Figure 5: Effect of exhaustive haplotype tests on statistical power.**

Refining the impact of genetic evidence on clinical success

Article Open access 17 April 2024

Genome-wide association studies

Article 26 August 2021

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

References

Wang, W.Y., Barratt, B.J., Clayton, D.G. & Todd, J.A. Genome-wide association studies: theoretical and practical concerns. Nat. Rev. Genet. 6, 109–118 (2005).
Article CAS Google Scholar
Carlson, C.S., Eberle, M.A., Kruglyak, L. & Nickerson, D.A. Mapping complex disease loci in whole-genome association studies. Nature 429, 446–452 (2004).
Article CAS Google Scholar
Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J. & Lander, E.S. High-resolution haplotype structure in the human genome. Nat. Genet. 29, 229–232 (2001).
Article CAS Google Scholar
Gabriel, S.B. et al. The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002).
Article CAS Google Scholar
Patil, N. et al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294, 1719–1723 (2001).
Article CAS Google Scholar
Johnson, G.C. et al. Haplotype tagging for the identification of common disease genes. Nat. Genet. 29, 233–237 (2001).
Article CAS Google Scholar
The International HapMap Consortium. The International HapMap Project. Nature 426, 789–796 (2003).
The International HapMap Consortium. A haplotype map of the human genome. Nature (in the press).
Hinds, D.A. et al. Whole-genome patterns of common DNA variation in three human populations. Science 307, 1072–1079 (2005).
Article CAS Google Scholar
Stram, D.O. et al. Choosing haplotype-tagging SNPs based on unphased genotype data using a preliminary sample of unrelated subjects with an example from the Multiethnic Cohort Study. Hum. Hered. 55, 27–36 (2003).
Article Google Scholar
Weale, M.E. et al. Selection and evaluation of tagging SNPs in the neuronal-sodium-channel gene SCN1A: implications for linkage-disequilibrium gene mapping. Am. J. Hum. Genet. 73, 551–565 (2003).
Article CAS Google Scholar
Ke, X. & Cardon, L.R. Efficient selective screening of haplotype tag SNPs. Bioinformatics 19, 287–288 (2003).
Article CAS Google Scholar
Meng, Z., Zaykin, D.V., Xu, C.F., Wagner, M. & Ehm, M.G. Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes. Am. J. Hum. Genet. 73, 115–130 (2003).
Article CAS Google Scholar
Carlson, C.S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).
Article CAS Google Scholar
Hu, X., Schrodi, S.J., Ross, D.A. & Cargill, M. Selecting tagging SNPs for association studies using power calculations from genotype data. Hum. Hered. 57, 156–170 (2004).
Article CAS Google Scholar
Halldorsson, B.V. et al. Optimal haplotype block-free selection of tagging SNPs for genome-wide association studies. Genome Res. 14, 1633–1640 (2004).
Article CAS Google Scholar
Ao, S.I. et al. CLUSTAG: hierarchical clustering and graph methods for selecting tag SNPs. Bioinformatics 21, 1735–1736 (2005).
Article CAS Google Scholar
Zhang, K. et al. HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics 21, 131–134 (2005).
Article CAS Google Scholar
Rinaldo, A. et al. Characterization of multilocus linkage disequilibrium. Genet. Epidemiol. 28, 193–206 (2005).
Article Google Scholar
Schaid, D.J., Rowland, C.M., Tines, D.E., Jacobson, R.M. & Poland, G.A. Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am. J. Hum. Genet. 70, 425–434 (2002).
Article Google Scholar
Zaykin, D.V. et al. Testing association of statistically inferred haplotypes with discrete and continuous traits in samples of unrelated individuals. Hum. Hered. 53, 79–91 (2002).
Article Google Scholar
Fan, R. & Knapp, M. Genome association studies of complex diseases by case-control designs. Am. J. Hum. Genet. 72, 850–868 (2003).
Article CAS Google Scholar
Stram, D.O. et al. Modeling and E-M estimation of haplotype-specific relative risks from genotype data for a case-control study of unrelated individuals. Hum. Hered. 55, 179–190 (2003).
Article Google Scholar
Chapman, J.M., Cooper, J.D., Todd, J.A. & Clayton, D.G. Detecting disease associations due to linkage disequilibrium using haplotype tags: a class of tests and the determinants of statistical power. Hum. Hered. 56, 18–31 (2003).
Article Google Scholar
Lin, S., Chakravarti, A. & Cutler, D.J. Exhaustive allelic transmission disequilibrium tests as a new approach to genome-wide association studies. Nat. Genet. 36, 1181–1188 (2004).
Article CAS Google Scholar
Roeder, K., Bacanu, S.A., Sonpar, V., Zhang, X. & Devlin, B. Analysis of single-locus tests to detect gene/disease associations. Genet. Epidemiol. 28, 207–219 (2005).
Article Google Scholar
Storey, J.D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 100, 9440–9445 (2003).
Article CAS Google Scholar
Nyholt, D.R. A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am. J. Hum. Genet. 74, 765–769 (2004).
Article CAS Google Scholar
Dudbridge, F. & Koeleman, B.P. Efficient computation of significance levels for multiple associations in large studies of correlated data, including genomewide association studies. Am. J. Hum. Genet. 75, 424–435 (2004).
Article CAS Google Scholar
Wang, W.Y. & Todd, J.A. The usefulness of different density SNP maps for disease association studies of common variants. Hum. Mol. Genet. 12, 3145–3149 (2003).
Article CAS Google Scholar
Goldstein, D.B., Ahmadi, K.R., Weale, M.E. & Wood, N.W. Genome scans and candidate gene approaches in the study of common diseases and variable drug responses. Trends Genet. 19, 615–622 (2003).
Article CAS Google Scholar
Schaffner, S.F. et al. Calibrating a coalescent simulation of human genome sequence variation. Genome Res. (in the press).
Crawford, D.C. et al. Haplotype diversity across 100 candidate genes for inflammation, lipid metabolism, and blood pressure regulation in two populations. Am. J. Hum. Genet. 74, 610–622 (2004).
Article CAS Google Scholar
Barrett, J.C., Fry, B., Maller, J. & Daly, M.J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005).
Article CAS Google Scholar
Nejentsev, S. et al. Comparative high-resolution analysis of linkage disequilibrium and tag single nucleotide polymorphisms between populations in the vitamin D receptor gene. Hum. Mol. Genet. 13, 1633–1639 (2004).
Article CAS Google Scholar
Ahmadi, K.R. et al. A single-nucleotide polymorphism tagging set for human drug metabolism and transport. Nat. Genet. 37, 84–89 (2005).
Article CAS Google Scholar

Download references

Acknowledgements

We thank N. Patterson, E. Lander, J. Hirschhorn and S. Schaffner for discussions; J. Barrett and J. Maller for their implementation of Tagger in Haploview; the Broad Systems Group for technical assistance; and members of the Analysis group of the International HapMap Project for many useful interactions. D.A. is a Charles E. Culpeper Scholar of the Rockefeller Brothers Fund and a Burroughs Wellcome Fund Clinical Scholar in Translational Research. This work was supported by grants from the US National Institutes of Health.

Author information

Paul I W de Bakker and Roman Yelensky: These authors contributed equally to this work.

Authors and Affiliations

Center for Human Genetic Research, Massachusetts General Hospital, 185 Cambridge Street, CPZN-6818, Boston, 02114-2790, Massachusetts, USA
Paul I W de Bakker, Roman Yelensky, Itsik Pe'er, Mark J Daly & David Altshuler
Department of Molecular Biology, Massachusetts General Hospital, 185 Cambridge Street, CPZN-6818, Boston, 02114-2790, Massachusetts, USA
Paul I W de Bakker, Roman Yelensky & David Altshuler
Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
Paul I W de Bakker & David Altshuler
Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
Paul I W de Bakker, Itsik Pe'er, Stacey B Gabriel, Mark J Daly & David Altshuler
Harvard-MIT Division of Health Sciences and Technology, Cambridge, Massachusetts, USA
Roman Yelensky
Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA
Mark J Daly & David Altshuler
Diabetes Unit, Massachusetts General Hospital, 185 Cambridge Street, CPZN-6818, Boston, 02114-2790, Massachusetts, USA
David Altshuler

Authors

Paul I W de Bakker
View author publications
You can also search for this author in PubMed Google Scholar
Roman Yelensky
View author publications
You can also search for this author in PubMed Google Scholar
Itsik Pe'er
View author publications
You can also search for this author in PubMed Google Scholar
Stacey B Gabriel
View author publications
You can also search for this author in PubMed Google Scholar
Mark J Daly
View author publications
You can also search for this author in PubMed Google Scholar
David Altshuler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Mark J Daly or David Altshuler.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Bakker, P., Yelensky, R., Pe'er, I. et al. Efficiency and power in genetic association studies. Nat Genet 37, 1217–1223 (2005). https://doi.org/10.1038/ng1669

Download citation

Received: 14 July 2005
Accepted: 27 September 2005
Published: 23 October 2005
Issue Date: 01 November 2005
DOI: https://doi.org/10.1038/ng1669

This article is cited by

A diverse ancestrally-matched reference panel increases genotype imputation accuracy in a underrepresented population
- John Mauleekoonphairoj
- Sissades Tongsima
- Yong Poovorawan
Scientific Reports (2023)
Haplotype-tagged SNPs improve genomic prediction accuracy for Fusarium head blight resistance and yield-related traits in wheat
- Admas Alemu
- Lorena Batista
- Aakash Chawade
Theoretical and Applied Genetics (2023)
Polymorphisms in the mTOR-PI3K-Akt pathway, energy balance-related exposures and colorectal cancer risk in the Netherlands Cohort Study
- Colinda C.J.M. Simons
- Leo J. Schouten
- Matty P. Weijenberg
BioData Mining (2022)
Gene-environment interactions between CREB1 and childhood maltreatment on aggression among male Chinese adolescents
- Yanmei Zhang
- Chun Kang
- Yizhen Yu
Scientific Reports (2022)
Epistasis Detection via the Joint Cumulant
- Randall Reese
- Guifang Fu
- Kenneth Chiu
Statistics in Biosciences (2022)

Efficiency and power in genetic association studies

Abstract

Access options

Similar content being viewed by others

Refining the impact of genetic evidence on clinical success

Genome-wide association studies

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Supplementary information

Supplementary Fig. 1

Supplementary Fig. 2

Supplementary Fig. 3

Supplementary Note

Rights and permissions

About this article

Cite this article

This article is cited by

A diverse ancestrally-matched reference panel increases genotype imputation accuracy in a underrepresented population

Haplotype-tagged SNPs improve genomic prediction accuracy for Fusarium head blight resistance and yield-related traits in wheat

Polymorphisms in the mTOR-PI3K-Akt pathway, energy balance-related exposures and colorectal cancer risk in the Netherlands Cohort Study

Gene-environment interactions between CREB1 and childhood maltreatment on aggression among male Chinese adolescents

Epistasis Detection via the Joint Cumulant

Search

Quick links

Abstract

Access options

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links