Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Evaluating coverage of genome-wide association studies

Abstract

Genome-wide association studies involving hundreds of thousands of SNPs in thousands of cases and controls are now underway. The first of many analytical challenges in these studies involves the choice of SNPs to genotype. It is not practical to construct a different panel of tag SNPs for each study, so the first generation of genome-wide scans will use predefined, commercially available marker panels, which will in part dictate their success or failure. We compare different approaches in use today, and show that although many of them provide substantial coverage of common variation in non-African populations, the precise extent is strongly dependent on the frequencies of alleles of interest and on specific considerations of study design. Overall, despite substantial differences in genotyping technologies, marker selection strategies and number of markers assayed, the first-generation high-throughput platforms all offer similar levels of genome coverage.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Genomic coverage by maximally efficient (pairwise) tag sets for three HapMap panels and three r2 cutoffs.
Figure 2: Minor allele frequency in CEU for 900,000 polymorphic Phase I HapMap SNPs, 2 million distinct polymorphic SNPs added during the second phase of HapMap and 10,000 polymorphic ENCODE SNPs (which approximate the underlying frequency distribution in the genome).
Figure 3: Coverage of common variation in the Phase II HapMap by the Affymetrix 500K and Illumina HumanHap300 products plotted as a function of random genotype failure rate.
Figure 4

Similar content being viewed by others

References

  1. Altshuler, D. et al. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

    Article  Google Scholar 

  2. Hinds, D.A. et al. Whole-genome patterns of common DNA variation in three human populations. Science 307, 1072–1079 (2005).

    Article  CAS  Google Scholar 

  3. Wang, W.Y., Barratt, B.J., Clayton, D.G. & Todd, J.A. Genome-wide association studies: theoretical and practical concerns. Nat. Rev. Genet. 6, 109–118 (2005).

    Article  CAS  Google Scholar 

  4. Hirschhorn, J.N. & Daly, M.J. Genome-wide association studies for common diseases and complex traits. Nat. Rev. Genet. 6, 95–108 (2005).

    Article  CAS  Google Scholar 

  5. Palmer, L.J. & Cardon, L.R. Shaking the tree: mapping complex disease genes with linkage disequilibrium. Lancet 366, 1223–1234 (2005).

    Article  CAS  Google Scholar 

  6. Botstein, D. & Risch, N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat. Genet. 33 (Suppl.), 228–237 (2003).

    Article  CAS  Google Scholar 

  7. Neale, B.M. & Sham, P.C. The future of association studies: gene-based analysis and replication. Am. J. Hum. Genet. 75, 353–362 (2004).

    Article  CAS  Google Scholar 

  8. Clayton, D.G. et al. Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat. Genet. 37, 1243–1246 (2005).

    Article  CAS  Google Scholar 

  9. de Bakker, P.I. et al. Efficiency and power in genetic association studies. Nat. Genet. 37, 1217–1223 (2005).

    Article  CAS  Google Scholar 

  10. Dong, S. et al. Flexible use of high-density oligonucleotide arrays for single-nucleotide polymorphism discovery and validation. Genome Res. 11, 1418–1424 (2001).

    Article  CAS  Google Scholar 

  11. Carlson, C.S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).

    Article  CAS  Google Scholar 

  12. Ke, X. et al. A comparison of tagging methods and their tagging space. Hum. Mol. Genet. 14, 2757–2767 (2005).

    Article  CAS  Google Scholar 

  13. Pritchard, J.K. & Przeworski, M. Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69, 1–14 (2001).

    Article  CAS  Google Scholar 

  14. Jorgenson, E. & Witte, J.S. Coverage and power in genomewide association studies. Am. J. Hum. Genet. 78, 884–888 (2006).

    Article  CAS  Google Scholar 

  15. Daly, M. et al. Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat. Genet. advance online publication 21 May 2006 (doi:10.1038/ng1816).

  16. Klein, R.J. et al. Complement factor H polymorphism in age-related macular degeneration. Science 308, 385–389 (2005).

    Article  CAS  Google Scholar 

  17. Rieder, M.J. et al. Effect of VKORC1 haplotypes on transcriptional regulation and warfarin dose. N. Engl. J. Med. 352, 2285–2293 (2005).

    Article  CAS  Google Scholar 

  18. Hardenbol, P. et al. Highly multiplexed molecular inversion probe genotyping: over 10,000 targeted SNPs genotyped in a single tube assay. Genome Res. 15, 269–275 (2005).

    Article  CAS  Google Scholar 

  19. Barrett, J.C., Fry, B., Maller, J. & Daly, M.J. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–265 (2005).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We wish to thank M. Daly, I. Pe'er, L. Palmer, M. Barnes and the WTCCC analysis group, particularly D. Clayton and P. Donnelly, for discussions on many of these topics. We thank D. Evans for comments on the manuscript. We also thank the investigators and participants in the International HapMap project for generating the unique data set and making it available to the scientific community. The authors are supported by the Wellcome Trust, the US National Institutes of Health and a grant from the European Union (MolPAGE).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lon R Cardon.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Figure 1

Coverage by r2 threshold. (PDF 29 kb)

Supplementary Figure 2

HapMap and ENCODE allele frequencies. (PDF 27 kb)

Supplementary Methods (PDF 33 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barrett, J., Cardon, L. Evaluating coverage of genome-wide association studies. Nat Genet 38, 659–662 (2006). https://doi.org/10.1038/ng1801

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng1801

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing