Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Transferability of tag SNPs in genetic association studies in multiple populations


A general question for linkage disequilibrium–based association studies is how power to detect an association is compromised when tag SNPs are chosen from data in one population sample and then deployed in another sample. Specifically, it is important to know how well tags picked from the HapMap DNA samples capture the variation in other samples. To address this, we collected dense data uniformly across the four HapMap population samples and eleven other population samples. We picked tag SNPs using genotype data we collected in the HapMap samples and then evaluated the effective coverage of these tags in comparison to the entire set of common variants observed in the other samples. We simulated case-control association studies in the non-HapMap samples under a disease model of modest risk, and we observed little loss in power. These results demonstrate that the HapMap DNA samples can be used to select tags for genome-wide association studies in many samples around the world.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Performance of tags evaluated in multiple population samples, expressed as the percentage of common SNPs (excluding the tags) captured within a given maximum r2 range (three bins: 0 < r2 < 0.5, 0.5 ≤ r2 < 0.8 and 0.8 ≤ r2 ≤ 1.0).
Figure 2: The relationship between the allele frequency observed in the HapMap reference panel (from which tags are picked) and the maximum r2 between the tags and all 'untyped' SNPs with ≥5% frequency in HGDP-YRI, CEPH-EXT, HGDP-CHB and HGDP-JPT.


  1. The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

  2. de Bakker, P.I.W. et al. Efficiency and power in genetic association studies. Nat. Genet. 37, 1217–1223 (2005).

    Article  CAS  Google Scholar 

  3. Kolonel, L.N. et al. A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. Am. J. Epidemiol. 151, 346–357 (2000).

    Article  CAS  Google Scholar 

  4. Kolonel, L.N., Altshuler, D. & Henderson, B.E. The multiethnic cohort study: exploring genes, lifestyle and cancer risk. Nat. Rev. Cancer 4, 519–527 (2004).

    Article  CAS  Google Scholar 

  5. Wheeler, D.L. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 34, D173–D180 (2006).

    Article  CAS  Google Scholar 

  6. Zeggini, E. et al. An evaluation of HapMap sample size and tagging SNP performance in large-scale empirical and simulated data sets. Nat. Genet. 37, 1320–1322 (2005).

    Article  CAS  Google Scholar 

  7. Parra, E.J. et al. Estimating African American admixture proportions by use of population-specific alleles. Am. J. Hum. Genet. 63, 1839–1851 (1998).

    Article  CAS  Google Scholar 

  8. Pe'er, I. et al. Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat. Genet. 38, 663–667 (2006).

    Article  CAS  Google Scholar 

  9. Weale, M.E. et al. Selection and evaluation of tagging SNPs in the neuronal-sodium-channel gene SCN1A: implications for linkage-disequilibrium gene mapping. Am. J. Hum. Genet. 73, 551–565 (2003).

    Article  CAS  Google Scholar 

  10. Nejentsev, S. et al. Comparative high-resolution analysis of linkage disequilibrium and tag single nucleotide polymorphisms between populations in the vitamin D receptor gene. Hum. Mol. Genet. 13, 1633–1639 (2004).

    Article  CAS  Google Scholar 

  11. Ke, X. et al. Efficiency and consistency of haplotype tagging of dense SNP maps in multiple samples. Hum. Mol. Genet. 13, 2557–2565 (2004).

    Article  CAS  Google Scholar 

  12. Mueller, J.C. et al. Linkage disequilibrium patterns and tagSNP transferability among European populations. Am. J. Hum. Genet. 76, 387–398 (2005).

    Article  CAS  Google Scholar 

  13. Ahmadi, K.R. et al. A single-nucleotide polymorphism tagging set for human drug metabolism and transport. Nat. Genet. 37, 84–89 (2005).

    Article  CAS  Google Scholar 

  14. Ramirez-Soriano, A. et al. Haplotype tagging efficiency in worldwide populations in CTLA4 gene. Genes Immun. 6, 646–657 (2005).

    Article  CAS  Google Scholar 

  15. Ribas, G. et al. Evaluating HapMap SNP data transferability in a large-scale genotyping project involving 175 cancer-associated genes. Hum. Genet. 118, 669–679 (2006).

    Article  CAS  Google Scholar 

  16. Stankovich, J. et al. On the utility of data from the International HapMap Project for Australian association studies. Hum. Genet. 119, 220–222 (2006).

    Article  CAS  Google Scholar 

  17. Huang, W. et al. Linkage disequilibrium sharing and haplotype-tagged SNP portability between populations. Proc. Natl. Acad. Sci. USA 103, 1418–1421 (2006).

    Article  CAS  Google Scholar 

  18. Gonzalez-Neira, A. et al. The portability of tagSNPs across populations: a worldwide survey. Genome Res. 16, 323–330 (2006).

    Article  CAS  Google Scholar 

  19. Montpetit, A. et al. An evaluation of the performance of tag SNPs derived from HapMap in a Caucasian population. PLoS Genet. 2, e27 (2006).

    Article  Google Scholar 

  20. Smith, E.M. et al. Comparison of linkage disequilibrium patterns between the HapMap CEPH samples and a family-based cohort of Northern European descent. Genomics published online 19 May 2006 (doi:10.1016/j.ygeno.2006.04.004).

  21. Shifman, S., Kuypers, J., Kokoris, M., Yakir, B. & Darvasi, A. Linkage disequilibrium patterns of the human genome across populations. Hum. Mol. Genet. 12, 771–776 (2003).

    Article  CAS  Google Scholar 

  22. Beaty, T.H. et al. Haplotype diversity in 11 candidate genes across four populations. Genetics 171, 259–267 (2005).

    Article  CAS  Google Scholar 

  23. Evans, D.M. & Cardon, L.R. A comparison of linkage disequilibrium patterns and estimated population recombination rates across multiple populations. Am. J. Hum. Genet. 76, 681–687 (2005).

    Article  CAS  Google Scholar 

  24. Sawyer, S.L. et al. Linkage disequilibrium patterns vary substantially among populations. Eur. J. Hum. Genet. 13, 677–686 (2005).

    Article  CAS  Google Scholar 

  25. Cann, H.M. et al. A human genome diversity cell line panel. Science 296, 261–262 (2002).

    Article  CAS  Google Scholar 

  26. Rosenberg, N.A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002).

    Article  CAS  Google Scholar 

  27. Stephens, M. & Donnelly, P. A comparison of Bayesian methods for haplotype reconstruction from population genotype data. Am. J. Hum. Genet. 73, 1162–1169 (2003).

    Article  CAS  Google Scholar 

  28. Marchini, J. et al. A comparison of phasing algorithms for trios and unrelated individuals. Am. J. Hum. Genet. 78, 437–450 (2006).

    Article  CAS  Google Scholar 

  29. Carlson, C.S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).

    Article  CAS  Google Scholar 

  30. Howie, B.N., Carlson, C.S., Rieder, M.J. & Nickerson, D.A. Efficient selection of tagging single-nucleotide polymorphisms in multiple populations. Hum. Genet. 120, 58–68 (2006).

    Article  Google Scholar 

Download references


We thank M. Egyud for sharing unpublished results and all members of the collaborative Multiethnic Cohort Study and the Analysis group of the International HapMap Consortium for useful discussions. We acknowledge the support of NIH grants CA63464 and CA098758 (to B.E.H.), HL074166 (to X.Z.), CA54281 (to L.N.K.) and DK067288 (to H.N.L.); a March of Dimes grant (6-FY04-61, to J.N.H.) and a Charles E. Culpeper Scholarship of the Rockefeller Brothers Fund and a Burroughs Wellcome Fund Clinical Scholarship in Translational Research (both to D.A.).

Author information

Authors and Affiliations



H.N.L., X.Z., R.C., L.G., C.A.H., L.N.K., B.E.H. provided DNA samples; N.P.B. coordinated resequencing with R.C.O. and S.Y.; N.P.B., R.R.G., C.G., J.B., K.L.P. prepared DNA samples, designed and performed genotyping experiments; P.d.B., N.P.B., R.R.G., R.Y., J.A.D. and T.B. performed the analyses; P.d.B. wrote the paper, with contributions from N.P.B. and R.R.G.; M.L.F., C.A.H., D.O.S. and H.N.L. gave feedback and helped with revisions and M.J.D., J.N.H. and D.A. jointly directed the project.

Corresponding author

Correspondence to David Altshuler.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Table 1

Summary of SNP discovery through resequencing. (PDF 25 kb)

Supplementary Table 2

Genotyping summary of all attempted SNP assays. (PDF 540 kb)

Supplementary Table 3

Genotyping summary by population. (PDF 23 kb)

Supplementary Table 4

Genotyping summary of the final data set. (PDF 24 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

de Bakker, P., Burtt, N., Graham, R. et al. Transferability of tag SNPs in genetic association studies in multiple populations. Nat Genet 38, 1298–1303 (2006).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing