Transferability of tag SNPs in genetic association studies in multiple populations

Abstract

A general question for linkage disequilibrium–based association studies is how power to detect an association is compromised when tag SNPs are chosen from data in one population sample and then deployed in another sample. Specifically, it is important to know how well tags picked from the HapMap DNA samples capture the variation in other samples. To address this, we collected dense data uniformly across the four HapMap population samples and eleven other population samples. We picked tag SNPs using genotype data we collected in the HapMap samples and then evaluated the effective coverage of these tags in comparison to the entire set of common variants observed in the other samples. We simulated case-control association studies in the non-HapMap samples under a disease model of modest risk, and we observed little loss in power. These results demonstrate that the HapMap DNA samples can be used to select tags for genome-wide association studies in many samples around the world.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Performance of tags evaluated in multiple population samples, expressed as the percentage of common SNPs (excluding the tags) captured within a given maximum r2 range (three bins: 0 < r2 < 0.5, 0.5 ≤ r2 < 0.8 and 0.8 ≤ r2 ≤ 1.0).
Figure 2: The relationship between the allele frequency observed in the HapMap reference panel (from which tags are picked) and the maximum r2 between the tags and all 'untyped' SNPs with ≥5% frequency in HGDP-YRI, CEPH-EXT, HGDP-CHB and HGDP-JPT.

References

  1. 1

    The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

  2. 2

    de Bakker, P.I.W. et al. Efficiency and power in genetic association studies. Nat. Genet. 37, 1217–1223 (2005).

  3. 3

    Kolonel, L.N. et al. A multiethnic cohort in Hawaii and Los Angeles: baseline characteristics. Am. J. Epidemiol. 151, 346–357 (2000).

  4. 4

    Kolonel, L.N., Altshuler, D. & Henderson, B.E. The multiethnic cohort study: exploring genes, lifestyle and cancer risk. Nat. Rev. Cancer 4, 519–527 (2004).

  5. 5

    Wheeler, D.L. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 34, D173–D180 (2006).

  6. 6

    Zeggini, E. et al. An evaluation of HapMap sample size and tagging SNP performance in large-scale empirical and simulated data sets. Nat. Genet. 37, 1320–1322 (2005).

  7. 7

    Parra, E.J. et al. Estimating African American admixture proportions by use of population-specific alleles. Am. J. Hum. Genet. 63, 1839–1851 (1998).

  8. 8

    Pe'er, I. et al. Evaluating and improving power in whole-genome association studies using fixed marker sets. Nat. Genet. 38, 663–667 (2006).

  9. 9

    Weale, M.E. et al. Selection and evaluation of tagging SNPs in the neuronal-sodium-channel gene SCN1A: implications for linkage-disequilibrium gene mapping. Am. J. Hum. Genet. 73, 551–565 (2003).

  10. 10

    Nejentsev, S. et al. Comparative high-resolution analysis of linkage disequilibrium and tag single nucleotide polymorphisms between populations in the vitamin D receptor gene. Hum. Mol. Genet. 13, 1633–1639 (2004).

  11. 11

    Ke, X. et al. Efficiency and consistency of haplotype tagging of dense SNP maps in multiple samples. Hum. Mol. Genet. 13, 2557–2565 (2004).

  12. 12

    Mueller, J.C. et al. Linkage disequilibrium patterns and tagSNP transferability among European populations. Am. J. Hum. Genet. 76, 387–398 (2005).

  13. 13

    Ahmadi, K.R. et al. A single-nucleotide polymorphism tagging set for human drug metabolism and transport. Nat. Genet. 37, 84–89 (2005).

  14. 14

    Ramirez-Soriano, A. et al. Haplotype tagging efficiency in worldwide populations in CTLA4 gene. Genes Immun. 6, 646–657 (2005).

  15. 15

    Ribas, G. et al. Evaluating HapMap SNP data transferability in a large-scale genotyping project involving 175 cancer-associated genes. Hum. Genet. 118, 669–679 (2006).

  16. 16

    Stankovich, J. et al. On the utility of data from the International HapMap Project for Australian association studies. Hum. Genet. 119, 220–222 (2006).

  17. 17

    Huang, W. et al. Linkage disequilibrium sharing and haplotype-tagged SNP portability between populations. Proc. Natl. Acad. Sci. USA 103, 1418–1421 (2006).

  18. 18

    Gonzalez-Neira, A. et al. The portability of tagSNPs across populations: a worldwide survey. Genome Res. 16, 323–330 (2006).

  19. 19

    Montpetit, A. et al. An evaluation of the performance of tag SNPs derived from HapMap in a Caucasian population. PLoS Genet. 2, e27 (2006).

  20. 20

    Smith, E.M. et al. Comparison of linkage disequilibrium patterns between the HapMap CEPH samples and a family-based cohort of Northern European descent. Genomics published online 19 May 2006 (doi:10.1016/j.ygeno.2006.04.004).

  21. 21

    Shifman, S., Kuypers, J., Kokoris, M., Yakir, B. & Darvasi, A. Linkage disequilibrium patterns of the human genome across populations. Hum. Mol. Genet. 12, 771–776 (2003).

  22. 22

    Beaty, T.H. et al. Haplotype diversity in 11 candidate genes across four populations. Genetics 171, 259–267 (2005).

  23. 23

    Evans, D.M. & Cardon, L.R. A comparison of linkage disequilibrium patterns and estimated population recombination rates across multiple populations. Am. J. Hum. Genet. 76, 681–687 (2005).

  24. 24

    Sawyer, S.L. et al. Linkage disequilibrium patterns vary substantially among populations. Eur. J. Hum. Genet. 13, 677–686 (2005).

  25. 25

    Cann, H.M. et al. A human genome diversity cell line panel. Science 296, 261–262 (2002).

  26. 26

    Rosenberg, N.A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002).

  27. 27

    Stephens, M. & Donnelly, P. A comparison of Bayesian methods for haplotype reconstruction from population genotype data. Am. J. Hum. Genet. 73, 1162–1169 (2003).

  28. 28

    Marchini, J. et al. A comparison of phasing algorithms for trios and unrelated individuals. Am. J. Hum. Genet. 78, 437–450 (2006).

  29. 29

    Carlson, C.S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).

  30. 30

    Howie, B.N., Carlson, C.S., Rieder, M.J. & Nickerson, D.A. Efficient selection of tagging single-nucleotide polymorphisms in multiple populations. Hum. Genet. 120, 58–68 (2006).

Download references

Acknowledgements

We thank M. Egyud for sharing unpublished results and all members of the collaborative Multiethnic Cohort Study and the Analysis group of the International HapMap Consortium for useful discussions. We acknowledge the support of NIH grants CA63464 and CA098758 (to B.E.H.), HL074166 (to X.Z.), CA54281 (to L.N.K.) and DK067288 (to H.N.L.); a March of Dimes grant (6-FY04-61, to J.N.H.) and a Charles E. Culpeper Scholarship of the Rockefeller Brothers Fund and a Burroughs Wellcome Fund Clinical Scholarship in Translational Research (both to D.A.).

Author information

H.N.L., X.Z., R.C., L.G., C.A.H., L.N.K., B.E.H. provided DNA samples; N.P.B. coordinated resequencing with R.C.O. and S.Y.; N.P.B., R.R.G., C.G., J.B., K.L.P. prepared DNA samples, designed and performed genotyping experiments; P.d.B., N.P.B., R.R.G., R.Y., J.A.D. and T.B. performed the analyses; P.d.B. wrote the paper, with contributions from N.P.B. and R.R.G.; M.L.F., C.A.H., D.O.S. and H.N.L. gave feedback and helped with revisions and M.J.D., J.N.H. and D.A. jointly directed the project.

Correspondence to David Altshuler.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Table 1

Summary of SNP discovery through resequencing. (PDF 25 kb)

Supplementary Table 2

Genotyping summary of all attempted SNP assays. (PDF 540 kb)

Supplementary Table 3

Genotyping summary by population. (PDF 23 kb)

Supplementary Table 4

Genotyping summary of the final data set. (PDF 24 kb)

Rights and permissions

Reprints and Permissions

About this article

Further reading