Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A worldwide survey of haplotype variation and linkage disequilibrium in the human genome

Abstract

Recent genomic surveys have produced high-resolution haplotype information, but only in a small number of human populations. We report haplotype structure across 12 Mb of DNA sequence in 927 individuals representing 52 populations. The geographic distribution of haplotypes reflects human history, with a loss of haplotype diversity as distance increases from Africa. Although the extent of linkage disequilibrium (LD) varies markedly across populations, considerable sharing of haplotype structure exists, and inferred recombination hotspot locations generally match across groups. The four samples in the International HapMap Project contain the majority of common haplotypes found in most populations: averaging across populations, 83% of common 20-kb haplotypes in a population are also common in the most similar HapMap sample. Consequently, although the portability of tag SNPs based on the HapMap is reduced in low-LD Africans, the HapMap will be helpful for the design of genome-wide association mapping studies in nearly all human populations.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: Haplotype structure in diverse populations for two genomic regions of size 330 kb.
Figure 2: Schematic world map of haplotype diversity.
Figure 3: Effect of ascertainment bias on haplotype diversity.
Figure 4: Population recombination rates (ρ) across genomic regions.
Figure 5
Figure 6: The fraction of common haplotypes in individual populations that are also common in the HapMap.
Figure 7: Portability of tag SNPs chosen using the HapMap.
Figure 8: The determinants of portability of HapMap tag SNPs.

References

  1. Zondervan, K.T. & Cardon, L.R. The complex interplay among factors that influence allelic association. Nat. Rev. Genet. 5, 89–100 (2004).

    Article  CAS  PubMed  Google Scholar 

  2. The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1319 (2005).

  3. Tishkoff, S.A. et al. Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271, 1380–1387 (1996).

    Article  CAS  PubMed  Google Scholar 

  4. Reich, D.E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).

    Article  CAS  PubMed  Google Scholar 

  5. Plagnol, V. & Wall, J.D. Possible ancestral structure in human populations. PLoS Genet. 2, 972–979 (2006).

    Article  CAS  Google Scholar 

  6. Sabeti, P.C. et al. Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832–837 (2002).

    Article  CAS  PubMed  Google Scholar 

  7. Voight, B.F., Kudaravalli, S., Wen, X. & Pritchard, J.K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  8. McVean, G.A.T. et al. The fine-scale structure of recombination rate variation in the human genome. Science 304, 581–584 (2004).

    Article  CAS  PubMed  Google Scholar 

  9. Ptak, S.E. et al. Fine-scale recombination patterns differ between chimpanzees and humans. Nat. Genet. 37, 429–434 (2005).

    Article  CAS  PubMed  Google Scholar 

  10. Fearnhead, P. & Smith, N.G. A novel method with improved power to detect recombination hotspots from polymorphism data reveals multiple hotspots in human genes. Am. J. Hum. Genet. 77, 781–794 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Patil, N. et al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294, 1719–1723 (2001).

    Article  CAS  PubMed  Google Scholar 

  12. Hinds, D.A. et al. Whole-genome patterns of common DNA variation in three human populations. Science 307, 1072–1079 (2005).

    Article  CAS  PubMed  Google Scholar 

  13. De Bakker, P.I.W., Graham, R.R., Altshuler, D., Henderson, B.E. & Haiman, C.A. Transferability of tag SNPs to capture common genetic variation in DNA repair genes across multiple populations. Pac. Symp. Biocomput. 11, 478–486 (2006).

    Google Scholar 

  14. Huang, W. et al. Linkage disequilibrium sharing and haplotype-tagged SNP portability between populations. Proc. Natl. Acad. Sci. USA 103, 1418–1421 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Montpetit, A. et al. An evaluation of the performance of tag SNPs derived from HapMap in a Caucasian population. PLoS Genet. 2, e27 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Service, S. et al. Magnitude and distribution of linkage disequilibrium in population isolates and implications for genome-wide association studies. Nat. Genet. 38, 556–560 (2006).

    Article  CAS  PubMed  Google Scholar 

  17. Willer, C.J. et al. Tag SNP selection for Finnish individuals based on the CEPH Utah HapMap database. Genet. Epidemiol. 30, 180–190 (2006).

    Article  PubMed  Google Scholar 

  18. Yoo, Y.K. et al. Fine-scale map of Encyclopedia of DNA Elements regions in the Korean population. Genetics 174, 491–497 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Rosenberg, N.A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002).

    Article  CAS  PubMed  Google Scholar 

  20. Barrett, J.C. & Cardon, L.R. Evaluating coverage of genome-wide association studies. Nat. Genet. 38, 659–662 (2006).

    Article  CAS  PubMed  Google Scholar 

  21. Sawyer, S.L. et al. Linkage disequilibrium patterns vary substantially among populations. Eur. J. Hum. Genet. 13, 677–686 (2005).

    Article  CAS  PubMed  Google Scholar 

  22. Bonnen, P.E. et al. Evaluating potential for whole-genome studies in Kosrae, an isolated population in Micronesia. Nat. Genet. 38, 214–217 (2006).

    Article  CAS  PubMed  Google Scholar 

  23. Gonzalez-Neira, A. et al. The portability of tagSNPs across populations: a worldwide survey. Genome Res. 16, 323–330 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Cann, H.M. et al. A human genome diversity cell line panel. Science 296, 261–262 (2002).

    Article  CAS  PubMed  Google Scholar 

  25. Cavalli-Sforza, L.L. The Human Genome Diversity Project: past, present and future. Nat. Rev. Genet. 6, 333–340 (2005).

    Article  CAS  PubMed  Google Scholar 

  26. Rosenberg, N.A. Standardized subsets of the HGDP-CEPH Human Genome Diversity Cell Line Panel, accounting for atypical and duplicated samples and pairs of close relatives. Ann. Hum. Genet. published online 29 March 2006 (doi:10.1111/j.1469-1809.2006.00285.x).

  27. Kong, A. et al. A high-resolution recombination map of the human genome. Nat. Genet. 31, 241–247 (2002).

    Article  CAS  PubMed  Google Scholar 

  28. Scheet, P. & Stephens, M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Ramachandran, S. et al. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc. Natl. Acad. Sci. USA 102, 15942–15947 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Clark, A.G., Hubisz, M.J., Bustamante, C.D., Williamson, S.H. & Nielsen, R. Ascertainment bias in studies of human genome-wide polymorphism. Genome Res. 15, 1496–1502 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Mountain, J.L. & Cavalli-Sforza, L.L. Inference of human evolution through cladistic analysis of nuclear DNA restriction polymorphisms. Proc. Natl. Acad. Sci. USA 91, 6515–6519 (1994).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Nielsen, R. Population genetic analysis of ascertained SNP data. Hum. Genomics 1, 218–224 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Rogers, A.R. & Jorde, L.B. Ascertainment bias in estimates of average heterozygosity. Am. J. Hum. Genet. 58, 1033–1041 (1996).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Bowcock, A.M. et al. High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368, 455–457 (1994).

    Article  CAS  PubMed  Google Scholar 

  35. Crawford, D.C. et al. Haplotype diversity across 100 candidate genes for inflammation, lipid metabolism, and blood pressure regulation in two populations. Am. J. Hum. Genet. 74, 610–622 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Stephens, J.C. et al. Haplotype variation and linkage disequilibrium in 313 human genes. Science 293, 489–493 (2001).

    Article  CAS  PubMed  Google Scholar 

  37. Nielsen, R., Hubisz, M.J. & Clark, A.G. Reconstructing the frequency spectrum of ascertained single-nucleotide polymorphism data. Genetics 168, 2373–2382 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Winckler, W. et al. Comparison of fine-scale recombination rates in humans and chimpanzees. Science 308, 107–111 (2005).

    Article  CAS  PubMed  Google Scholar 

  39. De Bakker, P.I.W. et al. Efficiency and power in genetic association studies. Nat. Genet. 37, 1217–1223 (2005).

    Article  CAS  PubMed  Google Scholar 

  40. Carlson, C.S. et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2005).

    Article  Google Scholar 

  41. Rosenberg, N.A. et al. Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genet. 1, 660–671 (2005).

    Article  CAS  Google Scholar 

  42. Marchini, J. A comparison of phasing algorithms for trios and unrelated individuals. Am. J. Hum. Genet. 78, 437–450 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Hurlbert, S.H. The nonconcept of species diversity: a critique and alternative parameters. Ecology 52, 577–586 (1971).

    Article  PubMed  Google Scholar 

  44. Kalinowski, S.T. Counting alleles with rarefaction: private alleles and hierarchical sampling designs. Conserv. Genet. 5, 539–543 (2004).

    Article  CAS  Google Scholar 

  45. Nei, M. Molecular Evolutionary Genetics (Columbia Univ. Press, New York, 1987).

    Google Scholar 

  46. Weir, B.S. Genetic Data Analysis II (Sinauer, Sunderland, Massachusetts, 1996).

    Google Scholar 

  47. Pritchard, J.K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank J. DeYoung and the Southern California Genotyping Consortium for genotyping, P. Scheet for providing prepublication access to fastPHASE, C. Shtir for assistance and M. Przeworski for comments. This work was supported by grants from the Burroughs Wellcome Fund (J.K.P., N.A.R.), the Sloan Foundation (J.D.W., J.K.P., N.A.R.), the Packard Foundation (J.K.P.) and the US National Science Foundation (J.D.W., N.A.R.).

Author information

Authors and Affiliations

Authors

Contributions

J.D.W., N.A.R., and J.K.P. conceived and jointly supervised the study, D.F.C. performed the SNP design, N.A.R. and J.K.P. cleaned the data, and J.K.P. performed the phasing. All authors analyzed the data, with the following primary contributions: haplotype visualization, X.W.; haplotype diversity statistics and haplotype sharing with the HapMap, M.J.; recombination rate estimation, G.C.; tag SNP portability, D.F.C.; determinants of portability, M.J. and D.F.C. The supplementary information was written by D.F.C, G.C., M.J., N.A.R. and J.K.P., and the paper was written primarily by J.K.P. and N.A.R., with assistance from all other authors.

Corresponding author

Correspondence to Noah A Rosenberg.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Fig. 1

Portability of tag-SNPs from seven different panels. (PDF 172 kb)

Supplementary Fig. 2

Difference in PVT between Patil and non-Patil regions (CEU tags). (PDF 32 kb)

Supplementary Fig. 3

Difference in PVT between Patil and non-Patil regions (YRI tags). (PDF 31 kb)

Supplementary Fig. 4

Relationships between tag portability and the distance at which the r2 measure of linkage disequilibrium decays below 0.5, and between tag portability and FST genetic distance to the HapMap population that produces the highest tag portability. (PDF 33 kb)

Supplementary Fig. 5

Relationships between tag portability and distance at which the r2 measure of linkage disequilibrium decays below 0.5, and tag portability and FST genetic distance to the HapMap population that produces the highest tag portability. (PDF 44 kb)

Supplementary Table 1

Details of SNPs used. (XLS 367 kb)

Supplementary Methods (PDF 434 kb)

Supplementary Note (PDF 1390 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Conrad, D., Jakobsson, M., Coop, G. et al. A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nat Genet 38, 1251–1260 (2006). https://doi.org/10.1038/ng1911

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng1911

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing