Review Article | Published:

Haplotype blocks and linkage disequilibrium in the human genome

Nature Reviews Genetics volume 4, pages 587597 (2003) | Download Citation

Subjects

Abstract

There is great interest in the patterns and extent of linkage disequilibrium (LD) in humans and other species. Characterizing LD is of central importance for gene-mapping studies and can provide insights into the biology of recombination and human demographic history. Here, we review recent developments in this field, including the recently proposed 'haplotype-block' model of LD. We describe some of the recent data in detail and compare the observed patterns to those seen in simulations.

Key points

  • Linkage disequilibrium (LD) is the nonrandom association of alleles at different sites.

  • Recent studies have proposed that patterns of LD in the human genome can be summarized by a series of discrete haplotype blocks: regions of high LD that are separated from other haplotype blocks by many historical recombination events.

  • Patterns of LD and the fit of the haplotype-block model vary tremendously from region to region: some show extensive well-defined haplotype blocks, while others contain essentially no haplotype blocks.

  • This variability across regions is probably the result of several factors, which include large-scale variation in recombination rates (apparent from genetic maps), fine-scale variation in recombination rates (for example, hotspots) and the inherent stochasticity of LD.

  • Simulations indicate that although recombination hotspots generally create haplotype-block boundaries, the converse is not true: most haplotype-block boundaries do not occur at hotspots

  • The identification of haplotype blocks will be of some use for future association studies, but there will be a substantial fraction of the genome (not covered by large haplotype blocks) for which other approaches will be useful.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    & Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69, 1–14 (2001). This paper discusses ways of quantifying LD, and explores how LD is affected by different demographic models.

  2. 2.

    & A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29, 311–322 (1995).

  3. 3.

    & Using haplotype blocks to map human complex trait loci. Trends Genet. 19, 135–140 (2003).

  4. 4.

    Linkage disequilibrium and the search for complex disease genes. Genome Res. 10, 1435–1444 (2000).

  5. 5.

    , & Patterns of linkage disequilibrium in the human genome. Nature Rev. Genet. 3, 299–309 (2002).

  6. 6.

    et al. Identification of the cystic fibrosis gene: genetic analysis. Science 245, 1073–1080 (1989).

  7. 7.

    et al. Linkage disequilibrium mapping in isolated founder populations: diastrophic dysplasia in Finland. Nature Genet. 2, 204–211 (1992).

  8. 8.

    , & Variations on a theme: cataloging human DNA sequence variation. Science 278, 1580–1581 (1997).

  9. 9.

    Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genet. 22, 139–144 (1999).

  10. 10.

    Searching for genetic determinants in the new millennium. Nature 405, 847–856 (2000).

  11. 11.

    et al. Nonuniform recombination within the human β-globin gene cluster. Am. J. Hum. Genet. 36, 1239–1258 (1984).

  12. 12.

    & A new multilocus model for linkage disequilibrium, with application to exploring variations in recombination rate. Genetics (in the press). This study provides an innovative approach to modelling LD, and introduces a powerful new method for quantifying local variation in levels of LD.

  13. 13.

    et al. Meiotic gene conversion tract length distribution within the rosy locus of Drosophila melanogaster. Genetics 137, 1019–1026 (1994).

  14. 14.

    & Why is there so little intragenic linkage disequilibrium in humans? Genet. Res. 77, 143–151 (2001).

  15. 15.

    et al. Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. Am. J. Hum. Genet. 69, 831–843 (2001). This paper quantifies differences in levels of LD across populations, and provides the first estimates of gene-conversion rates in humans.

  16. 16.

    et al. Lower-than-expected linkage disequilibrium between tightly linked markers in humans suggests a role for gene conversion. Am. J. Hum. Genet. 69, 582–589 (2001).

  17. 17.

    et al. Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271, 1380–1387 (1996).

  18. 18.

    et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001). This paper is the first genomic-scale study to document the variability in levels of LD across different populations and genetic regions.

  19. 19.

    , , & Estimation of admixture and detection of linkage in admixed populations by a Bayesian approach: application to African-American populations. Ann. Hum. Genet. 64, 171–186 (2000).

  20. 20.

    et al. Traces of human migrations in Helicobacter pylori populations. Science 299, 1582–1585 (2003).

  21. 21.

    , & Inference of population structure: extensions to linked loci and correlated allele frequencies. Genetics (in the press).

  22. 22.

    Insights from linked single nucleotide polymorphisms: what we can learn from linkage disequilibrium. Curr. Opin. Genet. Dev. 11, 647–651 (2001).

  23. 23.

    Detecting ancient admixture in humans using sequence polymorphism data. Genetics 154, 1271–1279 (2000).

  24. 24.

    , & Nucleotide variability at G6PD and the signature of malarial selection in humans. Genetics 162, 1849–1861 (2002).

  25. 25.

    et al. Haplotype diversity and linkage disequilibrium at human G6PD: recent origin of alleles that confer malarial resistance. Science 293, 455–462 (2001). This paper, along with references 24 and 26, shows how recent natural selection can affect patterns of LD.

  26. 26.

    et al. Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832–837 (2002).

  27. 27.

    & The future of genetic studies of complex human diseases. Science 273, 1516–1517 (1996).

  28. 28.

    Genomewide transmission/disequilibrium testing — consideration of the genotypic relative risks at disease loci. Am. J. Hum. Genet. 61, 1424–1430 (1997).

  29. 29.

    , & High resolution analysis of haplotype diversity and meiotic crossover in the human TAP2 recombination hotspot. Hum. Mol. Genet. 9, 725–733 (2000).

  30. 30.

    , & Intensely punctuate meiotic recombination in the class II region of the major histocompatibility complex. Nature Genet. 29, 217–222 (2001). This high-resolution experimental analysis shows that most recombination events in the class II MHC region occur in just a handful of narrow hotspots.

  31. 31.

    , , , & High-resolution haplotype structure in the human genome. Nature Genet. 29, 229–232 (2001). The notable patterns of LD in this study spurred interest in the haplotype-block concept.

  32. 32.

    et al. The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002). This study explores haplotype-block patterns across many populations and genomic regions.

  33. 33.

    et al. Juxtaposed regions of extensive and minimal linkage disequilibrium in human Xq25 and Xq28. Nature Genet. 25, 324–328 (2000).

  34. 34.

    et al. The extent of linkage disequilibrium in four populations with distinct demographic histories. Am. J. Hum. Genet. 67, 1544–1554 (2000).

  35. 35.

    et al. Extent and distribution of linkage disequilibrium in three genomic regions. Am. J. Hum. Genet. 68, 191–197 (2001).

  36. 36.

    , , , & Haplotype and linkage disequilibrium architecture for human cancer-associated genes. Genome Res. 12, 1846–1853 (2002).

  37. 37.

    et al. Human genome sequence variation and the influence of gene history, mutation and recombination. Nature Genet. 32, 135–142 (2002).

  38. 38.

    The sampling distribution of linkage disequilibrium under an infinite allele model without selection. Genetics 109, 611–631 (1985).

  39. 39.

    Two-locus sampling distributions and their application. Genetics 159, 1805–1817 (2001).

  40. 40.

    & Linkage disequilibrium: what history has to tell us. Trends Genet. 18, 83–90 (2002).

  41. 41.

    & Linkage disequilibrium and the mapping of complex human traits. Trends Genet. 18, 19–24 (2002).

  42. 42.

    & Demographic history and linkage disequilibrium in human populations. Nature Genet. 17, 435–438 (1997).

  43. 43.

    et al. Extensive linkage disequilibrium in small human populations in Eurasia. Am. J. Hum. Genet. 70, 673–685 (2002).

  44. 44.

    et al. The genetically isolated populations of Finland and Sardinia may not be a panacea for linkage disequilibrium mapping of common disease genes. Nature Genet. 25, 320–323 (2000).

  45. 45.

    et al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science 294, 1719–1723 (2001).

  46. 46.

    et al. A first generation linkage disequilibrium map of human chromosome 22. Nature 418, 544–548 (2002).

  47. 47.

    et al. Haplotype tagging for the identification of common disease genes. Nature Genet. 29, 233–237 (2001). This paper explores how haplotype tag SNPs might aid future association studies.

  48. 48.

    et al. Chromosome-wide distribution of haplotype blocks and the role of recombination hotspots. Nature Genet. 33, 382–387 (2003).

  49. 49.

    , & The pattern of polymorphism on human chromosome 21. Genome Res. 13, 1158–1168 (2003).

  50. 50.

    Meiotic recombination hot spots and cold spots. Nature Rev. Genet. 2, 360–369 (2001).

  51. 51.

    , , , & Comprehensive human genetic maps: individual and sex-specific variation in recombination. Am. J. Hum. Genet. 63, 861–869 (1998).

  52. 52.

    et al. Comparison of human genetic and sequence-based physical maps. Nature 409, 951–953 (2001).

  53. 53.

    et al. A high-resolution recombination map of the human genome. Nature Genet. 31, 241–247 (2002).

  54. 54.

    , , , & Recombination breakpoints in the human β-globin gene cluster. Blood 92, 4415–4421 (1998).

  55. 55.

    , , , & Mapping recombination hotspots in human phosphoglucomutase (PGM1). Hum. Mol. Genet. 8, 1699–1706 (1999).

  56. 56.

    , , & Crossover breakpoint mapping identifies a subtelomeric hotspot for male meiotic recombination. Hum. Mol. Genet. 9, 1239–1244 (2000).

  57. 57.

    et al. Amplification and analysis of DNA sequences in single human sperm and diploid cells. Nature 335, 414–417 (1988).

  58. 58.

    , , & High resolution localization of recombination hot spots using sperm typing. Nature Genet. 7, 420–424 (1994).

  59. 59.

    , & High-resolution mapping of crossovers in human sperm defines a minisatellite-associated recombination hotspot. Mol. Cell 2, 267–273 (1998).

  60. 60.

    , , & Evidence for heterogeneity in recombination in the human pseudoautosomal region: high resolution analysis by sperm typing and radiation-hybrid mapping. Am. J. Hum. Genet. 66, 557–566 (2000).

  61. 61.

    , , , & Crossover clustering and rapid decay of linkage disequilibrium in the Xp/Yp pseudoautosomal gene SHOX. Nature Genet. 31, 272–275 (2002).

  62. 62.

    , , , & Direct measurement of the male recombination fraction in the human β-globin hot spot. Hum. Mol. Genet. 11, 207–215 (2002).

  63. 63.

    , & Hot and cold spots of recombination in the human genome: the reason we should find them and how this can be achieved. Am. J. Hum. Genet. 73, 5–16 (2003).

  64. 64.

    , , & An initiation site for meiotic gene conversion in the yeast Saccharomyces cerevisiae. Nature 338, 35–39 (1989).

  65. 65.

    & Reciprocal crossover asymmetry and meiotic drive in a human recombination hot spot. Nature Genet. 31, 267–271.

  66. 66.

    , & The hotspot conversion paradox and the evolution of meiotic recombination. Proc. Natl Acad. Sci. USA 94, 8058–8063 (1997).

  67. 67.

    , & Differences in crossover frequency and distribution among three sibling species of Drosophila. Genetics 142, 507–523 (1996).

  68. 68.

    et al. A genetic linkage map of the baboon (Papio hamadryas) genome based on human microsatellite polymorphisms. Genomics 67, 237–247 (2000).

  69. 69.

    , & Recombination hotspots rather than population history dominate linkage disequilibrium in the MHC class II region. Hum. Mol. Genet. 12, 33–40 (2003).

  70. 70.

    et al. Additional SNPs and linkage-disequilibrium analysis in whole-genome association studies in humans. Nature Genet. 33, 518–521 (2003).

  71. 71.

    & Assessing the performance of the haplotype block model of linkage disequilibrium. Am. J. Hum. Genet. (in the press).

  72. 72.

    , , & Distribution of recombination crossovers and the origin of haplotype blocks: the interplay of population history, recombination, and mutation. Am. J. Hum. Genet. 71, 1227–1234 (2002).

  73. 73.

    , , , & Robustness of inference of haplotype block structure. J. Comp. Biol. 10, 13–19 (2003).

  74. 74.

    , & Inferences about human demography based on multilocus analyses of noncoding sequences. Genetics 161, 1209–1218 (2002).

  75. 75.

    , & Mitochondrial DNA and human evolution. Nature 325, 31–36 (1987).

  76. 76.

    & Genetic and fossil evidence for the origin of modern humans. Science 239, 1263–1268 (1988).

  77. 77.

    , & Testing models of selection and demography in Drosophila simulans. Genetics 162, 203–216 (2002).

  78. 78.

    & Demography, recombination hotspot intensity, and the block structure of linkage disequilibrium. Curr. Biol. 13, 1–8 (2003).

  79. 79.

    et al. Genetic variation in the 5q31 cytokine gene cluster confers susceptibility to Crohn disease. Nature Genet.29, 223–228 (2001).

  80. 80.

    & Assessment of linkage disequilibrium by the decay of haplotype sharing, with application to fine-scale genetic mapping. Am. J. Hum. Genet. 65, 858–875 (1999).

  81. 81.

    , & Fine-scale mapping of disease loci via shattered coalescent modeling of genealogies. Am. J. Hum. Genet. 70, 686–707 (2002).

  82. 82.

    The interaction of selection and linkage. I. General considerations: heterotic models. Genetics 49, 49–67 (1964).

  83. 83.

    & Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111, 147–164 (1985).

  84. 84.

    & The power of association studies to detect the contribution of candidate genetic loci to variation in complex traits. Genome Res. 9, 720–731 (1999).

  85. 85.

    Estimating the recombination parameter of a finite population model without selection. Genet. Res. 50, 245–250 (1987).

  86. 86.

    , & A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160, 1231–1241 (2002).

  87. 87.

    & Estimating recombination rates from population genetic data. Genetics 159, 1299–1318 (2001).

  88. 88.

    A comparison of estimators of the population recombination rate. Mol. Biol. Evol. 17, 156–163 (2000).

  89. 89.

    , & Maximum likelihood estimation of recombination rates from population data. Genetics 156, 1393–1401 (2000).

  90. 90.

    & Ancestral inference from samples of DNA sequences with recombination. J. Comp. Biol. 3, 479–502 (1996).

  91. 91.

    , , , & A dynamic programming algorithm for haplotype block partitioning. Proc. Natl Acad. Sci. USA 99, 7335–7339 (2002).

  92. 92.

    , , & Haplotype block structure and its applications to association studies: power and study designs. Am. J. Hum. Genet. 71, 1386–1394 (2002).

  93. 93.

    & Measuring gametic disequilibrium from multilocus data. Genetics 157, 413–423 (2001).

  94. 94.

    et al. CARD15 genetic variation in a Quebec populations: prevalence, genotype-phenotype relationship, and haplotype structure. Am. J. Hum. Genet. 71, 74–83 (2002).

  95. 95.

    & The allelic architecture of human disease genes: common disease-common variant... or not? Hum. Mol. Genet. 11, 2417–2423 (2002).

  96. 96.

    Are rare variants responsible for susceptibility to complex diseases? Am. J. Hum. Genet. 69, 124–137 (2001).

  97. 97.

    Properties of a neutral allele model with intragenic recombination. Theor. Popul. Biol. 23, 183–201 (1983).

  98. 98.

    , , & The effect of single nucleotide polymorphism identification strategies on estimates of linkage disequilibrium. Mol. Biol. Evol. 20, 232–242 (2003).

Download references

Acknowledgements

We thank D. Nickerson, S. Gabriel, M. Daly, D. Altshuler and S. Schaffner for help in accessing and interpreting their data, and A. DiRienzo and S. Zoellner for discussions. We also thank M. Przeworski and the anonymous reviewers for comments on an earlier version of this manuscript. This work was supported by a National Institutes of Health grant to J.K.P.

Author information

Author notes

    • Jeffrey D. Wall

    Current address:Program in Molecular and Computational Biology, The University of Southern California, 1042 West 36th Place, DRB 289, Los Angeles, California 90089-1113, USA. pritch@uchicago.edu

Affiliations

  1. Department of Human Genetics, The University of Chicago, 920 East 58th Street, CLSC 507, Chicago, Illinois 60637, USA.  jwall@genetics.bsd.uchicago.edu

    • Jeffrey D. Wall
    •  & Jonathan K. Pritchard

Authors

  1. Search for Jeffrey D. Wall in:

  2. Search for Jonathan K. Pritchard in:

Glossary

BOTTLENECK

A temporary reduction in population size that causes the loss of genetic variation.

ADMIXTURE

The mixture of two or more genetically distinct populations.

PAIRWISE LINKAGE DISEQUILIBRIUM

(Pairwise LD). The strength of association between alleles at two different markers.

PRE-ASCERTAINED SINGLE NUCLEOTIDE POLYMORPHISMS

(Pre-ascertained SNPs). SNPs that have already been detected in previous studies, usually from an extremely small sample of chromosomes.

UNPHASED DIPLOID DATA

Sequence data in which the phase of double heterozygotes was not determined.

BAYESIAN APPROACH

A statistical approach that, given a set of assumptions about the underlying model, can provide a rigorous assessment of uncertainty.

COALESCENT SIMULATION

A method of simulating data under a population genetic model.

ASCERTAINMENT BIAS

The bias in patterns of variation that results from using pre-ascertained SNPs.

GENE CONVERSION

Recombination that involves the nonreciprocal transfer of information from one sister chromatid to another.

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nrg1123

Further reading