Review Article | Published:

Linkage disequilibrium — understanding the evolutionary past and mapping the medical future

Nature Reviews Genetics volume 9, pages 477485 (2008) | Download Citation

Subjects

Abstract

Linkage disequilibrium — the nonrandom association of alleles at different loci — is a sensitive indicator of the population genetic forces that structure a genome. Because of the explosive growth of methods for assessing genetic variation at a fine scale, evolutionary biologists and human geneticists are increasingly exploiting linkage disequilibrium in order to understand past evolutionary and demographic events, to map genes that are associated with quantitative characters and inherited diseases, and to understand the joint evolution of linked sets of genes. This article introduces linkage disequilibrium, reviews the population genetic processes that affect it and describes some of its uses. At present, linkage disequilibrium is used much more extensively in the study of humans than in non-humans, but that is changing as technological advances make extensive genomic studies feasible in other species.

Key points

  • Linkage disequilibrium (LD) is the nonrandom association of alleles of different loci. There is no single best statistic that quantifies the extent of LD. Several statistics have been proposed that are useful for different purposes.

  • Recombination interacts in a complex way with selection, mutation and genetic drift to determine levels of LD. As a consequence, local and genome-wide patterns of LD can provide insight into patterns of natural selection and the past history of population growth and dispersal.

  • In humans and other model organisms, LD between marker alleles and traits of interest allow fine-scale gene mapping. Many recent genome-wide association studies have successfully mapped SNPs associated with complex inherited diseases in humans.

  • Unusually high local LD can indicate an allele that has recently increased to high frequency under strong selection. Several methods have been developed to detect selected loci and to estimate the age of alleles using patterns of LD.

  • In humans, the analysis of LD is well underway. The pace is slower in other species, although some model organisms, including mice, dogs, Drosophila and Arabidopsis thaliana, are catching up fast. Extensive analysis of LD in non-model species will be undertaken soon.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    & The evolutionary dynamics of complex polymorphisms. Evolution 14, 458–472 (1960).

  2. 2.

    Genetic Data Analysis II (Sinauer Assoc., Sunderland, Massachusetts, 1996).

  3. 3.

    Genetic disequilibrium measures: proceed with caution. Genetics 117, 331–341 (1987). This paper and the reply by Lewontin (reference 33) point out many of the logical and statistical difficulties in attempting to define a 'best' LD statistic.

  4. 4.

    & GOLD — Graphical Overview of Linkage Disequilibrium. Bioinformatics 16, 182–183 (2000).

  5. 5.

    , & Evaluation of linkage disequilibrium measures between multi-allelic markers as predictors of linkage disequilibrium between single nucleotide polymorphisms. Genet. Res. 89, 1–6 (2007).

  6. 6.

    , , , & Signatures of demographic history and natural selection in the human major histocompatibility complex loci. Genetics 173, 2121–2142 (2006).

  7. 7.

    Uber vererbungsgesetze beim menschen. Z. Abst V. Vererb. 1, 276–330 (1909).

  8. 8.

    The numerical results of diverse systems of breeding, with respect to two pairs of characters, linked and independent, with special relation to the effects of linkage. Genetics 2, 97–154 (1917).

  9. 9.

    Inferences about linkage disequilibrium. Biometrics 35, 235–254 (1979).

  10. 10.

    & Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol. Biol. Evol. 12, 921–927 (1995).

  11. 11.

    LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters. Bioinformatics 22, 768–770 (2006).

  12. 12.

    & Molecular and phenotypic variation of the white locus region in Drosophila melanogaster. Genetics 120, 199–212 (1988).

  13. 13.

    , , , & High-resolution haplotype structure in the human genome. Nature 29, 229–232 (2001). This paper presents the first clear evidence of haplotype blocks in the human genome and the first method for detecting block boundaries.

  14. 14.

    et al. The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002).

  15. 15.

    & Haplotype blocks and linkage disequilibrium in the human genome. Nature Rev. Genet. 4, 587–597 (2003).

  16. 16.

    Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genet. 22, 139–144 (1999).

  17. 17.

    et al. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet. 74, 106–120 (2004).

  18. 18.

    et al. Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots. Nature Genet. 33, 382–387 (2003).

  19. 19.

    & Finding haplotype block boundaries by using the minimum-description-length principle. Am. J. Hum. Genet. 73, 336–354 (2003).

  20. 20.

    International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

  21. 21.

    International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007).

  22. 22.

    et al. Haplotype block structure is conserved across mammals. PLoS Genet. 2, 1111–1118 (2006).

  23. 23.

    et al. Genetic and haplotypic structure in 14 European and African cattle breeds. Genetics 177, 1059–1070 (2007).

  24. 24.

    et al. Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 438, 803–819 (2005).

  25. 25.

    , & Multilocus structure of natural populations of Hordeum spontaneum. Genetics 96, 523–536 (1980).

  26. 26.

    , , & How clonal are bacteria? Proc. Natl Acad. Sci. USA 90, 4384–4388 (1993).

  27. 27.

    On the probability theory of linkage in Mendelian heredity. Annals of Mathematical Statistics 15, 25–57 (1944).

  28. 28.

    , & Constrained disequilibrium values and hitchhiking in a three-locus system. Genetics 150, 1295–1307 (1998).

  29. 29.

    Linkage disequilibrium due to random genetic drift in finite subdivided populations. Proc. Natl. Acad. Sci. USA 79, 1940–1944 (1982).

  30. 30.

    Linkage disequilibrium with the island model. Genetics 101, 139–155 (1982).

  31. 31.

    Breeding structure of populations in relation to speciation. Am. Nat. 74, 232–248 (1940).

  32. 32.

    & Genepop (Version 1.2) — population-genetics software for exact tests and ecumenicism. J. Hered. 86, 248–249 (1995).

  33. 33.

    On measures of gametic disequilibrium. Genetics 120, 849–852 (1988).

  34. 34.

    , , , & The optimal measure of linkage disequilibrium reduces error in association mapping of affection status. Hum. Mol. Genet. 14, 145–153 (2005).

  35. 35.

    Properties of a neutral allele model with intragenic recombination. Theor. Popul. Biol. 23, 183–201 (1983). This paper presents the first coalescent model with recombination.

  36. 36.

    & Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111, 147–164 (1985).

  37. 37.

    , & Maximum likelihood estimation of recombination rates from population data. Genetics 156, 1393–1401 (2000).

  38. 38.

    Two-locus sampling distributions and their applications. Genetics 159, 1805–1817 (2001).

  39. 39.

    , & A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160, 1231–1241 (2002).

  40. 40.

    , , , & A fine-scale map of recombination rates and hotspots across the human genome. Science 310, 321–324 (2005). This paper applies the method described in reference 39 to human HapMap data and demonstrates the ubiquity of recombinational hot spots and identifies a DNA sequence motif that is associated with elevated recombination rates.

  41. 41.

    Attainment of quasi linkage equilibrium when gene frequencies are changing by natural selection. Genetics 52, 875–890 (1965).

  42. 42.

    Quasilinkage equilibrium and the evolution of two-locus systems. Proc. Natl. Acad. Sci. USA 71, 526–530 (1974).

  43. 43.

    The evolution of one and two-locus systems. Genetics 83, 583–600 (1976).

  44. 44.

    The Genetical Theory of Natural Selection (Clarendon, Oxford, 1930).

  45. 45.

    The effect of linkage on directional selection. Genetics 52, 349–363 (1965).

  46. 46.

    & Linkage and selection: two locus symmetric viability model. Theor. Popul. Biol. 1, 39–71 (1970).

  47. 47.

    , & Selection in complex genetic systems I. The symmetric equilibria of the three-locus symmetric viability model. Genetics 76, 135–162 (1974).

  48. 48.

    & Is the gene the unit of selection? Genetics 65, 707–734 (1970).

  49. 49.

    On treating the chromosome as the unit of selection. Genetics 72, 157–168 (1972).

  50. 50.

    & Study of linkage disequilibrium in populations of Drosophila melanogaster. Genetics 73, 351–359 (1973).

  51. 51.

    , & Linkage disequilibrium in natural populations of Drosophila melanogaster. Genetics 78, 921–936 (1974).

  52. 52.

    et al. Evidence for consistent intragenic and intergenic interactions between SNP effects in the APOA1/C3/A4/A5 gene cluster. Hum. Hered. 61, 87–96 (2006).

  53. 53.

    & Linkage disequilibrium in finite populations. Theor. Appl. Genet. 38, 226–231 (1968).

  54. 54.

    & Linkage disequilibrium at steady state determined by random genetic drift and recurrent mutation. Genetics 63, 229–238 (1969).

  55. 55.

    The sampling distribution of linkage disequilibrium under an infinite allele model without selection. Genetics 109, 611–631 (1985).

  56. 56.

    Linkage disequilibrium in growing and stable populations. Genetics 137, 331–336 (1994).

  57. 57.

    & The effect of linkage on limits to artificial selection. Genet. Res. 8, 269–294 (1966).

  58. 58.

    & The effects of Hill–Robertson interference between weakly selected mutations on patterns of molecular evolution and variation. Genetics 155, 929–944 (2000).

  59. 59.

    The evolutionary advantage of recombination. Genetics 78, 737–756 (1974). This is the first paper to recognize the Hill–Robertson effect and its implications for the evolution of sex and recombination.

  60. 60.

    & Interference among deleterious mutations favours sex and recombination in finite populations. Nature 443, 89–92 (2006).

  61. 61.

    A general model for the evolution of recombination. Genet. Res. 65, 123–144 (1995).

  62. 62.

    & Linkage disequilibrium in subdivided populations. Genetics 75, 213–219 (1973).

  63. 63.

    , & Population genetics of marine pelecypods. III. Epistasis between functionally related isoenzymes of Mytilus edulis. Genetics 73, 487–496 (1973).

  64. 64.

    Stable linkage disequilibrium without epistasis in subdivided populations. Theor. Popul. Biol. 6, 173–183 (1974).

  65. 65.

    Gene flow and selection in a 2-locus system. Genetics 81, 787–802 (1975).

  66. 66.

    et al. Sequencing and analysis of Neanderthal genomic DNA. Science 314, 1113–1118 (2006).

  67. 67.

    , , & Genetic variability in a genomic region with long-range linkage disequilibrium reveals traces of a bottleneck in the history of the European population. Hum. Genet. 118, 276–286 (2005).

  68. 68.

    et al. Impact of population structure, effective bottleneck time, and allele frequency on linkage disequilibrium maps. Proc. Natl. Acad. Sci. USA 101, 18075–18080 (2004).

  69. 69.

    & Approximate Bayesian inference reveals evidence for a recent, severe bottleneck in a Netherlands population of Drosophila melanogaster. Genetics 172, 1607–1619 (2006).

  70. 70.

    & Group inbreeding with 2 linked loci. Genetics 63, 711–742 (1969).

  71. 71.

    & Linkage disequilibrium in a finite population that is partially selfing. Genetics 94, 777–789 (1980).

  72. 72.

    et al. The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol. 3, 1289–1299 (2005).

  73. 73.

    et al. Recombination and linkage disequilibrium in Arabidopsis thaliana. Nature Genet. 39, 1151–1155 (2007).

  74. 74.

    , , & Distinguishing recombination and intragenic gene conversion by linkage disequilibrium patterns. Genet. Res. 75, 61–73 (2000).

  75. 75.

    et al. Lower-than-expected linkage disequilibrium between tightly linked markers in humans suggests a role for gene conversion. Am. J. Hum. Genet. 69, 582–589 (2001).

  76. 76.

    , & Estimating the rate of gene conversion on human chromosome 21. Am. J. Hum. Genet. 75, 386–397 (2004).

  77. 77.

    , & Estimating meiotic gene conversion rates from population genetic data. Genetics 177, 881–894 (2007).

  78. 78.

    et al. Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. Am. J. Hum. Genet. 69, 831–843 (2001).

  79. 79.

    et al. Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447, 1087–1093 (2007).

  80. 80.

    et al. Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor-positive breast cancer. Nature Genet. 39, 865–869 (2007).

  81. 81.

    et al. A genome-wide association scan of tag SNPs identifies a susceptibility variant for colorectal cancer at 8q24.21. Nature Genet. 39, 984–988 (2007).

  82. 82.

    et al. Genome-wide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24. Nature Genet. 39, 989–994 (2007).

  83. 83.

    Diabetes Genetics Initiative, Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316, 1331–1336 (2007).

  84. 84.

    et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316, 1336–1341 (2007).

  85. 85.

    et al. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 316, 1341–1345 (2007).

  86. 86.

    et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 445, 881–885 (2007).

  87. 87.

    et al. Variants conferring risk of atrial fibrillation on chromosome 4q25. Nature 448, 353–357 (2007).

  88. 88.

    et al. A common allele on chromosome 9 associated with coronary heart disease. Science 316, 1488–1491 (2007).

  89. 89.

    & Case–control association tests correcting for population stratification. Ann. Hum. Genet. 70, 98–115 (2006).

  90. 90.

    & Case–control studies of association in structured or admixed populations. Theor. Popul. Biol. 60, 227–237 (2001).

  91. 91.

    & The hitch-hiking effect of a favourable gene. Genet. Res. 23, 23–35 (1974).

  92. 92.

    & Detecting a local signature of genetic hitchhiking along a recombining chromosome. Genetics 160, 765–777 (2002).

  93. 93.

    Estimating the time since the fixation of a beneficial allele. Genetics 164, 1667–1676 (2003).

  94. 94.

    et al. Genomic scans for selective sweeps using SNP data. Genome Res. 15, 1566–1575 (2005).

  95. 95.

    et al. Dating the origin of the CCR5–Delta32 AIDS-resistance allele by the coalescence of haplotypes. Am. J. Hum. Genet. 62, 1507–1515 (1998).

  96. 96.

    & The use of intra-allelic variability for testing neutrality and estimating population growth rate. Genetics 158, 865–874 (2001).

  97. 97.

    , , , & Evidence for positive selection in the superoxide dismutase (Sod) region of Drosophila melanogaster. Genetics 136, 1329–1340 (1994).

  98. 98.

    & Neutrality tests based on the distribution of haplotypes under an infinite-site model. Mol. Biol. Evol. 15, 1788–1790 (1998).

  99. 99.

    et al. Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832–837 (2002).

  100. 100.

    et al. Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–918 (2007). This paper and reference 101 are among the first to show the feasibility of testing for selection on a genome-wide scale.

  101. 101.

    , , & A map of recent positive selection in the human genome. PLoS Biol. 4, 446–458 (2006).

  102. 102.

    & in Microsatellies: Evolution and Applications (eds Goldstein, D. B. & Schlötterer, C.) 129–138 (Oxford University Press, Oxford, 1999).

  103. 103.

    , & Age of the ΔF508 cystic fibrosis mutation. Nature Genet. 8, 216 (1994).

  104. 104.

    & Estimating the age of alleles by use of intraallelic variability. Am. J. Hum. Genet. 60, 447–458 (1997).

  105. 105.

    & Estimating the age of mutant disease alleles based on linkage disequilibrium. Hum. Hered. 47, 315–337 (1997).

  106. 106.

    A Bayesian method for jointly estimating allele age and selection intensity. Genet. Res. 90, 119–128 (2008).

  107. 107.

    DNA sequencing: A plan to capture human diversity in 1000 Genomes. Science 319, 395 (2008).

  108. 108.

    The human genome diversity project — 'Peoples', 'populations' and the cultural politics of identification. Cultural Studies 18, 571–606 (2004).

  109. 109.

    Colonial encounters in postcolonial contexts — patenting indigenous DNA and the Human Genome Diversity Project. Crit. Anthropol. 18, 205–233 (1998).

  110. 110.

    Genetic diversity project tries again. Science 266, 720–722 (1994).

  111. 111.

    Detecting ancient admixture in humans using sequence polymorphism data. Genetics 154, 1271–1279 (2000).

  112. 112.

    & Possible ancestral structure in human populations. Plos Genet. 2, 972–979 (2006).

  113. 113.

    , , , & Evidence that the adaptive allele of the brain size gene microcephalin introgressed into Homo sapiens from an archaic Homo lineage. Proc. Natl. Acad. Sci. USA 103, 18178–18183 (2006).

  114. 114.

    et al. Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol. 5, e310 (2007).

  115. 115.

    et al. Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271, 1380–1387 (1996).

  116. 116.

    et al. SNPSTRs: empirically derived, rapidly typed, autosomal haplotypes for inference of population history and mutational processes. Genome Res. 12, 1766–1772 (2002).

  117. 117.

    et al. Fine-scale structural variation of the human genome. Nature Genet. 37, 727–732 (2005).

  118. 118.

    , , , & A high-resolution survey of deletion polymorphism in the human genome. Nature Genet. 38, 75–81 (2006).

  119. 119.

    The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49, 49–67 (1964).

  120. 120.

    & Measuring the strength of associations between HLA antigens and diseases. Tissue Antigens 18, 356–363 (1981).

  121. 121.

    et al. Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. Am. J. Hum. Genet. 63, 595–612 (1998).

  122. 122.

    Estimation of linkage disequilibrium in randomly mating populations. Heredity 33, 229–239 (1974).

  123. 123.

    Inference of haplotypes from PCR-amplified samples of diploid populations. Mol. Biol. Evol. 7, 111–122 (1990).

  124. 124.

    , & Efficient reconstruction of haplotype structure via perfect phylogeny. J. Bioinform. Comput. Biol. 1, 1–20 (2003).

  125. 125.

    , & A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet. 68, 978–989 (2001).

  126. 126.

    et al. A comparison of phasing algorithms for trios and unrelated individuals. Am. J. Hum. Genet. 78, 437–450 (2006).

  127. 127.

    , , , & A new multipoint method for genome-wide association studies by imputation of genotypes. Nature Genet. 39, 906–913 (2007).

  128. 128.

    et al. Linkage disequilibrium mapping in isolated founder populations: diastrophic dysplasia in Finland. Nature Genet. 2, 204–211 (1992).

  129. 129.

    et al. The diastrophic dysplasia gene encodes a novel sulfate transporter — positional cloning by fine-structure linkage disequilibrium mapping. Cell 78, 1073–1087 (1994).

  130. 130.

    , & Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nature Genet. 29, 217–222 (2001). This paper presents the first experimental demonstration of hot spots of recombination along with evidence of their association with haplotype blocks.

Download references

Acknowledgements

The writing of this paper was supported in part by a grant from the US National Institutes of Health, R01-GM40282. I thank J. Felsenstein for discussions of this topic and the translation of Weinberg's paper, and M. Kirkpatrick and the referees for comments on an earlier version of this paper.

Author information

Affiliations

  1. Department of Integrative Biology, University of California, Berkeley, California 94720-3140, USA.  slatkin@berkeley.edu

    • Montgomery Slatkin

Authors

  1. Search for Montgomery Slatkin in:

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nrg2361

Further reading