Human genome sequence variation and the influence of gene history, mutation and recombination

Abstract

Variation in the human genome sequence is key to understanding susceptibility to disease in modern populations and the history of ancestral populations. Unlocking this information requires knowledge of the patterns and underlying causes of human sequence diversity. By applying a new population-genetic framework to two genome-wide polymorphism surveys, we find that the human genome contains sizeable regions (stretching over tens of thousands of base pairs) that have intrinsically high and low rates of sequence variation. We show that the primary determinant of these patterns is shared genealogical history. Only a fraction of the variation (at most 25%) is due to the local mutation rate. By measuring the average distance over which genealogical histories are typically preserved, these data provide the first genome-wide estimate of the average extent of correlation among variants (linkage disequilibrium). The results are best explained by extreme variability in the recombination rate at a fine scale, and provide the first empirical evidence that such recombination 'hot spots' are a general feature of the human genome and have a principal role in shaping genetic variation in the human population.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Correlation in heterozygosity.
Figure 2: Cis versus trans comparisons.
Figure 3: Impact of gene history on the correlation in heterozygosity.
Figure 4: Correlation in mutation rate (inferred from sequence divergence).
Figure 5: Correlation in gene history.
Figure 6: Comparison of the observed and simulated correlation in gene history under a range of models of human demographic history and recombination.

Accession codes

Accessions

GenBank/EMBL/DDBJ

References

  1. 1

    Sachidanandam, R. et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001).

  2. 2

    Li, W.H. & Sadler, L.A. Low nucleotide diversity in man. Genetics 129, 513–523 (1991).

  3. 3

    Altshuler, D. et al. An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature 407, 513–516 (2000).

  4. 4

    Mullikin, J.C. et al. An SNP map of human chromosome 22. Nature 407, 516–520 (2000).

  5. 5

    Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

  6. 6

    Cargill, M. et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nature Genet. 22, 231–238 (1999).

  7. 7

    Cambien, F. et al. Sequence diversity in 36 candidate genes for cardiovascular disorders. Am. J. Hum. Genet. 65, 183–191 (1999).

  8. 8

    Halushka, M.K. et al. Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nature Genet. 22, 239–247 (1999).

  9. 9

    Wang, D.G. et al. Large-scale identification, mapping, and genotyping of single- nucleotide polymorphisms in the human genome. Science 280, 1077–1082 (1998).

  10. 10

    Venter, J.C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).

  11. 11

    Li, W.-H. Molecular Evolution (Sinauer Associates, Sunderland, Massachusetts, 1997).

  12. 12

    Griffiths, R.C. in Selected Proceedings of the Sheffield Symposium on Applied Probability, IMS Lecture Notes Vol. 18 (eds I.V. Basawa & R.L. Taylor) 100–117 (Institute of Mathematical Statistics, 1991).

  13. 13

    Griffiths, R.C. Neutral two-locus multiple allele models with recombination. Theor. Popul. Biol. 19, 169–186 (1981).

  14. 14

    Kaplan, N. & Hudson, R.R. The use of sample genealogies for studying a selectively neutral m-loci model with recombination. Theor. Popul. Biol. 28, 382–396 (1985).

  15. 15

    Hudson, R.R. Properties of a neutral allele model with intragenic recombination. Theor. Popul. Biol. 23, 183–201 (1983).

  16. 16

    Hudson, R.R. in Oxford Surveys in Evolutionary Biology (eds Futuyma, D.J. & Antonovics, J.) 1–44 (Oxford Univ. Press, Oxford, 1990).

  17. 17

    Sved, J.A. Linkage disequilibrium and homozygosity of chromosome segments in finite populations. Theor. Popul. Biol. 2, 125–141 (1971).

  18. 18

    Yu, A. et al. Comparison of human genetic and sequence-based physical maps. Nature 409, 951–953 (2001).

  19. 19

    Hudson, R.R. Testing the constant-rate neutral allele model with protein sequence data. Evolution 37, 203–217 (1983).

  20. 20

    Takahata, N. & Satta, Y. Evolution of the primate lineage leading to modern humans: phylogenetic and demographic inferences from DNA sequences. Proc. Natl Acad. Sci. USA 94, 4811–4815 (1997).

  21. 21

    Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, Cambridge, 1983).

  22. 22

    Lander, E.S. & Schork, N.J. Genetic dissection of complex traits. Science 265, 2037–2048 (1994).

  23. 23

    Kruglyak, L. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genet. 22, 139–144 (1999).

  24. 24

    Risch, N.J. Searching for genetic determinants in the new millennium. Nature 405, 847–856 (2000).

  25. 25

    Devlin, B. & Risch, N. A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29, 311–322 (1995).

  26. 26

    Strobeck, C. & Morgan, K. The effect of intragenic recombination on the number of alleles in a finite population. Genetics 88, 829–844 (1978).

  27. 27

    Ohta, T. & Kimura, M. Linkage disequilibrium between two segregating nucleotide sites under the steady flux of mutations in a finite populations. Genetics 68, 571–580 (1971).

  28. 28

    Reich, D.E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).

  29. 29

    Abecasis, G.R. et al. Extent and distribution of linkage disequilibrium in three genomic regions. Am. J. Hum. Genet. 68, 191–197 (2001).

  30. 30

    Dunning, A.M. et al. The extent of linkage disequilibrium in four populations with distinct demographic histories. Am. J. Hum. Genet. 67, 1544–1554 (2000).

  31. 31

    Taillon-Miller, P. et al. Juxtaposed regions of extensive and minimal linkage disequilibrium in human Xq25 and Xq28. Nature Genet. 25, 324–328 (2000).

  32. 32

    Gabriel, S.B. et al. The structure of haplotype blocks in the human genome. Science 297, 2225–2229 (2002); published online 23 May 2002 (10.1126/science.1069424).

  33. 33

    Przeworski, M., Hudson, R.R. & Di Rienzo, A. Adjusting the focus on human variation. Trends. Genet. 16, 296–302 (2000).

  34. 34

    Reich, D.E. & Goldstein, D.B. Genetic evidence for a Paleolithic human population expansion in Africa. Proc. Natl Acad. Sci. USA 95, 8119–8123 (1998).

  35. 35

    Kimmel, M. et al. Signatures of population expansion in microsatellite repeat data. Genetics 148, 1921–1930 (1998).

  36. 36

    Tishkoff, S.A. et al. Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271, 1380–1387 (1996).

  37. 37

    Wakeley, J. Nonequilibrium migration in human history. Genetics 153, 1863–1871 (1999).

  38. 38

    Chakravarti, A. et al. Nonuniform recombination within the human β-globin gene cluster. Am. J. Hum. Genet. 36, 1239–1258 (1984).

  39. 39

    Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J. & Lander, E.S. High-resolution haplotype structure in the human genome. Nature Genet. 29, 229–232 (2001).

  40. 40

    Jeffreys, A.J., Ritchie, A. & Neumann, R. High resolution analysis of haplotype diversity and meiotic crossover in the human TAP2 recombination hotspot. Hum. Mol. Genet. 9, 725–733 (2000).

  41. 41

    Jeffreys, A.J., Kauppi, L. & Neumann, R. Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex. Nature Genet. 29, 217–222 (2001).

  42. 42

    Cavalli-Sforza, L.L., Menozzi, P. & Piazza, A. The History and Geography of Human Genes (Princeton Univ. Press, Princeton, New Jersey, 1994).

  43. 43

    Kong, A. A high resolution recombination map of the human genome. Nature Genet. 31, 241–247 (2002); advance online publication, 10 June 2002 (doi:10.1038/ng917).

  44. 44

    Lui, B.H. in Statistical Genomics: Linkage, Mapping, and QTL Analysis (CRC Press, Boca Raton, Florida, 1998).

  45. 45

    Broman, K.W., Murray, J.C., Sheffield, V.C., White, R.L. & Weber, J.L. Comprehensive human genetic maps: individual and sex-specific variation in recombination. Am. J. Hum. Genet. 63, 861–869 (1998).

  46. 46

    Nachman, M.W. & Crowell, S.L. Estimate of the mutation rate per nucleotide in humans. Genetics 156, 297–304 (2000).

Download references

Acknowledgements

We thank T. Lavery, A. Rachupka and J. Platko for assistance with great ape sequencing; B. Gilman for computer support; D. Cutler, P. Donnelly, J. Hirschhorn, L. Kruglyak, S. Myers, J. Pritchard and J. Wakeley for discussions and advice; and the laboratory of E. Green and the Baylor Sequencing Center for depositing large-insert chimpanzee sequences into GenBank. D.E.R. was supported in part by a National Defense Science and Engineering fellowship. D.A. is a Charles E. Culpeper Scholar of the Rockefeller Brothers Fund and a Burroughs Wellcome Fund Clinical Scholar in Translational Research. This work was supported by grants from The SNP Consortium to E.S.L. and D.A., the Massachusetts General Hospital to D.A. and the National Institutes of Health to E.S.L..

Author information

Correspondence to David Altshuler.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Web Note A

Rights and permissions

Reprints and Permissions

About this article

Further reading