Detecting recent positive selection in the human genome from haplotype structure


The ability to detect recent natural selection in the human population would have profound implications for the study of human history and for medicine. Here, we introduce a framework for detecting the genetic imprint of recent positive selection by analysing long-range haplotypes in human populations. We first identify haplotypes at a locus of interest (core haplotypes). We then assess the age of each core haplotype by the decay of its association to alleles at various distances from the locus, as measured by extended haplotype homozygosity (EHH). Core haplotypes that have unusually high EHH and a high population frequency indicate the presence of a mutation that rose to prominence in the human gene pool faster than expected under neutral evolution. We applied this approach to investigate selection at two genes carrying common variants implicated in resistance to malaria: G6PD1 and CD40 ligand2. At both loci, the core haplotypes carrying the proposed protective mutation stand out and show significant evidence of selection. More generally, the method could be used to scan the entire genome for evidence of recent positive selection.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Experimental design of core and long-range SNPs for G6PD and TNFSF5.
Figure 2: Core haplotype frequency and relative EHH of G6PD and TNFSF5.
Figure 3: Control regions: core haplotype frequency against relative EHH.


  1. 1

    Ruwende, C. & Hill, A. Glucose-6-phosphate dehydrogenase deficiency and malaria. J. Mol. Med. 76, 581–588 (1998)

  2. 2

    Sabeti, P. et al. CD40L association with protection from severe malaria. Genes Immun. 3, 286–291 (2002)

  3. 3

    Cavalli-Sforza, L. L., Menozzi, P. & Piazza, A. The History and Geography of Human Genes (Princeton Univ. Press, Princeton, 1994)

  4. 4

    Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, Cambridge/New York, 1983)

  5. 5

    Stephens, J. C. et al. Dating the origin of the CCR5-Delta32 AIDS-resistance allele by the coalescence of haplotypes. Am. J. Hum. Genet. 62, 1507–1515 (1998)

  6. 6

    Hudson, R. R. & Kaplan, N. L. Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111, 147–164 (1985)

  7. 7

    Lewontin, R. The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49, 49–67 (1964)

  8. 8

    Nei, M. Molecular Evolutionary Genetics Eqn. 8.4 (Columbia Univ. Press, New York, 1987)

  9. 9

    Luzatto, L., Mehta, A. & Vulliamy, T. The Metabolic & Molecular Bases of Inherited Disease 4517–4553 (McGraw-Hill, New York, 2001)

  10. 10

    Raymond, M. & Rousset, F. An exact test for population differentiation. Evolution 49, 1280–1283 (1995)

  11. 11

    Hudson, R. R. Properties of a neutral allele model with intragenic recombination. Theor. Popul. Biol. 23, 183–201 (1983)

  12. 12

    Reich, D. E. & Goldstein, D. B. Microsatellites: Evolution and Applications 128–138 (Oxford Univ. Press, Oxford/New York, 1999)

  13. 13

    Tishkoff, S. A. et al. Haplotype diversity and linkage disequilibrium at human G6PD: recent origin of alleles that confer malarial resistance. Science 293, 455–462 (2001)

  14. 14

    Rozas, J. & Rozas, R. DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15, 174–175 (1999)

  15. 15

    Tajima, F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585–595 (1989)

  16. 16

    Fu, Y. X. & Li, W. H. Statistical tests of neutrality of mutations. Genetics 133, 693–709 (1993)

  17. 17

    Fay, J. C. & Wu, C. I. Hitchhiking under positive Darwinian selection. Genetics 155, 1405–1413 (2000)

  18. 18

    Hughes, A. L. & Nei, M. Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335, 167–170 (1988)

  19. 19

    McDonald, J. H. & Kreitman, M. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351, 652–654 (1991)

  20. 20

    Hudson, R. R., Kreitman, M. & Aguade, M. A test of neutral molecular evolution based on nucleotide data. Genetics 116, 153–159 (1987)

  21. 21

    Gabriel, S. B. et al. The structure of haplotype blocks in the human genome. Science 23, 2225–2229 (2002)

  22. 22

    Wootton, J. C. et al. Genetic diversity and chloroquine selective sweeps in Plasmodium falciparum. Nature 418, 320–323 (2002)

  23. 23

    Tang, K. et al. Chip-based genotyping by mass spectrometry. Proc. Natl Acad. Sci. USA 96, 10016–10020 (1999)

  24. 24

    Vulliamy, T. J. et al. Linkage disequilibrium of polymorphic sites in the G6PD gene in African populations and the origin of G6PD A. Gene Geogr. 5, 13–21 (1991)

  25. 25

    Sachidanandam, R. et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001)

  26. 26

    Reich, D. E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001)

Download references


We thank B. Blumenstiel, M. DeFelice, A. Lochner, J. Moore, H. Nguyen and J. Roy for assistance in genotyping the 17 control regions. We also thank L. Gaffney, S. Radhakrishna, T. DiCesare and T. Lavery for graphics and technical support, B. Ferrell for the Beni samples, and A. Adeyemo and C. Rotimi for helping to collect the Yoruba and Shona samples. Finally, we thank M. Daly, E. Cosman, B. Gray, V. Koduri, T. Herrington and L. Peterson for comments on the manuscript. P.C.S. was supported by grants from the Rhodes Trust, the Harvard Office of Enrichment, and by a Soros Fellowship. This work was supported by grants from the National Institute of Health.

Author information

Correspondence to Eric S. Lander.

Ethics declarations

Competing interests

The authors declare that they have no competing financial interests.

Supplementary information

Supplementary Methods (DOC 28 kb)

Supplemental Table 1: SNP allele frequencies (DOC 37 kb)

Supplemental Table 2: P-values for a range of demographies and distances (DOC 29 kb)

Supplemental Table 3: Sequencing results and tests of selection (DOC 27 kb)

Supplemental Figure 1 legend (DOC 21 kb)

Supplemental Figure 1a (PDF 18 kb)

Supplemental Figure 1b (PDF 15 kb)

Rights and permissions

Reprints and Permissions

About this article

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.