Opinion | Published:

The Human Genome Diversity Project: past, present and future


The Human Genome Project, in accomplishing its goal of sequencing one human genome, heralded a new era of research, a component of which is the systematic study of human genetic variation. Despite delays, the Human Genome Diversity Project has started to make progress in understanding the patterns of this variation and its causes, and also promises to provide important information for biomedical studies.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1

    Greely, H. T. Human genome diversity: what about the other human genome project? Nature Rev. Genet. 2, 222–227 (2001).

  2. 2

    Cann, H. M. et al. A human genome diversity cell line panel. (letter). Science 296, 261 (2002).

  3. 3

    The International HapMap Consortium. The International HapMap Project. Nature 426, 789–795 (2003).

  4. 4

    The International HapMap Consortium. Integrating ethics and science in the International HapMap Project. Nature Rev. Genet. 5, 467–475 (2004).

  5. 5

    Cavalli-Sforza, L. L. How can one study individual variation for three billion nucleotides of the human genome? Am. J. Hum. Genet. 46, 649–651 (1990).

  6. 6

    Cavalli-Sforza, L. L., Wilson, A. C., Cantor, C. R., Cook-Deegan, R. M. & King, M. -C. Call for a worldwide survey of human genetic diversity: a vanishing opportunity for the Human Genome Project. Genomics 11, 490–491 (1991).

  7. 7

    Committee on Human Genome Diversity, National Research Council. Evaluating Human Genetic Diversity (US National Academy of Sciences, Washington DC, 1997).

  8. 8

    Dausset, J. et al. Centre d'Etude du Polymorphisme Humain (CEPH): collaborative genetic mapping of the human genome. Genomics 6, 575–577 (1990) (in French).

  9. 9

    Rosenberg, N. A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002).

  10. 10

    Zhivotovsky, L. A., Rosenberg, N. A. & Feldman, M. W. Features of evolution and expansion of modern humans, inferred from genomewide microsatellite markers. Am. J. Hum. Genet. 72, 1171–1186 (2003).

  11. 11

    Ramachandran, S., Rosenberg, N. A., Zhivotovsky, L. A. & Feldman, M. W. On the robustness of the inference of human population structure. Hum. Genomics 1, 87–97 (2004).

  12. 12

    Shi, M., Caprau, D., Romitti, P., Christensen, K. & Murray, J. C. Genotype frequencies and linkage disequilibrium in the CEPH Human Diversity Panel for variants in folate pathway genes MTHFR, MTHFD, MTRR, RFLI and GCP2. Birth Defects Res. A 67, 545–549 (2003).

  13. 13

    Bersaglieri, T. et al. Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 74, 1111–1120 (2004).

  14. 14

    Macpherson, M. J., Ramachandran, S., Diamond, L. & Feldman, M. W. Demographic estimates from Y-chromosome microsatellite polymorphisms: analysis of a worldwide sample. Hum. Genomics 1, 345–354 (2004).

  15. 15

    Cavalli-Sforza, L. L. & Feldman, M. W. Biology as history: population genetic approaches to modern human evolution. Nature Genet. 33, 266–275 (2003).

  16. 16

    Serre, D. & Paabo, S. Evidence for gradients of human genetic diversity within and among continents. Genome Res. 14, 1679–1685 (2004).

  17. 17

    Horten, R. et al. Read all about it: the Lancet's paper of the Year, 2003. Lancet 362, 2101–2103 (2003).

  18. 18

    Cavalli-Sforza, L. L. & Bodmer, W. The Genetics of Human Populations (Freeman, San Francisco, 1971; Dover, New York, 1999).

  19. 19

    Risch, N. & Merikangas K. The future of genetic studies of complex human diseases. Science 273, 1516–1517 (1996).

  20. 20

    Roseman, C. C. Detecting inter-regionally diversifying natural selection of modern human cranial form using matched molecular and morphometric data. Proc. Natl Acad. Sci. 101, 12824–12829 (2004).

  21. 21

    Reich, D. E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).

  22. 22

    Wall, J. D. & Pritchard, J. K. Haplotype blocks and linkage disequilibrium in the human genome. Nature Rev. Genet. 4, 587–597 (2003).

  23. 23

    McVean, G. A. T. et al. The fine scale structure of recombination rate variation in the human genome. Science 304, 581–584 (2004).

  24. 24

    Cavalli-Sforza, L. L., Menozzi, P. & Piazza, A. The History and Geography of Human Genes (Princeton Univ. Press, New Jersey, 1994).

  25. 25

    Cavalli-Sforza, L. L. The DNA revolution in population genetics. Trends Genet. 14, 60–65 (1998).

  26. 26

    Underhill, P. A. et al. The phylogeography of Y chromosome binary haplotypes and the origins of modern human populations. Ann. Hum. Genet. 65, 43–62 (2001).

  27. 27

    Edmonds, C. A., Lillie, A. S. & Cavalli-Sforza, L. L. Mutations arising in the wave front of an expanding population. Proc. Natl Acad. Sci. USA 101, 975–979 (2004).

  28. 28

    Cavalli-Sforza, L. L. & Edwards, A. W. F. Analysis of human evolution. Genet. Today Proc. 11th Int. Congress Genet. 3, 923–933 (1964).

  29. 29

    Menozzi, P., Piazza, A. & Cavalli-Sforza, L. L. Synthetic gene frequency maps in Europe. Science 201, 786–792 (1978).

  30. 30

    Cavalli-Sforza, L. L. Some current problems in human population genetics. Am. J. Hum. Genet. 25, 82–104 (1973).

  31. 31

    Hirszfeld, L. & Hirszfeld, H. Essai d'application des methodes au probleme des races. Anthropologie 29, 505–537 (1919) (in French).

  32. 32

    Race, R. R. & Sanger, R. Blood Groups in Man (Blackwell Scientific, Oxford, 1975).

  33. 33

    Pauling, L., Itano, A. H., Singer, S. J. & Wells, I. C. Sickle cell anemia, a molecular disease. Science 110, 543–548 (1949).

  34. 34

    Harris, H. The Principles of Human Biochemical Genetics 3rd edn (Elsevier; North Holland Biomedical Press, Amsterdam, 1980).

  35. 35

    Cavalli-Sforza, L. L. et al. DNA markers and genetic variation in the human species. Cold Spring Harb. Symp. Quant. Biol. 51, 411–417 (1987).

  36. 36

    Cavalli-Sforza, L. L. (ed.) African Pygmies (Academic, Orlando, 1986).

  37. 37

    Pritchard, J. K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).

  38. 38

    Bowcock, A. M. et al. High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368, 455–457 (1994).

  39. 39

    Barbujani, G. et al. An apportionment of human DNA diversity. Proc. Natl Acad. Sci. USA 84, 4516–4519 (1997).

Download references


This work has been made possible by donors of blood samples and cell lines to the Human Genome Diversity Project (HGDP) and the Center for the Study of Human Polymorphism (CEPH). The collaboration with CEPH has been a decisive contribution. Support for preparing the first African cell lines in the Stanford laboratory in 1984–1985 came initially from the Lucille P. Markey Trust, with later additions from a National Institutes of Health Institute for General Medical Science programme and the HGDP–CEPH initiative from the Ellison Medical Foundation. H. Cann, M. Feldman, H. Greely and M.-C. King are thanked for suggesting improvements to the manuscript.

Author information

Ethics declarations

Competing interests

The author declares no competing financial interests.

Related links



The mixture of two or more genetically distinct populations.


A method for localizing genes that are responsible for specific diseases by comparing the DNA of a selected set of patients who are believed to carry the same mutation/s because of their ancestral origin, with that of unrelated healthy controls from the same population.


An increase in the breadth to length ratio of the skull.


Processes of substantial demographical growth causing geographical expansions of a population. These are made possible by innovations that affect production of food, such as agro-pastoral economies and/or other improved technologies (for example, transportation, hunting and other weapons).


A set of genetic markers that show complete or nearly complete linkage disequilibrium; that is, they are inherited through generations without being changed by crossing-over or other recombination mechanisms.


A classical mathematical principle in population genetics used for testing random mating. It gives the expected frequencies of genotypes for a gene after one generation of random mating if the parental allele frequencies are known.


The tendency for markers that are physically close to each other on the same chromosome to be transmitted to the progeny together, as there is a low probability that they will be split through recombination.


Mapping genes by typing genetic markers in families to identify regions that are associated with disease or trait values that occur within pedigrees more often than is expected by chance. Such linked regions are more likely to contain a causal genetic variant.


Lymphoblastoid cell lines are obtained from B lymphocytes, a fraction of white cells from blood that can be grown indefinitely in the laboratory after special treatment of the cells with Epstein–Barr virus.


Microsatellites are tandem repeats of short nucleotide sequences (2–6 bases). They have a large number of alleles compared with SNPs, owing to a much higher mutation rate.

Rights and permissions

Reprints and Permissions

About this article

Further reading

Figure 1: Populations that are included in the Human Genome Diversity Project collection.