The Human Genome Diversity Project: past, present and future


The Human Genome Project, in accomplishing its goal of sequencing one human genome, heralded a new era of research, a component of which is the systematic study of human genetic variation. Despite delays, the Human Genome Diversity Project has started to make progress in understanding the patterns of this variation and its causes, and also promises to provide important information for biomedical studies.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Populations that are included in the Human Genome Diversity Project collection.


  1. 1

    Greely, H. T. Human genome diversity: what about the other human genome project? Nature Rev. Genet. 2, 222–227 (2001).

    CAS  Article  PubMed  Google Scholar 

  2. 2

    Cann, H. M. et al. A human genome diversity cell line panel. (letter). Science 296, 261 (2002).

    CAS  Article  Google Scholar 

  3. 3

    The International HapMap Consortium. The International HapMap Project. Nature 426, 789–795 (2003).

  4. 4

    The International HapMap Consortium. Integrating ethics and science in the International HapMap Project. Nature Rev. Genet. 5, 467–475 (2004).

  5. 5

    Cavalli-Sforza, L. L. How can one study individual variation for three billion nucleotides of the human genome? Am. J. Hum. Genet. 46, 649–651 (1990).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. 6

    Cavalli-Sforza, L. L., Wilson, A. C., Cantor, C. R., Cook-Deegan, R. M. & King, M. -C. Call for a worldwide survey of human genetic diversity: a vanishing opportunity for the Human Genome Project. Genomics 11, 490–491 (1991).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  7. 7

    Committee on Human Genome Diversity, National Research Council. Evaluating Human Genetic Diversity (US National Academy of Sciences, Washington DC, 1997).

  8. 8

    Dausset, J. et al. Centre d'Etude du Polymorphisme Humain (CEPH): collaborative genetic mapping of the human genome. Genomics 6, 575–577 (1990) (in French).

    CAS  Article  PubMed  Google Scholar 

  9. 9

    Rosenberg, N. A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002).

    CAS  Article  Google Scholar 

  10. 10

    Zhivotovsky, L. A., Rosenberg, N. A. & Feldman, M. W. Features of evolution and expansion of modern humans, inferred from genomewide microsatellite markers. Am. J. Hum. Genet. 72, 1171–1186 (2003).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  11. 11

    Ramachandran, S., Rosenberg, N. A., Zhivotovsky, L. A. & Feldman, M. W. On the robustness of the inference of human population structure. Hum. Genomics 1, 87–97 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12

    Shi, M., Caprau, D., Romitti, P., Christensen, K. & Murray, J. C. Genotype frequencies and linkage disequilibrium in the CEPH Human Diversity Panel for variants in folate pathway genes MTHFR, MTHFD, MTRR, RFLI and GCP2. Birth Defects Res. A 67, 545–549 (2003).

    CAS  Article  Google Scholar 

  13. 13

    Bersaglieri, T. et al. Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 74, 1111–1120 (2004).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  14. 14

    Macpherson, M. J., Ramachandran, S., Diamond, L. & Feldman, M. W. Demographic estimates from Y-chromosome microsatellite polymorphisms: analysis of a worldwide sample. Hum. Genomics 1, 345–354 (2004).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  15. 15

    Cavalli-Sforza, L. L. & Feldman, M. W. Biology as history: population genetic approaches to modern human evolution. Nature Genet. 33, 266–275 (2003).

    CAS  Article  PubMed  Google Scholar 

  16. 16

    Serre, D. & Paabo, S. Evidence for gradients of human genetic diversity within and among continents. Genome Res. 14, 1679–1685 (2004).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  17. 17

    Horten, R. et al. Read all about it: the Lancet's paper of the Year, 2003. Lancet 362, 2101–2103 (2003).

    Article  Google Scholar 

  18. 18

    Cavalli-Sforza, L. L. & Bodmer, W. The Genetics of Human Populations (Freeman, San Francisco, 1971; Dover, New York, 1999).

    Google Scholar 

  19. 19

    Risch, N. & Merikangas K. The future of genetic studies of complex human diseases. Science 273, 1516–1517 (1996).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. 20

    Roseman, C. C. Detecting inter-regionally diversifying natural selection of modern human cranial form using matched molecular and morphometric data. Proc. Natl Acad. Sci. 101, 12824–12829 (2004).

    CAS  Article  PubMed  Google Scholar 

  21. 21

    Reich, D. E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).

    CAS  Article  Google Scholar 

  22. 22

    Wall, J. D. & Pritchard, J. K. Haplotype blocks and linkage disequilibrium in the human genome. Nature Rev. Genet. 4, 587–597 (2003).

    CAS  Article  Google Scholar 

  23. 23

    McVean, G. A. T. et al. The fine scale structure of recombination rate variation in the human genome. Science 304, 581–584 (2004).

    CAS  Article  Google Scholar 

  24. 24

    Cavalli-Sforza, L. L., Menozzi, P. & Piazza, A. The History and Geography of Human Genes (Princeton Univ. Press, New Jersey, 1994).

    Google Scholar 

  25. 25

    Cavalli-Sforza, L. L. The DNA revolution in population genetics. Trends Genet. 14, 60–65 (1998).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  26. 26

    Underhill, P. A. et al. The phylogeography of Y chromosome binary haplotypes and the origins of modern human populations. Ann. Hum. Genet. 65, 43–62 (2001).

    CAS  Article  PubMed  Google Scholar 

  27. 27

    Edmonds, C. A., Lillie, A. S. & Cavalli-Sforza, L. L. Mutations arising in the wave front of an expanding population. Proc. Natl Acad. Sci. USA 101, 975–979 (2004).

    CAS  Article  PubMed  Google Scholar 

  28. 28

    Cavalli-Sforza, L. L. & Edwards, A. W. F. Analysis of human evolution. Genet. Today Proc. 11th Int. Congress Genet. 3, 923–933 (1964).

    Google Scholar 

  29. 29

    Menozzi, P., Piazza, A. & Cavalli-Sforza, L. L. Synthetic gene frequency maps in Europe. Science 201, 786–792 (1978).

    CAS  Article  PubMed  Google Scholar 

  30. 30

    Cavalli-Sforza, L. L. Some current problems in human population genetics. Am. J. Hum. Genet. 25, 82–104 (1973).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31

    Hirszfeld, L. & Hirszfeld, H. Essai d'application des methodes au probleme des races. Anthropologie 29, 505–537 (1919) (in French).

    Google Scholar 

  32. 32

    Race, R. R. & Sanger, R. Blood Groups in Man (Blackwell Scientific, Oxford, 1975).

    Google Scholar 

  33. 33

    Pauling, L., Itano, A. H., Singer, S. J. & Wells, I. C. Sickle cell anemia, a molecular disease. Science 110, 543–548 (1949).

    CAS  Article  Google Scholar 

  34. 34

    Harris, H. The Principles of Human Biochemical Genetics 3rd edn (Elsevier; North Holland Biomedical Press, Amsterdam, 1980).

    Google Scholar 

  35. 35

    Cavalli-Sforza, L. L. et al. DNA markers and genetic variation in the human species. Cold Spring Harb. Symp. Quant. Biol. 51, 411–417 (1987).

    Article  Google Scholar 

  36. 36

    Cavalli-Sforza, L. L. (ed.) African Pygmies (Academic, Orlando, 1986).

    Google Scholar 

  37. 37

    Pritchard, J. K., Stephens, M. & Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 155, 945–959 (2000).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. 38

    Bowcock, A. M. et al. High resolution of human evolutionary trees with polymorphic microsatellites. Nature 368, 455–457 (1994).

    CAS  Article  Google Scholar 

  39. 39

    Barbujani, G. et al. An apportionment of human DNA diversity. Proc. Natl Acad. Sci. USA 84, 4516–4519 (1997).

    Article  Google Scholar 

Download references


This work has been made possible by donors of blood samples and cell lines to the Human Genome Diversity Project (HGDP) and the Center for the Study of Human Polymorphism (CEPH). The collaboration with CEPH has been a decisive contribution. Support for preparing the first African cell lines in the Stanford laboratory in 1984–1985 came initially from the Lucille P. Markey Trust, with later additions from a National Institutes of Health Institute for General Medical Science programme and the HGDP–CEPH initiative from the Ellison Medical Foundation. H. Cann, M. Feldman, H. Greely and M.-C. King are thanked for suggesting improvements to the manuscript.

Author information



Ethics declarations

Competing interests

The author declares no competing financial interests.

Related links

Related links


Fondation Jean Dausset — CEPH

International HapMap Project

Human Genome Diversity Project

Human Genome Organisation

Human Genome Project

Marcus Feldman's laboratory

National Research Council

Noah Rosenberg's web site

Stanford Human Population Genetics Laboratory



The mixture of two or more genetically distinct populations.


A method for localizing genes that are responsible for specific diseases by comparing the DNA of a selected set of patients who are believed to carry the same mutation/s because of their ancestral origin, with that of unrelated healthy controls from the same population.


An increase in the breadth to length ratio of the skull.


Processes of substantial demographical growth causing geographical expansions of a population. These are made possible by innovations that affect production of food, such as agro-pastoral economies and/or other improved technologies (for example, transportation, hunting and other weapons).


A set of genetic markers that show complete or nearly complete linkage disequilibrium; that is, they are inherited through generations without being changed by crossing-over or other recombination mechanisms.


A classical mathematical principle in population genetics used for testing random mating. It gives the expected frequencies of genotypes for a gene after one generation of random mating if the parental allele frequencies are known.


The tendency for markers that are physically close to each other on the same chromosome to be transmitted to the progeny together, as there is a low probability that they will be split through recombination.


Mapping genes by typing genetic markers in families to identify regions that are associated with disease or trait values that occur within pedigrees more often than is expected by chance. Such linked regions are more likely to contain a causal genetic variant.


Lymphoblastoid cell lines are obtained from B lymphocytes, a fraction of white cells from blood that can be grown indefinitely in the laboratory after special treatment of the cells with Epstein–Barr virus.


Microsatellites are tandem repeats of short nucleotide sequences (2–6 bases). They have a large number of alleles compared with SNPs, owing to a much higher mutation rate.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Cavalli-Sforza, L. The Human Genome Diversity Project: past, present and future. Nat Rev Genet 6, 333–340 (2005).

Download citation

Further reading


Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing