The discovery of the structure of DNA1, and the realization that the chemical basis of mutations is changes in the nucleotide sequence of the DNA, meant that the history of a piece of DNA could be traced by studying variation in its nucleotide sequence found in different individuals and in different species. But it was not until rapid and inexpensive methods became available for probing DNA sequence variation in many individuals that the efficient study of molecular evolution in general — and of human evolution in particular — became feasible. Thus, the development in the 1980s of techniques for efficiently scoring polymorphisms with restriction enzymes and amplifying DNA2,3 enabled the study of molecular evolution to become a truly booming enterprise.

What follows is a personal and, by necessity, selective attempt to consider what the accelerating pace of exploration of human genetic variation over the past two decades has taught us about ourselves as a species, as well as some suggestions for what may be fruitful areas for future studies.

Primate relations

The first insight of fundamental importance for our understanding of our origins came from comparisons of DNA sequences between humans and the great apes. These analyses showed that the African apes, especially the chimpanzees and the bonobos, but also the gorillas, are more closely related to humans than are the orangutans in Asia4. Thus, from a genetic standpoint, humans are essentially African apes (Fig. 1). Although there had been hints of this from molecular comparisons of proteins5,6, it was a marked shift from the earlier common belief that humans represented their own branch separate from the great apes.

Figure 1: Tree showing the divergence of human and ape species.
figure 1

Approximate dates of divergences are given for, from left to right, orangutan, gorilla, human, bonobo and chimpanzee.

Our sense of uniqueness as a species was further rocked by the revelation that human DNA sequences differ by, on average, only 1.2 per cent from those of the chimpanzees7, as a consequence of humans and apes sharing a recent common ancestry. It should be noted that the dating of molecular divergences has uncertainties of unknown magnitude attached, not least because of calibration based on palaeontological data. Nevertheless, it seems clear that the human evolutionary lineage diverged from that of chimpanzees about 4–6 million years ago, from that of gorillas about 6–8 million years ago, and from that of the orangutans about 12–16 million years ago7. Before the advent of molecular data, the human–chimpanzee divergence was widely believed to be about 30 million years old.

In fact, we have recently come to realize that the relationship between humans and the African apes is so close as to be entangled. Although the majority of regions in our genome are most closely related to chimpanzees and bonobos, a non-trivial fraction is more closely related to gorillas7. In yet other regions, the apes are more closely related to each other than to us (Fig. 2). This is because the speciation events that separated these lineages occurred so closely in time that genetic variation in the first ancestral species, from which the gorilla lineage diverged, survived until the second speciation event between the human and chimpanzee lineages8. Thus, there is not one history with which we can describe the relationship of our genome to the genomes of the African apes, but instead different histories for different segments of our genome. In this respect, our genome is a mosaic, where each segment has its own relationship to that of the African apes.

Figure 2: Within- and between-species variation along a single chromosome.
figure 2

a,The interspecies relationships of five chromosome regions to corresponding DNA sequences in a chimpanzee and a gorilla. Most regions show humans to be most closely related to chimpanzees (red) whereas a few regions show other relationships (green and blue). b, The among-human relationships of the same regions are illustrated schematically for five individual chromosomes. Most DNA variants are found in people from all three continents, namely Africa (Af), Asia (As) and Europe (Eu). But a few variants are found on only one continent, most of which are in Africa. Note that each human chromosome is a mosaic of different relationships. For example, a chromosome carried by a person of European descent may be most closely related to a chromosome from Asia in one of its regions, to a chromosome from Africa in another region, and to a chromosome from Europe in a third region. For one region (red), the extent of sequence variation within humans is low relative to what is observed between species. The relationship of this sequence among humans is illustrated as star-shaped owing to a high frequency of nucleotide variations that are unique to single chromosomes. Such regions may contain genes that contribute to traits that set humans apart from the apes.

Modern humans

The mosaic nature of our genome is even more striking when we consider differences in DNA sequence between currently living humans. Our genome sequences are about 99.9 per cent identical to each other. The variation found along a chromosome is structured in 'blocks' where the nucleotide substitutions are associated in so-called haplotypes (Figs 2b and 3). These 'haplotype blocks' are likely to result from the fact that recombination, that is, the re-shuffling of chromosome segments that occurs during formation of sex cells (meiosis), tends to occur in certain areas of the chromosomes more often than in others9,10,11. In addition, the chance occurrence of recombination events at certain spots and not at others in the genealogy of human chromosomes will influence the structure of these blocks. Thus, any single human chromosome is a mosaic of different haplotype blocks, where each block has its own pattern of variation. Although the delineation of such blocks depends on the methods used to define them, they are typically 5,000–200,000 base pairs in length, and as few as four to five common haplotypes account for most of the variation in each block (Fig. 3).

Figure 3: The mosaic structure of human genetic variation.
figure 3

a, Each human chromosome is made up of regions, called 'haplotype blocks', which are stretches of DNA sequence where three to seven variants (at frequencies above 5 per cent in the human population) account for most of the variation found among humans. Each such haplotype found in a block is illustrated here as a bar of different colour. The catalogue of haplotypes for every block makes up the 'haplotype map' of the human genome. b, The chromosomes of two hypothetical individuals are shown. Each individual carries two copies of each block (as humans carry two sets of chromosomes). As the chance that the two haplotypes carried at a block are identical is about 20 per cent, each of us carries an average of about 1.8 different haplotypes per block. Since there is on average 5.5 haplotypes for every block, each individual carries about 30 per cent of the total haplotype diversity of the entire human species. Haplotype blocks tend to be shorter in Africa than elsewhere; as a result, African variation will probably have to be used to define the species-wide block lengths, which may be an average of around 10,000 base pairs. Note that not all of the human genome may have a clearly definable haplotype-block structure.

Of 928 such haplotype blocks recently studied in humans from Africa, Asia and Europe12, 51 per cent were found on all three continents, 72 per cent in two continents and only 28 per cent on one continent. Of those haplotypes that were on one continent only, 90 per cent were found in Africa, and African DNA sequences differ on average more among themselves than they differ from Asian or European DNA sequences13. Therefore, within the human gene pool, most variation is found in Africa and what is seen outside Africa is a subset of the variation found within Africa.

Two parts of the human genome can be regarded as haplotype blocks where the history is particularly straightforward to reconstruct, as no recombination occurs at all. The first of these is the genome of the mitochondrion (the cellular organelle that produces energy and has its own genetic material), which is passed on to the next generation from the mother's side; the second is the Y chromosome, which is passed on from the father's side. Variation in DNA sequences from both the mitochondrial genome14,15,16 and the Y chromosome17, as well as many sections of the nuclear genome13,18,19,20, have their geographical origin in Africa. Because other evidence suggest that humans expanded some 50,000 to 200,000 years ago21 from a population of about 10,000 individuals, this suggests that we expanded from a rather small African population. Thus, from a genomic perspective, we are all Africans, either living in Africa or in quite recent exile outside Africa.

Ancient humans

What happened to the other hominids that existed in the Old World from about 2 million years ago until about 30,000 years ago? For instance, the Neanderthals are abundant in the fossil record and persisted in western Europe until less than 30,000 years ago. Analysis of Neanderthal mitochondrial DNA has shown that, at least with respect to the mitochondrial genome, there is no evidence that Neanderthals contributed to the gene pool of current humans22,23,24,25. It is possible, however, that some as yet undetected interbreeding took place between modern humans and archaic hominids, such as Homo erectus in Asia or Neanderthals in Europe22,26,27.

But any interbreeding would not have significantly changed our genome, as we know that the variation found in many haplotype blocks in the nuclear genome of contemporary humans is older than the divergence between Neanderthals and humans. Thus, the divergence of modern humans and Neanderthals was so recent that Neanderthal nuclear DNA sequences were probably more closely related to some current human DNA sequences than to other Neanderthals. In other words, the overlapping genetic variation that is likely to have existed between different ancient hominid forms makes it difficult to resolve the extent to which any interbreeding occurred.

Nevertheless, the limited variation among humans outside Africa, as well palaeontological evidence28, suggest that any contribution cannot have been particularly extensive. Thus, it seems most likely that modern humans replaced archaic humans without extensive interbreeding and that the past 30,000 years of human history are unique in that we lack the company of the closely related yet distinct hominids with which we used to share the planet.

Human variation and 'race'

Comparisons of the within-species variation among humans and among the great apes have shown that humans have less genetic variation than the great apes29,30. Furthermore, early data that only about 10 per cent of the genetic variation in humans exist between so-called 'races'31 is borne out by DNA sequences which show that races are not characterized by fixed genetic differences. Rather, for any given haplotype block in the genome, a person from, for example, Europe is often more closely related to a person from Africa or from Asia than to another person from Europe that shares his or her complexion (for example, see ref. 32; Fig. 2).

Claims about fixed genetic differences between races (see ref. 33 for example) have proved to be due to insufficient sampling34. Furthermore, because the main pattern of genetic variation across the globe is one of gene-frequency gradients35, the contention that significant differences between races can be seen in frequencies of various genetic markers36 is very likely due to sampling of populations separated by vast geographical distances. In this context it is worth noting that the colonization history of the United States has resulted in a sampling of the human population made up largely of people from western Europe, western Africa and southeast Asia. Thus, the fact that 'racial groups' in the United States differ in gene frequencies cannot be taken as evidence that such differences represent any true subdivision of the human gene pool on a worldwide scale.

Rather than thinking about 'populations', 'ethnicities' or 'races', a more constructive way to think about human genetic variation is to consider the genome of any particular individual as a mosaic of haplotype blocks. A rough calculation (Fig. 3) reveals that each individual carries in the order of 30 per cent of the entire haplotype variation of the human gene pool. Although not all of our genome may show a typical haplotype-block structure and more research is needed to fully understand the haplotype landscape of our genome, this perspective clearly indicates that each of us contain a vast proportion of the genetic variation found in our species. In the future, we therefore need to focus on individuals rather than populations when exploring genetic variation in our species.

Tracking human traits

What are the frontiers ahead of us in human evolutionary studies? One of them, to my mind, is to identify gene variants that have been selected and fixed in all humans during the past few hundred thousand years. These will include genes involved in phenotypic traits that set humans apart from the apes and at least some archaic human forms (for example, genes involved in complex cognitive abilities, language and longevity). However, an important obstacle in this respect is that there is little detailed knowledge of many of the relevant traits in the great apes. For example, only recently has the extent to which apes possess the capability for language37 and culture38 begun to be comprehensively described. As a consequence, we have come to realize that almost all features that set humans apart from apes may turn out to be differences in grade rather than absolute differences.

Many such differences are likely to be quantitative traits rather than single-gene traits. To have a chance to unravel the genetic basis of such traits, we will need to rigorously define the differences between apes and humans — for instance, how we learn, how we communicate and how we age. In the next few years, geneticists will therefore need to consider insights from primatology and psychology, and more studies will be required that directly compare humans to apes.

There are, however, ways in which we can contribute towards the future unravelling of functionally important genetic differences between humans and apes. For example, we can identify regions of the human genome where the patterns of variation suggest the recent occurrence of a mutation that was positively selected and swept through the entire human population. The sequencing of the chimpanzee genome, as well as the haplotype-map project, will greatly help in this. Further prerequisites include the capability to determine the DNA sequence of many human genomes and the development of tools and methods to analyse the resulting data; in particular, a more realistic model of human demographic history is required.

Collectively these studies will allow us to identify regions in the human genome that have recently been acted upon by selection and thus are likely to contain genes contributing to human-specific traits (Fig. 2). Other interesting candidate genes for human-specific traits are genes duplicated or deleted in humans39, genes that have changed their expression in humans40, and genes responsible for disorders affecting traits unique to humans, such as language41 and a large brain size42.

A problem inherent in studying genes that are involved in traits unique to humans, such as language, is that functional experiments cannot be performed, as no animal model exists, and transgenic humans or chimpanzees cannot be constructed. A further difficulty is that many genes that enable humans to perform tasks of interest may exert their effects during early development where our ability to study their expression both in apes and humans is extremely limited.

A challenge for the future is therefore to design ways around these difficulties. This will involve in vitro as well as in silico approaches that study how genes interact with each other to influence developmental and physiological systems. As these goals are achieved, we will be able to determine the order and approximate times of genetic changes during the emergence of modern humans that led to the traits that set us apart among animals.