The idea of fitness is central to evolutionary biology. The British philosopher Herbert Spencer characterized Darwin's theory as “the survival of the fittest” — the survival of the individuals that best fit their environment. In the 1930s, J. B. S. Haldane, Sewall Wright and Ronald Fisher quantified 'fitness' to express the strength of natural selection. Thus, if a mutant genotype suffers a 10% selective disadvantage relative to the wild type, it has a fitness of 90%.

This concept of fitness has prompted sterile debate along the lines that, as natural selection states that the fittest survive, and as 'the fittest' is defined as those that survive, the whole concept of natural selection is tautological. This misses the point: the main thing is that it is extremely useful to have a quantifiable, measurable description of a genotype.

Fitnesses of genotypes tend, empirically, to be roughly constant. Calculations based on this assumption give good predictions of the time course of the spread of advantageous alleles in populations. Fisher showed that when the fitness of each genotype is constant with time, the mean fitness of a population increases. Fitness describes the present success of a genotype, not the probability of its survival in the long term, which might depend on its capacity to adapt to new environments.

One fundamental misunderstanding is that any differences in survival or reproduction between individuals reflect differences in fitness. This is not what fitness means. Fitness represents an expected outcome, and what actually happens in small populations differs from expectation because each generation's genotypes represent a sample, with an attendant sampling error, of the gametes produced by the previous generation; this is the basis of the phenomenon known as 'genetic drift'. The fitness of a genotype is related only probabilistically to real events; weakly advantageous mutations are usually lost by chance. Weakly deleterious mutations are much less likely to spread than advantageous alleles, but may arise much more often. Indeed, it is probable that most evolution of amino-acid sequences has occurred by fixation of weakly deleterious changes.

Fitness is hard to measure, particularly in wild populations, because it summarizes expected survival and reproduction. In particular, measuring 'lifetime reproductive success' by painstakingly tracking cohorts of individuals throughout their lives gives data that are difficult to interpret. Each individual in a sexual wild population has a genotype that, as an entity, is unique. An individual's genotype will have a fitness, which will thus be the individual's fitness. But random events cause the lives of individuals to differ, even those with identical fitnesses, and variation in the lifetime reproductive success of individuals does not represent variation in fitness between their genotypes. The only way to measure differences in fitness is to measure differences in mean survivorship and in mean reproductive rate between classes of individuals (groups of individuals that differ biologically). If one is interested in evolution, the only interesting classes are those that differ in their genotypes. A genotypic class, for example, might include all individuals that share the same genotype at a single genetic locus.

Now, if one is looking at the mean survival and reproduction of genetically defined classes of individuals, there is no point in looking at lifetime reproductive success. It is not worth measuring the fertility of young adults, for example, and then monitoring the fertility of the same individuals year after year as they grow older. You can find out all you need to know by looking at the fertility of different age classes in the same year — the fact that the individuals are different has no effect on the expected mean fitness.

Many view natural selection as an environmental force that acts on the phenotypes of populations, by analogy with artificial selection in animal or plant breeding. Although differences in genotypic fitness are caused by differences in phenotype, the widespread occurrence of pleiotropy — whereby a single genetic change has multiple phenotypic effects — means that it is very difficult to identify the true ways in which fitness differences arise. The most obvious phenotypic differences may not be the most important.

Does Fisher's theorem predict whether organisms become better adapted to their environment with time? In principle, yes, but there are important caveats. First, environments change. It is futile to try to explain human behaviour in adaptive terms, as the environments in which the genes responsible evolved were so different. For example, it may be that bad drivers crash cars more often, and so there is natural selection against individuals who are poor at driving. But this has no causal connection with the fact that humans can drive cars. The genes for this skill were not created by selection against individuals who crashed. Of course, nobody would suggest that they were, yet there are consistent, misguided attempts to explain other aspects of contemporary human behaviour in terms of the consequences, in effects on fitness, of those behaviours in modern environments.

Show-off: genes to woo females can boost fitness. Credit: RICHARD HAMILTON SMITH/CORBIS

Second, although genes that improve survival tend to increase fitness, so do genes that increase sexual attractiveness — such as those that create the peacock's tail. A population of peacocks that did not evolve a spectacular tail might have been more successful in terms of population density or the probability of survival. Equally, in a population in which random genetic changes reduced male fitness to make all males slightly less attractive to females, the females would settle for second-best, and the species would get along fine. There is no necessary agreement between mean fitness and ecological variables.

I have used the term 'fitness' to describe differences between genotypes within species. What about the 'fitnesses' of different entities? Some species spread at the expense of others, and some ideas (memes) become better known by imitation. Can the spread of religion, for example, be explained in terms of 'memes' with high 'fitness', as some believe? Logically, 'fitness' could be used for these other entities, in which case its use would remain as circular and non-explanatory as it is for genotypes. But here, fitness is not constant over time, so there is no pragmatic justification for it as a predictive mathematical tool.