  • Review Article
  • Published:

Digital genetics: unravelling the genetic basis of evolution

Key Points

  • Quantitative experiments in evolutionary biology have ushered in a new era. Theories can now be tested experimentally, as they are in the physical sciences. Digital organisms (self-replicating computer programs that evolve and adapt) are the newest tool in this revolution.

  • Digital genetics experiments reveal that genes that evolve with high mutation rates are subject to a new selective force: the pressure to develop robustness with respect to mutations. At sufficiently high mutation rates, robust replicators out-compete faster replicators.

  • Digital genetics also allows the complete storage and characterization of a phylogeny, and provides tools for functional analysis. These tools can be used to follow the evolution of complex genes, which indicates that complicated features can evolve in a step-by-step process.

  • Genes are not just passive repositories for information; they are active participants in the preservation of this information. Environmental regimes exert different pressures on how the information in a gene should be organized, leading to different levels of gene overlap and linkage.

  • Sexual recombination also changes the way in which information is stored in genes, leading to a more modular 'coding style' than is found in asexually reproducing clones. Because modularity affects the degree and form of the interaction of mutations (epistasis), such an adaptation can lead to synergistic epistasis — a prerequisite for sex according to Kondrashov's theory.


Digital genetics, or the genetics of digital organisms, is a new field of research that has become possible as a result of the remarkable power of evolution experiments that use computers. Self-replicating strands of computer code that inhabit specially prepared computers can mutate, evolve and adapt to their environment. Digital organisms make it easy to conduct repeatable, controlled experiments, which have a perfect genetic 'fossil record'. This allows researchers to address fundamental questions about the genetic basis of the evolution of complexity, genome organization, robustness and evolvability, and to test the consequences of mutations, including their interaction and recombination, on the fate of populations and lineages.

Figure 1: Avidian genomes.
Figure 2: Use of a phylogeny tool for reconstructing the evolution of digital organisms.

I would like to thank R.E. Lenski, D. Misevic, C. Ofria, and C.O. Wilke for extensive discussions. I am further indebted to D. Misevic for kindly providing me with figure 1b, and D. Weise for the illustration in box 3.

Genetic algorithm

A computational method that uses Darwinian methods to search for a rare solution (encoded into a symbolic string) within a large search space. Typically, the best strings in a population are mutated and recombined to form the next generation, whereas inferior strings are removed.


The condition in which the distributions of two species overlap and hybridization between taxa would be possible if they were not reproductively isolated by factors other than spatial separation.


The ability of an organism's genome to withstand point mutations (substitutions) without losing fitness. Numerically, this is expressed as the fraction of all possible single substitutions that do not change the organism's fitness. At high mutation rates, an increase in mutational robustness is selected for.

Central processing unit

(CPU). The 'brain' of a computer, where the information is processed. In Avida, a simulated CPU translates the genetic information into actions that represent the phenotype of the information.

Carrying capacity

In ecology, the maximum number of organisms of a particular species an ecosystem can support. In the single-niche or multi-niche digital system Avida, this is simply the total number of organisms, which can be set by the user.

Poisson distribution

The distribution of the number of occurrences of a discrete number of events during a fixed amount of time. For an average rate of occurrences λ, the probability of observing k events during time τ is P(k) = eλτ(λτ)k/k!

Implicit mutation

A genetic change that appears as a consequence of a faulty replication process rather than owing to an explicit mutation agent such as substitution, insertion, deletion or recombination. Examples include repetitions, as well as excisions of whole segments of code.

Replication loop

In digital organisms, the segment of code responsible for the duplication of genetic information. In most cases, this segment is 'looped over' many times to effect replication.


An arbitrary unit of time in digital life experiments, during which every member of the population executes a finite number of instructions (usually set as 30). The number of updates that elapse during one generation is not fixed because the time to produce an offspring changes during evolution.


In population genetics, the fixation of an allele or trait is defined as the moment at which every member of the population carries that allele.

Clonal interference

The competition between beneficial mutations or alleles before fixation. In clonal populations within a single niche, only one of several competing (interfering) mutations can go to fixation.

Quasispecies model

A theoretical description of a population of self-replicators at high mutation rate, which is characterized by a 'cloud' of mutationally interdependent types rather than a single, dominant wild type. 'Species' here refers to the species concept in chemistry rather than biology.

Phylogenetic depth

The total number of generations in which an offspring organism has differed from its parent, or the cumulative number of genetic changes that separate an organism from the ancestor of the lineage.

Kondrashov's mutational deterministic hypothesis

The hypothesis that sex will evolve and be maintained in populations at high mutation rates if mutations interact in a synergistic (that is, aggravating) fashion, but not if mutations interact antagonistically.

Mutational load

The fitness reduction of a population owing to mutations in the gene pool.

Genetic drift

In evolutionary genetics, a random process that can lead to the fixation of neutral and even deleterious alleles.

Fitness landscape

A visualization of the relationship between genotypes (providing the domain) and phenotype (the fitness) in evolutionary dynamics, for which the fitness can be characterized by a single real value, such as the rate of replication.


In computer science, the degree to which a program is structured in independent components that can be moved around or modified without having an effect on other components (modules). In genetics, the degree to which a function is carried out by independent genes.

Twofold cost of sex

In a sexual population with two sexes, the twofold growth disadvantage of a population that has a 1:1 sex ratio, which is due to the fact that only females give birth. The opposite occurs when a population comprises only self-fertilizing females.

Muller's ratchet

In population genetics, the irreversible loss of alleles due to chance in small populations that reproduce asexually.

Red-Queen effect

The theoretical result of continuing evolutionary competition between host and parasite genes (or sometimes between competing mutations in clonal interference) that requires continued evolutionary innovation to survive.

