Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# Evolution of genetic redundancy

## Abstract

Genetic redundancy means that two or more genes are performing the same function and that inactivation of one of these genes has little or no effect on the biological phenotype. Redundancy seems to be widespread in genomes of higher organisms1,2,3,4,5,6,7,8,9. Examples of apparently redundant genes come from numerous studies of developmental biology10,11,12,13,14,15, immunology16,17, neurobiology18,19 and the cell cycle20,21. Yet there is a problem: genes encoding functional proteins must be under selection pressure. If a gene was truly redundant then it would not be protected against the accumulation of deleterious mutations. A widespread view is therefore that such redundancy cannot be evolutionarily stable. Here we develop a simple genetic model to analyse selection pressures acting on redundant genes. We present four cases that can explain why genetic redundancy is common. In three cases, redundancy is even evolutionarily stable. Our theory provides a framework for exploring the evolution of genetic organization.

## Main

There are an increasing number of observations demonstrating that experimental inactivation of certain genes has no apparent effect on the phenotype or fitness of an animal. In specific cases, it seems that the natural function of a gene can be taken over by another gene. Such a redundant genetic organization is sensible from an engineer's point of view: important functions require back-up devices that can take over in case of failure. But can natural selection favour the emergence and stability of redundant genes?

Consider a population of animals in which some essential function can be performed by genes at either of two loci, A and B. (We use the word ‘function’ to refer to an effect of a gene during development; thus two genes coding for different proteins can have the same function.) Non-functional alleles, a and b, arise by mutation at rates ua and ub per generation; reverse mutations are ignored. For simplicity, we consider a haploid population, but the models can be extended to diploid populations and the conclusions remain essentially unchanged. There are four genotypes: AB, Ab, aB and ab. In each generation, random mating is followed by mutation and selection. Natural selection can maintain both genes if redundancy is only apparent, that is, if the AB genotype is fitter than the other genotypes. Less obvious is the question of whether natural selection can maintain true redundancy in the sense that an individual with one of the two genes is as fit as an individual with both. Models 1–3 will address this question. Model 4 studies the consequence of developmental errors.

In model 1, we assume that both genes are equally effective, and that each can function perfectly on its own (Fig. 1a). The fitness of AB, Ab and aB is one, while the fitness of ab is zero. Let us first consider the case where the mutation rates in both genes are the same: ua = ub = u. The system admits a line of equilibria. All trajectories converge to this line. For small mutation rates, the maximum equilibrium frequency of AB is approximately 1− 2√(u/r), where r is the recombination rate between the two loci. Thus a large proportion of individuals can carry functional alleles for both genes.

There is, however, an important caveat. We have assumed that the mutation rates ua and ub are equal. But any small deviation from ua = ub destroys the equilibrium line. If uaub then model 1 does not admit any interior equilibrium, and redundancy does not survive22. A simple way of understanding this result is as follows. At equilibrium, the rate at which deleterious genes arise by mutation must equal the rate at which they are removed by selection. Because only ab individuals are removed selectively, the rates of removal of the two genes are equal: therefore the rates at which they arise by mutation must be equal.

If ua > ub, then A will become extinct while B will be fixed. But if mutation rates are very similar, this may take a long time. In model 1, A declines as exp[−(uaub)T], where T is the number of generations22. (This represents an upper limit.) For a mutation rate of 10−6 per gene per generation, and a 10% difference between ua and ub, the average lifetime of redundancy is about 107 generations. Therefore, a certain amount of redundancy in our genomes could be the consequence of recent gene duplication events. We note, however, that redundancy cannot only be a consequence of gene duplication because very different genes can also show overlapping redundancy.

Several authors have studied stochastic versions of model 1 and computed the time it takes for random drift to eliminate one of the two genes even if mutation rates are exactly equal23,24,25,26,27,28.

In model 2, we assume that the genes A and B perform the same function, but with slightly different efficacies (Fig. 1b). Suppose A performs the function with an efficacy of one, while B does it with a reduced efficacy, h. If both genes are present, the function is performed with the higher of the two efficacies; this is essentially a definition of redundancy. Thus the fitness of genotypes AB and Ab is one, while the fitness of genotype aB is h. The fitness of ab is zero. Unexpectedly, this can lead to a stable equilibrium with both genes A and B maintained in the population, provided that the mutation rate in A is higher than in B. Redundancy is maintained because gene B, with the lower efficacy, also has a lower mutation rate, and is maintained by selection in a genotypes. In this case, B is fully redundant in the sense that its inactivation has no effect on fitness, whereas deletion of A causes a small reduction in fitness. A stable equilibrium is also possible if the fitness of Ab is higher than the fitness of AB, which in turn has a higher fitness than aB. In this case, redundancy is even maintained at a cost.

Model 3 relates pleiotropy to redundancy (Fig. 1c). Pleiotropy implies that genes perform more than one specific function (for example, by being expressed at more than one time and place in the developing organism29). The idea is that redundancy between two genes occurs only with respect to a given function, while the genes are maintained by selection because of another, independent function. Redundancy arises as a consequence of ‘functional overlap’ between genes.

In the simplest case, there are two functions, F1 and F2, and two genes, A and B. Suppose A performs F1 with an efficacy of one, while B performs F1 with a slightly reduced efficacy, h, and F2 with an efficacy of one. Mutations in A lead to the inactive variant a; the mutation rate is ua. In the second locus, we consider two types of mutants: b1 has lost the ability to perform F1, but still performs F2; b2 is completely inactive. The mutation rate from B to b1 is ub1.

The fitness of each variant is evaluated by assuming that each function is performed with the efficacy of the most efficient gene, and the overall fitness is the product of the efficacies at which the two functions are performed. Therefore genotypes AB and Ab1 have a fitness of one, genotype aB has fitness h, and all other genotypes have a fitness of zero.

Using the same framework as above, we find that a stable equilibrium with AB is possible provided the mutation rate ub1 is smaller than ua. Functional overlap (and therefore redundancy in performing function F1) is stable if the mutation rate at which mutants are produced that have lost the functional overlap but still maintain the original function is lower than the mutation rate of producing inactive mutants at the other locus (ub1 < ua). This is plausible, as mutations that destroy all functions of a gene are likely to be more common than mutations destroying one function but leaving another unaffected. This is especially true if the two functions are very similar.

Models 2 and 3 show that true redundancy can be evolutionarily stable, but in both cases the relevant selection pressures are weak if the mutation rates are low. For weak selection pressures to counteract random drift the population size has to be large. In our case the population size has to exceed 1/u for redundancy to be maintained.

Models 1–3 explore the maintenance of redundancy. We would also like to understand the origin of redundancy and extend the idea to a larger number of genes and functions. Figure 2 shows computer simulations based on a stochastic version of our model. The starting configuration is neither redundant nor pleiotropic: each function is performed by one gene, and each gene performs one function. During the simulation, mutations that lead to functional overlap arise spontaneously and can be favoured by selection. Genes evolve to perform additional functions. Redundancy is selected and can be fixed in the population if the mutation rate for complete inactivation of a gene is higher than the mutation rate for inactivating only a specific function of a gene. Final configurations often consist of complex redundancy–pleiotropy networks in which each function is performed by several genes and each gene performs several functions.

Finally, model 4 considers ‘developmental errors’ (Fig. 3). The transmission of information from the egg to the adult organism is subject to errors3. Let us therefore consider the possibility that a gene is intact in the germ line but fails to perform its function during development. We suggest that such developmental failure may arise either by somatic mutation, by errors in the origin and maintenance of cell differentiation (for example, in copying DNA methylation patterns), or through errors in cell-to-cell signalling30.

In terms of our model, we assume that two genes A and B perform the same function, but with probabilities δa and δb gene A and B, respectively, fail to perform this function in the course of development. If both genes fail, the function is not performed and the animal does not survive.

In the absence of developmental error, the genotypes AB, Ab and aB have a fitness of one, but taking into account developmental error, the average fitnesses of AB, Ab and aB are respectively 1− δaδb, 1− δa, and 1− δb. As before, the mutation rates to produce inactive genes a and b are ua and ub. We find that the redundant genotype, AB, is stable if ua < δb and ub < δa. Thus the mutation rate in each gene has to be smaller than the developmental error rate in the other gene. It is plausible that such developmental failures are more frequent than germline mutations: repeated rounds of cell replication provide increased probabilities for somatic mutation; errors can occur in the DNA methylation pattern and may result in incorrect cell differentiation; and interactions with other cells and response to signals can fail. Therefore we expect the error rate for normal expression of developmental genes, per individual ontogeny, to be higher than germline mutation rates. This is also supported by observations that routine examination of large numbers of embryos in each phase of development reveal spontaneous cases of obvious morphogenetic failure much in excess of germline mutation rates.

Model 4 suggests that redundancy should be more common in developmental genes that are expressed in specific spatio-temporal patterns in the body than in genes encoding for ‘housekeeping’ functions that are required in all cells (for example, essential metabolic enzymes). Somatic mutations or failures of gene expression that simply kill the cell in which they occur may have much less phenotypic effect than similar events that misguide subsequent developmental signals. Therefore the developmental error rate should be higher in genes that are not required in every cell.

The model can be extended to more than two genes per function (Fig. 3b). An elegant result is obtained if we consider genes with similar mutation rate, u, and similar developmental error rate, δ. The number of redundant genes that can be maintained per function is the largest integer less than 1+ (log u)/(log δ). For example, if the mutation rate is u = 10−6 and the developmental error rate is δ = 10−3, selection can maintain up to three genes for a given function.

Model 1 shows that in situations where true redundancy is not evolutionarily stable, it may nevertheless take a long time until it is eliminated from the population, provided that mutation rates are small. Models 2 and 3 describe situations in which true redundancy can be maintained indefinitely. Model 3 can lead to complex redundancy networks in which each function is performed by several genes and each gene performs more than one function. Such networks are evolutionarily stable provided that random mutations are more likely to destroy all functions of a gene, rather than destroy just one function while leaving other functions unaffected. Model 4 introduces the concept of developmental errors, and shows that redundancy is evolutionarily stable provided that developmental error rates are larger than mutation rates in the germ line. According to model 4, redundancy should occur for thosegenes (or functions) that are under a high developmental error rate. The four models are not mutually exclusive; together they explain how mutation and selection can lead to redundant genetic organization.

## Methods

Model 1. Consider a haploid population with genes at two loci, A and B. Non-functional alleles, a and b, arise at mutation rates ua and ub. There are four genotypes, AB, Ab, aB and ab. The frequencies are x1, x2, x3 and x4, and the fitnesses are f1, f2, f3 and f4, respectively. In each generation there is mating (with recombination), followed by mutation and selection. Mating is described by the difference equations: x1 = x1 + D, x2 = x2D, x3 = x3D, and x4 = x4 + D. Here, D = r(x2x3x1x4), where r is the recombination rate between the A and B loci, and r is a number between 0 and 0.5. Mutation is described by x1 = x1(1 − ua)(1 − ub), x2 = x1(1 − ua)ub + x2(1 − ua), x3 = x1ua(1 − ub) + x3(1 − ub), and x4 = x1uaub + x2ua + x3ub + x4. Selection is described by xi = fixi/f, where f = Σixifi denotes the average fitness of the population. Suppose both genes perform function F with equal efficacy. We have f1 = f2 = f3 = 1 and f4 = 0. For exactly equal mutation rates, ua = ub = u, there is a line of equilibria given by x1 = x2x3r(1 − u)/u. For unequal mutation rates, the gene with higher mutation rate will become extinct.

Model 2. This has the same framework as model 1, but genes A and B performfunction F with different efficacies, ha and hb. Let ha > hb. The genotype fitnesses are f1 = f2 = ha, f3 = hb and f4 = 0. Redundancy can be evolutionarily stable if B has a lower mutation rate than A, ub < ua. If 1− (hb/ha) > ua > ub[1 + (1/r)(hahb)/hb] the equilibrium is x1* = (1 − x2*) × [ha(1 − ua) − hb(1 − ub)] / [(hahb)(1 − ua)], x2* = (1/r)[ub/(1 − ub) × [ha(1 − ua) − hb(1 − ub)] / [hb(uaub)], x3* = 1 − x1*x2*, and x4* = 0. For low mutation rates, the equilibrium frequency of the redundant AB genotype is approximately x1* ≈ 1 − (1/r)[ub/(uaub)](hahb)/hb. For example, if ha = 1, hb = 0.99, ua = 1.1 × 10−6, ub = 10−6 and r = 0.5, then the equilibrium frequency of AB is about 0.8.

This model can be expanded to n genes with different mutation rates and different efficacies. The fitness of a particular genotype is given by the efficacy of the most efficient gene. If less efficient genes have lower mutation rates then stability of several redundant genes is possible. For a large number of genes, however, the conditions on efficacies and mutation rates become very restrictive.

Model 3. Consider two genes, A and B, and two functions, F1 and F2. Gene A performs function F1 with efficacy ha, and gene B performs function F1 with a lower efficacy hb and function F2 with an efficacy of one. Mutations in A lead to the inactive variant a; the mutation rate is ua. Mutations in B can either lead to variant b1, which has lost the ability to perform function F1 but still performs F2, or to variant b2, which is completely inactive; mutation rates are ub1 and ub2, respectively. Variant b2 can also arise from b1 at a mutation rate ub3. The redundant organization for performing function F1, is evolutionarily stable if ub1 < ua. The analysis is similar to model 2 if ub2ub3: for low mutation rates, the equilibrium frequency of AB is approximately x1* ≈ 1 − (1/r)[ub1/(uaub1)] × (hahb)/ha. For the same numerical values as model 2, and assuming that ub1 is 10 times smaller than ua, we find that the equilibrium frequency of AB is 0.998. Pleiotropy facilitates redundancy.

Model 4. Consider two genes A and B with mutation rates ua and ub and developmental error rates δa and δb. Mutation and selection are described by thedifference equations x1 = (1 − δaδb)(1 − ua)(1 − ub)x1/f, x2 = (1 − δa) × (x1ub + x2)/f, x3 = (1 − δb)(1 − ub)(x1ua + x3)/f, x4 = 0, where f is such that x1 + x2 + x3 = 1. In contrast to models 1–3, recombination is not essential here. The equilibrium frequency of AB is x1 = 1/{1 + [ua(1 − δb)]/ [δb(1 − δa) − ua(1 − δaδb)] + [ub(1 − δa)] / [δa(1 − δb) − ub(1 − δaδb)]}. For small values of u and δ, we obtain x1 ≈ 1/{1 + [ua/(δbua)] + [ub/(δaub)]}. Thus necessary conditions for a large x1 are ua < δb and ub < δa.

The model can be extended to n genes. Suppose all genes have mutation rateu and developmental error rate δ. Let xi denote genotypes with i genes (i = 0,…, n). The population dynamics are xnk = (fnk/f) Σi = 0k (kini) × ukixni, where fj = (1 − δj)(1 − u)j and f is such that all frequencies add to one. The equilibrium can be solved recursively. An equilibrium with the genotype containing all n redundant genes is possible if fn > fn−1. This leads to n < 1 + (log u)/(log δ).

Diploid models. Our results for haploid models also apply to diploid models. In diploid models, we distinguish four gametes, AB, Ab, aB and ab, which form nine zygotes: AB/AB, AB/Ab, Ab/Ab, AB/aB, aB/aB, AB/ab, Ab/ab, aB/ab and ab/ab. For each generation we assume that mutation acts on gamete frequency, then zygotes are formed, selection acts on zygotes, and finally new gametes are formed, including the possibility of recombination. In agreement with haploid model 1, we find that the case where all zygotes have high fitness except ab/ab which has low fitness, does not lead to stable redundancy. Cases similar to models 2 and 3 give stable redundancy. Diploid models with developmental errors also give stable redundancy.

There are some additional cases that can lead to redundancy in diploid models. One such case was discovered by Brookfield: it assumes that the double heterozygote, AB/ab, is as fit as the wild type, AB/AB, but Ab/ab, aB/ab and ab/ab have low fitness1. In addition, stable redundancy is also possible for partial dominance where all homozygotes have high fitness, the double heterozygote has a lower fitness, the single heterozygotes have still lower fitness, and ab/ab has lowest fitness.

Classification of redundancy. It is helpful to distinguish three types of genetic redundancy. (1) True redundancy1 denotes the situation where an individual with a redundant genotype, AB, is not fitter than one in which one of the redundant genes has been knocked out, Ab. In model 2, B is truly redundant, but A is not. In cases with pleiotropy, ‘true redundancy’ implies that the fully redundant genotype is not fitter than a genotype where the pleiotropic function of one gene has been eliminated. (2) ‘Generic redundancy’ is the case when an AB individual is only occasionally fitter than an Ab individual. This can be the consequence of rare developmental errors. Another possibility is that AB is only fitter than Ab in some environments. (3) ‘Almost redundancy’ means than the redundant genotype AB is always slightly fitter than any genotype where one of the redundant genes has been knocked out. Of course, the fitness difference should be small if the situation is to be regarded as one of redundancy. Several such examples have been discussed previously5.

## References

1. Brookfield, J. F. Y. Genetic redundancy. Adv. Genet. 36, 137–155 (1997).

2. Brookfield, J. F. Y. Can genes be truly redundant? Curr. Biol. 2, 553–554 (1992).

3. Tautz, D. Redundancies, development and the flow of information. BioEssays 14, 263–266 (1992).

4. Goldstein, D. B. & Holsinger, K. E. Maintenance of polygenic variation in spatially structured populations. Evolution 46, 412–429 (1992).

5. Thomas, J. H. Thinking about genetic redundancy. Trends Genet. 9, 395–399 (1993).

6. Dover, G. A. Evolution of genetic redundancy for advanced players. Curr. Opin. Genet. Dev. 3, 902–910 (1993).

7. Pickett, F. B. & Meeks-Wagner, D. R. Seeing double: appreciating genetic redundancy. Plant Cell 7, 1347–1356 (1995).

8. Bird, A. P. Gene number, noise reduction and biological complexity. Trends Genet. 11, 94–100 (1995).

9. O'Brien, S. J. On estimating function gene number in eukaryotes. Nature New Biol. 242, 52–54 (1973).

10. Kastner, P. et al. Nonsteroid nuclear receptors: what are genetic studies telling us about their role in real life? Cell 83, 859–869 (1995).

11. Rudnicki, M. A. et al. Inactivation of MyoD in mice leads to up-regulation of the myogenic HLH gene Myf-5 and results in apparently normal muscle development. Cell 71, 383–390 (1992).

12. Saga, Y. et al. Mice develop normally without tenascin. Genes Dev. 6, 1821–1831 (1992).

13. Yang, Y. et al. Functional redundancy of the muscle-specific transcription factors Myf5 and myogenin. Nature 379, 823–825 (1996).

14. Joyner, A. L. et al. Subtle cerebellar phenotype in mice homozygous for a targeted deletion of the En-2 homeobox. Science 251, 1239–1243 (1991).

15. Laney, J. D. & Biggin, M. D. Redundant control of Ultrabithorax by zeste involves functional levels of zeste protein binding at the Ultrabithorax promoter. Development 122, 2303–2311 (1996).

16. Schorle, H. et al. Development and function of T cells in mice rendered interleukin-2 deficient by gene targeting. Nature 352, 621–624 (1991).

17. Taniguchi, T. Cytokine signaling through nonreceptor protein tyrosine kinases. Science 268, 251–255 (1995).

18. Steindler, D. A. et al. Tenascin Knockout Mice: Barrels, Boundary Molecules, and Glial Scars. J. Neurosci. 15, 1971–1983 (1995).

19. Pekny, M. et al. Mice lacking glial fibrallary acidic protein display astrocytes devoid of intermediate filaments but develop and reproduce normally. EMBO J. 14, 1590–1598 (1995).

20. Reed, S. I. G1-specific cyclins in search of an S-phase promoting factor. Trends Genet. 7, 95–99 (1991).

21. Roche, S. et al. Requirement for Src family protein tyrosine kinases in G-2 for fibroblast cell division. Science 269, 1567–1569 (1995).

22. Fisher, R. A. The sheltering of lethals. Am. Nat. 69, 446–455 (1935).

23. Christiansen, F. B. & Frydenberg, O. Selection-mutation balance for two nonallelic recessives producing an inferior double homozygote. Am. J. Hum. Genet. 29, 195–207 (1977).

24. Bailey, G. S. et al. Gene duplication in tetraploid fish: model for gene silencing at unlinked loci. Proc. Natl Acad. Sci. USA 75, 5575–5579 (1978).

25. Allendorf, F. W. Protein polymorphism and the rate of loss of duplicate gene expression. Nature 272, 76–78 (1978).

26. Kimura, M. & King, J. L. Fixation of a deleterious allele at one of two duplicate loci by mutation pressure and random drift. Proc. Natl Acad. Sci. USA 76, 2858–2861 (1979).

27. Li, W.-H. Rate of gene silencing at duplicate loci: a theoretical study and interpretation of data from tetraploid fish. Genetics 95, 237–258 (1980).

28. Ohta, T. Time for spreading of compensatory mutations under gene duplication. Genetics 123, 579–584 (1989).

29. Li, X. & Noll, M. Evolution of distinct developmental functions of three Drosophila genes by acquisition of different cis-regulatory regions. Nature 367, 83–87 (1994).

30. Krakauer, D. C. & Pagel, M. Selection by somatic signals. Phil. Trans. R. Soc. Lond. B 351, 647–658 (1996).

## Acknowledgements

We thank D. Krakauer and K. Sigmund for discussion. This work was supported by the Wellcome Trust (M.A.N.) and the European Community (M.C.B.).

## Author information

Authors

### Corresponding author

Correspondence to Martin A. Nowak.

## Rights and permissions

Reprints and Permissions

Nowak, M., Boerlijst, M., Cooke, J. et al. Evolution of genetic redundancy. Nature 388, 167–171 (1997). https://doi.org/10.1038/40618

• Accepted:

• Issue Date:

• DOI: https://doi.org/10.1038/40618

• ### Degeneracy measures in biologically plausible random Boolean networks

• Basak Kocaoglu
• William H. Alexander

BMC Bioinformatics (2022)

• ### Verbose exponence: Integrating the typologies of multiple and distributed exponence

• Matthew J. Carroll

Morphology (2022)

• ### Structural and functional analysis of CCT family genes in pigeonpea

• Kishor U. Tribhuvan
• Tanvi Kaila

Molecular Biology Reports (2022)

• ### Is BRD7 associated with spermatogenesis impairment and male infertility in humans? A case-control study in a Han Chinese population

• Tianrong He
• Mohan Liu
• Yuan Yang

Basic and Clinical Andrology (2021)

• ### Extensive non-redundancy in a recently duplicated developmental gene family

• E. A. Baker
• S. P. R. Gilbert
• A. Woollard

BMC Ecology and Evolution (2021)