Is there a limit to the number of genes carried by an organism? Two reasons have been. First, as most mutations are deleterious, for a given per locus mutation rate there must exist an upper limit to the number of genes that is consistent with individual survival. Second, the imprecision of the mechanisms governing gene expression might also restrict genomic complexity. As gene expression errors are probably much more common than mutations, it is the latter that are more likely to impose a limit. However, these errors are not heritable and therefore cannot accumulate in populations. Which of the two sorts of effect are more likely to impose a limit? We address this issue in two ways. First, we ask about the load imposed by each sort of error. We show that the harmful effect of non-heritable failures is higher than that of heritable mutations, if p × δ > μ, where p is the rate of non-heritable failures, δ measures the harmful effect of these failures and μ is the rate of heritable mutations. Therefore, although the rate of non-heritable errors might be very high, this does not demonstrate that they are more important than mutations as their impact must be discounted by the strength of their effects. Further, we note that both theory and evidence suggest that the most common errors are of the least importance. Second, we discuss the population genetics of a new gene duplication. Previous attempts to make a connection between error rates and limits on gene number are based on group selection arguments. These fail to show a direct limitation on the spread of gene duplications. We note that empirical evidence indicates that duplication per se tends to result in expression errors that may be heritable. We therefore argue that a hybrid model, one evoking heritable expression errors, is likely to be the most realistic.
Although the issue of progress is an ambiguous concept in evolutionary biology, it is hard to deny that some lineages have become more complex during evolution. One possible measure of complexity is the number of functionally distinct genes. There exists considerable variation between organisms in gene number. Although there are certain exceptions, it is generally true that eukaryotes have more genes than prokaryotes, and multicellular organisms have more genes, than unicellulars (Miklos & Rubin, 1996). How can we understand the evolution of this apparent increase in genomic complexity over time?
Such an increase is by no means inevitable. Evolution by natural selection does not predict a general increase of complexity: adaptation to local environmental conditions can be fulfilled by loss or gain of new genes, functions or morphological structures. The reduced genome of numerous parasitic and endosymbiotic species being a case in point (e.g. Charles & Ishikawa, 1999; Fukuda et al., 1999).
According to the most widely accepted scenario, increased gene number arises as a by-product of local adaptation. For example, eukaryotes have a chromosome segregation mechanism that enables the replication of DNA to start at many points simultaneously, compared with the single origin in prokaryotes. As a by-product, these changes also enabled the increase of genome size (Cavalier Smith, 1985; Maynard Smith & Szathmáry, 1995). Similarly, the acquisition of mitochondria in early eukaryotes may have also reduced the energetic limits on genome size (Vellai & Vida, 1999).
While these sorts of forces might affect genome size, they do not necessarily impose limits on gene number. Further, neither of the above forces can explain the variation in gene number within eukaryotes (Cavalier Smith, 1985). Two hypotheses have been presented to suggest that gene number might be limited. Limits to the amount of functional DNA may be imposed by the accumulation of harmful mutations or by inappropriate gene expression.
There are a large number of related arguments suggesting that harmful mutations might impose a strict limit on the maximum number of genes. One type of argument emphasizes the impossibility of preserving the non-mutant (master) sequence with growing information content. Some theoretical models (Eigen, 1971; Maynard Smith, 1983; Higgs, 1994) show that if the genomic mutation rate exceeds a critical value, then selection cannot prevent the accumulation of harmful mutations, even under infinite population size. The consequence of this error threshold is the limit it sets on gene number. However, these models assume very specific interactions between mutations. All mutant variants are considered to have lower fitness, which is the same for all of them, irrespective of the number of mutations. It has been also shown that the conclusions of these models cannot be generalized to arbitrary interactions between mutations (Charlesworth, 1990): the error threshold phenomenon arises only under diminishing epistasis, which is rarely observed in extant organisms. Hence, there is no obvious biological justification of the error threshold concept.
Another argument concentrates on the loss of mutation free variants under the combination of mutation pressure and genetic drift. With a given accuracy of replication and a growing number of genes, the genomic mutation rate also increases. Under finite population size, the enhanced genomic mutation rate may accelerate the accumulation of harmful mutations (Muller, 1964). This effect is even stronger if the decline in population fitness decreases the population size, which in turn facilitates the spread of harmful mutations (Lynch et al., 1993). It is important to note that the accumulation of harmful mutations is irreversible in asexual populations, while recombination may recreate mutation-free genotypes. Hence, it is possible that sexual reproduction may enable the maintenance of a higher number of genes (Hurst, 1995).
If an error threshold places an upper limit on genomic complexity, we should expect to find an inverse relationship between gene number (or more precisely functional DNA content) and the per base pair mutation rate. Although this analysis has not been done precisely, Drake (1991) found that genome size and the per base pair mutation rate were negatively correlated in a wide range of unicellular organisms. If we suppose that for the organisms investigated there exists a positive correlation between gene number and genome size (as seems likely given that these organisms have little junk DNA), then the prediction holds. These data support the idea that gene number cannot increase indefinitely without a compensatory reduction in the per base pair mutation rate. If such compensation cannot occur indefinitely then mutations will impose a limit on gene number.
However, genetic errors alone are unlikely to explain all of the variation in gene number. The effective genomic mutation rate, which measures the total mutation rate in coding regions, is usually several orders of magnitude higher in plants and animals than in microbial eukaryotes (Drake et al., 1998). But, contrary to obvious expectations, it is the latter that have fewer genes. One possibly way to reconcile theory and data is if we assume that sexual processes occur much more frequently in higher organisms (Hurst, 1995). Unfortunately, much more data would be necessary to resolve this issue.
Furthermore, vertebrates have extensive DNA-methylation, which is potently mutagenic, while also having more genes (Holliday & Grigg, 1993; Smith & Hurst, 1999). That vertebrates have extensive methylation fits much better with Bird’s hypothesis (Bird, 1995) that the sustainable number of genes might be intrinsically hampered by the imprecision of biochemical mechanisms governing gene expression. He considered the fact that these failures occur at a much higher rate than genetic mutations as support for his theory. Such errors are, however, not heritable and therefore cannot accumulate in populations.
Are these two sources of error equally likely to affect the evolution of gene number or is one intrinsically a much stronger force than the other? In this review we address this question by examining two issues. First, does the fact that gene expression failures cannot accumulate across generations mean that they are less important than heritable mutations? Second, when one asks about limits to genomic complexity, all previous analyses have been group selective. For example, they point out that a population with more genes can have a higher chance of extinction if the mutation rate is not adjusted accordingly. But, to understand limits on genomic complexity, we should address the issue by asking about limitations to the spread of a new gene as it enters the population. Therefore we shall ask whether mutations or gene expression errors have an immediate impact on the probability of the spread of a new gene. First, however, we shall discuss in more detail the forms of non-heritable errors.
The plethora of non-heritable errors
According to Bird (1995), the source of imprecision is that the gene is activated or silenced at the wrong time or place. However, even if genes are activated and silenced at the appropriate time, it still does not guarantee successful function: wrong gene products may arise during protein biosynthesis or protein folding (Wickner et al., 1999). Furthermore, although we do not have any clear estimates, it is reasonable to suppose that translocation of proteins across membranes, transport processes within cells (Ellgaard et al., 1999) and cell–cell signalling mechanisms are also error-prone (Krakauer & Pagel, 1996). We can approximately classify non-heritable errors as failures during gene regulation and protein biosynthesis.
Gene regulatory failures
Besides errors during biosynthesis, inappropriate gene expression might arise due to failures of gene silencing. The activities of vast numbers of genes are restricted to particular tissues and/or particular times within the cell cycle. Thus many of the genes are activated only in given cell types at given times.
Inappropriate transcription seems to be especially dangerous to complex multicellular organisms. Estimates of gene number in vertebrates are between 70 000 and 100 000, while no other non-vertebrate organism has been found to exceed 25 000 genes. Because the gene number coding for housekeeping functions are very similar in invertebrates and vertebrates (Bird & Tweedie, 1995), most of the surplus genes in vertebrates are most likely to be cell or tissue specific. Accordingly, the ratio of cell specific and housekeeping genes has changed dramatically during the evolution of vertebrates. Therefore, only a small fraction of the total number of genes is expressed in any vertebrate cell type, whereas in invertebrates most of the genes are actively expressed (Bird & Tweedie, 1995).
While detailed estimates of rates of inappropriate expression are hard to come by, it is very reasonable to suspect that gene silencing mechanisms are error-prone. Gene expression seems to have intrinsically stochastic components (Ko, 1992; McAdams & Arkin, 1999). Many regulatory molecules act at very low concentrations, resulting in fluctuations (noise) in the reaction rates. It is possibly this sort of stochasticity that explains why, even under uniform experimental conditions clonal populations of bacteria show a considerable degree of variation. Although gene expression failures are not heritable in the sense that they are not usually transmitted from parent to offspring, some can be transmitted through cell cycle events. During development, genes are turned on and off, and the expression patterns of differentiated cells are passed through during cell division. DNA-methylation is one means by which this control is exercised and seems to be especially important in some (but not all) complex multicellular organisms (Regev et al., 1998). Owing to the huge number of CpG sites in vertebrate genomes either methylated or unmethylated, the possible combinatorial patterns of DNA-methylation is enormous. Even if a small fraction can occur by chance events, and only a tiny part of them can actually influence gene regulation, it might still provide substantial variation within an individual (Jones & Laird, 1999).
Failures during protein biosynthesis
Even if the gene is activated and suppressed at the right time and place, plenty of errors can occur during transcription and translation. Expression errors can occur any time between transcription and activity of the final protein product. For example, there might be substitution of one amino acid for another, failure to complete a full-length version of mRNA or protein, frameshift errors and readthrough of a stop codon (Parker, 1989; Ninio, 1991). All are known to occur at appreciable rates. Moreover, even if we have the right protein sequence at the right time, it is still uncertain whether it can reach its active conformation. Large multidomain proteins have much difficulty in reaching their native structure. Aggregate prone intermediate structures may arise during synthesis, at points of translocation across membranes and during times of stress (Ellgaard et al., 1999). Molecular chaperones play a pivotal part in the shifting the balance away from aggregation to proper folding (Nathan et al., 1997).
Which are more important: heritable or non-heritable errors?
There seem to be many different sorts of non-heritable errors, some of which may be quite frequent. Some errors can also be propagated through cell division. For these reasons one might suppose that non-heritable errors are of great importance. However, the changes that are heritable through the cell cycle are mostly reset in the germ line, and, like the gene regulatory errors, cannot accumulate over generations. Does the fact that they cannot accumulate across generations mean that they are less important, or does the fact that such errors occur at a high rate make them more important? In the following we compare the harmful effects of heritable and non-heritable failures. The population genetic consequences of harmful mutations are well-known in the literature (e.g. Crow & Kimura, 1970).
Consider a haploid locus with two different alleles: the wild-type allele will be denoted by (A), which may mutate to a defective allele (a), at a small rate μ (we neglect the probability of back mutations). The relative fitnesses are WA=1 and Wa=1 − s, respectively (μ < s). Using standard population genetic framework it is straightforward to see, that at equilibrium the frequency of the deleterious allele is μ/s and the mean fitness of the population is 1 − μ.
Consider a genome of L loci. Each mutant gene reduces the fitness of the genome by a factor of 1 − s. We shall assume that the fitness of a genotype with i mutations is Wi=(1 − s)i. The equilibrium frequency of a variant with i mutations under mutation-selection equilibrium yields
and the mean fitness of the population is
These results are simple manifestations of the Haldane–Muller principle: under multiplicative fitness the mean fitness at mutation–selection equilibrium is independent of the strength of selection against mutations (Crow & Kimura, 1970). The higher fitness reduction they cause, the lower frequency they are present in the population, and these two factors exactly offset each other.
Consider now the effects of non-heritable gene expression defects. Notably, theoretical work on somatic mutations (see, e.g. Orr, 1995) is applicable equally to expression errors as both cannot accumulate across organism generations. Let each locus have a non-heritable error probability p, and an effect on fitness δ. Assuming that errors occur independently at different loci, the mean fitness of the population subjected to non-heritable errors is,
In this case, the population fitness depends not only on the frequency of failures (pL), but also on the intensity of selection (δ). This is due to the fact that non-heritable failures cannot be passed on to offspring causing multiple generation damage. Using (1) and (2), it is straightforward to see that the non-heritable failures have a higher impact than deleterious mutations if
If the rate of non-lethal non-heritable errors and heritable mutations is the same (p ≅ μ, δ < 1), than slightly deleterious mutations will have a more serious effect due to their accumulation in the population. Even if the rate of non-heritable errors is much higher than the rate of heritable mutations ( p ≈ μ), it still does not guarantee a higher load on the population: the harmful effect of non-heritable failures must also be high. The issue is hence an empirical one, and we review what is known in the following section.
The rate and fitness effect of non-heritable errors
In contrast to the fairly good estimates on mutation rates (Drake et al., 1998), we have few clear data about the per locus error rate ( p) and the harmful effects (δ) of non-heritable failures.
Advocates of non-heritable errors could point to the observation in wild-type bacteria that 30% of the proteins of a given gene are as truncated (Kurland, 1992). Alternatively, one might note the high rates of transcriptional and translational failures in bacteria (Parker, 1989). Trancriptional errors occur at a rate 10−5 per codon. The mis-sense frequency of translational errors in bacteria range between 5 × 10−5 and 5 × 10−3 per codon (Parker, 1989). Some new findings support the idea that no inactive genes are truly silent and that illegitimate transcription occurs in large numbers of somatic cells (Chelly et al., 1989). There are inappropriate transcripts of a given gene in every 104 mammalian cells. Even such an effective gene silencing method as DNA-methylation is error-prone. Some recent work (e.g. Rougier et al., 1998) indicates that chromosome demethylation in mammalian somatic cells occurs with each DNA replication. Methylated cytosines are passively demethylated during cell divisions, leading to the expression of unwanted genes (Rougier et al., 1998).
But what are the effects on fitness? As regards the harmful effects of non-heritable errors, we might make one generalization, namely the more downstream the effect the less damaging it will be. For example, the potential impact of transcriptional error is higher than translational error because usually a large number of proteins are produced from the same transcript. Likewise, gene silencing failure is likely to be more important than mistranscription, because gene silencing failures can propagate through somatic cell division. Any errors specific to the protein once formed are likely to be of less importance as only one protein molecule will be affected. Therefore, the fact that mistranslation goes on at such a high rate (compared with mutation) is not itself enough to indicate that such errors are of importance. The most damaging errors are then likely to be gene silencing failures that are propagated through cell divisions.
We can also note that mutations are expected to cause a much greater loss of fitness than non-heritable failures. This is due to the fact that no error-free gene product can arise from a mutant gene. Take the above-mentioned example that 30% of the proteins of a given gene are truncated. It is reasonable to suppose that for normal functioning, the cell needs a critical number of error-free gene products, and the surplus number can be non-functional variants without serious fitness consequences. As support for this idea, eukaryotes are generally less sensitive to decreased gene dosage. In diploid organisms, most of the deleterious mutations are recessive, probably as a physiological consequence of metabolic pathways (Kacser & Burns, 1981). Similarly, Orr (1991) found that artificially constructed diploids from typically haploid organisms tend to display dominance. Therefore, in most cases even a 50% decrease in the amount of a protein causes only a small reduction in fitness.
However, we do not wish to suggest that all non-heritable errors are of minimal importance. Proof-reading (Parker, 1989; Ibba & Söll, 1999) and surveillance (Culbertson, 1999) mechanisms are used throughout protein biosynthesis. A variety of quality control mechanisms (e.g. chaperones, proteases) are known to operate to remove and to repair misfolded proteins. The presence of these energetically costly mechanisms suggests that intense selection has acted against non-heritable errors. Furthermore, the failure of proper functioning of these systems can lead to protein aggregation associated with prion and amyloid diseases (Prusiner, 1998).
The evidence for a relationship between methylation defects and cancer also suggests that gene silencing failures can have serious fitness consequences (see Holliday, 1990; Jones & Gonzalgo, 1997). Numerous promoters of tumour-suppressor genes are hypermethylated (reviewed in Jones & Laird, 1999), and some transformed mammalian cells show global hypomethylation. Further, some tumours have variable phenotypes. In some cases they can switch to a near-normal state, a finding that is very hard to explain with the mutation theory. Recently, it has also been demonstrated that a large number of human cancer cells (like Wilms’ tumour) show heterogeneous expression of imprinted genes (Cui et al., 1997). The theory that gene silencing failures may initiate cancer progression is consistent with these experimental findings.
Non-heritable errors might also explain some of the decline of performance associated with ageing. For example, the inactive X chromosome frequently becomes reactivated in ageing mice (Brown & Rastan, 1988). More generally, spontaneous loss of methylation with age seems to be a general property of somatic cells (Catania & Fairweather, 1991).
To summarize, while some non-heritable errors might be very damaging and some might be very common, it appears that common ones are of little effect and damaging ones are rare. Simply noting that some errors are very common is not then enough to substantiate a strong effect. At the moment, parameter estimates are too crude — even within orders of magnitude — to know whether pδ > μ.
Evolution of error rate
The fact that some downstream errors are common is not convincing evidence that non-heritable errors are of great importance. A priori, if an error type were very damaging, then we should expect selection to favour modifiers that reduce the rate of such errors. Modifier genes that reduce error rates may be favoured by selection, even if there is a physiological cost associated with enhanced fidelity. A simple mathematical model can clarify the favourable conditions for invasion.
Consider a population subjected to non-heritable errors with parameters p and δ. Consider a rare modifier that reduces the error rate to p′ ( p′ < p). Individuals with the modifier suffer of increased physiological cost. We assume that the expected fitness of the modifier is given by,
where the constant c measures the physiological cost associated with the modifier. The rare modifier can invade the population, if
Using (2) and (4), invasion occurs if,
Thus, we can conclude that the modifier can spread more easily if the net effect of errors on fitness (pδ) is high. Therefore, we might expect that the least damaging errors are those that occur most frequently. In support of this trend are the data suggesting that gene silencing errors, while more serious than errors during protein biosynthesis, occur at a lower rate (Holliday, 1987; Jablonka & Lamb, 1995).
That selection has acted on the rates of such errors is supported by the finding that organisms appear to trade-off translational accuracy and replication speed: hyperaccurate ribosome mutants show slower growth rates (reviewed in Kurland, 1992). Note, however, that mutants with increased mis-sense substitution rate display highly decreased growth rates (Kurland, 1992).
In organisms with a limited number of somatic cell divisions, the danger of gene regulatory defects are expected to be much lower (Regev et al., 1998) than the danger in organisms with extensive somatic cell turnover. Therefore, in organisms with a short life span and little cell turnover, the maintenance of gene activity pattern during repeated rounds of cell divisions does not rely so much on DNA-methylation (Jablonka & Lamb, 1995). Accordingly, selection has reduced or even eliminated the genes responsible for DNA-methylation. The fact, that the presence of DNA-methylation in metazoa correlates with the amount of somatic cell turnover (Regev et al., 1998), supports the idea that investment into the control of gene expression is a function of the net costs of illegitimate expression.
Besides increasing the fidelity of protein biosynthesis, there is another special way to achieve a reduced frequency of gene expression errors. In contrast to the elimination of mutant genes, the degradation of erroneous gene products can lead to lowered net effects of errors. For example, RNA-surveillance mechanisms prevent the translation of truncated mRNA sequences, with the result that they are rapidly degraded (Culbertson, 1999). Further, a variety of control mechanisms operate during the transportation of proteins across membranes. Only proteins with an adequate folding structure can pass the stringent sorting process, misfolded variants are eliminated by proteases (Wickner et al., 1999).
We would also like to point out, that some of the mechanisms against non-heritable errors have presumably evolved even at the cost of an enhanced rate of mutation. As mentioned in the introduction, although DNA-methylation is an efficient gene silencing mechanism, it is also known to be mutagenic. However, there is also evidence that functionally intact mutant proteins with minor structural deficiencies are sometimes retained in the endoplasmic reticulum (ER). A number of inherited disorders are known to be due to mutant proteins aggregating in the ER (for references see Ellgaard et al., 1999).
Constraints on genomic complexity
Population-based arguments are not sufficient
There are a large number of related arguments suggesting that harmful mutations might impose a limit on the maximum number of genes (see Introduction). All of these are based on group selective arguments. They emphasize either the impossibility of preserving the non-mutant sequence or the decline of mean fitness of the population with growing information content. However, these arguments neglect the problem whether any direct limitation exists on the spread of functionally new genes.
It is generally believed that functionally new genes arise by gene (or genome) duplication events followed by divergence. Most of the models assume that the duplicates have originally fully overlapping functions (but see Lewis & Wolpert, 1979). If we accept this, mutations or gene expression errors might theoretically pose a limit either on the spread of functionally equivalent gene duplicates, or on the fixation of diverged copies. It is currently unclear whether divergence is driven by rare advantageous (Ohno, 1970), or by complementary degenerative mutations (Force et al., 1999). Therefore, we concentrate on the possible limits on the fixation of equivalent gene duplicates imposed by deleterious mutations and gene expression failures.
Gene duplications may induce gene expression failures
The evolutionary mechanisms for the spread of gene duplicates are largely unclear. It has often been argued that duplicates are favoured by selection because they can mask harmful mutations. However, mathematical models (e.g. Clark, 1994) have shown that the selection pressure for the preservation of gene duplicates is extremely weak and a duplicate can be favoured by selection only if it provides some direct advantage. Others propose (Nowak et al., 1997) that genetic redundancy is an adaptation against non-heritable developmental errors.
However, the very act of copying is likely to interfere with the gene expression pattern. Gene duplications mostly do not copy all the cis-regulatory regions. Therefore, even if duplicated genes are clustered in a specific genomic region, it still does not guarantee co-ordinated gene expression of the new copy. At the most extreme, a retroposed gene is very unlikely to insert near an appropriate 5′ promoter sequence. Any movement within the genome has the potential to disrupt promoter elements, alter chromatin, etc. Several experimental findings suggest that for correct regulation a given gene must integrate at sites favourable for transcription (e.g. Kioussis & Festenstein, 1997).
There is also direct evidence for the harmful effects of gene duplications. There are some data suggesting that gene duplicates — either clustered or tandem — can perturb the chromatin structure, leading to uncontrolled gene expression in somatic cells. It has long been known that the increased copy number of both transgenes (Henikoff, 1998) and endogenous sequences (Selker, 1999) can lead to heterochromatin formation and to reduced gene expression. The phenomenon (often cited as repeat induced gene silencing) seems to be a general phenomenon: it has been detected in fungi (Selker, 1997), plants (Kooter et al., 1999), invertebrates and mammals (Henikoff, 1998). Sometimes silencing occurs only when the gene is locally repeated. However, in a large number of cases long-range interactions between dispersed copies can also induce heterochromatin formation.
Sometimes, duplicates seem to increase the rate of expression errors rather than disrupting gene expression in all cells of the organism. For example, transgene arrays in vertebrates and Drosophila often show unpredictable levels of expression: the phenotype consists of a random mixture of fully expressed and silenced somatic cell clones (Martin & Whitelaw, 1996). A proportion of cells stochastically silences the transgene, similarly to position effect variegation (Kioussis & Festenstein, 1997). Remarkably, these changes can also propagate through somatic cell divisions. It appears, therefore, that expression errors are likely to be associated with gene duplications, by whatever mechanisms.
It should be recognized that some expression errors are likely to be heritable errors. Indeed, one might consider that the strongest hypothesis for the limitation on gene number is a hybrid of the two usually posed. Heritable expression errors are not only likely to occur as a consequence of duplication, but can accumulate.
How can gene number affect the fate of new duplicates?
The arguments above are not enough to explain any potential limits to gene number. More specifically, one needs to show that the spread of gene duplication becomes harder and harder as gene number increases. We can therefore complete our scenario if we assume that genomic complexity increases either the frequency or the harmful effects of errors at the new gene. We know that duplicated genes mostly do not code housekeeping functions, but rather specific transcription factors or tissue-specific functions. Therefore, it is reasonable to assume that the organisms with the higher number of these genes have more somatic cell types, hence an increased number of target cells for inappropriate gene expression (Bird, 1995).
It is also conceivable, that genomic complexity increases not only the frequency but also the fitness effect of gene expression errors. Interactions between transcription factors become more and more complicated as a gene regulatory network is supplemented by ever more elements (e.g. Lenski et al., 1999). In a complex network with large number of interactions, misexpressions of transcription factors do not remain localized, but rather they have numerous downstream consequences. Therefore, the effects of misexpression are expected to become more and more harmful as gene number increases.
In his influential paper Bird (1995) pointed out that inappropriate gene silencing can have a high impact on fitness compared with mutations. He also stated that this would limit the maximum number of sustainable genes. The argument was based on the reasonable assumption that gene regulatory failures occur at a much higher rate than mutations. We have argued that even if the rate of non-heritable failures is much higher than the mutation rate ( p >> μ), it still does not guarantee a higher impact on the population. Slightly deleterious mutations cause a higher (cumulative) fitness reduction due to their persistence over generations under mutation-selection equilibrium. However, the same rule does not apply to non-heritable gene expression failures as they cannot accumulate. Therefore, if the rate of heritable and non-heritable failures were the same, the effect of the latter would be less. Against Bird’s argument is the theory that predicts that the common types of errors will be the least deleterious. Therefore, even establishing that some errors occur at high rates is also not decisive. None the less, the relative importance of heritable and non-heritable errors is still ambiguous.
Even if we accept that the non-heritable failures cause more harm than heritable mutations, it is still unclear how load arguments relate to any limit on gene number. All previous attempts to make a connection between error rates and limits on gene number are based on group selection arguments. However, these arguments fail to show a direct limitation on the spread of gene duplications. We argued that there could be such a direct cost if we consider the effects of gene duplicates on gene expression. For example, a duplicated gene might often be expressed in the wrong tissue or at the wrong time. Therefore, gene number is likely to be limited because gene duplications may disrupt normal gene expression, and not because of the effects of heritable and non-heritable errors on the mean fitness of the population. The fact that some lineages have more functionally distinct genes may be explained if we assume that these lineages can mitigate the harmful effects of errors on gene expression.
Although we cannot estimate the net effect of non-heritable errors, we suggest that they may be important in evolutionary features, other than limitations on gene number, where deleterious mutations have been suggested to be of importance. Several authors (Otto & Orive, 1995; Michod, 1996) have argued that somatic cell variants during development can threaten the integrity of multicellular organisms. These variants are assumed to arise through somatic mutations. We propose that the model can also work if one considers gene expression failures that can propagate through cell divisions, as it is known that they can have serious fitness consequences (e.g. cancer).
Bird, A. (1995). Gene number, noise reduction and biological complexity. Trends Genet, 11: 94–99.
Bird, A. and Tweedie, S. (1995). Transcriptional noise and the evolution of gene number. Phil Trans R Soc B, 349: 249–253.
Brown, S. and Rastan, S. (1988). Age-related reactivation of an X-linked gene close to the inactivation centre in the mouse. Genet Res, 52: 151–154.
Catania, J. and Fairweather, D. S. (1991). DNA methylation and cellular ageing. Mutat Res, 256: 283–293.
Cavaliersmith, T. (ed.) (1985) The Evolution of Genome Size. John Wiley and Son Ltd, Chichester.
Charles, H. and Ishikawa, H. (1999). Physical and genetic map of Buchnera the primary endosymbiont of the pea aphid Acyrthosiphon pisum. J Mol Evol, 48: 142–150.
Charlesworth, B. (1990). Mutation-selection balance and the evolutionary advantage of sex and recombination. Genet Res, 55: 199–221.
Chelly, J., Concordet, J. P., Kaplan, J. C. and Kahn, A. (1989). Illegitimate transcription: transcription of any gene in any cell type. Proc Natl Acad Sci USA, 86: 2617–2621.
Clark, A. G. (1994). Invasion and maintenance of a gene duplication. Proc Natl Acad Sci USA, 91: 2950–2954.
Crow, J. F. and Kimura, M. (1970) An Introduction to Population Genetics Theory. Harper & Row, New York.
Cui, H., Hedborg, F., He, L., Nordenskjold, A., Sandstedt, B., Pfeifer-Ohlsson, S. and Ohlsson, R. (1997). Inactivation of H19 an imprinted and putative tumor repressor gene, is a preneoplastic event during Wilms’ tumorigenesis. Cancer Res, 57: 4469–4473.
Culbertson, M. (1999). RNA surveillance: unforeseen consequences for gene expression, inherited genetic disorders and cancer. Trends Genet, 15: 74–80.
Drake, J. W. (1991). A constant rate of spontaneous mutation in DNA based microbes. Proc Natl Acad Sci USA, 88: 7160–7164.
Drake, J. W., Charlesworth, B., Charlesworth, D. and Crow, J. F. (1998). Rates of spontaneous mutations. Genetics, 148: 1667–1686.
Eigen, M. (1971). Self-organization of matter and the evolution of biological macromolecules. Naturwissenschaften, 58: 465–523.
Ellgaard, L., Molinari, M. and Helenius, A. (1999). Setting the standads: quality control in the secretory pathway. Science, 286: 1882–1888.
Force, A., Lynch, M., Pickett, B. F., Amores, A., Yan, Y. -L. and Postlethwait, J. (1999). Preservation of duplicate genes by complementary, degenerative mutations. Genetics, 151: 1531–1545.
Fukuda, Y., Washio, T. and Tomita, M. (1999). Comparative study of overlapping genes in the genomes of Mycoplasma genitalium and Mycoplasma pneumoniae. Nucleic Acids Res, 27: 1847–1853.
Henikoff, S. (1998). Conspiracy of silence among repeated transgenes. Bioessays, 20: 532–535.
Higgs, P. G. (1994). Error thresholds and stationary mutant distribution in multilocus diploid genetics models. Genet Res, 63: 63–78.
Holliday, R. (1987). The inheritance of epigenetic defects. Science, 238: 163–170.
Holliday, R. (1990). Mechanisms for the control of gene activity during development. Biol Rev, 65: 431–471.
Holliday, R. and Grigg, G. W. (1993). DNA methylation and mutation. Mutat Res, 285: 61–73.
Hurst, L. D. (1995). The silence of the genes. Curr Biol, 5: 459–461.
Ibba, M. and Söll, D. (1999). Quality control mechanisms during translation. Science, 286: 1893–1898.
Jablonka, E. and Lamb, R. M. (1995) Epigenetic Inheritance and Evolution. Oxford University Press, Oxford.
Jones, P. A. and Gonzalgo, M. L. (1997). Altered DNA methylation and genome instability: a new pathway to cancer?. Proc Natl Acad Sci USA, 94: 2103–2105.
Jones, P. A. and Laird, P. W. (1999). Cancer epigenetics comes of age. Nature Genet, 21: 163–167.
Kacser, H. and Burns, J. A. (1981). The molecular basis of dominance. Genetics, 97: 639–666.
Kioussis, D. and Festenstein, R. (1997). Locus control regions: overcoming heterochromatin-induced gene inactivation in mammals. Curr Opin Gene Dev, 7: 614–619.
Ko, M. S. (1992). Induction mechanism of a single gene molecule: stochastic or deterministic?. Bioessays, 14: 341–346.
Kooter, J. M., Matzke, M. A. and Meyer, P. (1999). Listening to the silent genes: transgene silencing, gene regulation and pathogen control. Trends Plant Sci, 4: 340–347.
Krakauer, D. C. and Pagel, M. (1996). Selection by somatic signals: the advertisement of phenotypic state through costly intercellular signals. Phil Trans R Soc B, 351: 647–658.
Kurland, C. G. (1992). Translational accuracy and the fitness of bacteria. Annu Rev Genet, 26: 29–50.
Lenski, R. E., Ofria, C., Collier, T. C. and Adams, C. (1999). Genome complexity, robustness and genetic interactions in digital organisms. Nature, 400: 661–664.
Lewis, J. and Wolpert, L. (1979). Diploidy, evolution and sex. J Theor Biol, 78: 425–438.
Lynch, M., Burger, R., Butcher, D. and Gabriel, W. (1993). The mutational meltdown of in asexual populations. J Heredity, 84: 339–344.
Martin, D. I. K. and Whitelaw, E. (1996). The vagaries of variegating transgenes. Bioessays, 18: 918–923.
Maynardsmith, J. (1983). Models of evolution. Proc R Soc B, 219: 315–325.
Maynardsmith, J. and Szathmáry, E. (1995) The Major Transitions in Evolution. Freeman, Oxford.
Mcadams, H. and Arkin, A. (1999). It’s a noisy business! Genetic regulation at the nanomolar scale. Trends Genet, 15: 65–69.
Michod, R. E. (1996). Cooperation and conflict in the evolution of individuality. II. Conflict mediation. Proc R Soc B, 263: 813–822.
Miklos, G. L. and Rubin, G. M. (1996). The role of the genome project in determining gene function: insights from model organisms. Cell, 86: 521–529.
Muller, H. J. (1964). The relation of recombination to mutational advantage. Mutat Res, 1: 2–9.
Nathan, D. F., Vos, M. H. and Lindquist, S. (1997). In vivo functions of the Saccharomyces cerevisiae Hsp90 chaperone. Proc Natl Acad Sci USA, 94: 12949–12956.
Ninio, J. (1991). Connections between translation, transcription and replication error-rates. Biochemie, 73: 1517–1523.
Nowak, M. A., Boerlijst, M. C., Cooke, J. and Maynardsmith, J. (1997). Evolution of genetic redundancy. Nature, 388: 167–171.
Ohno, S. (1970) Evolution by Gene Duplication. Springer-Verlag, Heidelberg.
Orr, H. A. (1991). A test of Fisher’s theory of dominance. Proc Natl Acad Sci USA, 88: 11413–11415.
Orr, H. A. (1995). Somatic mutations favor the evolution of diploidy. Genetics, 139: 1441–1447.
Otto, S. P. and Orive, M. E. (1995). Evolutionary consequences of mutation and selection within an individual. Genetics, 141: 1173–1187.
Parker, J. (1989). Errors and alternatives in reading the universal genetic code. Microbiol Rev, 53: 273–298.
Prusiner, S. B. (1998). Prions. Proc Natl Acad Sci USA, 95: 13363–13383.
Regev, A., Lamb, M. J. and Jablonka, E. (1998). The role of DNA methylation in invertebrates: developmental regulation or genome defense? Mol Biol Evol, 15: 880–891.
Rougier, N., Bourchis, D., Gomes, D. M., Niveleau, A., Plachot, M., Paldi, A. and Viegas-Pequignot, E. (1998). Chromosome methylation patterns during mammalian preimplantation development. Genes Dev, 12: 2108–2113.
Selker, E. U. (1997). Epigenetic phenomena in filamentous fungi: useful paradigms or repeat-induced confusion? Trends Genet, 13: 296–301.
Selker, E. U. (1999). Gene silencing: repeats that count. Cell, 97: 157–160.
Smith, N. G. and Hurst, L. D. (1999). The causes of synonymous rate variation in the rodent genome. Can substitution rates be used to estimate the sex bias in mutation rate? Genetics, 152: 661–673.
Vellai, T. and Vida, G. (1999). The origin of eukaryotes: the difference between prokaryotic and eukaryotic cells. Proc R Soc B, 266: 1571–1577.
Wickner, S., Maurizi, M. R. and Gottesman, S. (1999). Posttranslational quality control: folding, refolding and degrading proteins. Science, 286: 1888–1893.
We would like to thank John Maynard Smith for helpful discussions on the subject. C.P. is also grateful for István Molnár pointing out the importance of non-heritable errors. We are also grateful for comments from two anonymous reviewers. C.P. was supported by the European Science Foundation, while L.D.H. is funded by the Royal Society.
About this article
Cite this article
Pál, C., Hurst, L. The evolution of gene number: are heritable and non-heritable errors equally important?. Heredity 84, 393–400 (2000). https://doi.org/10.1046/j.1365-2540.2000.00725.x
- gene duplication
- gene number
Bulletin of Mathematical Biology (2014)