Estimation of spontaneous genome-wide mutation rate parameters: whither beneficial mutations?

Abstract

Empirical estimates of genome-wide mutation rates and of the distribution of mutational effects are needed to illuminate various topics ranging from evolutionary biology to conservation. Methods for inferring genome-wide mutation parameters are presented, and results stemming from these studies are reviewed. It is argued that, although most if not all mutations detected in mutation accumulation experiments are deleterious, the question of the rate of favourable mutations (and their effects) is still a matter for debate.

Mutations, why do we care?

Mutation is the ultimate source of heritable variation. As such it conditions the response to selection and adaptation through natural selection. Most researchers agree that mutations with phenotypic effects are usually deleterious. Indeed, when considering a population that has evolved for a long time in a constant environment, one can postulate that the population is composed of genotypes finely tuned with respect to a myriad of biotic and abiotic conditions and that a random mutation will probably disrupt such fine tuning (Fisher, 1999, pp. 38–42). This has been empirically shown for bacteria evolving in a constant glucose-limited environment for about 10 000 generations (Elena et al., 1996). Although the result depend largely on the ecology and history of the population considered, this assertion is used as a working hypothesis in numerous evolutionary genetics models. These models address different issues that include: (1) the evolution of genetic systems such as sex, recombination, selfing rates and ploidy levels (Otto & Marks, 1996; Barton & Charlesworth, 1998; Charlesworth & Charlesworth, 1998); (2) the maintenance of genetic variability at both the phenotypic (Barton & Turelli, 1989) and DNA levels (Charlesworth et al., 1993); (3) the fate of small natural or managed populations (Kondrashov, 1995; Lynch et al., 1995; Lande, 1998; Schoen et al., 1998); (4) sexual selection (Burt, 1995); and (5) evolutionary explanations for ageing (Partridge & Barton, 1993).

The number of these models has grown exponentially in the last decade but there have been relatively few studies providing empirical estimates for the rates and effects of spontaneous mutations affecting traits related to fitness. Recent reviews have focused either broadly on spontaneous mutations at both the molecular and genome-wide level (Drake et al., 1998) or exclusively on deleterious mutations (Kondrashov, 1998; Keightley & Eyre-Walker, 1999; Lynch et al., 1999). The aim of this review is to (1) present the methods currently available for inferring genome-wide mutation parameters; (2) assess our current ability to detect beneficial mutations; and (3) to propose some alternative experimental designs that will allow us to quantify the flux and distribution of beneficial mutational effects.

I define U as the sum of the haploid mutation rates across the (unknown) set of loci affecting either fitness or a fitness component. Mutational effects have then to be defined by assuming a relationship between the number of mutations carried by an individual and its genotypic value. Most studies assume that mutations act in a multiplicative or additive fashion (which are equivalent when mutations are assumed to have sufficiently small effects). The mutation effect attributed to a mutation in the homozygous (heterozygous) state is denoted by s (hs) and represents the shift in the expected genotypic value of the individual carrying the mutation relative to a value of 1 for wild type. Estimating U, h and especially s is the primary goal of empirical studies. I will concentrate on two approaches for estimating mutational parameters. One, the so-called mutation accumulation (MA) approach, uses controlled designs where mutations are allowed to accumulate de novo in a quasi neutral fashion while the other is based on the comparative analysis of molecular data (hereafter called the DNA-based method). Other methods based on characterizing levels of inbreeding depression in natural populations have been carried out in several plant species and Daphnia (Charlesworth et al., 1990, 1994; Johnston & Schoen, 1995; Deng & Lynch, 1996, 1997). However, these methods provide only indirect estimates of mutational parameters and make two limiting assumptions. First, the size of the population sampled will impose a threshold selection coefficient below which mutation will not be detected (Bataillon & Kirkpatrick, 2000). Second, inbreeding depression must be solely due to recessive deleterious alleles produced by mutation, and not by maladapted migrant alleles or by overdominant alleles.

Mutation accumulation experiments

The MA approach lets mutations randomly accumulate under benign conditions in a series of sublines derived from an inbred base population (ideally a single completely homozygous individual). The sublines are maintained by close inbreeding, ideally by selfing which ensures an effective population size (Ne) of 1, or by brother–sister mating. In such a design, drift will dominate selection within each subline and all but very detrimental mutations will be fixed at random (Keightley & Caballero, 1997). After several generations of mutation accumulation, the genotypic value of ancestral lines is compared with that of derived lines (Fig. 1). Based on the observed distribution of line values, inferences can be made on the amount of heritable variation produced by mutation and on the type of mutations causing it.

Figure 1
figure1

Estimation of genome-wide mutation parameters using MA experiments.

The traditional method for analysing MA experiments is the Bateman-Mukai technique (hereafter BM), which is based on regressing mean genetic value and observed between-line variance on generation number (Bateman, 1959; Mukai et al., 1972). By assuming that mutations arise at a rate U per haploid genome, are fixed neutrally within each subline, and that each new mutation acts additively and shifts the genotypic value of a line by a fixed increment s, estimators of U and s are obtained as simple functions of the change in mean genetic value per generation ΔM and the mutational variance Vm (Fig. 1). If mutation effects are variable, then the BM method is biased by a factor equal to 1 plus the coefficient of variation of the distribution of effects (e.g. 2 for an exponential distribution). BM estimators are undefined if no shift in mean genetic value of MA lines is observed, despite between-line divergence.

Alternatively methods based either on minimum distance (Garcia-Dorado, 1997) or maximum likelihood (ML) (Keightley, 1994, 1998) have been developed which seek to extract more information from the distribution of MA line values. These methods can be either based on the same assumptions as BM (in which case they are biased if mutation effects are variable) or assume a parametric distribution for mutational effects, φ(s) (e.g. an exponential or Gamma distribution). In the latter case, estimates are provided for both U and the parameters describing φ. Simulation work and reanalysis of two recent MA experiment using Caenorhabditis elegans shows that, even when analysing data under the assumption of a constant effect of mutations, ML estimators of U and s yield estimates with lower sampling variances than traditional BM estimators (Keightley & Bataillon, 2000).

At any rate, all methods of analysis require a substantial level of divergence between lines and/or a high level of replication in order to estimate accurately line values for the trait of interest. The statistical power of a MA experiment can be summarized by the heritability achieved at the line level,

where Vr represents the error variance (across the replicates used to estimate MA line value) and VL the between-line variance. A value of 1/2 is the minimum for reliable and independent estimation of U and s.

Historically, MA experiments have been carried out on Drosophila melanogaster, mostly by Mukai and collaborators (reviewed in Simmons & Crow, 1977; see Keightley & Eyre-Walker, 1999 for a recent historical account). MA experiments used a marked chromosomal inversion and exploited the lack of male recombination in D. melanogaster to keep the entire chromosome II free of recombination. Control populations consisted of large outbred populations. The MA lines were assayed by monitoring viability. MA experiments have recently been performed on several different species (Table 1).

Table 1 Genome-wide haploid mutation rates (Umin) and the mean homozygous effect of mutation (smax) estimates from MA experiments

In all the studies, the mean fitness or mean of fitness related traits declined over time, suggesting that the net effect of spontaneous mutation is indeed deleterious (an exception is Shaw et al., 1999). Mean decline of the fitness components of MA lines ranged from 0.1% to 1–2% per generation. Although the reliability of control populations used to assess erosion of components of fitness has been questioned (especially for Mukai’s experiments: Keightley, 1996); fitness erosion seems to be the rule over a broad range of organisms.

Among the factors contributing to observed variation in U and s estimates are: (1) the quality of the control population used; (2) the activity of transposable elements; and (3) naturally varying levels of mutation rates. Control populations may consist of: (1) large populations that are supposed not to evolve significantly during the course of the experiment (but where a small amount of adaptive evolution can nevertheless occur), or (2) frozen controls (seeds, worms or bacteria) where any evolution is halted. We may expect that a greater number of genes in the organism studied or a longer generation time will cause greater U. But despite similar gene number, D. melanogaster and C. elegans U estimates differ at least by a factor of 10 (Keightley & Bataillon, 2000). A convincing correlation was found between Vm and generation time (Lynch et al., 1999), but it is hard to know whether it is caused by differences in U or different distributions of mutational effects.

A fundamental problem with such phenotypic methods is that, even in instances where the major changes in the distribution of line values were caused by a few mutations with large effect, the presence of a large class of mildly deleterious mutations can never be ruled out. A mutagenesis experiment on the N2 strain of C. elegans (Davies et al., 1999) is particularly revealing. The number of mutations induced by ethyl methyl sulphonate (EMS) at the genomic level could be estimated directly from rates of mutations scored at a known set of genes. The authors estimated that about 50 new amino acid altering mutations had been induced (80% of which are predicted to be deleterious). In parallel, the EMS lines were assayed for productivity and a ML estimator of the induced U was used. However, the ML estimator based on productivity data gave U = 1! When reanalysing the phenotypic data by assuming a U equal to 45 mutations/line, the best fitting distribution for φ(s) was bimodal with the vast majority of induced mutations (43.4 out of 45) having a very small effect (s=0.0007) on productivity. Even competition experiments in models such as Escherichia coli will fail to detect fitness differences between MA lines, that are below 0.001. Potentially, one would like to detect mutations with effects as small as 1/Ne, where Ne for the species under consideration can be as large as 106.

The DNA-based method

This method (Kondrashov & Crow, 1993) is based on the neutral theory of molecular evolution (Kimura, 1983, pp. 43–46 and chapter 5) and the assumption that mutations are either neutral or deleterious. In neutral regions of the genome, mutations are substituted at the rate of mutation μt, while in selectively constrained regions substitutions occur at a rate t, where f represents the proportion of mutations that are selectively neutral (the fraction 1 − f that are deleterious will not be substituted). The method uses sequence data from a pair of species with known levels of divergence and generation times (Fig. 2). A random sample of orthologous genes is used to estimate Kc the average sequence divergence in selectively constrained regions of the genome. The divergence for this set of functional sequences is then compared with the level of divergence Kn for non-functional (presumably neutral) sequences such as pseudogenes. This allows the fraction 1 − f of mutations that are deleterious to be calculated. By extrapolating to the whole genome, one can then derive an estimate of U (Fig. 2). Eyre-Walker & Keightley (1999) recently applied a modified version of this technique to a sample of 46 human–chimp orthologous proteins and used synonymous substitutions in the sequences for estimating Kn and inferring the total mutation rate. They found that U=0.8. Their estimate did not include mutations arising in non-coding sequences.

Figure 2
figure2

The method uses a pair of recently derived species (A,B) with known divergence times (T). For the species of interest (say B), a direct estimate of the total mutation rate Ut is either known or is inferred from levels of substitution (Kn) at ‘neutral’ sequences (represented as empty boxes), the generation time (t) and the total genome size (G). A set of orthologous gene sequences, representing the constrained fraction of the genome (hatched boxes), is used to estimate the rate of substitution for selectively non-neutral regions (Kc) and the fraction f of neutral mutations throughout the genome. An estimate of U is then obtained as U=(1 − f)Ut.

There are several caveats with respect to this method. First, the method requires an independent estimate of the total mutation rate or must rely upon indirect estimation of the total mutation rate through levels of divergence at ‘neutral’ sequences Kn. Second, it ignores the existence of favourable mutations which will bias downward the estimation of U by inflating Kc. More importantly, this method yields no information about the effects of deleterious mutations other than the fact that their effects are greater than the reciprocal effective population size, which may be very large.

On detecting favourable mutations...

Recent experiments involving retroviruses show that despite their elevated genomic mutation rates (Drake & Holland, 1999), adaptive evolution can occur even in small populations by means of beneficial or compensatory mutations (Burch & Chao, 1999). Such mutations may have crucial consequences for models seeking to predict the persistence of small populations (Whitlock & Otto, 1999). Yet most experiments looking at multicellular organisms have so far failed to produce any information on such mutations.

This has been largely overlooked in MA experiments. The BM technique ignores beneficial mutations but minimum distance or ML techniques are versatile enough to incorporate a non-null probability of favourable mutations. Studies that have looked for favourable mutations include the reanalysis of three Drosophila experiments (Garcia-Dorado, 1997) and one C. elegans experiment (Keightley & Caballero, 1997). Of these, one MA experiment fitted a model where 10% of mutations were beneficial (Garcia-Dorado, 1997). The question remains: if 10% of mutations are favourable has the MA method any power to detect them? Although the properties of ML estimators have been explored in detail (Keightley, 1998; Keightley & Bataillon, 2000), the situation where beneficial mutations may be fixed in MA lines has never been studied.

Simulations of MA experiments were performed where a small proportion of beneficial mutations (0%, 1% or 10% of mutations) are fixed in the MA lines, which were then analysed using the constant effect of mutation model or its ML version (see Keightley & Bataillon, 2000, for details of the simulation protocol). First results indicated that line value distributions, and BM or ML estimators traditionally used, are barely affected by the presence of beneficial mutations. Although the analysis is very crude, it indicates that the power of MA, designs to detect such mutations is probably quite low. A full analysis of the behaviour of ML estimators incorporating mutations of variable effects (both positive and negative), although computationally cumbersome, would be interesting in that regard.

An alternative design, potentially useful for the detection of beneficial mutations, would be to practise directional selection in an initially homogenous population. Such designs have traditionally been used as a way to estimate mutational variance for quantitative traits (Hill & Caballero, 1992). Here selection should be practised on fitness-related traits as those typically eroded in MA experiments. Replicating such selection experiments with widely varying population sizes, would provide some information on the distribution of favourable mutations. In such a design, population size should sieve beneficial/deleterious mutations as a function of their selective effect. Monitoring the evolution of the distribution of fitness should provide some information on the relative rates and sizes of effects of beneficial vs. deleterious mutations.

References

  1. Barton, N. H. and Charlesworth, B. (1998). Why sex and recombination? Science 281: 1986–1990.

    CAS  Article  Google Scholar 

  2. Barton, N. H. and Turelli, M. (1989). Evolutionary quantitative genetics: how little do we know? Annu Rev Genet 23: 337–370.

    CAS  Article  Google Scholar 

  3. Bataillon, T. and Kirkpatrick, M. (2000). Inbreeding depression due to slightly deleterious mutations in finite populations: size does matter. Genet Res 75: 75–81.

    CAS  Article  Google Scholar 

  4. Bateman, A. (1959). The viability of near normal irradiated chromosomes. Int J Radiat Biol 2: 170–180.

    Google Scholar 

  5. Burch, C. L. and Chao, L. (1999). Evolution by small steps and rugged landscapes in the RNA virus φ6. Genetics 151: 921–927.

    CAS  PubMed  PubMed Central  Google Scholar 

  6. Burt, A. (1995). Perspective: the evolution of fitness. Evolution 49: 1–8.

    PubMed  PubMed Central  Google Scholar 

  7. Charlesworth, B. and Charlesworth, D. (1998). Some evolutionary consequences of deleterious mutations. Genetica 102: 3–19.

    Article  Google Scholar 

  8. Charlesworth, B., Charlesworth, D. and Morgan, M. T. (1990). Genetic loads and estimate of mutation rates in highly inbred plant populations. Nature 347: 380–382.

    Article  Google Scholar 

  9. Charlesworth, B., Morgan, M. T. and Charlesworth, D. (1993). The effect of deleterious mutations on neutral molecular variation. Genetics 134: 1289–1303.

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Charlesworth, D., Lyons, E. and Litchfield, L. (1994). Inbreeding depression in two highly inbreeding populations of Leavenworthia. Proc R Soc B, 1353: 209–214.

    Google Scholar 

  11. Davies, E. K., Peters, A. D. and Keightley, P. D. (1999). High frequency of cryptic deleterious mutations in Caenorhabditis elegans. Science 285: 1748–1751.

    CAS  Article  Google Scholar 

  12. Deng, H. W. and Lynch, M. (1996). Estimation of deleterious-mutation parameters in natural populations. Genetics 144: 349–360.

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Deng, H. W. and Lynch, M. (1997). Inbreeding depression and inferred deleterious-mutation parameters in Daphnia. Genetics 147: 147–155.

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Drake, J. W., Charlesworth, B., Charlesworth, D. and Crow, J. F. (1998). Rates of spontaneous mutation. Genetics 148: 1667–1686.

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Drake, J. W. and Holland, J. J. (1999). Mutation rates among RNA viruses. Proc Natl Acad Sci USA 96: 13910–13913.

    CAS  Article  Google Scholar 

  16. Elena, S. F., Cooper, V. S. and Lenski, R. E. (1996). Punctuated evolution caused by selection of rare beneficial mutations. Science 272: 1802–1804.

    CAS  Article  Google Scholar 

  17. Eyre-Walker, A. and Keightley, P. D. (1999). High genomic deleterious mutation rates in hominids. Nature 397: 344–347.

    CAS  Article  Google Scholar 

  18. Fisher, R. A. (1999). The Genetical Theory of Natural Selection — a Complete Variorum Edition. Oxford University Press, Oxford.

    Google Scholar 

  19. Fry, J. D., Keightley, P. D., Heinsohn, S. L. and Nuzhdin, S. V. (1999). New estimates of the rates and effects of mildly deleterious mutation in Drosophila melanogaster. Proc Natl Aacd Sci USA, 96: 574–579.

    CAS  Article  Google Scholar 

  20. Garcia-Dorado, A. (1997). The rate and effects distribution of viability mutation in Drosophila: Minimum distance estimation. Evolution 51: 1130–1139.

    PubMed  Google Scholar 

  21. Garcia-Dorado, A., Lopez-Fanjul, C. and Caballero, A. (1999). Properties of spontaneous mutations affecting quantitative traits. Gene Res, 74: 341–350.

    CAS  Article  Google Scholar 

  22. Hill, W. G. and Caballero, A. (1992). Artificial selection experiments. Annu Rev Ecol Syst, 23: 287–310.

    Article  Google Scholar 

  23. Johnston, M. O. and Schoen, D. J. (1995). Mutation rates and dominance levels of genes affecting total fitness in two angiosperm species. Science 267: 226–229.

    CAS  Article  Google Scholar 

  24. Keightley, P. D. (1994). The distribution of mutation effects on viability in Drosophila melanogaster. Genetics 138: 1315–1322.

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Keightley, P. D. (1996). Nature of deleterious mutation load in Drosophila. Genetics 144: 1993–1999.

    CAS  PubMed  PubMed Central  Google Scholar 

  26. Keightley, P. D. (1998). Inference of genome-wide mutation rates and distributions of mutation effects for fitness traits: a simulation study. Genetics 150: 1283–1293.

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Keightley, P. D. and Bataillon, T. M. (2000). Multi-generation maximum likelihood analysis applied to mutation accumulation experiments in Caenorhabditis elegans. Genetics 154: 1193–1201.

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Keightley, P. D. and Caballero, A. (1997). Genomic mutation rates for lifetime reproductive output with lifespan in Caenorhabditis elegans. Proc Natl Acad Sci USA, 94: 3823–3827.

    CAS  Article  Google Scholar 

  29. Keightley, P. D. and Eyre-Walker, A. (1999). Terumi Mukai and the riddle of deleterious mutation rates. Genetics 153: 515–523.

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Kibota, T. T. and Lynch, M. (1996). Estimate of the genomic mutation rate deleterious to overall fitness in Escherichia coli. Nature 381: 694–696.

    CAS  Article  Google Scholar 

  31. Kimura, M. (1983). The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge.

    Google Scholar 

  32. Kondrashov, A. (1995). Contamination of the genome by very slightly deleterious mutations — why have we not died 100 times over. J Theor Biol, 175: 583–594.

    CAS  Article  Google Scholar 

  33. Kondrashov, A. S. (1998). Measuring spontaneous deleterious mutation process. Genetica 102: 183–197.

    Article  Google Scholar 

  34. Kondrashov, A. S. and Crow, J. F. (1993). A Molecular Approach to Estimating the Human Deleterious Mutation Rate. Hum Mutat, 2: 229–234.

    CAS  Article  Google Scholar 

  35. Lande, R. (1998). Risk of population extinction from fixation of deleterious and reverse mutations. Genetica 102: 21–27.

    Article  Google Scholar 

  36. Lynch, M., Conery, J. and Bürger, R. (1995). Mutational meltdowns in sexual populations. Evolution 49: 1067–1080.

    Article  Google Scholar 

  37. Lynch, M. et al (1999). Perspective: spontaneous deleterious mutations. Evolution 53: 645–663.

    Article  Google Scholar 

  38. Mukai, T. (1963). The genetic structure of natural population of Drosophila melanogaster I. Spontaneous mutation rate of polygenes controlling viability. Genetics 30: 1–19.

    Google Scholar 

  39. Mukai, T., Chigusa, S. I., Mettler, L. E. and Crow, J. F. (1972). Mutation rate and dominance of genes affecting viability in Drosophila melanogaster. Genetics 72: 335–355.

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Ohnishi, O. (1977). Spontaneous and ethyl methanesulfonate- induced mutations controlling viability in Drosophila melanogaster II Homozygous effect of polygenic mutations. Genetics 87: 529–545.

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Otto, S. P. and Marks, J. C. (1996). Mating systems and the evolutionary transition between haploidy and diploidy. Biol J Linn Soc, 57: 197–218.

    Article  Google Scholar 

  42. Partridge, L. and Barton, N. H. (1993). Optimality, mutation and the evolution of ageing. Nature 362: 305–311.

    CAS  Article  Google Scholar 

  43. Schoen, D. J., David, J. L. and Bataillon, T. M. (1998). Deleterious mutation accumulation and the regeneration of genetic resources. Proc Natl Acad Sci USA, 95: 394–399.

    CAS  Article  Google Scholar 

  44. Schultz, S. T., Lynch, M. and Willis, J. H. (1999). Spontaneous deleterious mutation in Arabidopsis thaliana. Proc Natl Acad Sci USA 96: 11393–11398.

    CAS  Article  Google Scholar 

  45. Shaw, R. G., Byers, D. L. and Darno, E. (2000). Spontaneous mutational effect on reproductive traits of Arabidopsis thaliana. Genetics, submitted.

  46. Simmons, M. and Crow, J. F. (1977). Mutations affecting fitness in Drosophila populations. Annu Rev Genetics, 11: 49–78.

    CAS  Article  Google Scholar 

  47. Vassilieva, L. L. and Lynch, M. (1999). The rate of spontaneous mutation for life-history traits in Caenorhabditis elegans. Genetics 151: 119–129.

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Whitlock, M. C. and Otto, S. P. (1999). The panda and the phage: compensatory mutations and the persistence of small populations. Trends Ecol Evol, 14: 295–296.

    CAS  Article  Google Scholar 

Download references

Acknowledgements

I thank Ruth Shaw for sharing unpublished results and, Patrice David, Martin Morgan, Peter Keightley, Andrew Pomiankowski, Joëlle Ronfort and an anonymous reviewer for constructive comments on earlier versions of this paper.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Thomas Bataillon.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Bataillon, T. Estimation of spontaneous genome-wide mutation rate parameters: whither beneficial mutations?. Heredity 84, 497–501 (2000). https://doi.org/10.1046/j.1365-2540.2000.00727.x

Download citation

Keywords

  • adaptation
  • compensatory mutations
  • deleterious mutations
  • fitness
  • genetic load
  • mutational meltdown

Further reading

Search

Quick links