Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Mutation bias and GC content shape antimutator invasions


Mutators represent a successful strategy in rapidly adapting asexual populations, but theory predicts their eventual extinction due to their unsustainably large deleterious load. While antimutator invasions have been documented experimentally, important discrepancies among studies remain currently unexplained. Here we show that a largely neglected factor, the mutational idiosyncrasy displayed by different mutators, can play a major role in this process. Analysing phylogenetically diverse bacteria, we find marked and systematic differences in the protein-disruptive effects of mutations caused by different mutators in species with different GC compositions. Computer simulations show that these differences can account for order-of-magnitude changes in antimutator fitness for a realistic range of parameters. Overall, our results suggest that antimutator dynamics may be highly dependent on the specific genetic, ecological and evolutionary history of a given population. This context-dependency further complicates our understanding of mutators in clinical settings, as well as their role in shaping bacterial genome size and composition.


The idea that the mutation rate is evolvable has captivated the interest of evolutionary biologists for decades1. It was early recognised that, since the vast majority of mutations with phenotypic effects are deleterious, selection should primarily act to reduce the deleterious load, pushing mutation rates to be as low as physiologically affordable2,3,4,5,6. However, strains with highly-elevated mutation rates (i.e., mutators) are readily selected in clinical and laboratory populations of bacteria7,8 and yeast9,10, as well as in certain cancers11. Theory and experiments have explained this phenomenon in terms of selection pressures operating at different timescales: linkage with strong beneficial mutations enables mutators to rapidly reach fixation before their increased deleterious load becomes fully manifest, which requires the accumulation of multiple secondary deleterious mutations12. Due to this reliance on rapid hitchhiking, mutators are most likely to thrive whenever populations face strong selection pressures13 and under conditions in which both recombination14 and genetic drift15 are unimportant.

But, eventually, populations fixed for a mutator phenotype are expected to re-evolve low mutation rates once selective pressure subsides—provided that restorative alleles are available16,17. Given the longer timescales involved, the evolution of reduced mutation rates has proven much more difficult to observe than its reverse process, the selection for mutator alleles. Indirect evidence comes from the fact that DNA repair genes seem to undergo frequent horizontal transfer18,19,20, and the observation of marked mutation rate polymorphisms within single-patient bacterial populations21. Direct, empirical evidence of the evolution of reduced mutation rates is limited to a handful of experimental evolution studies17,22,23,24,25. The provisional picture that emerges from these studies is rather heterogeneous, with different experiments reporting contrasting findings in terms of timescales, mechanisms and magnitude of mutation rate reduction. Recent theoretical work has begun to provide a framework to account for these contrasting patterns, emphasising the role of several factors in determining the fixation probability of antimutator alleles. These factors include differences in population size, beneficial and deleterious mutation rates, mutator strength, and the availability of secondary mutations compensating the cost of deleterious mutations12,26,27.

An additional, yet unexplored factor is the well-known mutational idiosyncrasy exhibited by different mutators28. This idiosyncrasy arises from the particular molecular details of the mutation-avoidance mechanism that are impaired in each mutator genotype. In Escherichia coli, for instance, impairment of any of the enzymes removing oxidised guanine from the DNA (e.g., MutM, MutY) results into substantial elevations of G:C → T:A mutations, while disruption of the enzyme preventing its incorporation from the free nucleotide pool (e.g., MutT) leads to a marked increase in A:T → C:G mutations29. These sort of mutational biases shape the tendency of different mutators to generate mutations with different fitness effects, which can have dramatic consequences on mutator success when adaptation involves just a few strongly beneficial mutations30. In analogy to this phenomenon, an intriguing hypothesis is that mutators that tend to generate stronger deleterious mutations may be more easily out-competed by an invading, low-mutation rate genotype. Similarly, mutators producing on average milder deleterious mutations than the wild-type may resist the invasion of antimutator alleles for longer. Whether these possibilities are plausible or not under realistic scenarios remains largely unknown.

In a first approach, at least two considerations argue against the idea that mutational spectrum differences can play any significant role in the evolution of reduced mutation rates. The first one comes from the classic Haldane-Muller principle31,32, which states that the reduction in fitness caused by recurring deleterious mutations is roughly on the order of the deleterious mutation rate (ud), irrespective of the actual fitness cost of each individual mutation (sd). Such independence from sd should preclude any spectrum-driven differences in mutational load among mutators. It is well-known, however, that this principle only holds as long as sd>ud33, a condition that may readily be violated in well-adapted, mutator populations of microbes. Second, different biases in the production of mutations are likely to translate into substantial fitness differences when just a few number of sites have a huge impact on fitness, as in the case of strongly beneficial antibiotic resistance mutations30. It is unclear, however, to what extent these kind of spectrum-driven differences may balance out when considering a larger number of sites. Relevant to this issue is the observation that some amino acid substitutions tend to be much more disruptive to proteins than others, a well-established fact that forms the basis of many protein alignment tools34. This fact affords speculation that systematic patterns may emerge at the genome-wide scale, so that different mutational spectra may produce, on average, deleterious mutations with characteristically different fitness effects.

Here, we use computer simulation to explore the extent to which the advantage of an antimutator allele deviates from the Haldane-Muller expectations under the relevant range of parameters. In addition, we estimate the genome-wide average disruptive effect on proteins of mutations caused by different mutational spectra. Importantly, since different codon usage patterns might alter the probability that a particular spectrum generates strong-effect amino acid changes, we also test whether systematic differences are to be expected among mutators in species with widely-divergent genomic GC compositions. Overall, our results suggest that mutational spectrum differences (understood as differences in the distribution of deleterious effects produced by different mutators) may play an unsuspectedly important role in the selection against high mutation rates in bacteria.


Broad conditions allow biases to shape antimutator invasions

To test whether mutational spectrum differences can alter the evolution of reduced mutation rates, we built a computer model that simulates the evolutionary dynamics of antimutator alleles invading a mutator population. The model was designed to capture the basic properties of the influential Lenski’s Long-Term Evolution Experiment (LTEE), in which 12 Escherichia coli populations have been serially propagated in the same glucose-limited medium for more than 60,000 generations35. Crucially, one of these bacterial populations was observed to re-evolve reduced mutation rates after being dominated by a mutator phenotype for more than 10,000 generations17. Inspired by this experiment, we considered the simple scenario of an asexual mutator population being serially propagated in a constant environment to which is already well-adapted (see Methods). At the start of each simulation, a single antimutator allele, restoring the mutation rate to wild-type levels, is introduced. The trajectory of this allele is tracked until it either reaches fixation or is lost by drift. Multiple frequency trajectories are then used to estimate the average effective selection coefficient (seff) of the antimutator allele, computed empirically as the log change of the antimutator-to-mutator ratio per generation (see Methods and Supplementary Fig. 1).

Our first aim was to test whether the Haldane-Muller principle can be violated over the range of parameters typically reported in experiments with mutator bacteria. In particular, the two most important parameters for this matter are the mutation rate of mutators (m) and the average selection coefficient of deleterious mutations (sd). Most estimates of m are based on a few reporter genes, and so caution should be exercised when using them as a proxy for genome-wide rates36. However, while more than a dozen genes are known to increase bacterial mutation rates when inactivated28,37, only those causing order-of-magnitude elevations of m are the ones typically observed in clinical and experimental evolution studies37,38. Therefore, the relevant range spans from slightly over a 10-fold increase (e.g., mutY)39 to a 1000-fold increase of m (e.g., dnaQ)40, although all mutators observed in the LTEE fall in the several 100-fold range (e.g., mutT, mutL)8,17.

The estimates of sd also display a certain degree of uncertainty. Attempts to estimate sd classically relied on mutation accumulation experiments, in which populations are serially passaged through single-cell bottlenecks to restrain selection from purging deleterious mutations41. However, since populations need to recover sufficiently after the single-cell bottleneck for the experiment to continue, there exists an upper limit on how deleterious a mutation can be to get detected, which may lead to an underestimation of sd according to the growth conditions employed42. Despite this limitation, different experiments have provided similar values for both the upper and the lower bounds of sd. Using an early isolate from the LTEE, Kibota & Lynch43 estimated an upper bound for sd of 0.012. Two later studies, also using E. coli, reported slightly higher values (sd ~0.03)42,44. Of note, both studies pointed to differences in mutational spectrum as a possible explanation for their higher estimates (they examined a transposon-based insertion library, and a mutS mutator strain, respectively). More recently, a few studies have leveraged the resolution afforded by next-generation sequencing to provide a lower bound for sd. These studies reported remarkably close values for this lower bound (sd ~0.0015 to 0.0017), even though they involved three different bacterial species (Salmonella typhimurium;45 Pseudomonas aeruginosa46 and Burkholderia cenocepacia47). In addition, one of these studies found that sd can vary noticeably across environments47.

Figure 1 provides a general overview of the invasion dynamics observed in the computer simulation model. In line with the Haldane-Muller expectations, we observed that the mutation rate of the resident mutator (m) strongly determines the speed of the antimutator invasion (Fig. 1a). However, in contrast with the Haldane-Muller principle, we found that the fitness cost of deleterious mutations (sd) can also exert a substantial, albeit less dramatic effect on invasion speed (Fig. 1b). While this dependence on sd is most pronounced when mutation rates are the highest and fitness costs the smallest, our results show that invasion dynamics can indeed be affected by sd over a large fraction of the relevant range of parameters (Supplementary Fig. 2). Therefore, there are grounds to speculate whether spectrum-driven differences in sd may alter the propensity of different mutators to evolve reduced mutation rates. To examine this possibility, we expanded the computer simulation model to allow consideration of general biases in the production of deleterious mutations. We modelled these biases as a multiplicative factor (κ) that modifies the selection coefficient of deleterious mutations in the mutator background, such that when κ < 1 mutators produce milder deleterious mutations than antimutators, when κ = 1 there is no difference between backgrounds, and when κ > 1 mutations are more harmful in mutators (see Methods).

Fig. 1

Frequency trajectories of antimutator alleles invading well-adapted, mutator populations. Lines represent 100 independent simulations for each condition. a Invasion dynamics under various values of the mutation rate of the resident mutator (grey, m= 1000; magenta, m= 300; blue, m= 100; green, m= 30; light grey, m= 1), and a fixed fitness cost of deleterious mutations (sd= 0.064). b Invasion dynamics under various values of the fitness cost of deleterious mutations (from left to right, sd equals: 0.064, 0.032, 0.016, 0.008, 0.004, 0.002, 0.001), and a fixed mutation rate of the resident mutator (m= 1000). Other parameters as described in Methods

Figure 2 captures how the interplay between sd and m controls the degree to which mutational spectra differences (κ) impact on the success of antimutator alleles. Two patterns can readily be appreciated by observing the overall shape of the curves in Fig. 2. First, the slopes become steeper with mutation rate (m) (which increases from panel a to d). In turn, the slopes become flatter with fitness cost (sd) (which increases within each panel from bottom to top). In line with the discussion in the previous paragraph, these general patterns can be interpreted in terms of deviations from the Haldane-Muller principle. Thus, the impact of κ is the greatest when mutation rate is maximal and fitness cost is minimal (Fig. 2d, lowest line)—exactly the same conditions under which the dependence of invasion speed on sd is most pronounced (Fig. 1b and Supplementary Fig. 2). Conversely, when populations approach the regime in which the Haldane-Muller principle holds (sd>ud), the impact of κ becomes rather modest, which visually translates into comparatively flatter slopes (Fig. 2a, upper lines).

Fig. 2

Mutational spectrum effects on the invasion speed of antimutator alleles. Points represent the effective selection coefficient (seff) of the invading antimutator alleles, averaged from 200 independent simulations. Panels correspond to different values of the mutation rate of the resident mutator (a, m= 30; b, m= 100; c, m= 300; d, m= 1000). Within each panel, lines depict different values for the fitness cost of deleterious mutations (from top to bottom, sd equals: 0.064, 0.032, 0.016, 0.008, 0.004, 0.002, 0.001). Mutational spectrum effects refers to the differential propensity of mutators to produce deleterious mutations with different fitness cost. We modelled this effect as a multiplicative factor (κ) that modifies sd in the mutator background, such that when κ < 1 mutators produce milder deleterious mutations than antimutators, when κ = 1 there is no difference between backgrounds, and when κ > 1 mutations are more harmful in mutators. The basal deleterious mutation rate (ud) is set to 2 × 10−4 (other parameters as described in Methods)

The previous analysis shows that the importance of mutational spectrum ultimately depends on how large the mutation rate is compared with the fitness cost of deleterious mutations. Therefore, a further natural parameter to consider is the basal deleterious mutation rate (ud), that is, the absolute rate at which deleterious mutations are produced in the non-mutator background. Estimates of this quantity have classically been obtained through mutation accumulation experiments, and consequently suffer from the same uncertainties discussed for sd. Throughout the previous simulations we set ud = 2 × 10−4, as originally estimated in E. coli43. However, while reports in other bacteria have provided similar or slightly lower values (ud = 1.8–0.7 × 10−4)44,47, estimates in yeast differ by more than an order of magnitude, depending on whether the strain is haploid (ud = 1.1 × 10−3)48 or diploid (ud = 0.6–0.5 × 10−4)49,50. On top of this, an additional layer of variability comes from the fact that the overall mutation rate can vary across growth conditions51,52,53,54. In Fig. 3a–c we explored how changes in ud within the empirically relevant range can alter the previously discussed results from Fig. 2. A prominent pattern emerging from Fig. 3 is that the slopes become steeper with larger values of ud (Fig. 3c). This result mimics the pattern found for increasing m in Fig. 2, and can be understood in terms of populations moving gradually away from the Haldane-Muller regime. A more remarkable observation is that even for the lowest values tested, despite the relatively flatter slopes, the mutational spectrum is still capable of exerting a moderate but sizeable impact on the performance of invading antimutator alleles.

Fig. 3

Impact of the deleterious and lethal mutation rate on antimutator dynamics. Panels a and b show antimutator fitness under two extreme values of the basal deleterious mutation rate (ud= 0.5 × 10−4 and ud= 8 × 10−4, respectively). Points and colours follow the same convention as in Fig. 2. Panel c shows the change in the antimutator’s effective selection coefficient (seff) for various values of the basal deleterious mutation rate (from top to bottom, ud equals: 1.6 × 10−3, 8 × 10−4, 4 × 10−4, 2 × 10−4, 1 × 10−4, 0.5 × 10−4). Fold change refers to the change in seff from κ = 0.25 to κ= 4. Panels d and e show the effects on antimutator dynamics of spectrum-driven differences in the propensity to produce lethal mutations, under two extreme values of the basal lethal mutation rate (ul= 0.2 × 10−5 and ul= 6.4 × 10−5, respectively). Points and colours as in Fig. 2. Panel f shows the change in seff for various values of the basal lethal mutation rate (from top to bottom, ul equals: 6.4 × 10−5, 3.2 × 10−5, 1.6 × 10−5, 0.8 × 10−5, 0.4 × 10−5, 0.2 × 10−5). Fold change is defined as in c. In all cases, the mutation rate of the resident mutator was fixed to a single value (m= 300). Other parameters as described in Methods

Another issue worth considering is the lethal mutation rate (ul). Lethal mutations typically occur at a much lower rate than deleterious mutations55,56, and so as a first approximation we have neglected their influence. Lethal mutations, however, can be seen as a distinct subclass of deleterious mutations, namely, as large-effect deleterious mutations affecting essential genes. It seems possible, therefore, that mutators producing more harmful mutations may also produce a greater proportion of lethal mutations. Such spectrum-driven elevations in ul, if strong enough, may alter the results discussed in Fig. 2. A further consideration is that ul is expected to be even more environmentally dependent than ud, since not only the overall mutation rate varies across conditions, but also the fraction of the genome that is essential57,58. As a lower bound, we set ul = 2 × 10−6 from estimates in E. coli that ~7% of the genome is unconditionally essential58 and that ~13% of mutations within a protein are inactivating59. On the other hand, direct estimates in yeast have produced a value roughly an order of magnitude larger (ul = 3.2 × 10−5)60. Figure 3d–f) confirms the intuition that lethal mutations have generally a modest effect on antimutator dynamics, except for the largest values of κ and ul considered. How often such extreme conditions are met in natural scenarios is a matter of empirical investigation, but overall our results show that spectrum-driven variations on ul within the relevant range can play a significant, yet typically secondary role in the invasion dynamics of antimutator alleles.

We also wanted to explore to what extent the previous results can be applicable to adaptive scenarios other than the LTEE setting. In particular, we extended the simulation analyses to study the consequences of changing two key demographic parameters: the bottleneck and the maximum population size. We found that these parameters have a minor effect on antimutator dynamics even in their lower value range, in which the influence of random genetic drift begins to be noticeable (Supplementary Fig. 3). Taken together, our results support the notion that the impact of mutational spectrum on antimutator evolution can be substantial under a wide and relevant range of parameters and experimental conditions.

As a final note, it is worth highlighting that the variation in the slopes in Figs. 2 and 3 results in a large area of overlap among the curves obtained for different mutators under various combinations of parameters, especially for the smaller values of κ. This overlap represents the range of conditions under which a stronger mutator will actually be more robust to antimutator invasions than a weaker one. Since sd, ud and ul can vary appreciably across species and environments, such a counterintuitive outcome illustrates the importance of considering the mutational spectrum when investigating the evolution of reduced mutation rates.

Mutational biases cause distinct protein-disrupting patterns

The previous results show that spectrum-driven differences in sd can greatly influence the evolution of reduced mutation rates in bacteria. It remains to be explored, nonetheless, whether spectrum-driven differences in sd are actually likely to occur among bacterial mutators. While differences in fitness have indeed been observed in the case of beneficial mutations involving a few genomic sites30, the very large mutational target size for deleterious mutations may cause local, spectrum-driven differences to balance out at the genome-wide scale. However, a first look at the properties of the genetic code affords reasonable grounds for expecting the emergence of some general trends. Certainly, it has long been known that transversions are overall more detrimental than transitions, due to the fact that transversions underlie a larger fraction of non-synonymous substitutions and, among these, tend to produce changes that are less conservative of the physicochemical properties of amino acids61,62,63. Besides these trends, a closer examination reveals that the 6 types of point mutations display fairly broad distributions of disruptive effects (see Supplementary Fig. 4). Such breadth raises the possibility that, ultimately, the average disruptive effect of a given mutational spectrum may actually be determined by the highly-diverse codon usage preferences observed among bacterial species64.

To explore these possibilities, we set out to quantify the average protein-disrupting effect of the specific point mutations elevated in 3 prominent types of mutators: mutY (G:C → T:A), mutT (A:T → C:G) and Mismatch Repair (G:C → A:T, A:T → G:C) mutators28,37 (see Methods). Briefly, we systematically computed all of the possible substitutions per codon associated with each mutational spectrum across a panel of bacterial genomes spanning a wide range of GC compositions. We then estimated the protein-disrupting effects of all these spectrum-specific substitutions by applying the well-known Grantham’s matrix of physicochemical distance65. This amino-acid substitution matrix was previously shown to provide the best predictions of empirical fitness effects among standard distance-based matrices59. As validation, we also applied an alignment-based substitution matrix (BLOSUM100)66, which provided comparable results. Moreover, for the specific case of the LTEE experiment, we have shown that the use of Grantham’s matrix provides an efficient alternative to more sophisticated and computationally intensive methods, such as Direct Coupling Analysis57 (see Supplementary Fig. 5). Finally, seeking to increase the likelihood of non-synonymous mutations being predominantly harmful, we initially conducted these analyses for genes belonging to the COG categories most commonly enriched in essential genes (H: Coenzyme metabolism, J: Translation and M: Cell wall/membrane/envelop biogenesis)67—although the overall patterns remained similar when considering whole genomes (see Supplementary Fig. 6).

Figure 4 shows that there are indeed marked differences in the protein-disrupting effects of mutations caused by the different mutational spectra. The Mismatch Repair spectrum displays the weakest disruptive effects in all tested backgrounds (Fig. 4, green), which makes sense since this spectrum comprises the two transitions, well-known to be the most conservative among all possible point mutation types61,62,63. While interesting, we shall note that this result is probably an underestimation since Mismatch Repair mutators, apart from point mutations, also exhibit an elevated occurrence of indels and large recombination events68. More remarkable is the fact that the disruptive effects associated with the mutY and mutT spectra exhibit a strong and opposite dependence on the GC content of the genetic background. In particular, we observe that the mutY spectrum is highly detrimental in AT-rich backgrounds (Fig. 4, red), while the mutT spectrum inflicts its greatest disruption in GT-rich backgrounds (Fig. 4, blue). This contrasting behaviour is amenable to a straightforward explanation: whatever the processes causing the base composition bias may be, the last codons to be changed to conform to this bias should be the ones for which the change will produce the most harmful effects. These last codons are exactly the ones being predominantly altered by mutY-specific mutations (G:C → T:A) and mutT-specific mutations (A:T → C:G) in AT-rich and GT-rich backgrounds, respectively.

Fig. 4

Protein-disrupting effects of mutations caused by different mutators in different genomes. Colours correspond to predictions for mutY (red), mutT (blue) and Mismatch Repair (green) mutators. For comparison, the effects of a unbiased spectrum are highlighted with a grey background. a Grantham scores in five different bacterial species for which hypermutability is of particular interest. These species are arranged, from left to right, according to increasing GC content. b Average Grantham scores across a panel of species with genomes spanning a wide range of GC compositions. Boxplots as defined by default. c Average BLOSUM100 scores across the same panel. Details about these genomes are shown in Supplementary Table 1. Source data are provided as a Source Data file

In addition, we should expect the fitness cost of altering these last, non-conforming codons to be the greatest in conditions where selection is weak compared to other evolutionary forces, since under such conditions selection can only prevent the most essential amino-acid sites from changing. This phenomenon would help explain why the most disruptive effects are found for the mutY spectrum in the most AT-biased genomes—generally seen as reflective of highly-relaxed selective conditions69,70,71. This effect is better appreciated in the analyses with the distance-based instead of the aligned-based matrix (Fig. 4b versus Fig. 4c), perhaps because physicochemical distance is a more pure proxy for protein-disrupting effects than evolutionary conservation, which integrates the effects of several other factors (e.g., epistasis, basal mutational bias)34.


Our analyses reveal that different mutators can be expected to produce deleterious mutations with distinctive fitness effects, and that such idiosyncrasy can greatly impact antimutator invasion dynamics. At least three points regarding these findings merit brief discussion. First, the simulations purposely focused on the effects of mutational spectra on deleterious mutations, leaving aside the complications of considering either compensatory or generally-beneficial mutations. While previous research has already studied the importance of these types of mutations on antimutator dynamics12,26,27, a full treatment of this problem should include the fact that spectrum-driven differences can also bias mutator access to both compensatory and generally-beneficial mutations. Second, the dynamics can be further complicated by considering two phenomena well-known to limit the evolution of mutation rates: recombination and the cost of fidelity1. Recombination disrupts mutator hitchhiking by separating the mutator allele from its linked mutations4,14. Its relevance to the dynamics studied here, therefore, is probably confined to scenarios in which recombination rate is either very low or fluctuating, so as not to impede mutator fixation in the first place. Regarding the cost of fidelity, it is plausible that antimutator alleles can exhibit differences on their direct physiological cost based on their particular genetic underpinnings (e.g., true reversion72 versus gain-of-function mutations16). Characterising the complexities introduced by these factors will be reserved for future research. Third, it is worth noting the breadth of conditions under which spectrum effects are noticeable, as well as the magnitude that these effects can reach—including the paradoxical situation of weak mutators exhibiting larger deleterious load that strong mutators. The breadth and magnitude of these effects lead us to conclude that, even if taken only as a first approximation, our analyses strongly support the notion that mutational spectrum differences can greatly influence antimutator evolution in many biologically-relevant scenarios.

Finally, it is worth pointing out that the general finding of our study is that antimutator success depends not only on the extent of mutation rate elevation, but also on the mutational spectrum, the genetic background and the environmental conditions. Since the exact contribution of these factors is essentially an empirical question, it is possible that the likelihood of antimutator invasions in real-world scenarios may have to be evaluated on a case-by-case basis. Such dependence on the particulars of each case has at least two important consequences. In clinical settings, it can complicate predictions about the long-term persistence and transmissibility of mutators, thus being relevant to interventions aimed at curbing the contribution of mutators to antibiotic resistance evolution37. More broadly, it has implications for our views on how mutators shape the evolution of bacterial genomes. Episodes of hypermutability can be common along the evolutionary history of bacterial lineages, inflicting rapid changes in genome size and composition that can blur the signature of selection57. Our results suggest that the length of these pulses of hypermutability, and therefore their potential impact, may be highly dependent on the specific genetic, ecological and evolutionary history of a given lineage—a possibility further complicating the interpretation of present-day patterns of bacterial genome diversity.


Computer simulation

The computer model simulates the serial passage of a bacterial population in a laboratory environment to which is already well-adapted. Since we focused on strictly asexual populations, we used a class-based model in which individuals are grouped according to their genotype13,30. Mimicking the serial passage protocol from the LTEE, the algorithm recreates two stages: population growth and the 1/100 bottleneck73. In the first stage, cells reproduce deterministically and accumulate mutations stochastically while populations expand from 107 to at least 109 individuals. Reproduction is formulated in terms of discrete, non-overlapping generations74. Every generation, individuals reproduce deterministically according to their multiplicative growth rate, defined as r=2+nsd, where n represents the number of accumulated deleterious mutations and sd is the average deleterious selection coefficient. Mutation is implemented by using a Poisson-distributed pseudorandom number generator (the function rpois in R). Every generation, individuals acquire deleterious mutations stochastically with a probability depending on the basal deleterious mutation rate (ud) and the mutator strength (m) (note that for antimutator alleles m=1). The second part of the algorithm is executed when population size exceeds the limit of 109 individuals, and consist of taking a random sample of 107 individuals, after which growth is resumed. To recover from this daily bottleneck, populations require ≥ 7 generations (owing to discrete generation time and the accumulation of deleterious mutations).

Simulations start with a single antimutator allele entering a population of 107 mutator individuals, and terminate when this allele either reaches fixation or is lost by random drift. The average effective selection coefficient of the antimutator allele is calculated empirically as seff=log((pg/qg)/(p0/q0))/g, where p and q represent the frequency of the antimutator and mutator allele, respectively, and g is the number of generations74 (see Supplementary Fig. 1). To implement the differential access of mutators to deleterious mutations with different fitness costs, we introduced a multiplicative factor (κ) that modifies sd in the mutator background as r=2+κnsd. Note that when κ < 1 mutators produce milder deleterious mutations than antimutators, when κ = 1 there are no differences between backgrounds, and when κ > 1 mutations are more harmful in the mutator background. To implement the differential propensity of mutators to produce lethal mutations, we allowed κ to modify the basal lethal mutation rate (ul) in the mutator background, such that lethal mutations represent a smaller (κ < 1), equal (κ = 1) or larger (κ > 1) than expected proportion of the total deleterious mutations. For all tested parameter combinations, reported values of seff were computed from 200 independent replicates. All programming was performed in R version 3.2.375, and basic codes are freely available on

Genome analyses

To conduct the bioinformatic analyses we developed a series of scripts in Python (version 2.7.12) ( These codes were applied to a panel of 25 bacterial genomes, including relevant pathogens, and chosen to span a wide range of GC compositions. A summary of the main features of these genomes is presented in Supplementary Table 1. For all strains, the predicted coding sequences (CDSs) and their functional classification (COG) were retrieved from the Microscope platform from Genoscope ( After formatting and parsing, we estimated the average protein-disrupting effect of different mutations for all CDSs across the panel of genomes. We achieved this by computing the Grantham and BLOSUM100 scores for all of the possible substitutions per codon associated with each mutational spectrum. The Grantham and BLOSUM100 matrices were obtained from the AAindex database ( and the NCBI FTP server (, respectively. Codons harbouring incompletely specified bases (e.g., N, R, Y) were excluded from the analyses. Basic codes are freely available on

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

Genome sequences were retrieved from Genoscope and are publicly accessible at The source data underlying Fig. 4 and Supplementary Figs. 4, 5, 6 are provided as a Source Data file.

Code availability

Basic codes to reproduce the results here presented are publicly available at


  1. 1.

    Sniegowski, P. D., Gerrish, P. J., Johnson, T. & Shaver, A. The evolution of mutation rates: separating causes from consequences. Bioessays 22, 1057–1066 (2000).

    CAS  Article  Google Scholar 

  2. 2.

    Sturtevant, A. H. Essays on evolution. I. On the effects of selection on mutation rate. Q. Rev. Biol. 12, 464–467 (1937).

    Article  Google Scholar 

  3. 3.

    Kimura, M. Optimum mutation rate and degree of dominance as determined by the principle of minimum genetic load. J. Genet. 57, 21–34 (1960).

    Article  Google Scholar 

  4. 4.

    Leigh, E. G. The evolution of mutation rates. Genetics 73(Suppl 73), 1–18 (1973).

    ADS  MathSciNet  Google Scholar 

  5. 5.

    Eshel, I. Clone-selection and optimal rates of mutation. J. Appl. Probab. 10, 728–738 (1973).

    MathSciNet  Article  Google Scholar 

  6. 6.

    Liberman, U. & Feldman, M. W. Modifiers of mutation rate: a general reduction principle. Theor. Popul. Biol. 30, 125–142 (1986).

    MathSciNet  CAS  Article  Google Scholar 

  7. 7.

    LeClerc, J. E., Li, B., Payne, W. L. & Cebula, T. A. High mutation frequencies among Escherichia coli and Salmonella pathogens. Science 274, 1208–1211 (1996).

    ADS  CAS  Article  Google Scholar 

  8. 8.

    Sniegowski, P. D., Gerrish, P. J. & Lenski, R. E. Evolution of high mutation rates in experimental populations of E. coli. Nature 387, 703–705 (1997).

    ADS  CAS  Article  Google Scholar 

  9. 9.

    Healey, K. R., Jimenez Ortigosa, C., Shor, E. & Perlin, D. S. Genetic drivers of multidrug resistance in Candida glabrata. Front. Microbiol. 7, 1995 (2016).

  10. 10.

    Voordeckers, K. et al. Adaptation to high ethanol reveals complex evolutionary pathways. PLOS Genet. 11, e1005635 (2015).

    Article  Google Scholar 

  11. 11.

    Loeb, L. A. Human cancers express mutator phenotypes: origin, consequences and targeting. Nat. Rev. Cancer 11, 450–457 (2011).

    CAS  Article  Google Scholar 

  12. 12.

    Good, B. H. & Desai, M. M. Evolution of mutation rates in rapidly adapting asexual populations. Genetics 204, 1249–1266 (2016).

    Article  Google Scholar 

  13. 13.

    Tenaillon, O., Toupance, B., Nagard, H. L., Taddei, F. & Godelle, B. Mutators, population size, adaptive landscape and the adaptation of asexual populations of bacteria. Genetics 152, 485–493 (1999).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Tenaillon, O., Nagard, H. L., Godelle, B. & Taddei, F. Mutators and sex in bacteria: conflict between adaptive strategies. PNAS 97, 10465–10470 (2000).

    ADS  CAS  Article  Google Scholar 

  15. 15.

    Raynes, Y., Wylie, C. S., Sniegowski, P. D. & Weinreich, D. M. Sign of selection on mutation rate modifiers depends on population size. PNAS 115, 3422–3427 (2018).

    CAS  Article  Google Scholar 

  16. 16.

    Drake, J. W. General antimutators are improbable. J. Mol. Biol. 229, 8–13 (1993).

    CAS  Article  Google Scholar 

  17. 17.

    Wielgoss, S. et al. Mutation rate dynamics in a bacterial population reflect tension between adaptation and genetic load. Proc. Natl Acad. Sci. USA 110, 222–227 (2013).

    ADS  CAS  Article  Google Scholar 

  18. 18.

    Brown, E. W., LeClerc, J. E., Li, B., Payne, W. L. & Cebula, T. A. Phylogenetic evidence for horizontal transfer of mutS alleles among naturally occurring Escherichia coli strains. J. Bacteriol. 183, 1631–1644 (2001).

    CAS  Article  Google Scholar 

  19. 19.

    Denamur, E. et al. Evolutionary implications of the frequent horizontal transfer of mismatch repair genes. Cell 103, 711–721 (2000).

    CAS  Article  Google Scholar 

  20. 20.

    Elena, S. F., Whittam, T. S., Winkworth, C. L., Riley, M. A. & Lenski, R. E. Genomic divergence of Escherichia coli strains: evidence for horizontal transfer and variation in mutation rates. Int. Microbiol. 8, 271–278 (2005).

    PubMed  Google Scholar 

  21. 21.

    Couce, A., Alonso-Rodriguez, N., Costas, C., Oliver, A. & Blázquez, J. Intrapopulation variability in mutator prevalence among urinary tract infection isolates of Escherichia coli. Clin. Microbiol. Infect. 22, 566.e1–7 (2016).

    CAS  Article  Google Scholar 

  22. 22.

    McDonald, M. J., Hsieh, Y.-Y., Yu, Y.-H., Chang, S.-L. & Leu, J.-Y. The evolution of low mutation rates in experimental mutator populations of Saccharomyces cerevisiae. Curr. Biol. 22, 1235–1240 (2012).

    CAS  Article  Google Scholar 

  23. 23.

    Singh, T., Hyun, M. & Sniegowski, P. Evolution of mutation rates in hypermutable populations of Escherichia coli propagated at very small effective population size. Biol. Lett. 13, 20160849 (2017).

  24. 24.

    Tröbner, W. & Piechocki, R. Selection against hypermutability in Escherichia coli during long term evolution. Molec Gen. Genet 198, 177–178 (1984).

    Article  Google Scholar 

  25. 25.

    Turrientes, M.-C. et al. Normal mutation rate variants arise in a mutator (Mut S) Escherichia coli population. PLoS ONE 8, e72963 (2013).

    ADS  CAS  Article  Google Scholar 

  26. 26.

    Jain, K. & James, A. Fixation probability of a nonmutator in a large population of asexual mutators. J. Theor. Biol. 433, 85–93 (2017).

    MathSciNet  Article  Google Scholar 

  27. 27.

    James, A. & Jain, K. Fixation probability of rare nonmutator and evolution of mutation rates. Ecol. Evol. 6, 755–764 (2016).

    Article  Google Scholar 

  28. 28.

    Miller, J. H. Spontaneous mutators in bacteria: insights into pathways of mutagenesis and repair. Annu. Rev. Microbiol. 50, 625–643 (1996).

    CAS  Article  Google Scholar 

  29. 29.

    Michaels, M. L. & Miller, J. H. The GO system protects organisms from the mutagenic effect of the spontaneous lesion 8-hydroxyguanine (7,8-dihydro-8-oxoguanine). J. Bacteriol. 174, 6321–6325 (1992).

    CAS  Article  Google Scholar 

  30. 30.

    Couce, A., Guelfo, J. R. & Blázquez, J. Mutational spectrum drives the rise of mutator bacteria. PLoS Genet. 9, e1003167 (2013).

    CAS  Article  Google Scholar 

  31. 31.

    Haldane, J. B. S. The effect of variation of fitness. Am. Nat. 71, 337–349 (1937).

    Article  Google Scholar 

  32. 32.

    Muller, H. J. Our load of mutations. Am. J. Hum. Genet 2, 111–176 (1950).

    CAS  PubMed  PubMed Central  Google Scholar 

  33. 33.

    Crow, J. F. Genetic loads and the cost of natural selection. In Mathematical Topics in Population Genetics (ed. Kojima, K.) 128–177 (Springer, Berlin Heidelberg, 1970).

    Google Scholar 

  34. 34.

    Yampolsky, L. Y. & Stoltzfus, A. The exchangeability of amino acids in proteins. Genetics 170, 1459–1472 (2005).

    CAS  Article  Google Scholar 

  35. 35.

    Good, B. H., McDonald, M. J., Barrick, J. E., Lenski, R. E. & Desai, M. M. The dynamics of molecular evolution over 60,000 generations. Nature 551, 45–50 (2017).

    ADS  Article  Google Scholar 

  36. 36.

    Foster, P. L. Methods for determining spontaneous mutation rates. Meth. Enzym. 409, 195–213 (2006).

    CAS  Article  Google Scholar 

  37. 37.

    Oliver, A. & Mena, A. Bacterial hypermutation in cystic fibrosis, not only for antibiotic resistance. Clin. Microbiol. Infect. 16, 798–808 (2010).

    CAS  Article  Google Scholar 

  38. 38.

    Jolivet-Gougeon, A. et al. Bacterial hypermutation: clinical implications. J. Med. Microbiol. 60, 563–573 (2011).

    CAS  Article  Google Scholar 

  39. 39.

    Notley-McRobb, L., Seeto, S. & Ferenci, T. Enrichment and elimination of mutY mutators in Escherichia coli populations. Genetics 162, 1055–1062 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  40. 40.

    Ragheb, M. N. et al. Inhibiting the evolution of antibiotic resistance. Mol. Cell 73, 157–165.e5 (2019).

    CAS  Article  Google Scholar 

  41. 41.

    Eyre-Walker, A. & Keightley, P. D. The distribution of fitness effects of new mutations. Nat. Rev. Genet. 8, 610–618 (2007).

    CAS  Article  Google Scholar 

  42. 42.

    Elena, S. F. & Lenski, R. E. Test of synergistic interactions among deleterious mutations in bacteria. Nature 390, 395–398 (1997).

    ADS  CAS  Article  Google Scholar 

  43. 43.

    Kibota, T. T. & Lynch, M. Estimate of the genomic mutation rate deleterious to overall fitness in E. coli. Nature 381, 694–696 (1996).

    ADS  CAS  Article  Google Scholar 

  44. 44.

    Trindade, S., Perfeito, L. & Gordo, I. Rate and effects of spontaneous mutations that affect fitness in mutator Escherichia coli. Philos Trans. R. Soc. Lond. B Biol. Sci. 365, 1177–1186 (2010).

    Article  Google Scholar 

  45. 45.

    Lind, P. A. & Andersson, D. I. Whole-genome mutational biases in bacteria. Proc. Natl Acad. Sci. USA 105, 17878–17883 (2008).

    ADS  CAS  Article  Google Scholar 

  46. 46.

    Heilbron, K., Toll-Riera, M., Kojadinovic, M. & MacLean, R. C. Fitness is strongly influenced by rare mutations of large effect in a microbial mutation accumulation experiment. Genetics 197, 981–990 (2014).

    CAS  Article  Google Scholar 

  47. 47.

    Dillon, M. M. & Cooper, V. S. The fitness effects of spontaneous mutations nearly unseen by selection in a bacterium with multiple chromosomes. Genetics 204, 1225–1238 (2016).

    CAS  Article  Google Scholar 

  48. 48.

    Wloch, D. M., Szafraniec, K., Borts, R. H. & Korona, R. Direct estimate of the mutation rate and the distribution of fitness effects in the yeast Saccharomyces cerevisiae. Genetics 159, 441–452 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Joseph, S. B. & Hall, D. W. Spontaneous mutations in diploid Saccharomyces cerevisiae: more beneficial than expected. Genetics 168, 1817–1825 (2004).

    Article  Google Scholar 

  50. 50.

    Zeyl, C. & DeVisser, J. A. Estimates of the rate and distribution of fitness effects of spontaneous mutation in Saccharomyces cerevisiae. Genetics 157, 53–61 (2001).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Degnen, G. E. & Cox, E. C. Conditional mutator gene in escherichia coli: isolation, mapping, and effector studies. J. Bacteriol. 117, 477–487 (1974).

    CAS  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Krašovec, R. et al. Mutation rate plasticity in rifampicin resistance depends on Escherichia coli cell-cell interactions. Nat. Commun. 5, 3742 (2014).

    ADS  Article  Google Scholar 

  53. 53.

    Chu, X.-L. et al. Temperature responses of mutation rate and mutational spectrum in an Escherichia coli strain and the correlation with metabolic rate. BMC Evol. Biol. 18, 126 (2018).

    Article  Google Scholar 

  54. 54.

    Shewaramani, S. et al. Anaerobically grown escherichia coli has an enhanced mutation rate and distinct mutational spectra. PLOS Genet. 13, e1006570 (2017).

    Article  Google Scholar 

  55. 55.

    Bull, J. J. & Wilke, C. O. Lethal mutagenesis of bacteria. Genetics 180, 1061–1070 (2008).

    Article  Google Scholar 

  56. 56.

    Robert, L. et al. Mutation dynamics and fitness effects followed in single cells. Science 359, 1283–1286 (2018).

    ADS  CAS  Article  Google Scholar 

  57. 57.

    Couce, A. et al. Mutator genomes decay, despite sustained fitness gains, in a long-term experiment with bacteria. Proc. Natl Acad. Sci. USA 114, E9026–E9035 (2017).

    CAS  Article  Google Scholar 

  58. 58.

    Baba, T. et al. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol. Syst. Biol. 2, 0008 (2006).

    Article  Google Scholar 

  59. 59.

    Jacquier, H. et al. Capturing the mutational landscape of the beta-lactamase TEM-1. PNAS 110, 13067–13072 (2013).

    ADS  CAS  Article  Google Scholar 

  60. 60.

    Zhu, Y. O., Siegal, M. L., Hall, D. W. & Petrov, D. A. Precise estimates of mutation rate and spectrum in yeast. PNAS 111, E2310–E2318 (2014).

    ADS  CAS  Article  Google Scholar 

  61. 61.

    Lyons, D. M. & Lauring, A. S. Evidence for the selective basis of transition-to-transversion substitution bias in two RNA viruses. Mol. Biol. Evol. 34, 3205–3215 (2017).

    CAS  Article  Google Scholar 

  62. 62.

    Wakeley, J. The excess of transitions among nucleotide substitutions: new methods of estimating transition bias underscore its significance. Trends Ecol. Evol. 11, 158–162 (1996).

    CAS  Article  Google Scholar 

  63. 63.

    Zhang, J. Rates of conservative and radical nonsynonymous nucleotide substitutions in mammalian nuclear genes. J. Mol. Evol. 50, 56–68 (2000).

    ADS  CAS  Article  Google Scholar 

  64. 64.

    Wan, X.-F., Xu, D., Kleinhofs, A. & Zhou, J. Quantitative relationship between synonymous codon usage bias and GC composition across unicellular genomes. BMC Evolut. Biol. 4, 19 (2004).

    Article  Google Scholar 

  65. 65.

    Grantham, R. Amino acid difference formula to help explain protein evolution. Science 185, 862–864 (1974).

    ADS  CAS  Article  Google Scholar 

  66. 66.

    Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices from protein blocks. Proc. Natl Acad. Sci. USA 89, 10915–10919 (1992).

    ADS  CAS  Article  Google Scholar 

  67. 67.

    Mandal, R. K., Jiang, T. & Kwon, Y. M. Essential genome of Campylobacter jejuni. BMC Genom. 18, 616 (2017).

  68. 68.

    Elez, M., Radman, M. & Matic, I. The frequency and structure of recombinant products is determined by the cellular level of MutL. PNAS 104, 8935–8940 (2007).

    ADS  CAS  Article  Google Scholar 

  69. 69.

    Hershberg, R. & Petrov, D. A. Evidence that mutation is universally biased towards AT in bacteria. PLOS Genet. 6, e1001115 (2010).

    Article  Google Scholar 

  70. 70.

    Hildebrand, F., Meyer, A. & Eyre-Walker, A. Evidence of selection upon genomic GC-content in bacteria. PLOS Genet. 6, e1001107 (2010).

    Article  Google Scholar 

  71. 71.

    McCutcheon, J. P. & Moran, N. A. Extreme genome reduction in symbiotic bacteria. Nat. Rev. Microbiol. 10, 13–26 (2011).

    Article  Google Scholar 

  72. 72.

    Shaver, A. C. & Sniegowski, P. D. Spontaneously arising mutL mutators in evolving escherichia coli populations are the result of changes in repeat length. J. Bacteriol. 185, 6076–6082 (2003).

    CAS  Article  Google Scholar 

  73. 73.

    Lenski, R. E., Rose, M. R., Simpson, S. C. & Tadler, S. C. Long-term experimental evolution in escherichia coli. i. adaptation and divergence during 2,000 generations. Am. Nat. 138, 1315–1341 (1991).

    Article  Google Scholar 

  74. 74.

    Chevin, L.-M. On measuring selection in experimental evolution. Biol. Lett. 7, 210–213 (2011).

    Article  Google Scholar 

  75. 75.

    Core Team, R. R: A Language And Environment For Statistical Computing [Internet] 2015 (R Foundation for Statistical Computing, Vienna, Austria, 2015).

    Google Scholar 

  76. 76.

    Vallenet, D. et al. MaGe: a microbial genome annotation system supported by synteny results. Nucleic Acids Res. 34, 53–65 (2006).

    CAS  Article  Google Scholar 

  77. 77.

    Kawashima, S. & Kanehisa, M. AAindex: amino acid index database. Nucleic Acids Res. 28, 374 (2000).

    CAS  Article  Google Scholar 

Download references


We thank Dr. Harry Kemble for critical comments on an earlier version of the paper. This work was supported by the European Commission under the 7th Framework Program (ERC Grant 310944 to O.T.) and under the Horizon 2020 Framework Programme (MSCA-IF 750129 to A.C.).

Author information




A.C. conceived the project, designed, conducted and interpreted simulations and genome analyses and wrote the paper. O.T. contributed to data analysis and interpretation and critically revised the paper.

Corresponding author

Correspondence to Alejandro Couce.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information: Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Source data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Couce, A., Tenaillon, O. Mutation bias and GC content shape antimutator invasions. Nat Commun 10, 3114 (2019).

Download citation


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing