The role of mutational spectrum in the selection against mutator alleles

Rapidly adapting microbe and cancer cell populations often evolve high mutation rates. Yet, once adaptive opportunity declines, antimutator alleles are expected to take over as a result of indirect selection against deleterious mutations. Theory indicates that the most important determinant of antimutator invasions is the extent of mutation rate reduction. However, inconsistent results from evolution experiments suggest that additional factors may also play a major role in antimutator dynamics. Here we show that the idiosyncratic mutation bias exhibited by different mutators – a previously unrecognized factor – can greatly alter mutator susceptibility to antimutator invasions. Using a simulation model calibrated to mimic a well-known long-term evolution experiment with bacteria, we show that differences in average deleterious load can account for order-of-magnitude changes in antimutator fitness for a realistic range of parameters. Since these parameters are known to vary with the environment, our results reveal an unanticipated source of variability in antimutator dynamics. Finally, we estimated the genome-wide average disruptive effect on proteins of mutations caused by different mutators, and found marked and systematic differences emerging across mutators and species with different genomic GC compositions. Taken together, our results suggest that antimutator dynamics may be highly dependent on the specific genetic, ecological and evolutionary history of a given population. Such dependence reveals a more complex picture than anticipated, being relevant for understanding mutators in clinical settings, as well as how hypermutability shapes the evolution of bacterial genome size and composition.


Introduction
The idea that the mutation rate is evolvable has captivated the interest of evolutionary biologists for decades (Sniegowski et al., 2000). It was early recognized that, since the vast majority of mutations with phenotypic effects are deleterious, selection should primarily act to reduce the deleterious load, pushing mutation rates to be as low as physiologically affordable (Eshel, 1973;Kimura, 1960;Leigh, 1973;Liberman and Feldman, 1986;Sturtevant, 1937). However, strains with highlyelevated mutation rates (i.e. mutators) are readily selected in clinical and laboratory populations of bacteria (LeClerc et al., 1996;Sniegowski et al., 1997) and yeast (Healey et al., 2016;Voordeckers et al., 2015), as well as in certain cancers (Loeb, 2011). Theory and experiments have explained this phenomenon in terms of selection pressures operating at different timescales: linkage with strong beneficial mutations enables mutators to rapidly reach fixation before their increased deleterious load becomes fully manifest, which requires the accumulation of multiple secondary deleterious mutations (Good and Desai, 2016). Due to this reliance on rapid hitchhiking, mutators are most likely to thrive whenever populations face strong selection pressures (Tenaillon et al., 1999) and under conditions in which both recombination  and genetic drift (Raynes et al., 2018) are unimportant.
But, eventually, populations fixed for a mutator phenotype are expected to re-evolve low mutation rates -provided that selective pressure subsides and restorative alleles are available. Given the longer timescales involved, the evolution of reduced mutation rates has proven much more difficult to observe than its reverse process, the selection for mutator alleles. Indirect evidence comes from the fact that DNA repair genes seem to undergo frequent horizontal transfer (Brown et al., 2001;Denamur et al., 2000;Elena et al., 2005), and the observation of marked mutation rate polymorphisms within single-patient bacterial populations (Couce et al., 2016). Direct Wielgoss et al., 2013). The provisional picture that emerges from these studies is rather heterogeneous, with different experiments reporting contrasting findings in terms of timescales, mechanisms and magnitude of mutation rate reduction. Recent theoretical work has begun to provide a framework to account for these contrasting patterns, emphasising the role of several factors in determining the fixation probability of antimutator alleles. These factors include differences in population size, beneficial and deleterious mutation rates, mutator strength, and the availability of secondary mutations compensating the cost of deleterious mutations (Good and Desai, 2016; Jain and James, 2017; James and Jain, 2016).
An additional, yet unexplored factor is the well-known mutational idiosyncrasy exhibited by different mutators (Miller, 1996). This idiosyncrasy arises from the particular molecular details of the mutation-avoidance mechanism that are impaired in each mutator genotype. In Escherichia coli, for instance, impairment of any of the enzymes removing oxidized guanine from the DNA (e.g. MutM, MutY) results into substantial elevations of G:C→T:A mutations, while disruption of the enzyme preventing its incorporation from the free nucleotide pool (e.g., MutT) leads to a marked increase in A:T→C:G mutations (Michaels and Miller, 1992). These sort of mutational biases shape the tendency of different mutators to generate mutations with different fitness effects, which can have dramatic consequences on mutator success when adaptation involves just a few strongly beneficial mutations (Couce et al., 2013). In analogy to this phenomenon, an intriguing hypothesis is that mutators that tend to generate stronger deleterious mutations may be more easily outcompeted by an invading, low-mutation rate genotype. Similarly, mutators producing on average milder deleterious mutations than the wild-type may resist the invasion of antimutator alleles for longer. Whether these possibilities are plausible or not under realistic scenarios remains largely unknown.
In a first approach, at least two considerations argue against the idea that mutational spectrum differences can play any significant role in the evolution of reduced mutation rates. The first one comes from the classic Haldane-Muller principle (Haldane, 1937;Muller, 1950), which states that the reduction in fitness caused by recurring deleterious mutations is roughly on the order of the deleterious mutation rate (u d ), irrespective of the actual fitness cost of each individual mutation (s d ).
Such independence from s d should preclude any spectrum-driven differences in mutational load among mutators. It is well-known, however, that this principle only holds as long as s d > u d (Crow, 1970), a condition that may readily be violated in well-adapted, mutator populations of microbes.
Second, different biases in the production of mutations are likely to translate into substantial fitness differences when just a few number of sites have a huge impact on fitness, as in the case of strongly beneficial antibiotic resistance mutations (Couce et al., 2013). It is unclear, however, to what extent these kind of spectrum-driven differences may balance out when considering a larger number of sites. Relevant to this issue is the observation that some amino acid substitutions tend to be much more disruptive to proteins than others, a well-established fact that forms the basis of many protein alignment tools (Yampolsky and Stoltzfus, 2005). This fact affords speculation that systematic patterns may emerge at the genome-wide scale, so that different mutational spectra may produce, on average, deleterious mutations with characteristically different fitness effects.
Here, we used computer simulation to explore the extent to which the advantage of an antimutator allele deviates from the Haldane-Muller expectations under the relevant range of parameters. In addition, we estimated the genome-wide average disruptive effect on proteins of mutations caused by different mutational spectra. Importantly, since different codon usage patterns might alter the probability that a particular spectrum generates strong-effect amino acid changes, we also tested whether systematic differences are to be expected among mutators in species with widely-divergent genomic GC compositions. Overall, our results suggest that mutational spectrum differences may play an unsuspectedly important role in the selection against high mutation rates in bacteria.

Broad conditions allow mutational spectrum to impact antimutator evolution
To test whether mutational spectrum differences can alter the evolution of reduced mutation rates, we built a computer model that simulates the evolutionary dynamics of antimutator alleles invading a mutator population. The model was designed to capture the basic properties of the influential Lenski's Long-Term Evolution Experiment (LTEE), in which 12 Escherichia coli populations have been serially propagated in the same glucose-limited medium for more than 60,000 generations (Good et al., 2017). Crucially, one of these bacterial populations was observed to re-evolve reduced mutation rates after being dominated by a mutator phenotype for more than 10,000 generations (Wielgoss et al., 2013). Inspired by this experiment, we considered the simple scenario of an asexual mutator population being serially propagated in a constant environment to which is already well-adapted (see Material and Methods). At the start of each simulation, a single antimutator allele, restoring the mutation rate to wild-type levels, is introduced. The trajectory of this allele is tracked important parameters for this matter are the mutation rate of mutators (m) and the average selection coefficient of deleterious mutations (s d ). Most estimates of m are based on a few reporter genes, and so caution should be exercised when using them as a proxy for genome-wide rates (Foster, 2006). However, while more than a dozen genes are known to increase bacterial mutation rates when inactivated (Miller, 1996 Figure S1). Therefore, there are grounds to speculate whether spectrum-driven differences in s d may alter the propensity of different mutators to evolve reduced mutation rates. To examine this possibility, we expanded the computer simulation model to allow consideration of general biases in the production of deleterious mutations. We modelled these biases as a multiplicative factor (κ) that modifies the selection coefficient of deleterious mutations in the mutator background, such that when κ= 1 there are no differences between mutators and antimutators (see Material and Methods). The previous analysis shows that the importance of mutational spectrum ultimately depends on how large the mutation rate is compared with the fitness cost of deleterious mutations. Therefore, a further natural parameter to consider is the basal deleterious mutation rate (u d ), that is, the absolute  Figure 3 is that the slopes become steeper with larger values of u d (Figure 3c). This result mimics the pattern found for increasing m in Figure 2, and can be understood in terms of populations moving gradually away from the Haldane-Muller regime. A more remarkable observation is that even for the lowest values tested, despite the relatively flatter slopes, the mutational spectrum is still capable of exerting a moderate but sizeable impact on the performance of invading antimutator alleles.
Overall, the results presented here support the notion that the impact of mutational spectrum on antimutator evolution can be substantial under a wide and relevant range of conditions. Finally, it is worth noting that the variation in the slopes in Figures 2 and 3 results in a large area of overlap among the curves obtained for different mutators under various combinations of parameters, especially for the smaller values of κ. This overlap represents the range of conditions under which a stronger mutator will actually be more robust to antimutator invasions than a weaker one. Since both s d and U d can vary appreciably across species and environments, such a counterintuitive outcome illustrates the importance of considering the mutational spectrum when investigating the evolution of reduced mutation rates.

Different mutational spectra cause distinct genome-wide protein-disrupting patterns
The previous results show that spectrum-driven differences in s d can greatly influence the evolution of reduced mutation rates in bacteria. It remains to be explored, nonetheless, whether spectrumdriven differences in s d are actually likely to occur among bacterial mutators. While differences in fitness have indeed been observed in the case of beneficial mutations involving a few genomic sites (Couce et al., 2013), the very large mutational target size for deleterious mutations may cause local, spectrum-driven differences to balance out at the genome-wide scale. However, a first look at the properties of the genetic code affords reasonable grounds for expecting the emergence of some general trends. Certainly, it has long been known that transversions are overall more detrimental than transitions, due to the fact that transversions underlie a larger fraction of non-synonymous substitutions and, among these, tend to produce changes that are less conservative of the physicochemical properties of amino acids (Lyons and Lauring, 2017;Wakeley, 1996;Zhang, 2000). Besides these trends, a closer examination reveals that the 6 types of point mutations display fairly broad distributions of disruptive effects (see Supplementary Figure S2). Such breadth raises the possibility that, ultimately, the average disruptive effect of a given mutational spectrum may actually be determined by the highly-diverse codon usage preferences observed among bacterial species (Wan et al., 2004).
To explore these possibilities, we set out to quantify the average protein-disrupting effect of the  (Grantham, 1974). This amino-acid substitution matrix was previously shown to provide the best predictions of empirical fitness effects among standard distance-based matrices (Jacquier et al., 2013). As validation, we also conducted the analyses applying an alignment-based substitution matrix (BLOSUM100, Henikoff and Henikoff, 1992), which provided comparable results. Moreover, using the specific case of the LTEE experiment, we have shown that   (Figure 4, green), which makes sense since this spectrum comprises the two transitions, well-known to be the most conservative among all possible point mutation types (Lyons and Lauring, 2017;Wakeley, 1996;Zhang, 2000). While interesting, we shall note that this result is probably an underestimation since Mistmatch Repair mutators, apart from point mutations, also exhibit an elevated occurrence of indels and large recombination events (Elez et al., 2007). More remarkable is the fact that the disruptive effects associated with the mutYand mutTspectra exhibit a strong and opposite dependence on the GC content of the genetic background. In particular, we observe that the mutYspectrum is highly detrimental in AT-rich backgrounds (Figure 4, red), while the mutTspectrum inflicts its greatest disruption in GT-rich backgrounds (Figure 4, blue). This contrasting behavior is amenable to a straightforward explanation: whatever the processes causing the base composition bias may be, the last codons to be changed to conform to this bias should be the ones for which the change will produce the most harmful effects. These last codons are exactly the ones being predominantly altered by mutY -specific mutations (G:C→T:A) and mutT --specific mutations (A:T→C:G) in AT-rich and GT-rich backgrounds, respectively.
In addition, we should expect the fitness cost of altering these last, non-conforming codons to be the greatest in conditions where selection is weak compared to other evolutionary forces, since under such conditions selection can only prevent the most essential amino-acid sites from changing. This phenomenon would help explain why the most disruptive effects are found for the mutYspectrum in the most AT-biased genomes -generally seen as reflective of highly-relaxed selective conditions

Concluding Remarks
Our analyses reveal that different mutators can be expected to produce deleterious mutations with distinctive fitness effects, and that such idiosyncrasy can greatly impact antimutator invasion dynamics. At least two points regarding these findings merit brief consideration. First, the simulations purposely focused on the effects of mutational spectra on deleterious mutations, leaving aside the complications of considering either compensatory or generally-beneficial mutations.
While previous research has already studied the importance of these types of mutations on antimutator dynamics (Good and Desai, 2016; Jain and James, 2017; James and Jain, 2016), a full treatment of this problem should take into account the fact that spectrum-driven differences can also bias mutator access to both compensatory and generally-beneficial mutations -a non-trivial degree of complexity that will be reserved for future research. Second, it is worth noting the breadth of conditions under which spectrum effects are noticeable, as well as the magnitude that these effects can reach -including the paradoxical situation of weak mutators exhibiting larger deleterious load that strong mutators. The breadth and magnitude of these effects lead us to conclude that, even if taken only as a first approximation, our analyses strongly support the notion that mutational spectrum differences can drive antimutator evolution in many biologically-relevant scenarios.
Finally, it is worth pointing out that the general finding of our study is that antimutator success depends not only on the extent of mutation rate elevation, but also on the mutational spectrum, the genetic background and the environmental conditions. Since the exact contribution of these factors is essentially an empirical question, it is possible that the likelihood of antimutator invasions in realworld scenarios may have to be evaluated on a case-by-case basis. Such dependence on the particulars of each case has at least two important consequences. In clinical settings, it can complicate predictions about the long-term persistence and transmissibility of mutators, thus being relevant to interventions aimed at curbing the contribution of mutators to antibiotic resistance evolution (Oliver and Mena, 2010). More broadly, it has implications for our views on how mutators shape the evolution of bacterial genomes. Episodes of hypermutability can be common along the evolutionary history of bacterial lineages, inflicting rapid changes in genome size and composition that can blur the signature of selection (Couce et al., 2017). Our results suggest that the length of these pulses of hypermutability, and therefore their potential impact, may be highly dependent on the specific genetic, ecological and evolutionary history of a given lineage -a possibility further complicating the interpretation of present-day patterns of bacterial genome diversity.

Computer simulation
The computer model simulates the serial passage of a bacterial population in a laboratory environment to which is already well-adapted. Mimicking serial passage, the algorithm recreates two events: population growth and bottleneck. In the first one, cells reproduce and accumulate mutations for ~ 6.7 generations (population sizes oscillate daily between 10 7 and 10 9 ) (Lenski et al., defined as r = 2 + ns d , where n represents the number of accumulated deleterious mutations and s d is the average deleterious selection coefficient. Mutation is implemented by using a Poissondistributed pseudorandom number generator (the function rpois in R). Every generation, individuals acquire deleterious mutations with a probability depending on the basal deleterious mutation rate (u d ) and the mutator strength (m) (note that for antimutator alleles m=1). Lethal mutations are implemented similarly, with a fixed value of u l = 1 x 10 -5 (Robert et al., 2018;Tenaillon et al., 1999). The second part of the algorithm is executed when population size exceeds the limit of 10 9 individuals, and consist of taking a random sample of 10 7 individuals, after which growth is resumed.
Simulations start with a single antimutator allele entering a population of 10 7 mutator individuals, and terminate when the frequency this allele reaches either zero or increases by a 1,000-fold factor.
The average effective selection coefficient of the antimutator allele is calculated empirically as s eff = log((p g /q g )/(p 0 /q 0 ))/g where p and q represent the frequency of the antimutator and mutator allele, respectively, and g is the number of generations (Chevin, 2011). To implement the differential access of mutators to deleterious mutations with different fitness costs, we introduced a multiplicative factor (κ) that modifies s d in the mutator background as r = 2 + κns d . Note that when

Genome analyses
To conduct the bioinformatic analyses we developed a series of scripts in Python (version 2.7.12) (www.python.org), freely available upon request. These codes were applied to a panel of 25 bacterial genomes, including relevant pathogens, and chosen to span a wide range of GC compositions. A summary of the main features of these genomes is presented in Supplementary   Table S1. For all strains, the predicted coding sequences (CDSs) and their functional classification (COG) were retrieved from the Microscope platform from Genoscope (www.genoscope.cns.fr) (Vallenet et al., 2006). After formatting and parsing, we estimated the average protein-disrupting effect of different mutations for all CDSs across the panel of genomes. We achieved this by computing the Grantham and BLOSUM100 scores for all of the possible substitutions per codon associated with each mutational spectrum. The Grantham and BLOSUM100 matrices were obtained from the AAindex database (www.genome.jp/aaindex) (Kawashima and Kanehisa, 2000) and the NCBI FTP server (ftp.ncbi.nih.gov/blast/matrices), respectively. Codons harbouring incompletely specified bases (e.g. N, R, Y) were excluded from the analyses.  0.008, 0.004, 0.002, 0.001). Mutational spectrum effects refers to the differential propensity of mutators to produce deleterious mutations with different fitness cost. We modelled this effect as a multiplicative factor (κ) that modifies s d in the mutator background, such that when κ=1 mutators and antimutators generate deleterious mutations of similar effect.