Phenotypic Switching Can Speed up Microbial Evolution

Stochastic phenotype switching has been suggested to play a beneficial role in microbial populations by leading to the division of labour among cells, or ensuring that at least some of the population survives an unexpected change in environmental conditions. Here we use a computational model to investigate an alternative possible function of stochastic phenotype switching: as a way to adapt more quickly even in a static environment. We show that when a genetic mutation causes a population to become less fit, switching to an alternative phenotype with higher fitness (growth rate) may give the population enough time to develop compensatory mutations that increase the fitness again. The possibility of switching phenotypes can reduce the time to adaptation by orders of magnitude if the “fitness valley” caused by the deleterious mutation is deep enough. Our work has important implications for the emergence of antibiotic-resistant bacteria. In line with recent experimental findings, we hypothesise that switching to a slower growing — but less sensitive — phenotype helps bacteria to develop resistance by providing alternative, faster evolutionary routes to resistance.

In Fig. 2A we compared the mean adaptation time T with and without phenotypic switching for various parameters. In all cases switching decreased T . However, if the mutation probability µ is large enough, the pathway that goes directly through the valley genotype 2A may take less time than any pathway involving the alternative phenotype B. Figure S1 shows T as a function of mutation probability µ for the case with and without switching. For large enough µ, switching makes no observable difference to the mean adaptation time as successful cells do not use an alternative pathway. This is the same behaviour observed in region 1 of Fig. 3B.  Figure S2 shows the mean adaptation time when the switching rates between the two phenotypes are asymmetric, α (A→B) and β (B→A), but identical for all genotypes. To create the plot we fixed the ratio α/β to 10 −2 , . . . , 10 2 and varied α. For all but the largest ratio α/β = 10 2 the mean adaptation time is minimized for a certain range of α.

III. MEAN ADAPTATION TIME DECREASES MONOTONOUSLY WITH α IN THE ABSENCE OF SWITCHING BETWEEN STATES 2A AND 2B
We identified that region 3 in Fig. 3B appears due to the existence of the step between states 2A and 2B (see Fig. 1A). For large α values a trajectory via phenotype B is likely to involve switching to the deleterious state 2A. This reduces the average fitness of the population and causes the mean adaptation time to increase with α (Fig. 2B). To demonstrate this consider the model without switching between states 2A and 2B (Fig. S3A). The mean adaptation time T as a function of switching rate α decreases monotonically in this modified model (Fig. S3B).

IV. PROBABILITY OF INDIVIDUAL STEPS FOR SUCCESSFUL TRAJECTORIES
In Fig. 3B we showed the most common trajectory classes for successful cells as a function of (µ, α). A more detailed picture can be obtained by looking at the probability that each step occurs in a successful cell's trajectory. For each pair of (µ, α) we create a diagram of transitions between states (i.e. 1A,1B,...,3A,3B) in which the thickness of each link corresponds to the probability that that step is taken (Fig. S4A). Figure S4B shows a graph in which diagrams for different pairs of (µ, α) have been put together. This graph is very similar to Fig. 3B. This confirms that the most-common trajectories plotted in Fig. 3B are good representatives of the "typical" trajectories of successful cells.

V. MEAN ADAPTATION TIME AS FUNCTION OF CARRYING CAPACITY
We can plot differently the same simulation data used in Fig. 4B to explore how the mean adaptation time T depends on the carrying capacity K. Figure S5 shows T for different values of the switching rate α. A nonmonotonic behaviour can be seen: T is maximal for intermediate K and falls off for both small and large carrying capacities.

VI. THE EQUILIBRIUM DISTRIBUTION BETWEEN PHENOTYPES AT GENOTYPE 2
In the main text we posited that when α becomes too large the beneficial effect that state 2B has on the evolution of the population reduces, through its connection with the deleterious valley state 2A. To demonstrate this we consider here the equilibrium distribution that a population confined to genotype 2 will have between the states 2A and 2B. Such confinement at genotype 2 can be likened to the theoretical situation in which the mutation rate is much smaller (i.e. by orders of magnitude) than the rate of phenotype switching α. Therefore a population at genotype 2 can be assumed to reach its A . . .

N Simulation runs
Individual trajectories collected Combined trajectories Figure S4: The probability of each step being taken by successful cells. (A) Constructing images that display the probability that each step is taken. By collecting many successful trajectory classes we can compile them into a single image that consists of links of different thicknesses, where the thickness of each link is proportional to the probability that that step is taken by a successful cell. (B) The probability each step is taken as a function of µ and α. Remaining parameters are K = 100, δ = 0.4 and d = 0.1.
equilibrium distribution before the time it takes for the next mutation to occur. In this case the population at states 2A and 2B can be described by the following coupled differential equationṡ where N 2A and N 2B are the population number at states 2A and 2B respectively and ξ = [1 − (N T /K)] ≈ d is the logistic factor. These equations can be solved analytically to find the steady-state values of N 2A and N 2B as a function of α. The solution for the steady state distribution at state 2B, expressed as a frequency n 2B such that n 2B = N 2B /(N 2A + N 2B ), is given by (S3) A plot of both n 2B from Eq. S3 and n 2A = (1 − n 2B ) are shown in Fig. S6. We can see that n 2B reduces monotonically from 1 to the limiting value of 0.5 with increasing α. As this happens the average fitness of the entire population at genotype 2, defined as r 2 = n 2A r 2A + n 2B r 2B , will decrease and it is straightforward to show that Therefore in the limit of large α the equilibrium distribution of the population is an equal distribution between states 2A and 2B. In this limit the selective differences between states 2A and 2B are not experienced by the population and the two phenotype system maps to a single phenotype problem in which the fitness of genotype 2 is equal to the r 2 limit given in Eq. S4.

VII. DIFFERENT MUTATION PROBABILITIES FOR CELLS IN PHENOTYPE A AND B
So far we have considered all cells in our model to have the same mutation probability µ regardless of which phenotype they are in. However, it has been experimentally observed that cells in secondary phenotypic states often have increased rates of mutation [1,12]. Figure S7 shows curves of the mean adaptation time T (α) when cells in phenotype B have mutation probabilities (µ B ) greater than that of cells in phenotype A (µ A ). We observe that for low values of α the three curves converge. This is due to α being small enough that successful cells are quicker to evolve by phenotype A than to await the switching events that make phenotype B accessible, irrespective of the value of µ B . At intermediate α values successful cells travel to the target state primarily by phenotype B, which takes less time the larger µ B is. Therefore we see lower minimum values of T (α) for larger values of µ B . Finally, at the largest α values the two phenotype system maps to that of a single phenotype (as discussed later in Section IX C). In this limit, larger mutation probabilities in either phenotype will result in lower adaptation times.    Figure S7 shows that even when µ B ≫ µ A we can still observe a significant valley in the curve of T (α). For example, when µ B = 10µ A (i.e. the red curve) evolution of cells via phenotype B (i.e. at intermediate α values) remains at least an order of magnitude quicker than in the limits of low and high α. We previously observed in Fig. 2B that the depth of the valley in T (α) increases as the (universal in that case) mutation probability µ decreases. Therefore at the mutation values observed in nature (on average 10 −10 to 10 −9 per replication in E. coli [2,7]), we expect the valley in T (α) to be significant enough to withstand cells in phenotype B alone having an increased mutation rate (by orders of magnitude).

VIII. FIXED INCREMENT EVOLUTION OF THE SWITCHING RATE
In the main text, Figure 6 showed simulation results in which newly created cells could mutate their switching rate α by drawing a new value from a discrete set of allowed switching rates. We observed successful cell trajectories to consist largely of those that evolved switching rates to within the optimal range identified in Fig. 2B.
Here we consider the same set of allowed α values but, upon mutation of α, the new value is chosen as one of the two closest values to the switching rate of the ancestor cell. This corresponds to α evolving in fixed multiplicative steps, with increase or decrease of α being equally likely. The rest of the algorithm is the same as before.
We again collected trajectories of successful cells in the expanded state space, including states with different α, and calculated transition probabilities between any two states. Fig. S8B shows these probabilities as links of different thickness between the states in state space. We can clearly see some new behaviour here. First, successful trajectories are more likely to feature genotype mutations in phenotype A (more red lines) than in Fig. 6B. Second, the probability of crossing the genotype space at large α (in either phenotype) is much smaller due to the increased number of steps it takes a cell to evolve such large α values. Otherwise, Fig. S8 is similar to Fig. 6. This shows that the details of how mutations of α are implemented do not have a significant effect upon the results of our model.

IX. RANDOM EVOLUTION OF THE SWITCHING RATE
In the main text, Figure 6 showed simulation results when cells could mutate the switching rate α of their offspring by drawing a new value from a discrete set of allowed values. Specifically, Fig. 6C showed a bar chart of the probabilities that genotype space was crossed (combining the mutations from genotype 1 to 2 and those from genotype 2 to 3) for each of the possible switching rates α. In Fig. S9 we consider this situation in more detail, looking at what values of α are most likely to be used by successful cells for the two genetic mutations individually. Figures S9A and S9B show bar charts of the probability that successful cells underwent the genotype mutation 1 to 2 and 2 to 3 respectively for each possible α value. The bars are coloured both red and green to indicate the proportion of the probability that is due to the mutations occurring when the cell is in phenotype A or B respectively. Superimposed on both bar charts is a curve of the mean adaptation time T (α) when alpha is constant, for which the optimal α range can be seen.
We can see in both Figs. S9A and S9B that successful cells are most likely to undergo genetic mutations while in phenotype B (due to the bars in both plots being predominantly green). However, successful cells are more likely to undergo genetic mutations in phenotype A during the mutation from genotype 2 to 3 than from genotype 1 to 2 (due to the larger amount of red seen in the bars in Fig. S9B compared with Fig. S9A). This will be due to the mutation from genotype 2 to 3 being the one that reaches the target genotype (i.e. 3), while the mutation from genotype 1 to 2 results in a cell in phenotype A having to exist at the deleterious valley state. Figure 6C showed that the most likely α values for successful cells to cross genotype space with were those corresponding to the optimal α range in the curve of the mean adaptation time T (α). In Figs. S9A and S9B we can see that this is also the case when the genetic mutations are considered individually (the peaks of these bar charts correspond with the minimum range in the curve of T (α)). This is clearest in Fig. S9B where cells with lower α would take too long to switch back to the correct phenotype and will be outcompeted by those in the optimal α range. It is interesting to note that, as we observed in Fig. 6C, there is almost zero probability that cells with very large α values will be successful in the evolutionary process when α is allowed to evolve. This is due to the weakening of the beneficial effect brought to the process by the existence of the second phenotype, as illustrated mathematically in an earlier section.

X. MATHEMATICAL ANALYSIS
In this section we want to show mathematically the existence of the optimal range for the phenotype switching rate α and find the expression for the mean adaptation time T (α). Due to the complexity of the model, the exact calculation of T (α) is not possible. However, we can find an approximate formula that still agrees very well with numerical simulations while at the same time gives insight into the role of various parameters of the model on the optimal range of α and the dip in T (α).
We first note that, as indicated by computer simulations from Fig. 3B, two trajectories are particularly important in the evolution of a cell of type 3A: that which evolves 3A directly via 1A→2A→3A (represented by the symbol in Fig. 3B) and that which avoids state 2A entirely (represented by the symbol ). We will refer to these as trajectories A and B respectively (see Fig. S10 for the networks of states these trajectories occur over), with the corresponding mean adaptation times T A and T B .
At low values of α, if the valley depth δ is large enough for the optimal α range to exist, successful cells will travel to 3A almost exclusively by trajectories A and B (Fig.  3B). This significantly simplifies the calculation of T : if we treat these two "reaction channels" as occurring independently with rates proportional to 1/T A and 1/T B , the mean adaptation time can be approximated by A

Combined trajectories
Phenotype G e n o t y p e S P S r a t e

B
Step in phenotype B Step switching phenotypes Step in phenotype A Initial state Step evolving α   Increasing α results in an increase in the variability of trajectories taken by successful cells as indicated by Fig. 3B. However, in the limit of large α we can approximate the dynamics of the system as occurring in a one-dimensional space of states, with A and B phenotypes mixed together as described in Section VI above. This will lead to another formula T combined for the mean adaptation time.
Mathematical descriptions of how asexual populations acquire a sequence of as few as two mutations are already surprisingly complex and exhibit different dynamical regimes [10] depending on parameters such as mutation probability µ and carrying capacity K, as well as the particular fitness landscape being considered. Here we encounter the same difficulty, with the additional complication caused by phenotypic switching. In general, formulas for T A and T B will be different for different regimes. Here we shall discuss only those regimes that apply to trajectories from Fig. 3B.
In what follows we shall first show how to calculate T A , T B , and T combined . Next, we show how to combine them to obtain a formula for T (α). Finally, we use these formulas to predict the range of optimal α, the minimum T at optimal α, and the phase diagram from Fig. 3B.

A. Low α -stochastic tunnelling via the A trajectory
We assume here that the population size (which is proportional to K) and mutation rate are small enough that the dynamics are stochastic. This is the case for the simulation results presented in Fig. 2B and 3B on which we will focus here. If α is small, the population must evolve via trajectory A. Two regimes are possible in this case: sequential fixation [10,11] and stochastic tunnelling [4,5,9,10].
Sequential fixation (SF) occurs when the total rate of mutant production in the population is very small: µdK ≪ 1. If a mutant lineage is to exist long enough to produce the next mutant in the sequence, it is likely that it has become significant in the population (i.e. it has approximately fixed). Otherwise we expect it to be lost due to random drift, particularly if the mutant state is deleterious. In this regime, the mean adaptation time T is dominated by the waiting time for the population to produce mutants that are likely to fix. Therefore we can ignore the intermediate dynamics of the cells, such as drift times (being the time it takes for the mutant population to reach fixation). In this regime, the population remains highly localized and -over large timescaleshas the appearance of moving collectively from one state to another upon the event of a mutant fixing.
In the stochastic tunnelling (ST) regime, mutants are produced at greater rates than in the SF regime. Therefore cells are likely to be produced further along a sequence of mutations before cells in the earlier states reach fixation. In general, the mean adaptation time in this regime varies inversely with population size due to the increased rate of mutant production at larger K. The mean adaptation times are smaller than in the SF regime and we find that the drift dynamics of the mutant populations must now be incorporated.
We can show that for the parameters we are interested in here (K ∼ 100, d = 0.1, δ = 0.4) the population is unlikely to be in the SF regime. The rate with which mutants are generated at state 2A is approx- is the approximate mean number of organisms at state 1A, d is the replication rate (equal to the death rate in the steady state) and µ is the mutation probability. If the probability P 2A of fixation at 2A is sufficiently small, T A can be approximated by The probability that a mutant fixes in a state with relative fitness disadvantage δ is [3,6] which gives the following formula for P 2A : Inserting the values K = 100, d = 0.1, δ = 0.4, µ = 10 −5 into Eq. (S6) combined with Eq. (S8) we obtain T A ∼ 10 20 . This is much longer than any T from Fig. 2B, and we shall see below that the alternative regime -stochastic tunnelling -gives a much shorter T A .
We proceed now with calculating T A in the stochastic tunnelling (ST) regime. We follow the same approach as in [10] for the general two-step process shown in Fig. S11. The population begins in state i − 1, at time t = 0, and evolves by tunnelling through state i, to produce a cell in the target state i + 1. We want to calculate the expected time it takes for the first cell to be produced at i + 1. A mutant at state i that is termed "successful" is one that will go on to produce offspring in state i+1. Each mutant at i produces an independent lineage of cells that evolves by a continuous time branching process. All mutants therefore have the same probability of being successful Figure S11: A general two step process to acquire a double mutant. Cells in state i − 1 can produce mutants in state i, which can produce mutants in state i + 1. ηj is the rate that a cell in state j produces a mutant in state j + 1.
(note this would not be the case if states had frequencydependent fitnesses). We also exclude the possibility of competing mutant lineages at the same state. For the two-step process in Fig. S11, the expected mean adaptation time T in the stochastic tunnelling regime is Here T i−1→i|suc is the expected time it takes the population at i − 1 to produce the first successful mutant at i, which will dominate T . τ i is the expected time a successful mutant lineage at state i will drift before producing a cell at state i + 1. To determine whether a mutant produced at i is successful we need to calculate P i (t) -the probability that a mutant lineage at i, created at t = 0, is successful (i.e. produced a cell at i + 1) by time t.
The probability of a mutant at i being at all successful is therefore P i = lim t→∞ P i (t). Furthermore if we know P i (t), the drift time τ i can be calculated by Here P i − P i (t) is the probability of the mutant not being successful by time t, and the division by P i ensures that we include only events in which the mutant eventually succeeded. To calculate P i (t) consider a mutant produced at state i. Let n(t) be the number of mutant cells in this lineage at time t, with initial condition n(t = 0) = 1. The size of the lineage fluctuates in time until it dies out (n(t) = 0) or produces a cell in the next mutant state i + 1. In a single realisation of the stochastic process, i.e. one particular trajectory n(t) out of the set of possible trajectories {n(t)}, the probability that the lineage produces the mutant i + 1 in time interval t → t + dt is η i n(t)dt, where η i is the rate with which a cell of type i produces a cell of type i + 1 (see Fig. S11). The waiting time for a cell to be created at i + 1 can be described by an inhomogeneous Poisson process. The probability that a mutant at i + 1 is created by time t is therefore where W (t) = t 0 n(t ′ )dt ′ . If we average (S11) over all possible lineage trajectories {n(t)} we get the following expression for P i (t) Here P i (n, w, t) is the probability density function, such that P i (n, w, t)dt equals the probability that W (t) = w and n(t) = n during the time interval (t, t + dt).
Let us now introduce Φ i (x, y, t) as the Laplace transform of the probability density function P i (n, w, t): (S13) If we can determine Φ i (x, y, t), then Eq. (S12) allows us to calculate P i (t) = 1 − Φ i (x = 0, y = η i , t). The time evolution of P i (n, w, t) obeys the following equation (S14) Here b i and d i are the rates that cells of type i reproduce (i.e. produce offspring of their own type -not mutants) and die respectively. In the limit dt → 0 we obtain the following differential equation for P i (n, w, t): ∂P i (n, w, t) ∂t = (n + 1)d i P i (n + 1, w, t) (S15) If we multiply this by exp(−xn−yw), then integrate over w and sum over n we get This equation can be solved by the method of characteristics to give us where We can now calculate P i (t) from (S17): while the probability of a mutant at i being successful is All terms in Eq. (S9) can now be expressed using Eqs. (S19,S20), giving the mean adaptation time T to be Here we have defined the population size at the initial state i − 1 to be N i−1 . All other quantities in (S21), such as σ ± i depend only on the replication, death, and mutation rates, and not the population size.
We will now apply this method to calculate the mean adaptation time T A for trajectory A in Fig. S10. The mean adaptation time T A is given by Eq. (S21), in which i − 1 corresponds to state 1A and i to state 2A. Therefore we have that in which (identifying the values of the variables in our model to use in Eq. (S21)): where we have used that r 1A = 1, r 2A = (1 − δ) and that most of the population is expected to remain in state 1A in this regime, so N T ≈ N 1A which is approximately given by Eq. (S23). This approximation means that the logistic factor controlling growth is Substituting these values into Eq. (S21) gives us , (S27) and from (S18) we know that . (S29) Since in the limit we are interested here (Kµ ≪ 1), T 1A→2A|suc is much larger than τ 2A we can neglect the last term in Eq. (S22) and write the expresion for T A in the stochastic tunnelling regime: where we have replaced σ − 2A by 1 − µ + µ/δ which is valid in the limit of small µ considered here. As expected, T A from Eq. (S30) is inversely proportional to µ 2 since it requires two mutation steps to reach 3A from 1A. Inserting K = 100, d = 0.1, δ = 0.4, µ = 10 −5 into Eq. (S30) we obtain T A ≈ 7.4 × 10 8 , which is orders of magnitude smaller than the value predicted via the sequential fixation formula (S6).

B. Intermediate α -sequential fixation via the B trajectory
As we increase α, the probability of switching to phenotype B increases and the dynamics of the system is well described by neutral evolution along trajectory B. As before, we consider the small Kµ and Kα limit. This corresponds to the sequential fixation regime mentioned earlier and enables us to approximate the dynamics by the whole population jumping between the sites of trajectory B.
In what follows we neglect population extinction which is very unlikely for K ≫ 10 considered here. The population performs a continuous time, 1D asymmetric random walk, with site-dependent transition rates. The mean time it takes a walker to go from site 0 to M (in which 0 is a reflecting boundary) is [8] Here p i and q i are the rates that in each timestep the walker moves from site i to i + 1 and i − 1 respectively. We can use Eq. (S31) to calculate T B by identifying states i = 0, 1, 2, 3, 4 (M = 4) with our states 1A, 1B, 2B, 3B, and 3A in Fig. S10. We therefore have that where the rate p(0) corresponds to the transition from 1A to 1B, rate p(1) to the transition 1B→2B, etc., while the rate q(1) corresponds to the transition 1B→1A, rate q(2) to 2B→1B, and so on. These rates are given by the rate with which a mutant/opposite phenotype is generated Inserting the above rates into Eq. (S32) we obtain a simple expression for T B in the sequential fixation regime, where we have neglected the last term as being much smaller than the two other terms. Figure S12 compares T B from Eq. (S36) and T A from Eq. (S30). While it takes much longer for the population to travel along trajectory B than A for very small α, the mean time to adaptation T B decreases inversely proportional to α and becomes smaller than T A at some α * . To find α * , we equate T A and T B from Eqs. (S30,S36): from which we obtain that C. Large α -the combined trajectory In the limit of very fast phenotypic switching (large α) we have the situation described in Section VI: the 6-state system reduces to 3 states which now combine phenotypes A and B. Two of the states (1 and 3) have the same growth rates, r 1,comb = r 3,comb = 1 (as we are only interested only in the creation of the first cell at state 3A in the 6 state system). The growth rate of the middle state is where we have used Eq. (S3) for n 2B . Since the combined system has the same structure as trajectory A from Section X A, the system is again in the stochastic tunnelling regime and we can use Eq. (S30) to calculate the mean adaptation time. We only need to replace the depth δ of the valley by the new effective depth 1 − r 2,comb . This gives . (S40) Equation (S40) is valid only if the fitness valley is sufficiently deep. For shallow valleys (1 − r 2,comb 1/K) the dynamics becomes neutral and is better approximated by the sequential fixation formula. The characteristic α + when this happens can be determined from the equation 1 − r 2,comb (α + ) = 1/K which gives Comparison to simulations Figure 3C shows the predicted mean time to adaptation as a function of α. The small and intermediate α formulas have been combined together by means of Eq. (S5) whereas the large α solution (S40) has been plotted separately. The agreement between the analytics and computer simulations is very good, especially for low α values. The position of the optimal α region is also predicted with good accuracy despite the many approximations that we have made.
E. The optimal range of α, the minimum T (α), and the phase diagram To find the small-α boundary of the optimal α regime, we notice in Eq. (S36) that T B reaches a plateau when α ≫ α − , for which Therefore α − is the lower boundary of the dip in T (α). The upper boundary α + is defined by Eq. (S41). The minimum of the mean adaptation time is given by the minimum of T B,ST from Eq. (S36): T min ≈ 5/(dµ). (S43) These mathematical results can be used to understand the three regions seen in Fig. 3B from the main text: • For α ≪ α * = Kdµ 2 , the population evolves along trajectory A and does not switch phenotypes.
• For dµ ≪ α ≪ α + = d/K, the system is in the optimal switching regime. The population evolves almost exclusively along trajectory B, avoiding fitness valley at state 2A.
• For α ≫ d/K phenotypic switching mixes phenotypes A and B. The population evolves along a complicated trajectory that includes both phenotypes.
The boundaries of different regions predicted by the above analytic calculation are in good numerical agreement with the diagram in Fig. 3B for K = 100, δ = 0.4, d = 0.1. In particular, the optimal switching region vanishes for µ ≫ 0.01 as predicted by the equality α * = α + which predicts that µ critical ∝ 1/K.
We note that all calculations presented in this section are not valid when the population size K or the mutation probability µ are very large. In this case the population may be in the deterministic or semi-deterministic regime, and different formulas will apply.