The rich phase structure of a mutator model

Saakian, David B.; Yakushkina, Tatiana; Hu, Chin-Kun

doi:10.1038/srep34840

Download PDF

Article
Open access
Published: 10 October 2016

The rich phase structure of a mutator model

David B. Saakian^1,2,
Tatiana Yakushkina³ &
Chin-Kun Hu^1,4,5

Scientific Reports volume 6, Article number: 34840 (2016) Cite this article

987 Accesses
13 Citations
Metrics details

Subjects

Biological physics

Abstract

We propose a modification of the Crow-Kimura and Eigen models of biological molecular evolution to include a mutator gene that causes both an increase in the mutation rate and a change in the fitness landscape. This mutator effect relates to a wide range of biomedical problems. There are three possible phases: mutator phase, mixed phase and non-selective phase. We calculate the phase structure, the mean fitness and the fraction of the mutator allele in the population, which can be applied to describe cancer development and RNA viruses. We find that depending on the genome length, either the normal or the mutator allele dominates in the mixed phase. We analytically solve the model for a general fitness function. We conclude that the random fitness landscape is an appropriate choice for describing the observed mutator phenomenon in the case of a small fraction of mutators. It is shown that the increase in the mutation rates in the regular and the mutator parts of the genome should be set independently; only some combinations of these increases can push the complex biomedical system to the non-selective phase, potentially related to the eradication of tumors.

Elucidating the correlations between cancer initiation times and lifetime cancer risks

Article Open access 12 December 2019

Hamid Teimouri, Maria P. Kochugaeva & Anatoly B. Kolomeisky

Survival of the simplest in microbial evolution

Article Open access 06 June 2019

Torsten Held, Daniel Klemmer & Michael Lässig

Evolvability-enhancing mutations in the fitness landscapes of an RNA and a protein

Article Open access 19 June 2023

Andreas Wagner

Introduction

The concepts of genome instability^{1,2,3,4,5,6,7,8,9,10,11} and clonal evolutionary dynamics^12,13,14 are key ideas in understanding the causes and behavior of cancer. Genome instability is a hallmark of cancer¹¹, plays a significant role in bacterial evolution^{15,16,17,18,19}, and has been a focus of physics-community research on oncology²⁰. We need more complex nonlinear models^12,13,14,21 than those for the viral evolution^22,23,24,25 for the cancer analysis.

Models with relatively simple evolutionary dynamics can be applied to cancers that arise from one or two driver mutations, e.g., inherited retinoblastoma; however, most cancers require three or more. It has been proposed^1,26 that some cancers may be associated with a substantial increase in the mutation rate of the genome. The question here is how the dynamics and phase structure of the evolving clonal population.

There have been some attempts to construct and solve evolutionary dynamics systems with mutator genes^16,17. The earlier work¹⁶ on this subject, influenced by experimental results on E. coli²⁷, discusses the problem of how selection in natural populations favors more rapid mutation. The authors suggest a simple model with a linear fitness function to identify the conditions that lead to selection for or against mutator states. Some approximate estimates were made for the steady state distribution and the dynamics of the mutator fraction in the population. The later paper¹⁷ used a quasispecies model with a multiplicative fitness function.

Classical quasispecies models^22,23 describe the evolutionary process of an infinite population of binary sequences caused by two forces: mutation and selection; the former is controlled by the mutation rate and the latter by the fitness function. One of the most impressive results for these models is the existence of a phase transition^23,28, known as the “error threshold” in some versions of the fitness landscape. When the mutation rate is small, a selective phase is observed, where sequences organize into a “quasispecies” near the sequence with the maximum fitness value. The non-selective phase observed with higher mutation rates shows dilution throughout the sequence space.

Returning to mutator dynamics, Nagar and Jain¹⁷ first studied the possible phases for these systems. Here, the fraction of mutator genome sequences in the population defines different phases: a normal phase with only normal sequences (a degenerate case when there are no back-mutations in a mutator gene), a mixed phase in which both mutators and non-mutators coexist, and a pure mutator phase with only mutators.

Among other findings, it has been shown that the small mutation rates for the mutator gene lead to the mixed phase with a small fraction of the mutator allele, but an increase in these rates turns the system to the mutator phase. The infinite population model with a constant mutation rate has been solved in ref. 29 (see also the recent review³⁰).

When we map the evolutionary dynamics to statistical physics, the genome length plays the role of lattice size. In statistical physics, different phases arise as different analytical expressions for free energy at a large lattice size. Similarly, we need the large genome length to define the evolutionary dynamics phases. In this paper, we derive these different expressions for the mean fitness in mixed and mutator phases. The most realistic case, when back mutation for the mutator gene can be neglected due to its small impact during the time of the experiment, gives a very simple situation: in a mutator phase in a steady state, there is a zero fraction of normal sequences, even for small genome length, but there is a finite fraction of normal sequences in the mixed phase.

Several different classes of evolution models form the basis of mutator phenomenon analysis:

Class I Phenomenological models^5,6. Such models consider deterministic or stochastic equations for a set of several genotypes. Models in this class are relatively simple but useful for describing the specific types of experiments, e.g., investigations that show how the mutator mechanism competes with the ordinary evolution scheme with a high mutation rate^5,6.

In ref. 5, a simple infinite population model without selection has been exposed, and it has been assumed that cancer arises after a given number of mutations in oncogenes. The probability of the start of cancer development at a fixed time has been calculated for the mutator mechanism as well as without mutators. The accuracy of the derivations has been well controlled using a small parameter: the ratio of the ordinary mutation rate to the mutation rate due to mutators. A different phenomenological model has been proposed without derivation details⁶, which discusses three types of fitness landscapes. This concept is of interest, and we consider one of the fitness landscapes in our work. However, the phenomenological models as a methodology cannot describe the different phases of the evolutionary dynamics accurately.

Class II Infinite population microscopic models. Models in this class are defined in the binary sequence space with a finite genome length^16,17. The Crow-Kimura^18,22,31 and Eigen^23,24,25,32 models, which are the quasispecies models mentioned above, have been widely investigated over recent decades, especially in the context of virus evolution. Mathematically, each of these models is described via a system of nonlinear master equations, which can be mapped to the chain of linear ordinary differential equations with some nonlinear transformation. In the Crow-Kimura setting, mutation and selection are parallel processes, while in the Eigen model, mutation is connected to selection. In smooth fitness landscapes, the steady-state distributions of the Crow-Kimura and Eigen models can be mapped to each other. In the large genome limit, the Crow-Kimura model is equivalent to a discrete-time Eigen model³³ and is very close to branching processes. Moreover, the Wright-Fisher model^34,35 for large populations can be mapped to a discrete-time Eigen model³³.

Class III Finite population microscopic models³⁶. In this case, models are also defined in the binary sequence space³⁶. In ref. 4, numerical simulations have been performed for the Moran model by considering the Muller ratchet in the presence of mutators. In ref. 7, the growth population version of the Wright-Fisher model^34,35 for the four-dimensional landscape has been formulated; there are neutral, deleterious and advantageous mutations, as well as a sufficient number of mutations in the part of the genome that brings about the mutator effect. According to these results, the mutator phenotype evolves only at a moderate level of selection for the advantageous mutations.

Class IV Infinite population and infinite number of genotypes models without backward mutations^29,30. These models are useful to describe the virus evolutionary dynamics for a short period of time. Compared with the microscopic models, which represent different possible phases and collective effects in evolution, class IV models are less fundamental.

In the current paper, we consider the microscopic infinite population models (class II). The discrete-time version of the Eigen model^23,24,25 was originally introduced to describe the self-replication of macromolecules at the origin of life (see also^37,38). In this model, the selection and mutation happen together (mutation directly follows replication), while in the Crow-Kimura model, originally proposed to describe biological speciation, mutation and selection happen in parallel. In both models, the genome is represented by a chain of L letters (genes), which take the values ±1, similar to the Ising spin^39,40. There is a mutation rate of μ per letter. We take the sequence with all “+” letters as the reference sequence. All of the genomes with the same total number l of “−” letters (l point mutations from the reference sequence) have fitness r_l ≡ f(1 − 2l/L), where f(x) as a fitness function, x = 1 − 2l/L. For the initial symmetric distribution, the model can be described via differential equations of L + 1 variables P_l, corresponding to the total probabilities of the sequences with l mutations in the genome. The mean fitness of the Crow-Kimura model has been calculated using algebraic methods⁴¹ and the Hamilton-Jacobi equation (HJE) method^42,43. The exact dynamics had been derived using the HJE⁴⁴. Here, we construct a mutator gene model on the basis of the Crow-Kimura model^18,22,31 and solve it for the general fitness case, by providing also the solution for the Eigen model version of the mutator phenomena.

Our solutions for both models are consistent with each other.

We find the analytical solution for the mean fitness for the general fitness function, including the multidimensional case, and clarify and calculate the order parameters and the phases of the model, which was an open problem until now. Previous work^5,6 has been focused on the comparison between the efficiencies of cancer development via ordinary evolutionary dynamics and the evolutionary dynamics using the mutator mechanism. Our quantitative results of the phase structure (shown in Fig. 1) depict the borders of different phases. The procedure to obtain Fig. 1 is described below.

The phase structure may have biomedical implications. In ref. 9, it was proposed that the transition from the mixed phase to the non-selective phase is associated with the possible eradication of the tumor. We provide the solution for two types of fitness landscapes; a symmetric case, where the fitness landscape depends on the total number of mutations from the reference sequences, and a random fitness landscape.

Key technical terms and notations are listed in Tables 1 and 2, respectively.

Table 1 Terms.

Full size table

Table 2 Notations.

Full size table

Results

The model with a symmetric fitness landscape

In this section, we consider a modification of the Crow-Kimura model^18,22,31 for evolutionary processes under the influence of a mutator gene. The chain of (L + 1) genes defines the genome, which can be formally subdivided into two parts; a regular part with L genes and a mutator gene, whose mutations change the fitness and the mutation rate in the regular part of the genome. We represent each gene with two alleles by ξ_θ = ±1, θ = 0, 1, …, L. The regular set of genes is denoted by the sequences S_i = (ξ₁, …, ξ_L), i = 1, …, 2^L. The mutator gene in the state ξ₀ = +1 determines the normal dynamics of the regular part with a mutation rate μ₁. This wild-type form of the gene corresponds to the most common phenotype in the population. If the state of the mutator gene changes to ξ₀ = −1, then the evolutionary process switches to a different regime with a mutation rate μ₂. The mutant-type gene defines an altered phenotype, which is called a mutator phenotype. We use α₁ and α₂ to designate forward (ξ₀: +1 → −1) and backward (ξ₀: −1 → +1) mutation rates of the mutator gene.

We assume that the fitness landscape is symmetric, which means it depends solely on the number of mutations, l, from the reference sequence (without loss of generality, we take the reference sequence to be S = (+1, …, +1)). For this purpose, it is natural to use the notion of Hamming distance between S and because all sequences in the same Hamming class have the same value of fitness. To facilitate further analysis, we introduce the variable for the mean value, . The state of the system is represented by two probability distributions. We use to denote the relative frequency of normal genome sequences (wild-types) with l mutations (in the l—th Hamming class) at the time moment, t, and for the relative frequency of sequences with an increased mutation rate (mutator-types) with l mutations, and we have a probability balance condition for any time moment, . All admissible transitions between system states can be seen as arrows in Fig. 2 ^45,46.

To describe the evolution of the probability distribution under consideration, we use the following system of 2(L + 1) differential equations:

Here, f(x_l) is a fitness function for the normal genome sequence, and g(x_l) is a fitness function for the mutator sequence. As long as changes in the mutator gene significantly affect both the fitness and mutation rate in the regular part, fitness functions f(x_l) and g(x_l) can be different, and μ₂ is commonly 10–100 times larger than μ₁^2,10. Note that the rates α₁ and μ₁ also differ because that α₁ is proportional to the number of possible mutagenic loci. The coefficients (L − l + 1) and (l + 1) appear in the Eq. (1) transition terms according to the combinatorial formulae for Hamming class probabilities^41,47. To investigate Eq.(1), we consider a nonlinear transformation⁴⁸:

It is easy to show that P_l and Q_l are the solutions of the following system of linear equations:

In Eqs (1) and (2), and are the probabilities under the normalization constraint, while P_l and Q_l of Eq. (3) are not normalized.

In this paper, we focus on the following characteristics of the model in a steady state: the mean fitness R, the total surplus s (the expected value of x_l), the surplus for the wild-type part of the population s₁ and for the mutator-type part s₂, and the fraction of the mutator sub-population q:

The surplus s shows the average number of mutations in the population, The maximum of the distributions P_l and Q_l is attained at points L(1 − s₁)/2 and L(1 − s₂)/2⁴⁴. Therefore, the value of the mean fitness for wild-type sequences is equal to f(s₁), and that for mutator-type sequences is equal to g(s₂), as the distribution is narrow due to the large constant L in the exponent in Eq. (5) below. Assuming a smooth distribution, we obtain the following from Eqs (1) and (2):

Calculating the mean fitness in growing populations is crucial for understanding evolutionary dynamics and is the primary concern of this investigation. Depending on the values of the main model parameters, we can obtain different analytical expressions for the mean fitness, which correspond to different phases of the system. The key problem is to first solve the large L limit, and clarify the following items:

A. Is it possible to have selective and non-selective phases in the system with a phase transition between them?
B. Does the average number of mutations depend on the state of the mutator gene or is it the same in both cases (s₁ = s₂)?
C. Is it possible to have a significant fraction of normal genomes in the population: 1 − q > 0?
D. Are there large finite L corrections for q at L ~ 10000?

These four questions are discussed in this paper; we show that both negative and positive answers exist for each. The second question (B) leads to the most considerable distinction in the analysis. In the Results section, we first investigate the case s₁ = s₂; then, in the Methods section, we describe a more complex situation with s₁ ≠ s₂. Moreover, contrary to the classical evolution models for quasispecies, here, the evolutionary picture depends on the genome length.

The case of forward and backward mutations of the mutator gene

Let us consider a smooth solution of Eq. (3) at the limit L → ∞ after the simplification mentioned in the previous section. We assume the following ansatz^42,44,49:

where we denoted x = 1 — 2l/L for large values of L. This form of expressions for P_l and Q_l allows the application of the HJE method to the system of equations (5) (see Methods). Putting our ansatz (5) into Eq. (3), we get a system of two equations for the variables v₁, v₂, du/dx and du/dt. Looking at these equations as a system of two linear algebraic equations for v₁ and v₂, we get a zero determinant condition for the non-zero solution of v₁ and v₂, so the final equation contains only du/dt and du/dx. The fact that L disappears in the Hamilton-Jacobi equation in the large L limit shows the correctness of our ansatz. We derive the mean fitness investigating the Hamiltonian without solving it and calculating the function u(x) in the steady state.

We obtain the following formulae for the mean fitness R using the potential function V_±⁴⁴:

Here, we use μ₁ = 1, μ₂ = μ, and . The numerical simulation for the system (1) supports the analytical result (6) well, as seen from Table 3.

Table 3 The comparison of the results for f(x) = g(x) = 3x²/2, μ₁ = 1, μ₂ = μ, α₁ = α₂ = 1, L = 400.

Full size table

Having the expression for the mean fitness R, we can calculate the surplus of distribution s from the following equation (see Methods):

At the time limit t → ∞, we denote v_k(x) ≡ v_k(x, t), k = 1,2.

It is important to note that Eq. (7) is valid whether s₁ = s₂ or not. However, for α₁ · α₂ ≠ 0, we have that s₁ = s₂. In this case, substituting the solution of R and s, we get the following system of equations for the surplus s and ratio v₂(s)/v₁(s) in the steady state (see Methods) :

In the limit of large t and L, we have q = v₂(s)/(v₁(s) + v₂(s)). For the case f(x) ≡ g(x), we get a simple relationship from (8):

The model with uni-directional mutations of the mutator gene

The probability of back mutation (from mutant to wild-type) is small and can be neglected in finite populations. Let us generalize this idea, and consider the infinite population model with α₂ = 0, α₁ = a (μ₁ = 1, μ₂ = μ). In this case, the expressions obtained for potential functions V_±(x) are still valid, but it is necessary to take into account both branches of the potential, V₊ and V₋, and define the mean fitness as their maxima.

Taking V₋(x) in Eq. (6), we get the mutator phase (here and below, we use terminology from¹⁷):

The latter equation for the surplus has the same form as for the ordinary Crow-Kimura model⁴¹. It can be derived directly from system (1) if we assume that the vast majority of the population has the mutator allele and omit the contribution of P_l in the second equation of Eq. (1). In the mutator phase, the mean fitness is defined by the fitness landscape for the mutator genome. Further, we will consider the case when two fitness landscapes are identical, e.g., f(x) = g(x).

Taking V₊(x) in Eq. (6) under the condition α₂ = 0, α₁ = a, μ₁ = 1, μ₂ = μ leads to the following expression for the mean fitness of the mixed phase (see Methods):

In the mixed phase, the mean fitness is defined by the wild-type fitness function and the mutation rate. However, numerical calculations for the linear fitness landscapes f(x) = kx or simple form quadratic fitness landscapes f(x) = kx²/2 show that the vast majority of the population has the mutator allele for a large enough genome length. It is worth pointing out that this result (the majority of population has a mutator-type) is correct for any smooth fitness landscapes for a long enough genome length. We verify that for the single peak fitness function, there is a finite fraction of wild-type sub-population at the large genome limit. Such results will be presented below.

Two phases have been found for the discrete-time Eigen model in ref. 17, where the mean fitness has been calculated for f(x) = k(x − 1), k ≫ 1. This model can be exactly mapped into the Crow-Kimura model at the large L limit, when the mutation rate per gene is fixed. Figure 3 gives a comparison of our analytical results for the mean fitness R with the numerics. Figure 3 shows that our analytical results are quite reliable.

The single peak model with uni-directional mutations of the mutator gene

Let us consider a single peak fitness function, f(x = 1) = g(x = 1) = J and f(x ≠ 1) = g(x ≠ 1) = 0. We have three possible phases in a steady state (Fig. 1). We consider either the steady state of Eq. (1), or the asymptotic solution of Eq. (3), replacing ²⁵. We provide the following expressions for the mean fitness in the mutator phase R_mu, mixed phase R_mix and non-selective phase R_ns; see Fig. 1:

We first calculate P₀ considering the equation for dP₀/dt in Eq. (1) and ignore P₁ term. The fraction of the wild-type sequences in the l-th Hamming class in the population is P_l = P₀(1/J)^l, which can be derived from Eq. (55) of ref. 50; therefore, for the fraction of the population with a normal mutator gene, we have . Considering analogously the equation for dQ₀/dt and ignoring the Q₁ term, we obtain . Conversely, the definition of R gives P₀ + Q₀ = R_mix/J ≡ (J − a − 1)/J. Thus, we derive for P₀ the following equation:

This expression allows us to calculate the mutator allele probability q using the equivalence :

At L = 5000, the accuracy of our analytical result is approximately 0.1%, as seen in Fig. 4. Figure 4 and Eq. (14) show that the fraction of the wild allele does not approach to 0 even for a large L.

The model with uni-directional mutations of the mutator gene and a smooth fitness landscape

As previously mentioned, we have the mixed and mutator phases. In the mutator phase, the numerics give that the mutator sub-population dominates with q = 1, even for finite L, so we can ignore the first chain of transitions and reduce the system to the standard Crow-Kimura model with mutation rate μ₂ and fitness function g(x).

Let us now consider the mixed phase. If α₁ = a, α₂ → 0, then Eq. (8) gives q → 1 for this case. Assuming an ansatz , we obtain:

The smooth curves in Figs 5 and 6 obtained by numerical calculations illustrate that the 1 − q strongly depends on the genome length, L. For the small genome length L, Fig. 7 obtained by numerical calculations support the behavior

As has been derived in ref. 29, for the large L, we get 1 − q ≪ 1. In the Methods section, we calculate (1 − q) for the system with a general smooth symmetric fitness function and parameters a, μ. For linear fitness functions f(x) = kx and small values of a and μ₁/μ₂ ≡ 1/μ, we obtain a simple expression:

This result is well supported by the numerics, as seen in Fig. 5 and Table 4 for 1 − q ≪ 1, when the exponent is a large negative number. The transition between two regimes (16) and (17) is for the values of L where the expression of the exponent in Eq. (17) is ~1.

Table 4 The results for f(x) = g(x) = kx, μ₁ = 1, μ₂ = μ, α₁ = a, α₂ = 0.

Full size table

In population genetics, the fitness difference for one mutation is denoted by S and corresponds to the linear coefficient k; U is the total number of mutations per generation. We use the equalities U = εμ₁, S = ε2k/L and h = εα₁, where ε is the discrete time step of one generation and h is the transition probability to the mutator allele during one generation. Applying the discrete-time Eigen model to Crow-Kimura model mapping, we get a condition for the exponent:

Otherwise, the numerics give q ≪ 1 instead of Eq. (17).

The linear fitness case can be solved exactly. We introduce the generation functions Q(z) = ∑_lQ_lz^l and P(z) = ∑_lP_lz^l, so rewriting Eq. (1) with these variables gives a system of equations for the generation functions:

It is easy to calculate R from the first equation with the constraint that P(z) is an L-th order polynomial. Another constraint is the probability balance condition, P(1) + Q(1) = 1. Solving this system, we can calculate P_l, Q_l with relative accuracy . The generation function method has been applied before for investigation of the Crow-Kimura model⁵¹.

The Eigen model with a mutator gene and random fitness landscape

The probability distribution for the Eigen model is defined for genome sequences with the wild and mutator type (p_i and q_i, respectively, 0 ≤ i < 2^L). Consider now the following system of equations

where r_i is the fitness function, Q_ij and are probabilities of mutation from S_j to S_i, Q_ij = w^{L−d(i, j)}(1 − w)^{d(i, j)}, w is the probability of errorless replication per nucleotide for sequences with normal mutator gene and 1 − e^−h ≈ h is the transition probability from the wild type to the mutator type. The diagonal terms of the mutation matrix are Q_ii = w^L ≡ Q ≡ e^−γ, where γ = −Lln(w) ≈ L(1 − w) is the mutation parameter in the Eigen model, , is the probability of errorless replication per nucleotide for sequences with a mutator allele. The mutation parameter for the mutator type is . Here, d(i, j) is the Hamming distance between two sequences, S_i and S_j.

For the single peak fitness function with r₀ = A and r_i = 1, i ≥ 1 the mean fitness is similar to the one obtained for the Crow-Kimura model: QAe^−h for the mixed phase, for the mutator phase and 1 for the non-selective phase. The random fitness landscape is a reasonable approximation for real biological data⁵². However, it was shown that it is almost equivalent to the single peak fitness landscape³³. If we assume a log-normal distribution of Wrightian fitnesses, r_i:

then, we have the maximal value of fitness ³³, while for the majority of the population, fitness is equal to 1. Using the formulae for the single peak fitness case from the Methods section, we obtain:

where we dropped the small term, h/μ. The steady state distribution is the same for the continuous-time Eigen model (20) and for the discrete-time Eigen model. For small values of h ≪ 1, U ≡ γ = L(1 − q) ≪ 1 and (A − 1) ≡ S ≪ 1, we get:

Equation (23) coincides with the Crow-Kimura model result in Eq. (14) after mapping τJ = S, U = τ, h = aτ, where τ is the growing time scale of the replicator, and we identified the Crow-Kimura model dynamics after the time τ with the dynamics of the discrete time Eigen model³³.

It is interesting that there is no μ dependence in our expression. Taking the case of strong selection, s ≫ U, we obtain the situation when q ≪ h/U, as in the case of refs 53 and 54. According to ref. 53 (Table 1), we have U = 0.003, q = 0.00001 and h/U = 0.0001, while for ref. 54 (see abstract), we have q = 3/10⁵, h = 5/10⁶. The case of large q = 1/408 = 0.245%⁵⁵, p. 109, perhaps is related to the smooth fitness landscape.

Discussion

In this work, we have investigated the mutator phenomenon in the evolutionary process in a stable environment for infinite population size. Previous studies have obtained some results for a microscopic model with uni-directional mutation (wild-type → mutator-type) and linear Malthusian fitness in a 1-dimensional landscape. The mutator and mixed phases have been considered theoretically in the literature, so we continued to focus on the phase structure in this research. We have calculated exactly, for the first time, the phase structure of the model for the general fitness function, the analytical expression for the mean fitness for any scheme of mutations between normal and mutator states in the d-dimensional fitness landscape (see Methods). Moreover, we have discussed the important case of the random fitness landscape. We have derived the formula for the mutator-type probability, q, for the large genome length. This mathematical problem is highly non-trivial, as it is non-perturbative via the value of the back mutation rate: see Eqs (8) and (17). Equation (17) shows that the mutator-type takes over the entire population for the model with a linear fitness function. This effect has been observed experimentally^27,56.

In ref. 17, an expression for the mutator allele probability has been derived when the mutator-type is the minority and the fitness function is linear. Our results confirm these findings¹⁷; furthermore, our method provides an analytical solution for the model in this case. When condition (18) is broken, the wild-type takes over the population, which has also been proven experimentally⁵⁴. The value of q depends on the genome length (q ≪ 1 or 1 − q ≪ 1; see Fig. 5), and for the surplus, both alternatives hold (s₁ = s₂ or s₁ ≠ s₂). Thus, the finite size corrections could be large, even for the large genome length L ~ 10⁴.

A careful investigation of the evolutionary models is essential for biological applications. Looking at possible phases with order parameters is needed to provide an appropriate level of accuracy. As long as cancer is assumed to be a collective phenomenon, even with some collective intellect⁵⁷, it is necessary to use analytical methods for complex systems. Let us compare our results with those by Desai and Fisher²⁹ who solved a phenomenological model, while we rigorously solved the microscopic model. For “small” genome sizes, the mutator fraction is proportional to the small parameter α, as has been found in ref. 29. What we claim is q → 1 for sufficiently large genome length for any smooth fitness function. The latter case is a non-perturbative phenomenon, and it is not represented in the approach of²⁹, as they neglected the backward mutation between different classes of wild-types. The approach by Desai and Fisher²⁹ cannot be applied for the steady state of our model, as it is completely wrong for that case; however, Desai and Fisher obtained²⁹ a useful (especially for the cancer case) analytical estimate for the short-time dynamics for the involved case of general fitness function.

Theoretical results concerning the error threshold⁵⁸ have a practical impact for viruses and cancer, and it has been recently suggested to use an error catastrophe as a new therapeutic strategy for the solid cancer treatment¹⁰. A phase structure shown in Fig. 1 qualitatively describes the different stages of cancer; the mixed phase s ≠ 0 characterizes the early non-aggressive version of tumor, and the mutator phase is an aggressive stage of cancer, possibly with metastasis. To eradicate the tumor according to the strategy suggested in ref. 9, it should be transferred to the non-selective phase s = 0. As seen in Fig. 1, there are four different situations. In the mixed II subphase, we can push the tumor to the non-selective phase (the eventual goal of error-catastrophe therapy) to increase the forward mutation rate for the mutator gene α₁, which is responsible for the genome stability. In the mutator I subphase, we should increase the mutation rate μ to push the tumor to the non-selective phase. In the mutator II and mixed I subphases, we need to increase both versions of mutation rates. The phase structure has the same form for the random fitness case and a similar form for other fitness landscapes.

Further research should focus on the finite population version of the mutator model, as the population size can drastically affect the evolutionary dynamics⁸. The problem has already attracted attention¹⁹, but it is reasonable first to solve a simpler case of infinite population limit.

To summarize, let us directly reply to the four questions stated in the beginning of the Results section.

A. There is a phase transition between two phases, selective and non-selective, for the random fitness landscape and for the last two types of fitness landscapes considered in ref. 6 (the fitness landscape is flat until some critical number of mutations, then jumps and later is flat again; the fitness decreases linearly until some number of mutations and then jumps). For the linear fitness landscape, there is no phase transition.
B. For small enough genome length s₁ ≠ s₂, see the Methods section for a detailed explanation.
C. For a small genome length, it is possible to have some non-zero fraction of wild-types (see the Methods section).
D. There are large finite L corrections for q; see Figs 5 and 6.

Methods

To perform the analytical investigation Eq. (3), we use the HJE approach. Ignoring O(1/L) correction terms and using Eq. (5) and the formulae P_l±1 = v₁e^Lu±2u′ and Q_l±1 = v₂e^Lu±2u′ in Eq. (3), we get:

Here, we denoted and . We consider Eq. (24) as a homogeneous system of equations for v₁ and v₂, so its determinant is equal to 0 for the non-zero solution of v₁ and v₂. First, we find from the latter condition and write an equation, , where H_± is a Hamiltonian. Then, we derive the potential function as

For the general case we have the following Hamiltonian:

Deriving V_±(x) according to Eqs (25) and (26), we get Eq. (6) to calculate the mean fitness R.

At the maximum point x = s of the distribution, we take in Eq. (26) to obtain Eq. (7) for the surplus s. Putting to Eq. (24) the expression for the R, and looking the point x = s, we derive Eq. (8).

If f(x) = g(x), α₁ = a, α₂ = 0 (μ₁ = 1, μ₂ = μ), then, taking into account the ansatz P_l = e^Lu(x,t), we derive an equation:

and the second equation for the mixed phase in Eq. (11): f(s) = R.

The mutator-type probability

Consider the general case s₁ ≠ s₂ and parameters α₁ = a, α₂ = 0, μ₁ = 1, μ₂ = μ and f(x) = g(x). For Q_l ~ P_l, we consider the second equation in (1) with the substitution P_l = e^Lu(x,t) in a steady state. We derive an equation for Q_l:

We can write an approximate solution:

If P_l ≪ Q_l, then, substituting , we get the following equation for :

The expression for s₂ can be derived from R = f(s₂). We took the solution (27) at the interval [s₃, 1] (where s₃ is a switching point between s₁ and s₂) and the solution (30) at the interval [s₁, s₃]. We use the concatenation of these two solutions, assuming the smoothness of the derivative , thus we obtain the following equation for s₃:

Putting a condition u(s₁) = 0 (which is equivalent to ∑_lP_l = 1) and , l₀ = (1 + s₂)L/2, we get:

Here, u′ and are calculated using Eqs (27) and (30), respectively. The expression in the exponent in Eq. (32) is exact at the limit L → ∞.

These analytical results allow us to predict, which allele takes over the population. The analysis of Table 4 reveals, that if , then the mutator-type dominates. If 1 − K ≪ 1, then the wild-type takes over the population.

The case of small mutation rate for mutator allele a

Let us suppose that forward mutation rate a is sufficiently small. This assumption leads to the following equations:

From (31), we obtain

We are looking the case of large μ, therefore, we can replace s₃ by s₁:

Thus

For the linear fitness f(x) = kx, we get Eq. (17):

This solution is correct when the following condition is met:

The mean fitness for the model with d-dimensional fitness landscape and mutator gene

The multidimensional model has been discussed in ref. 19, where two parts of genome were considered; the first part is with lethal mutations and the second part is with deleterious and advantageous mutations. The genome is formally fractured into d parts with the lengths Ly_i. For the wild-type, the fitness is equal to ∑_if(x_i), and for the mutator-type, the fitness is ∑_ig(x_i), where x_i = 1−2l_i/(Ly_i) and l_i is the number of “−” spins (alleles) in the i-th part of genome. We denote mutation rates as μ_i for the wild-type and ν_i for the mutator-type, see ref. 49 for the multidimensional fitness model without a mutator gene. For these modified expressions, we calculate the mean fitness R as a maximum of the potential function V_±, defined as:

Here α₁, α₂ have the same meaning as in Eq. (1). The model considered in ref. 19 corresponds to the values: d = 2, f₁(1) = g₁(1) = 0, f₁(x) = g₁(x) = −∞ for −1 ≤ x < 1 and f₂(x) = g₂(x) = k₂(1 − x), for the case of finite population.

The Eigen setting of the mutator model

Consider the first equation in the system of Eq. (20) as an equation for the eigenvalue R. Compared with the ordinary Eigen model, the matrix of the linear system is multiplied by e^−h. Thus, we can directly use the results from⁵⁸ and write the expression for the mean fitness of the mixed phase for the fitness function r_i = f(x):

For the single peak fitness, we obtain for the selective phase⁵⁸:

For the non-selective phase, we simply have R = 1.

In the mutator phase, we ignore the first chain of equations in Eq. (20) and get R as the mean fitness of the Eigen model with the value:

Consider now the distribution of population in the single peak fitness case. Ignoring the back mutations for the equation for q₀ (because they give O(1/L) terms), we obtain:

As all the sequences, besides p₀, q₀, have a fitness value 1, we have an equation (q₀ + p₀)A + (1 − q₀ − p₀) = R. Therefore

We derive eventually:

In the ordinary Eigen model

This equation is valid in our case as all the equations in the first chain of Eq. (20) are derived from the ordinary model⁵⁸ by multiplying on e^−h, moreover, we have that Re^h = QA. These two equations give together Eq. (22) of the text.

For the discrete-time Eigen model with a time step n, we have the probabilities p_i(n), q_i(n) and the following iteration equations:

Additional Information

How to cite this article: Saakian, D. B. et al. The rich phase structure of a mutator model. Sci. Rep. 6, 34840; doi: 10.1038/srep34840 (2016).

References

Loeb, L. A., Springgate, C. F. & Battula, N. Errors in DNA replication as a basis of malignant change. Cancer Res. 34, 2311–2321 (1974).
CAS PubMed Google Scholar
Loeb, L. A., Loeb, K. R. & Anderson, J. P. Multiple mutations and cancer. Proc. Natl. Acad. Sci. USA 100, 776–781 (2003).
ADS CAS PubMed PubMed Central Google Scholar
Lynch, M. The cellular, developmental and population-genetic determinants of mutation-rate evolution. Genetics 180, 933 (2008).
PubMed PubMed Central Google Scholar
Soderberg, R. J. & Berg, O. G. Kick-starting the ratchet: the fate of mutators in asexual populations. Genetics 187, 1129 (2011).
PubMed PubMed Central Google Scholar
Beckman, R. A. & Loeb, L. A. Efficiency of carcinogenesis with and without a mutator mutation. Proc. Natl. Acad. Sci. USA 103, 14140–14145 (2006).
ADS CAS PubMed PubMed Central Google Scholar
Beckman, R. A. Mutator Mutations Enhance Tumorigenic Efficiency across Fitness Landscapes. PLoS One 4, e5860 (2009).
ADS PubMed PubMed Central Google Scholar
Datta, R. S. et al. Modelling the evolution of genetic instability during tumour progression. Evolutionary Applications 6, 20–33 (2013).
CAS Google Scholar
Raynes, Y., Gazzara, M. R. & Sniegowski, P. D. Contrasting dynamics of a mutator allele in asexual populations of different size. Evolution 66, 2329–2334 (2012).
PubMed PubMed Central Google Scholar
Fox, E. J. & Loeb, L. A. Lethal mutagenesis: targeting the mutator phenotype in cancer. Semin Cancer Biol. 20, 353–359 (2010).
CAS PubMed PubMed Central Google Scholar
Loeb, L. A. Human cancers express mutator phenotypes: origin, consequences and targeting. Nature Review Cancers 11, 450–457 (2011).
CAS Google Scholar
Hanahan, D. & Weinberg, R. A. The hallmarks of cancer. Cell 100, 57 (2000).
CAS PubMed Google Scholar
Nowell, C. P. The clonal evolution of tumor cell populations. Science 194, 23–28 (1976).
Article ADS CAS PubMed Google Scholar
Merlo, L. M. F., Pepper, J. W., Reid B. J. & Maley, C. C. Cancer as an evolutionary and ecological process. Nature Review Cancers 6, 924–935 (2006).
CAS Google Scholar
Greaves, M. & Maley, C. C. Clonal evolution in cancer. Nature 481, 306–311 (2012).
ADS CAS PubMed PubMed Central Google Scholar
Taddei, F. et al. Role of mutator alleles in adaptive evolution. Nature 387, 700–702 (1997).
ADS CAS PubMed Google Scholar
Kessler, D. & Levine, H. Mutator dynamics on a smooth evolutionary landscape. Phys. Rev. Lett. 80, 2012 (1998).
ADS CAS Google Scholar
Nagar A. & Jain, K. Exact phase diagram of quasispecies model with a mutator rate modifier. Phys. Rev. Lett. 102, 038101 (2009).
ADS PubMed Google Scholar
Baake, E., Baake, M. & Wagner, H. Quantum Chain is Equivalent to a Model of Biological Evolution. Phys. Rev. Lett. 78, 559 (1997).
ADS CAS Google Scholar
Wylie, C. S., Ghim, C. M., Kessler, D. & Levine, H. The Fixation Probability of Rare Mutators in Finite Asexual Populations. Genetics 181, 1595–1612 (2009).
CAS PubMed PubMed Central Google Scholar
Michor, F., Liphardt, J., Ferrari, M. & Widom, J. What does physics have to do with cancer? Nature Reviews Cancer 11, 657–670 (2011).
CAS PubMed PubMed Central Google Scholar
Sole, R. V. Phase transitions in cancer. in Onofrio, d’. et al. eds New Challenges for Cancer Systems Biomedicine (Springer-Verlag, Berlin, 2013).
Crow, J. F. & Kimura, M. An Introduction to Population Genetics Theory (Harper Row, NY, 1970).
Eigen, M. Selforganization of matter and the evolution biological macromolecules. Naturwissenschaften 5, 465–523 (1971).
ADS Google Scholar
Eigen, M. & Shuster, P. The hypercycle: A principal of natural selforganization (Springer, Berlin-Heidelberg, 1979).
Eigen, M., McCascill, J. & Schuster, P. The Molecular Quasi-Species. Advances in Chemical Physics 75, 149–263 (1989).
CAS Google Scholar
Ao, P. Orders of magnitude change in phenotype rate caused by mutation, Cellurar Oncology 29, 67–69 (2007).
CAS Google Scholar
Sniegowski, P. D., Gerrish, P. J. & Lenski, R. E. Evolution of high mutation rates in experimental populations of E. coli. Nature 387, 703–705 (1997).
ADS CAS PubMed Google Scholar
Biebricher, C. K. & Eigen, M. The error threshold. Virus research 107(2), 117–127 (2005).
CAS PubMed Google Scholar
Desai, M. M. & Fisher, D. S. The balance between mutators and nonmutators in asexual populations. Genetics 188, 997–1014 (2011).
PubMed PubMed Central Google Scholar
Kussell, E. & Vucelja, M. Non-equilibrium physics and evolution-adaptation, extinction, and ecology: a Key Issues review. Rep. Prog. Phys. 77, 102602 (2014).
ADS CAS PubMed Google Scholar
Saakian, D. B. & Hu, C.-K. Solvable biological evolution model with parallel mutation-selection scheme, Phys. Rev. E 69, 046121 (2004).
ADS Google Scholar
Saakian, D. B. & Hu, C.-K. Eigen model as a quantum spin chain: exact dynamics. Phys. Rev. E 69, 021913 (2004).
ADS Google Scholar
Kirakosyan, Z., Saakian, D. B. & Hu, C.-K. Evolution models with lethal mutations on symmetric or random fitness landscapes. Phys. Rev. E 82, 011904 (2010).
ADS Google Scholar
Fisher, R. A. The genetical theory of natural selection (Claredon press, Oxford, 1930).
Wright, S. Evolution in Mendelian populations. Genetics 16, 97 (1931).
CAS PubMed PubMed Central Google Scholar
Davies, P. C. W. & Lineweaver, C. H. Cancer tumors as Metazoa 1.0: tapping genes of ancient ancestors. Phys. Biol. 8, 015001 (2011).
ADS CAS PubMed PubMed Central Google Scholar
Saakian, D. B., Biebricher, C. K. & Hu, C.-K. Lethal Mutants and Truncated Selection Together Solve a Paradox of the Origin of Life. PLoS One 6, e21904 (2011).
ADS CAS PubMed PubMed Central Google Scholar
Kirakosyan, Z., Saakian, D. B. & Hu, C.-K. Eigen model with correlated multiple mutations and solution of error catastrophe paradox in the origin of life. J. Phys. Soc. Jpn. 81, 114801 (2012).
ADS Google Scholar
Brush, S. G. History of the Lenz-Ising Model. Rev. Mod. Phys. 39, 883–893 (1967).
ADS CAS Google Scholar
Hu, C.-K. Historical Review on Analytic, Monte Carlo, and Renormalization Group Approaches to Critical Phenomena of Some Lattice Models. Chin. J. Phys. 52, 1–76 (2014).
MathSciNet Google Scholar
Baake E. & Wagner, H. Mutation-selection models solved exactly with methods of statistical mechanics. Genet. Res. 78, 93–117 (2001).
CAS PubMed Google Scholar
Saakian, D. B. A new method for the solution of models of biological evolution: Derivation of exact steady-state distributions. J. Stat. Phys. 128, 781 (2007).
ADS MathSciNet MATH Google Scholar
Sato, K. & Kaneko, K. Evolution equation of phenotype distribution: General formulation and application to error catastrophe. Phys. Rev. E 75, 061909 (2007).
ADS MathSciNet Google Scholar
Saakian, D. B., Rozanova, O. & Akmetzhanov, A. Exactly solvable dynamics of the Eigen and the Crow-Kimura model. Phys. Rev. E 78, 041908 (2008).
ADS MathSciNet Google Scholar
Yakushkina, T., Saakian, D. B., Bratus, A. & Hu, C.-K. Evolutionary games with randomly changing payoff matrices. J. Phys. Soc. Japan 84, 064802 (2015).
ADS Google Scholar
Yakushkina, T., Saakian, D. B. & Hu, C.-K. Exact Dynamics for a Mutator Gene Model. Chin. J. Phys. 53, 100904 (2015).
MathSciNet Google Scholar
Woodcock, G. & Higgs, P. G. Population evolution on a multiplicative single-peak fitness. J. Theor. Biol. 179, 61–73 (1996).
CAS PubMed Google Scholar
Thompson, C. J. & McBride, J. L. On The Eigen’s theory of selforganization of matter. Math. Biosci. 21, 127 (1974).
MathSciNet MATH Google Scholar
Saakian, D. B. & Galstyan, V. Dynamics of the Chemical Master Equation, a strip of chains of equations in d-dimensional space. Phys. Rev. E 86, 011125 (2012).
ADS Google Scholar
Saakian, D. B., Hu, C.-K. & Khachatryan, H. Solvable biological evolution models with general fitness functions and multiple mutations in parallel mutation-selection scheme. Phys. Rev. E 70, 041908 (2004).
ADS MathSciNet Google Scholar
Bratus, A. S., Novozhilov, A. S. & Semenov, Y. S. Linear algebra of the permutation invariant Crow-Kimura model of prebiotic evolution. Math. Biosciences 256, 42–57 (2014).
MathSciNet MATH Google Scholar
Sanjuan, R., Moya, A. & Elena, S. F. The distribution of fitness effects caused by single-nucleotide substitutions in an RNA virus. Proc. Natl. Acad. Sci. USA 101, 8396 (2004).
ADS CAS PubMed PubMed Central Google Scholar
Ninio, J. Transient Mutators: A Semiquantitative Analysis of the Influence of Translation and Transcription Errors on Mutation Rates. Genetics 129, 957 (1991).
CAS PubMed PubMed Central Google Scholar
Boe, L., Danielsen, M., Knudsen, S., Petersen, J. B., Maymann, J. & Jensen P. R. The frequency of mutators in populations of Escherichia coli. Mutation Research 448, 47–55 (2000).
CAS PubMed Google Scholar
Gross, M. D. & Siegel, E. C. Incidence of mutator strains in Escherichia coli and coliforms in nature, Mutat. Res. 91, 107–110 (1981).
CAS PubMed Google Scholar
Chao, L. & Cox, E. C. Competition between high and low mutating strains of Escherichia coli. Evolution 37, 125–134(1983).
PubMed Google Scholar
Tarabichi, M. et al. Systems biology of cancer: entropy, disorder, and selection-driven evolution to independence, invasion and swarm intelligence. Cancer Metastasis Rev. 32, 4031 (2013).
Google Scholar
Saakian D. B. & Hu, C.-K. Exact solution of the Eigen model with general fitness functions and degradation rates. Proc. Natl. Acad. Sci. USA 103, 4935 (2006).
ADS CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

D.B.S. thanks R. Sole for the discussion. We thank Jonathan Dushoff for a critical reading of this paper. D.B.S. and C.-K.H. were thankful to Taiwan-Russia collaborative research grant with Grant Number 101-2923-M-001 -003 -MY3. DBS was supported by Grant MOST 104-2811-M-001-102 and MOST 105-2811-M-001 -078. C.-K.H. was supported by MOST 104-2112-M-001 -002 and MOST 105-2112-M-001 -004.

Author information

Authors and Affiliations

Institute of Physics, Academia Sinica, Nankang, 11529, Taipei, Taiwan
David B. Saakian & Chin-Kun Hu
A.I. Alikhanyan National Science Laboratory (Yerevan Physics Institute) Foundation, Alikhanian Brothers St. 2, Yerevan, 375036, Armenia
David B. Saakian
National Research University Higher School of Economics, Myasnitskaya Street 20, Moscow, 101000, Russia
Tatiana Yakushkina
National Center for Theoretical Sciences, National Tsing Hua University, Hsinchu, 30013, Taiwan
Chin-Kun Hu
Business School, University of Shanghai for Science and Technology, Shanghai, 200093, China
Chin-Kun Hu

Authors

David B. Saakian
View author publications
You can also search for this author in PubMed Google Scholar
Tatiana Yakushkina
View author publications
You can also search for this author in PubMed Google Scholar
Chin-Kun Hu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

D.B.S. and C.-K.H. designed and performed the research. T.Y. did numerical calculations and participated in the research implementation. All authors wrote and reviewed the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Saakian, D., Yakushkina, T. & Hu, CK. The rich phase structure of a mutator model. Sci Rep 6, 34840 (2016). https://doi.org/10.1038/srep34840

Download citation

Received: 12 November 2015
Accepted: 15 September 2016
Published: 10 October 2016
DOI: https://doi.org/10.1038/srep34840

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

The rich phase structure of a mutator model

Subjects

Abstract

Similar content being viewed by others

Elucidating the correlations between cancer initiation times and lifetime cancer risks

Survival of the simplest in microbial evolution

Evolvability-enhancing mutations in the fitness landscapes of an RNA and a protein

Introduction

Results

The model with a symmetric fitness landscape

The case of forward and backward mutations of the mutator gene

The model with uni-directional mutations of the mutator gene

The single peak model with uni-directional mutations of the mutator gene

The model with uni-directional mutations of the mutator gene and a smooth fitness landscape

The Eigen model with a mutator gene and random fitness landscape

Discussion

Methods

The mutator-type probability

The case of small mutation rate for mutator allele a

The mean fitness for the model with d-dimensional fitness landscape and mutator gene

The Eigen setting of the mutator model

Additional Information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

Comments

Search

Quick links

Subjects

Abstract

Similar content being viewed by others

Elucidating the correlations between cancer initiation times and lifetime cancer risks

Survival of the simplest in microbial evolution

Evolvability-enhancing mutations in the fitness landscapes of an RNA and a protein

Introduction

Results

The model with a symmetric fitness landscape

The case of forward and backward mutations of the mutator gene

The model with uni-directional mutations of the mutator gene

The single peak model with uni-directional mutations of the mutator gene

The model with uni-directional mutations of the mutator gene and a smooth fitness landscape

The Eigen model with a mutator gene and random fitness landscape

Discussion

Methods

The mutator-type probability

The case of small mutation rate for mutator allele a

The mean fitness for the model with d-dimensional fitness landscape and mutator gene

The Eigen setting of the mutator model

Additional Information

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Ethics declarations

Competing interests

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links