Optimal Down Regulation of mRNA Translation

Down regulation of mRNA translation is an important problem in various bio-medical domains ranging from developing effective medicines for tumors and for viral diseases to developing attenuated virus strains that can be used for vaccination. Here, we study the problem of down regulation of mRNA translation using a mathematical model called the ribosome flow model (RFM). In the RFM, the mRNA molecule is modeled as a chain of $n$ sites. The flow of ribosomes between consecutive sites is regulated by $n+1$ transition rates. Given a set of feasible transition rates, that models the outcome of all possible mutations, we consider the problem of maximally down regulating the translation rate by altering the rates within this set of feasible rates. Under certain conditions on the feasible set, we show that an optimal solution can be determined efficiently. We also rigorously analyze two special cases of the down regulation optimization problem. Our results suggest that one must focus on the position along the mRNA molecule where the transition rate has the strongest effect on the protein production rate. However, this rate is not necessarily the slowest transition rate along the mRNA molecule. We discuss some of the biological implications of these results.


INTRODUCTION
Gene expression is the process by which the genetic code inscribed in the DNA is transformed into proteins. The process consists of four main steps: transcription of a DNA gene into an mRNA molecule, translation of the mRNA molecule to a protein, degradation of mRNA molecules, and degradation of proteins. During mRNA translation, macromolecules called ribosomes move unidirectionally along the mRNA molecule, decoding it codon by codon into a corresponding chain of amino acids that is folded to become a functional protein. Translation is a fundamental biological process, and understanding and re-engineering this process is important in many scientific disciplines including medicine, evolutionary biology, and synthetic biology [1].
New methods that measure gene-specific translation activity at the whole-genome scale, like polysome profiling [2] and ribosome profiling [3], have led to a growing interest in mathematical models for translation. Such models can be used to integrate and explain the rapidly accumulating biological data as well as to predict the outcome of various manipulations of the genetic machinery. Recent methods that allow real-time imaging of translation on a single mRNA transcript in vivo (see, e.g. [4], [5], [6], [7]) are expected to provide even more motivation for developing and analyzing powerful dynamical models of translation.
Down-regulation of translation is important in cell biology, medicine, and biotechnology. For example, in many organisms small RNA genes, such as microRNAs, hybridize to the mRNA in specific locations [8], [9] in order to down-regulate translation initiation or elongation [10], [11] and/or promote mRNA degradation. Alterations in the expression of microRNA genes contribute to the pathogenesis of most, if not all, human malignancies [12], and many times cancer cells are targeted via generating tumor specific RNA interference (RNAi) genes that down-regulate the oncogenes [13], [14], [15]. Furthermore, many viral therapeutic treatments and viral vaccines are based on the attenuation of mRNA translation in the viral genes [16], [17], [18], [19], [20]. Down regulation of mRNA translation in an optimal manner is also Fig. 1. The RFM models unidirectional flow along a chain of n sites. The state variable xi(t) ∈ [0, 1] represents the density of site i at time t. The parameter λi > 0 controls the transition rate from site i to site i + 1, with λ0 > 0 [λn > 0] controlling the initiation [exit] rate. The output rate at time t is R(t) = λnxn(t).
Here we study for the first time optimal down regulation of translation in a dynamical model of translation. A standard model for translation is the totally asymmetric simple exclusion process (TASEP) [24], [25]. In this model, particles hop randomly along an ordered lattice of sites. Simple exclusion means that a particle cannot hop into a site that is occupied by another particle. This models hard exclusion between the particles, and creates an indirect coupling between the particles. Indeed, if a particle remains in the same site for a long time then all the particles preceding this site cannot move forward leading to a "traffic jam".
In the context of translation, the lattice is the mRNA molecule; the particles are the ribosomes; and hard exclusion means that a ribosome cannot move forward if the codon in front of it is covered by another ribosome. In the homogeneous TASEP (HTASEP) all the transition rates within the lattice are assumed to be equal and normalized to 1, and thus the model is specified by an input rate α, an exit rate β, and an order N denoting the number of sites in the lattice. TASEP is a fundamental model in non-equilibrium statistical mechanics that has been used to model numerous natural and artificial processes including traffic flow, surface growth, communication networks, evacuation dynamics and more [26], [27].
The ribosome flow model (RFM) [28] is a nonlinear, continuous-time compartmental model for the unidirectional flow of "material" along a chain of n consecutive compartments (or sites). It can be derived via a mean-field approximation of TASEP [26], [29]. In the RFM, the state variable x i (t) : R + → [0, 1], i = 1, . . . , n, describes the normalized amount (or density) of "material" in site i at time t, where x i (t) = 1 [x i (t) = 0] indicates that site i is completely full [completely empty] at time t. Thus, the vector x(t) := x 1 (t) . . . x n (t) ′ describes the density profile along the chain at time t. A parameter λ i > 0, i = 0, . . . , n, controls the transition rate from site i to site i + 1, where λ 0 [λ n ] is the initiation [exit] rate (see Fig. 1). The output rate at time t is R(t) = λ n x n (t). In the context of translation, the "material" are the moving ribosomes, and each site represents a group of codons, i.e. the mRNA is coarse-grained into n consecutive sites of codons. Thus, R(t), the output flow of ribosomes at time t, is the protein production rate at time t. It is known that the RFM admits a unique steady-state production rate denoted by R = R(λ) [30], where λ := λ 0 . . . λ n ′ .
Here, we use the RFM to analyze how to maximally down-regulate mRNA translation. To do this, we formulate the following general optimization problem. Given an mRNA molecule with n sites, and a convex and compact region of feasible transition rates Ω n+1 , find a vector λ * ∈ Ω n+1 such that R(λ * ) = min λ∈Ω n+1 R(λ). In other words, the problem is how to select transition rates, within a feasible region, such that the production rate is minimized (see Fig. 2). To the best of our knowledge, this is the first time that such a problem is analyzed in a dynamical model of mRNA translation.
As a concrete example, consider an RFM with dimension n and ratesλ 0 , . . . ,λ n . Given a "total reduction budget" b ∈ [0, min{λ i }], define the feasible set Ω n+1 ⊂ R n+1 The problem we consider is how to efficiently select transition rates along the mRNA molecule, within a given set of possible rates, such that the protein production rate is minimized. In practice, translation rate modification can be done by introducing mutations into the gene or by designing a corresponding RNAi molecule.
In other words, the feasible set is the set of all the rates obtained by applying a "total reduction budget" b in the rates of the given mRNA molecule. The question is how to distribute the total reduction budget over the rates so as to obtain the minimal possible protein production rate. We prove that: • If some rateλ k is a "bottleneck" rate, in a sense that will be made precise below, then an optimal reduction in protein production rate is obtained by using all the reduction budget b to further decreaseλ k ; • If all the given rates are equal, i.e.λ 0 = · · · =λ n , then the transition rate at the middle of the mRNA molecule is the bottleneck rate, and thus an optimal reduction in protein production rate is obtained by using all the reduction budget to reduce this transition rate. Thus, in this case there exists a single site such that mutating it yields the maximal inhibition of translation. Our results allow to determine where this site is located.
The remainder of this paper is organized as follows. We first briefly review some known results on the RFM that are needed for our purposes. The following section poses the problem of down-regulating the steady-state protein production rate in the RFM in an optimal manner, and then describes our main results. Analysis of the RFM is non-trivial, as this is a nonlinear dynamical model. In particular, the mapping from λ to R(λ) is nonlinear and does not admit a closed-form expression. Nevertheless, by combining tools from convex optimization and eigenvalue sensitivity theory, we show that this optimization problem is tractable in some cases, and rigorously prove several results that have interesting biological implications. The final section summarizes and describes several directions for further research. To increase the readability of this paper, all the proofs are placed in the Appendix.

RIBOSOME FLOW MODEL
The dynamics of the RFM with n sites is given by n nonlinear first-order ordinary differential equations: . . .
(2) can be explained as follows. The flow of material from site i to site i . This flow is proportional to x i (t), i.e. it increases with the density at site i, and to (1 − x i+1 (t)), i.e. it decreases as site i + 1 becomes fuller. This corresponds to a "soft" version of a simple exclusion principle. Note that the maximal possible flow from site i to site i + 1 is the transition rate λ i . Let x(t, a) denote the solution of (1) at time t ≥ 0 for the initial condition x(0) = a. Since the statevariables correspond to normalized density levels, with x i (t) = 0 [x i (t) = 1] representing that site i is completely empty [full] at time t, we always assume that a belongs to the closed n-dimensional unit cube: It is straightforward to verify that ∂C n is repelling, i.e. if a ∈ ∂C n then x(t, a) ∈ int(C n ) for all t > 0, so C n and also int(C n ) are invariant sets for the dynamics.
An important property of the RFM is the symmetry between the "particles" (i.e. ribosomes) moving from left to right and "holes" (i.e. "lack" of ribosomes) moving from right to left (in the TASEP literature, this property is sometimes referred to as the "particle-hole" symmetry). Indeed, let q j (t) := 1 − x n+1−j (t), i = 1, . . . , n. Thenq This is another RFM, but now with rates λ n , . . . , λ 0 .
The RFM has been used to model and analyze the flow of ribosomes along the mRNA molecule during the process of mRNA translation. The (soft) simple exclusion principle corresponds to the fact that ribosomes have volume and cannot overtake one another.
It is important to mention that it has been shown in [28] that the correlation between the production rate based on modeling using RFM and using TASEP over all S. cerevisiae endogenous genes is 0.96. In addition, it has also been shown there that the RFM model agrees well with biological measurements of ribosome densities. Furthermore, it was also shown that the RFM model predictions correlate well (correlations up to 0.6) with protein levels in various organisms (e.g. E. coli, S. pombe, S. cerevisiae). Given the high levels of bias and noise in measurements related to gene expression and the inherent stochasticity of intracellular biological processes (see e.g. [31], [32]), these correlation values demonstrate the relevance of the RFM in this context.

A. Steady-State Spectral Representation
Ref. [30] has shown that the RFM is a tridiagonal cooperative dynamical system [33], and that (1) admits a unique steady-state point e = e(λ 0 , . . . , λ n ) ∈ int(C n ) that is globally asymptotically stable, that is, lim t→∞ x(t, a) = e for all a ∈ C n (see also [34]). This means that the ribosomal density profile always converges to a steady-state profile that depends on the rates, but not on the initial condition. In particular, the output rate R(t) = λ n x n (t) converges to a steady-state value R := λ n e n .
At steady-state (i.e, for x = e), the left-hand side of all the equations in (1) is zero, so where e 0 := 1 and e n+1 := 0. Ref. [35] used these expressions to provide a spectral representation of the mapping from the set of rates λ to the steady-state output rate R. Let R n + := {y ∈ R n : y i ≥ 0, i = 1, . . . , n} and R n ++ := {y ∈ R n : y i > 0, i = 1, . . . , n}.

Theorem 1 [35] Given an RFM with rates
Then: 1) The eigenvalues of A are real and distinct, and if we order them as ++ denote an eigenvector of A corresponding to the eigenvalue ζ n+2 . Then This means that the steady-state production rate, and its sensitivity with respect to the transition rates, can be computed efficiently using numerical algorithms for computing the eigenvalues and eigenvectors of tridiagonal matrices. Theorem 1 also implies that R(cλ 0 , . . . , cλ n ) = cR(λ 0 , . . . , λ n ), for all c > 0, i.e. R(λ) is homogeneous of degree one. Another important implication of Theorem 1 is that R is a strictly concave function of the transition rates {λ 0 , . . . , λ n } over R n+1 ++ [35]. Also, it implies that ∂ ∂λ i R > 0 for all i, that is, an increase in any of the rates yields an increase in the steady-state production rate.

MAIN RESULTS
We begin by posing a general minimization problem for the steady-state production rate in the RFM.

Problem 1 Given a convex and compact feasible set of transition rates
From the biological point of view, the feasible set of transition rates Ω n+1 depends on all the biophysical constraints on the transition rates along the coding sequence. For example, the maximal/minimal decoding rate of a codon (e.g. via its adaptation to the tRNA pool) [42], the maximal possible effect of mRNA folding (after codon substitution) on each codon [43], the maximal possible effect (after amino acid substitution) of the interaction of the ribosome with amino acids of the nascent peptide [44], and the maximal elongation slow down due to interaction with microRNAs [8], [9].
Below we explain how to pose various interesting biological problems in the framework of Problem 1. Examples include finding the minimal number of mutations that down regulate translation of a gene/mRNA under a certain "total reduction budget". This is practically important when we use costly (in terms of time and money) gene editing approaches. Another related question is how to down regulate translation of a gene/mRNA with a maximal number of mutations. This is important when attenuating viral replication rate for generating a safe live attenuated vaccine. A large number of mutations reduces the probability of reverting. One may also define the feasible set in Problem 1 in such a way that some rates cannot be changed. This is relevant for example when some codons along the mRNA cannot be modified. Indeed, various positions along the mRNA affect regulatory mechanisms that we may not want to alter (e.g. co-translational folding, splicing, translation).
It is well-known (see, e.g. [45,Thm. 7.42]) that if f : Ω n+1 → R is a continuous and strictly convex function defined over a convex and compact set Ω n+1 then all the maximizers of f over Ω n+1 are extreme points of Ω n+1 (for more on the problem of maximizing a convex function, or equivalently, minimizing a concave function, see e.g. [46]). Combining this with the fact that R is a strictly concave function of the transition rates over R n+1 ++ implies the following.

Proposition 1 Every solution of Problem 1 is an extreme point of Ω n+1 .
In particular, if the set of extreme points of Ω n+1 is finite then one can always solve Problem 1 by simply calculating R(λ) for all λ that are extreme points of Ω n+1 , and then finding the minimum of these values. In particular, if Ω n+1 is a convex polytope then the extreme points are just the vertices of Ω n+1 . Thus, when the biophysical constraints lead to a feasible set of rates that is a convex polytope then it is computationally straightforward to determine how to modify the rates so as to obtain the largest decrease in translation rate under reasonable biophysical constraints.
In the remainder of this section, we consider three special cases of Problem 1 for which it is also possible to obtain analytic results.
In other words, Ω n+1 is the set of all the rates that can be obtained by applying a total reduction b to the given ratesλ i . From a mathematical point of view, b provides a bound on the total possible rate reduction. It also couples the reduction in different rates, as a larger reduction in one rate must be compensated by smaller reductions in other rates so that the total reduction will not exceed b. From a synthetic biology point of view, b can be used to capture the idea of maximally inhibiting the production rate while minimizing the side-effects of this down regulation. For example, a very small value of b forces a solution with small modifications in all the rates. This is expected of course to minimize the effect of the mutations on the fitness of the cell/organism. For example, since co-translation folding [47], [48], [49] is related to the ribosome transition rates along the mRNA, smaller changes in the rates are expected to have a smaller effect on protein folding (and thus on the functionality of the protein and the overall organismal fitness). Smaller changes in the transition rates are also related to a "simpler" biological solution in the sense of fewer mutations, less miRNAs, etc. The next example demonstrates Problem 2. with R(λ * ) = 0.2140. Note that this corresponds to reducing b from the rateλ 3 , which is the minimum of all the ratesλ i , leaving all the other rates unchanged.
Let d i ∈ R n+1 denote the (i + 1)'th column of the (n + 1) × (n + 1) identity matrix. The set Ω n+1 (λ, b) is a convex polytope with vertices: If there exists an index i such thatλ i = b then it is clear that an optimal solution is to reduceλ i to 0, as then the steady-state production rate will be zero. So we always assume that b takes values in the set [0, min{λ i } − ρ], for some ρ > 0. This means that Problem 2 is a special case of Problem 1, as Ω n+1 (λ, b) is a convex polytope contained in R n+1 ++ . By Prop. 1, every solution of Problem 2 is contained in the set {v 0 , . . . , v n }. In other words, every minimizer corresponds to reducing all the available budget b from a single rate. This immediately yields a simple and efficient algorithm for solving Problem 2: use the spectral representation of R to compute R(v i ), i = 0, . . . , n, and then find the minimum of all these values. Since the matrix A in (5) is symmetric and tridiagonal, calculating R(v i ) can be done efficiently even for large values of n. We wrote a simple (and unoptimized) MATLAB script for solving Problem 2, and ran it on a MAC laptop with a 2.6 GHz Intel core i7 processor. For an RFM with n = 500 (a typical coding region includes a few hundred codons [50]), ratesλ i = 1, i = 0, . . . , 500, and b = 0.1, the optimal solution is found in 3.14 seconds.
Example 1 may suggest that reducing the slowest transition rate by b always yields an optimal solution, but in general this is not true (see Example 3 below).
One may also consider a different feasible set in Problem 2, namely, i.e. here the total reduction is up to b. However, by Theorem 1 ∂ ∂λ i R(λ) > 0 for all i, and thus an optimal solution for this problem is guaranteed to agree with an optimal solution of Problem 2.
The next example demonstrates the effect of increasing the total reduction rate b on the optimal solution of Problem 2. R(λ) − R(λ * ), that is, the optimal reduction in protein rate that can be obtained for various values of b. Figure 3 depicts ∆R as a function of b. It may be seen that ∆R increases quickly with b (specifically, the relation is superlinear).

B. Optimal reduction and sensitivities
It is also possible to derive theoretical results on the structure of an optimal solution λ * in Problem 2 using the sensitivities s i (λ) := ∂ ∂λ i R(λ). Note that these can be computed efficiently using (6).

Proposition 2 Consider Problem 2.
If there exist i, j ∈ {0, . . . , n} such that then any optimal solution λ * satisfies λ * i =λ i . In other words, if the sensitivity of the steady-state production rate to rate λ i atλ is lower than some other sensitivity then an optimal solution will not include a reduction inλ i . Indeed, it is better to distribute the reduction budget over some other, more sensitive, rates.

Remark 1 Note that since R is a strictly concave function of the rates,
for anyλ ∈ R n+1 ++ and any i ∈ {0, . . . , n}. In other words, a decrease inλ i increases the sensitivity w.r.t. this rate.
Proposition 2 leads to the following definition.

Definition 1
Given an RFM with ratesλ, a transition rateλ j is called a bottleneck rate if s j (λ) > s i (λ), for all i = j.
In other words, a bottleneck rate is one with a maximal sensitivity.
Combining this with Proposition 2 immediately yields the following result.

Corollary 1
Given an RFM with ratesλ, suppose that s j (λ) is a bottleneck rate. Then the unique optimal solution to Problem 2 is obtained by reducingλ j by b.
An important observation is that the slowest rate along the mRNA molecule and the bottleneck rate may be different. The next example demonstrates this.
However, note that Remark 1 implies that if some rate λ i is decreased enough then it will eventually become a bottleneck rate.
Proposition 2 can be used to derive analytic results in cases where we can obtain explicit information on the sensitivities at a pointλ ∈ R n+1 + . The next two results demonstrate this.
In other words, in the case where all the rates are equal, the bottleneck is at the center of the chain. These results are closely related to the fact that in a dynamic model for phosphorelay [51], that is very similar to the RFM, the middle layer in the model is the most sensitive to changes in the input. This also agrees with the so called "edge-effect" in the HTASEP [52], [53], [54], i.e. the fact that the steady-state output rate is less sensitive to the rates that are close to the edges of the chain. For more on the sensitivity of TASEP to manipulations in the initiation, hopping, and exit rates, see [54], [55], [56], [57].

Proposition 4
Consider an RFM with dimension n and ratesλ such thatē 1 = · · · =ē n := e c , i.e. all the steady-state occupancies are equal, and e c denotes their common value. 1) If e c < 1/2 then the unique optimal solution to Problem 2 is 2) If e c > 1/2 then the unique optimal solution to Problem 2 is 3) If e c = 1/2 then (10) and (11) are the optimal solutions.
In other words, if the equal occupancy is relatively low [high] then maximal inhibition of the production rate is obtained by reducing the total reduction rate from the initiation [exit] rate, leaving all the other rates unchanged. . In some cases it may be more natural to define the transition rate reduction in relative rather than absolute terms. This is captured by the following optimization problem.
In practice, each codon (or coding region) admits a minimal and a maximal possible decoding rate. There are also minimal and maximal values for the initiation rate. These bounds are determined by the biophysical properties of the transcript and the intracellular environment. To model this, we can modify the optimization problems described above to include a bound ℓ i on the maximal allowed reduction of rate i, for i = 0, . . . , n. The next problem demonstrates such a modification for Problem 2.
In other words, the feasible set Φ n+1 in Problem 4 is the intersection of the set Ω n+1 (defined in Problem 2), and the closed (n + 1)-dimensional cube Ψ n+1 that models constraints on the maximal possible reduction of each rate. Since Φ n+1 is compact and convex (being the intersection of two compact and convex sets), Problem 4 admits a solution that is an extreme point of Φ n+1 . In general, not all the rates can be reduced by b, and thus an optimal solution may include a reduction of several rates. Example 5 Consider Problem 4 for an RFM with dimension n = 2, ratesλ i = 1.0, i = 0, 1, 2, and parameters b = 0.85, and ℓ i = 0.4, i = 0, 1, 2. In other words, the total possible reduction is 0.85, but any rate can be reduced by no more than 0.4. Fig. 4 depicts the feasible set Φ 3 (blue polytope) that is the intersection of the set Ω 3 (gray polytope) and the set Ψ 3 (green cube). Shown also are the three extreme points of Φ 3 : A calculation yields R(v 1 ) = R(v 2 ) = 0.2538, whereas R(v 3 ) = 0.2764. It follows that λ * = v 1 and λ * = v 2 are optimal solutions. Note that these solutions correspond to reducing several rates along the mRNA molecule. Note also that s(λ) = 0.1056 0.1708 0.1056 ′ , so both optimal solutions correspond to a maximal possible reduction in a most sensitive rate, and a maximal possible reduction in another most sensitive rate.
In some cases, there may be positions along the coding region that we cannot modify due to their potential effect on various intracellular processes. An important advantage of Problem 4 is that it allows capturing this by simply setting some of the ℓ i s to zero. On the other hand, in down regulation of a viral gene it may be desirable to distribute the synonymous codon modifications over many mRNA sites in order to reduce the chance of spontaneous mutations yielding the original wild type. This is captured by Problem 4 when we set the ℓ i s to small non-zero values, as then an optimal solution will include a transition rate reduction in many sites.

C. A biological example
To demonstrate how the results above can be used to analyze translation and provide guidelines for re-engineering the mRNA, we consider the S. cerevisiae gene YBL025W that encodes the protein RRN10 which is related to regulation of RNA polymerase I. This gene has 145 codons (excluding the stop codon). Similarly to the approach used in [28], we divided this mRNA into 6 consecutive pieces: the first piece includes the first 24 codons (that are also related to later stages of initiation [49]). The other pieces include 25 non-overlapping codons each, except for the last one that includes 21 codons.
To model this using an RFM with n = 5 sites, we first estimated the elongation rates λ 1 , . . . , λ 5 using ribo-seq data for the codon decoding rates [42], normalized so that the median elongation rate of all S. cerevisiae mRNAs becomes 6.4 codons per second [58]. The site rate is (site time) −1 , where site time is the sum over the decoding times of all the codons in this site. These rates thus depend on various factors including availability of tRNA molecules, amino acids, Aminoacyl tRNA synthetase activity and concentration, and local mRNA folding [42], [1], [49]. Note that if we replace a codon in a site of mRNA by a synonymous slower codon then the decoding time increases and thus the rate associated with this site decreases.
The initiation rate (that corresponds to the first piece) was estimated based on the ribosome density per mRNA levels, as this value is expected to be approximately proportional to the initiation rate when initiation is rate limiting [28], [59]. Again we applied a normalization that brings the median initiation rate of all S. cerevisiae mRNAs to be 0.8 [60]. A calculation yields that the steady-state production rate in this RFM is R = 0.0732.
In order to analyze the solution of Problem 2 for this RFM we calculated the sensitivities using (6). This yields: s(λ) = 0.0795 0.0669 0.0611 0.0578 0.0328 0.0092 , soλ 0 is a bottleneck rate. This means that the solution for Problem 2 is to reduce all the reduction budget b fromλ 0 . In biological terms, this suggests that maximal inhibition of production should be based on replacing some (or all) of the first 24 codons with slower synonymous codons. For comparison with the optimization scenarios described below, consider the total budget b = 0.0089. The solution for Problem 2 is then to reduce λ 0 by b, and this yields R * = 0.0725.
Reducing λ 0 by b in the model is possible by substituting codons in the first site with their slowest synonymous mutation (for example, the third codon AGA should be replaced by the synonymous codon CGG, increasing the codon decoding time from 0.1128 seconds to 0.2246 seconds). Now suppose that we are not interested in modifying these codons because in this region there are various regulatory signals that we may not want to change (see, for example, [49]). To maximize inhibition of production rate under this constraint, we apply Problem 4, with ℓ 0 = 0, and ℓ i > b for all i = 0. Now the optimal solution is to reduce b fromλ 1 . Note thatλ 1 has the second largest sensitivity. This yields R * = 0.0726, and is, as expected, higher than the value in (14). Again, the biological data shows that such a reduction can be done by synonymously Note that all the rates are reduced and that the total reduction is b. This yields R = 0.0727, which is again higher than the value in (14).

DISCUSSION
There are several approaches for effectively down-regulating translation. Global down-regulation can be achieved by controlling basic translation factors or by using drugs that induce ribosome stalling [61], [62], [63]. Here we consider down regulation of specific genes via targeting specific codons/regions in these genes. This leads to the problem of finding the codon regions that have the most effect on the steady-state production rate. We study this problem of optimal down regulation of mRNA translation using a mathematical model for ribosome flow, the RFM. All possible modifications of the rates define a feasible set of rates, and, under certain conditions, we give a simple algorithm for finding the optimal solution, that is, the rates that lead to a maximal decrease in the protein production rate. For some specific cases, we also derive theoretical results on the optimal solution.
Our results show that the solution must focus on the positions along the mRNA molecule where the transition rate has the strongest effect on the protein production rate. However, this position is not necessarily the one with the minimal rate (though in many cases there are correlations between the two definitions). Many previous studies in the field emphasized the importance of the translation bottleneck [64], [21], [56], however, this is always defined as the minimal rate. We believe that the sensitivity of the coding region sites should be further studied in order to understand better the evolution of transcripts and their design.
The optimization problems posed here are flexible enough to capture various scenarios. For example, in some cases it may be desirable to introduce a minimal number of changes in the transcript to obtain the desired decrease in the translation rate. Indeed, generating mutations and using suitable RNAi molecules is costly in time and money. Also, any change in the translation rates can affect various important phenomena such as co-translational folding [47], [48], [49], as well as other properties that are encoded in the coding region [49], [65], [66]. In other cases, such as generating a down-regulated virus strain, it may be desirable to introduce as many mutations as possible.
There are various approaches for synthesizing molecules that block mRNA translation (see e.g. http://www.gene-too In practice, when determining an optimal position to target (e.g. with RNAi molecules) one must take into account additional biophysical aspects. For example, the GC content at the different regions along the mRNA, the folding of the mRNA, the potential binding affinity of the RNAi and the mRNA, potential un-desired binding of the RNAi to additional mRNAs or regions within the mRNA, etc. Nevertheless, we feel that out results can be integrated to improve the design of such tools.
In practice, there are many mRNA molecules in the cell and they all compete for the finite pool of free ribosomes. In particular, if more ribosomes are stuck in a traffic jam on a certain mRNA molecule then the pool of free ribosomes is depleted yielding a reduction in the production rates in other mRNA molecules. The RFM is a model for ribosome flow along a single isolated mRNA molecule. This is a reasonable model when the expression levels (e.g. the mRNA levels and the total number of ribosomes on the mRNA molecules related to the gene) are relatively low, so that changes in the translation dynamics on one mRNA have a negligible effect on the pool of ribosomes and thus on the other mRNAs. A model for a network of RFMs, interconnected via a dynamic pool of free ribosomes, has been studied in [41]. It may be of interest to study the problem of down regulation of a specific mRNA molecule within this framework. In this case, one can also down regulate the mRNA indirectly by affecting the ribosomal pool. However, the tools used here do not directly apply, as the convexity results for a single chain do not necessarily carry over to the case of a network of RFMs.
The results here suggest several biological experiments for studying the problem of optimal down regulation and, in particular, validating the theoretical predictions derived using the RFM. Libraries encoding the same protein using mRNAs with different codons (but similar mRNA levels and translation initiation rates) can be generated as was done in [16]. For each variant the protein levels, that are expected to monotonically increase with the production rate [28], can be measured either via a reporter protein [16] or directly [67]. The codon decoding rates can be estimated based on ribo-seq experiments [16], [42]. Such an experimental testbed can be used to validate the results reported in this study.

This means that
where a, b > 0 are constants that do not depend on i. If n is even then the cosine function in (15) admits a unique minimum at i = n/2, and combining this with Proposition 2 completes the proof. If n is odd then the cosine function in (15) admits two minima: at ⌊n/2⌋ and at ⌊n/2⌋ + 1. Now arguing as in the proof of Proposition 2 and using the particle-hole symmetry of the RFM completes the proof. Proof of Proposition 4. Ifē 1 = · · · =ē n := e c , then (3) yields where we scaledλ 0 to one w.l.o.g. In this case, the Perron eigenvector v ∈ R n+2 ++ of the matrix A(λ) is given by (see also [37]): where µ := e c /(1 − e c ). We consider two cases. If e c = 1/2 then v ′ v = 2(n + 1) and applying Theorem 1 yields the sensitivities: Thus, s 0 = s n > s j , for all j ∈ {0, n}, and arguing as in the proof of Proposition 2 and using the particle-hole symmetry implies that the two optimal solutions areλ − bd 0 andλ − bd n . If e c = 1/2 then Theorem 1 yields When e c < 1/2 [e c > 1/2] (19) yields s 0 > s j , for all j = 0 [s n > s j , for all j = n]. Combining this with Proposition 2 completes the proof.