Abstract
The concomitant folding of a protein with its synthesis on the ribosome is influenced by a number of different timescales including the translation rate. Here we present a kinetic formalism to describe cotranslational folding and predict the effects of variable translation rates on this process. Our approach, which utilizes equilibrium data from arrested ribosome nascent chain complexes, provides domain folding probabilities in quantitative agreement with molecular simulations of folding at different translation rates. We show that the effects of single codon mutations in messenger RNA that alter the translation rate can lead to a dramatic increase in the extent of folding under specific conditions. The kinetic formalism that we discuss can describe the cotranslational folding process occurring on a single ribosome molecule as well as for a collection of stochastically translating ribosomes.
Introduction
Ribosomebound nascent protein chains are particularly vulnerable to misfolding and interacting in aberrant manners with other cellular components^{1}. To avoid these potentially dangerous possibilities and facilitate the folding process^{2}, a variety of quality control mechanisms are associated with translating ribosomes, including those involving molecular chaperones and other ancillary factors^{1}. An additional level of control is provided by the opportunity for proteins to fold during synthesis^{3,4,5,6}, thus potentially enhancing folding yields^{7} and avoiding misfolded or intermediate species^{8,9}. Given its importance, it is not surprising that the cotranslational folding process can be regulated by the modulation of the rates at which successive amino acids are covalently attached to the nascent chain during synthesis. Thus, for example, reduced folding yields have been observed when slowtranslating messenger RNA codons are mutated to fasttranslating codons^{10}. Even single synonymous mutations have been reported to decrease the total enzymatic activity of specific types of proteins, presumably because of cotranslational misfolding^{11}, leading to disease^{12}. Furthermore, slowtranslating codons have been observed to appear more frequently at domain boundaries^{10}, which can result in increased folding yields. All these results indicate that the interplay of the timescales of domain folding and aminoacid addition to the nascent chain is crucial in determining the extent of cotranslational folding (Fig. 1a).
Approaches based on the kinetic modelling of the molecular processes involved in translation have provided profound insights into the diverse functions of the ribosome. For example, the ability of the ribosome to discriminate between cognate and nearcognate transfer RNA has been explained using kinetic equations^{13,14}. Here we extend this strategy to the prediction of the extent of nascent chain folding during continuous translation. This approach is based on the use of data on folding kinetics from arrested ribosome nascent chain (RNC) complexes and the time required to add individual residues to the nascent chain, quantities that can be measured using fluorescence or single molecule methods^{15,16}. Making predictions based on arrested (that is, equilibrium) RNC data is convenient, because it is experimentally easier to probe such systems as compared with RNCs undergoing continuous, nonequilibrium translation^{17}. Our approach is applicable to both single molecule and bulk cotranslational folding occurring during continuous protein synthesis.
Results
The extent of cotranslational folding on a single ribosome
To develop our approach, we first note that, in many instances, the folding of protein domains consisting of less than about 100 residues often occurs in bulk solution without significantly populating any intermediate state, and hence can be described phenomenologically by a twostate model^{18} (Fig. 1b). In this scheme, a protein can interconvert between folded (F) and denatured (D) states. In what follows, we will consider this model in the context of a translating protein, allowing us to predict the extent of cotranslational folding at different rates of translation.
Translation introduces the additional timescale, τ_{A}, of aminoacid addition to the twostate kinetic scheme (Fig. 1c). Because the chemical environment surrounding a protein domain changes, as it is synthesized, the timescales of its folding (τ_{F,i}) and unfolding (τ_{D,i}) are a function of the nascent chain length i, that is, the number of residues comprising the nascent chain at a particular point during its synthesis^{4}. The time available for the domain to interconvert between folded and denatured states at length i is equal to τ_{A,i+1}, corresponding to the time it takes to attach the aminoacid i+1 to the nascent chain. τ_{A,i+1} has been shown to be influenced by a number of factors including the identity of the mRNA codon^{19}, the intracellular concentration of cognate and nearcognate aminoacyltRNAs^{20}, and the presence of secondary structure within the substrate mRNA^{21}. For an apparent twostate folding protein, larger τ_{A,i+1} values will increase the probability P_{F}(i,t) that the domain will fold (that is, achieve its native structure) by affording the domain more time to do so at a nascent chain length i and time t after initiation of synthesis (Fig. 1d).
To derive an equation relating these three timescales (τ_{A,i+1},τ_{F,i},τ_{D,i}) we first consider the behaviour of a single ribosome translocating along an mRNA molecule, and the time dependence of its nascent chain length. At a given nascent chain length i, the ribosome will dwell at codon i+1 waiting for this codon's cognate tRNA to be selected from the cytosol of the cell. This selection process involves a number of steps and a range of associated molecules such as elongation factor thermo unstable (EFTu). As we are concerned specifically with the nascent chain's length dependence as a function of time, we do not need to consider explicitly the details of these other chemical steps, for the reasons that follow. The time it takes to select the cognate tRNA and accommodate it into the Asite of the ribosome structure is stochastic in nature, but, on average, it is estimated in Escherichia coli to range from tens to hundreds of milliseconds depending on the identity of the tRNA molecule^{20}. Once the Asite and Psite tRNAs are aligned, and receive sufficient thermal energy to pass over the transition state barrier, the chemical step of peptide bond formation, which changes the nascent chain length from i to i+1, takes on the order of picoseconds to nanoseconds^{22} and is known as the transition path time^{23,24}. This six ordersofmagnitude separation in the transition path time and τ_{A} timescales (picoseconds versus milliseconds) means that, for an individual ribosome molecule, the transition from nascent chain length i to i+1 appears instantaneous relative to the time the ribosome spends at either of these chain lengths.
As a consequence, the probability, P(i), that this single ribosome molecule will contain a nascent chain of length i at time t is equal to the boxcar function probability distribution which equals 1 in the time interval [t_{i}^{0}, t_{i}^{0}+τ_{A,i+1}) and is zero otherwise (Fig. 2a). t_{i}^{0} is the time at which the ith amino acid is added to the nascent chain after initiation of translation. The change in P(i) with respect to time is
where δ(t) is the Dirac delta function centred at time t after initiation of translation (Fig. 2a).
Next, we note that the experimentally observed timescale of the folding and unfolding process of a protein domain in free solution is typically on the order of milliseconds or more^{25}, and may be much longer near the ribosome surface^{26}. Therefore, the picosecondtonanosecond transitionpath time of peptide bond formation will also appear as instantaneous relative to the milliseconds or more folding/unfolding timescale. As a consequence, the probability that the nascent chain is in the folded state is equal immediately before (denoted P_{F} (i,t=t_{i}^{0}+τ_{A,i+1})) and immediately after (denoted P_{F} (i+1, t=t_{i+1}^{0})) the addition of the i+1 amino acid (Fig. 1d); that is, the starting point of folding at length i+1 is equal to the ending point at length i. Thus, during continuous translation, the extent of folding at a given nascent chain length is a function of the extent of folding at shorter lengths, and, hence, cotranslational folding depends recursively on what has happened at earlier times during the synthesis of the protein (Figs. 2b,c). As translation is a nonequilibrium process, memory effects can become prevalent, and so it is not surprising that the extent of cotranslational domain folding depends on the states populated at earlier times during synthesis (Fig. 2c).
The specific behaviour of a single ribosome translocating along an mRNA containing N codons is therefore characterized by the series of dwell times at each codon If we have many independent measurements of domain folding on ribosomes that exhibit the same series of dwell times, then we can treat the domain folding probability as continuous and write down the differential equation defining the domain folding probability with respect to time as
and its solution is
We substitute equation (3) into the recursive equations shown in Fig. 2c and rearrange them to find that P_{F} at arbitrary nascent chain length i and time t is
In equations (3) and (4), is the equilibrium probability of folding and equals with λ(i) being the rate of interconversion of the folded and denatured states that equals [τ_{F,i}]^{−1}+[τ_{D,i}]^{−1}. This is in contrast to the outofequilibrium quantity in equation (3), which is the folding probability immediately after adding the ith residue to the nascent chain. τ_{F,i} and τ_{D,i} are the average times of folding and unfolding at nascent chain length i on an arrested RNC. The placement of the first residue (i=1) in the Psite of the ribosome, corresponding to fmettRNA in prokaryotes^{27}, is designated as time point zero, t_{1}^{0}=0 s, and the time at which the ith residue is added is
Accurate prediction of individual codon translation rate effects
Equation (4) is a function solely of τ_{F,i}, τ_{D,i},τ_{A,i} and is a closed form solution to differential equations (1) and (2); therefore it provides an exact solution to the kinetic model shown in Fig. 1c. This equation expresses the probability that a domain is folded at each codon during continuous translation in terms of the equilibrium quantities and λ(i) that can be measured on arrested ribosomes, and the translation time of each codon (τ_{A,i}), which can be measured by FRET and laser optical tweezer methods^{15,16}. To date, however, few such measurements at different nascent chain lengths have been reported. Therefore, to test equation (4) rigorously, we generated an independent data set representing the probability of domain folding at various translation rates using coarsegrained molecular simulations (Supplementary Methods) of the synthesis of protein G on the ribosome from Thermus thermophilus (Fig. 1a).
Protein G is a single domain protein whose folded architecture consists of an αhelix located adjacent to a fourstranded βsheet platform^{28}. The coarsegrained model that we use has been shown previously to be consistent with a range of experimental data from arrested RNC complexes^{4,29}. As in analogous experiments^{17}, we attached an unstructured linker to the carboxy terminus of protein G (Fig. 3a) to allow folding and unfolding of this domain to occur near the exit tunnel vestibule, where nascent chain tertiary interactions are sterically permitted^{29,30}.
We first calculated the equilibrium folding and unfolding kinetics (that is, τ_{F,i} and τ_{D,i}) of protein G on arrested RNCs containing nascent chain lengths ranging from 81 to 92 AA (Fig. 3b). These timescales can be seen to vary with the nascent chain length, a result attributable to the change in chemical environment around the domain that arises in the simulations from electrostatic and excluded volume interactions between the nascent chain and the ribosome surface. We then simulated the continuous translation of protein G by covalently attaching new glycine residues to the nascent chain's C terminus at the biologically relevant^{20} constant time intervals of 60, 10, 5, 2.5, and 1.3 ms, starting from a nascent chain length of 71 AA; at this length, protein G is unfolded on the ribosome as the Cterminal portion of the domain is in the exit tunnel^{4}. To obtain statistically significant results, we carried out between 32 and 384 independent protein synthesis simulations at each translation rate.
The effects of translation rate on the extent of protein G folding at each nascent chain length are shown in Fig. 3c, and the corresponding rootmeansquared deviations of the protein G domain from its Xray structure are shown in Fig. 4. We observe, consistent with previous conjectures^{31}, that the greater the translation rate the smaller the probability that the domain is folded at a given nascent chain length. Furthermore, at synthesis times close to the average value in E. coli, that is, τ_{A}=50 ms, we find that continuous translation and arrested RNCs result in the same extent of folding as a function of nascent chain length (Fig. 3c). This result occurs because the folding of protein G, during continuous translation at τ_{A}=60 ms, occurs under quasiequilibrium conditions, where the folding reaction is under thermodynamic control, whereas at τ_{A}=1.3 ms cotranslational folding occurs under nonequilibrium conditions, where folding is under kinetic control^{4}. It is important to emphasize that domains that fold on timescales of greater than 50 ms are more likely to be under kinetic control at synthesis timescales of τ_{A}≤50 ms (τ_{F}=2 ms for protein G in free solution^{25}), and hence show a deviation between the nonequilibrium and equilibrium folding curves P_{F}(i, t) and . In a database of single domain folding timescales^{25} under physiologically relevant conditions, a quarter of them have τ_{F}≥50 ms. Thus, at average E. coli synthesis rates during exponential growth^{32}, cotranslational folding of 25% or more of domains in multidomain proteins may be under kinetic control.
Importantly for the purpose of this study, the data in Fig. 3c provide a means to test the accuracy of equation (4). Inserting the arrested RNC folding kinetics from Fig. 3b into equation (4) and setting τ_{A} to the corresponding value used in the simulations, we find this kinetic formalism accurately and rapidly predicts the extent of cotranslational folding as a function of the translation rate (Fig. 3c). Thus, our approach captures the interplay of translation rate and folding and denaturation timescales and its consequence for the extent of cotranslational folding.
To test the sensitivity of equation (4) to single codon mutations that locally alter the translation rate along an mRNA molecule, we simulated cotranslational folding of protein G when a single 'fast'translating codon (τ_{A,87}=1.3 ms) was placed at codon 87 in the context of a 'slow'translating mRNA sequence (τ_{A}=10 ms). Conversely, we also simulated a system in which a single 'slow'translating codon (τ_{A,90}=10 ms) was placed at codon 90 in the context of a 'fast'translating mRNA sequence (τ_{A}=1.3 ms). We find that equation (4) accurately predicts the change in the extent of domain folding that results from the change in single synonymous codon mutations (Fig. 5a). This is a crucial demonstration of the utility of this formalism as synonymous mutations have been shown to alter folding yields dramatically^{19}. These results also demonstrate that the predictions from this kinetic formalism are accurate and sensitive to the effect of variable translation rates at the level of single codons.
While the folding probabilities are shown as a function of nascent chain length in Figs 3c and 5a, equation (4) can also accurately predict these folding curves as a function of the time after the initiation of translation (Fig. 5b).
Application to a collection of translating ribosomes
In the preceding treatment, we considered a single ribosome molecule translocating along an mRNA molecule. Equation (4) therefore represents the average domain folding probability of a nascent chain on a ribosome that translocates with a specific series of dwell times {τ_{A}}. As translocation of a ribosome along mRNA is stochastic, with a distribution of aminoacid addition times at a codon i, experiments on different ribosomes can yield different series of dwell times while they translate the same mRNA sequence.
How can we combine the exact result of equation (4), which utilizes a specific series of dwell times, with the stochastic nature of an ensemble of ribosomes, each with their own respective series of dwell times? If the probability density function P_{i}(τ_{A}) of aminoacid addition times at codon i is known a priori, then for a specific series of N dwell times, labelled as set k ({τ_{A}}_{k}), we can calculate the probability p_{k} of that series occurring by random chance as
Therefore, by inserting the same series of dwell times in both equations (4) and (5) , and multiplying the result as p_{k}P_{F}(i=N, t_{N}^{0}+τ_{A,i+1}), we obtain the contribution of the P_{F}(i) folding curve of a single translating ribosome (for example, Fig. 3c) to the folding curve that would result from averaging over a large number of independent, stochastically translating ribosomes.
This result is useful for three reasons. First, equation (5) allows for the calculation of the probability of obtaining a particular single molecule trace (defined by the set of dwell times) in an experiment. Second, it allows for the numerical simulation of an arbitrarily large number of independent, stochastically translating ribosomes and each of their corresponding cotranslational folding probability curves. And finally, with sufficient such simulations, the distribution of folding probability curves and their average can be calculated for an ensemble of stochastically translating ribosomes. Importantly, this approach can be applied to arbitrary P_{i}(τ_{A}) distributions, thus providing it significant versatility.
To illustrate these points, consider an aminoacid addition time distribution P_{i}(τ_{A}) that is exponentially distributed and is therefore equal to where ‹τ_{A,i+1}› is the average time required for aminoacid addition to a nascent chain of length i. Values of this time have already been estimated for all 48 codons in E. coli^{20}. τ^{k}_{A,i+1} is the time it takes to add the i+1 residue to the nascent chain in the kth experiment in which a single ribosome translocating along mRNA is monitored. If N=91, as in the protein G construct discussed above, and ‹τ_{A,i+1}› is taken as 60 ms for all codons, then the probability of observing a single ribosome translate a protein in which it dwells at each codon for 20 ms is effectively zero (about 10^{−171}). To simulate the individual folding curves of 1,000 ribosomes stochastically translating this protein G construct; however, we can randomly sample τ_{A} values from the exponentially distributed P_{i+1}(τ_{A}) for each codon (see Methods) and construct 1,000 unique dwell time sets [{τ_{A}}_{k}]. For each τ_{A} set, we can use equation (4) to calculate the resulting folding curve. Fig. 4 shows these 1,000 folding curves as a function of time (Fig. 6a) and nascent chain length (Fig. 6b). These results show that the kinetic model that we described can be utilized to predict how aminoacid timescales and their underlying distribution affect the extent of cotranslational folding of a protein domain at the resolution of an individual ribosome molecule, or for a large collection of ribosomes.
Exact solution for a collection of ribosomes
When P_{i+1}(τ_{A}) is exponentially distributed, it is possible to derive an exact expression relating the average cotranslational folding curve from a collection of stochastically translating ribosomes as a function of nascent chain length (equation (6), Attila Szabo, personal communication). That is, the blue line in Fig. 6b can be predicted without having to resort to the numerical simulations discussed in the previous section, although, by doing so, the information on the underlying distribution of folding curves is lost.
To derive the ensemble averaged folding curve as a function of nascent chain length, denoted ‹P_{F}(i)›, a probabilistic approach can be utilized to analyse the elementary reaction steps in Fig. 1c (see Methods). Under these conditions,
where the superscript of '−1' indicates the reciprocal of these timescales. To test the accuracy of equation (6), we used it to calculate ‹P_{F}(i)› for protein G and compared it with results from the numerical simulations described in the previous section. We find excellent agreement between this exact result and the numerical simulations (Fig. 6b). Thus, equation (6) can predict the effect of per codon translation rates on the average cotranslational folding curve that arises from bulk experiments.
Discussion
We have presented two equations (equations (4) and (6)) that predict the extent of cotranslational domain folding based on per codon translation timescales, and the timescales of domain folding and denaturation on arrested RNC complexes at equilibrium (Fig. 3b). We have derived an exact expression for the domain folding probability in the case of a single translating ribosome (equation (4)), and shown how this expression can be utilized to predict the behaviour of a large number of stochastically translating ribosomes. Finally, an exact expression for the cotranslational folding curve was derived for ribosomes translating with exponential dwell times at each codon (equation (6)).
The utility of each of these equations depends on the questions that one is interested in addressing and the type of experiment (bulk versus single molecule) that is being carried out. In analysing and predicting cotranslational folding behaviour on individual ribosomes, equation (4) is perhaps the most relevant. The application, via numerical methods, of equation (4) to a collection of stochastically translating ribosomes is of direct consequence to both single molecule and bulk experiments as this approach offers the ability to calculate the individual ribosome folding curves as well as the ensemble average. This numerical approach can handle arbitrary distributions of aminoacid addition timescales and is thus not limited to the exponential dwell time distributions. Bulk experiments, where the average cotranslational folding curve as a function of nascent chain length may be measured from a collection of ribosomes, can be predicted using equation (6). Thus, these equations are applicable under a wide range of conditions.
Laser optical tweezers have recently^{26} been used to measure the folding rate under tension of T4lysozyme arrested on the ribosome at two different linker lengths. While the unfolding rate at zero force was not estimated at either length, these experiments clearly demonstrate that it is possible to measure τ_{F,i} and τ_{D,i} experimentally, which are key inputs in our approach. We expect that, as more studies measuring these rates are carried out on this and other proteins, such data, when combined with our approach, will be useful in predicting what happens during continuous translation.
A number of additional translationassociated processes were not explicitly considered in the reaction scheme (Fig. 1c). For example, the competitive (and reversible) binding of near and noncognate tRNAs for a codon can slow down the rate of aminoacid addition by cognate tRNA molecules^{20}. Furthermore, chaperones such as trigger factor directly interact with nascent chains during their synthesis, and can slow the rate of cotranslational folding of at least some proteins^{33}. These processes do not diminish the utility of our approach, because each of these additional processes can effectively be accounted for by incorporating them into the timescales of aminoacid addition (in the case of competitive binding) and into the rates of folding and unfolding (in the case of trigger factor). The mathematical dependence of τ_{A} on nearcognate and noncognate tRNA concentrations and their competitive binding rates has been worked out previously^{20}. Thus, combining that model with equation (4) provides a means to model the effect of competitive tRNA binding on cotranslational folding. Similarly, when quantitative experimental measurements become available for the effect of trigger factor on the rates of domain folding and unfolding, they can be incorporated implicitly into effective timescales in these equations.
The kinetic models that we have described here are based on domains that fold cotranslationally in a twostate manner. This property applies to a variety of small proteins (typically ≤100 residues) and enables an analytical solution to be obtained for the kinetics of cotranslational folding. For protein domains larger than those examined here, which may populate intermediate states on the ribosome, the kinetic scheme in Fig. 1c can be modified to account for such additional states as they are experimentally identified. Although the additional complexity of such kinetic schemes may make it difficult to find an analytical solution, they could always be solved by numerical methods.
The approach we have proposed here has many potential applications in the areas of in vivo protein folding, biotechnology, and synthetic biology. For example, when coupled with models of translation rates that account for codon usage and tRNA concentrations^{34}, this formalism provides a means to predict cotranslational folding behaviour of entire proteomes under varying cellular conditions and aid in the design and of synthetic transcriptomes that optimize the extent of cotranslational folding. Equations (4) and (6) also provide a way for experimentalists to map directly the results from more easily studied arrested RNCs to the realistic situation of continuous translation. Thus, such kinetic modelling of the ribosome, when combined with a variety of different experimental data, provides new research avenues and the potential for novel insights in a number of different areas.
Methods
An exact solution for a collection of translating ribosomes
An equation can be derived relating the domain folding probability immediately before the addition of the next amino acid to the nascent chain for a ribosome that dwells with an exponential waiting time distribution at each codon (equation (6), Attila Szabo, personal communication). This probability, denoted is equal to the probability of taking the pathway in Fig. 1c F_{i}→F_{i+1} and can be calculated as
where and are, respectively, the probabilities that, when the nascent chain changes from length i−1 to i, the domain was either in the folded or denatured state. is the probability, that, beginning in the folded state at length i, the RNC complex will reach the folded state at length i+1 before reaching the denatured state at length i+1. Likewise, is the probability that, beginning in the denatured state at length i, the system will reach the folded state at length i+1 before reaching the denatured state at length i+1. Because there are only two states in our reaction scheme (Fig. 1c), we have that Substituting this into equation (7), we have
and we see that is a recursive relationship.
Using the probabilistic method^{35} for calculating pathway probabilities in reaction schemes, and can be easily shown to equal, respectively, and Inserting these terms into equation (8), and using the boundary condition that at i=1 (that is, for a nascent chain comprising one residue, the domain is denatured), this recursive relationship when solved equals equation (6).
Numerical simulation of a collection of translating ribosomes
To simulate the stochastic nature of translation, it is necessary to consider the randomly distributed dwell times {τ_{A}} that the ribosome exhibits during its translation of an mRNA molecule. As the underlying P_{i}(τ_{A}) distribution of aminoacid addition times at codon i has not yet been experimentally determined, here we assume it to be exponentially distributed with For each ribosome, we constructed its {τ_{A}} by randomly sampling from this distribution, using an inverse transform sampling in which where R is a random number selected from a uniform distribution in the range of (0,1). For each ribosome, this procedure results in 91 dwell times, representing a ribosome stochastically translating the protein G construct (Fig. 3a). This procedure was repeated 1,000 times, each yielding a unique {τ_{A}}, which represents the behaviour of 1,000 different synthesis events of this protein. Each {τ_{A}} was then inserted into equation (4) to yield their corresponding cotranslational folding curve (Fig. 6).
Analysis of coarsegrained simulations
Mapping simulation timescales to experimental timescales. Low viscosity Langevin dynamics, as used in the coarsegrained simulations (Supplementary Methods), accelerate molecular dynamics while leaving the thermodynamic properties of the system unaltered. To map these accelerated kinetics to the experimentally relevant highviscosity situation in aqueous media, we multiply the simulation time by the constant the ratio of experimentally measured to calculated folding times. is the experimentally (E) measured folding time of protein G in bulk solution and equals 2.4 ms (ref. 25), whereas is the average folding time from these simulations in the absence of the ribosome and equals 3.6 ns. This constant represents a linear scaling between the simulation time and the experimental time. Thus, in these simulations, when a new glycine residue is inserted into the growing chain during continuous translation every 90 ns (=6×10^{6} integration time steps) of simulation time, this interval corresponds to an experimental time of τ_{A}=60 ms. Likewise, τ_{F}, τ_{D} and the other τ_{A} values reported in the main text are the results of multiplying their simulation times by this constant.
The probability of domain folding at equilibrium, P_{F}^{E}, was calculated from the Replica exchange simulations (Supplementary Methods). A given simulation conformation of protein G was considered to be folded, if its fraction of native contacts was greater than 50%, and otherwise was considered unfolded. The folded/unfolded time series for each replica was constructed using this definition, and the time series from replicas at different temperatures combined in the WHAM equations^{36} to calculate P_{F}^{E}. The stability of the folded state of protein G with respect to its denatured state, ΔG_{ND}, is equal to where k_{B} is Boltzmann's constant and T is the temperature.
The mean folding time τ_{F} of protein G equals the average of the set of first passage times {${\tau}_{\text{F},i}$} determined from temperature quench simulations at various nascent chain lengths (Supplementary Methods). τ_{D} is calculated as where k_{B} is Boltzmann's constant and T is the simulation temperature.
The probability of domain folding during continuous synthesis simulations, P_{F}(i), was calculated as where the summation is over the Nindependent trajectories simulated for the given system, θ (${Q}_{\text{BB},i}$−0.50) is the Heaviside step function that equals 1 if more than half of the native backbone contacts ${Q}_{\text{BB},i}$ in the structure of protein G are made in the last frame of the simulation at nascent chain length i and 0 otherwise. where C is the number of native backbone contacts within the crystal structure, S (=56) is the number of interaction sites in protein G, and r_{jk}^{F} and r_{jk} are, respectively, the spatial distances between interaction sites j and k in the crystal structure and the simulation structure. In this analysis, a native contact is identified in the crystal structure if any heavy atoms between residues j and k are within 4.5 Å of each other.
The standard error about the mean of τ_{F} was calculated by breaking the 152 independent folding trajectories into 15 sets of 10 or 11 τ_{F,i} values each, calculating the average value of each set and then calculating the standard deviation of the 15 averages divided by To calculate the s.e.m. of ΔG_{ND}, the replica exchange simulation timeseries data were broken into 5 independent sets, with approximately 20,000 points in each replica in each set. We then calculated ΔG_{ND} using each data set in the WHAM equations, and calculated the s.e.m., using these five ΔG_{ND} values. τ_{D}'s s.e.m. was calculated using standard propagation of error equations.
Additional information
How to cite this article: O'Brien, E. P. et al. Prediction of variable translation rate effects on cotranslational protein folding. Nat. Commun. 3:868 doi: 10.1038/ncomms1850 (2012).
Change history
26 February 2013
A correction has been published and is appended to both the HTML and PDF versions of this paper. The error has not been fixed in the paper.
References
 1
Hartl, F. U. & HayerHartl, M. Converging concepts of protein folding in vitro and in vivo. Nat. Struct. Mol. Biol. 16, 574–581 (2009).
 2
Thirumalai, D., O'Brien, E. P., Morrison, G. & Hyeon, C. Theoretical perspectives on protein folding. Annu. Rev. Biophys. 39, 159–183 (2010).
 3
Nicola, A. V., Chen, W. & Helenius, A. Cotranslational folding of an alphavirus capsid protein in the cytosol of living cells. Nat. Cell Biol. 1, 341–345 (1999).
 4
O'Brien, E. P., Christodoulou, J., Vendruscolo, M. & Dobson, C. M. New scenarios of protein folding can occur on the ribosome. J. Am. Chem. Soc. 133, 513–526 (2011).
 5
Elcock, A. H. Molecular simulations of cotranslational protein folding: fragment stabilities, folding cooperativity, and trapping in the ribosome. PLOS Comput. Biol. 2, 824–841 (2006).
 6
Fedyukina, D. V. & Cavagnero, S. Protein folding at the exit tunnel. Ann. Rev. Biophys. 40, 337–359 (2011).
 7
Ugrinov, K. G. & Clark, P. L. Cotranslational folding increases GFP folding yield. Biophys. J. 98, 1312–1320 (2010).
 8
Clark, P. L. & King, J. A newly synthesized, ribosomebound polypeptide chain adopts conformations dissimilar from early in vitro refolding intermediates. J. Biol. Chem. 276, 25411–25420 (2001).
 9
Netzer, W. J. & Hartl, F. U. Recombination of protein domains facilitated by cotranslational folding in eukaryotes. Nature 388, 343–9 (1997).
 10
Komar, A. A., Lesnik, T. & Reiss, C. Synonymous codon substitutions affect ribosome traffic and protein folding during in vitro translation. FEBS Lett. 462, 387–391 (1999).
 11
Tsai, C. J. et al. Synonymous mutations and ribosome stalling can lead to altered folding pathways and distinct minima. J. Mol. Biol. 383, 281–291 (2008).
 12
KimchiSarfaty, C. et al. A 'silent' polymorphism in the MDR1 gene changes substrate specificity. Science 315, 525–528 (2007).
 13
Hopfield, J. J. Kinetic proofreading  new mechanism for reducing errors in biosynthetic processes requiring high specificity. Proc. Natl Acad. Sci. USA 71, 4135–4139 (1974).
 14
Ninio, J. Kinetic Amplification of enzyme discrimination. Biochimie 57, 587–595 (1975).
 15
Khushoo, A., Yang, Z., Johnson, A. E. & Skach, W. R. Liganddriven vectorial folding of ribosomebound human CFTR NBD1. Mol. Cell 41, 682–692 (2011).
 16
Uemura, S. et al. Realtime tRNA transit on single translating ribosomes at codon resolution. Nature 464, 1012–U73 (2010).
 17
Hsu, S. T. D. et al. Structure and dynamics of a ribosomebound nascent chain by NMR spectroscopy. Proc. Natl Acad. Sci. USA 104, 16516–16521 (2007).
 18
Jackson, S. E. & Fersht, A. R. Folding of chymotrypsin inhibitor2.1. Evidence for a 2state transition. Biochemistry 30, 10428–10435 (1991).
 19
Zhang, G., Hubalewska, M. & Ignatova, Z. Transient ribosomal attenuation coordinates protein synthesis and cotranslational folding. Nat. Struct. Mol. Biol. 16, 274–280 (2009).
 20
Fluitt, A., Pienaar, E. & Vijoen, H. Ribosome kinetics and aatRNA competition determine rate and fidelity of peptide synthesis. Comput. Biol. Chem. 31, 335–346 (2007).
 21
Qu, X. et al. The ribosome uses two active mechanisms to unwind messenger RNA during translation. Nature 475, 118–21 (2011).
 22
Schwartz, S. D. & Schramm, V. L. Enzymatic transition states and dynamic motion in barrier crossing. Nat. Chem. Biol. 5, 552–559 (2009).
 23
Dellago, C., Bolhuis, P. G., Csajka, F. S. & Chandler, D. Transition path sampling and the calculation of rate constants. J. Chem. Phys. 108, 1964–1977 (1998).
 24
Chung, H. S., McHale, K., Louis, J. M. & Eaton, W. A. Singlemolecule fluorescence experiments determine protein folding transition path times. Science 335, 981–4 (2012).
 25
De Sancho, D., Doshi, U. & Munoz, V. Protein folding rates and stability: how much is there beyond size? J. Am. Chem. Soc. 131, 2074–2075 (2009).
 26
Kaiser, C. M., Goldman, D. H., Chodera, J. D., Tinoco, I. Jr & Bustamante, C. The ribosome modulates nascent protein folding. Science 334, 1723–7 (2011).
 27
BingelErlenmeyer, R. et al. A peptide deformylaseribosome complex reveals mechanism of nascent chain processing. Nature 452, 108–111 (2008).
 28
Gronenborn, A. M. et al. A novel, highly stable fold of the immunoglobulin binding domain of streptococcal proteinG. Science 253, 657–661 (1991).
 29
O'Brien, E. P., Hsu, S. T. D., Christodoulou, J., Vendruscolo, M. & Dobson, C. M. Transient tertiary structure formation within the ribosome exit port. J. Am. Chem. Soc. 132, 16928–16937 (2010).
 30
Kosolapov, A. & Deutsch, C. Tertiary interactions within the ribosomal exit tunnel. Nat. Struct. Mol. Biol. 16, 405–411 (2009).
 31
Purvis, I. J. et al. The efficiency of folding of some proteins is increased by controlled rates of translation invivo  a hypothesis. J. Mol. Biol. 193, 413–417 (1987).
 32
Young, R. & Bremer, H. Polypeptidechainelongation rate in escherichiacoli BR as a function of growthrate. Biochem. J. 160, 185–194 (1976).
 33
Agashe, V. R. et al. Function of trigger factor and DnaK in multidomain protein folding: Increase in yield at the expense of folding speed. Cell 117, 199–209 (2004).
 34
Czech, A., Fedyunin, I., Zhang, G. & Ignatova, Z. Silent mutations in sight: covariations in tRNA abundance as a key to unravel consequences of silent mutations. Mol. Biosyst. 6, 1767–1772 (2010).
 35
Ninio, J. Alternative to the steadystate method: derivation of reaction rates from firstpassage times and pathway probabilities. Proc. Natl Acad. Sci. USA 84, 663–7 (1987).
 36
Kumar, S., Bouzida, D., Swendsen, R. H., Kollman, P. A. & Rosenberg, J. M. The weighted histogram analysis method for freeenergy calculations on biomolecules.1. the method. J. Comput. Chem. 13, 1011–1021 (1992).
Acknowledgements
We thank Attila Szabo for stimulating discussions on chemical kinetic modelling and for proposing and deriving equation (6); Sophie Jackson for a careful reading of the manuscript; Robert Best for providing CHARMM source code for the doublewell dihedral potential and DebyeHuckel electrostatic calculations; Changbong Hyeon for useful suggestions on modelling electrostatic interactions in coarsegrained models; and John Christodoulou for illuminating discussions about cotranslational folding. This work was supported by an NSF postdoctoral grant (EPO), BBSRC and the Wellcome Trust (MV and CMD), and the EPSRC (EPO, MV and CMD). This study utilized the highperformance computational capabilities of the Biowulf Linux cluster at the National Institutes of Health, Bethesda, Maryland. (http://biowulf.nih.gov).
Author information
Affiliations
Contributions
E.P.O., M.V., and C.M.D. designed the research. E.P.O. carried out the research and analysed the data. E.P.O., M.V., and C.M.D. interpreted the data and wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Information
Supplementary Tables S1S2, Supplementary Methods and Supplementary References (PDF 579 kb)
Rights and permissions
About this article
Cite this article
O'Brien, E., Vendruscolo, M. & Dobson, C. Prediction of variable translation rate effects on cotranslational protein folding. Nat Commun 3, 868 (2012). https://doi.org/10.1038/ncomms1850
Received:
Accepted:
Published:
Further reading

A code within the genetic code: codon usage regulates cotranslational protein folding
Cell Communication and Signaling (2020)

GenomeScale Analysis of Perturbations in Translation Elongation Based on a Computational Model
Scientific Reports (2018)

Cotranslational protein assembly imposes evolutionary constraints on homomeric proteins
Nature Structural & Molecular Biology (2018)

Accurate prediction of cellular cotranslational folding indicates proteins can switch from post to cotranslational folding
Nature Communications (2016)

Structural studies of the Nterminal fragments of the WW domain: Insights into cotranslational folding of a betasheet protein
Scientific Reports (2016)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.