Cochaperones enable Hsp70 to use ATP energy to stabilize native proteins out of the folding equilibrium

The heat shock protein 70 (Hsp70) chaperones, vital to the proper folding of proteins inside cells, consume ATP and require cochaperones in assisting protein folding. It is unclear whether Hsp70 can utilize the free energy from ATP hydrolysis to fold a protein into a native state that is thermodynamically unstable in the chaperone-free equilibrium. Here I present a model of Hsp70-mediated protein folding, which predicts that Hsp70, as a result of differential stimulation of ATP hydrolysis by its Hsp40 cochaperone, dissociates faster from a substrate in fold-competent conformations than from one in misfolding-prone conformations, thus elevating the native concentration above and suppressing the misfolded concentration below their respective equilibrium values. Previous models would not make or imply these predictions, which are experimentally testable. My model quantitatively reproduces experimental refolding kinetics, predicts how modulations of the Hsp70/Hsp40 chaperone system affect protein folding, and suggests new approaches to regulating cellular protein quality.

proximity then facilitates the J domain binding to Hsp70 and accelerating its ATP hydrolysis [26][27][28] . Following ATP hydrolysis, the chaperone returns from the ADP-state to the ATP-state through nucleotide exchange, which is often catalyzed by nucleotide exchange factors (NEF) such as the bacterial GrpE 29,30 .
It is unclear whether Hsp70 can use the free energy from ATP hydrolysis to drive its substrate protein toward the native state, N, and away from the misfolded state, M, such that f N /f M > f N,eq /f M,eq , where f S is the fraction of the substrate in state S at the steady state of Hsp70-mediated folding, and f S,eq is the corresponding fraction at the folding equilibrium in the absence of the chaperone. Previous models 10,31 mostly considered the chaperone as an unfoldase/holdase-which need not consume free energy-that pulls the substrate out of the misfolded state and holds it in an unfolded state. It was proposed that the free energy from ATP hydrolysis was used to achieve ultra-affinity in substrate binding 9,32 . As an unfoldase/holdase, Hsp70 would also pull the substrate out of the native state into the unfolded state; unless Hsp70 has a higher affinity for the native substrate than for the misfolded substrate, which contradicts experimental observations, these models would predict f N /f M ≤ f N,eq /f M,eq .
Here I propose a model of Hsp70-mediated protein folding, in which Hsp70 and Hsp40 together constitute a molecular machine that uses the free energy from ATP hydrolysis to actively drive a protein toward its native state, so that f N /f M > f N,eq /f M,eq . It suggests that without Hsp40, Hsp70 alone cannot change the ratio f N /f M from the equilibrium value f N,eq /f M,eq . My model thus answers the question why Hsp70 requires both the Hsp40 cochaperones and ATP consumption in assisting protein folding. My model explains the puzzling non-monotonic dependency of folding efficiency on the chaperone and cochaperone concentrations. It makes quantitative predictions on how protein folding is affected by modulations of the chaperone environment, including changes in the ATPase activity or the nucleotide exchange rate of Hsp70. These predictions may be readily tested by experiments, and inform rational approaches to manipulating chaperone-mediated protein folding.

Results
My model is based on two assumptions supported by experimental observations. The first assumption is that a substrate protein can adopt two additional conformational states besides the misfolded, M, and the native, N, states: the unfolded and misfolding-prone state, U, and the fold-competent state, F. A protein in the F state is unfolded but poised to fold into the native state ( Fig. 1a,b). Such intermediate states of folding have been observed experimentally 4 . Conformational transitions can occur between M and U, between U and F, and between F and N (Fig. 1b). The second assumption is that Hsp40's affinity for a substrate protein is higher if the substrate is in the misfolding-prone conformation than if it is in the fold-competent conformation. Experimental observations suggest that the misfolding-prone conformation is less compact and exposes more hydrophobic sites-thus providing more accessible sites for Hsp40 binding-than the fold-competent conformation 4 . Consequently, a substrate in the U state is more likely to be Hsp40-bound than one in the F state.
The key idea of my model is a mechanism by which Hsp70 actively drives a substrate toward the native state and away from the misfolded state. Based on the above assumptions, an Hsp70 molecule bound to a substrate molecule in the U state will on average have substantially higher ATP hydrolysis rate-because of the higher probability of cis stimulation by an Hsp40 molecule bound to the same substrate molecule-than if it is bound to a substrate molecule in the F state. If the nucleotide exchange rate is between these two hydrolysis rates, an Hsp70 bound to a substrate in the U state will be driven toward the ADP-state, where it slowly dissociates from the substrate, while an Hsp70 bound to a substrate in the F state will be driven toward the ATP-state, where it rapidly dissociates from the substrate. This enables Hsp70, when bound to substrate molecules, to act like a Maxwell's demon 33 : it quickly releases the fold-competent molecules so that they can fold, but it retains the misfolding-prone molecules to prevent misfolding. By this mechanism, Hsp70 drives the folding along the reac- where S · C · X represents the complex between a substrate in conformation S and the chaperone C bound to nucleotide X = ATP, ADP (Fig. 1b). One ATP molecule is consumed in this reaction path and the free energy is used to compel the substrate into the native state.
My model predicts that Hsp70 need Hsp40 in order to alter the folding equilibrium. For Hsp70 to drive substrate folding toward the native state, the above mechanism does not require that more Hsp70 molecules bind to a substrate in the U state than to a substrate in the F state, which is true and reflected in previous models; instead, it requires that an individual Hsp70 molecule, when bound to a substrate, dissociates slower if the substrate is in the U state than if it is in the F state. Hsp40 thus plays a critical role because a substrate-bound Hsp70 distinguishes, probabilistically, between the U and F states of the substrate by sensing whether an Hsp40 is also bound to the same substrate. Defining the excess free energy of folding as where R is the gas constant and T the temperature, it can be shown algebraically (see Methods) and numerically ( Fig. 1c) that without cochaperones, ΔΔG = 0. This prediction is consistent with the results from the single-molecule experiment of DnaK-mediated refolding 34 , where DnaK alone in the presence of ATP was unable to alter the ratio of the misfolded and folded fractions.
In order to simplify calculations using my model, I assume that a protein's hydrophobic binding sites for Hsp40 and Hsp70-which can be exposed in the U and F states-become entirely buried upon folding (F → N) and misfolding (U → M), thus a protein in the M and N states has zero exposed hydrophobic binding sites, and it does not bind to either Hsp70 or Hsp40 (Fig. 1a). The hydrophobic burial in the misfolded state may be due to intramolecular contacts between incorrectly folded domains within a monomeric protein, or due to intermolecular contacts between different protein molecules as a result of oligomerization or mild, reversible aggregation. A protein may also aggregate irreversibly, and will not refold even in the presence of the chaperone system 5 (Table 2). (b) The transitions between the microscopic states in the chaperonemediated folding pathway. S·C·X represents the complex between the substrate in the conformational state S (=U, F) and the Hsp70 chaperone (denoted as C) bound to nucleotide X (=ATP, ADP). The transitions between S and S·C·X correspond to the chaperone binding to and unbinding from the substrate. The transition of S·C·ATP to S·C·ADP corresponds to ATP hydrolysis, and its reverse, nucleotide exchange. Hsp70 binding stabilizes the substrate in the intermediate states, thus catalyzing the folding reaction. Hsp40 (J) can form a ternary complex with the substrate and Hsp70-thus stimulating ATP hydrolysis-if the substrate is in the U state, but not if the substrate is in the F state. Differential ATP hydrolysis by Hsp70 bound to the substrate in the U and F states drives the refolding through the pathway highlighted in red. The lengths of the reaction arrows are linear with respect to the logarithms of the exemplary rate constants (in 1/s) for the DnaK/DnaJ/GrpEmediated refolding of luciferase at 25 °C (Tables 1 and 2 (Table 1). My model quantitatively reproduces the experimentally observed refolding kinetics under various conditions, capturing the slow spontaneous refolding and denaturation of luciferase, the acceleration of refolding with chaperone assistance, and the necessity of GrpE (Fig. 2a). The intermediate conformations U and F in my model in the case of luciferase may correspond to the experimentally identified intermediate conformations I 2 and I 1 of luciferase 4 : the free energy difference between N and F at 25 °C, according to the fitted parameters, is 20 kJ/mol, close to the experimental value of 15 kJ/mol between N and I 1 , measured at 10 °C. Consistent with previous experimental observations 31 , my model suggests that the Hsp70-mediated refolding proceeds in two steps: (1) rapid unfolding of the misfolded substrate, stabilized by the ADP-bound DnaK, followed by (2) slow conversion to the native state (Fig. 3a).
Next, I used my model to analyze how the refolding kinetics and yield change with respect to the DnaK concentration, which have been experimentally studied for LucDHis6, a variant of luciferase 31 3b) keeps the protein folding out of equilibrium, elevating the native population above and suppressing the misfolded population below their respective equilibrium values (Fig. 2c). Notably, the refolding yield peaks around [DnaK] = 1 µM, and it decreases at higher DnaK concentrations. This non-monotonic dependence on the DnaK concentration was also reported for the wildtype luciferase while this work was under review 36 . According to my model, the excess free energy at the steady state always increases with increasing DnaK concentrations, but the native population reaches a maximum and then decreases (Fig. 2c), because at high DnaK concentrations, the substrate is trapped in the DnaK-bound state and thus prevented from folding into the native state. I used my model to estimate the ATP consumption in the DnaK/DnaJ/GrpE-mediated folding, which has been experimentally measured for LucDHis6 31 (Fig. 4). My model estimates that, in the initial minutes of refolding, approximately 150 ATP molecules are consumed to refold one LucDHis6 (Fig. 4a), which is reasonably close to the experimental result of ~50 ATP molecules consumed per refolded LucDHis6 when the stoichiometry of DnaK:LucDHis6 is 1:1, significantly higher than the experimental number of ~5 when LucDHis6 is in excess of DnaK, and significantly lower than the estimates of >1000 for many other substrates in other experiments 31,37-39 . The discrepancy between the model and the experimental results may be attributable to the approximations in my model and the inaccuracies in the input kinetic parameters. ATP hydrolysis continues at the steady state and the free energy is utilized to promote the native state and suppress the misfolded state (Fig. 4b,c). As [DnaK] exceeds 1 µM, the ATP consumption rate increases rapidly without commensurate increase in the excess free energy. My analysis thus suggests that DnaK may be most free energy efficient at maintaining protein folding out-of-equilibrium when its concentration is in the sub-micromolar range, a prediction that may be tested experimentally.
My model suggests that Hsp70 can keep a protein folded even if it thermodynamically tends to misfold and aggregate (Here I only consider reversible aggregation). The chaperone is thus able to play a critical role in maintaining protein conformations, not just in the folding of nascent chains 40 . Higher DnaK concentrations are required to suppress aggregation at increasing substrate concentrations (Fig. 5a) or at decreasing substrate stabilities (Fig. 5b). This may explain how cells that overexpress DnaK can tolerate higher numbers of mutations in the chaperone's substrates 41 . Because the excess free energy plateaus at high chaperone concentrations (Fig. 2c), my results imply a limit on the chaperones' capacity to prevent aggregation, in that there exists a threshold of aggregation tendency (Fig. 5a,b, the black arrows) above which the chaperone can no longer maintain high levels of native concentrations and prevent misfolding/aggregation at the same time.
My model suggests that Hsp70 only drives the folding of proteins with sufficiently slow conversion between U and F states (Fig. 5c-e), implying that Hsp70 substrates tend to be slow refolding proteins (Fig. 5d). If the conversion between U and F is too fast, the chaperone diminishes, rather than increases, the native fraction in comparison to the chaperone-free equilibrium. As the conversion slows, the chaperone drives the steady state native fraction higher, but at the price of longer refolding time (Fig. 5e), a trade-off reminiscent of that between speed and specificity in the kinetic proofreading mechanism 42,43 , where the expenditure of free energy (such as from ATP or GTP consumptions) is utilized to increase the specificity of chemical reactions.
My model explains the observation that folding is less efficient at both low and high DnaJ concentrations 17 (Fig. 6a). At low DnaJ concentrations, ATP hydrolysis is slow, and nucleotide exchange drives DnaK toward the ATP-state, in which it dissociates from the substrate rapidly and thus unable to prevent aggregation. At high DnaJ (red) and misfolded (blue) populations, but the ratio between the two remains unchanged from its equilibrium value: ΔΔG = 0 (orange, right y-axis). Here, I have taken  k F N and fit → k N F to the refolding data. c There should be more accessible DnaK binding sites in the U state than in the F state. Here, I arbitrarily set the ratio between the two. Although the values of the other fitting parameters will change accordingly, the quality of the fit and the predictions of my model are insensitive to this ratio (at least for values between 1 and 100). d I assume that the effective distance, L, between the Hsp70 molecule and the J domain of the Hsp40 molecule bound to the same substrate molecule is L = 4.3 nm, the effective concentration is then 3 . The quality of the fit and the predictions of my model are insensitive to this parameter.

Reaction
Parameter Rate Reference 1.28 × 10 6 M −1 ·s −1 1.17 × 10 4 M −1 ·s −1  Substrate-free DnaK has a basal ATP hydrolysis rate of 0.001 s −1 , but the substrate can further accelerate ATP hydrolysis by up to 9-fold 17,53 . I take the ATP hydrolysis rate of substrate-bound DnaK to be 10-fold higher than the basal rate. The predictions of my model are insensitive to this parameter. c This is the experimental rate for the temperature T = 25 °C. For the higher temperature T = 30 °C, I used an arbitrary but reasonable 5.5-fold higher value of 10 s −1 because I could not find any reported experimental value for this temperature. d The kinetic rates for GrpE binding to DnaK were not determined, but the steady state ratio in the two-step dissociation reaction, , , was determined. I chose an arbitrary diffusion limited association rate for GrpE binding to DnaK in our calculations. e I note that k C (0) and E a appear too large to be physically meaningful; they should instead be taken simply as numerical parameters that yield an excellent fit of the Arrhenius equation to the experimental data.
concentrations, a large fraction of the substrate in the U state is bound to DnaJ. These DnaJ-bound substrate molecules are trapped in the U state, unable to progress toward the F state, resulting in diminished folding.
My model also explains the observation that folding decreases at both low and high GrpE concentrations 29 (Fig. 6b). For the chaperone to effectively assist folding, nucleotide exchange should be much slower than ATP hydrolysis when the chaperone binds to a substrate in the U state, but much faster than ATP hydrolysis when it binds to a substrate in the F state, so that the chaperone is driven toward the ADP-state in the former case, and toward the ATP-state in the latter case (Fig. 1b). At low GrpE concentrations, nucleotide exchange is slow, leaving DnaK bound to the substrate in the F state predominantly in the ADP-state-as reflected by the low population of F · C · ATP (Fig. 6b), slowing its dissociation from the substrate and thus preventing the latter from folding to  Table 2, and the conditions of the experiments considered in this paper are summarized in Table 3.
Scientific REPORTs | (2018) 8:13213 | DOI:10.1038/s41598-018-31641-w the native state. At high GrpE concentrations, nucleotide exchange is fast, and DnaK is driven into the ATP-state and does not stay bound to the substrate in the U state long enough-as reflected by the decreasing population of U · C · ADP (Fig. 6b)-to prevent the substrate from aggregation. To maximize substrate folding, higher nucleotide exchange rate should accompany higher stimulated ATP hydrolysis rate (Fig. 6c-e).
My model predicts that Hsp70 chaperones with higher Hsp40-stimulated ATP hydrolysis rates can drive substrate folding to higher native fractions (Fig. 6e), at the cost of higher free energy expenditure (Fig. 6f). This result explains a previous experimental observation that a small molecule that enhances ATP hydrolysis by Hsp40-bound Hsp70 can induce higher yields of substrate folding 44 . Modulation of the ATP hydrolysis or the nucleotide exchange rates by small molecules may represent a therapeutic opportunity in the treatment of misfolding-or proteostasis-related diseases 45 .

Discussion
Key to my model is the assumption that Hsp40 has different affinities for a substrate in different conformations, favoring the misfolding-prone conformation over the fold-competent conformation. This is supported by a number of experimental observations. Hsp40 binds to exposed hydrophobic sites using a Zn finger-like domain within its CTD 20 . It has been inferred from experiments that the binding of Hsp40 CTD to substrate is the strongest for unfolded peptides, weaker for partially unfolded proteins, and the weakest for native proteins, and that Hsp40 is able to distinguish the substrate conformations 22 . Hsp40 has been experimentally shown to bind to a number of denatured proteins, including denatured luciferase [20][21][22] , but it binds to few native proteins, notably to σ 32 , which adopts a loosely folded and highly flexible conformation 46 , and to RepE, the binding to which depends on the G/F-rich domain (outside CTD) of Hsp40 instead of the Zn finger-like domain 22 . These results suggest that Hsp40 preferentially binds to loose conformations with many exposed hydrophobic sites, which, together with the experimental observation that the fold-competent conformation is more compact with fewer exposed hydrophobic sites than the misfolding-prone conformation 4 , provides support for the assumption. Future experiments may directly test the assumption.
My model makes two distinct predictions that subject it to future experimental tests and possible falsification. First, it predicts that some thermodynamically unstable substrates depend on continuous Hsp70 assistance to maintain their native structures, and such a substrate in the steady state of Hsp70-mediated folding will gradually lose its native structure upon disruption of the chaperone system. Second, it predicts that an Hsp70 molecule The chaperones can prevent aggregation at decreasing substrate stability. I vary the protein stability by changing the rate constant of conversion, k N→F , from the N state to the F state; the corresponding change in the folding free energy ΔΔG folding is indicated on the top axis. The native and the misfolded concentrations, as well as the DnaK concentration required to prevent aggregation, are shown as in panel a. (c,d) Hsp70 is more efficient at folding substrates with slower conversion between the U and the F states. Here, I take the kinetic parameters of luciferase folding at 25 °C, and simultaneously scale the forward and reverse rates of the reaction  U F by the same factor, thus changing the kinetics without affecting the folding equilibrium. The times, t 1/2 , for the refolding of the misfolded substrate to reach half of the native fraction at equilibrium (spontaneous refolding) or the steady state (mediated by DnaK/DnaJ/GrpE), as well as the excess free energy (orange, right y-axis), are plotted against the hypothetic rates of conversion in c. The native fractions (red, left y-axis) and the excess free energy (orange, right y-axis) at the steady state are plotted against t 1/2 of spontaneous refolding in d; the equilibrium native fraction is shown as the red dotted line.  , and the populations of F · C · ATP and U · C · ADP (right y-axis). The optimal GrpE concentration is indicated by the red stars, and the in-plot numbers show the corresponding ratios of the nucleotide exchange rate to the ATP hydrolysis rates in the U and F states. In a and b, the rates of GrpE-catalyzed nucleotide exchange and DnaJ-catalyzed ATP hydrolysis are adjusted for the temperature of 30 °C (see Methods and Table 1). (c) Folding efficiency at different hypothetical rates of nucleotide exchange, for different values of the ATP hydrolysis rate in the U state. Native fractions (solid lines, left y-axis) are diminished at both low and high nucleotide exchange rates. At high rates of nucleotide exchange, the excess free energies (dashed lines, right y-axis) approach zero, indicating that Hsp70 can no longer drive protein folding. (d) The excess free energy as a function of the nucleotide exchange and the DnaJ-catalyzed ATP-hydrolysis rates. The rates used to model DnaK/DnaJ/GrpE-mediated folding at 30 °C are indicated by the red circle. (e) Folding efficiency increases with the DnaJ-catalyzed ATP hydrolysis rate, yielding higher native fractions (solid lines, left y-axis) and larger excessive free energies (dashed lines, right y-axis). (f) Higher ATP hydrolysis rate yields larger excess free energy (orange, right y-axis, top), at the price of higher rate of ATP consumption (red, left y-axis, top). The ratio of the two (bottom) changes only slightly.
bound to a substrate molecule will dissociate faster if the substrate is in the fold-competent conformation than if it is in the misfolding-prone conformation, and that this difference will disappear in the absence of Hsp40.
In support of the first prediction above, a recent experiment has demonstrated that luciferase at 37 °C can be kept active by the DnaK/DnaJ/GrpE chaperone system when there is sufficient ATP, but it rapidly loses its activity when ATP is depleted by the addition of apyrase 13 . The interpretation of this experiment, however, is complicated because apyrase also affects the luciferase activity assay. Here, based on my model, I propose an alternative experiment, in which Hsp70-mediated maintenance of luciferase activity is disrupted by inhibiting the simultaneous binding of Hsp40 to Hsp70 and to the substrate protein. For example, an isolated J-domain (e.g., DnaJ with its CTD deleted) can be used to compete against Hsp40 in binding to the Hsp70; alternatively, two D-peptides known to compete against substrate for binding to DnaJ, without binding to DnaK, can be used to inhibit DnaJ binding to the substrate 47 . When the J-domain or the D-peptide is added in excess to luciferase kept active by the DnaK/DnaJ/GrpE system, my model predicts that luciferase will lose its activity.
The second prediction of my model may be tested by kinetic experiments. Luciferase can be unfolded to different extent at different concentrations of the chemical denaturant guanidinium chloride (GdmCl): luciferase adopts a more compact unfolded structure with fewer exposed hydrophobic sites at lower concentrations of the denaturant than at higher concentrations of the denaturant 4 . My model predicts that, in the presence of Hsp40 and ATP, the dissociation rate of Hsp70 from luciferase denatured by low concentrations of GdmCl will be higher than from luciferase denatured at high concentrations of GdmCl, and that this difference in the dissociate rates will disappear in the absence of Hsp40.
Single molecule experiments 25,34 may provide a more stringent test of the second prediction of my model, if one can monitor both the residence time of Hsp70 on a substrate molecule and the probability that the same substrate molecule subsequently folds into the native structure. My model predicts that in the presence of Hsp40 and ATP, the folding probability will be higher if the residence time is shorter, but this correlation will vanish in the absence of Hsp40. Such experiments may be feasible if, for instance, separate fluorescence signals to detect Hsp70-substrate binding and substrate folding become available.

Methods
Model of Hsp70-mediated protein folding. I denote Hsp70 as C, Hsp40 (J protein) as J, and the NEF as E.
[Y] denotes the solution concentration of the molecular species Y. There are four types of reactions explicitly considered in my model (Fig. 1b): 1) Hsp70 binding to the substrate.
2) Conformational transitions of the substrate. An Hsp70-free substrate can adopt any of the four conformational states The chaperone-bound substrate can only be in and transition between the U and F states The details of the kinetic rates of the above reactions are described below.
Hsp70 binding to the substrate. The  Conformational transitions of the substrate. The transition rates between conformations S and S′ are different between a chaperone-free substrate ( → ′ k ) S S and a chaperone-bound substrate ( ⋅ ⋅ → ⋅ ⋅ ′ k S C X S C X ) (Fig. 1b). The condition of thermodynamic cycle closure dictates that Because Hsp40 has different affinities for different substrate conformations, the transition rates between the conformations will depend on whether the substrate is bound to Hsp40. I treat the effects of Hsp40 on the reactions implicitly by making the affected rate constants dependent on the solution Hsp40 concentration [J] (see below).
For the transition ⋅ ⋅ ⋅ ⋅  U C X F C X, I assume that the bound chaperone does not hinder the substrate to go from the F state to the U state, because a binding site available in the F state is most likely also available in the U state (based on the assumption > n n ). Thus I take = ⋅ ⋅ → ⋅ ⋅ → k k F C X U C X F U . It follows from thermodynamic cycle closure that the rate of the reverse transition-I use the superscript dagger to indicate that they are influenced by the presence of Hsp40-is U C X F C X U F C F C U ( ) ( ) I take the rate of (reversible) aggregation to be proportional to the substrate concentration: U M U M simplicity and lack of experimental parameters. In the absence of the nucleotide exchange factor, the rate limiting step in the reaction is the dissociation of ADP, with the rate constant k d,ADP , whereas when catalyzed by the NEF, the rate limiting step is the conformational change, with rate constant k C 29,50 . The overall rate of reaction at a given NEF concentration, [E], is then approximately Solving the kinetic equations. To simplify the calculations of refolding kinetics, I make the approximation that the solution concentrations of Hsp70, Hsp40, and NEF remain constant throughout the refolding process, which is true if they are in large excess of the substrate-bound chaperone, cochaperone, and NEF. Under this approximation, refolding kinetics is described by a set of linear ordinary differential equations, which are solved by the technique of eigenvalue decomposition of the rate matrix. This simplification allows quick and robust fitting of the folding kinetic parameters to the experimental refolding data. The steady state calculations do not use this approximation.