Reversible thermal unfolding of a yfdX protein with chaperone-like activity

yfdX proteins are ubiquitously present in a large number of virulent bacteria. A member of this family of protein in E. coli is known to be up-regulated by the multidrug response regulator. Their abundance in such bacteria suggests some important yet unidentified functional role of this protein. Here, we study the thermal response and stability of yfdX protein STY3178 from Salmonella Typhi using circular dichroism, steady state fluorescence, dynamic light scattering and nuclear magnetic resonance experiments. We observe the protein to be stable up to a temperature of 45 °C. It folds back to the native conformation from unfolded state at temperature as high as 80 °C. The kinetic measurements of unfolding and refolding show Arrhenius behavior where the refolding involves less activation energy barrier than that of unfolding. We propose a homology model to understand the stability of the protein. Our molecular dynamic simulation studies on this model structure at high temperature show that the structure of this protein is quite stable. Finally, we report a possible functional role of this protein as a chaperone, capable of preventing DTT induced aggregation of insulin. Our studies will have broader implication in understanding the role of yfdX proteins in bacterial function and virulence.

Temperature induced unfolding and conformational stability. The α -helical secondary structure remains stable and nearly unchanged up to 50 °C when heated gradually in steps of 10 °C starting from 20 °C (Fig. 1a). The structural signature significantly changes beyond 50 °C. The two minima at 209 and 222 nm disappear completely at 60 °C (Fig. 1a) indicating total loss of secondary structure. No further change in the CD spectrum is observed at 70 °C (Fig. 1a) when compared to 60 °C spectrum (Fig. 1a). Thus we find the protein to unfold completely at 60 °C.
Next we monitor the emission maxima position for excitation at 280 nm in steady state fluorescence. The emission peak position of the protein remains unaltered till 45 °C (Fig. 1b), beyond which the maxima starts shifting towards higher wavelength. A jump of nearly 10 nm in the emission peak position is observed between 45 °C to 55 °C. This shift towards higher wavelength is an indication of protein unfolding. Beyond 55 °C, the emission peak position does not change any further. We observe a decrease in fluorescence emission intensity with increase in temperature. These observations are qualitatively similar to those from CD measurement.
HSQC spectrum of 15 N-labelled protein shows well dispersed peaks between 6-10.5 ppm at 25 °C (Fig. 1c). We observe large number of peaks clustered around 7.5-8.5 ppm region as found for α -helical proteins in the HSQC spectrum (Fig. 1c). The spectrum also contains well dispersed peaks other than the clusters observed. This indicates presence of some β -sheet elements in STY3178. The dispersion of the HSQC spectrum changes marginally when we increase temperature from 25 °C to 45 °C. This suggests that the tertiary structural fold is indeed stable at 45 °C, in agreement with CD and fluorescence data. There are only a few peaks in the HSQC spectra which show sensitivity upon increasing temperature (Fig. 1c). Interestingly most of the peaks showing sensitivity to higher temperature belong to the well dispersed regions and not the core helical region.

Reversibility in unfolding.
We check the reversibility in unfolding by monitoring the refolding using CD by cooling down the protein from an elevated temperature. The thermally unfolded protein at 70 °C is gradually cooled to 20 °C by decreasing temperature in step of 10 °C. We observe the protein remains unfolded up to 50 °C (Fig. 1d). We observe refolding of the protein where the α -helical secondary structure is regained upon further cooling down to 40 °C (Fig. 1d). The CD spectrum after cooling down to 20 °C (Fig. 1d) is very similar to the native protein spectrum (Fig. 1a) prior to thermal unfolding. This observation demonstrates reversibility in thermal unfolding of STY3178. We observe a blue shift ~8 nm compared to the unfolded state in the fluorescence emission peak position upon cooling (Fig. 1b). This again indicates protein refolding where a native-like structure is formed which results in blue shift compared to the unfolded protein. We also observe an increase in fluorescence intensity upon refolding in contrast to the decrease during unfolding. However, the final emission peak intensity is less and the position is red shifted by 2-3 nm in the refolded protein compared to the native, suggesting slight rearrangement in side chains conformations. Overall both steady state fluorescence and CD results are in agreement indicating reversibility in thermal unfolding of this protein.
The conformational stability under equilibrium condition is probed by heating the protein for a longer period of time. We heat the protein at different high temperature water baths (50 °C to 100 °C) for 30 minutes and cool down it to room temperature after that. The CD spectra of the cooled proteins are then compared with the native one. We find that protein when heated within the range 50 °C to 80 °C, refolds completely as native-like structure, shown in Fig. 1e. There is a slight decrease in ellipticity of the refolded protein when it is treated at 90 °C, indicating partial loss of structure (Fig. 1e). We further observe that no refolding could be achieved when the protein is heated at 100 °C (Fig. 1e). Thus, the reversibility of folding is maintained in STY3178 fully when treated at 80 °C.
We measure the hydrodynamic size of STY3178 in the temperature range 20 °C to 60 °C ( Fig. 1f) using DLS. The folded protein in solution at 20 °C has a hydrodynamic diameter around 6.5 nm as reported earlier 2 . The variation of this size with temperature is only around 1 nm which is within the fluctuation limit. This indicates that the oligomerization state of the protein does not change upon heating and remains stable. When we cool the same protein from 60 °C to 20 °C, we again observe similar hydrodynamic sizes (Fig. 1f).
Heating or cooling rate dependence. We perform ellipticity measurements for different heating rates.
The fraction of folded protein (f N ) is estimated from the ellipticity at 222 nm [θ 222 ] for each temperature (as described in method section). For none of the heating rates below 50 °C, protein unfolds and f N remains constant. Above 50 °C, f N decreases for various rates (Fig. 2a). We observe a difference in half denaturation temperature (T m ) that is at f N = 0.5 for the different heating rates. T m is low for slower heating rates and increases for faster heating rate as tabulated in Table 1. In other words, longer the protein remains at elevated temperature above 50 °C, faster is the unfolding. We also monitor the refolding of the protein using different cooling rates. The plot of f N versus temperature for different cooling rates (Fig. 2b) show refolding of the protein starts when the temperature is below 50 °C. The half renaturation temperature (T m ʾ) of the protein is thus lower than the half denaturation temperature (T m ) ( Table 1), indicating hysteresis in the unfolding and refolding. These observations indicate that the insufficient time during faster scan in temperature hinders the protein to come to equilibrium at a given temperature causing the hysteresis. This is also indicative of kinetically controlled process [29][30][31] . The hysteresis decreases with decreasing rates of heating and cooling.
Kinetics of unfolding and refolding. The kinetics of unfolding of STY3178 is measured from f N at a given temperature with time in the temperature range 53 °C to 65 °C (see methods) where the f N values tend to zero after sufficiently long time (Fig. 3a). f N decreases below 50% within approximately 50 minutes when the protein is heated in this temperature range. The plot of f N with time follows single exponential decay. We obtain the rate of unfolding (k u ) from the exponential fit of f N versus time (few representative cases shown in Fig. 3b). We observe that k u values follow the Arrhenius behavior as can be seen from lnk u versus 1/T plot, where T is the absolute temperature (Fig. 3c). The activation energy of unfolding (E a u ) is calculated from the slope of the plot. Here Scientific RepoRts | 6:29541 | DOI: 10.1038/srep29541 E a u ~ 246.9 kJ/mol obtained for STY3178 is comparable to the activation of energy of unfolding reported in literature [32][33][34] . In the temperature range below 53 °C the protein does not reach complete unfolded state despite long measurement time. Hence this region has not been considered for estimating E a u . We monitor the refolding kinetics of the protein from the change in ellipticity at 222 nm [θ 222 ] starting from the thermally unfolded state at 70 °C. The unfolded protein is then directly cooled down to different temperatures in the range of 40 °C to 30 °C. There is a time lag (t L ) of about 100 seconds after which f N increases rapidly to reach the folded state (f N ≈ 0.95) within approximately 300 seconds. The data for the entire time of measurement (0 to 600 seconds), averaged over repeated set of experiments, are shown in Fig. 3d for different temperatures. The finite t L is probably due to stabilization after temperature quenching and depends slightly on temperature. We calculate The f R values are typically small but non zero for different temperatures. Similarly, we define the rise time Δ t = t − t L . The plot of Δ f N versus Δ t show single exponential rise, few of which are shown in Fig. 3e. We calculate the rate of refolding (k f ), from the fitted curves. Figure 3f shows lnk f versus 1/T plot confirming the Arrhenius behavior upon refolding. The activation energy (E a f ) of refolding is ~− 58.66 kJ/mol. The quality of linear fit in refolding kinetics (Fig. 3f) is somewhat poorer than that in unfolding (Fig. 3c). This is reflected in the R 2 values of the fits (0.97 for unfolding and 0.85 for refolding). This leads us to check the dependences of ln(k u /T) and ln(k f /T) on 1/T as suggested in ref. 20. The plots, shown in the Supplementary Fig. S1, confirm linear dependences for both the cases. This suggests that the activation heat capacities 20 for both unfolding and refolding are negligible.  Molecular modeling. In the absence of a molecular structure, we propose a homology model of STY3178 to understand the stability of the protein, based on the template yfdX protein structure (PDB 3DZA) from K. pneumoniae. The sequence similarity of STY3178 with 3DZA protein is nearly 40%. The homology model contains residues A 11 -Q 183 excluding 21 N-terminal residues of the construct. At the C-terminal end of STY3178, there are 16 residues, for which no homology is obtained from the template. Thus, we add all these residues de-novo using Swiss-PdbViewer and minimize homology model for refinement. We perform MD simulation of this minimized homology model for 300 ns at 310 K. The MD simulated monomer is primarily helical containing total ten helices along with a two stranded antiparallel β -sheet as shown in Fig. 4a. The helices present in the structure are primarily forming two helix bundles where helices H2, H3, H7 and H8 form one bundle and helices H5, H6 and H10 forming the other. Molecular dynamic simulations at elevated temperatures. Our experimental data show that this protein unfolds at elevated temperature without change in its assembly. This leads to test the thermal stability of the monomer model from MD simulations. We simulate structures at 310 K, 350 K and 400 K. We find that the root means square fluctuation (RMSF) increases with temperature (Fig. 4b).
In particular there are two regions which show enhanced RMSF with temperature. The first region is part of H3 and H4 along with the adjacent loop residues and the other region comprises of E1, H7 and the adjacent loops. The backbone dihedral (φ and ψ ) distribution for majority of residues do not show change with increase in temperature (Figs 4c and S2). Only a handful of residues show sensitivity in the φ and ψ distribution, which includes N14, D17, N18 from H1; D69, W70 and N71 from H4; A132 from H6; S182, Q183, S184 and V185 from H9; V193, H195, A197 and A198 of H10. An example of such change is shown for residue W70 in Fig. 4d and S184 in Supplementary Fig. S3. The Ramachandran plots for these residues indicate structural changes from helix to loop and β -sheet ( Supplementary Fig. S4). The relative standard deviations at 350 K and 400 K with respect to those at 310 K, given by r 350 = σ 350 /σ 310 and r 400 = σ 400 /σ 310 respectively, are shown in Fig. 4e,f for different residues, colour coded according to their values. Residues showing r 350 , r 400 ≤ 1 for both φ and ψ are shown in green in Fig. 4e,f. Majority of residues belong to the region 1 < r 350 , r 400 < 4 with moderate fluctuations in either φ or ψ or both at 350 K and 400 K as shown in orange in Fig. 4e,f. The residues, having r 350 , r 400 > 4 are marked in red which belong to the terminal helices H1, H9 and H10. Despite enhancement in fluctuations, the overall structure of STY3178 remains stable. To the contrary, we observe enhanced RMSF and loss of secondary structural element for lysozyme at 400 K from our simulation using the same force field as shown in Supplementary Fig. S5, in agreement with earlier report 35 .

Discussion
We predict the function of STY3178 using bioinformatics tool MODexplorer 36 . The output results from MODexplorer server indicates nearly 50 proteins which show structural similarity > 50% with STY3178. Among these 50 proteins most of the proteins have helices similar to STY3178. While some of the proteins with similar structures have no reported function, majority of them are either chaperones or assist chaperone activity. Our observations indicate that STY3178 may have chaperone activity. To verify such possibility, we monitor DTT induced aggregation of insulin B-chain in presence of STY3178. At 42 °C, we observe a steep rise in absorbance at 360 nm (A 360 ) with time followed by a saturation platue (Fig. 5) as reported earlier 37 . In presence of STY3178 at same temperature, we find lower absorbance at 360 nm with time, indicating less aggregation of insulin B-chain in presence of DTT. This prevention of aggregation is dependent on the net amount of STY3178 present in the reaction mixture. Figure 5 shows the plot of absorbance at 360 nm (A 360 ) versus reaction time (30 minutes) in absence (black) and presence of various STY3178 molar concentration ratios 0.1 (red), 0.25 (green) and 0.5 (blue) to insulin concentration. A systematic lowering of absorbance at 360 nm (A 360 ) with increasing concentration of STY3178 demonstrates its capability to prevent insulin aggregation. Thus, in vitro we indeed observe that STY3178 is capable of showing chaperone activity.
STY3178 sequence from Gene bank (gene ID gi|16758993:c3049965-3049366) is used in signalP 4.1 server 38 to identify any possible signal peptide sequence. The construct of STY3178 used in experiments do not contain 9 N-terminal residues. SignalP server predicts the first 12 residues in the N-terminal region of this protein as a signal peptide. Experimentally we observe that STY3178 is soluble in aqueous medium. This indicates that STY3178 cannot be a membrane embedded protein. Thus presence of predicted signal peptide in the N-terminus indicates a possibility of its localization in the periplasm. Exact cellular localization of this protein is not yet established. However, servers like Cello 39,40 and LocTree3 41 predicts its sub-cellular localization in the periplasm. There are many chaperones identified in the periplasm of bacteria like peptidyl-prolyl isomerases, disulphide bond isomerases etc [42][43][44] . Since ATP is absent in periplasm 42 , the chaperones from periplasm are capable of performing their activity without ATP assistance. The predicted sub cellular localization as well as ATP independent chaperone activity of STY3178 both suggests its localization in the periplasm.

Conclusion
To summarize, our experimental studies reveal a reversible thermal unfolding and structural stability at elevated temperature for STY3178. The CD, steady state fluorescence and NMR data show that the protein is stable up to 45 °C and folds back even after heating at 80 °C. Further the time dependent CD measurements between 20 °C-70 °C show reversibility in unfolding with hysteresis. The kinetics of unfolding and refolding both follows the Arrhenius behavior with an activation energy barrier of ~246.9 kJ/mol and − 58.66 kJ/mol, respectively. Our molecular dynamics simulation studies on the proposed model monomeric structure of the protein show stability at high temperature. Most importantly, we find an ATP independent chaperone activity capability of STY3178. This observation matches well with the bioinformatically predicted periplasmic localization of the protein. Our studies indicate that STY3178 may be an important protein through its chaperone activity for the bacterial cellular functions. Hence yfdX proteins need to be fully characterized to understand bacterial virulence. Methods Plasmid pET28a carrying the desired gene sty3178 is overexpressed in E. coli using 0.2 mM isopropyl-β -D-thiogalactoside for 4 hours in a shaker (Innova 42 New Brunswick Scientific) as reported earlier 2 . Protein extraction and purification are done following the earlier reported protocol 2 . All the reported measurements are carried out in a buffer containing 50 mM phosphate (pH 7), 250 mM NaCl and 1 mM PMSF. In all the experiments, the protein concentration is calculated in terms of monomer molecular weight (MW = 23107.7 Da). Circular Dichroism. All the circular dichroism (CD) measurements are carried out in Jasco J-815 CD spectrometer equipped with peltier temperature control unit (Jasco). The sample concentration and path length for all the experiments are 10 μ M and 3 mm, respectively. Every CD spectrum is acquired in the far UV-region (200-250 nm). Unfolding of the protein is monitored in the temperature range of 20 °C to 70 °C upon increasing temperature with increments of 10 °C and equilibration of 10 minutes at each (30 °C, 40 °C, 50 °C, 60 °C and 70 °C) temperature. Similar to unfolding experiments, refolding of the protein is achieved by decreasing temperature from 70 °C with decrements of 10 °C till 20 °C (60 °C, 50 °C, 40 °C, 30 °C, and 20 °C). Background correction is done by subtracting the ellipticity value of buffer in the far UV-region for each experiment at that temperature.
The transition from folded state to unfolded state is monitored by measuring the ellipticity values at 222 nm [θ 222 ] upon gradually heating from 20 °C to 70 °C. The fraction of the folded protein (f N ) at each temperature is calculated using the expression 29   for this region (S 184 -H 199 ) using Swiss-PdbViewer 52 by adding one residue at a time and minimizing thereafter. Hydrogen atoms are added to the final model containing residues A 11 -H 199 . The model is solvated using explicit solvent model in a rectangular parallelepiped water box with dimension 69.4 × 62.2 × 73 Å 3 and minimized using NAMD 53 for refinement after neutralizing with counter ions.

Molecular dynamics simulation of protein model. NAMD 53 is used for the molecular dynamics (MD)
simulation of the minimized structure. CHARMM27 54 force field and TIP3P 55 water model is used for MD simulation performed at 1 atm pressure in isothermal-isobaric (NPT) ensemble following the standard protocol, with periodic boundary condition and 1 femtosecond (fs) time-step. Particle-mesh Ewald method is applied to deal with electrostatic interactions. Energy minimization of 10,000 steps is performed prior to 300 nanosecond (ns) simulation time with 29848 total number of atoms. The initial MD simulation is performed at 310 K temperature. An equilibrated structure from the 310 K ensemble is used for 350 K and 400 K temperature simulations. The root mean square deviation (RMSD) at all the temperatures is calculated with respect to the C α -atoms over all the trajectories up to 300 ns. We then estimate the root means square fluctuation (RMSF) of C α -atoms of each residue at temperatures 310 K, 350 K and 400 K over the equilibrated trajectories. The histogram distribution, Ramachandran plot and standard deviation (σ ) of backbone dihedral angles φ and ψ of each residue is calculated over the trajectories 150-300 ns. The relative change in σ for each residue at 350 K (r 350 = σ 350 /σ 310 ) and 400 K (r 400 = σ 400 /σ 310 ) is then estimated with respect to that at 310 K.
MD simulation for the protein lysozyme (PDB 193L) is performed for 100 ns using the total number of 21002 atoms at 310 K and 400 K following the protocols mentioned above.
Chaperone activity assay. Thermal aggregation of 50 μ M insulin in presence of 20 mM DTT is monitored from the absorbance at 360 nm (A 360 ). A 10 mm path length cell is used in BioSpectrometer (Eppendorf) for A 360 measurement at 42 °C as described in earlier protocol 37 . The experiment is performed both in absence and in the presence of pure STY3178. Molar ratio of STY3178 used in the assay is 0.1:1, 0.25:1 and 0.5:1 with respect to insulin. The buffer used in the experiment is 50 mM phosphate (pH 7), 250 mM NaCl and 1 mM PMSF. Both the proteins are equilibrated at 42 °C for 10 minutes prior to addition of DTT.