Transition between protein-like and polymer-like dynamic behavior: Internal friction in unfolded apomyoglobin depends on denaturing conditions

Equilibrium dynamics of different folding intermediates and denatured states is strongly connected to the exploration of the conformational space on the nanosecond time scale and might have implications in understanding protein folding. For the first time, the same protein system apomyoglobin has been investigated using neutron spin-echo spectroscopy in different states: native-like, partially folded (molten globule) and completely unfolded, following two different unfolding paths: using acid or guanidinium chloride (GdmCl). While the internal dynamics of the native-like state can be understood using normal mode analysis based on high resolution structural information of myoglobin, for the unfolded and even for the molten globule states, models from polymer science are employed. The Zimm model accurately describes the slowly-relaxing, expanded GdmCl-denaturated state, ignoring the individuality of the different aminoacid side chain. The dynamics of the acid unfolded and molten globule state are similar in the framework of the Zimm model with internal friction, where the chains still interact and hinder each other: the first Zimm relaxation time is as large as the internal friction time. Transient formation of secondary structure elements in the acid unfolded and presence of α-helices in the molten globule state lead to internal friction to a similar extent.


Results
Structural properties. According to the circular dichroism (CD) measurements, apoMb in its native-alike form at pD 6 contains 49% secondary structure elements (see Fig. 2). Under acid denaturation, apoMb at pD 4 has 25% secondary structure elements and at pD 2 4.3%. At 3 M GdmCl, there are about 6% secondary structure elements in the protein molecule. The content of secondary structure elements of apoMb at pD 2 and the one of GdmCl-denatured states are in both cases very small and comparable to each other within the error of the applied technique.
Small angle neutron scattering (SANS) was used to gain information on these folding states. The labile protons of the protein have been exchanged with deuterium and all the solvents used were deuterated to decrease the incoherent neutron scattering. The pD value was determined as 0.4 plus the pH meter read-out. Data of low concentrated (3-5 mg/mL) protein solutions that show no signs of intermolecular interactions, nor aggregates was used to characterize the form of the protein molecule in each folding/denaturation state (see Fig. 3). All measurements in this study were performed at 10 °C to minimize the risk of aggregation. The scattering curve of apoMb at pD 6 is well described by a generalized Guinier model 29 where Rg is the radius of gyration and α a parameter describing the three-dimensional form of the protein. The model is valid in the range: qRg < 1.3. With α = 0, the protein is a spheroid with Rg = 1.5 nm. With a hydrodynamic radius R H of approximately 2 nm determined by dynamic light scattering (DLS), this state has a high degree of compactness: R H /Rg = 1.32 ( 30 and the references therein). The theoretical limit of a solid sphere is given by R H /Rg = (5/3) 0.5 = 1.29, while the average for a random-coil polymer (or a polymer in θ-solvent) gives a ratio of 0.65 31 .
The measured SANS curves of the partially and completely unfolded proteins are well described by the polymer with excluded volume model 32,33 (See Table 1 and Fig. 3). This analytical model was used to describe various polymer systems 34 . Whereas a Gaussian polymer chain has orientationally uncorrelated links between the beads and the length of these segments follows a Gaussian probability distribution, this model considers excluded volume effects too, reflected by the excluded volume parameter ν. This is related to the Porod exponent m through ν = 1/m and also known as critical exponent. The statistical segment length of the polymer chain, also known as Kuhn length l, and the degree of polymerization n can be extracted from the formula . The compactness of a polymer is also related to the excluded volume parameter ν 30 . Applying the polymer with excluded volume model to the present denaturated protein structures is appropriate, given that the theory behind  . Normalized Kratky-Porod representation of the SANS data with the models used to obtain the form factor. The apoMb at pD 6 structure shows the characteristic peak of a globular protein, the pD 4 is a typical molten globule (reaching a maximum at qR g = 0.2 Å −1 ⋅ 25. 4 Å = 5). pD 2 and GdmCl data are specific for unfolded states.
it is validated in practice by several techniques. In a simple picture, the denaturation by acid occurs because the amino acid side chains become protonated and repel each other, destabilizing the secondary structure elements. ApoMb at pD 4, the molten globule state with 30% content of secondary structure elements, is more compact (R H /Rg = 1.18 and ν = 0.46) than apoMb at pD 2 (4% content of secondary structure elements, R H /Rg = 0.67, ν = 0.55). Denaturation by GdmCl occurs through a slightly different process: some of the amino acid units become protonated (pH meter read-out for the buffer of the GdmCl-denaturated protein is 4.5), and the guanidium hydrochloride molecules interact with the protein chain, leading to an expansion of the unfolded molecule [35][36][37] . This is reflected in our data: larger Rg and R H values, and also less compactness compared to the other unfolded states: ν = 0.64. Similar to the apoMb unfolded state of urea investigated by Eliezer et al. 38 , this could be a mixture of monomer and dimer. In other words, apoMb at pD 6 is a typical globular protein, whereas the partially and completely acid-unfolded, apoMb at pD 4 and pD 2 are more compact than the denaturant unfolded state. The GdmCl-denaturated state has a larger size and lacks compactness. ApoMb at pD 2 has the typical R H /Rg value for a polymer in good solvent 39,40 and the typical ν-value for a chain with excluded volume interactions 41 .
The structure factor, which is concentration-dependent, is obtained by dividing the scattering curve of the concentrated solution by the form factor (see SI). The data is smoothed and averaged to remove the noise. Whereas the form factor describes the shape of a molecule in solution, the structure factor characterizes the interaction between these molecules. The structure factor is needed in order to correct the dynamics data reported later. Intermolecular interactions are well described in the case of apoMb by a mean spherical approximation (MSA) structure factor 42,43 , originally developed for macro-ion solutions. The model was implemented using the python package Jscatter 44 , an adaptation of the original Fortran code 45 . ApoMb at pD 6 is closer to its isoelectric point (estimated by ExPASy 46 to lie at 7.20), therefore there is only a slight difference between the number of positively and negatively charged residues. The charge on the surface is not distributed uniformly and the monomers attract each other (the structure factor is larger than 1 in the low q-regime). The curve has its minimum at q = 0.07 Å −1 , suggesting that monomers start to interact with each other at a typical distance of 2π/q = 90 Å. A radius of gyration of 17 Å (closely lying to the one obtained by fitting the form factor) and a screening length of 30 Å are obtained by fitting the MSA model. In comparison to apoMb at pD 6, the structure factors of the solutions of apoMb denaturated by acid and GdmCl show that the monomers repel each other. This repelling can be attributed to the charge state (Fit results are available in SI).
Dynamical properties. Neutron Spin-Echo Spectroscopy (NSE,) measures temporal and spatial correlations between different scattering particles and from internal motions in the particles resulting in the normalized intermediate scattering function (ISF) S(q, t)/S(q, 0). ISF can be investigated for each q-value: either for its initial slope or as a stretched exponential (Kohlrausch-Williams-Watts). Alternatively, the data can be modelled simultaneously for all q values according to polymer models.
Investigation of the spectra initial slope. From its initial slope, the effective diffusion coefficient D 1 is obtained S(q, t)/S(q, 0) = Aexp(− D 1 t − D 2 t 2 ) (see SI). According to the de Gennes 40 and Doi 22 theory, the overlap concentration c * = M/(N A 4πRg 3 /3) is the border between the diluted and the semidiluted regime of a polymer solution. ApoMb has a molecular weight of M = 16951 g/mol (the molecular weight of myoglobin of which the heme group weight is subtracted), and for Rg = 2 nm, the calculated value for the overlap concentration is c * = 840 g/L. At 30 mg/mL, the solution is significantly below the overlap concentration, thus it can be treated as a dilute solution. With an assumed Rg of 3 nm the overlap concentration would be c * = 249 g/L, still one order of magnitude larger than the maximum protein concentration used in the experiments presented here. However, this dilution classification is derived for polymer systems and does not account for any surface charge or forces between the protein molecules. Empirically, it was shown that intermolecular interactions and the solvent mediated interactions have to be considered as well 47 . Intermolecular interactions are represented by the structure factor. Solvent-mediated interactions are represented by the hydrodynamic function H(c, q), which can be approximated as a q-independent constant, given that its value in the low q regime is close to its value in the high q-regime. At low q-values, H c,q0 = D c S q0 /D 0 , where D 0 is the extrapolated diffusion constant at infinite dilution, D c is the diffusion coefficient at concentration c measured by DLS, and S(q = 0.026 nm −1 ) is the value of the structure factor at the DLS-specific q-value. At large q-values, the hydrodynamic functions H c,qL = 0 can be approximated as the ratio between the measured viscosity of the concentrated and diluted protein solution, η conc and η c=0 , respectively. www.nature.com/scientificreports www.nature.com/scientificreports/ For these solutions of apoMb, the values of H c,q0 and H c,qL are close to each other (see Fig. 3) and we assume that the hydrodynamic functions are constant in the q-range of interest.
Thereby, the effective diffusion coefficients D eff for the protein monomers are obtained: Fig. 4). They comprise information on translational diffusion, rotational diffusion and internal dynamics of the single molecule. In a good approximation, these motions can be decoupled 18 . For apoMb denaturated by acid (at pD 2 and pD 4) and by GdmCl, D eff has a linear dependence on q which is specific for the Zimm regime of local chain relaxations 48 , whereas for apoMb at pD 6 the value of D eff has a non-linear dependence on q (see Fig. 4). The dynamics of the mostly folded protein apoMb at pD 6 deviates from the dynamics of the more denatured protein solutions. It is therefore discussed in the following paragraphs. At first, translational and rotational diffusion can be determined in the rigid-body approximation, directly from pdb structures using HYDROPRO 49 . ApoMb at pD 6 resembles the native structure of myoglobin. Given that there are no available pdb structures of the heme-free forms, and motivated by the work of Stadler et al. 50 , proving that myoglobin and apoMb at pD 6 have similar characteristics in solution, the crystal structure of myoglobin (pdb ID: 2v1k) was used for the calculation. For T = 283.15 K, η = 1.67 mPas, φ = 0.720 cm 3 /g (solute partial specific volume) and ρ = 1 g/cm 3 (solution density), the 9x9 diffusion matrix D is obtained, that comprises the translational and the rotational diffusion matrices, whose traces are the translational diffusion coefficient 5.96 Å 2 /ns and the rotational diffusion coefficient of 9.83 μs −1 .
The q-dependency of the coupled rotational and translational diffusion is obtained from the coordinates of the amino acids in the protein → r , their individual neutron scattering length b, the form factor F(q), and the diffusion matrix obtained above, using the formula: which is derived by Ortega et al. 47 . The brackets represent the ensemble average over the remaining variables. While the integration over the position space for the single particle is 1, the orientation average can be replaced by an averaging over q-space. The exchange occurring between the protons at the protein surface and the solvent is also considered. The calculated D 0 (q) values are shown in Fig. 5 together with the experimentally derived D eff (q) values. As can be seen in Fig. 5, the difference ΔD eff (q) = D eff (q) − D 0 (q) between the measured NSE data points and the calculated D trans−rot accounts for approximately 20% of the total dynamics and can be due to internal α-helices movements or other internal dynamics processes. We performed a normal mode analysis using the MMTK package [51][52][53] . We determined the effective diffusion specific for the first non-trivial mode, mode number 7, as following: www.nature.com/scientificreports www.nature.com/scientificreports/ the temperature T. We observe that the diffusion coefficient of the first non-trivial normal mode of the pdb structure 2v1k, describing the movement of the α-helices which allows the access to the heme group, has a similar dependence of the diffusion coefficient on q, see inset of Fig. 5.
Investigation of the spectra using stretched exponential functions. Another common practice in the NSE data evaluation is modeling using a stretched exponential function, characteristic for relaxation processes: The stretching exponent β for apoMb at pD 6 is on average for all q-dependent data sets 0.9, a value close to 1, so that the protein is seen rather as a point, where translational diffusion dominates, and the internal dynamics is small in comparison to it (about 20%). In contrast, for the pD 2, τ p is the relaxation time characteristic for the normal mode p, with η the solvent viscosity, and ν the critical exponent, k B the Boltzmann's constant, T the temperature. R E is the end-to-end distance of the polymer chain In the exponent of the first term of equation 3 one can find the hydrodynamic function H(c,q) mentioned earlier devided by the structure factor S(c,q). In the Zimm and ZIF models, the normal modes have all the same amplitude: A(p) = 1. Internal friction reflects the intrinsic resistance of a polymer to changes in its conformation and occurs due to dihedral angle rotational barriers, hydrogen bonding or intrachain collisions. As opposed to the Zimm model, the ZIF model incorporates the internal friction of the polymer chain as a resistive spring installed in parallel to the entropic spring connecting the beads. By solving the Langevin equation, a mode independent relaxation time τ intern is obtained. It is added to each Zimm mode τ p so that τ pZIF = τ p + τ intern . This way, in the ZIF model the higher frequency normal modes of the Zimm model are damped.
The NSE spectra can be simulated based on the equation defining I(q, t). Using the information on the translational diffusion from DLS, on the viscosity (from direct measurements), on the hydrodynamic function (see Table 2) and on the critical exponent ν and Rg obtained from the SANS data (see Table 1), the simulation can be performed. In Fig. 6a,b, the dotted lines are simulated NSE spectra of apoMb pD 4 and pD 2 using the Zimm www.nature.com/scientificreports www.nature.com/scientificreports/ model, under the assumption that the polymer consists of 20 beads. The simulation reproduces the spectra well, but the large q-values and the longer Fourier-times are not described properly by the Zimm model. Without any knowledge on the internal friction time, the ZIF model was fitted simultaneously for all q for each sample, having only D and τ intern as free parameters. The fit results are presented in Table 3. The values obtained for the center of mass diffusion coefficients D are comparable within error bars with the ones obtained via DLS measurements for both pD 2 and pD 4. Although apoMb at pD 4 is a molten globule and has a significantly higher content of secondary structure elements, its dynamics can still be understood similarly to the one of the totally unfolded state. The whole structure needs a similar time to relax (t Zimm ) and both protein states experience a similar internal friction(τ intern ). However, for apoMb at pD 4, the ZIF model deviates significantly from the experimental NSE spectra at longer Fourier times, especially for the ISF at q = 0.07 Å −1 , which is reflected in the larger χ 2 value. This could be because the model does not account for any residual secondary structure content. An interpretation of the experimental NSE data might be achieved by coarse-grained computer simulations, which are out of the scope of the present manuscript. We refer here to future studies to clarify that aspect.
In contrast to these two, apoMb denaturated by GdmCl has almost double the Zimm relaxation time and no internal friction time is observed (see Fig. 6c). Dynamics of denatured apoMb can be described very well using the Zimm model only. This supports the mechanism of denaturation described by Heyda et al. and Huerta et al. 35,37 : GdmCl increases the solubility of hydrophobic residues and the local energetic barriers are lowered. The trends observed on intrinsically disordered proteins (IDPs) denaturated states in different concentrations of GdmCl 36 are confirmed. Several studies including the Zimm model support the idea that the centre of mass diffusion coefficient of the protein scales with the chain length, or with the bead number, according to N 1/ 6,26,36 . These studies are performed on proteins where the chain length is varied, which is not the case for the present work. ApoMb, which always consists of the same number of amino acids, independently of its folding state. The bead number should not be confused with the chain length. The choice of the beads number when the protein is considered a polymer is arbitrary, but even when we increase the beads number, having less than 7 amino-acids per bead, the ZIF model does not change its validity (see Supplementary Information).  Table 2. Values of the hydrodynamic functions in a low (H c,q0 ) and large q-regime (H c,qL ) determined by different methods. The SANS, DLS and viscosity measurements were performed at 283 K. The viscosity value η conc of the apoMb pD 2 solution with the highest concentration could not be determined accurately.   Table 3. www.nature.com/scientificreports www.nature.com/scientificreports/ Further polymer models (Zimm with damping of the mode amplitudes 21 , compacted Zimm with internal friction 26 , the Zimm analogues of the Rouse with non-local interactions and of the Rouse with anharmonic potentials 54 have been considered to interpret this data, but none leads to better results. Some studies claiming that internal friction does not play a role are performed on smaller proteins 28 or the solvent viscosity is varied significantly 55 . In those cases, the friction with the solvent, and not the internal friction may be the dominant dissipation mechanism. For the data presented in this work for apoMb at pD 2 and pD 4, the ZIF model is the best fit.

Discussion
By comparing two different denaturation ways, we could gain insights on the denaturant effect on the structure and dynamics of the model system apomyoglobin. Both ways started from the native-alike form apoMb at pD 6. The protein in this folding state resembles many structural features of the holoprotein and its dynamics shows internal collective modes, which are no longer seen in any other unfolded states investigated (see Fig. 1). Its internal dynamics, accounting for less than 20% of the total dynamics of the protein is of biological relevance: the α-helices perform this movement to incorporate the heme group in the process of the protein synthesis 56 .
In case of the acid denaturation, apoMb at pD 4 has a high content of secondary structure elements, observed by CD spectroscopy and SANS. However, its dynamics can if it all be described by the same polymer model (ZIF) as the dynamics of the acid unfolded state, apoMb at pD 2 (see Fig. 1B,C,G). Although similar Zimm relaxation and internal friction times are obtained, the data is not as perfectly modelled. The GdmCl unfolded apoMb does not show internal friction, suggesting that this denaturant is screening the protein chains, reducing the interaction between them (see Fig. 1D,F). The observations of Zheng et al. 56 , Borgia et al. 36 and Samanta et al. 26 on IDPs are confirmed also for apoMb: internal friction is larger with considerable increase of protein compactness.
Previous QENS experiments showed that molecular dynamics on the faster ps to ns time-scale are similar between apoMb at pD 2 and apoMb at pD 4, but differ significantly from apoMb at pD 6 57 . That dynamic picture is corroborated here by NSE for slower collective dynamics as well. The first folding step in apoMb does not have a significant effect on collective internal dynamics. A fundamental change in the physical nature of the dynamics of Mb due to protein folding occurs only by the following folding step into the native state, where the heme-pocket is formed. By comparing the internal friction in apoMb at pD 4 with that of an IDP with a similar content of secondary structure 20 , we see that internal friction dominates the Zimm mode spectrum even stronger for the IDP than for the apo-Mb at pD 4. This shows that apoMb at pD 4 and apoMb at pD 2 still need to be seen as comparatively soft protein conformations. Therefore, the formation of the G and H helices in the apoMb at pD 4 state is not that important for the motions seen by NSE. Motions in apoMb at pD 2 and pD 4 are rather influenced by the transient formation of secondary structure content. If more information on intermediate states experiencing constant folding/refolding transitions would be available, the dynamics of the denaturated proteins observed by NSE could be modelled as an equilibrium, an average distribution of the intermediate state dynamics. Recent single-molecule techniques allow the observation of such intermediate states 58 , whilst theories such as Zimm-Bragg 59 claim that chemical unfolding is a multi-state process of a mixture of conformations. To relate the NSE observations with the in depth understanding of the chemical unfolding process of apoMb directly, such experiments and theories would be necessary.
Although proteins are known to adopt their unique structure based on the individuality of their amino acid side chains, coarse grain polymer models can characterize the nanosecond dynamics. In case of the GdmCl denatured apomyoglobin, the protein loses all its protein-like features and behaves like a Zimm polymer. This is mostly due to the binding of GdmCl to the side chains removing their individuality leading to a more polymer like behavior. Moreover, apoMb at pD 2, which could still exhibit hydrogen bonding and some transient elements of secondary structure, loses its protein-like features, but behaves like a non-ideal polymer, with internal friction.

Methods
Sample preparation. ApoMb was prepared from horse-heart myoglobin (Sigma-Aldrich) following the butanone method to extract the heme group (as performed in 50 ), adapting the method described in 60 ), and then refolded by dialysis in 20 mM NaH 2 PO 4 /Na 2 HPO 4 (Sigma Life Science, >99.5% and Sigma-Aldrich, >99%) pH 7 buffer and distilled water. Before storage in the freezer, the apo-Mb solution was lyophilized. To replace the exchangeable protons by deuterium ions, the freeze-dried apo-Mb powder was dissolved in heavy water (99.9% 2 H, Sigma-Aldrich), incubated for 1 day, and lyophilized again. The obtained powder was stored at −20 °C. In order to obtain the molten globule state of apoMb the powder was dissolved in 2 H 2 O and centrifuged to remove the large aggregates. In the supernate solution of concentration 2 mg/mL and pH 6, 2 HCl 0.1 M (Sigma-Aldrich) was added until the pH-read out value was 3.6 (monitored by pH meter Methrom). This corresponds to a a pD value of 4. The buffer exchanged protein solution was centrifuged (Heracus Instruments) to the final concentrations (Vivaspin 3,000 MWCO concentration units, Sartorius, Göttingen, Germany). Circular dichroism (CD). Circular dichroism was measured on a Jasco J1100 spectropolarimeter (JASCO, Tokyo, Japan), in the range 180-250 nm, with a pitch of 1 nm, a scanning speed of 100 nm/min, and 3 accumulations/measurement. The samples were measured at a concentration of 300 μM in 0.01 cm thick quartz cuvettes under constant nitrogen flow at 10 °C. According to the BeStSel Single Spectrum Analysis 61 , the α-helix composition of apoMb, varied as following: pD 6-49%, pD 4-25%, pD 2-4.3%, GdmCl-6%. In case of GdmCl-denaturated solution, only the range 200-240 nm was considered for data analysis because GdmCl absorbs strongly in the range 180-200 nm.

Small-angle neutron scattering (SANS).
The scattering vector q is defined as q = 4nπ/λsin(θ∕2) with the incident neutron wavelength λ and the scattering angle θ. The investigation of the form and structure factor was performed for apoMb at pD 2 and pD 6 at the instrument KWS-2 at the MLZ in Garching 63 . The in situ DLS option at this instrument helped to acquire data that confirmed that the samples did not show considerable aggregation during the neutron measurement. Protein concentrations were 3, 6, 15 and 30 mg/mL. The corresponding buffers, empty cells and references were measured as well. Hellma quartz cells of 1 mm and 2 mm were used for high-and low protein concentrations. The neutron wavelength was set to 4.5 Å, and measurements were performed at 3 detector positions: 2, 8 and 20 m. All measurements have been performed at 10 °C. For the low-concentrated solutions, the background-corrected intensities were linearly extrapolated to infinite dilution to extract the form factor per unit mass. The measured SANS curve of apoMb at pD 2, pD 4 and GdmCl are well-described by a polymer with excluded volume model, while apoMb at pD 6 is globular, thus the corresponding SANS curve is described by a Guinier model. By dividing the SANS curve of the highest concentration by the one at the lowest, the structure factor was obtained.
Neutron spin-echo spectroscopy (NSE). Solutions of apoMb at pD 2 were investigated at the instrument SNS-NSE, the neutron spin echo spectrometer at the Oak Ridge National Laboratory, Oak Rigde, Tennessee, USA 64 . It is a time-of-flight instrument: the Larmor precession of the neutron spin in a preparation zone with magnetic field before the sample encodes the individual velocities of the incoming neutrons into a precession angle. The other samples were measured at J-NSE "Phoenix" at MLZ, Garching 65 . The instrument covers a Q-range of 0.03-1.0 Å −1 , reaching Fourier Times of 250-90 ns using 12 and 8 Å neutrons. In the experiments presented here, a Q-range of 0.03-0.15 Å −1 was explored using using 12 and 8 Å neutrons. A graphite powder sample was measured as a scattering reference, followed by the protein sample and the buffer solution. All measurements were performed at 10 °C. NSE data evaluation was performed with the data reduction software DrSPINE 66,67 . Viscosimetry. The viscosity of all protein solutions and buffers was measured at 10 °C using a rolling-ball viscometer Lovis 2000 M/ME. Each measurement was performed 3 times and the average value was reported.
UV/VIS spectroscopy. The sample absorption in a cell with a path-length of 0.1 mm (Hellma, Germany) was measured using UV/VIS Spectroscopy (Cary 300). For the very low concentrations (<1 mg/mL), a 5 mm thick quartz Hellma cell was used. The concentrations were determined from the absorption values using the molar extinction coefficient ϵ 280nm = 13980 M −1 cm −1 calculated from the amino acid sequence (ExPASy 46 ).