Intrinsic Localized Modes in Proteins

Nicolaï, Adrien; Delarue, Patrice; Senet, Patrick

doi:10.1038/srep18128

Download PDF

Article
Open access
Published: 11 December 2015

Intrinsic Localized Modes in Proteins

Adrien Nicolaï^1,2,
Patrice Delarue² &
Patrick Senet²

Scientific Reports volume 5, Article number: 18128 (2015) Cite this article

1795 Accesses
10 Citations
4 Altmetric
Metrics details

Subjects

Abstract

Protein dynamics is essential for proteins to function. Here we predicted the existence of rare, large nonlinear excitations, termed intrinsic localized modes (ILMs), of the main chain of proteins based on all-atom molecular dynamics simulations of two fast-folder proteins and of a rigid α/β protein at 300 K and at 380 K in solution. These nonlinear excitations arise from the anharmonicity of the protein dynamics. The ILMs were detected by computing the Shannon entropy of the protein main-chain fluctuations. In the non-native state (significantly explored at 380 K), the probability of their excitation was increased by a factor between 9 and 28 for the fast-folder proteins and by a factor 2 for the rigid protein. This enhancement in the non-native state was due to glycine, as demonstrated by simulations in which glycine was mutated to alanine. These ILMs might play a functional role in the flexible regions of proteins and in proteins in a non-native state (i.e. misfolded or unfolded states).

Topology, landscapes, and biomolecular energy transport

Article Open access 11 October 2019

Justin E. Elenewski, Kirill A. Velizhanin & Michael Zwolak

Protein conformational entropy is not slaved to water

Article Open access 16 October 2020

Bryan S. Marques, Matthew A. Stetz, … Nathaniel V. Nucci

Universality of fold-encoded localized vibrations in enzymes

Article Open access 06 September 2019

Yann Chalopin, Francesco Piazza, … Marcel Filoche

Introduction

Intrinsic localized modes (ILMs)¹, are members of the large soliton family². They have been predicted and observed in crystals and anti-ferromagnetic materials^3,4 and arise from the anharmonicity of interatomic potentials and from the discreteness of matter at the atomic scale. We expect proteins in solution (water) to subtend ILMs, that is, strongly localized waves, because of their well-known anharmonicity^{5,6,7,8,9,10,11}. Despite the importance of protein dynamics for biological function^{12,13,14,15,16}, the actual occurence of ILMs in proteins remains to be demonstrated both theoretically and experimentally.

Referring to the anharmonicity of hydrogen bonds between the amide N-H bonds and the carbonyl C = O groups of the protein backbone, Davydov proposed the existence of localized waves in α-helices several decades ago¹⁷. Using a one-dimensional quantum Hamiltonian, Davydov predicted the self-localization of C = O bond vibrations through their anharmonic coupling with the low-frequency modes of the polypeptide chain¹⁷. According to his proposal, this localized wave would provide a mechanism to propagate energy within a protein¹⁸. Experimental evidence of Davydov’s solitons in proteins continues to remain elusive^19,20,21,22. The one-dimensional Davydov model is a crude approximation of the structure of a protein. Several authors have attempted to simulate solitons in proteins, including their three-dimensional features, using simplified (coarse-grained) classical^23,24,25 or quantum dynamic models²⁶. These models are very difficult to relate to actual protein dynamics. Protein dynamics span several time-scales in which local displacements of atoms are coupled to more large-scale conformational motions; a full description of these dynamics requires, therefore, an atomistic approach¹⁶.

In recent years, all-atom molecular dynamics (MD) simulations have become a powerful tool that is complementary to experiments to investigate the dynamics of proteins in solution at the atomic scale^27,28,29. Here, we used all-atom MD simulations, including realistic interactions between water and amino acids and the full effects of temperature, to theoretically establish the lifetime and statistics of ILMs in proteins.

We predict a new type of fully classical ILMs in proteins: solitons localized in both time and space (similarly to the Peregrine solitons^30,31). These intermittent ILMs are due to the anharmonicity of the potential energy surface describing the torsional degrees of freedom of the main chain of proteins. The torsional degrees of freedom play an important role in protein dynamics because they govern the low-frequency functional modes of proteins^32,33,34. The main-chain torsional angles γ are built from four C^α atoms of consecutive residues in the amino acid sequence^35,36 (Fig. 1). Because the length of the C^α-C^α virtual bond between two consecutive residues is nearly constant, the main-chain conformation is entirely described by the main-chain torsional angles γ and the main-chain bond angles θ (Fig. 1). These coarse-grained angles (CGA) (γ, θ) are part of coarse-grained protein models^37,38 and are used to analyze large conformational changes of proteins and protein folding in all-atom MD simulations^36,39.

The origin of ILMs found in proteins can be better understood by drawing an analogy between the fluctuations of the protein main chain and those of a simple mechanical system known to substend solitons. The mechanical analog of the protein main chain is a chain of rigid pendulums coupled by harmonic torsional springs and rotating around the same axis² [Fig. 2(a)]. At each time, the position of each pendulum i is defined by its angle α_i relative to the vertical. In this mechanical model, anharmonicity arises from the gravitational restoring force, which is proportional to sin(α_i). This mechanical system may subtend ILMs (sine-Gordon solitons)², which interpolate between two rest states (no restoring force) of the system (α_i = 0 for all i and α_i = 2π for all i). The rest state α_i = 0 for all i and the ILM solution of the dynamic equations², computed at the maximum of its amplitude (α = π) for a localization chosen at i = 10, are shown in Fig. 2(a). The localized wave is characterized by a well-defined profile of the angle α [Fig. 2(b)] named kink. Due to the symmetry of the system, an anti-kink solution exists. The combination of the kink and anti-kink solutions may lead to a localized solution [chosen at i = 10 in Fig. 2(b)], that does not propage but does oscillate as a function of time, a so-called sine-Gordon discrete breather². The position of each pendulum can be represented alternatively as a rotating unit vector in a plane: u_i = (cos(α_i), sin(α_i)) [Fig. 2(c)]. For each pendulum i, we define the difference between the vector u_i between the uniform state of the chain [red arrows in Fig. 2(c)] and in its excited state [blue arrows in Fig. 2(c)] by Δu_i. The kink [panel (b)] leads to a peak in as a function of the pendulum position i [panel (d)] (in the example shown in Fig. 2, the sine-Gordon soliton is centered at i = 10). Consequently, the time-dependent fluctuations of the pendulums can be viewed as an ensemble of vectors rotating in a plane [Fig. 2(c)]. Their displacements between different times allow the definition of the localized character of the excitation [Fig. 2(d)].

As for the motion of each pendulum, the dynamics of each torsional angle γ_i of the protein main chain can be represented by a unit vector u_i = (cos(γ_i), sin(γ_i)) rotating in a plane^36,40. For small fluctuations around their equilibrium orientation, the vectors are harmonically coupled, as are the pendulums in the mechanical model. However, for large angular displacements, anharmonicity arises due to the nonlinearity of the dihedral potential energy surface. As for the pendulums, the orientation of each vector u_i corresponding at γ_i = 0 and γ_i = 2π are equivalent and solitons similar to discrete breathers of sine-Gordon type are expected in proteins by analogy.

More generally, the fluctuations of each pair of CGAs in the protein can be represented by a unit vector:

with one end fixed and the other describing a stochastic path along the surface of a sphere [Fig. 1(b)]. An example of the vectors u_i built on (γ_i, θ_i) (Fig. 1) computed from an all-atom MD trajectory of Trp-cage protein⁴¹ (see the Results section) are shown for a uniform state (red arrows) and an excited state (blue arrows) occuring 1 ps later at i = 10 [Fig. 2(c)]. The localized character of this particular excitation can be seen in Fig. 2(d), where the values are compared for the pendulum model and the protein. As seen in Fig. 2(c), the vector displacements are primarily due to the fluctuations of the γ torsional angles. As shown in the Results section, the typical ILM shown in Fig. 2(d) is a rare event in MD simulations of proteins and is localized in both time and space.

In the present work, we present evidence of ILMs of the soliton type [as in Fig. 2(d)] in the spontaneous, unbiased fluctuations of the main chain of model proteins at different temperatures using all-atom MD in explicit solvent (water). We predict the existence, statistics and biophysical properties of ILMs in proteins and their relation with the protein free-energy landscape. The particular questions we address are as follows: what is the probability of finding an ILM occurring spontaneously in the native (folded) and non-native (misfolded or unfolded) states of a protein in solution? How do the ILMs depend on the secondary structures of the protein and on its chemical composition?

Results

Evidence of ILMs in unbiased MD simulations

As the loss of rigidity due to the unfolding of a protein increases the anharmonicity of its free-energy landscape, we investigated the dynamics of two ultrafast-folder proteins, Trp-cage^41,42 and the chicken villin headpiece fragment HP-36^43,44, above their folding temperature. These proteins were chosen because they have been extensively studied using MD simulations and experiments and because unfolding events can be reproduced by unbiased all-atom MD simulations in explicit solvent within a reasonable computational time^45,46,47,48. Trp-cage is a 20-residue protein designed to aid in understanding protein folding mechanisms consisting of one α-helix and one (3/10)-helix⁴¹. HP-36 is a 36-residue protein corresponding to the C terminus of the 76-residue chicken villin headpiece domain^43,44. It consists of three α-helices. Because of their small size and fast kinetics, Trp-cage and HP-36 have become typical model proteins for MD simulations of protein folding^45,46,47,48. For comparison, we analyzed also the dynamics of a rigid 46-residue α/β model protein (VA3)⁴⁰. Because of the presence of three disulfide bonds (namely 3–40, 4–32 and 16–26), VA3 remains folded in all MD trajectories⁴⁰ while exploring the non-native state at 380 K.

Three all-atom MD trajectories with different initial conditions (run 1, run 2 and run 3) at T = 380 K and one MD trajectory at T = 300 K each of a duration of 500 ns were conducted for Trp-cage and two all-atom MD trajectories with different initial conditions (run 1 and run 2) at T = 380 K and one MD trajectory a T = 300 K each of a duration of 500 ns were conducted for HP-36 in explicit water (see the Methods section). In addition, one all-atom MD trajectory at 300 K and one at 380 K each of a duration of 500 ns were performed for VA3 (see the Methods section). The coordinates of the proteins were recorded every ps. Each MD run, therefore, represents 500,001 snapshots, from which the vector u_i associated with each pair (γ_i, θ_i) (Fig. 1) was computed. The fluctuations of the protein main chain between two consecutive snapshots were represented by the sequence of the displacements Δu_i(t) = u_i(t) − u_i(t − 1) for all i = 2 to N − 2 and t ≤ 1. The degree of localization of these fluctuations was measured by the normalized Shannon entropy S computed from the square displacements Δu_i(t)² along the sequence (see the Methods section). An excitation localized on a single pair of CGAs corresponds to S = 0 (minimum entropy, strongly localized fluctuations) and an excitation uniformly distributed on all CGAs corresponds to S = 1 (maximum entropy, delocalized fluctuations). The calculation of S(t) is a systematic means to detect rare large localized excitations in MD trajectories. The ILMs are defined here by the excitations for which S ≤ 0.5 (see the Methods section for the choice of this cutoff value). For example, the ILM shown in Fig. 2(d), which is the excitation that has the largest amplitude in the MD run 1 (T = 380 K) of Trp-cage, had a value of S = 0.47. The results for the Trp-cage protein are discussed next and similar results for additional MD runs of Trp-cage, HP-36 and VA3 are shown in the Supplementary Information.

ILMs (S ≤ 0.5) are rare events. For example, in the MD run 1 of Trp-cage, only 251 ILMs were found (Table 1), which represents 0.05% of the total number of main-chain fluctuations recorded over 500 ns. The probability of observing ILMs was similar in the other MD runs of Trp-cage (Table 1). In all MD runs of Trp-cage, the most frequent ILMs were located at the same specific positions along the amino acid sequence [Fig. 3(a)]: i = 9, 10, 14 and 18. The excitations at i = 9, 10 and 14 are typical solitons of sine-Gordon type [compare Fig. 2(d) to Fig. 3(b)]. The largest amplitudes of ILMs were found for these three sites, with Δu² = 3.0, 3.3 and 3.1 for a soliton centered at i = 9, 10 and 14, respectively. Most of the solitons (80%) centered at i = 9, 10 and 14 with the largest amplitudes (Δu² > 2.0) corresponded to cis-trans or trans-cis transitions of the four C^α segments [typical examples are shown in Fig. 3(c)].

Table 1 Number N_ILM of ILMs detected in MD simulations of Trp-cage at T = 380 K and 300 K.

Full size table

The solitons localized at i = 9, 10 and 14 [shown in Fig. 3] shared a common feature: they all had a glycine (GLY) residue at i + 1 in the amino acid sequence (in bold font in Table 1). As shown in SI (Supplementary Tables 1 and 2), similar results were found for HP-36 and VA3 proteins: solitons localized at i = 51 and i = 73 in HP-36 have a GLY residue at i = 52 and at i = 74, respectively and solitons localized at i = 36 in VA3, have a GLY residue at i = 37. In the Trp-cage, the most frequent soliton (i = 10) corresponded to a rotation around a virtual bond formed by two GLYs (GLY10-GLY11) (Table 1). Because of its small side-chain (H atom), GLY can adopt a larger set of conformations in a polypeptide chain, which may explain why the highest probabilities of ILMs are observed at i = 9, 10 and 14. To test this hypothesis, we ran an MD trajectory for the mutant Trp-cage G15A at T = 380 K. The number of ILMs of soliton type at i = 14 decreases by an order of magnitude (Table 1). To further test the role of GLYs, we ran a MD trajectory for the triple mutant Trp-cage G10A-G11A-G15A at T = 380 K. The number of ILMs of soliton type at i = 9, 10 and 14 decreased drastically (Table 1). This observation reveals, for the first time, the role of GLY residues in the localization and the probability of the appearance of ILMs of the soliton type.

The ILMs located at the C-termimus of the chain (i = 18) are not similar to sine-Gordon solitons (Supplementary Figure 1) but are localized excitations that also exist in a harmonic chain with free ends and are due to the broken symmetry of the chain at its extremities⁴⁹. These ILMs do not depend on the presence of GLY residues and their probability is similar for Trp-cage and its mutants (Table 1).

A few ILMs (S ≤ 0.5), all with small amplitudes (0.4 < Δu² < 1.9), were also observed very rarely (not more than six times in 500,001 snapshots) at i = 2, 7, 8, 11, 12, 13, 15, 16 and 17 in the MD runs of Trp-cage (Supplementary Table 3). Except at i = 2, which corresponds to a mode located at the N-terminus (Supplementary Figure 1), all of these ILMs were similar to sine-Gordon solitons.

Free-energy landscape and ILMs

The probability of observing an ILM at a given time is smaller if the protein is in its native state (rigid, folded state) than in a non-native state (flexible, misfolded or unfolded states). The native state of Trp-cage is defined by the ensemble of the most probable conformations explored at T = 300 K. The native state is better represented as basins in the free-energy landscape of the protein^6,16.

We represented the free-energy landscape of the main chain of Trp-cage by the sequence of the effective free-energy maps V(γ, θ)_n computed from the probability densities of each pair of CGA (γ, θ)_n in the MD trajectories (see the Methods section and Supplementary Figures 2 and 3). The sequence of V(γ, θ)_n has proven useful in describing protein folding⁵⁰, conformation dynamics⁴⁰ and allosteric communication⁵¹. For each V(γ, θ)_n, we defined the native basin as the region of the (γ, θ)_n space within 3 k_BT from the minimum of V(γ, θ)_n at T = 300 K (Supplementary Figure 2). The 3 k_BT cutoff ensures that all of the experimental structures of Trp-cage observed by NMR at 282 K⁴¹ correspond to (γ, θ) angles located in the native basins (Supplementary Figure 4).

To quantify the native character (NC) of the protein as a function of time in MD trajectories at T = 380 K, we counted the % of CGA remaining within their native basins, as illustrated in Fig. 4(a), for a typical trajectory. At T = 380 K, Trp-cage partially folds/unfolds during MD runs, i.e., explores non-native states far from the native basins of the free-energy maps, as shown for selected CGA pairs in Fig. 4(b) and for all of the CGAs in Supplementary Figure 3. Typical structures of the protein in native and non-native states are shown in Fig. 4(c) and in the Supplementary Figure 5 for another MD trajectory (run 2) for comparison. As shown in Fig. 4(a), a sharp transition occurs between the initial native state of Trp-cage, which lasts for approximately 100 ns, to a non-native state in which the molecule remains until the end of the trajectory [Fig. 4(a)]. Results similar to those presented in Fig. 4(a) were found for all the MD trajectories of Trp-cage and HP-36 at T = 380 K (with different sequences of folding and unfolding events depending on the initial conditions of the MD runs, see Supplementary Figures 5 and 6). As clearly shown in Fig. 4 [bottom of panel (a)], the number of ILMs was larger in the non-native portion of the trajectory than in its native portion. The same results were observed for all MD runs of Trp-cage, HP-36 and VA3 (Supplementary Figures 5 and 6) and were quantified by computing the probability of observing a soliton in the native portion and in the non-native portion of the trajectory (Table 2). For the fast-folder proteins examined in the present work, Table 2 demonstrates that the probability of observing a soliton is larger by a factor varying between approximately 9 and 28 in the non-native portion of the trajectory compared to the native portion at T = 380 K. The probability of observing a soliton is only about twice larger in the non-native portion of the MD trajectory of VA3 than in the native portion of the trajectory at T = 380 K (Table 2). Because of its three disulfide bridges, VA3 did not unfold in the MD run and explored a non-native state with a relative high average of NC ( = 75%, see Supplementary Figure 6). In reference to Table 2, it is worth noting that the probability of observing a soliton in the non-native portion of the trajectories at T = 300 K is difficult to evaluate accurately, as the time spent in a non-native state is extremely small.

Table 2 Statistics of ILMs of the soliton type in the native and non-native states computed from MD simulations of Trp-cage, chicken villin headpiece fragment HP-36, and VA3 (MD trajectories of a duration of 500 ns are presented).

Full size table

Statistics of ILMs at different time resolutions

To improve the statistics of ILMs, we ran eight additional short trajectories of 1 ns duration using initial non-native structures of Trp-cage extracted at different times from MD run 1 at T = 380 K and recording a snapshot every fs. The displacements of the main chain, Δu_i, were first computed from these trajectories by using a sliding window of Δt = 1 ps shifted every fs. That is, the time-scale of the displacements Δu_i was identical to that discussed in the previous sections. The ILMs of the soliton type similar to those reported in Fig. 3(b) were detected by computing S(t) every fs [see Fig. 5(a)]. Seven solitons were detected [Fig. 5(a)] at i = 10 (solitons 1 to 5); i = 3 (soliton 6) and i = 14 (soliton 7) (not shown). Soliton 3 had the largest amplitude and was the most localized [Fig. 5(b)]. The probability of observing solitons in these short trajectories was similar to that reported in Table 1. If the snapshots were recorded only every ps (as in the 500 ns MD runs 1, 2 and 3), only soliton 3 (located at i = 10) would have been detected. Interestingly, the probability density of the entropy P(S) was fairly independent of the time-scale used to compute the main-chain displacements Δu_i [see Fig. 5(c) for Δt = 100 fs and Δt = 10 fs]. The ILMs (S ≤ 0.5) were always found in the tails of P(S) (rare events).

Soliton dynamics at a femtosecond time resolution

As an example, we analyzed the dynamics of soliton 3 (located at i = 10) detected in the trajectory shown in Fig. 5(a), for which snapshots were recorded every fs. The time t = 0 was defined as the time at which the soliton was detected, with Δt = 1 ps in the trajectory analyzed in Fig. 5(a). The displacement Δu₁₀(t) was computed at each fs by summing the displacements Δu₁₀ every fs from t = −5 ps to t = 5 ps (by assuming Δu₁₀(t = −5 ps) ≡ 0). The values of Δu₁₀(t)² shown in Fig. 6(a), demonstrated that the ILM is strongly localized between t = −1 ps and t = 2.5 ps. A careful analysis of the variation of the S(t) [Fig. 6(b)] shows that the life time of the soliton (i.e., the time for which S(t) remains lower or close to 0.5) is only 400 fs (from t = −300 fs to t = 100 fs). This extreme time-space localization corresponds to a jump in the γ₁₀ free-energy profile from the top of a barrier (γ₁₀ = −30°) to a small metastable state (γ₁₀ = 60°) [Fig. 6(c)]. As already noted in the introduction and shown in Fig. 2, the ILMs of the soliton-type correspond mainly to a large variation of the dihedral angle γ, as seen for soliton 3 in Fig. 6(c).

Discussion

In the present all-atom MD study, ILMs were successfully detected in the spontaneous thermal fluctuations of the main-chain of proteins. We found that ILMs of soliton type are short-life events. In the example detailed in Fig. 6, the lifetime is ~400 fs (S ≤ 0.5) [it would be 1,500 fs if we adopted a less strict definition of the spatial localization of the main-chain deformation (S ≤ 0.6)]. These sub-picosecond localized excitations may be accessible for experimental investigation using 2D-IR spectroscopy, as demonstrated for the fast dynamics of the Ramachandran dihedral angular fluctuations of a three-peptide chain⁵². The probability of an ILM is approximately 0.05% in MD simulations. As rare events, these ILMs do not contribute significantly to the partition function of the system; they are a priori negligible from a thermodynamic equilibrium point of view. However, the sine-Gordon localized excitations may help to cross an activation barrier as illustrated in Fig. 6(c) and may be important from a kinetic point of view. Intermittent events often do play a role in biological function, for example, in conformational gating of a ligand in enzymes⁵³.

Solitons were previously predicted to occur in stiff regions of a protein (in an α-helix or at catalytic sites) using simplified models of a polypeptide chain²³. The opposite is found here by using all-atom MD simulations of two α-helical fast-folder proteins and a rigid protein. We found ILMs in the flexible regions of the proteins (loops and N- and C-termini), as illustrated in Fig. 3(a). These ILMs are different from the usual classical discrete breathers as they are strongly localized in time. Solitons with the highest amplitudes mainly corresponded to (incomplete) cis-trans or trans-cis transitions [Fig. 3(c)], which are more probable in flexible segments of a protein main chain. The flexibility governs the probability of observing a soliton: the probability of observing an ILM of the soliton type [Fig. 2(d)] is enhanced in the non-native state compared to that in the native state and in protein segments containing GLY residues. This result is expected because the main chain experiences more of the anharmonic portion of the free-energy landscape in large fluctuations than it does in small fluctuations. The non-native state was significantly explored at T = 380 K. To confirm that the solitons occur more frequently in our simulations because the protein is unstructured (and is thus more flexible) and not because the temperature is increased, we ran an additional 20 ns trajectory at T = 300 K by selecting one of the most unfolded structures of Trp-cage as the initial structure of the new trajectory, as we did previously³⁹. The probability of observing solitons was enhanced by a factor of approximately 6 compared to that of the native state at the same temperature (Table 1). Therefore, ILMs of the soliton type might play a functional role in misfolded proteins and in unfolded proteins.

Methods

MD simulations

All-atom MD simulations in explicit water (TIP3P force field⁵⁴) with Trp-cage (model 1 in PDB ID: 1L2Y)⁴¹, the chicken villin headpiece subdomain HP-36 (model 1 in PDB ID: 1VII)⁴⁴ and the protein VA3 (model 1 in PDB: 1ED0)⁵⁵ were conducted using the GROMACS software package⁵⁶ and the AMBER99sb-ILDN force field⁵⁷. In addition, one all-atom MD run of the Trp-cage at T = 300 K and one at 380 K using CHARMM27^58,59 and AMBER99SB*-ILDN-q⁶⁰ force fields were conducted. The results found using these force fields are reported in the Supplementary Table 4 and were similar to those obtained with AMBER99sb-ILDN force field. The time step used in all simulations was 1 fs and the list of neighbors was updated every 5 fs with the grid method and a cutoff radius of 1.0 nm. The coordinates of all the atoms in the simulation box were saved every 1 ps. The initial velocities were chosen randomly. We used the NPT ensemble with a cubic box of 4.55 nm for Trp-cage, 5.21 nm for HP-36 and 5.27 nm for VA3. The temperature and pressure were kept to the desired value by using the Nosé-Hoover thermostat^61,62 and the Parrinello-Rahman⁶³ barostat, respectively. The electrostatic term was computed by using the particle mesh Ewald algorithm⁶⁴ (with a radius of 1 nm) using the Fast Fourier Transform optimization (with an order equal to four for the interpolation). The cutoff algorithm was applied for the non-coulomb potentials with a radius of 1.0 nm. The system was warmed up for 50 ps and equilibrated for 1 ns with lower restraints, finishing with no restraints at the desired temperature. We performed three MD runs of Trp-cage and two MD runs of HP-36 at T = 380 K using different initial conditions and one MD run at T = 300 K for each protein (named run 1 in Table 1). In addition, we performed one MD run for each mutant of the Trp-cage ([G15A] and [G10A-G11A-G15A]) and for VA3 at T = 300 K and one at T = 380 K. Each MD run was of 500 ns duration. In total, the thirteen long MD runs corresponded to 6.5 μs of simulations.

In addition, we performed eight extra short MD runs of Trp-cage of 1 ns duration at T = 380 K using the same procedure as described above, except that the coordinates of all the atoms in the simulation box were saved every 1 fs. The initial structures of those MD runs were extracted at different times from the MD run 1 of Trp-cage at T = 380 K, corresponding to t = 100, 161, 209, 283, 300, 441 and 459 ns. Data from the MD run using the frame at t = 441 ns are presented in the present paper (Figs 5 and 6). The results for the other seven runs were similar. Finally, we performed an MD run of Trp-cage of 20 ns duration at T = 300 K (named run 2 in Table 1) using a completely unfolded structure of the protein obtained in a previous work³⁹ at T = 450 K using the same procedure described above.

Free-energy map and free-energy profiles

An effective free-energy map V((γ, θ)_i) and two free-energy profiles V(γ_i) and V(θ_i) were computed for each pair of CGAs (γ, θ)_i by using

where k_B is the Boltzmann constant, T is the temperature and P((γ, θ)_i), P(γ_i), P(θ_i) are the probability density functions (PDF) of the pair (γ, θ)_i, of γ_i and of θ_i, respectively. The PDFs were computed from the MD trajectories on a time-scale of 500 ns.

Normalized Shannon entropy S and localized excitations

The quantity p_i measures the fluctuations of the pair of CGA i (i = 2 to N − 2) relative to the N − 3 pairs of CGAs along the sequence and is defined by

where Δu_i is the displacement of the vector u_i between two consecutive snapshots Δu_i(t) = u_i(t) − u_i(t − 1) and u_i = (cos(γ_i)sin(θ_i), sin(γ_i)sin(θ_i), cos(θ_i)). By definition, 0 ≤ p_i(t) ≤ 1. The localization of the structural fluctuations of the protein main chain can be quantified by

where S(t) can be interpreted as a normalized Shannon entropy 0 ≤ S(t) ≤ 1. The maximum localization of the fluctuations occured for S = 0 (p_i = 0 for i ≠ k and p_k = 1) and the maximum delocalization occured for S = 1 (p_i = 1/(N − 3) for all i).

An ILM is defined by a value of S ≤ S_cutoff, where S_cutoff is chosen such that the correlation coefficient between the sequence of the Δu_i(t)² of all ILMs localized at the same site i is larger than 0.9. For example, in the MD run 1 (T = 380 K), the most strongly localized excitation at i = 10 had a value of S = 0.40 with a maximum amplitude Δu₁₀(t)² = 2.42. All excitations located at i = 10 are highly correlated (ρ > 0.9) if S ≤ 0.5, i.e., they all represent the same single peak of localized excitation (Supplementary Figure 7). It is worth noting that the number of ILMs at i = 10 in this MD run increases as a function of the S_cutoff value (see Supplementary Table 5) as expected. However, the probability to find an ILM at i = 10 in the non-native state is larger than in the native state for different values of S_cutoff. The ratio P_nn/P_n is infinite for S_cutoff = 0.45 and decreases to 4.6 at S_cutoff = 0.6 (Supplementary Table 5) because at S_cutoff = 0.6 we include excitations which are not centered on a single site as the one shown in red in the Supplementary Figure 7. A similar conclusion was drawn for all of the MD runs studied here, which sets the value of S_cutoff to 0.5.

Additional Information

How to cite this article: Nicolaï, A. et al. Intrinsic Localized Modes in Proteins. Sci. Rep. 5, 18128; doi: 10.1038/srep18128 (2015).

References

Flach, S. & Willis, C. R. Discrete breathers. Phys. Rep. 295, 181–264 (1998).
Article CAS ADS MathSciNet Google Scholar
Dauxois, T. & Peyrard, M. Physics of Solitons (Cambridge University Press, 2006).
Sievers, A. J. & Takeno, S. Intrinsic localized modes in anharmonic crystals. Phys. Rev. Lett. 61, 970–973 (1988).
Article CAS ADS Google Scholar
Schwarz, U. T., English, L. Q. & Sievers, A. J. Experimental Generation and Observation of Intrinsic Localized Spin Wave Modes in an Antiferromagnet. Phys. Rev. Lett. 83, 223–226 (1999).
Article CAS ADS Google Scholar
McCammon, J. A., Gelin, B. R. & Karplus, M. Dynamics of folded proteins. Nature 267, 585–590 (1977).
Article CAS ADS Google Scholar
Frauenfelder, H., Petsko, G. A. & Tsernoglou, D. Temperature-dependent X-ray diffraction as a probe of protein structural dynamics. Nature 280, 558–563 (1979).
Article CAS ADS Google Scholar
Parak, F. & Knapp, E. W. A consistent picture of protein dynamics. Proc. Natl. Acad. Sci. USA 81, 7088–7092 (1984).
Article CAS ADS Google Scholar
Doster, W., Cusack, S. & Petry, W. Dynamical transition of myoglobin revealed by inelastic neutron scattering. Nature 337, 754–756 (1989).
Article CAS ADS Google Scholar
Hayward, S., Kitao, A. & Gō, N. Harmonicity and anharmonicity in protein dynamics: A normal mode analysis and principal component analysis. Proteins 23, 177–186 (1995).
Article CAS Google Scholar
García, A. E. Large-amplitude nonlinear motions in proteins. Phys. Rev. Lett. 68, 2696–2699 (1992).
Article ADS Google Scholar
Doster, W. The dynamical transition of proteins, concepts and misconceptions. Eur. Biophys. J. 37, 591–602 (2008).
Article CAS Google Scholar
Gerstein, M., Lesk, A. M. & Chothia, C. Structural Mechanisms for Domain Movements in Proteins. Biochemistry 33, 6739–6749 (1994).
Article CAS Google Scholar
Benkovic, S. J. & Hammes-Schiffer, S. Enzyme Motions Inside and Out. Science 312, 208–209 (2006).
Article CAS Google Scholar
Rashin, A. A., Rashin, A. H. & Jernigan, R. L. Diversity of function-related conformational changes in proteins: coordinate uncertainty, fragment rigidity and stability. Biochemistry 49, 5683–5704 (2010).
Article CAS Google Scholar
Henzler-Wildman, K. A. et al. A hierarchy of timescales in protein dynamics is linked to enzyme catalysis. Nature 450, 913–916 (2007).
Article CAS ADS Google Scholar
Henzler-Wildman, K. & Kern, D. Dynamic personalities of proteins. Nature 450, 964–972 (2007).
Article CAS ADS Google Scholar
Scott, A. Davydov’s soliton. Phys. Rep. 217, 1–67 (1992).
Article ADS Google Scholar
Davydov, A. S. Solitons and energy transfer along protein molecules. J. Theoret. Biol. 66, 379–387 (1977).
Article CAS Google Scholar
Xie, A. H., Meer, L., Hoff, W. & Austin, R. H. Long-lived amide I vibrational modes in myoglobin. Phys. Rev. Lett. 84, 5435–5438 (2000).
Article CAS ADS Google Scholar
Edler, J., Pfister, R., Pouthier, V., Falvo, C. & Hamm, P. Direct Observation of Self-Trapped Vibrational States in α-Helices. Phys. Rev. Lett. 93, 106405 (2004).
Article CAS ADS Google Scholar
Austin, R. H. et al. Tilting after Dutch windmills: probably no long-lived Davydov solitons in proteins. J. Biol. Phys. 35, 91–101 (2009).
Article CAS Google Scholar
Kobus, M., Nguyen, P. H. & Stock, G. Coherent vibrational energy transfer along a peptide helix. J. Chem. Phys. 134, 124518 (2011).
Article ADS Google Scholar
Juanico, B., Sanejouand, Y.-H., Piazza, F. & De Los Rios, P. Discrete Breathers in Nonlinear Network Models of Proteins. Phys. Rev. Lett. 99, 238104 (2007).
Article CAS ADS Google Scholar
Piazza, F. Nonlinear excitations match correlated motions unveiled by NMR in proteins: a new perspective on allosteric cross-talk. Phys. Biol. 11, 036003 (2014).
Article ADS Google Scholar
Krokhotin, A., Liwo, A., Maisuradze, G. G., Niemi, A. J. & Scheraga, H. A. Kinks, loops and protein folding, with protein A as an example. J. Chem. Phys. 140, 025101 (2014).
Article ADS Google Scholar
Feddersen, H. Localization of vibrational energy in globular protein. Phys. Lett. A 154, 391–395 (1991).
Article CAS ADS Google Scholar
Karplus, M. & Petsko, G. A. Molecular dynamics simulations in biology. Nature 347, 631–639 (1990).
Article CAS ADS Google Scholar
Lindorff-Larsen, K., Best, R. B., DePristo, M. A., Dobson, C. M. & Vendruscolo, M. Simultaneous determination of protein structure and dynamics. Nature 433, 128–132 (2005).
Article CAS ADS Google Scholar
Shaw, D. E. et al. Atomic-level characterization of the structural dynamics of proteins. Science 330, 341–346 (2010).
Article CAS ADS Google Scholar
Peregrine, D. H. Water waves, nonlinear Schrödinger equations and their solutions. Anziam J. 25, 16–43 (1983).
MATH Google Scholar
Kibler, B. et al. The Peregrine soliton in nonlinear fibre optics. Nat. Phys. 6, 790–795 (2010).
Article CAS Google Scholar
Gō, N., Noguti, T. & Nishikawa, T. Dynamics of a small globular protein in terms of low-frequency vibrational modes. Proc. Natl. Acad. Sci. USA 80, 3696–3700 (1983).
Article ADS Google Scholar
Bahar, I., Atilgan, A. R., Demirel, M. C. & Erman, B. Vibrational Dynamics of Folded Proteins: Significance of Slow and Fast Motions in Relation to Function and Stability. Phys. Rev. Lett. 80, 2733–2736 (1998).
Article CAS ADS Google Scholar
Keskin, O., Jernigan, R. L. & Bahar, I. Proteins with Similar Architecture Exhibit Similar Large-Scale Dynamic Behavior. Biophys. J. 78, 2093–2106 (2000).
Article CAS Google Scholar
Nishikawa, K., Momany, F. A. & Scheraga, H. A. Low-Energy Structures of Two Dipeptides and Their Relationship to Bend Conformations. Macromolecules 7, 797–806 (1974).
Article CAS ADS Google Scholar
Senet, P., Maisuradze, G. G., Foulie, C., Delarue, P. & Scheraga, H. A. How main-chains of proteins explore the free-energy landscape in native states. Proc. Natl. Acad. Sci. USA 105, 19708–19713 (2008).
Article CAS ADS Google Scholar
Korkut, A. & Hendrickson, W. A. A force field for virtual atom molecular mechanics of proteins. Proc. Natl. Acad. Sci. USA 106, 15667–15672 (2009).
Article CAS ADS Google Scholar
Maisuradze, G. G., Senet, P., Czaplewski, C., Liwo, A. & Scheraga, H. A. Investigation of Protein Folding by Coarse-Grained Molecular Dynamics with the UNRES Force Field. J. Phys. Chem. A 114, 4471–4485 (2010).
Article CAS Google Scholar
Cote, Y., Maisuradze, G. G., Delarue, P., Scheraga, H. A. & Senet, P. New Insights into Protein (Un)Folding Dynamics. J. Phys. Chem. Lett. 6, 1082–1086 (2015).
Article CAS Google Scholar
Cote, Y., Senet, P., Delarue, P., Maisuradze, G. G. & Scheraga, H. A. Anomalous diffusion and dynamical correlation between the side chains and the main chain of proteins in their native state. Proc. Natl. Acad. Sci. USA 109, 10346–10351 (2012).
Article CAS ADS Google Scholar
Neidigh, J. W., Fesinmeyer, R. M. & Andersen, N. H. Designing a 20-residue protein. Nat. Struct. Biol. 9, 425–430 (2002).
Article CAS Google Scholar
Qiu, L., Pabit, S. A., Roitberg, A. E. & Hagen, S. J. Smaller and faster: the 20-residue trp-cage protein folds in 4 μs J. Am. Chem. Soc. 124, 12952–12953 (2002).
Article CAS Google Scholar
McKnight, J. C., Doering, D. S., Matsudaira, P. T. & Kim, P. S. A Thermostable 35-Residue Subdomain within Villin Headpiece. J. Mol. Biol. 260, 126–134 (1996).
Article CAS Google Scholar
McKnight, C. J., Matsudaira, P. T. & Kim, P. S. NMR structure of the 35-residue villin headpiece subdomain. Nat. Struct. Mol. Biol. 4, 180–184 (1997).
Article CAS Google Scholar
Zhou, R. Trp-cage : Folding free energy landscape in explicit water. Proc. Natl. Acad. Sci. USA 100, 13280–13285 (2003).
Article CAS ADS Google Scholar
Paschek, D., Nymeyer, H. & García, A. E. Replica exchange simulation of reversible folding/unfolding of the Trp-cage miniprotein in explicit solvent: On the structure and possible role of internal water. J. Struct. Biol. 157, 524–533 (2007).
Article CAS Google Scholar
Paschek, D., Hempel, S. & García, A. E. Computing the stability diagram of the Trp-cage miniprotein. Proc. Natl. Acad. Sci. USA 105, 17754–17759 (2008).
Article CAS ADS Google Scholar
Bandyopadhyay, S., Chakraborty, S., Balasubramanian, S., Pal, S. & Bagchi, B. Atomistic Simulation Study of the Coupled Motion of Amino Acid Residues and Water Molecules around Protein HP-36: Fluctuations at and around the Active Sites. J. Phys. Chem. B 108, 12608–12616 (2004).
Article CAS Google Scholar
Lucas, A. A. Phonon Modes of an Ionic Crystal Slab. J. Chem. Phys. 48, 3156–3168 (1968).
Article CAS ADS Google Scholar
Maisuradze, G. G., Liwo, A., Senet, P. & Scheraga, H. A. Local vs Global Motions in Protein Folding. J. Chem. Theory Comput. 9, 2907–2921 (2013).
Article CAS Google Scholar
Nicolaï, A., Delarue, P. & Senet, P. Decipher the Mechanisms of Protein Conformational Changes Induced by Nucleotide Binding through Free-Energy Landscape Analysis: ATP Binding to Hsp70. PLoS Comput. Biol. 9, e1003379 (2013).
Article ADS Google Scholar
Woutersen, S., Mu, Y., Stock, G. & Hamm, P. Subpicosecond conformational dynamics of small peptides probed by two-dimensional vibrational spectroscopy. Proc. Natl. Acad. Sci. USA 98, 11254–11258 (2001).
Article CAS ADS Google Scholar
Zhou, H.-X., Wlodek, S. T. & McCammon, J. A. Conformation gating as a mechanism for enzyme specificity. Proc. Natl. Acad. Sci. USA 95, 9280–9283 (1998).
Article CAS ADS Google Scholar
Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W. & Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 79, 926–935 (1983).
Article CAS ADS Google Scholar
Romagnoli, S. et al. NMR structural determination of viscotoxin A3 from Viscum album L. Biochem. J. 350, 569–577 (2000).
Article CAS Google Scholar
van der Spoel, D. et al. Gromacs User Manual version 4.6.6 (2011). URL http://manual.gromacs.org/documentation/ Date of access: 15 October 2015.
Lindorff-Larsen, K. et al. Improved side-chain torsion potentials for the Amber ff99sb protein force field. Proteins 78, 1950–1958 (2010).
CAS PubMed PubMed Central Google Scholar
MacKerell, A. D. et al. All-atom empirical potential for molecular modeling and dynamics studies of proteins. J. Phys. Chem. B 102, 3586–3616 (1998).
Article CAS Google Scholar
Mackerell, A. D., Feig, M. & Brooks, C. L. Extending the treatment of backbone energetics in protein force fields: limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations. J. Comput. Chem. 25, 1400–1415 (2004).
Article CAS Google Scholar
Best, R. B., De Sancho, D. & Mittal, J. Residue-specific α-helix propensities from molecular simulation. Biophys. J. 102, 1462–1467 (2012).
Article CAS ADS Google Scholar
Nosé, S. A unified formulation of the constant temperature molecular dynamics methods. J. Chem. Phys. 81, 511–519 (1984).
Article ADS Google Scholar
Hoover, W. G. Canonical dynamics: Equilibrium phase-space distributions. Phys. Rev. A 31, 1695–1697 (1985).
Article CAS ADS Google Scholar
Parrinello, M. & Rahman, A. Polymorphic transitions in single crystals: A new molecular dynamics method. J. Appl. Phys. 52, 7182–7190 (1981).
Article CAS ADS Google Scholar
Darden, T., York, D. & Pedersen, L. Particle mesh Ewald: An Nlog(N) method for ewald sums in large systems. J. Chem. Phys. 98, 10089–10092 (1993).
Article CAS ADS Google Scholar

Download references

Acknowledgements

The calculations were performed using HPC resources from DSI-CCUB (Université de Bourgogne). The authors thank the Conseil Régional de Bourgogne for the funding (PARI Nano2bio).

Author information

Authors and Affiliations

Department of Physics, Applied Physics and Astronomy, Rensselaer Polytechnic Institute, Troy, 12180, New York, United States
Adrien Nicolaï
Laboratoire Interdisciplinaire Carnot de Bourgogne, UMR 6303 CNRS-Univ. Bourgogne Franche-Comté, 9 Av. A. Savary, BP 47 870, Dijon Cedex, F-21078, France
Adrien Nicolaï, Patrice Delarue & Patrick Senet

Authors

Adrien Nicolaï
View author publications
You can also search for this author in PubMed Google Scholar
Patrice Delarue
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Senet
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.S. designed the research. A.N. and P.D. performed the simulations. A.N., P.D. and P.S. analyzed the data. A.N. and P.S. wrote the paper.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Supplementary Information

Supplementary Video

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Nicolaï, A., Delarue, P. & Senet, P. Intrinsic Localized Modes in Proteins. Sci Rep 5, 18128 (2015). https://doi.org/10.1038/srep18128

Download citation

Received: 23 July 2015
Accepted: 13 November 2015
Published: 11 December 2015
DOI: https://doi.org/10.1038/srep18128

This article is cited by

Why Proteins are Big: Length Scale Effects on Equilibria and Kinetics
- Kenneth A. Rubinson
The Protein Journal (2019)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.