The dynamics of single protein molecules is non-equilibrium and self-similar over thirteen decades in time

Journal name:
Nature Physics
Volume:
12,
Pages:
171–174
Year published:
DOI:
doi:10.1038/nphys3553
Received
Accepted
Published online

Internal motions of proteins are essential to their function. The time dependence of protein structural fluctuations is highly complex, manifesting subdiffusive, non-exponential behaviour with effective relaxation times existing over many decades in time, from ps up to ~102s (refs 1,2,3,4). Here, using molecular dynamics simulations, we show that, on timescales from 10−12 to 10−5s, motions in single proteins are self-similar, non-equilibrium and exhibit ageing. The characteristic relaxation time for a distance fluctuation, such as inter-domain motion, is observation-time-dependent, increasing in a simple, power-law fashion, arising from the fractal nature of the topology and geometry of the energy landscape explored. Diffusion over the energy landscape follows a non-ergodic continuous time random walk. Comparison with single-molecule experiments suggests that the non-equilibrium self-similar dynamical behaviour persists up to timescales approaching the in vivo lifespan of individual protein molecules.

At a glance

Figures

  1. Yeast PGK.
    Figure 1: Yeast PGK.

    Blue, N-terminal domain (residues 1–185); red, C-terminal domain (residues 200–389); yellow, hinge region (residues 186–199 and 390–416). R(t) indicates the inter-domain centre-of-mass distance.

  2. Non-equilibrium inter-domain dynamics of PGK.
    Figure 2: Non-equilibrium inter-domain dynamics of PGK.

    a, TA-MSD (Supplementary Equations 1 and 2) averaged over five independent trajectories for t = 100ps, 10ns, 500ns, together with the TA-MSD for t = 17μs. Dotted reference lines indicating power laws with different exponents are plotted as a visual guide. b, ACFs (Supplementary Equations 3 and 5) of the inter-domain distance trajectories, calculated from different independent MD trajectories. A reference line at e−1 is plotted to serve as a visual guide. c, Scaling behaviour between the observed characteristic time τc and the observation time t. The logarithm (to base 10) of the characteristic relaxation time τc of the inter-domain distance fluctuation of PGK and ePepN, of the intra-domain structural fluctuation within the single domain protein K-Ras (see Supplementary Information), and of the average for the distance fluctuations between residue side-chain pairs in PGK (see Supplementary Information), are plotted against the logarithm (to base 10) of the observation time, t. τc obtained from MD simulations is defined as the time at which the normalized autocorrelation function decays to 1/e. A reference line for the power-law relationship τc(t) ∝ t0.9 is plotted as a visual guide. The error bars shown with the red circles represent the standard deviation of log10(τc) associated with individual residue pairs. d, Power spectral density (PSD), S(f) of the inter-domain distance fluctuation of PGK versus frequency, f ([f] = 10−12Hz), calculated using the Welch algorithm27. Different coloured symbols indicate different observation times; black, t = 100ps; red, t = 10ns; blue, t = 500ns and magenta, t = 17μs. The inset shows the estimated PSD of protein structural fluctuation based on the experimental single-molecule data published in ref. 4, obtained by numerical Fourier transform of an analytical fit to the experimentally measured autocorrelation function.

  3. Network representation of conformational transitions in PGK.
    Figure 3: Network representation of conformational transitions in PGK.

    a, Conformational transition network from the 17μs MD simulation, containing 530 vertices and 2,345 edges. The circles represent structural clusters, the diameter and the colour scale of each circle indicate the cluster size (the darker and larger the circle, the larger the cluster size), defined by the numbers of conformations belonging to the cluster. The integer label on each vertex indicates its index based on its rank in terms of the cluster size. The arrows represent the transitions between the clusters. The thickness of the arrow and warmness of colour scale indicate the transition frequency (the thicker and warmer the colour, the higher the transition frequency). The graphical representation of the network is generated using the Python library graph tool (http://graph-tool.skewed.de). b, Conformational transition network from the 500ns MD simulation, containing 243 vertices and 951 edges. A high-resolution version of the network illustrations shown in both sub-figures are provided in Supplementary Figs 9 and 10.

Main

Although fluctuations of distances between atoms in folded proteins are necessarily spatially bounded (confined), it is conceivable that, as the timescale of observation is increased, a protein may incorporate into these fluctuations slower pathways over its energy landscape. The question then arises as to whether there is at all a finite characteristic time associated with any given structural change, or, instead, that the timescale on which a structural fluctuation is observed determines the apparent characteristic relaxation time for the motion that will be obtained. To examine this question we have performed molecular dynamics (MD) simulations to characterize the internal dynamics of three globular proteins of markedly different size and structure: one with a single structural domain (K-Ras), one with two structural domains (phosphoglycerate kinase (PGK)) and one with four structural domains (the Escherichia coli aminopeptidase N (ePepN)). MD simulations of different lengths (observation times) were performed. We examine in detail the motion between the two domains of PGK (Fig. 1), which is of direct functional importance5. The time-averaged mean-square displacement (TA-MSD), (Supplementary Equations 1 and 2), where Δ is the lag time and t the observation time (length of the trajectory), calculated from the time series of the distance R(t) between the centres of mass of the two domains, is presented together with the corresponding normalized displacement autocorrelation function (ACF) C(Δ; t) (Supplementary Equations 3 and 5), in Fig. 2a, b. The TA-MSD does not reach a plateau over the timescale examined. Furthermore, TA-MSDs calculated over different observation times t are shifted relative to each other, with the slope becoming increasingly smaller with increasing t, a signature of ageing and observation-time-dependent dynamics6.

Figure 1: Yeast PGK.
Yeast PGK.

Blue, N-terminal domain (residues 1–185); red, C-terminal domain (residues 200–389); yellow, hinge region (residues 186–199 and 390–416). R(t) indicates the inter-domain centre-of-mass distance.

Figure 2: Non-equilibrium inter-domain dynamics of PGK.
Non-equilibrium inter-domain dynamics of PGK.

a, TA-MSD (Supplementary Equations 1 and 2) averaged over five independent trajectories for t = 100ps, 10ns, 500ns, together with the TA-MSD for t = 17μs. Dotted reference lines indicating power laws with different exponents are plotted as a visual guide. b, ACFs (Supplementary Equations 3 and 5) of the inter-domain distance trajectories, calculated from different independent MD trajectories. A reference line at e−1 is plotted to serve as a visual guide. c, Scaling behaviour between the observed characteristic time τc and the observation time t. The logarithm (to base 10) of the characteristic relaxation time τc of the inter-domain distance fluctuation of PGK and ePepN, of the intra-domain structural fluctuation within the single domain protein K-Ras (see Supplementary Information), and of the average for the distance fluctuations between residue side-chain pairs in PGK (see Supplementary Information), are plotted against the logarithm (to base 10) of the observation time, t. τc obtained from MD simulations is defined as the time at which the normalized autocorrelation function decays to 1/e. A reference line for the power-law relationship τc(t) ∝ t0.9 is plotted as a visual guide. The error bars shown with the red circles represent the standard deviation of log10(τc) associated with individual residue pairs. d, Power spectral density (PSD), S(f) of the inter-domain distance fluctuation of PGK versus frequency, f ([f] = 10−12Hz), calculated using the Welch algorithm27. Different coloured symbols indicate different observation times; black, t = 100ps; red, t = 10ns; blue, t = 500ns and magenta, t = 17μs. The inset shows the estimated PSD of protein structural fluctuation based on the experimental single-molecule data published in ref. 4, obtained by numerical Fourier transform of an analytical fit to the experimentally measured autocorrelation function.

C(Δ; t) shifts towards longer lag times with increasing t, again consistent with ageing. Furthermore, an intriguing commonality is found in the dynamics examined on different timescales: Fig. 2c shows that τc, the characteristic time of the inter-domain motion, increases in a power-law fashion with the observation time, t, as τc(t) ∝ tθ, with θ ≈ 0.9, showing no sign of convergence. Remarkably, data from single-molecule experiments on the distance fluctuations between side-chain pairs3, 4 fall close to the same power-law relationship (Fig. 2c), although these were obtained on t ≥ 300s observation timescales, more than seven orders of magnitude longer than the MD, and on other proteins. Together, Fig. 2a–c reveals strong non-stationarity (ageing) of the inter-domain dynamics and suggests a power-law dependence extending from 10−12 to 102s.

The residues probed in the experimental single-molecule studies are only ~0.3–0.4nm apart from each other3, 4, in contrast to the inter-domain distance explored above, the average of which is ~3.8nm. Therefore, we investigated the distance fluctuations between the side chains of 32 selected PGK residue pairs in a variety of structural environments (see Supplementary Information for details). Although there are substantial variations in the results between individual residue pairs, the average behaviour (Fig. 2c) shows the same quantitative time dependence as described above for the inter-domain centre-of-mass motion, with characteristic relaxation times continuously increasing with t and following practically the same power law. Thus, the non-equilibrium scaling behaviour holds both for global (for example, inter-domain) protein motion and a substantial fraction of local motions, and in some cases even for distance fluctuations between adjacent residues on the same α-helix, an example of which is shown in Supplementary Fig. 8. Also, to determine whether these results are specific to PGK we also performed MD simulations of two other proteins under similar conditions: a much larger, four-domain enzyme (ePepN) and a much smaller, single-domain GTPase, human K-Ras. Both these proteins exhibit the same observation-time-dependent internal dynamics as shown above for PGK (see Fig. 2c and Supplementary Figs 3–5).

The power spectrum of the PGK inter-domain distance fluctuation, S(f), where f is the frequency, is shown in Fig. 2d. S(f) obtained on different observation timescales can be concatenated onto a single profile. For f 0.1THz, and over nearly five frequency decades in the MD, S(f) scales approximately as f−1. Moreover, the power spectrum calculated from the single-molecule experimental data4 on the ms to 102s timescales also follows the same 1/f behaviour (inset in Fig. 2d). Hence, the 1/f dependence also extends from ps up to ~102s timescales. This ‘1/f-noise, often also referred to as ‘flicker noise or ‘pink noise, indicates self-similarity of the corresponding dynamics on different timescales7. The lack of a characteristic frequency associated with a 1/f spectrum is consistent with the observation of the time-dependent relaxation time—that is, motions with ever-lower frequencies are sampled as the observation length is increased.

The above observations concern the dynamics of folded proteins in their broad, global, free energy minima. As the average structure of most globular proteins is well defined, at sufficiently long times the system will eventually feel a restoring force centred at the mean position—that is, a confining potential. However, within this global broad free energy well there are many small, local minima separated by barriers with different heights8, 9. The dynamics of folded proteins can thus be considered as a fictitious particle diffusing on a rugged energy landscape possessing many wells of various depths10. To understand the features of this landscape that give rise to the self-similar dynamics, we mapped the individual MD trajectories onto networks based on the transitions between different metastable conformational states11, 12 (see Supplementary Information for details). Clusters of similar structures form the vertices (nodes) of the network. Whenever, during the simulation, the protein transits between two clusters an edge is added between the two vertices, thus forming a conformational cluster transition network (CCTN). Similar networks have been used several times previously to describe complex dynamics, for example, refs 11,12. Graphical illustrations of obtained networks are shown in Fig. 3. The degree distributions of the networks, defined as the probability P(d) of finding a vertex connected to d direct neighbours, fully overlap on different timescales within the statistical errors, indicating topologically self-similar, fractal networks (Supplementary Fig. 11a). (The fractality here is in the geometry and topology of the CCTN, in contrast to the fractality of the protein structure itself, which has been related to the vibrational density of states13, 14.)

Figure 3: Network representation of conformational transitions in PGK.
Network representation of conformational transitions in PGK.

a, Conformational transition network from the 17μs MD simulation, containing 530 vertices and 2,345 edges. The circles represent structural clusters, the diameter and the colour scale of each circle indicate the cluster size (the darker and larger the circle, the larger the cluster size), defined by the numbers of conformations belonging to the cluster. The integer label on each vertex indicates its index based on its rank in terms of the cluster size. The arrows represent the transitions between the clusters. The thickness of the arrow and warmness of colour scale indicate the transition frequency (the thicker and warmer the colour, the higher the transition frequency). The graphical representation of the network is generated using the Python library graph tool (http://graph-tool.skewed.de). b, Conformational transition network from the 500ns MD simulation, containing 243 vertices and 951 edges. A high-resolution version of the network illustrations shown in both sub-figures are provided in Supplementary Figs 9 and 10.

A natural simplified physical description of the dynamics of the fictitious particle diffusing over the CCTN is a ‘continuous time random walk (CTRW; ref. 15), in which, at each step, the random walker draws from a distribution of jumping distances and waiting times. Non-ergodic subdiffusive motion arises if the average waiting time diverges, as, for example, in a power-law waiting time distribution. Diverging waiting times can occur in diffusion on very rugged potential surfaces possessing many deep ‘traps in which the system is stuck for extended periods of time16. Superposition of thermal noise, representing motion within the trap, leads to the noisy CTRW description introduced in ref. 17.

The decay behaviour of the tails of the ACFs at long Δ is best described by an incomplete beta function (see Supplementary Equation 7), which is the analytical ACF for a non-ergodic subdiffusive CTRW in an external potential, derived from the fractional Fokker–Planck formalism18. The full range ACFs obtained on all observation timescales and for all three proteins are well fitted by a noisy CTRW model (see Supplementary Equation 9 and Fig. 2b and Supplementary Figs 3–5). We note that all ACFs obtained here deviate significantly from power-law behaviour for large Δ. In contrast, the ACFs from related ergodic subdiffusive models, such as fractional Brownian motion or processes governed by the fractional Langevin equation, are described by a Mittag–Leffler function15, 19, which decays as a power law for large lag times Δ, and is independent of t. Furthermore, projecting the motion of interest (for example, the inter-domain motion) onto the CCTN filters out the noise component, leaving the subdiffusive CTRW dynamics (see Supplementary Section 7 and Supplementary Fig. 12).

If observed over a sufficiently long timescale a single, folded protein must eventually reach dynamical equilibrium. At that point the system will be ergodic (the time average of all quantities depending on the dynamics will converge to the ensemble-averaged values), the MSDs will plateau and the subdiffusive CTRW description will break down. The single-molecule spectroscopic experiments indicate this timescale must be longer than ~102s (refs 3,4). Hence, a single protein will not reach equilibrium over most timescales on which functional processes, such as ligand binding, allostery and catalysis, usually occur (that is, μs upwards). Combination of the present results with the single-molecule experimental data indicates that self-similar dynamics persists from ps through to the timescales of relevant biological processes in the cellular environment. Indeed, the median half-life of cellular proteins in yeast cells (Saccharomyces cerevisiae) has been estimated at ~45min (ref. 20), barely an order of magnitude longer than the observation time lengths of the single-molecule experiments. Therefore, the dynamics of an individual protein may exhibit non-equilibrium, self-similar dynamical behaviour throughout its typical biological lifespan.

On functional timescales, although the spatial dependence of fluctuations may evolve only extremely slowly with time, the time dependence of the associated motion remains well out of equilibrium, leading to the absence of a finite average in the power-law relationship revealed here between the characteristic time and observation time. This complexity means that canonical assumptions relating dynamics to function break down. Thermal equilibrium is assumed in classical theories for chemical reactions catalysed by proteins, such as the Michaelis–Menten formalism or transition state theory. However, on functional timescales any two protein molecules will sample different regions of conformational space and the associated time-averaged dynamical properties, including catalytic rate constants, will be different. Conformational dynamics modulates enzyme catalytic activity and in some cases can determine the overall rate-limiting step21. The present picture is consistent with single-molecule experiments revealing that enzymes possess dynamic disorder22, undergoing internal motions on timescales longer than those on which they function, such that the catalytic rate constant of an individual enzyme can strongly fluctuate over time23, 24. The internal motions lead to fluctuations in the height of the effective reaction barrier, occurring on timescales similar to or longer than the reaction, leading to dispersed kinetics and deviation from classical Michaelis–Menten behaviour2, 24, 25. In this case, the reaction rate, which would otherwise be Arrhenius, is convolved with the temporal fluctuation of the barrier height, resulting in a long-tailed distribution of the rate24, 25, 26. The present work indicates that the corresponding non-equilibrium dynamics is a continuous time random walk on a self-similar, fractal conformational cluster transition network, a general phenomenon associated with the complexity of globular protein structure. The prevalence of non-equilibrium fractal dynamics in single protein molecules on the timescales of macromolecular function may fundamentally change our appreciation of the relationship between protein dynamics and functional activity.

References

  1. Lim, M., Jackson, T. A. & Anfinrud, P. A. Nonexponential protein relaxation: Dynamics of conformational change in myoglobin. Proc. Natl Acad. Sci. USA 90, 58015804 (1993).
  2. Wang, Y. & Lu, H. P. Bunching effect in single-Molecule T4 lysozyme nonequilibrium conformational dynamics under enzymatic reactions. J. Phys. Chem. B 114, 66696674 (2010).
  3. Yang, H. et al. Protein conformational dynamics probed by single-molecule electron transfer. Science 302, 262266 (2003).
  4. Min, W., Luo, G., Cherayil, B. J., Kou, S. C. & Xie, X. S. Observation of a power-law memory kernel for fluctuations within a single protein molecule. Phys. Rev. Lett. 94, 198302 (2005).
  5. Bernstein, B. E., Michels, P. & Hol, W. Synergistic effects of substrate-induced conformational changes in phosphoglycerate kinase activation. Nature 385, 275278 (1997).
  6. Neusius, T., Sokolov, I. M. & Smith, J. C. Subdiffusion in time-averaged, confined random walks. Phys. Rev. E 80, 011109 (2009).
  7. Weissman, M. B. 1/f noise and other slow, nonexponential kinetics in condensed matter. Rev. Mod. Phys. 60, 537571 (1988).
  8. Frauenfelder, H., Sligar, S. G. & Wolynes, P. G. The energy landscapes and motions of proteins. Science 254, 15891603 (1991).
  9. Wales, D. J. Energy Landscapes (Cambridge Univ. Press, 2003).
  10. Frauenfelder, H. & Leeson, D. T. The energy landscape in non-biological and biological molecules. Nature Struct. Biol. 5, 757759 (1998).
  11. Noé, F., Horenko, I., Schütte, C. & Smith, J. C. Hierarchical analysis of conformational dynamics in biomolecules: Transition networks of metastable states. J. Chem. Phys. 126, 155102 (2007).
  12. Neusius, T., Daidone, I., Sokolov, I. M. & Smith, J. C. Configurational subdiffusion of peptides: A network study. Phys. Rev. E 83, 021902 (2011).
  13. Reuveni, S., Granek, R. & Klafter, J. Proteins: Coexistence of stability and flexibility. Phys. Rev. Lett. 100, 208101 (2008).
  14. Reuveni, S., Granek, R. & Klafter, J. Anomalies in the vibrational dynamics of proteins are a consequence of fractal-like structure. Proc. Natl Acad. Sci. USA 107, 1369613700 (2010).
  15. Metzler, R., Jeon, J.-H., Cherstvya, A. G. & Barkai, E. Anomalous diffusion models and their properties: Non-stationarity, non-ergodicity, and ageing at the centenary of single particle tracking. Phys. Chem. Chem. Phys. 16, 2412824164 (2014).
  16. Bouchaud, J.-P. & Georges, A. Anomalous diffusion in disordered media: Statistical mechanics, models and physical applications. Phys. Rep. 195, 127293 (1990).
  17. Jeon, J.-H., Barkai, E. & Metzler, R. Noisy continuous time random walks. J. Chem. Phys. 139, 121916 (2013).
  18. Burov, S., Metzler, R. & Barkai, E. Aging and nonergodicity beyond the Khinchin theorem. Proc. Natl Acad. Sci. USA 107, 1322813233 (2010).
  19. Jeon, J.-H., Leijnse, N., Oddershede, L. B. & Metzler, R. Anomalous diffusion and power-law relaxation of the time averaged mean squared displacement in worm-like micellar solutions. New J. Phys. 15, 045011 (2013).
  20. Belle, A., Tanay, A., Bitincka, L., Shamir, R. & OShea, E. K. Quantification of protein half-lives in the budding yeast proteome. Proc. Natl Acad. Sci. USA 103, 1300413009 (2006).
  21. Henzler-Wildman, K. A. et al. A hierarchy of timescales in protein dynamics is linked to enzyme catalysis. Nature 450, 913916 (2007).
  22. Zwanzig, R. Rate processes with dynamical disorder. Acc. Chem. Res. 23, 148152 (1990).
  23. Lu, H. P., Xun, L. & Xie, X. S. Single-molecule enzymatic dynamics. Science 282, 18771882 (1998).
  24. Min, W. et al. Fluctuating enzymes: Lessons from single-molecule studies. Acc. Chem. Res. 38, 923931 (2005).
  25. English, B. P. et al. Ever-fluctuating single enzyme molecules: Michaelis–Menten equation revisited. Nature Chem. Biol. 8, 8794 (2006).
  26. Flomenbom, O. et al. Stretched exponential decay and correlations in the catalytic activity of fluctuating single lipase molecules. Proc. Natl Acad. Sci. USA 102, 23682372 (2005).
  27. Welch, P. The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms. IEEE Trans. Audio Electroacoust. 15, 7073 (1967).

Download references

Acknowledgements

Anton computer time was provided by the National Center for Multiscale Modeling of Biological Systems (MMBioS) through Grant P41GM103712-S1 from the National Institutes of Health (NIH) and the Pittsburgh Supercomputing Center (PSC). The Anton machine at PSC was generously made available by D.E. Shaw Research. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the US Department of Energy under Contract No. DE-AC05-00OR22725 and resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the US Department of Energy under Contract No. DE-AC02-05CH11231. L.H. acknowledges the support from NSF China 11504231. We thank I. M. Sokolov, A. P. Sokolov and F. Noé for fruitful discussions and T. Splettstößer (http://www.scistyle.com) for rendering the 3D protein structure shown in Fig. 1.

Author information

Affiliations

  1. Center for Molecular Biophysics, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, USA

    • Xiaohu Hu,
    • Micholas Dean Smith,
    • Xiaolin Cheng &
    • Jeremy C. Smith
  2. Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee 37996, USA

    • Xiaohu Hu
  3. Institute of Natural Sciences & Department of Physics and Astronomy, Shanghai Jiao Tong University, Shanghai 200240, China

    • Liang Hong
  4. Wiesbaden Business School, Rhein-Main University of Applied Sciences, Bleichstr. 44, D-65183 Wiesbaden, Germany

    • Thomas Neusius
  5. Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee 37996, USA

    • Jeremy C. Smith

Contributions

X.H. performed and conceived the research, analysed the results and wrote the manuscript. L.H. analysed the results and wrote the manuscript. M.D.S. performed the research. T.N. analysed the results and wrote the manuscript. X.C. analysed the results and wrote the manuscript. J.C.S. conceived the research, analysed the results and wrote the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Author details

Supplementary information

PDF files

  1. Supplementary information (2,192,121 KB)

    Supplementary information

Additional data