Main

Although fluctuations of distances between atoms in folded proteins are necessarily spatially bounded (confined), it is conceivable that, as the timescale of observation is increased, a protein may incorporate into these fluctuations slower pathways over its energy landscape. The question then arises as to whether there is at all a finite characteristic time associated with any given structural change, or, instead, that the timescale on which a structural fluctuation is observed determines the apparent characteristic relaxation time for the motion that will be obtained. To examine this question we have performed molecular dynamics (MD) simulations to characterize the internal dynamics of three globular proteins of markedly different size and structure: one with a single structural domain (K-Ras), one with two structural domains (phosphoglycerate kinase (PGK)) and one with four structural domains (the Escherichia coli aminopeptidase N (ePepN)). MD simulations of different lengths (observation times) were performed. We examine in detail the motion between the two domains of PGK (Fig. 1), which is of direct functional importance5. The time-averaged mean-square displacement (TA-MSD), (Supplementary Equations 1 and 2), where Δ is the lag time and t the observation time (length of the trajectory), calculated from the time series of the distance R(t) between the centres of mass of the two domains, is presented together with the corresponding normalized displacement autocorrelation function (ACF) C(Δ; t) (Supplementary Equations 3 and 5), in Fig. 2a, b. The TA-MSD does not reach a plateau over the timescale examined. Furthermore, TA-MSDs calculated over different observation times t are shifted relative to each other, with the slope becoming increasingly smaller with increasing t, a signature of ageing and observation-time-dependent dynamics6.

Figure 1: Yeast PGK.
figure 1

Blue, N-terminal domain (residues 1–185); red, C-terminal domain (residues 200–389); yellow, hinge region (residues 186–199 and 390–416). R(t) indicates the inter-domain centre-of-mass distance.

Figure 2: Non-equilibrium inter-domain dynamics of PGK.
figure 2

a, TA-MSD (Supplementary Equations 1 and 2) averaged over five independent trajectories for t = 100 ps, 10 ns, 500 ns, together with the TA-MSD for t = 17 μs. Dotted reference lines indicating power laws with different exponents are plotted as a visual guide. b, ACFs (Supplementary Equations 3 and 5) of the inter-domain distance trajectories, calculated from different independent MD trajectories. A reference line at e−1 is plotted to serve as a visual guide. c, Scaling behaviour between the observed characteristic time τc and the observation time t. The logarithm (to base 10) of the characteristic relaxation time τc of the inter-domain distance fluctuation of PGK and ePepN, of the intra-domain structural fluctuation within the single domain protein K-Ras (see Supplementary Information), and of the average for the distance fluctuations between residue side-chain pairs in PGK (see Supplementary Information), are plotted against the logarithm (to base 10) of the observation time, t. τc obtained from MD simulations is defined as the time at which the normalized autocorrelation function decays to 1/e. A reference line for the power-law relationship τc(t) t0.9 is plotted as a visual guide. The error bars shown with the red circles represent the standard deviation of log10(τc) associated with individual residue pairs. d, Power spectral density (PSD), S(f) of the inter-domain distance fluctuation of PGK versus frequency, f ([f] = 10−12 Hz), calculated using the Welch algorithm27. Different coloured symbols indicate different observation times; black, t = 100 ps; red, t = 10 ns; blue, t = 500 ns and magenta, t = 17 μs. The inset shows the estimated PSD of protein structural fluctuation based on the experimental single-molecule data published in ref. 4, obtained by numerical Fourier transform of an analytical fit to the experimentally measured autocorrelation function.

C(Δ; t) shifts towards longer lag times with increasing t, again consistent with ageing. Furthermore, an intriguing commonality is found in the dynamics examined on different timescales: Fig. 2c shows that τc, the characteristic time of the inter-domain motion, increases in a power-law fashion with the observation time, t, as τc(t) tθ, with θ 0.9, showing no sign of convergence. Remarkably, data from single-molecule experiments on the distance fluctuations between side-chain pairs3,4 fall close to the same power-law relationship (Fig. 2c), although these were obtained on t ≥ 300 s observation timescales, more than seven orders of magnitude longer than the MD, and on other proteins. Together, Fig. 2a–c reveals strong non-stationarity (ageing) of the inter-domain dynamics and suggests a power-law dependence extending from 10−12 to 102 s.

The residues probed in the experimental single-molecule studies are only 0.3–0.4 nm apart from each other3,4, in contrast to the inter-domain distance explored above, the average of which is 3.8 nm. Therefore, we investigated the distance fluctuations between the side chains of 32 selected PGK residue pairs in a variety of structural environments (see Supplementary Information for details). Although there are substantial variations in the results between individual residue pairs, the average behaviour (Fig. 2c) shows the same quantitative time dependence as described above for the inter-domain centre-of-mass motion, with characteristic relaxation times continuously increasing with t and following practically the same power law. Thus, the non-equilibrium scaling behaviour holds both for global (for example, inter-domain) protein motion and a substantial fraction of local motions, and in some cases even for distance fluctuations between adjacent residues on the same α-helix, an example of which is shown in Supplementary Fig. 8. Also, to determine whether these results are specific to PGK we also performed MD simulations of two other proteins under similar conditions: a much larger, four-domain enzyme (ePepN) and a much smaller, single-domain GTPase, human K-Ras. Both these proteins exhibit the same observation-time-dependent internal dynamics as shown above for PGK (see Fig. 2c and Supplementary Figs 3–5).

The power spectrum of the PGK inter-domain distance fluctuation, S(f), where f is the frequency, is shown in Fig. 2d. S(f) obtained on different observation timescales can be concatenated onto a single profile. For f 0.1 THz, and over nearly five frequency decades in the MD, S(f) scales approximately as f−1. Moreover, the power spectrum calculated from the single-molecule experimental data4 on the ms to 102 s timescales also follows the same 1/f behaviour (inset in Fig. 2d). Hence, the 1/f dependence also extends from ps up to 102 s timescales. This ‘1/f-noise’, often also referred to as ‘flicker noise’ or ‘pink noise’, indicates self-similarity of the corresponding dynamics on different timescales7. The lack of a characteristic frequency associated with a 1/f spectrum is consistent with the observation of the time-dependent relaxation time—that is, motions with ever-lower frequencies are sampled as the observation length is increased.

The above observations concern the dynamics of folded proteins in their broad, global, free energy minima. As the average structure of most globular proteins is well defined, at sufficiently long times the system will eventually feel a restoring force centred at the mean position—that is, a confining potential. However, within this global broad free energy well there are many small, local minima separated by barriers with different heights8,9. The dynamics of folded proteins can thus be considered as a fictitious particle diffusing on a rugged energy landscape possessing many wells of various depths10. To understand the features of this landscape that give rise to the self-similar dynamics, we mapped the individual MD trajectories onto networks based on the transitions between different metastable conformational states11,12 (see Supplementary Information for details). Clusters of similar structures form the vertices (nodes) of the network. Whenever, during the simulation, the protein transits between two clusters an edge is added between the two vertices, thus forming a conformational cluster transition network (CCTN). Similar networks have been used several times previously to describe complex dynamics, for example, refs 11,12. Graphical illustrations of obtained networks are shown in Fig. 3. The degree distributions of the networks, defined as the probability P(d) of finding a vertex connected to d direct neighbours, fully overlap on different timescales within the statistical errors, indicating topologically self-similar, fractal networks (Supplementary Fig. 11a). (The fractality here is in the geometry and topology of the CCTN, in contrast to the fractality of the protein structure itself, which has been related to the vibrational density of states13,14.)

Figure 3: Network representation of conformational transitions in PGK.
figure 3

a, Conformational transition network from the 17 μs MD simulation, containing 530 vertices and 2,345 edges. The circles represent structural clusters, the diameter and the colour scale of each circle indicate the cluster size (the darker and larger the circle, the larger the cluster size), defined by the numbers of conformations belonging to the cluster. The integer label on each vertex indicates its index based on its rank in terms of the cluster size. The arrows represent the transitions between the clusters. The thickness of the arrow and warmness of colour scale indicate the transition frequency (the thicker and warmer the colour, the higher the transition frequency). The graphical representation of the network is generated using the Python library graph tool (http://graph-tool.skewed.de). b, Conformational transition network from the 500 ns MD simulation, containing 243 vertices and 951 edges. A high-resolution version of the network illustrations shown in both sub-figures are provided in Supplementary Figs 9 and 10.

A natural simplified physical description of the dynamics of the fictitious particle diffusing over the CCTN is a ‘continuous time random walk’ (CTRW; ref. 15), in which, at each step, the random walker draws from a distribution of jumping distances and waiting times. Non-ergodic subdiffusive motion arises if the average waiting time diverges, as, for example, in a power-law waiting time distribution. Diverging waiting times can occur in diffusion on very rugged potential surfaces possessing many deep ‘traps’ in which the system is stuck for extended periods of time16. Superposition of thermal noise, representing motion within the trap, leads to the noisy CTRW description introduced in ref. 17.

The decay behaviour of the tails of the ACFs at long Δ is best described by an incomplete beta function (see Supplementary Equation 7), which is the analytical ACF for a non-ergodic subdiffusive CTRW in an external potential, derived from the fractional Fokker–Planck formalism18. The full range ACFs obtained on all observation timescales and for all three proteins are well fitted by a noisy CTRW model (see Supplementary Equation 9 and Fig. 2b and Supplementary Figs 3–5). We note that all ACFs obtained here deviate significantly from power-law behaviour for large Δ. In contrast, the ACFs from related ergodic subdiffusive models, such as fractional Brownian motion or processes governed by the fractional Langevin equation, are described by a Mittag–Leffler function15,19, which decays as a power law for large lag times Δ, and is independent of t. Furthermore, projecting the motion of interest (for example, the inter-domain motion) onto the CCTN filters out the noise component, leaving the subdiffusive CTRW dynamics (see Supplementary Section 7 and Supplementary Fig. 12).

If observed over a sufficiently long timescale a single, folded protein must eventually reach dynamical equilibrium. At that point the system will be ergodic (the time average of all quantities depending on the dynamics will converge to the ensemble-averaged values), the MSDs will plateau and the subdiffusive CTRW description will break down. The single-molecule spectroscopic experiments indicate this timescale must be longer than 102 s (refs 3,4). Hence, a single protein will not reach equilibrium over most timescales on which functional processes, such as ligand binding, allostery and catalysis, usually occur (that is, μs upwards). Combination of the present results with the single-molecule experimental data indicates that self-similar dynamics persists from ps through to the timescales of relevant biological processes in the cellular environment. Indeed, the median half-life of cellular proteins in yeast cells (Saccharomyces cerevisiae) has been estimated at 45 min (ref. 20), barely an order of magnitude longer than the observation time lengths of the single-molecule experiments. Therefore, the dynamics of an individual protein may exhibit non-equilibrium, self-similar dynamical behaviour throughout its typical biological lifespan.

On functional timescales, although the spatial dependence of fluctuations may evolve only extremely slowly with time, the time dependence of the associated motion remains well out of equilibrium, leading to the absence of a finite average in the power-law relationship revealed here between the characteristic time and observation time. This complexity means that canonical assumptions relating dynamics to function break down. Thermal equilibrium is assumed in classical theories for chemical reactions catalysed by proteins, such as the Michaelis–Menten formalism or transition state theory. However, on functional timescales any two protein molecules will sample different regions of conformational space and the associated time-averaged dynamical properties, including catalytic rate constants, will be different. Conformational dynamics modulates enzyme catalytic activity and in some cases can determine the overall rate-limiting step21. The present picture is consistent with single-molecule experiments revealing that enzymes possess dynamic disorder22, undergoing internal motions on timescales longer than those on which they function, such that the catalytic rate constant of an individual enzyme can strongly fluctuate over time23,24. The internal motions lead to fluctuations in the height of the effective reaction barrier, occurring on timescales similar to or longer than the reaction, leading to dispersed kinetics and deviation from classical Michaelis–Menten behaviour2,24,25. In this case, the reaction rate, which would otherwise be Arrhenius, is convolved with the temporal fluctuation of the barrier height, resulting in a long-tailed distribution of the rate24,25,26. The present work indicates that the corresponding non-equilibrium dynamics is a continuous time random walk on a self-similar, fractal conformational cluster transition network, a general phenomenon associated with the complexity of globular protein structure. The prevalence of non-equilibrium fractal dynamics in single protein molecules on the timescales of macromolecular function may fundamentally change our appreciation of the relationship between protein dynamics and functional activity.