Structural and dynamic origins of ESR lineshapes in spin-labeled GB1 domain: the insights from spin dynamics simulations based on long MD trajectories

Site-directed spin labeling (SDSL) ESR is a valuable tool to probe protein systems that are not amenable to characterization by x-ray crystallography, NMR or EM. While general principles that govern the shape of SDSL ESR spectra are known, its precise relationship with protein structure and dynamics is still not fully understood. To address this problem, we designed seven variants of GB1 domain bearing R1 spin label and recorded the corresponding MD trajectories (combined length 180 μs). The MD data were subsequently used to calculate time evolution of the relevant spin density matrix and thus predict the ESR spectra. The simulated spectra proved to be in good agreement with the experiment. Further analysis confirmed that the spectral shape primarily reflects the degree of steric confinement of the R1 tag and, for the well-folded protein such as GB1, offers little information on local backbone dynamics. The rotameric preferences of R1 side chain are determined by the type of the secondary structure at the attachment site. The rotameric jumps involving dihedral angles χ1 and χ2 are sufficiently fast to directly influence the ESR lineshapes. However, the jumps involving multiple dihedral angles tend to occur in (anti)correlated manner, causing smaller-than-expected movements of the R1 proxyl ring. Of interest, ESR spectra of GB1 domain with solvent-exposed spin label can be accurately reproduced by means of Redfield theory. In particular, the asymmetric character of the spectra is attributable to Redfield-type cross-correlations. We envisage that the current MD-based, experimentally validated approach should lead to a more definitive, accurate picture of SDSL ESR experiments.

H to build the propagators   It is assumed that the step 1 ps

 
is sufficiently small so that the Hamiltonian ( ) t H does not change appreciably during this time interval. Therefore, Eq. (S1) can be viewed as a legitimate representation of the evolution operator (note that it differs slightly from the conventional definition of the evolution operator, but we find the form in Eq. (S1) somewhat more convenient). With this definition, the propagation of the spin density matrix ( ) t σ over the time interval 1 ps   is given by: Next, we calculate the "packet" propagators: The time interval T has been set to 50 ps and 0, 1, 2, ... m  . Hence, we prepare the set of pack U matrices, where the first one describes the evolution of the system from 0 to 50 ps, the second -from 50 S3 to 100 ps, the third -from 100 to 150 ps, etc. The choice of T  50 ps is motivated by the Nyquist theorem for sampling of the ESR free induction decay.
As a next step, we use pack U matrices to describe the evolution of the system over extended time intervals: where 5 ns   , 1, 2, 3, ... k  and 0, 1, 2, ... l  . Thus the first series of matrices ( 1 k  ) describes the evolution of the system from 0 to 0.05 ns, from 5 to 5.05 ns, from 10 to 10.05 ns, etc. The second series of matrices ( 2 k  ) describes the evolution from 0 to 0.1 ns, from 5 to 5.1 ns, from 10 to 10.1 ns, etc. One such series of propagators is schematically illustrated in Fig. S1 using red arrows. Figure S1. A series of propagators ext U (schematically indicated by red arrows) with a fixed value of k.
For the sake of illustration, this graph represents a short 50-ns MD trajectory. We comment on the meaning of 5 ns   below.
Finally, we average all propagators ext U with a given value of k : where L is the length of the MD trajectory (assumed commensurate with 5 ns   ) and the function   ceil x rounds the argument x upward to the nearest integer. The expressions Eqs. (S5, S6) hold for kT   (illustrated in Fig. S1) as well as kT   ; they correspond to a variant of a sliding time-window scheme.
Using the average propagators   kT Γ we calculate the time-evolution of the spin-density matrix: S4 where x S refers to the x-component of the electron spin operator. This result is used as a basis to calculate the spectrum   s  according to Eqs. (4,5).
At this point it is appropriate to comment on the choice of  . This parameter is essentially the stride used to generate ext U matrices. In principle, choosing smaller  leads to better statistical properties of the result, but increases the computational time. For the system at hand, we satisfied ourselves that 5 ns   is the optimal setting. Specifically, we re-calculated all spectra using 0.5 ns   (results not shown) and found them to be identical to those obtained with 5 ns   , whereas the time required to compute a single spectrum increased sevenfold, from 5.5 hrs to 40 hrs. We note that the proper choice of  maybe be dependent on the details of the system and the length of the trajectory. The setting of this parameter needs to be tested to ensure good convergence properties of the calculations.
One additional possibility exists to improve statistical properties of the simulated spectra   s  . While each of our MD trajectories is sufficiently long, the simulated nitroxide group does not sample different spatial orientations in a perfectly isotropic manner (this is just a consequence of finite simulation length).
To improve on this aspect, for each trajectory we have generated a series of replica trajectories that differ from each other by a constant rotation. Specifically, multiple copies of the original trajectory have been generated such that the initial orientations of the NO bond cover 225 nodes uniformly distributed on a unit sphere. 4 Each of the copies is used to independently compute   s  and the results are subsequently averaged. We have found that the grid containing 225 nodes is adequate for the purpose of these analyses -use of finer grids does not change the shape of the spectrum. In principle, it is possible to enhance orientational sampling by introducing an additional dimension to the grid, corresponding to rotation about the NO bond. In practice, however, this proved to be unnecessary.
We have also undertaken additional series of simulations to probe the effect of small variation in the A G, which were reported for proxyl spin label in ortho-terphenyl (believed to be a good model of protein interior). 5 In this case, the simulated spectra were clearly off the mark. Therefore, we stayed with the initial choice of PAS g and PAS A , which showed the best results in our calculations. Nevertheless, one should bear in mind that small S5 environment-dependent variations in PAS g and PAS A are in principle possible. This may be relevant for future efforts to achieve quantitative accuracy in interpreting ESR spectra of spin-labeled proteins.
Finally, we have also explored the effect from two additional interactions: quadrupolar interaction for 14 N nuclear spin 1 I  and the Zeeman interaction for the same nuclear spin. The corresponding terms have been added to the time-dependent Hamiltonian   t H , Eq. (1). The parameters of the quadrupolar tensor were taken from the paper by Savitsky et al. 5 Using this extended Hamiltonian, we repeated the calculations of ESR spectra for all spin-labeled GB1 variants investigated in this work. It was found that including these two interactions has no appreciable effect on the appearance of the simulated spectra. This is expectable since both of them are small compared to the hyperfine coupling, which dominates spin dynamics of 14 N and, more broadly, spin dynamics of the coupled two-spin system.

Determination of tumbling time
The following procedure has been used to extract rot  from the MD trajectory. First, we parameterized the rotation of GB1 molecule from frame i to frame 1 i  along the entire trajectory. To this end, we superimposed the scaffold of GB1 from frame i onto the scaffold of GB1 from frame 1 i  (specifically, we superimposed C α atoms belonging to the secondary structure, where secondary structure was defined based on crystallographic coordinates of the protein). This operation can be represented as a combination of translation of the center of mass (of no interest for us) and rotation. In this manner we determined the matrices , 1 i i Ω , describing the step-wise reorientational motion of GB1. As a next step, we considered the set of 225 vectors of unit length 0 v providing optimal sampling of a unit sphere. 4 For each of these vectors we apply consecutive rotations , 1 i i Ω and thus construct vector trajectory   In turn, this vector trajectory is used to construct the temporal correlation function: good, indicating that this procedure is well suited to obtain an accurate measure of isotropic reorientational diffusion in a protein with moderate anisotropy, such as GB1.
As described in the main text, the extracted value of rot  is in good agreement with the experimental results, as well as theoretical predictions. It is important to realize, however, that this agreement involves a certain element of error compensation. On one hand, it is known that the standard TIP3P model employed in our simulation underestimates shear viscosity of water. 6 On the other hand, the commonly used NPT ensemble with collision frequency 1 = 2 ps   effectively raises solvent viscosity. 7 The fortuitous compensation of these two factors leads to rot  value in line with expectations.
Note that, in principle, we do not need to rely on error cancelation to obtain correct rot  . Using one of more advanced water models in conjunction with NVE ensemble offers a good practical solution to this problem. 8 Furthermore, good results can be obtained using NPT ensemble with the thermostat developed by Bussi and co-workers. 9-10 Importantly, the length of MD simulations accessible through GPU computing greatly exceeds the typical time scale of protein tumbling: in our case, each trajectory is nearly 4 orders of magnitude longer than rot  . All of this taken together makes it possible to model protein tumbling with a quantitative accuracy.

Adjustment of tumbling time
The goal here is to build a pseudo-trajectory, which would differ from the original trajectory in only one characteristic -namely, the rate of the overall protein tumbling. For this purpose, we have developed the algorithm that is described below. In what follows, protein coordinates from the i-th frame in the original MD trajectory are denoted i v and the corresponding coordinates in the pseudo-trajectory are denoted i V . The transformation that superimposes i v onto j V via C α atoms in the secondary-structure regions is We are mainly interested in the rotational portion of this transformation, which is To construct the pseudo-trajectory, we assume that 0 0  V v and formulate the recursive algorithm to 2. We apply this transformation to 1 i v :  has to pay special attention to the orientational sampling scheme, see SI section 1. In particular, highquality sampling is necessary for the mutants with restricted R1 dynamics, such as F30R1 (see Fig. S12).

Redfield-theory calculations
Redfield formalism that we have used in this paper to calculate ESR spectra of spin-labeled GB1 is fundamentally the standard Redfield formalism. However, there are certain aspects of these calculations that deserve a special comment. A brief summary of our computational procedure is presented below.
We treat here the system that is comprised of two spins: the electron spin S=1/2 and 14 N nuclear spin I=1.
We begin by dividing the Hamiltonian, Eq.
Next, we diagonalize this matrix, thus arriving at the eigenoperators i V and the corresponding eigenvalues i  (transition frequencies):    Figure S3. ESR spectra of seven spin-labeled mutants of GB1. The experimental and simulated spectra are plotted with broad black lines and thin red lines, respectively. The apparent difference in noise level, cf. panels (h) and (i), is a plotting artefact. Specifically, the data are scaled such that the maximum amplitude of the calculated signal is set to 1.0 (providing good view of all spectra). In the case of broad low-intensity spectra, such as F30R1, this representation emphasizes the presence of noise. The superscript # is used to indicate duplicate MD trajectories of Y3R1 and K10R1. The overall agreement between the calculations and experiment is good across the board. However, in the case of Y3R1 and F30R1, the simulation may not be fully representative of the actual state of the sample (cf. sharp spectral features marked in the plot by green daggers; see discussion in the main text). In the case of K28R1, the observed deviations between the experiment and the simulations may stem from the multicomponent character of the experimental spectrum. Figure S4. Superposition of 1 H, 15 N HSQC maps from 15 N-enriched samples of GB1 F30C (blue) and F30R1 reduced by overnight incubation with 10 mM ascorbic acid (red). Contour levels are chosen such as to allow direct comparison between the two spectra. The spectrum of F30R1 re-acquired after 20 days was unchanged compared to the one shown in this plot. Figure S5. Histogram of 15 N linewidths in the HSQC spectral maps from GB1 F30C (blue) and F30R1 reduced by application of ascorbate (red). The data were obtained by reprocessing the data in Fig. S4 (without the window function in the 15 N dimension) and fitting the resulting peak shapes using the routine nlinLS from the package NMRPipe. 14 The comparison includes both backbone and side-chain resonances, but is limited to well-resolved peaks with substantial intensity.      lack of convergence. This is particularly relevant for 3  (green curve) since 3  jumps occur relatively infrequently during the course of the simulation, see text.
S19 Figure S11. The simulated ESR spectra from nine MD trajectories of spin-labeled variants of GB1: (red curves) standard calculations identical to those shown in Fig. S3; (green curves) the trajectories have been pre-processed to increase the rate of the protein overall tumbling, 2.0   ; (blue curves) the trajectories have been pre-processed to lower the rate of the protein overall tumbling, 0.5

 
. The details of the MD processing scheme are described in the main text and SI section 2.2. In this graph, each spectrum is normalized according to the height of its central line. Figure S12. The simulated ESR spectra from nine MD trajectories of spin-labeled variants of GB1: (red curves) standard calculations identical to those shown in Fig. S3; (blue curves) the trajectories have been pre-processed to quench the overall tumbling,    . For the mutants with restricted R1 dynamics, such as F30R1, the    spectrum approaches the limiting case of a static powder spectrum. In this situation, it is important to employ a high-quality sampling scheme encompassing the three orientational degrees of freedom of the spin-labeled protein. The scheme used in our work is relatively sparse (see SI section 1 for discussion) and hence gives rise to appreciable computational artefacts, such as seen in panel (h). Further consideration of    spectra, which are relevant for large proteins, microcrystalline proteins, precipitated proteins, membrane proteins in micelles and lipid bilayers, etc., is deferred to future work. Unlike in Figs. S3 and S11, the individual spectra in this graph are not normalized (and hence can be compared to each other in terms of signal intensities). Figure S13. ESR spectra of seven spin-labeled mutants of GB1 calculated via MD-based direct propagation scheme (red lines, same as in Fig. S3) or, alternatively, Redfield formalism (green lines). The small-to-moderate deviations between the two methods can be attributed to the following factors. First, for samples such Y3R1 and F30R1, the Redfield theory is pushed to the limits of its applicability (see the main text for discussion). Second, some of the trajectories feature rare transitions between distinctive conformational states (e.g. K10R1 discussed in the main text). This means that MD statistics is essentially inadequate. The two methods, the direct propagation scheme and Redfield scheme, have different properties with respect to statistical sampling of the MD data. As a consequence, they respond differently to the insufficient statistics. This is likely a contributing factor to the discrepancies between the two spectra in panel (d) as well as panel (e). Figure S14. ESR spectra of free 15N Δ MTSSL , N8R1 15N and T44R1 15N calculated by means of the full-fledged Redfield-theory treatment (orange lines, same as in Fig. 4d-f), as well as the altered version of the same treatment, where the contributions to the Redfield matrix due to cross-correlations between the anisotropy of g-tensor and the hyperfine interaction have been set to zero (blue lines).