Ultrafast coherent motion and helix rearrangement of homodimeric hemoglobin visualized with femtosecond X-ray solution scattering

Ultrafast motion of molecules, particularly the coherent motion, has been intensively investigated as a key factor guiding the reaction pathways. Recently, X-ray free-electron lasers (XFELs) have been utilized to elucidate the ultrafast motion of molecules. However, the studies on proteins using XFELs have been typically limited to the crystalline phase, and proteins in solution have rarely been investigated. Here we applied femtosecond time-resolved X-ray solution scattering (fs-TRXSS) and a structure refinement method to visualize the ultrafast motion of a protein. We succeeded in revealing detailed ultrafast structural changes of homodimeric hemoglobin involving the coherent motion. In addition to the motion of the protein itself, the time-dependent change of electron density of the hydration shell was tracked. Besides, the analysis on the fs-TRXSS data of myoglobin allows for observing the effect of the oligomeric state on the ultrafast coherent motion.

~20 min, and HbI was reduced with 60 μL of 1 M sodium dithionite solution under the nitrogen atmosphere. Then, the reduced sample was exposed to CO gas for ~20 minutes to generate HbI(CO)2. The preparation of the sample solutions was performed just before the x-ray solution scattering measurement.

The effect of the hydration shell changes to the TRXSS signal
In the previous theoretical study on myoglobin (Mb), it was argued that the electron density of the hydration shell around the protein changes during the ultrafast time-domain after reaction initiation triggered by light. 2 We conducted simulations using CRYSOL to verify the effect of electron density change of the hydration shell during the photoreaction on difference scattering curves. 3 In the simulations, three different electron density changes of the hydration shell after photolysis were assumed. For the simulation, a candidate structure of I1 reported in the previous TRXSS study was selected to generate the difference curves. 1 The difference curves were calculated assuming the transition from the carboxy structure (PDB entry: 3sdh) to the selected candidate structure under three different electron density changes of the hydration shell. The change of the hydration shell density was described by changing the parameter of the contrast of the hydration shell for CRYSOL which uses the implicit solvent model to describe the hydration shell with the thickness of 3 Å. The parameters such as the electron density of the bulk solvent were set as the default value for CRYSOL during the simulations.
According to our simulation results, the time-resolved SAXS signal (especially where q < 0.1 Å -1 ) is affected significantly by the change of electron density of the hydration shell ( Supplementary Fig. 1a) regardless of the degree of the electron density changes. Such an effect is absent in the signal of a wider angle region ( Supplementary Fig. 1b) in all three electron density changes of the hydration shell. In Supplementary Fig. 1c, the double-difference of theoretical difference curves assuming the different changes of the electron density of the hydration shell (black), and double-differences of the experimental difference curves of the time delays where the oscillatory feature was observed (red and blue) are plotted. The doubledifference due to the change of the hydration shell shows different features compared to the double-differences of the experimental data in terms of the shape and amplitude (particularly in the q range from 0.13 Å -1 to 0.14 Å -1 ). In addition, as shown in Supplementary Fig. 9, the changes of Rg and volume during the coherent motion are correlated to the signal amplitude at q = 0.13 Å -1 where the effect of the hydration shell is relatively small. Considering this result, the time-resolved difference scattering signal in the SAXS region with q < 0.13 Å -1 was excluded for the kinetic analysis and structure refinement on the fs-WAXS data to exclude the effect due to the hydration shell and elucidate the structural change of the bare protein. To check any dependency on protein structure, the same simulation was performed on other candidate structures of I1 reported in the previous TRXSS study. The effects of the electron density changes of the hydration shell on the SAXS and WAXS regions were the same regardless of the structures.
In addition, a simulation which considers the change of the reduced form factor was performed to verify the effect of the change of the excluded volume to the difference scattering curves. In this simulation, 'effective atomic radius' which is an input variable for CRYSOL and related to the displaced volume per atomic group was changed to check the effect of the reduced atomic form factor to the difference curve. Default input values for CRYSOL were used for other parameters such as the electron density contrast of the hydration shell. Our structure refinement results indicate that the volume expands less than 1%, which indicates that the average atomic radius would increase less than 0.3%. A candidate structure obtained from the structure refinement was selected and used to calculate the difference scattering curve of the transition from the carboxy structure to the candidate structure, using two different effective atomic radii. The difference scattering curves calculated assuming the original effective atomic radius (1.622) and the effective atomic radius increased by ~0.3% (1.627) show similar features, which indicates that the change of the reduced form factor due to the change of the excluded volume does not change the main interpretation ( Supplementary Fig. 1d)

Estimation of instrument response function (IRF) and kinetic analysis
The thickness of the capillary used for the experiment was ~500 μm. The temporal duration of the x-ray pulse was ~50 fs. Other parameters such as the temporal duration of the laser pulse (~250 fs) were obtained from the previous fs-TRXSS study performed at the XPP beamline of LCLS and used for IRF estimation of this study. 4 Considering such parameters and the time jitter between the laser and x-ray pulse, the full-width half maximum (FWHM) of IRF was estimated to be ~800 fs.

Carlo (MC) simulations on fs-WAXS data
Supplementary Fig. 6a shows the change of root-mean-squared-deviations (RMSDs) and χ 2 values after performing the ERRB structure refinement of I0 as a representative case. The RMSDs and χ 2 values of the initial structures used as starting structures for the structure refinement and the refined structures obtained by the structure refinement are indicated by black and blue dots, respectively. We note that the RMSDs of the initial structures with reference to the carboxy state extend up to ~1.2 Å, which is larger than the RMSD of the crystal structure at 5 ns (~0.16 Å), indicating that the initial structures cover a wider conformational space than the crystal structure at 5 ns can reach. Clustering was performed on the refined structures with χ 2 values less than 250 using the gromos method implemented in the GROMACS package, and the cutoff value for the clustering was 0.045. 5 The structures in the first cluster were selected as the candidate structures indicated by red dots. The mother structures yielding the candidate structures are indicated by magenta dots. The RMSDs between mother structures and candidate structures are about 0.3 Å on average.
Since the number of the fitting parameters is large, the structure refinement using simple MC simulation can induce overfitting. However, ERRB structure refinement aided by MC simulations utilizes a target function in addition to MC simulations to guide the movement of the rigid bodies. Since the target function is composed of various terms such as the χ 2 term and the collision-avoiding term which substantially reduce the effective degrees of freedom and the conformational space, guidance using the target function can decrease the possibility of the overfitting. Indeed, comparison of standard deviations of the RMSDs against the carboxy structure shows that the standard deviation decreases from 0.079 Å for the mother structures of the candidate structures to 0.052 Å for the candidate structures via the structure refinement. This result indicates that the candidate structures from the different mother structures have converged well into a target structure through the guidance of the target function used for the structure refinement.

The calculation of a theoretical difference scattering curves of the reaction intermediates
The theoretical difference scattering curves of the reaction intermediates were calculated by averaging the theoretical difference scattering curves of the candidate structures of the reaction intermediates. The theoretical difference scattering curves of the reaction intermediates were used to generate the calculated curves shown in Fig. 1b.

The calculation of a difference distance map (DDM)
DDMs were calculated by subtracting the distance map of reference structures from the distance map of target structures. Since the structural changes of two subunits are similar with the RMSD of ~0.3 Å ( Supplementary Fig. 8a), the distance of the same residue pairs in each subunit of HbI were averaged to calculate intra-subunit distance maps. For instance, the distance between residue i and residue j in subunit A was averaged with the distance between residue i and residue j in subunit B. For inter-subunit distance map calculation, a distance between a residue pair in subunit A and subunit B (for example, residue i in subunit A and residue j in subunit B) was averaged with the distance between the same residue pair but in subunit B and subunit A (residue j in subunit A and residue i in subunit B). For calculating DDMs using I0 as a reference structure, we used the averaged distance maps of candidate structures of I0.

Similarity between difference distance maps (DDMs)
Similarity between two difference distance maps was calculated using Pearson correlation coefficient. Pearson correlation coefficient between DDMs was calculated using the components in the DDMs as follows.
,, , , where xi,j and yi,j are the (i, j)th components of the region of interest of two DDMs, and x and y are the means of xi,j and yi,j, respectively.

The calculation of a displacement plot
The displacement was calculated as the changes in the distance of the Cα atom of each residue from the iron atom with respect to the reference structure. In the case of HbI, which has a homodimeric structure, the displacements of the same residue in each subunit were averaged to generate the displacement plot of a candidate structure. Finally, the displacement plots of the candidate structures were averaged to yield the averaged displacement plot of the reaction intermediate.

The calculation of the radius of gyration (Rg) and volume for fs-WAXS data
Rg was calculated from the candidate structures of the structure refinement results using the following equation, , where mi and ri are the mass and the position of the atoms or atomic groups in the protein, respectively, and rCOM is the position of the center of the mass of the protein. The volumes of the protein structures were calculated using the volume calculation program package developed and distributed by Voss et al. 6 The errors of Rg and volume were calculated as the standard error of the mean (SEM) among the candidate structures.

Time profile calculation of the radius of gyration (Rg) and volume for fs-WAXS data
For the WAXS data, the changes of the radius of gyration (ΔRg) and volume (ΔV) with respect to the carboxy structure (PDB entry: 3sdh) at a time delay were calculated by adding ΔRg and ΔV of two reaction intermediates (I0 and I1) multiplied by their concentrations using the following equations.
where ΔRg X and ΔV X are ΔRg and ΔV of the reaction intermediate X, respectively, and C X (t)

The Guinier analysis of fs-SAXS data
The Guinier analysis of the fs-SAXS data was performed following a previously reported method. 4 The difference scattering curve at each time delay was fitted using the following equation.
, where I(q,V,Rg) is the intensity calculated using the following Guinier approximation and C(t) is the population of the photolyzed HbI.
In Supplementary Equation (6), A is a scaling factor between the observed and theoretical intensities, V is the volume of the protein, and ⍴p and ⍴b are the electron densities of the protein and the solvent, respectively. For consistency, we used the same scaling factor (A) used in the ERRB structure refinement aided by MC simulations for the Guinier analysis.
Other parameters used for the fitting were ⍴pV = 17740 (electron), ⍴b = 0.34 (electron/Å 3 ). The Rg and volume of the carboxy structure were calculated following the aforementioned methods used for the analysis of fs-WAXS data. The fitting and error analysis were conducted using the MINUIT package written at CERN. 7 The changes of Rg and volume of the system were calculated by C(t)Rg(t) and C(t)V(t). The changes of Rg and volume at time delays earlier than -0.3 ps were assumed to be zero, since the changes of Rg and volume were inaccurate at these time delays due to the negligible amount of the photolyzed HbI.

The fitting of the SAXS region using the change of electron density of the hydration shell
The fitting was performed in a similar way to that described in Fig. 1b

Single-layer model vs. multiple-layer model to describe the density change in the hydration shell
In a previous theoretical study on Mb, it was suggested that the solvent molecules located farther than 3 Å from the protein can be perturbed due to the pressure generated by the protein quake. 2 Since CRYSOL, the program used to calculate the scattering curve, describes the hydration shell as a layer with a thickness of 3 Å , the density change dependent on the distance from the protein surface is approximated as a constant density of a single hydration shell instead of the density changes of multiple shell layers. To check the effect of this approximation, we performed a series of simulations considering the effect of multiple shell layers and their density changes.
In the simulations, the protein body and solvent layers were assumed a sphere with a radius of 21 Å and spherical shells with a thickness of 3 Å , respectively. The form factor for the protein body was calculated as follows.
, where Δeprotein is the difference between the number of electrons in the protein and that in the bulk solvent and R is the radius of the protein body. Δeprotein was determined using the volume of the protein and the electron densities of the protein (0.430 e -/Å 3 ) and the bulk solvent (0.334 e -/Å 3 ). The form factor for the solvent layers were calculated as follows.   was changed as depicted in Supplementary Fig. 12c. Then, the difference curve was calculated by subtracting the x-ray scattering curves before and after the density change. The magnitude of the difference curve due to the electron density change in the i-th water layer was quantified by taking the squared sum of the difference scattering intensities within the q range from 0.05 Å -1 to 0.1 Å -1 and normalized by that of the 1st solvent layer (that is, the relative contribution to the signal). The simulation result shown in Supplementary Figure 12d indicates that the electron density changes in the solvent layers closer to the protein body have higher contributions to the difference curve.

Kinetic and structural analysis of myoglobin (Mb)
The collection of TRXSS data of Mb was described elsewhere. 4 The TRXSS data of Mb was processed and analyzed in a similar way to that used for the case of HbI. The data with 0.17 Å -1 ≤ q ≤ 1.0 Å -1 was used for the WAXS analysis. The kinetic analysis was performed via SVD analysis and kinetic modeling. The SVD analysis was performed in a similar way to that used for HbI. From the SVD results, two SVD components were classified as the major components.
The rSVs were reconstructed with two exponential decay functions convoluted with an IRF (500 fs). One of the time constants used in the rSV reconstruction was fixed to 70 ps during the analysis, which is the time constant observed in the previous TRXSS study on horse heart Mb. 8 As a result of the SVD analysis, the rSVs were reconstructed with two exponential decay The helix label follows the helix identification in the carboxy structure of HbI (PDB entry: 3sdh).