Cross-validation of distance measurements in proteins by PELDOR/DEER and single-molecule FRET

Pulsed electron-electron double resonance spectroscopy (PELDOR or DEER) and single molecule Förster resonance energy transfer spectroscopy (smFRET) are recent additions to the toolbox of integrative structural biology. Both methods are frequently used to visualize conformational changes and to determine nanometer-scale distances in biomacromolecules including proteins and nucleic acids. A prerequisite for the application of PELDOR/DEER and smFRET is the presence of suitable spin centers or fluorophores in the target molecule, which are usually introduced via chemical biology methods. The application portfolio of the two methods is overlapping: each allows determination of distances, to monitor distance changes and to visualize conformational heterogeneity and -dynamics. Both methods can provide qualitative information that facilitates mechanistic understanding, for instance on conformational changes, as well as quantitative data for structural modelling. Despite their broad application, a comprehensive comparison of the accuracy of PELDOR/DEER and smFRET is still missing and we set out here to fill this gap. For this purpose, we prepared a library of double cysteine mutants of three well-studied substrate binding proteins that undergo large-scale conformational changes upon ligand binding. The distances between the introduced spin- or fluorescence labels were determined via PELDOR/DEER and smFRET, using established standard experimental protocols and data analysis routines. The experiments were conducted in the presence and absence of the natural ligands to investigate how well the ligand-induced conformational changes could be detected by the two methods. Overall, we found good agreement for the determined distances, yet some surprising inconsistencies occurred. In our set of experiments, we identified the source of discrepancies as the use of cryoprotectants for PELDOR/DEER and label-protein interactions for smFRET. Our study highlights strength and weaknesses of both methods and paves the way for a higher confidence in quantitative comparison of PELDOR/DEER and smFRET results in the future.

for an 18 overview). For a full description of each methods' theoretical background, the reader is referred 19 to the many detailed theoretical reviews and textbooks (e.g. 46-49 ).

20
Because most proteins are diamagnetic and devoid of any suitable fluorophores, 21 PELDOR/DEER and FRET experiments usually require the attachment of spin-or fluorescence 22 labels ( Figure 1). This is often accomplished by the site-specific introduction of cysteines. The 23 thiol groups of cysteines can be reacted with linker-functionalized dyes or labels, for example 24 via maleimides or thiosulfate esters (see Figure 2 for some typical examples). If the introduction 25 of cysteines is not an option, it is possible to use alternative labelling approaches such as 26 labelled nanobodies 50 or unnatural amino acids. The latter can either be fluorescent or 27 paramagnetic themselves or bear functional groups that can be chemoselectively labelled, for 28 instance by click-chemistry 51-54 . Although the types of labels used for PELDOR/DEER and 29 FRET are quite different, the requirements for suitable labelling positions in proteins are 30 essentially the same: the residue should be solvent-accessible and of no functional importance.

31
For PELDOR/DEER spectroscopy the distance between the labels ought to be in the range of 32 1.5 to 8.0 nm (longer distances of up to 16 nm are accessible with fully deuterated samples) 46,55,56 . The ideal distance for FRET experiments is around the Förster radius of the particular 1 FRET pair with a typical dynamic range between 3-8 nm ( Figure 1F), but in principle, distances 2 up to 10-15 nm can be measured 57 . Typically, labelling positions are chosen such that the 3 distance change between conformations is as large as possible. In practice, the pool of suitable 4 sites is often surprisingly small. Fortunately, software programs exist to assist in the 5 identification of optimal labelling positions in the case of an available structure or model of the   4 We started with the sialic acid TRAP transporter SBP from Haemophilus influenzae. Figure 3A   5 shows a difference distance map of HiSiaP based on the open-and closed crystal structures. 6 The map represents all distance changes between the C-b atoms of the two states.

16
The PELDOR/DEER results are shown above (grey curves for simulation, black curves for experiment) and the 17 FRET distances below the x-axis (grey bars for simulation, black bars for experiment). Raw data for all 18 experiments and confidence interval of PELDOR/DEER distributions are provided in the Supplementary

19
Information. The red shade around the PELDOR/DEER data is the error margin calculated using the validation 20 tool of DeerAnalysis 40 .

22
We picked pairs of sites with pronounced distance changes (dark areas) of up to 1.8 nm for 23 labelling. Figure 3B shows the open (PDB-ID: 2CEY) and closed structures (PDB-ID: 3B50) of the protein with the predicted accessible volumes of the spin-(magenta) and FRET-(blue) 1 labels at the selected labelling sites (residues 55, 58, 134, 175, 228). For PELDOR/DEER, all 2 double mutants (58/134, 55/175, 175/228 and 112/175) were labelled with MTSSL, which is 3 by far the most commonly-used spin label for proteins (SI Figure 1). In each case, two 4 PELDOR/DEER measurements were performed, one in the presence (1 mM) and one in the 5 absence of sialic acid (Neu5Ac). Similar to the previously published PELDOR/DEER data for 6 the Vibrio cholerae homolog VcSiaP, which shares 49 % amino acid sequence identity, 69 % 7 sequence similarity with HiSiaP 24 , the EPR-time traces obtained for HiSiaP were of excellent 8 quality with clearly visible oscillations and high signal-to-noise ratios (SI Figure 2). The  respectively ( Figure 2). This FRET pair was chosen for its high photostability, signal intensity 20 and proven compatibility with various protein samples. 17,28,31 Labelling quality and sample 21 purity were assessed by size exclusion chromatography, and all samples showed high labelling 22 efficiencies (> 90%) and good donor-to-acceptor labelling ratios (up to ~ 50:50) (SI Figure 3).

23
The experiments were conducted with freely diffusing molecules at ~50 pM concentration to 24 derive mean FRET-efficiency values for the apo-and holo states of HiSiaP. All FRET 25 measurements gave high quality ES-histograms with clearly defined FRET populations (SI 26 Figure 4). However, some of the populations appeared to be broader than expected indicating 27 that they were either composed of molecules with additional conformational flexibility, or there  ligand-free conditions. All other variants (55/175, 175/228, 112/175) failed to reproduce the trends from EPR and structural predictions (grey bars). smFRET data instead suggested that the 1 apo protein adopted a conformation that was more closed than the substrate-bound 2 conformation. Since variant variant 58/134 agreed with both the PELDOR/DEER results and 3 the models based on X-ray structures, it appeared unlikely that a completely unexpected 4 structural feature of the protein was responsible for the observed discrepancies. Considering the 5 known ± 5 Å experimental accuracy of FRET 71 , one might argue that the two states were simply 6 not discernable for the "offending" double mutants, (contradicting the simulation results in 7 Figure 3C). To examine whether the apo protein was in fact capable of adopting such a "apo- 8 closed" conformation under the conditions in which the FRET experiments were performed, 9 we carried out a burst-variance analysis of three HiSiaP mutants (SI Figure 5). The results 10 showed that the protein exists in a single conformation and is not switching rapidly between 11 distinct states (on the millisecond timescale). These results however do not rule out transitions 12 on a timescale below ~500 µs. The detection of such dynamics would require PIE or MFD 13 analysis of smFRET assays. 72 Also, the smFRET experiment gives no information about the 14 possibility that the protein was trapped in a closed state, but data below suggest otherwise. A 15 possible explanation for the discrepancy between crystal structures and smFRET distances for 16 the two label pairs is that the fluorescence labels were partly immobilized by an interaction with  Interestingly, we could observe such a broadened population 73 caused by these two labelling 23 combinations very clearly for a distinct dye-combination (Alexa Fluor 546 -Star 635P) for 24 mutant 112/175 (SI Figure 6). Notably, for all mutants with experimental apo distances that 25 deviated from the simulations, the 1D-E-Histograms showed broadening of the populations 26 indicating artefacts due to fluorophore interaction with the protein surface (SI Figure 4).

27
To explore the possibility of such unwanted dye-protein interactions, we investigated  The TMR/Cy5 label pair was not our first choice, because it is inferior to Alexa Fluor 555/647 22 in terms of signal intensity and photostability. Also, for many proteins, charged labels are 23 known to be less prone to stick to the protein surface than hydrophobic labels. In anisotropy 24 and lifetime measurements on the 175 variant in apo and holo state ( Figure Figure 5C) with the expectation that no substrate-induced distance change occurs ( Figure 5C).

19
All variants showed very good agreement between experimental and simulated values in 20 smFRET experiments. Ligand binding was confirmed using microscale thermophoresis as 21 shown previously 63 .

22
PELDOR/DEER distance measurements using the MTSSL were performed with the same set 23 of mutants ( Figure 5). For three of the four MalE variants, this yielded good quality time traces, 24 while the quality for 87/127 was not as high and had relatively low modulation depth and signal 25 to noise ratio (SI Figure 9). For all mutants, except 29/352, the measured apo distances closely 26 matched the predictions obtained from the crystal structure ( Figure 5C). Notably, the latter 27 mutant also had the worst match between the simulation and experiment for the smFRET 28 experiments. The addition of 1 mM maltose (Kd for MalE is 1 µM) to the 134/186 mutant 29 protein (our negative control) had, as expected, little effect on the position of the distance peak. 30 Whereas mutant 87/127 seemed to completely close upon substrate addition, the mutant 36/352 31 revealed what appeared to be a mixture of the holo and apo states. A similar result was obtained 32 for mutant 29/352 with the difference that the "apo distance" in the presence of maltose was ~5 Å longer, and its value was more similar to the simulated apo distance than the distance 1 determined in the actual experiment without maltose. As mentioned above, the qualities of the 2 PELDOR/DEER time traces for the 87/127 mutant were not as high, and the distance change 3 between open and closed conformation was relatively small. Therefore, we cannot exclude the 4 possibility that this mutant also existed in an open-closed mixture after substrate addition (SI 5 Figure 9). represented by magenta meshes.  Because binding constants are temperature dependent, and the PELDOR/DEER samples were 22 frozen before the measurement, we checked whether complete closure of MalE was achievable 23 at a higher substrate concentration by repeating the measurements with 10 mM maltose. Within 1 error, these experiments yielded the same mixtures of the holo and apo states as seen with 1 mM 2 maltose (SI Figure 9). Since the lack of complete closure did not result from a sub-saturating 3 maltose concentration in the frozen samples, and it was not observed in the smFRET data, we 4 reasoned that perhaps the cryoprotectant that was used for PELDOR/DEER experiments might 5 be the culprit. Figure 5C shows the PELDOR/DEER results for mutants 36/352 and 29/352 in cryoprotectant, the length of the PELDOR/DEER time trace had to be shortened to achieve a 12 good signal to noise ratio (SI Figure 10), and in these cases, a single distance peak with 13 reasonable correspondence to the holo state was observed for both mutants 29/352 and 36/352. 14 Interestingly, the measured distances were shorter than those measured in the presence of 15 cryoprotectant ( Figure 5C, red traces). 16 In summary, for MalE, both methods were able to detect the substrate-induced closure of the    supernatant from purified mutants showed that they did not contain detectable glutamine traces 24 (µM concentrations would be needed to explain our observations, SI Figure 14).
In summary, for the SBD2 protein, both methods were able to discern the open-and closed 1 state. But, the reason for the quantitative differences between the PELDOR/DEER and FRET 2 experiments remained elusive. 3 4 Estimating the influence of linker length on the accuracy of predicted distance distributions 5 As mentioned above, the discrepancies in our comparisons can be caused by protein-label  were simulated by distributing 1000 dummy atoms (representing a spin center or fluorophore) in each 4 of two spheres that were located 50 Å apart. The radius of the sphere represents the length of the linker 5 that connects the fluorophore or spin center to the C-alpha atom of the labelled residue. Interactions with 6 the protein surface (grey arcs) are indicated and lead to a clustering of labels at that position. Depending 7 on the degree of interaction between protein and label, the accessible volume approach becomes less 8 accurate. B) Histograms of 1000 experiments described in A) with a 10 Å linker (upper plot) and 20 Å 9 linker (lower plot) and varying degree of protein label interaction. The percentage indicates how many 10 percent of the 1000 dummy atoms are localized at the interaction site. As example for a long (20 Å) and 11 immobilized linker, the protein structure of MMP-12 (matrix metalloproteinases, PDB-ID 5L79 77 ) in 12 conjugation with a Cy5.5 fluorophore (K241, colored spheres) was selected. The surface of the protein 13 is shown in grey and the accessible volume of the fluorophore, calculated with FPS 70 is shown as a blue 1 mesh. C) The difference between the experimental smFRET and PELDOR/DEER results (black) and 2 the simulated smFRET and PELDOR/DEER results (white). 3 4 We then determined the distance between the geometric averages of the 1000 atoms (this 5 corresponds to the experiment) and compared it to the distance of 1000 randomly distributed 6 atoms in the same volume (this corresponds to the accessible volume approach) to calculate the 7 prediction error. This procedure was repeated 1000 times to achieve a statistical distribution of 8 the "interaction site" within the accessible volume. The results are summarized in Figure 7B: 9 For a weakly immobilized label (10 %, magenta), the prediction error was low, even for linkers 10 of 20 Å length (a scenario similar to use of Alexa Fluor labels; Figure 2). This changed  Figure 7B shows the size of the 21 accessible volume for the label (blue) for comparison. Note that this interaction is 22 intramolecular and thus not a crystal contact. Similar observations have been made for the R1 23 nitroxide spin label 75 . 1 We performed PELDOR/DEER and smFRET experiments on three substrate binding proteins 2 (HiSiaP, MalE and SBD2) to conduct a comprehensive cross-validation of these two important 3 integrative structural biology methods. For this purpose, we used the same labelling sites with 4 both methods and measured the inter-label distances in the presence or absence of their 5 substrates. Both methods showed a good consistency with each other and considering structural 6 models. To get a more quantitative comparison, we took 24 of our measurements (where a 7 single distance peak was observed) and calculated the difference between the average 8 experimental distances. The PELDOR/DEER and smFRET measurements on the same double 9 mutant differed by about 5 Å with an overall spread of ± 10 Å ( Figure 7C). To an extent, this 10 difference can surely be addressed to the different label structures and especially to the different 11 linker lengths of the used labels (compare Figure 2). We hence used in silico labelling programs, 12 to estimate the extent of these differences 59,60 . Figure 7C shows that, for most of our 13 measurements, the difference between the two methods is larger (and sometimes much larger) 14 than can be explained by the different linker lengths when a freely rotating label is assumed 15 (otherwise, the black and grey lines in Figure 7C should coincide). Our simulations above 16 ( Figure 7A  for distance simulations based on accessible volume and exclude orientational effects on FRET 31 efficiency 70 . This requirement would be hard to meet with very short and thus also rigid linkers. 32 The viability of approach (i) depends on the type of macromolecule. Nucleic acids for instance 1 have a highly negatively charged and predictable surface. So, negatively charged labels (such 2 as many sulphonated dye-molecules) will be repelled and diffuse almost freely in their 3 accessible volume. For proteins, it is much more difficult to predict how the label will interact 4 with the macromolecular surface. It is therefore more straightforward to change the type of label 5 (e.g. charged vs hydrophobic) or to try different labelling positions to validate unexpected or 6 inconsistent results. Indeed, we showed for HiSiaP that switching fluorophores to alter charge 7 and linker length enabled us to correctly detect the expected conformational change. is to measure as many distances as possible and then to investigate, whether a particular 14 observation is truly independent of the labelling site. In this regard, it should also be mentioned 15 that the common practice to re-use the same labelling position for several distance 16 measurements has obvious practical advantages, but renders it much more difficult to validate 17 the different measurements within a single dataset. therefore no simple task to study the temperature dependence of conformational transitions. 17 However, newly developed labels, such as the trityl spin labels might pave the way for routine 18 room temperature PELDOR experiments 67 .

19
Our failure to find the reason for the discrepancies in the SBD2 example is a reminder of the 20 high number of further parameters that can potentially influence the outcome of experiments. 21 One prominent example that was not investigated here is the sample concentration. Whereas     Yes. Multiple distances can be measured in one experiment, using just one type of label and standard equipment.
Yes, but technically demanding.

Physical state of the sample?
Usually frozen solution (50 K). Measurements in aqueous solution are possible but require specific labels (e. g. trityl) and very large or immobilized macromolecules. Depending on the label, other pulse sequences than PELDOR/DEER might be required.
Liquid solution at room temperature and standard cell conditions (37°C)

In vivo measurements?
Not a standard experiment, especially not under physiological conditions. But, measurements in manually injected frog oocytes or using paramagnetic unnatural amino acids in E. coli have been done 52,92 .

Time resolution
Freeze quenched samples can be measured. The timeresolution depends on the freeze quench equipment.

Time frame for measurements
Normally several hours per measurement for Q-band Diffusing molecules: 30-60 minutes Immobilized molecules: minutes to hours With this map we identified regions with large conformational changes and select amino acids 9 inside these regions, which are located on the surface of the protein to obtain a good 10 accessibility. For smFRET studies, optimized double cysteine mutants were created with a yet 11 unpublished software-tool for optimal fluorophore labelling (Gebhardt & Cordes, unpublished), 12 which will be described in a forthcoming publication. In short: residues were rated based on 13 different parameters such as solvent exposure or conservation to obtain a labelling feasibility 14 estimate. Residues with high ratings are paired to find good smFRET pairs with large distance 15 change between apo and holo (or no distance change as negative control). 16 17 Protein expression and purification 18 The TRAP SBPs HiSiaP and VcSiaP were expressed and purified as described before 24 . To 19 prevent co-purification of the substrate, the E. coli cells were cultured in M9 minimal medium.

20
For purification, the protein was loaded onto a benchtop Ni-affinity chromatography, followed  and a total volume of 10 µL sample was prepared into a glass capillary, sealed with superglue.

20
The magnetic field of the cw-EPR spectrometer at room temperature were set to a center field 21 of 3448 G and the microwave frequency to 9.631694 GHz. The microwave power was set to

18
The PELDOR/DEER experiments were measured on an ELEXSYS E580 pulsed spectrometer 19 from Bruker in combination with an ER 5106QT-2 Q-band resonator. The temperature was set 20 to 50 K with a continuous flow helium cryostat (CF935, Oxford Instruments) and a temperature 21 control system (ITC 502, Oxford Instruments). The PELDOR/DEER time traces were recorded 22 with the pulse sequence π/2(υA)-τ1-π(υA) -(τ1+t) -π(υB)-(τ2-t)-π(υA)-τ2-echo. The frequency 23 υA of the detection pulses were set 80 MHz lower than the frequency of the pump pulse υB, 24 which was set to the resonator frequency and the maximum of the nitroxide spectrum.

25
Typically, the short repetition time was 1000 µs and the lengths of τ1 and τ2 was 12 and 24 ns,  In silico distance simulations the average distance and the distance distribution between two of these ensembles can be 5 determined. 6 For smFRET we used the FRET-restrained positioning and screening method established by 7 the Seidel lab 60 . This method allows the determination of a FRET-efficiency-averaged model 8 distance between the two dyes using the crystal structure information. For distance simulations 9 we employed a simple dye model, in which three parameters were used to determine the 10 accessible volume the dye can sample: (i) linker-length (linker), linker-width (W), and the 11 fluorophore volume, which can be derived from an ellipsoid using R1, R2 and R3. With this 12 information, the average distance between two of these spheres was calculated. The dye 13 parameter for the different fluorophores are shown in Table 2. An average distance was   Polarization optics is mounted in homebuilt, 3D-printed rotation mounts and APD is protected 30 from scattered light with a 3D-printed shutter unit.

31
The excitation power was 10 µW and the concentration was finetuned to have ~50 kHz count 1 rate under magic angle conditions. All anisotropy and lifetime measurements were recorded for  MTSSL was verified with room temperature cw-EPR spectroscopy (X-band). The labelling efficiencies 4 were determined with the spectrometer software and is given next to the spectra.