Resolving dynamics and function of transient states in single enzyme molecules

We use a hybrid fluorescence spectroscopic toolkit to monitor T4 Lysozyme (T4L) in action by unraveling the kinetic and dynamic interplay of the conformational states. In particular, by combining single-molecule and ensemble multiparameter fluorescence detection, EPR spectroscopy, mutagenesis, and FRET-positioning and screening, and other biochemical and biophysical tools, we characterize three short-lived conformational states over the ns-ms timescale. The use of 33 FRET-derived distance sets, to screen available T4L structures, reveal that T4L in solution mainly adopts the known open and closed states in exchange at 4 µs. A newly found minor state, undisclosed by, at present, more than 500 crystal structures of T4L and sampled at 230 µs, may be actively involved in the product release step in catalysis. The presented fluorescence spectroscopic toolkit will likely accelerate the development of dynamic structural biology by identifying transient conformational states that are highly abundant in biology and critical in enzymatic reactions.

E nzymes adopt distinct conformational states during catalysis 1,2 , where transiently populated ("excited") states are often of critical importance in the enzymatic cycle. These states are short-lived and therefore "hidden" to many experimental techniques. Classical structural biology methods often struggle to fully capture enzymes during catalytic action because the conformational rearrangements often span several decades in time (ns-ms) [3][4][5][6][7][8] . Hence, there is an urgent need to develop experimental and analysis methods to overcome this challenge. Recently, we demonstrated by simulated experiments that a new analysis toolkit ("FRET on rails") combined with molecular simulations can resolve short-lived conformational states of proteins 9 .
Here, we apply and extend the fluorescence analysis toolkit 9,10 , developed for dynamic structural biology, to interrogate the catalytic cycle of an enzyme 11 . In particular, the analysis (1) captures an excited, short-lived state and (2) identifies its potential relevance in the enzyme's catalytic cycle. The presented approach may serve as a blueprint for future enzymologic studies with the well-established single-molecule multiparameter fluorescence detection (MFD) experiments in that it enables detecting hidden states by the unique time-resolution (picoseconds) and sensitivity (single-molecule) of fluorescence.
We use lysozyme (T4L) of the bacteriophage T4 as development platform and probe its conformational dynamics and structural features. Structurally, T4L 12 consists of two interrelated subdomains, the N-terminal subdomain (NTsD) and the Cterminal subdomain (CTsD), differing in their folding behavior and stability 13 . A long α-helix (helix c) links the two subdomains (Fig. 1a). To date, more than 500 structural models of T4L are available within the Protein Data Bank (PDB). In this ensemble, T4L adopts several opening angles corresponding to a classic hinge-bending motion of the NTsD with respect to the CTsD. The enzymatic function of T4L is to cleave the glycosidic bond between N-acetylmuramic acid and N-acetylglucosamine of the saccharides of the bacterial cell wall 14 .
T4L in solution is thought to adopt conformations that are open to various degrees, and a covalent adduct of the protein and its processed enzymatic product can crystallize in a closed conformation [14][15][16] . Therefore, T4L is thought to follow a classical Michaelis-Menten mechanism (MMm) characterized as a twostate system (Fig. 1b). Here, an open and closed conformational state fulfils unique functions of substrate binding and substrate cleavage, respectively 17 . In a MMm, the product dissociates stochastically from the enzyme. For other enzymes, e.g. the Horseradish peroxidase 18 , an "active" product release state was identified. Recent experimental findings for T4L suggest the involvement of more than two states in catalysis 19,20 , where the turnover rate was estimated between 10 and 50 ms [20][21][22][23] , while the conformational dynamics fell within the ns to sub-ms range 4,15,[20][21][22][23][24][25][26][27][28][29] . Such complex cases, with distinct interconverting conformational states, open additional reaction paths and yield disperse kinetics 30 .
For a full description of an enzymological cycle, the number of enzymatic states, their connectivity, the conformational structures of the states, and the states' chemical function have to be unraveled. Technically, we achieve these objectives by a hybrid approach combining classic biochemical methods (mutagenesis & HPLC), probe-based spectroscopy, and molecular simulations. Förster resonance energy transfer (FRET) and electron paramagnetic resonance (EPR) spectroscopy probe distances between bioorthogonally introduced probes through dipolar coupling. In FRET spectroscopy, the coupling is measured between a donor (D) and acceptor (A) fluorophore.
In confocal MFD single-molecule FRET (smFRET) experiments, freely diffusing molecules are repeatedly excited by a pulsed light source, and the emitted fluorescence photon is detected with picosecond time-resolution by time-correlated single photon counting (TCSPC) for several milliseconds per molecule (diffusion time, t diff ) 31 (Fig. 1c). smFRET experiments are ideal to study kinetics because no sophisticated strategies are necessary to synchronize molecules prior to the analysis. Consequently, it is possible to probe reliably protein kinetics over seven decades in time (sub ns-ms).
Distinct features of photon streams are highlighted by different representations (Fig. 1d). (1) A MFD-histogram is particularly valuable to reveal the number of states, identify dynamics, and to inform on state connectivities. A MFDhistogram is generated by analyzing two complementary FRETindicators, the average intensity-based FRET-efficiency, E, and the fluorescence-averaged donor lifetime in the presence of acceptor, 〈τ D(A) 〉 F , for individual single-molecule events [31][32][33] .
(2) Filtered fluorescence correlation spectroscopy (fFCS) quantifies exchange dynamics among the states by determining relaxation times 34,35 . (3) The analysis of fluorescence decays reveals populations of states and equilibrium distance distributions. (4) Finally, these experimental distances can be translated to structural models by molecular simulations 10,36,37 .
Following the concepts that were established for simulated data to resolve the structure and dynamics of proteins by integrative studies with FRET experiments 8,9 (Fig. 1e and more detailed in Supplementary Fig. 1), we start out with a systematic design of a FRET network for T4L to simultaneously monitor its dynamics and structural features. In step II, we use a combination of MFD to resolve the conformational states stable at the ns timescales. In step III, we quantify the conformational dynamics by employing fFCS and Monte Carlo simulations to resolve the connectivity of the conformational states. Following this, we perform a statistical analysis to further substantiate the existence of a hidden conformational state of T4L that was clearly identified above. In step IV, we identify structural models by using three distinct distance sets to screen an ensemble of structural models and to compare our identified states with models in the PDB. In the final step V, we derive an experimental energy landscape of T4L's enzymatic cleavage cycle, based on shifting equilibrium, by mutating key active site residues that mimic functional enzyme states at various steps during substrate hydrolysis. Overall, our results demonstrate the potential of fluorescence spectroscopy to go beyond traditional experimental methods for obtaining a dynamic structural picture of enzymes in action. The existence of the identified hidden/excited conformational state is also corroborated by other analytical tools such as chromatography and EPR spectroscopy.

Results
Detecting T4L's states by MFD. In our smFRET-experiments, we monitor the distance between a donor (D) and acceptor (A) attached to specific amino acids of a T4L variant (see Methods). We designed a network of 33 distinct T4L variants to probe hinge-bending motions of T4L from different spatial directions (Fig. 2a) that cover the whole protein.
In Fig. 2b, c, we present MFD-histograms with the two FRETindicators for two exemplary variants of our FRET network. Three peaks are identified in the MFD-histogram. In both histograms, a major and a minor FRET peak are present. The peak located at a low FRET-efficiency E corresponds to molecules without, or with an inactive acceptor, fluorophore (DOnly).
These E values are shown as horizontal lines in the marginal distributions of Fig. 2b, c. A comparison with the major peak ( Fig. 2b: Fig. 2c: E > 0.7) demonstrates that they are similar but not identical to known structural models. Next, we will show how dynamic exchange explains the observed peaks.
In MFD-histograms, FRET-lines (Supplementary Methods) serve as a unique guide to visualize conformational dynamics by peak shifts and splitting like in NMR relaxation dispersion measurements. A static FRET-line relates E and 〈τ D(A) 〉 F for molecules in the absence of dynamics (magenta line, Fig. 2b, c). States that exchange on a time scale much slower than the observation time (quasi-static case) are separated in a MFDhistogram and follow the static FRET-line. However, a shift of a peak to the right with respect to the static FRET-line is a modelfree indication for sub-ms dynamics 31 , because the FRETefficiencies in a MFD-histogram are averaged over the observation time of the molecules (~ms). Thus, very fast exchanging states result in a single average peak that is shifted to the right of the static FRET-line. These peaks can be described by dynamic FRET-lines, which connect the exchanging states. For a visual representation of the possible transitions, the dynamic FRET-lines of the identified exchanging states are displayed in the MFDdiagrams (dark green, cyan, and light green). A dynamic FRETline connecting high FRET states with the DOnly population (gray) demonstrates the lack of significant photobleaching or blinking of acceptor dyes.
In the presented data, the major populations are shifted to the right of the static FRET-line (Fig. 2b, c). This gives clear evidence for a dynamic exchange faster than ms. For molecules in very rapid (µs) exchange between an open and a closed conformation, we expected to detect a single averaged peak in MFD-histograms. Hence, taking fast exchange into account, the major peak of the smFRET data is in agreement with known X-ray structures 14,39 and kinetic data 4,13,20,21,23,24,29,[40][41][42] , most likely corresponding to the dynamic averaging of the hinge bending mechanism.
However, in 18 out of 33 MFD-histograms, we visually identify additional minor populations, which are in slow exchange with the major populations. Surprisingly, these minor populations (E > 0.8, Fig. 2b) and (0.2 < E < 0.6, Fig. 2c Assuming that the two limiting states (yellow and blue) exchange on timescales faster than ms with exchange rate constants k f and k b , we find only a single population (orange) shifted towards a longer fluorescence lifetime that is located on the dynamic FRET-line (green) connecting these two limiting states. Thus, FRET-lines serve as a visual guide to interpret 2D MFD-histograms, with deviations from the static FRET-line being indicative for the dynamic averaging and dynamics at the sub-ms and ms timescales. Filtered fluorescence correlation spectroscopy (fFCS) computes the species-specific cross-correlation function (sCCF (green), Supplementary Methods). The observed anti-correlation reveals a characteristic relaxation time t R related to the inverse of the sum of the exchange rate constants, k f and k b . In eTCSPC, the distribution of the fluorescent photons (yellow/blue-individual states, orange-mixture) is detected with respect to an excitation pulse with ps resolution to reveal populations stable on the ns timescale (Supplementary Methods). Finally, in molecular simulations, the experimental results are compared to available structural models. e Flowchart for the hybrid FRET toolkit for determining structural dynamics. Based on a network of FRET variants, the conformational states and their exchange dynamics are determined, which are then used to identify the structural models. T4L variants with mutations altering their enzymatic activity relate the structural models to enzymatic states. Based on the gathered information, the enzymatic cycle can be modeled.
predicted average open and closed conformation ( Supplementary  Fig. 2). This is a first indicative for the existence of a third, conformationally excited, structurally distinct conformer.
In conclusion, MFD-histograms identify three conformers in T4L referred to as C 1 , C 2 , and C 3 . The conformers C 1 and C 2 are likely in fast exchange, while C 3 is in slow exchange with C 1 or C 2 . These conformers may represent limiting states in the exchange dynamics 43,44 .
Following the workflow presented in Fig. 1e, we next determine the kinetic signatures via fFCS and resolve remaining ambiguities by simulations of MFD-experiments.
Connectivity of states in a kinetic network. To construct a reaction scheme of T4L's enzymatic cycle, the variant S44pAcF/ I150C is used as pseudo-wildtype ("wt ** "). At first, we carry out control experiments by comparing for this variant (DA)-labeled and reversely (AD)-labeled T4L variants. In this way, we could exclude potential dye artifacts ( Supplementary Fig. 3a-c, Supplementary Note 1) because the kinetic behavior was independent of the labeling scheme. The MFD-histogram of S44pAcF/I150C (Fig. 3a) reveals a typical pattern: a major population C 1 /C 2 (0.2 < E < 0.6) and a minor C 3 population (E > 0.8) similar to the variants presented in Fig. 2b, c.
To unravel the kinetic behavior of an enzyme, one has to be aware that an enzymatic cycle with multiple states can be described by a transition rate matrix, which contains all exchange rate constants of the states. To recover T4L's transition rate matrix, we determine a set of relaxation times by fFCS (see next paragraph) and the species fractions of the states by analysis of the fluorescence decays (for details see the section below). This analysis results in ambiguous solutions, which are resolved by simulating MFD experiments making use of the information contained in smFRET experiments.
Kinetic network of conformational states resolved by fFCS. By fFCS, we probe transitions in T4L on all relevant timescales 34,35 to resolve the kinetic network of conformational states. fFCS uses species-specific information encoded as a characteristic pattern within the ns-regime of the polarization-resolved fluorescence decays 34,45 . This amplifies the contrast compared to conventional FCS for resolving relaxation times with high precision. We find very good agreement between the normalized species crosscorrelation functions (sCCF) of the (AD)-and (DA)-labeled molecules. A global analysis of the sCCFs and the species autocorrelation functions (sACF) requires at least two relaxation times (t R1 = 4 μs and t R2 = 230 μs, Fig. 3b, Supplementary Note 2). In summary, the two relaxation times obtained by sCCFs independently support the hypothesis of the interconversion between three states at sub-ms timescales. Moreover, in line with the MFD-histograms, we find a fast and a slow relaxation time.
Simulation of the kinetic network of T4 Lysozyme. The three identified conformers C 1 , C 2 , and C 3 are assigned by their characteristic species fractions (see below) to the corresponding structural states open, closed, and excited, respectively.
Three distinct kinetic linear reaction schemes are possible: is unlikely due to the lack of burst across all FRET variants that connect C 3 ⇋ C 1 with an effective slower rate to satisfy equilibrium. These bursts would follow the dynamic FRET-lines as guides between states in the MFDdiagram ( Fig. 3a and Supplementary Fig. 2). Nonetheless, the sequential closing, from the most open (lowest FRET-efficiency in the variant S44pAcF/150C) state to the most compact (highest FRET-efficiency in the variant S44pAcF/150C) is depicted by C 1 ⇋ C 2 ⇋ C 3 , so that we can discard the models C 1 ⇋ C 3 ⇋ C 2 and C 2 ⇋ C 1 ⇋ C 3 . With the relaxation times determined by fFCS and the species fractions obtained by analysis of the fluorescence decays (see next section), we calculate the exchange rate constants and find two competing solutions (Supplementary Note 3, Equation (36)). The exchange between C 1 and C 2 can be either slow (Fig. 3c) or fast (Fig. 3d).
To solve this ambiguity, we simulate sm-experiments of the two possible solutions 43  To conclude, we experimentally determine all reaction rate constants that define the reaction network, and the resulting species fractions. This description covers µsms and captures the relevant global motions of T4L.
Characterization of the third conformer by eTCSPC. As demonstrated by fFCS analysis, T4L is highly dynamic. Hence, the FRET-efficiencies in smFRET-histograms only represent dynamic averages of states 46 . Therefore, for resolving the limiting states of the system, we record high-precision fluorescence decays by eTCSPC and analyze the distribution of the photon arrival times, t, with respect to the excitation pulse in fluorescence decays. This analysis benefits from polarization-free effects resulting from measuring at magic-angle detection, low background fluorescence, and the absence of photobleaching. Efficiency, E Moreover, it can reveal DA-distance distributions and species populations 47 . To dissect the donor quenching by FRET (i.e., FRET-induced donor decay), we jointly analyze the DA-and DOnly-dataset, where the fluorescence lifetime distribution is shared with the DA-dataset. For physically meaningful analysis results, we explicitly consider the DA-distribution broadening due to the linkers by normal distributions 47 . The analysis results of all 33 FRET-datasets are discussed using the variant S36pAcF/P86C shown in Fig. 4a (for other variants see Supplementary Note 4, Supplementary Fig. 5a). We display the experimental data by fluorescence decays of the DA-and the corresponding DOnly-sample (Fig. 4a). In agreement with the MFD-histograms and the fFCS data, 1-component models result in broad DA-distributions and/or are insufficient to describe the data (Fig. 4a, weighted residuals, violet). For S36pAcF/P86C, we obtain both, an unphysical distribution width and significant deviations in the weighted residuals, a strong indication that more than one conformer is found.
The analysis of the fluorescence decays by a 2-component model yields an inconsistent assignment by the species fractions ( Supplementary Fig. 5b, c). This is evident by significant differences among the species fractions (Supplementary Table 2b). Moreover, the DA-distances disagree with known structural models (compare Supplementary Table 2d, e).
In our effort to seek a consistent description of all measured fluorescence decays, we develop a joint/global model function. For such description, we treated all fluorescence decays as a single dataset sharing common species fractions for the states. This reduces the number of free parameters and dramatically stabilizes the optimization algorithm. Because the global 2-component model (Fig. 4, cyan, Supplementary Table 2c) shows no agreement with the data, we consequently used a 3-component model (Fig. 4a orange, Supplementary Table 2d-f) to describe the data.
To analyze the precision of this fit, the uncertainties ΔR DA of the obtained distances, 〈R DA 〉, from the 3-component model need to be determined. ΔR DA depends on statistical uncertainties and systematic errors. We use the known shot noise of the fluorescence decays to estimate the statistical uncertainties, ΔR DA (k FRET ), of the FRET-rate constant k FRET (Fig. 4b, Supplementary Table 2g). Moreover, we record polarization-resolved fluorescence decays of the donor and acceptor by eTCSPC to analyze the time-resolved anisotropy (Supplementary Table 3a, b) for estimating systematic errors, ΔR DA (κ 2 ), due to the orientation factor κ 2 . In conclusion, we can demonstrate that ΔR DA (κ 2 ) dominates the overall uncertainty of ΔR DA (Eq. (5), Supplementary Table 2d-g).
Moreover, we sample the model parameters of a 3-component model for individual datasets by a Markov chain Monte Carlo (MCMC) method. This demonstrates that, for given state populations, the mean distances 〈R DA,1 〉, 〈R DA,2 〉, and 〈R DA,3 〉 are very well defined (compare red to black in Fig. 4b). This also shows that a global model, which interrelates the state populations among datasets, improves the capability to resolve interdye distances.
A global 3-component model has too many degrees of freedom (Supplementary Methods) to be exhaustive when sampling by MCMC. Hence, we vary the state population of the minor state, x (C 3 ), while optimizing all other model parameters (support plane analysis). This way, we determine the dependency of x(C 3 ) on the quality parameter χ 2 r of all measurements (Fig. 4c). This analysis (1) shows that the minor state population is in the range of 0.1-0.27 and best agrees with the data for 0.21 (Fig. 4c, p- (5)).
In summary, only a 3-component analysis describes all FRET samples and reference samples in a global model. This analysis recovers a set of physically meaningful average DA-distances that are grouped automatically and unbiased by their state populations. Additionally, the 3-component model is consistent with the fFCS data and with the dynamic FRET-lines displaying dynamically averaged sm-subpopulations in MFD (Fig. 2,  Supplementary Fig. 2).
The integrated results are consistent with a view that T4L adopts three states (C 1 , C 2 , and C 3 ), as opposed to the expected two conformational states based on structural pre-knowledge.
Structural features of conformational states. To compare the experimental distances 〈R DA,exp 〉 obtained from the fluorescence decays-under consideration of their respective uncertainties ΔR DA -to the structural models deposited in the PDB, we cluster all available 578 structures of T4L and aligned them. We observed that the structural models of T4L group into open, ajar, and closed clusters (based on the proximity of the CTsD and NTsD, Supplementary Table 4) with an intra-cluster root mean-squared displacement of less than 1.8 Å. The representative structures of these clusters are given by PDB IDs 172L, 1JQU, and 148L for the open, ajar, and closed conformations, respectively (Fig. 5a).
Next, we apply the FRET positioning system (FPS) 34 to compute an error function (χ 2 r,FPS ) that compares the three sets of 33 distances 〈R DA,exp 〉 to the modeled distances 〈R DA,model 〉 by FPS. In χ 2 r,FPS , we consider explicitly the uncertainties, ΔR DA , of the distance 〈R DA,exp 〉 9 . The overall agreement (minimum χ 2 r,FPS ) for the distance sets for C 1 and C 2 is best for 172L and 148L, respectively (Fig. 5b). In Fig. 5c, 〈R DA,model 〉 for 172L and 148L are compared to 〈R DA,exp, 〉 of C 1 and C 2 , respectively. A linear regression (red line) with a slope close to one demonstrates the absence of significant systematic deviations.
Structurally, the ajar state is more closed than the open state and more open than the closed state, most likely representing an intermediate conformation or it could arise from structural instabilities introduced by specific mutations such as W158L 48 . The deviation from the open and closed state is clearly reflected in the elevated χ 2 r,FPS . Consequently, within our precision we can safely assign C 1 as open and C 2 as closed state. Screening results of other structures in the PDB against the FRET data are very similar to the results for the discussed cluster representatives, as expected ( Supplementary Fig. 6). However, none of the structures can be assigned to the C 3 state as judged by the disagreement with the data (Fig. 5b, χ 2 r,FPS ). Thus, we conclude that C 3 is an excited conformational state of currently unknown structure.
Relevant functional states in the enzymatic cleavage cycle. Detection of C 3 by EPR. We use double electron-electron resonance (DEER) to provide additional support for the C 3 state. Multiple DEER studies on T4L have shown interspin distributions for wt T4L 49,50 . Here, we show the distribution of interspin distances of the adduct form of the variant T26E/S44pAcF/I150C labeled with the appropriate spin label MTSSL to produce the variant T26E(+)-44R1/150R1, which displays a satellite population with interspin distance of~35 Å resembling the enzymeproduct-complex EP within the catalytic cleavage cycle of T4L (Fig. 6a, Supplementary Fig. 5d). The most frequently observed distance falls at interspin distances of 42 Å with another less populated state at interspin distances of >50 Å. These may correspond to the various sub-states of the closed (C 2 ) and open (C 1 ) states, respectively. To ensure that this small population is not an artifact of the Tikhonov regularization algorithm 51,52 or due to the rotamer populations of the spin label-carrying side chain, we lower the pH to influence the conformational equilibrium of the states 53 . The FRET-experiment with the variant S44pAcF/I150C shows an increase in the population of C 3 at pH 2 (Fig. 6b), and the analogous DEER experiment at pH 3 shows a remarkably similar redistribution of interspin distances. Compared to physiological pH conditions (Fig. 6a, dashed trace), these distances exhibit a shortening that is consistent with the C 3   Prob.
Prob. Trapped reaction states of T4L. To mimic functional enzymatic states, we mutated the residues E11 and T26 at the active site using the backbone of the S44pAcF/I150C variant, also named wt **14,39,54 . We use wt ** because of the advantage in clearly resolving all three conformations of the free enzyme (E) by FRET. These mutations help identifying the role of C 3 during enzyme catalysis: E11A, which inactivates T4L, causes the enzyme to bind its substrate S (peptidoglycan from Micrococcus luteus) while obviating the expected hydrolysis reaction 54 . Thus, in the presence of excess substrate, this mutation mimics the enzymesubstrate complex (ES). We monitor the effect of the substrate binding for the E11A mutation by FCS and compare the characteristic translational diffusion times, t diff , in both the absence and presence of substrate. While t diff is small (0.54 ms, Fig. 6c, green curve) without the substrate, it increases by several orders of magnitude when the large substrate is introduced (Fig. 6c,  yellow curve). Moreover, the shift towards the larger donor anisotropy values upon incubation with substrate also provides additional evidence for substrate binding without cleavage (Supplementary Fig. 6e).
Sub-ensemble TCSPC analysis of the DA-subpopulation of the ES state (E11A/wt ** in the presence of substrate, Fig. 6d, Supplementary Fig. 7a-d, Supplementary Note 5) reveals an increase of 125% in the population corresponding to C 2 compared to the free enzyme state E, with a concomitant reduction of C 1 . In contrast, no effect of substrate binding for wt ** -(DA) is observed because ES is not trapped (Fig. 6e).
Although the variant T26E cleaves the substrate, the formation of a covalent adduct (PDB ID 148L) prevents a release of the formed product 14 . Therefore, we use this intermediate adduct to mimic the product-bound enzyme state (EP). To confirm the adduct formation under our measurement conditions, we monitor the adduct formation of labeled T4L (T26E/wt ** variant) by HPLC (Fig. 6f). T4L without substrate (E) elutes at~18 min. After incubation with the substrate, the peak of E drops, and a new elution peak at~12 min is detected with increasing incubation time (Fig. 6f, Supplementary Fig. 8), indicative of the adduct form of T4L (EP). Both ensemble (Fig. 6e) and sm MFD-measurements (Fig. 6g, Supplementary Fig. 7e-h) show a significant increase of the relative fraction of the C 3 state, an effect  also observed in the EPR measurements (Fig. 6a). In the T26E variant, the accumulation of the C 3 state is connected to the inability of this variant to release a part of the product 14 . We conclude that the new excited conformational state must be involved in this step.

Discussion
In the following, we present the experimental evidence for the C 3 state and its structural properties. To corroborate the existence of C 3 , we discuss our experiments in four aspects: (1) the kinetic behavior in sm-experiments, (2) the error statistics of data analysis, (3) the structural validation of the obtained FRET parameters, and (4) the effect of specific mutations. Aspect 1: Kinetic behavior. Considering 18 out of 33 variants with FRET-pairs, the sm-experiments directly show the presence of an additional DA-subpopulation in the MFD-histograms, which differs significantly from C 1 and C 2 (Figs. 2, 3, Supplementary Fig. 2). This DA-subpopulation is either populated or depopulated with specific mutations that alter the overall catalytic activity of T4L. Moreover, our global fluctuation analysis recovers at least two relaxation times that are shared throughout all studied variants (Fig. 3, Supplementary Fig. 9). Applying kinetic theory, two relaxation times indicate at least three states in equilibrium, which are reproduced by Brownian dynamics simulations ( Supplementary Fig. 4).
Aspect 2: Error statistics. Key to the analysis and determination of C 3 by ensemble fluorescence decays is the use of global analysis of all 33 variants, which reduces the number of free parameters, increases fitting quality (Supplementary Methods), and gives a consistent description with sm-experiments. Moreover, assuming that CTsD and NTsD are rigid, there are six independent degrees of freedom in the system, which we significantly oversample by measuring 33 variants.
Aspect 3: Structural validation. In contrast to our 3-component model, the global analysis of eTCSPC data using a 2-component model yields two distance sets, which cannot describe the expected interdye distances of the known conformers (C 1 (172L) and C 2 (148L)). Furthermore, for the 2-component model, we do not observe the expected linear correlation between the modeled and experimental interdye distances, as shown in Supplementary  Fig. 5b, c.
Aspect 4: Specific mutations. The final point to corroborate the existence of C 3 are the results of a few specific mutations. The variant Q69pAcF/P86C is especially informative, as the donor is placed in the middle of helix c (Orange Fig. 1a), which connects both domains, while the acceptor is located in the middle of helix d, which is part of the CTsD (Brown Fig. 1a). According to FPS, the interdye distances for C 1 (172L) and C 2 (148L) states are hardly distinguishable by FRET, 〈R DA 〉 of 34 and 35 Å, respectively. Assuming that both domains preserve their secondary structure, the compaction of T4L in C 3 can only proceed by kinking the helix c. This conclusion is consistent with previous studies that identify V75 as the subdomain boundary and critical in protein stability of the pseudo-wild-type construct wt * with a boundary for the local stability to unfolding around residue N68 13,55 . Given the location of the dyes and the extension of the dye linkers, expected dye orientations will lead to an increase of the interdye distance for C 3 , i.e., a greater interdye distance is expected for C 3 compared to the experimental distances for C 1 (39 Å) and C 2 (37 Å). The additional observed distance of 52.4 Å agrees with this hypothesis (Supplementary Table 2f, Fig. 2c).
In conclusion, the existence of C 3 demonstrates a greater level of complexity of the domain motions of T4L than a single hingebending motion, which is in agreement with recent indirect observations 20,24,29 . The complex exchange dynamics between the conformations with relaxation times of 4 and 230 µs and the small population of C 3 may explain the difficulties of other experimental biophysical methods and MD simulations in identifying this exchange, and some heterogeneity in interspin distances observed in previous studies for similar conditions 49 .
Relating conformational states and enzyme function. A threestep process characterizes the T4L hydrolysis of peptidoglycans. First, the glycosidic bond of the substrate (S) is protonated by E11 followed by the simultaneous nucleophilic attack of water molecules, which are hydrogen-bonded to residues D20 and T26, on the C-1 carbon of S. As a result, the covalent adduct (ES) is observed in PDB ID 148L 14 . Second, the proton is presumably returned from D20 to E11 via solvent transfer. The third and final step is the product release from the active site to regenerate the enzyme to the original state.
In view of the structural dynamics and to link T4L's functional cycle to our three observed conformations, we use an extended MMm (eMMm) scheme as suggested by Kou et al. 30 (Fig. 7a).
Here, this model considers a sequential succession of three steps to go from a free enzyme via a product bound state back to the relaxed free enzyme state. The substrate S binds reversibly to the enzyme E to form an enzyme-substrate complex ES and is converted to the product P, resulting in an EP complex with the product still bound to the enzyme. A transition of E to an excited state E * then releases P from the complex, followed by a relaxation of the free enzyme E * to E.
Next, we consider our results in the light of the eMMm framework. First, by using the S44/I150C backbone and two key functional T4L variants (E11A, T26E), we create the relationship between the conformational (C 1 , C 2 , and C 3 ) and the abovedescribed reaction states (E, ES, and EP) for purposes of elucidating the functional role of E * (Fig. 7). We observed a significant difference of the populations of the conformers (Fig. 6) between the three reaction states.
To connect conformational equilibria with catalysis, we monitor the relative changes in species fractions observed across the functional variants in both the absence and presence of substrate (upper panel of Fig. 6e) by generating a 3 × 3 state matrix (Fig. 7b). As indicated in the matrix, specific conformational states are favored in each enzyme reaction state (Fig. 7c,d).
For this representation, we use the relative population changes as compared to the wt ** to monitor the conformational populations of the different enzyme states.
In the free enzyme state E, the open conformation C 1 is mostly populated to enable substrate binding, which initiates the catalytic cycle through the formation of ES. Through this cycle, the closed conformation C 2 now becomes the most abundant conformation 14 . In this conformation, the glycosidic bond can be cleaved such that C 2 connects both ES and EP. In our studies, we determine that the product release may occur in the compact conformation C 3 , a population that is greatly increased in EP. Thus, C 3 links EP and E so that the original enzyme E is regenerated from EP, which closes the enzymatic cycle. Consequently, the compact state C 3 now corresponds to the excited conformational state E * (Fig. 7a-d), the function of which is to disperse the product 18,29 (Fig. 7b, d). These series of events show a sequential closing from the most open conformation C 1 to the most compact form C 3 along two coordinates: the reaction state and the conformational state.
Most strikingly, even under saturating conditions, which favor the ES and EP states, the enzyme was observed to remain in dynamic equilibrium between the conformational states, rather than transforming entirely into a single conformational state.
To visualize the relative energetic changes of the enzyme during the various steps of the catalytic cycle, we use the species fractions and reaction rate constants to compute the relative energy landscape based on the Arrhenius equation (Fig. 7d,  Supplementary Fig. 11). We observe a sequential closing of the enzyme to populate C 3 . This is consistent with a ratchet model for providing directionality on the reaction 58-60 beyond the directionality introduced by the excess of S. This also corroborates with our Monte Carlo simulations (Fig. 3) that are incompatible with the unlikely off pathway from C 1 to C 2 through C 3 due to steric hindrances and with the fast hinge bending motion (4 μs) expected from structural models (PDB ID 172L and 148L).
All our evidence suggests that the conformational state C 3 , which appears to be more compact than any other structure known of T4L, is compulsory after rather than before catalytic cleavage. Thus, the compact nature of this structure suggests a functional role that is related to product release via an excited state E * (Fig. 7a). This mechanism can be an evolutionary advantage when directionality is required for function. On the contrary, considering a system with only two conformational states and without an active cleaning mechanism, the stochastic dissociation of the product can become rate-limiting given a high affinity of the product to the enzyme in the EP state. Indeed, a large surfeit of substrate is always characteristic of the in vivo conditions for T4L. Thus, the use of a three-state system to decouple the substrate access and product release can mitigate the occurrence of substrate inhibition in a two-state system when the route to the active site is clogged by excess substrate concentrations 61 .
In summary, we studied 33 distinct FRET-pairs to effectively oversample the anticipated simple hinge-bending motion of T4L. Due to the high precision, we identified three substratedependent fluorescence states that are in fast kinetic exchange. Inverting the positions of the dyes, e.g. S44pAcF/I150C-DA vs. S44pAcF/I150C-AD (Fig. 3b), rules out specific interactions of the dyes with T4L.
Functional variants change the relative populations of the fluorescence states that are determined in a substrate-dependent manner (Fig. 6). Given successive compaction via three conformational states (C 1 , C 2 , and C 3 ), and three reaction states (E, ES, and EP), we considered an eMM reaction scheme (Fig. 7a, c) to provide a meaningful description of the data. Mutagenesis and stability studies indicate the stability of CTsD and a flexibility of the NTsD 4,13,40-42,62 , which may be a necessary principle for the construction of enzymes undergoing conformational changes during catalysis. The combination of known structural models and fluorescence data is used to create a proposed novel structural state in the catalytic cycle of T4L involving a rearrangement of the reactive NTsD with respect to the CTsD, which is deemed consistent with the eMMm for enzyme kinetics. For a complete  Fig. 7 Energy landscape of T4L. a Extended Michaelis-Menten scheme. b T4L interconverts between three major C 1 , C 2 , and C 3 conformational states. The population fractions of C 1 , C 2 , and C 3 are normalized to the variant S44pAcF/I150C in the absence of substrate using the relative changes in Fig. 6e to correct for the influence of specific mutations in that absence. The different font sizes represent the species fractions x i for each conformer according to Supplementary Table 2h and satisfy x 1 + x 2 + x 3 = 1. The three enzyme states are monitored via the following three enzyme variants: (i) the free enzyme state E via S44pAcF/I150C; (ii) the enzyme-substrate state ES via the inactive E11A/S44C/I150C with bound substrate; and (iii) the enzyme product state EP via the product adduct with T26E/S44pAcF/I150C after substrate cleavage. The reaction rate constants are calculated according to the process detailed in Supplementary Note 3, the confidence intervals of which are shown in Supplementary Table 4c. c The peptidoglycan chain with n subunits (S n ) is cleaved into two products (P i and P j with n = i + j) by T4L, both of which can be further processed by T4L until only the dimer of N-acetylglucosamine and N-acetylmuramic acid remains. The gray shaded steps indicate the conformational/reaction states observed. d Relative Gibbs free energy landscapes are calculated using ΔG 0 ¼ Àk B T ln k ji k ij , where k B is the Boltzmann constant and T is the temperature; k ij are the reaction rate constants between states C i and C j for the data presented in (a). The activation energies are calculated according to ΔG 0þ ¼ Àk B T ln k ji k 0 assuming k 0 = 10 3 ms −1 as an arbitrary constant. The distributions consider C 1 , C 2 , and C 3 to follow a Gaussian distribution as a function of the interdye distance R DA . The Gaussian widths (σ ι ) are adjusted to satisfy the energy differences and calculated activation energies. Each energy landscape is independently normalized to C 1 .
structural insight, we are now using the obtained FRET-restraints to present a potential model of C 3 , which is left for a future report.
We anticipate that the presented integrative approach combining the fluorescence spectroscopic toolkit and computational information can accelerate the development of dynamic structural biology 63 by resolving the behaviors of long-and short-lived excited states for purposes of characterizing their functional relevance. This approach is highly relevant as we move towards understanding biomolecular dynamics in situ, where "invisible" molecular effects (i.e., ionic, viscous, and crowding effects 64 ) have the potential to modulate weak interactions with important repercussions in biological systems. The elucidation of excited conformational states is necessary for a thorough in-depth understanding of the mechanisms of enzymes. Thus, a comprehensive description of a dynamic molecular system contains intertwined kinetic and structural information, which is often difficult to obtain by traditional methods. Such information can be archived in the data bank PDB-dev 65 so that excited conformational states gain the urgently needed visibility.

Methods
Sample preparation. T4L cysteine and amber (TAG) mutants are generated via site-directed mutagenesis in the pseudo-wild-type construct containing the mutations C54T and C97A (wt * ), which was subsequently cloned into the pET11a vector (Life Technologies Corp.) [66][67][68] . All primer sequences are listed in Supplementary Table 6. The plasmid containing the gene with the desired mutant was cotransformed with pEVOL 66 into BL21(DE3) E. coli (Life Technologies Corp.) and plated onto LB-agar plates supplemented with the respective antibiotics, ampicillin and chloramphenicol. A single colony was inoculated into 100 mL of LB medium containing the above-mentioned antibiotics and grown overnight at 37°C in a shaking incubator. 50 mL of the overnight culture are used to inoculate 1 L of LB medium supplemented with the respective antibiotics and 0.4 g/L of pAcF (SynChem) and grown at 37°C until an OD 600 of 0.5 was reached. The protein production was induced for 6 h by addition of 1 mM IPTG and 4 g/L of arabinose.
Cells are harvested, lysed in 50 mM HEPES, 1 mM EDTA, and 5 mM DTT at pH 7.5 and purified using a monoS 5/5 column (GE Healthcare) with an eluting gradient from 0 to 1 M NaCl according to standard procedures. High-molecular weight impurities are removed by passing the eluted protein through a 30 kDa Amicon concentrator (Millipore), followed by subsequent concentration and buffer exchange to 50 mM PB, 150 mM NaCl pH 7.5 of the protein flow through with a 10 kDa Amicon concentrator. For the double cysteine mutant containing E11A, the temperature was reduced to 20°C after induction and the cells are grown additional 20 h to increase the fraction of soluble protein. This mutant was produced and purified as described above, except that only ampicillin for selection and IPTG for induction are needed.
Site-specific labeling of T4L was accomplished using orthogonal chemistry. To probe T4L structure by FRET studies (Fig. 2a) we labeled the Keto group of the pacetyl-L-phenylalanine (pAcF) amino acid at the N-terminal subdomain, hydroxylamine linker chemistry was used for the donor dye Alexa488 (Life Technologies Corp.). Cysteine mutants were labeled via a thiol reaction with maleimide linkers of the Alexa647 acceptor dye.
In one exceptional case of the sample S44pAcF/I150C-AD, the labeling was reversed in order to test the reproducibility of the filtered FCS (Fig. 3b) and FRET measurements with different dyes. Acceptor dye Alexa647 with hydroxylamine linker was used to label the pAcF and the Alexa488 donor dye with maleimide linker was used to label the cysteine residue of the mutant S44pAcF/I150C-AD.
For spin labeling, the S44C/I150C double mutant was diluted to a final concentration of~10 µM in labeling buffer (50 mM MOPS, 25 mM NaCl, pH 6.8) and a 10-fold molar excess of a methanthiosulfonate nitroxide (MTSSL) was added overnight 68 . Next day, excess spin reagent was removed using a desalting column (HiPrep 26/10, GE Healthcare) according to the manufacturer's instructions and concentrated with a 15 kDa Amicon concentrator (Millipore).
Binding of labeled T4L mutants to peptidoglycan from Micrococcus luteus (Sigma-Aldrich) was monitored by reverse phase chromatography using a C-18 column out of ODS-A material (4 × 150 mm, 300 Å) (YMC Europe GmbH, Dinslaken, Germany). The labeled protein (1 µM) was incubated with the substrate at 1 mg/mL in PBS. At various points of the reaction, 25 µL of mixed sample injected and further eluted with a gradient from 0 to 80% acetonitrile containing 0.01% trifluoroacetic acid for 25 min at a flow rate of 0.5 ml/min. The labeled complex elution was monitored by absorbance at 495 nm.
Single-molecule experiments. For single-molecule measurements with multiparameter fluorescence detection, we added 40 µM TROLOX to the measurement buffer to minimize the acceptor blinking and 1 µM unlabeled T4L to prevent any adsorption to the cover glass. A custom-built confocal microscope with a dead time-free detection scheme using 8 detectors (four green (τ-SPAD, PicoQuant, Germany) and four red channels (APD SPCM-AQR-14, Perkin Elmer, Germany)) was used for MFD and fFCS measurements. A time-correlated single photon counting (TCSPC) module with 8 synchronized input channels (HydraHarp 400, PicoQuant, Germany) was used to register the detected photon counts in the Time-Tagged Time-Resolved (TTTR) mode. For more details on TTTR please read 69 . The data was analyzed by established MFD procedures 31,33,43 and software, a more detailed description is given in Supplementary Methods. Exemplary data analysis is shown in Supplementary Fig. 3 and Supplementary Note 1, MFD-histograms of all measurements are collected in Supplementary Fig. 2 and Supplementary Fig. 10.
Filtered FCS. Filtered FCS (fFCS) is a derivative of fluorescence correlation spectroscopy (FCS). In fFCS, the information on the fluorescent species contained in the time-and polarization-resolved fluorescence decays (exemplary shown in Supplementary Fig. 9a, c) was used to amplify the transitions between the species of interest 32,34,45 . For this, we arbitrarily constructed species-selective filters (exemplary shown in Supplementary Fig. 9b, d) based on the major and minor population in the smFRET experiment and calculated species-selective auto-(sACF) and cross-correlation functions (sCCF). The in total four curves (two sACF's and two sCCF's) are analyzed jointly using established fitting models (Supplementary Methods Equations (15)(16)(17)) 32,34,45 . For more details see Supplementary Methods.
Fluorescence decay analysis. Fluorescence decays of all samples and singlelabeled reference samples are collected on either an IBH-5000U (IBH, Scotland) or a Fluotime 200 (Picoquant, Germany) system. We collected high-precision fluorescence decays histograms with 30 million photons to precisely determine the FRET parameters of limiting states together with their corresponding structural properties. eTCSPC has the advantage of better photon statistics, polarization-free measurements due to magic-angle detection, a keenly evolved instrumental response function (IRF), low background fluorescence, and the absence of photobleaching at low excitation powers. As the fluorophores are coupled via long and flexible linkers, this resulted in a DA-distance distribution even for single protein conformations. For our data analysis, we assumed that the dyes rotate quickly (κ 2 = 2/3) and diffuse slowly compared to the fluorescence lifetime (~ns) 38 . We validated the assumption of fast rotating dyes by time-resolved anisotropy measurements (Supplementary Table 3a 47 . We compared three different fit models (Supplementary Table 5). The results are summarized in Supplementary Table 2. We estimated the statistical uncertainty of the model parameters by making use of the known shot noise of the fluorescence decays. We randomly sampled the model parameters by a Markov chain Monte Carlo (MCMC) method to estimate their uncertainties for a single dataset (Supplementary Methods) 70 . The applied joint, global fit significantly reduced the overall dimensionality of the analysis but still left too many degrees of freedom (Supplementary Methods) for an exhaustive sampling by MCMC. Hence, we applied a support plane analysis to estimate the model parameter uncertainties, in which we systematically varied x 3 while minimizing all other parameters.
Electron paramagnetic resonance. For double electron electron resonance (DEER) measurements of doubly spin labeled proteins,~200 µM spin-labeled T4L containing 20% glycerol (v/v) was placed in a quartz capillary (1.5 mm i.d. × 1.8 mm o.d.; VitroCom) and then flash-frozen in liquid nitrogen. Sample temperature was maintained at 80 K. The four-pulse DEER experiment was conducted on a Bruker Elexys 580 spectrometer fitted with an MS-2 split ring resonator. Pulses of 8 ns (π/2) and 16 ns (π) are amplified with a TWT amplifier (Applied Engineering Systems). Pump frequency was set at the maximum of the central resonance, and the observe frequency was 70 MHz less than the pump frequency. Dipolar data are analyzed by using a custom program, LongDistances 71 , written in LabVIEW (National Instruments Co.). Distance distributions are acquired using Tikhonov regularization 51 .
Recovering the reaction network by Brownian dynamics simulations. To solve the ambiguity in the connectivity of states and kinetics of T4L, i.e. between the two possible analytical solutions of the transition rate matrix (Supplementary Methods Equation (31)), we used Brownian dynamics simulation of single-molecule and fFCS experiments. Simulations of single-molecule measurements are done via Brownian dynamics [72][73][74][75] . The spatial intensity distribution of the observation volume was assumed a 3D Gaussian. In contrast to other simulators, freely diffusing molecules in an "open" volume are used. Transition kinetics is modeled by allowing i → j transitions. The time that molecules spend in i and j states (t i and t j , respectively) are exponentially distributed with Simulated photon counts are saved in SPC-132 data format (Becker & Hickel GmbH, Berlin, Germany) and treated as experimental data. To quantify the difference between the two possible, simulated models and the experimental data, we calculated the relative χ 2 for the one-dimensional and two-dimensional MFDhistograms (Supplementary Note 3, Supplementary Table 1a, b).
Simulation of interdye distances and structural modelling. Accessible contact volume (ACV) simulations and interdye distances. The accessible volume (AV) considers dyes as hard sphere models connected to the protein via flexible linkers (modeled as a flexible cylindrical pipe) 38 . The overall dimension (width and length) of the linker is based on their chemical structures. For Alexa488 the five carbon linker length was set to 20 Å, the width of the linker is 4.5 Å and the dye radii R 1 = 5 Å, R 2 = 4.5 Å and R 3 = 1.5 Å. For Alexa647 the dimensions used are: length = 22 Å, width = 4.5 Å and three dye radii R 1 = 11 Å, R 2 = 3 Å and R 3 = 3.5 Å. Here, the dye distribution was modeled by the accessible contact volume approach (ACV) 9 , which is similar to the accessible volume (AV) 38 , but defines an area close to the surface as contact volume.
Similar approaches have been introduced before to predict possible positions for EPR and FRET labels 10,36,37 . The dye is assumed to diffuse freely within the AV and its diffusion is hindered close to the surface. The part of AV which is closer than 3 Å from the macromolecular surface (contact volume) is defined to have higher dye density ρ Dye,trapped . The spatial density ρ Dye along R is approximated by a step function: ρ Dye = [ρ Dye,free , R < 3 Å; ρ Dye,trapped , R ≥ 3 Å]. The ρ Dye,trapped /ρ Dye, free ratio is calculated from the fraction of the trapped dye x Dye,trapped for each labeling position separately: ρ Dye,free /ρ Dye,trapped = V Dye,trapped ·(1 − x Dye,trapped )/ (x Dye,trapped ·V Dye,free ). For this, the fraction x Dye,trapped was approximated by the ratio of the residual, r ∞ , and fundamental anisotropy, r 0 , determined by the timeresolved anisotropy decay of the directly excited dyes (Supplementary Table 3).
To account for dye linker mobility, we generated a series of ACV's for donor and acceptor dyes attached to T4L placing the dyes at multiple separation distances. For each pair of ACV's, we calculated the distance between dye mean positions (R mp ) whereR DðiÞ andR AðiÞ are all the possible positions that the donor fluorophore and the acceptor fluorophore can take. However, in ensemble TCSPC (eTCSPC) the mean donor-acceptor distance is observed: which can be modeled with the accessible volume calculation. The relationship between R mp and 〈R DA 〉 can be derived empirically following a third order polynomial from many different simulations. The 〈R DA 〉 is not directly related to the distance between atoms on the backbone (Cα-Cα), except through the use of a structural model.
FRET positioning and Screening (FPS). FPS is done in four steps, and its flow is based on the recent work by Kalinin et al. 10 . In order to do our experimental design using the available PDB structures of T4L with respect to our FRET data, FPS calculates the donor and acceptor accessible volumes for each donor-acceptor labeling scheme. We then compute an error function for each conformational state C (i) where N = 33 is the total number of FRET distances (〈R DA 〉) and the overall theoretically computed absolute uncertainty ΔR ðiÞ DA;tot (see next section). In order to compare the structural models currently available in the PDB to our experimental results, we clustered all PDB models using the RMSD (Root Mean Squared Deviation) of Cα atoms as the similarity measure. Clustering allowed us to sort all PDB models into three distinct groups based on the similarity of their backbone shapes. We found that the structural models of T4L group into open, ajar, and closed clusters (based on the proximity of the CTsD and NTsD) with an intra-cluster RMSD of less than 1.8 Å. Representative structures of these clusters are given by PDB IDs 172L, 1JQU, and 148L for the open, ajar, and closed conformations, respectively (Fig. 5a). This was done using the agglomerative hierarchical complete-linkage clustering of the "fastcluster" 76 software.
In Supplementary Table 4 we provide the complete breakdown of the three clusters. In Supplementary Fig. 6 we display the complete FRET-screening of the three clusters.
In the following, we describe the computation of the four individual contributions expressed as absolute and relative distance uncertainties, ΔR and δR, respectively.
(1) Dye model. The relative distance error δR dye model usually considers the asymmetry of the AVs for random labeling of two equivalent labeling sites (in general two cysteines) with δR dye model ≈ 1.5 % 47 . However, in this work we labeled T4L specifically (one cysteine, one p-acetylphenylalanine) so that ΔR dye model = 0.
(2) Uncertainty of the Förster Radius R 0 . The relative distance error δR R 0 ðrðtÞ dye Þ considers the uncertainty of the Förster Radius R 0 that is usually dominated by κ 2 errors related to the mutual orientation of donor and acceptor. At first, we use the experimental anisotropy decays r(t) dye recorded by eTCSPC and MFD and the wobbling in a cone (WIC) model to compute possible distribution of orientational factors, p(κ 2 ) 38,77 . As input we determined the residual anisotropies of the donor fluorescence r 3,D (Supplementary Table 3a), the directly excited acceptor fluorescence r 3,A (Supplementary Table 3b) and the FRET-sensitized acceptor fluorescence r 2,A(D) (Supplementary Table 3c). Based on these limits, we computed the distribution of the orientation factor p(κ 2 ) (Supplementary Table 3d) for each FRET pair as described in Sindbert at al. 38 . Next, we compute how p(κ 2 ) affects the experimentally observed interdye distance. Following the approach of Sindbert et al. 38 , we can assume that a DA pair is characterized by a single "true" DA distance R DA with κ 2 ≠ 2/3. As we calculate the DA distance assuming κ 2 = 2/3, we only recover an apparent DA distance, R app . Obviously, R app differs from R DA , R app =R DA ¼ 3 2 Á κ 2 À Á À1=6 . A distribution of κ 2 relates for a single R DA to a distribution of apparent R app . For each FRET pair the distribution p(κ 2 ) compiled in Supplementary Table 3d is transformed to a distribution of the relative distances R app / R DA = ξ. The standard deviation of the distribution p(ξ) is used as a relative approximate for the precision of the distance R DA .
The expectation value of p(ξ) can be used as an estimate of the accuracy: For the WIC model with the given residual anisotropies, the precision δ R DA; precision κ 2 ð Þ dominates the relative uncertainty. Estimates for δ R DA ;accuracy κ 2 ð Þ are very close to one. Hence, we estimate the overall uncertainty attributed to κ 2 by (3) Uncertainty of the D-only reference. The absolute uncertainty of the Donoronly reference ΔR reference considers the discrepancy between the photophysical properties of the donor in the FRET sample and the properties determined from the donor-only sample. This discrepancy is typically caused by unspecific labeling of the biomolecule and thus unknown fraction of donor-acceptor molecules with respect to acceptor-donor molecules (see also Peulen et al. 47 , Fig. 12). In this work, ΔR reference was set to zero, since specific labeling was used and the donor position is known exactly and accurate donor only sample was measured (Supplementary For example, considering a 1σ-confidence interval, the fraction x 3 of R Controls for the FRET analysis. Since the problems inherent in the use of smFRET studies are connected with complexities related to the labels, we performed ten controls to check for any potential label artifacts. Please refer to the