Transition path times of coupled folding and binding reveal the formation of an encounter complex

The association of biomolecules is the elementary event of communication in biology. Most mechanistic information of how the interactions between binding partners form or break is, however, hidden in the transition paths, the very short parts of the molecular trajectories from the encounter of the two molecules to the formation of a stable complex. Here we use single-molecule spectroscopy to measure the transition path times for the association of two intrinsically disordered proteins that form a folded dimer upon binding. The results reveal the formation of a metastable encounter complex that is electrostatically favored and transits to the final bound state within tens of microseconds. Such measurements thus open a new window into the microscopic events governing biomolecular interactions.

I nteractions between biomolecules are at the heart of all processes involving cellular signaling and communication. The mechanisms of how such interactions form are thus essential for both a fundamental understanding of these processes and targeted therapeutic intervention. Most of the mechanistic information is, however, contained in the exceedingly short parts of the reaction trajectories that start when the two molecules first encounter each other via translational diffusion and end with the formation of the stably bound complex. The transition states and intermediates visited on these transition paths are high in free energy and correspondingly unstable. Observing them experimentally has thus been challenging, and only in a few cases has it been possible to obtain glimpses of their structural or dynamic properties [1][2][3][4][5][6][7] . The experimental challenge is analogous to the one in protein folding: In both cases, the kinetics can often be approximated by a simple two-state reaction, where instantaneous transitions connect the initial and final states, but the most interesting information is hidden within the microscopic paths underlying these transitions 8,9 . Recent developments in singlemolecule spectroscopy have started to reveal this information for transition paths in protein folding [10][11][12][13] . Here we show how these advances enable new ways of probing the transition paths of protein binding.
We investigate the association between the nuclear-coactivator binding domain (NCBD) of the CBP/p300 transcription factor and the activation domain of SRC-3 (ACTR), two members of the broad spectrum of intrinsically disordered proteins (IDPs), proteins that lack stable tertiary structure in isolation 14 . The interaction between ACTR and NCBD is a paradigm of coupled folding and binding 15,16 , a mechanism that is frequently observed for IDPs. NCBD, a marginally stable, molten-globule-like IDP with pronounced helical content even in the unbound state 17 , and the largely unstructured ACTR 15 bind to each other with nano-molar affinity and form a cooperatively folded heterodimer 15 . We monitor their interaction using single-molecule Förster resonance energy transfer (FRET) 18 by labeling ACTR with a donor fluorophore, immobilizing it on a surface, and adding acceptor-labeled NCBD to the solution (Fig. 1a). Upon excitation, unbound ACTR emits only donor photons; on binding of an NCBD molecule, energy transfer results in a decrease in donor emission and an increase in acceptor emission (Fig. 1c). The signal change during the transition is recorded by confocal single-photon counting at high count rates (on average 200 ms −1 ) to be able to probe microsecond timescales. This high time resolution allows us to measure binding transition path times and their distribution, which reveal the presence of an encounter complex with a lifetime of~80 µs. The formation of this transient intermediate, where the molecules have associated but not yet folded, is favored by electrostatic interactions, in contrast to the final folding step. Such measurements thus create new opportunities for deciphering the mechanisms of protein binding.

Results
Measuring average transition path times. We collected time traces of ACTR/NCBD binding events with the following approach (Fig. 1c, Supplementary Fig. 1 and 2): For each immobilized ACTR molecule, multiple association and dissociation events were first monitored at low excitation rate to ensure that the observed molecules are binding-competent, and to exclude a contribution from nonspecific surface interactions. Starting at a time when no NCBD was bound, the laser power was increased to the highest possible intensity that still allowed the observation of binding transitions with high photon rates before photobleaching occurred ( Supplementary Fig. 1, see Methods). Dissociation events were not included since they exhibit the same  Fig. 1 Observing coupled folding and binding by single-molecule FRET. a ACTR labeled with Cy3B as the donor fluorophore (green) is immobilized on a polyethyleneglycol (PEG)-coated quartz cover slide via a biotin−avidin−biotin linkage and excited by a laser beam. NCBD free in solution is labeled with CF660R as the acceptor fluorophore (red). Cartoon based on PDB entries 1KBH 15 and 2KKJ 17 . b Schematic free-energy landscape of the reaction with a molecular trajectory of a binding transition. The molecules are in the unbound or bound states most of the time and only rarely cross the energy barrier. A binding transition path extends from the time when x 0 is crossed until x 1 is reached without returning to x 0 . c Examples of measured fluorescence time traces of binding events (ionic strength 108 mM, viscosity 1.28 cP), represented as binned (top) and single-photon data (bottom). The molecules start in the unbound state, indicated by high donor intensity (green) and low acceptor intensity (red). When NCBD binds to ACTR, the FRET efficiency increases in a rapid jump, corresponding to the transition path, with a concomitant increase in acceptor emission and a decrease in donor emission. In the single-photon representation of the transition-path region, donor photons are shown as green lines and acceptor photons as red lines. Segments of the trajectories identified as populating the transition path by the Viterbi algorithm are indicated by the gray shading in the single-photon time traces change in observed FRET efficiency as acceptor bleaching. Because of microscopic reversibility, however, the statistics of transition path times for dissociation should be identical to those for binding. Examples of recorded transitions are shown in Fig. 1c and in Supplementary Fig. 2. Despite the relatively high photon count rates, it is difficult to identify the start and end of a transition by visual inspection. For this reason, the photon arrival times were analyzed on a photon-by-photon level with the maximum-likelihood approach developed by Gopich and Szabo, which was previously applied to protein folding by Chung and Eaton 10,19,20 .
In this analysis, the likelihood of two kinetic models is compared (Fig. 2a): A two-state model and a model with an intermediate state that mimics the transient species populated during the transition 10-12 (see Methods). By maximizing the log likelihood difference between the models, ΔlnL, with respect to the mean lifetime, τ I , and the transfer efficiency, E I , of the intermediate state (Fig. 2b), we obtained the most likely values, τ I ¼ 80 ± 8 μs andÊ I ¼ 0:72 ± 0:02, respectively, from the analysis of all 686 measured transitions (unless stated otherwise ± indicates the standard error; here from 1000 bootstrapping trials). We tested the robustness of this result with two controls. To assess the influence of surface-attachment, we immobilized NCBD instead of ACTR. Using Cy3B-labeled NCBD on the surface and CF660R-labeled ACTR in solution (see Supplementary Fig. 3A), we obtained 〈t TP 〉 = 95 ± 22 µs andÊ I = 0.66 ± 0.05, within experimental error of the above result. To assess the influence of the dyes, we measured immobilized Alexa488-labeled ACTR with CF660R-labeled NCBD in solution (see Supplementary Fig. 3B) and found 〈t TP 〉 = 101 ± 11 µs, similar to the other dye pair;Ê I = 0.52 ± 0.03 was lower than with Cy3B, as expected from the shorter Förster radius of Alexa488/CF660R (4.7 nm) compared to Cy3B/CF660R (6.0 nm).
The relatively high value ofÊ I indicates that the labeled Cterminal segments of ACTR and NCBD are in proximity already during the transition. The value ofτ I can be interpreted as the average duration of the transition paths, 〈t TP 〉. [10][11][12] Remarkably, 〈t TP 〉 is very long compared to the transition path times of the   Table 3). d Solvent viscosity dependence of 〈t TP 〉 (with linear fit, constrained to 〈t TP 〉 ≥ 0 at zero viscosity, and 90% confidence interval) andÊ I (average: solid line; standard deviation: dashed lines). Error bars indicate standard errors obtained from 1000 bootstrapping trials folding of monomeric proteins previously observed in simulations 21 and most experiments [10][11][12] . The two most likely reasons for this surprisingly slow passage to the bound state are: (i) slow diffusion on the free-energy surface caused by internal friction arising from non-native interactions 12,13,22,23 , or (ii) a local freeenergy minimum corresponding to a high-energy intermediate or encounter complex [24][25][26] , where ACTR and NCBD are already in contact but have not yet found their final, stably bound structure.
Internal friction versus high-energy intermediate. To examine the first possibility, we measured the dependence of 〈t TP 〉 on solvent viscosity, since a viscosity-independent component in the dynamics is a common fingerprint of internal friction 12,22,23 . The pronounced increase of 〈t TP 〉 with viscosity and the absence of a significant intercept at zero viscosity (Fig. 2c, d) suggest that internal friction does not substantially contribute to the transition-path dynamics. We note thatÊ I -and thus the average inter-dye distance during the transition-exhibits no systematic change with viscosity, indicating that the addition of glycerol does not alter the conformational ensemble, attesting to the robustness of the analysis.
To probe the second possibility, the presence of a high-energy intermediate, as the cause for the long 〈t TP 〉, we take into account not only the average, but the distribution of transition path times, which is sensitive to the shape of the free-energy barrier 27,28 . The relatively large number of photons detected during the binding transitions enabled us to calculate ΔlnL j for every transition j individually (orange curves in Fig. 2b) and identify the most likely transition path time,τ I;j , for each. The resulting distribution ofτ I;j (grey histograms in Fig. 3d) was then compared to the distributions expected for different barrier shapes. We tested barriers ranging from an inverted harmonic potential (representing a simple transition state) to a flat barrier top and a high-energy intermediate with different stabilities (Fig. 3a, b) and calculated the corresponding t TP distributions for each potential by numerically solving the Smoluchowski equation 28 (Fig. 3c). The narrowest t TP distributions are those for a simple transition state, and they broaden with increasing stability of the intermediate. When the stability of the intermediate approaches a few k B T, the transition path time becomes essentially equivalent to the lifetime of the intermediate, which, according to classical kinetics, should be exponentially distributed. Indeed, this is the behavior observed (Fig. 3c). To fully account for photon statistics, we simulated photon time traces with t TP sampled from these distributions, analyzed them in the same way as the measured data, and compared the resulting distributions ofτ I;j to the experiment (Fig. 3d) by calculating the χ 2 -distance between them (Fig. 3e). The validity of this analysis was tested based on synthetic data (see Supplementary Fig. 4 and Methods for details).
At all solvent viscosities, the best agreement between simulation and measurement is achieved with a stability of the intermediate of at least −5 k B T (Fig. 3d, e, Supplementary Fig. 5), where the t TP distribution is, within uncertainty, indistinguishable from an exponential distribution. This observation, together with the long 〈t TP 〉 and highÊ I (Fig. 2), indicates the presence of an encounter complex where ACTR and NCBD are associated, but still separated by a free-energy barrier from the stably bound and folded state. Notably, the value of at least 5 k B T for the barrier height of escape from the encounter complex is consistent with an independent estimate based on Kramers theory 29 : From the reconfiguration time of ACTR, τ r ≈ 75 ns, which has recently been measured 30 , we estimate the preexponential factor to be τ 0 ≈ 2πτ r ≈ 0.5 μs. 31 Assuming two symmetrical barriers and usinĝ τ I ¼ 80 μs as the escape time, we obtain a barrier height of ln 2τ I =τ 0 ð Þk B T ≈ 5.8 k B T. (We note that our measurements do not  . c Probability density (PD) functions of transition path times, t TP , calculated for the barrier shapes in (b), and an exponential t TP distribution (dashed). d Histogram ofτ I;j for the measurement at 108 mM ionic strength without glycerol (gray), compared to histograms from simulated photon time traces based on different t TP distributions (colored, see legend). The smaller peaks in the sub-microsecond range are mostly due to transitions that were too fast to be accurately determined at the photon count rates of the corresponding time traces (see Methods and Supplementary Fig. 7). e χ 2 -distances between measured and simulated histograms for the measurements at different solvent viscosities, η (standard errors from 27 simulations each) allow us to determine the rate coefficients for the transitions from the intermediate to bound and the unbound states separately but only their sum. 10 The largest asymmetry of the barriers bounding the intermediate and still compatible with our results would bẽ 11 k B T versus~5 k B T (see Maximum likelihood analysis of binding transitions in Methods).) The shape of the observed t TP distributions can thus be explained by a localized intermediate. We note that roughness of the energy landscape along the reaction coordinate cannot account for our findings, although both scenarios would lead to a longer mean transition path time 12 . This is because introducing roughness, crudely speaking, amounts to lowering the effective diffusion coefficient along the reaction coordinate 32 . Indeed, numerical experiments where we introduced sinusoidal roughness of different amplitudes and calculated the resulting t TP distributions confirm that the roughness slows down the mean transition path time but has almost no effect on the shape of its distribution (see Supplementary Fig. 6).
Ionic-strength dependence of transition path times. To further investigate the nature of the encounter complex, in particular the role of electrostatics, we measured 〈t TP 〉,Ê I , and the association and dissociation rate coefficients, k on and k off , respectively, as a function of ionic strength. k on decreases about fivefold when the ionic strength is raised from 50 to 400 mM, while k off increases only about twofold (Fig. 4a, b), which is consistent with previous kinetic measurements 33 and in accord with the opposite net charge of ACTR and NCBD [34][35][36] . The encounter complex, however, behaves very differently: neither 〈t TP 〉 norÊ I change significantly with increasing ionic strength (Fig. 4c, d). The strong ionic-strength dependence of k on suggests that electrostatic interactions are formed already early during the transition, whereas the weak dependence of 〈t TP 〉 indicates that the subsequent formation of the folded state, which involves packing of the hydrophobic core, is less electrostatically driven.

Discussion
In summary, the distribution of transition path times we have measured for the association of two IDPs reveal the presence of an encounter complex. Owing to this transient intermediate, it takes on average~80 μs from the diffusional encounter of the two binding partners to the formation of the stably folded complex, much longer than the transition path time expected for the folding of a monomeric protein of similar size 10,11,21 . Neither pronounced internal friction nor "roughness" of the free-energy surface, which have been shown to slow down some protein folding reactions involving non-native interactions or misfolding 12,13,22,23,37 , are likely to be the cause of the long transition path times, as indicated by the strong solvent viscosity dependence of the transition path times we observe (Fig. 2d) and by simulations of the effect of energetic roughness on transition path time distributions (Supplementary Fig. 6).
The most likely mechanism for the coupled folding and binding of ACTR and NCBD is thus the initial formation of a transient encounter complex that is stabilized by the electrostatic interactions between the two oppositely charged IDPs 33 , followed by folding (Fig. 4e). The barrier to encounter complex formation is very low, as reflected by an association rate coefficient that is only about an order of magnitude below the diffusion-limited value expected for a barrierless binding reaction 26,33,38,39 . Although the stability of the encounter complex slows the overall transition path time compared to simple monomeric folding reactions, it still proceeds remarkably quickly. The search process may be facilitated by relatively specific initial contacts between the N-terminal helices of ACTR and NCBD that provide noncovalent connectivity and reduce conformational freedom 36 , as suggested by ϕ-value analysis 5 , an increase in association rates with helicity 40 , and simulations 41 . However, according to the ϕvalue analysis 5 , these interactions are highly localized, and the extended hydrophobic interface between ACTR and NCBD is largely non-native near the transition state. This result implies that the electrostatic interactions favoring association 33 (Fig. 4a) are also predominantly non-native. However, if they interchange rapidly compared to the interconversion time to the native state and the net gain of electrostatic interactions upon folding is small, no pronounced dependence of 〈t TP 〉 on salt concentration is expected, in agreement with our observations (Fig. 4c).
In spite of the small size of ACTR and NCBD, their kinetics of folding and binding have been observed to be remarkably complex. Stopped-flow measurements revealed multistate kinetics on timescales of milliseconds to seconds 42 , and recent singlemolecule experiments identified a contribution due to peptidylprolyl cis/trans isomerization in the range of tens of seconds 39 . The timescale of 80 µs we observed here for the transition path time is similar to a fast kinetic phase observed during binding of ACTR to NCBD in temperature-jump experiments 43 , which was attributed to the conformational exchange within NCBD previously identified by NMR 44 . The lifetime of the encounter complex might thus be linked to the internal dynamics of the molten-globule-like NCBD.
How do our observations for ACTR/NCBD relate to the behavior in other coupled folding-and-binding reactions of IDPs? The recent increase in kinetic investigations of IDP interactions has made it clear that the underlying mechanisms are diverse and difficult to generalize 35,45 . However, ACTR and NCBD do not seem to be an unusual case. With the high abundance of charged amino acids in IDPs 46 , a pronounced role of electrostatics is commonly observed, especially for association rates 35,45 . An initial binding event that precedes folding also seems to be a common scenario 45 , often referred to as "induced fit" 47 . However, even in cases where observations such as nonlinearities of concentration-dependent kinetics may indicate the presence of an encounter complex, characterizing its structural and dynamic properties has been more difficult, with NMR providing the most detailed insights so far 1,2 . Since the equilibrium populations of transient intermediates along the path of coupled folding and binding are typically low, detecting them with ensemble kinetics, such a temperature jump experiments 48 , is challenging. Since much of the interesting mechanistic information is contained in the transition paths 13,20,49,50 , probing them by single-molecule spectroscopy provides an opportunity for revealing the mechanisms of protein binding and complementing kinetic and structural information from other methods. Next steps for advancing this approach will be to combine it with multiple labeling positions 38 or three-color FRET 51 to map the structure and dynamics of encounter complexes in protein interactions in more detail.

Methods
Protein expression. ACTR-Avi: The coding sequence of a single-cysteine ACTR variant was cloned via BamHI/HindIII into a pAT222-pD expression vector (gift from J. Schöppe and A. Plückthun) 52 , yielding a protein construct with an Nterminal Avi-tag and a Thrombin-cleavable C-terminal His 6 -tag (sequence of the cleaved construct: MAGLNDIFEA QKIEWHEGSM GSGSGTQNRP LLRNSLDDLV GPPSNLEGQS DERALLDQLH TLLSNTDATG LEEIDRALGI PELVNQGQAL EPKQDCGGPR). pBirAcm (Avidity, Aurora CO, USA) was cotransfected for in vivo biotinylation of Lys12 in the Avi-tag 53 , and expression was carried out in Escherichia coli C41(DE3) (Merck). Cells were grown at 37°C in TYH medium (for 1 l: 20 g tryptone, 10 g yeast extract, 11 g HEPES, 5 g NaCl, 1 g MgSO 4 , pH 7.3), supplied with 0.5% (w/v) glucose, until they reached an OD 600 of 0.8. Then, 50 µM biotin in 10 mM bicine buffer (pH 8.3) and 1 mM IPTG were added to the culture. Expression continued for 3 h at 37°C, after which cells were harvested by centrifugation. The harvested cells were lysed by sonication, and the His 6 -tagged protein was enriched via immobilized metal ion affinity chromatography (IMAC) on Ni-IDA resin (ABT). The His 6 -tag was then cleaved off with thrombin (Serva Electrophoresis) and separated from the protein by another round of IMAC. Finally, biotinylated protein was separated from impurities and nonbiotinylated protein via reversed-phase HPLC (RP-HPLC) on a C18 column (Reprosil Gold 200, Dr. Maisch, Germany) with a H 2 O/0.1% trifluoroacetic acid −acetonitrile gradient. The purified protein was lyophilized, resuspended in buffer, and stored at −80°C until use.
ACTR: ACTR containing a C-terminal cysteine was also inserted into the pAT222-pD expression vector containing an HRV 3C-cleavable N-terminal Avitag as well as a Thrombin-cleavable C-terminal His 6 -tag (sequence of the cleaved construct: GPSGTQNRPL LRNSLDDLVG PPSNLEGQSD ERALLDQLHT LLSNTDATGL EEIDRALGIP ELVNQGQALE PKQDCGGPR). ACTR was expressed like the other variants containing an N-terminal Avi-tag. After enrichment of the His 6 -tag-containing protein by IMAC, the C-terminal His 6 -tag was cleaved off, followed by a second round of IMAC. To obtain the fully cleaved protein, the Avi-tag was cleaved off by HRV-3C protease. The RP-HPLC purification was carried out as described above for two consecutive rounds.
NCBD: A construct with a single-cysteine residue and proline residues 20 and 23 replaced by alanine (to suppress kinetic heterogeneity due to peptidyl-prolyl cis/ trans isomerization 39 ) was generated by site-directed-mutagenesis (primers used (Microsynth): NCBD_P20A_P23A_fw: GCA TCT TCA GCG CAA CAG CAA CAG CAA GTT CTT AAC; NCBD_P20A_P23A_rev: GCT GTT GCG CTG AAG ATG CCG ATT TCA GCG TCC). Furthermore, the expression construct contained an N-terminal His 6 -tag cleavable with HRV 3C protease (sequence of the cleaved construct: GPNRSISPSA LQDLLRTLKS ASSAQQQQQV LNILKSNPQL MAAFIKQRTA KYVANQPGMQ C). NCBD was coexpressed 54 with ACTR from a pET-47b(+) vector. Cell lysis and protein enrichment via IMAC were carried out as described above, followed by enzymatic cleavage of the His 6 -tag with HRV 3C protease and separation of the tag from the proteins via another round of IMAC. Finally, ACTR and NCBD were separated with RP-HPLC as described above.
NCBD-Avi: The NCBD construct containing a single cysteine at the C-terminus as well as an N-terminal Avi-tag and a C-terminal cleavable His 6 -tag was cloned, expressed and purified analogously to the ACTR-Avi variant (sequence of the cleaved construct: AGLNDIFEAQ KIEWHEGSMG SGSSPNRSIS PSALQDLLRT LKSASSAQQQ QQVLNILKSN PQLMAAFIKQ RTAKYVANQP GMQCGGPR). Also in this sequence, proline residues 20 and 23 were replaced by alanine to avoid kinetic heterogeneity.
Protein labeling. ACTR-Avi: Lyophilized protein was dissolved under nitrogen atmosphere to a concentration of 200 µM in 100 mM potassium phosphate buffer, pH 7.0, and was labeled for 3 h at room temperature with a 0.8-fold molar ratio of Cy3B or Alexa488 maleimide (GE Healthcare Life Sciences) to protein. Labeled protein was separated from unlabeled protein with RP-HPLC on a Sunfire C18 column (Waters) as described above.
ACTR: Lyophilized protein was dissolved to 50 µM in 50 mM sodium phosphate buffer, pH 7.0, and labeled with a 1.2-fold molar ratio of CF660R maleimide (Biotium) to protein. Labeled protein was separated from unlabeled protein with RP-HPLC on a Reprosil Gold 200 column, followed by RP-HPLC on a Sunfire C18 column.
NCBD: Lyophilized protein was dissolved to 170 µM in 100 mM potassium phosphate buffer, pH 7.0, and was labeled with a 1.2-fold molar ratio of CF660R to protein. Labeled protein was separated from unlabeled protein with RP-HPLC on a C18 column (Reprosil Gold 200), followed by RP-HPLC on a Sunfire column.
NCBD-Avi: Lyophilized protein was dissolved to 220 µM in 50 mM sodium phosphate buffer, pH 7.0, and labeled with a 1.2-fold molar ratio of Cy3B to protein. Labeled protein was separated from unlabeled protein with RP-HPLC on a Reprosil Gold 200 column.
The correct mass of all labeled proteins was confirmed by electrospray ionization mass spectrometry.
Sample preparation for surface experiments. Surface experiments were performed using quartz cover slides coated with polyethylene glycol (PEG) and covalently modified with biotin (Quartz Coverslip (1″ × 1″), Bio 01, MicroSurfaces Inc., Englewood, NJ, USA). To clean the slides before use, they were boiled in water containing 0.1% Tween 20 and sonicated for 5 min. Silicone chambers (Secure Seal Hybridization Chambers, SKU:621202, Grace Bio Labs, Bend, OR, USA) were glued to the cover slide to yield four measurement chambers per slide. Biotinylated protein was immobilized on the cover slides with a biotin−avidin−biotin bond. To accomplish this, 200 µg/ml Avidin D (Vector Labs, Burlingame CA, USA) in NaP buffer (50 mM sodium phosphate, pH 7.0, 0.01% Tween 20) was added to the well and incubated for 3 min, followed by three washing steps with NaP buffer. Biotinylated protein was immobilized at a concentration of 10 pM in NaP buffer.
Measurements were performed in NaP buffer, with H 2 O replaced by D 2 O (NaP/ D 2 O) to increase the quantum yield of the dyes 55 . Compared to H 2 O, the photon count rates in D 2 O were increased by~20% for Cy3B and by~50% for CF660R. To improve the signal quality further, we employed an oxygen scavenging system consisting of 400 U/ml bovine liver catalase (Sigma), 0.4 mg/ml glucose oxidase from Aspergillus niger (Sigma), and 1% (w/v) glucose, as well as a redox system (1 mM ascorbic acid, 1 mM methyl viologen) 56 . Concentrations of 20−80 nM acceptor-labeled NCBD or ACTR free in solution were used in the experiments.
To investigate the viscosity dependence, measurements were performed in NaP/ D 2 O buffer containing 0, 14, 32, and 45% (v/v) glycerol. The viscosity was determined with a DV-I+ 4.0 Digital-Viscometer (Brookfield, Lorch, Germany). The ionic-strength dependence was measured in NaP/D 2 O buffer supplied with 0, 50, 100, 175, 300, and 800 mM NaCl. The point at 51 mM ionic strength was measured in 20 mM sodium phosphate, 10 mM NaCl, pH 7.0, 0.01% Tween 20 (in D 2 O). The pH of each solution was set to 7.0 by adjusting the ratio of monobasic to dibasic phosphate.
Instrumentation for surface experiments. Surface experiments were performed on a MicroTime 200 confocal single-molecule instrument (PicoQuant, Berlin, Germany). A continuous-wave laser at 532 nm (LBX-532-50-COL-PP, Oxxius S.A., Lannion, France) was used for excitation. The light was focused into the sample (UplanApo 60/1.20W; Olympus, Japan), and the emitted light was collected with the same objective. A triple-band mirror (zt405/530/630rpc, Chroma, USA) and a long-pass filter (532 LP Edge Basic, Chroma) were used to separate the 532-nm laser light from the emitted fluorescence. The fluorescence light was then focused through a 100 µm pinhole and split by a dichroic mirror (T 635 LPXR, Chroma) to separate donor and acceptor photons. Donor emission was filtered with a 585/65 ET bandpass filter (Chroma), acceptor emission with a RazorEdge LP 647 RU longpass filter (Chroma). Both photon streams were detected with avalanche photodiode detectors (SPCM-AQR-15, PerkinElmer, Waltham MA, USA) and photon arrival times recorded with a HydraHarp 400 event timer (PicoQuant). The temporal resolution is limited by the random jitter of the detectors (~50 ps). A function generator (33600A Series Waveform Generator, Keysight Technologies, USA) connected to the modulation input of the laser driver allowed fast (<3 ms) and automated switching of the laser intensity ( Supplementary Fig. 1). To scan the surface, the objective was mounted on a combination of two piezo-scanners, a P-733.2CL for XY-positioning and a PIFOC for Z-positioning (Physik Instrumente, Germany). To suppress oscillations of the scanner-stage, which can result in signal fluctuations, the digital notch filters were optimized for each axis.
Analysis of long photon time traces. To obtain the binding and unbinding rate coefficients, k on and k off , as well as the transfer efficiencies of the unbound and bound states, E U and E B , long photon time traces of surface-immobilized proteins were acquired at a laser power of 0.5 µW (measured at the back aperture of the objective). Time traces were inspected to ensure that no substantial brightness variations were occurring (e.g. caused by a drift of the molecule's position, longlived dark states, or background fluctuations). Suitable traces were analyzed until photobleaching. Single-step photobleaching indicated that only one immobilized molecule was present in the confocal volume.
The pseudo-first-order association rate coefficient, k on ¼ k on Á c NCBD , the dissociation rate coefficient, k off , and the photon rates were determined using the maximum likelihood approach introduced by Gopich and Szabo 19 . k on is the second-order association rate coefficient, and c NCBD is the concentration of NCBD free in solution (for Supplementary Fig. 3A, where ACTR is free in solution, this would be c ACTR ). The likelihood of time trace j is calculated from the general equation where N j is the total number of photons in the time trace; c i is the color of the ith photon (D or A); τ i=1 = 0, and τ i>1 is the inter-photon time, i.e. the time interval between the detection of the (i − 1)th and ith photon. K is the rate matrix describing the association−dissociation dynamics. We include an additional dark state accounting for fluorophore blinking in the low-FRET unbound state, which is populated and depopulated with rate coefficients k +b and k −b , respectively. Blinking also occurs in the high-FRET bound state but does not need to be included in the model, as it is not misrecognized as a transition. K is given by k on and k off are the rate coefficients of association and dissociation observed for immobilized ACTR at a given bulk concentration of NCBD. The dark state is populated and depopulated with rate coefficients k +b and k −b , respectively. n D,j and n A,j in Eq. (1) are diagonal matrices with the observed donor photon rates (n U D;j , n B D;j , n dark D;j ) and the acceptor photon rates (n U A;j , n B A;j , n dark A;j ) of the three states on the diagonal, respectively. The photon rates vary slightly from time trace to time trace, mainly because the immobilized molecules are placed at slightly different positions inside the laser focus. p T fin ¼ ð1; 1; 1Þ is the transposed unity vector. The vector p ini contains the populations at the start of the measurement. For the analysis of long time traces, we assume p ini = p eq , the equilibrium population of the three states, which is obtained from Kp eq = 0. We maximize P j lnðL j Þ, the sum over the logarithms of the likelihoods of all photon time traces, with respect to k on , k off , k +b , k −b , n D,j , and n A,j . For this purpose, we constrained the acceptor photon rate of the dark state to the acceptor photon rate of the unbound state, which is essentially the background signal of the acceptor detection channel n dark A;j ¼ n U A;j . Analogously, we constrained the donor photon rate of the dark state to the corresponding value of the bound state n dark D;j ¼ n B D;j , which is a good approximation since the transfer efficiency in the bound state is very high (i.e., E B ≈ 0.9).
To obtain the second-order association rate coefficient, k on ¼ k on =c NCBD , the concentration of labeled protein free in solution needs to be known accurately. Because the concentrations can vary by up to 25% from experiment to experiment due to surface adhesion and pipetting errors, concentrations were determined directly in the sample with fluorescence correlation analysis 57 . The fluorescence of CF660R-labeled NCBD in solution was measured before and after each experiment, and the amplitude of the correlation curve was used to determine the average number of molecules inside the confocal volume, which is proportional to the concentration. The nominal concentrations were then corrected by the relative concentrations found from the correlation analysis.
For converting the photon count rates to transfer efficiencies, they need to be corrected for background fluorescence (bg A and bg D ), crosstalk between the detection channels (acceptor emission to donor channel, β AD , and donor emission to acceptor channel, β DA ), acceptor direct excitation (α), and differences in the quantum yields of the dyes and the detection efficiencies of the two channels (γ). We determined bg A and bg D for each time trace after the molecule had photobleached and corrected the measured photon count rates: n ′ A;j ¼ n A;j À bg A;j and n ′ D;j ¼ n D;j À bg D;j : ð3Þ β AD and α are negligible for these dye pairs in our instrument, so we can calculate the transfer efficiency as: Since we know that E U = 0, we can determine β DA,j and γ j directly from the measured time traces. In the unbound state, Eq. (4) simplifies to n ′U A;j À β DA;j n ′U D;j ¼ 0; ð5Þ and therefore After correcting for γ j , the total photon count rates in the bound and unbound state should be the same: Since n ′U A;j À β DA;j n ′U D;j ¼ 0, and we know β DA,j , we can calculate γ j : We calculated β DA,j and γ j for each time trace and used them to calculate E B,j with Eq. (4). E B,j was then averaged over all time traces to get a mean transfer efficiency, 〈E B 〉. All parameters determined from long photon time traces are listed in Supplementary Tables 1 and 2. Measuring transition path times. To measure transition path times, highintensity time traces were recorded in an automated fashion. The piezo-driven scanning stage of the microscope allows surface-immobilized labeled proteins to be localized in a 20 µm × 20 µm region of the cover slide. In the next step, the identified molecules are brought into focus one-by-one. The flow chart of the data acquisition procedure is detailed in Supplementary Fig. 1A. Initially, fluorescence is recorded at a laser power of 0.5 µW (measured at the back aperture of the objective), and donor and acceptor photons are binned (binning interval 10 ms). If the photon count ratio n A / (n A + n D ) is below 0.5 for five consecutive bins (i.e. no binding partner is bound), the laser is switched to high power (5-50 µW) for 0.9 s in order to detect a potential binding event with much higher photon count rates. Afterwards, the laser power is switched off and the objective is moved to the next ACTR molecule. By always switching the laser to high power when the ACTR molecule is in the unbound state, we increase the probability of observing a binding transition instead of an unbinding transition during the period of high laser power. Additionally, the initial part of the recording at low laser power allows us to verify that the laser is indeed positioned on a functional molecule that shows anticorrelated changes in donor and acceptor signal characteristic of binding and unbinding. In Supplementary Fig. 1B, an example of a time trace with the switch between low and high laser power is shown. We monitored only binding transitions because unbinding transitions exhibit the same change in observed FRET as acceptor photobleaching (in both cases the transfer efficiency drops to zero), which would bias the observed transition path times. In Supplementary Fig. 8, the log likelihood difference plots are compared for binding transitions, the unbinding/photobleaching transitions, and all transitions combined. For the unbinding/photobleaching transitions, no significant peak in the difference log likelihood is observed, as expected if transition paths for photobleaching are much faster than for unbinding.
Analysis of high-intensity photon time traces. The high-intensity time traces were inspected and transitions were identified visually. Around each transition, a time window was centered in a way that it did not contain any other transitions or blinking events. The duration of this window was chosen so that it was at least 1 ms long and contained at least 1000 photons. The range of the resulting window lengths is shown for each dataset in Supplementary Table 3. The photon count rates of the unbound and bound states were determined from the donor and acceptor emission before and after the transition.
To exclude time traces with blinking, blinking events were identified in the following way: For every detection channel and conformational state, the probability for each observed inter-photon time was calculated given the observed mean photon rate and number of photons, assuming exponentially distributed inter-photon times. Blinking events were defined as inter-photon times with a probability of less than 0.01, and the window used for analysis was chosen small enough to exclude all blinking events. If a blinking event occurred within less than 1 ms from the transition, the time trace was not used for analysis. The resulting photon time traces were then used for the maximum likelihood analysis. Supplementary Fig. 2 shows representative time traces, and Supplementary Table 3 shows for each dataset the number of analyzed transitions, the average total photon count rates, the range of window lengths, and the resulting 〈t TP 〉 andÊ I .
Maximum likelihood analysis of binding transitions. To obtain transition path times, we apply the method introduced by Chung and Eaton 10-12 . To approximate transition paths, a simple three-state model was used, where the transition path is described by a virtual intermediate state, I, between the unbound and bound states, U and B, respectively (here we assume that NCBD is in solution and ACTR immobilized; experiments with ACTR in solution and immobilized NCBD are described analogously): The depopulation of I is described by the rate coefficient k I . The lifetime of the intermediate state, τ I = 1/(2k I ), corresponds to the transition path time, t TP . The rates from I to U and I to B are not necessarily equal; however, in our analysis we can only measure the lifetime of I, which is the inverse sum of the two rates. Since we only consider segments in the time traces where transitions from U to B occur, and we assume that we observe no U → I → U and B → I → B transitions, we set the rate coefficients for leaving I to be equal to simplify the analysis 10-12 . The rate coefficients to I from both directions are k ′ on Á c NCBD and k ′ off . They are related to k on and k off in a two-state model that assumes an instantaneous transition, The factor 1/2 arises because the intermediate state can also react back to the original state in the three-state model, and so on average only every second attempt leads to binding or unbinding. The idea behind this approach of transition path time analysis is to compare the likelihood of an instantaneous transition, L j (τ I = 0), with the likelihood of an intermediate state with lifetime τ I and transfer efficiency E I , L j (τ I , E I ). Schematic FRET efficiency time traces for these two cases are shown in Fig. 2a. In both cases, the likelihood is calculated according to the general formula in Eq. (1). For L j (0), we use the rate matrix for the two-state model given by and for L j (τ I , E I ) we use the three-state model, where k on ¼ k on Á c NCBD and k ′ on ¼ k ′ on Á c NCBD . To prevent random fluctuations in photon rate from being misrecognized as transitions to I, k on and k off were set to 0.1 s −1 and k ′ on and k ′ off to 0.2 s −1 , which is slow compared to the average length of the fluorescence time traces. This approach is valid since we directly compare the two models. 10 The photon rates of the intermediate state of the three-state model are given by where 〈E U 〉 is zero, and 〈E B 〉 was determined from long time traces acquired at low excitation power (see Analysis of long photon time traces). The likelihoods for a time trace to have originated from an instantaneous transition or from a transition of finite duration, corresponding to an intermediate state I with lifetime τ I , and transfer efficiency E I can be calculated with Eq. (1). To compare the two models, the log likelihoods are subtracted, ΔlnL j ðτ I ; E I Þ ¼ lnL j ðτ I ; E I Þ À lnL j ð0Þ; ð15Þ and τ I and E I are varied systematically to obtain log likelihood difference plots (Fig. 2b). ΔlnL j values of multiple time traces can be added to yield an average likelihood; from its maximum, the most likely lifetime,τ I ¼ t TP h i, and most likely transfer efficiency,Ê I , can be determined: One can also find the most likely value for t TP of an individual transition by maximizing ΔlnL j , although with higher uncertainty.
To test whether the peaks observed in the log likelihood difference plots originate from molecular binding rather than random fluctuations in the fluorescence signal, we used the following control: For each measured time trace, we deleted segments containing the transition and surrounding intervals of varying length (see Supplementary Fig. 9A). We analyzed these altered datasets again with the maximum likelihood method. If the likelihood peak is caused by the finite duration of the binding transition, then we expect the peak to disappear upon deleting the transition region. However, if it was due to fluctuations of the fluorescence signal, it would persist. In Supplementary Fig. 9B, the likelihood curves at E I ¼Ê I ¼ 0:72 are shown for time traces with segments of different lengths deleted. The likelihood peak disappears if more than about 250 µs are deleted, indicating that the measured peak is caused by the transition path times of binding transitions. Upon deleting segments of 80 µs, corresponding to 〈t TP 〉, there is still a substantial peak, because many of the transitions are longer than 80 µs (owing to the tail of the t TP distribution, Fig. 3).
To simplify the analysis, we assume the barriers for escape from I to be symmetric in height. However, we can estimate the maximum asymmetry in barrier heights from I to U and from I to B compatible with our experimental observations based on the following considerations (cf. Figure 4e): The ratio of about 0.1 between the observed association rate coefficient and a purely diffusion-limited collision rate (~10 9 M −1 s −1 ) yields an overall activation free-energy barrier for binding of~2.3 k B T. From the observed dissociation rate and a preexponential factor of 0.5 μs (see main text), we obtain an overall activation free-energy barrier for dissociation of~11 k B T for the case of equal barriers from I to U and B (since in that case k B→I = k′ off = 2 k off = 32 s −1 ). In this case, our estimate for the barrier heights for the escape from I is~5.8 k B T (see main text). These restraints correspond to the scenario shown in Fig. 4e (solid line). Our results would in principle, however, also be compatible with a situation where both the free energy of I and the barrier from I to B are reduced to the same extent. If we choose as a limit for this reduction the point where the free energies of I and B are equal, the barrier for I to U would be~11 k B T. k I→U would then be negligible compared to k I→B for leaving I, in which case 1/k I→B = 80 μs, resulting in a barrier from I to B of 5 k B T. The largest asymmetry of the barriers bounding I would thus be 11 k B T versus 5 k B T.
Analysis of transition path time distributions. To quantify the distribution of transition path times, we calculated ΔlnL j for every transition individually (using theÊ I value found above), but for each transition j, we identified theτ I;j with the highest likelihood and generated a histogram from the resulting values (Fig. 3d). To find the underlying t TP distribution, we need to consider the broadening of the distribution due to the limited photon statistics. Photon time traces were thus simulated based on different theoretical transition path time distributions (see below) and analyzed in the same way as the measured data. To ensure that the photon statistics and shot-noise broadening are equivalent to those in the experimental data, we used the lengths and photon count rates of the measured time traces for the simulations. The simulations were performed in the following way: First, state trajectories were generated, each containing a single transition in the center, with a transition path time chosen randomly from the given theoretical distribution. For each state, photons were simulated with exponentially distributed inter-photon times, using the inverse of the total experimentally observed photon count rate of each state as mean inter-photon times. Photons were then randomly assigned to the acceptor or donor channel in accordance with the ratio of the corresponding photon count rates of the states observed experimentally. The photon count rates in the intermediate state were calculated using theÊ I obtained from the measured data.
These simulated photon time traces were then analyzed in the same way as the measured time traces, andτ I;j was quantified for all transitions. The resultingτ I;j histogram, H s , of the simulated data was then compared to the measured histogram, H m , by calculating the χ 2 -distance: By simulating data with different theoretical t TP distributions and finding the one with the smallest χ 2 -distance to the measured data, we can identify the distribution of underlying transition path times that agrees best with the measured data. In addition to calculating the χ 2 -distance, we also performed a k-sample Anderson −Darling test 58 (see Supplementary Fig. 5), which tests whether two samples originate from the same underlying distribution, independent of the functional form of the distribution. Like the χ 2 -distance, this method finds the best agreement with our measured data for an exponential distribution. The accuracy of this analysis was tested based on simulations (see Brownian dynamics simulations of transition paths).
There are three peaks present in theτ I;j histograms (see Fig. 3d): The largest at 〈t TP 〉, a smaller one at~100 ns, and a third one at~1 ns. The one at 1 ns arises from all transitions lacking a maximum in their ΔlnL j plots, which are all collected in the shortest bin. The peak at~100 ns also appears in the simulated datasets (see Fig. 3d), suggesting that it originates from the analysis of transitions that are too short or have too low a photon rate to be resolved accurately. To test this hypothesis, we simulated photon time traces of binding transitions with varying photon count rates and determinedτ I;j histograms as for the experimental data (see Supplementary Fig. 7). Indeed, all three peaks are present in the simulated results, and the ones at~1 ns and~100 ns decrease in amplitude with increasing photon count rates. Even though this observation indicates that some transitions in our measurements are not resolved, the method of determining the barrier shape is still expected to be valid, since we take the limited photon statistics into account when simulating the photon time traces we compare to the experimental data (Fig. 3d).
Theoretical transition path time distributions. In a description of coupled folding and binding as Brownian motion on a 1D free-energy surface, the distribution of transition path times depends on the shape of the free-energy barrier and the effective diffusion coefficient. While the barrier shape determines the shape of the t TP distribution, the diffusion coefficient only determines the overall timescale of the distribution. We tested different kinds of barrier shapes, including parabolic barriers of different heights, a flat barrier, and barriers with intermediates of different depths (Fig. 3a, b). We modeled the barriers with the following equations: Barriers with transition state : Flat barrier : VðxÞ ¼ 0; ð19Þ Barriers with intermediate : The transition path boundaries are x 0 and x 1 , with x 1 = −x 0 , and the heights or depths of the potentials are given by ΔV = V(0) − V(x 1 ). From these functions, we calculated the t TP distributions numerically according to the procedure described in the appendix of ref. 28 . Specifically, this distribution is proportional to the flux of trajectories starting infinitely close to the left boundary, at x = x 0 + ε, and exiting through the right boundary, x 1 , without returning to x 0 : The flux J was computed by solving the Smoluchowski equation, Jðx; tÞ ¼ ÀD βV′ðxÞ þ ∂ ∂x pðx; tÞ; ð23Þ with absorbing boundary conditions, p(x 0 , t) = p(x 1 , t) = 0. The spectral expansion method was used to solve the Smoluchowski equation numerically: in this method, the Smoluchowski equation is first transformed into the equivalent Schrödinger equation, with absorbing boundaries being equivalent to introducing infinite potential walls at x = x 0 , x 1 . The Schrödinger equation is then solved by diagonalizing its effective Hamiltonian using particle-in-a-box wavefunctions as the basis. For each t TP distribution calculated in this way, simulations of photon time traces were performed with different diffusion coefficients, corresponding to different values of 〈t TP 〉 (ranging from 90 to 130% of the measured 〈t TP 〉, in steps of 5%). Twenty-seven simulations were done for each value of 〈t TP 〉.τ I was determined for each simulation with the maximum-likelihood method, and for each set of 27, the average of allτ I was determined. The set with an averageτ I closest to the experimentally observed 〈t TP 〉 was then used for comparison to the measuredτ I;j histograms. The diffusion coefficients resulting in the best agreement with the measured values of 〈t TP 〉 are listed in Supplementary Table 4.
Brownian dynamics simulations of transition paths. To validate our method of finding the transition path time distribution, we performed Brownian dynamics simulations to generate transition paths for different barrier shapes. We then simulated fluorescence time traces based on these transition paths and analyzed them as described in Transition path time distribution analysis to test whether we could correctly identify the barrier shape on which the simulations were based. We used three different representative potentials for the Brownian dynamics simulations (see Supplementary Fig. 4A): Barrier with transition state: We adjusted the effective diffusion coefficient, D, for each potential so that 〈t TP 〉 between r 0 = 0.8 and r 1 = 1.2 (dashed lines in Supplementary Fig. 4A,B) was 80 µs. We then simulated transitions with time steps of 0.1 µs (see Supplementary  Fig. 4B) and converted distances into transfer efficiencies using the Förster equation: with the Förster radius R 0 = 1. The transfer efficiency time traces were then discretized into 20 states in steps of ΔE = 0.05. To ensure that the photon statistics NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-07043-x ARTICLE are equivalent to those of the experimental data (dataset at 1.28 cP), we used the measured photon count rates to calculate the donor and acceptor photon count rates of each state, n D;j ðEÞ ¼ E n B D;j À n U D;j þ n U D;j and n A;j ðEÞ ¼ E n B A;j À n U A;j þ n U A;j ; ð28Þ and simulated photon time traces based on the simulated state trajectories and the determined photon count rates as described in Transition path time distribution analysis. These photon time traces were then analyzed in the same way as the measured data. Both the χ 2 -distance and the k-sample Anderson−Darling test correctly identify the original barrier shapes (Supplementary Fig. 4C).

Data availability
Data supporting the findings of this manuscript are available from the corresponding author upon reasonable request. A custom module for Mathematica (Wolfram Research) used for the analysis of single-molecule fluorescence data is available upon request.