## Introduction

Interactions between biomolecules are at the heart of all processes involving cellular signaling and communication. The mechanisms of how such interactions form are thus essential for both a fundamental understanding of these processes and targeted therapeutic intervention. Most of the mechanistic information is, however, contained in the exceedingly short parts of the reaction trajectories that start when the two molecules first encounter each other via translational diffusion and end with the formation of the stably bound complex. The transition states and intermediates visited on these transition paths are high in free energy and correspondingly unstable. Observing them experimentally has thus been challenging, and only in a few cases has it been possible to obtain glimpses of their structural or dynamic properties1,2,3,4,5,6,7. The experimental challenge is analogous to the one in protein folding: In both cases, the kinetics can often be approximated by a simple two-state reaction, where instantaneous transitions connect the initial and final states, but the most interesting information is hidden within the microscopic paths underlying these transitions8,9. Recent developments in single-molecule spectroscopy have started to reveal this information for transition paths in protein folding10,11,12,13. Here we show how these advances enable new ways of probing the transition paths of protein binding.

We investigate the association between the nuclear-coactivator binding domain (NCBD) of the CBP/p300 transcription factor and the activation domain of SRC-3 (ACTR), two members of the broad spectrum of intrinsically disordered proteins (IDPs), proteins that lack stable tertiary structure in isolation14. The interaction between ACTR and NCBD is a paradigm of coupled folding and binding15,16, a mechanism that is frequently observed for IDPs. NCBD, a marginally stable, molten-globule-like IDP with pronounced helical content even in the unbound state17, and the largely unstructured ACTR15 bind to each other with nanomolar affinity and form a cooperatively folded heterodimer15. We monitor their interaction using single-molecule Förster resonance energy transfer (FRET)18 by labeling ACTR with a donor fluorophore, immobilizing it on a surface, and adding acceptor-labeled NCBD to the solution (Fig. 1a). Upon excitation, unbound ACTR emits only donor photons; on binding of an NCBD molecule, energy transfer results in a decrease in donor emission and an increase in acceptor emission (Fig. 1c). The signal change during the transition is recorded by confocal single-photon counting at high count rates (on average 200 ms−1) to be able to probe microsecond timescales. This high time resolution allows us to measure binding transition path times and their distribution, which reveal the presence of an encounter complex with a lifetime of ~80 µs. The formation of this transient intermediate, where the molecules have associated but not yet folded, is favored by electrostatic interactions, in contrast to the final folding step. Such measurements thus create new opportunities for deciphering the mechanisms of protein binding.

## Results

### Measuring average transition path times

We collected time traces of ACTR/NCBD binding events with the following approach (Fig. 1c, Supplementary Figs. 1 and 2): For each immobilized ACTR molecule, multiple association and dissociation events were first monitored at low excitation rate to ensure that the observed molecules are binding-competent, and to exclude a contribution from nonspecific surface interactions. Starting at a time when no NCBD was bound, the laser power was increased to the highest possible intensity that still allowed the observation of binding transitions with high photon rates before photobleaching occurred (Supplementary Fig. 1, see Methods). Dissociation events were not included since they exhibit the same change in observed FRET efficiency as acceptor bleaching. Because of microscopic reversibility, however, the statistics of transition path times for dissociation should be identical to those for binding. Examples of recorded transitions are shown in Fig. 1c and in Supplementary Fig. 2. Despite the relatively high photon count rates, it is difficult to identify the start and end of a transition by visual inspection. For this reason, the photon arrival times were analyzed on a photon-by-photon level with the maximum-likelihood approach developed by Gopich and Szabo, which was previously applied to protein folding by Chung and Eaton10,19,20.

In this analysis, the likelihood of two kinetic models is compared (Fig. 2a): A two-state model and a model with an intermediate state that mimics the transient species populated during the transition10,11,12 (see Methods). By maximizing the log likelihood difference between the models, ΔlnL, with respect to the mean lifetime, τI, and the transfer efficiency, EI, of the intermediate state (Fig. 2b), we obtained the most likely values, $$\hat \tau _{\mathrm {I}} = 80\,\pm\,8\,\upmu {\mathrm s}$$ and $$\hat E_{\mathrm {I}} = 0.72 \pm 0.02$$, respectively, from the analysis of all 686 measured transitions (unless stated otherwise ± indicates the standard error; here from 1000 bootstrapping trials). We tested the robustness of this result with two controls. To assess the influence of surface-attachment, we immobilized NCBD instead of ACTR. Using Cy3B-labeled NCBD on the surface and CF660R-labeled ACTR in solution (see Supplementary Fig. 3A), we obtained 〈tTP〉 = 95 ± 22 µs and $$\hat E_{\mathrm{I}}$$ = 0.66 ± 0.05, within experimental error of the above result. To assess the influence of the dyes, we measured immobilized Alexa488-labeled ACTR with CF660R-labeled NCBD in solution (see Supplementary Fig. 3B) and found 〈tTP〉 = 101 ± 11 µs, similar to the other dye pair; $$\hat E_{\mathrm{I}}$$ = 0.52 ± 0.03 was lower than with Cy3B, as expected from the shorter Förster radius of Alexa488/CF660R (4.7 nm) compared to Cy3B/CF660R (6.0 nm).

The relatively high value of $$\hat E_{\mathrm{I}}$$ indicates that the labeled C-terminal segments of ACTR and NCBD are in proximity already during the transition. The value of $$\hat \tau _{\mathrm{I}}$$ can be interpreted as the average duration of the transition paths, 〈tTP〉.10,11,12 Remarkably, 〈tTP〉 is very long compared to the transition path times of the folding of monomeric proteins previously observed in simulations21 and most experiments10,11,12. The two most likely reasons for this surprisingly slow passage to the bound state are: (i) slow diffusion on the free-energy surface caused by internal friction arising from non-native interactions12,13,22,23, or (ii) a local free-energy minimum corresponding to a high-energy intermediate or encounter complex24,25,26, where ACTR and NCBD are already in contact but have not yet found their final, stably bound structure.

### Internal friction versus high-energy intermediate

To examine the first possibility, we measured the dependence of 〈tTP〉 on solvent viscosity, since a viscosity-independent component in the dynamics is a common fingerprint of internal friction12,22,23. The pronounced increase of 〈tTP〉 with viscosity and the absence of a significant intercept at zero viscosity (Fig. 2c, d) suggest that internal friction does not substantially contribute to the transition-path dynamics. We note that $$\hat E_{\mathrm{I}}$$—and thus the average inter-dye distance during the transition—exhibits no systematic change with viscosity, indicating that the addition of glycerol does not alter the conformational ensemble, attesting to the robustness of the analysis.

To probe the second possibility, the presence of a high-energy intermediate, as the cause for the long 〈tTP〉, we take into account not only the average, but the distribution of transition path times, which is sensitive to the shape of the free-energy barrier27,28. The relatively large number of photons detected during the binding transitions enabled us to calculate ΔlnLj for every transition j individually (orange curves in Fig. 2b) and identify the most likely transition path time, $$\hat \tau _{{\mathrm{I}},j}$$, for each. The resulting distribution of $$\hat \tau _{{\mathrm{I}},j}$$ (grey histograms in Fig. 3d) was then compared to the distributions expected for different barrier shapes. We tested barriers ranging from an inverted harmonic potential (representing a simple transition state) to a flat barrier top and a high-energy intermediate with different stabilities (Fig. 3a, b) and calculated the corresponding tTP distributions for each potential by numerically solving the Smoluchowski equation28 (Fig. 3c). The narrowest tTP distributions are those for a simple transition state, and they broaden with increasing stability of the intermediate. When the stability of the intermediate approaches a few kBT, the transition path time becomes essentially equivalent to the lifetime of the intermediate, which, according to classical kinetics, should be exponentially distributed. Indeed, this is the behavior observed (Fig. 3c). To fully account for photon statistics, we simulated photon time traces with tTP sampled from these distributions, analyzed them in the same way as the measured data, and compared the resulting distributions of $$\hat \tau _{{\mathrm{I}},j}$$ to the experiment (Fig. 3d) by calculating the χ2-distance between them (Fig. 3e). The validity of this analysis was tested based on synthetic data (see Supplementary Fig. 4 and Methods for details).

At all solvent viscosities, the best agreement between simulation and measurement is achieved with a stability of the intermediate of at least −5 kBT (Fig. 3d, e, Supplementary Fig. 5), where the tTP distribution is, within uncertainty, indistinguishable from an exponential distribution. This observation, together with the long 〈tTP〉 and high $$\hat E_{\boldsymbol{I}}$$ (Fig. 2), indicates the presence of an encounter complex where ACTR and NCBD are associated, but still separated by a free-energy barrier from the stably bound and folded state. Notably, the value of at least 5 kBT for the barrier height of escape from the encounter complex is consistent with an independent estimate based on Kramers theory29: From the reconfiguration time of ACTR, τr ≈ 75 ns, which has recently been measured30, we estimate the preexponential factor to be τ0 ≈ 2πτr ≈ 0.5 μs.31 Assuming two symmetrical barriers and using $$\hat \tau _{\mathrm{I}}\,=\,80\,{{\upmu {\mathrm s}}}$$ as the escape time, we obtain a barrier height of $${\mathrm{ln}}\left( {2\hat \tau _{\mathrm{I}}/\tau _0} \right)$$ kBT ≈ 5.8 kBT. (We note that our measurements do not allow us to determine the rate coefficients for the transitions from the intermediate to bound and the unbound states separately but only their sum.10 The largest asymmetry of the barriers bounding the intermediate and still compatible with our results would be ~11 kBT versus ~5 kBT (see Maximum likelihood analysis of binding transitions in Methods).)

The shape of the observed tTP distributions can thus be explained by a localized intermediate. We note that roughness of the energy landscape along the reaction coordinate cannot account for our findings, although both scenarios would lead to a longer mean transition path time12. This is because introducing roughness, crudely speaking, amounts to lowering the effective diffusion coefficient along the reaction coordinate32. Indeed, numerical experiments where we introduced sinusoidal roughness of different amplitudes and calculated the resulting tTP distributions confirm that the roughness slows down the mean transition path time but has almost no effect on the shape of its distribution (see Supplementary Fig. 6).

### Ionic-strength dependence of transition path times

To further investigate the nature of the encounter complex, in particular the role of electrostatics, we measured 〈tTP〉, $$\hat E_{\mathrm{I}}$$, and the association and dissociation rate coefficients, kon and koff, respectively, as a function of ionic strength. kon decreases about fivefold when the ionic strength is raised from 50 to 400 mM, while koff increases only about twofold (Fig. 4a, b), which is consistent with previous kinetic measurements33 and in accord with the opposite net charge of ACTR and NCBD34,35,36. The encounter complex, however, behaves very differently: neither 〈tTP〉 nor $$\hat E_{\mathrm{I}}$$ change significantly with increasing ionic strength (Fig. 4c, d). The strong ionic-strength dependence of kon suggests that electrostatic interactions are formed already early during the transition, whereas the weak dependence of 〈tTP〉 indicates that the subsequent formation of the folded state, which involves packing of the hydrophobic core, is less electrostatically driven.

## Discussion

In summary, the distribution of transition path times we have measured for the association of two IDPs reveal the presence of an encounter complex. Owing to this transient intermediate, it takes on average ~80 μs from the diffusional encounter of the two binding partners to the formation of the stably folded complex, much longer than the transition path time expected for the folding of a monomeric protein of similar size10,11,21. Neither pronounced internal friction nor “roughness” of the free-energy surface, which have been shown to slow down some protein folding reactions involving non-native interactions or misfolding12,13,22,23,37, are likely to be the cause of the long transition path times, as indicated by the strong solvent viscosity dependence of the transition path times we observe (Fig. 2d) and by simulations of the effect of energetic roughness on transition path time distributions (Supplementary Fig. 6).

The most likely mechanism for the coupled folding and binding of ACTR and NCBD is thus the initial formation of a transient encounter complex that is stabilized by the electrostatic interactions between the two oppositely charged IDPs33, followed by folding (Fig. 4e). The barrier to encounter complex formation is very low, as reflected by an association rate coefficient that is only about an order of magnitude below the diffusion-limited value expected for a barrierless binding reaction26,33,38,39. Although the stability of the encounter complex slows the overall transition path time compared to simple monomeric folding reactions, it still proceeds remarkably quickly. The search process may be facilitated by relatively specific initial contacts between the N-terminal helices of ACTR and NCBD that provide noncovalent connectivity and reduce conformational freedom36, as suggested by ϕ-value analysis5, an increase in association rates with helicity40, and simulations41. However, according to the ϕ-value analysis5, these interactions are highly localized, and the extended hydrophobic interface between ACTR and NCBD is largely non-native near the transition state. This result implies that the electrostatic interactions favoring association33 (Fig. 4a) are also predominantly non-native. However, if they interchange rapidly compared to the interconversion time to the native state and the net gain of electrostatic interactions upon folding is small, no pronounced dependence of 〈tTP〉 on salt concentration is expected, in agreement with our observations (Fig. 4c).

In spite of the small size of ACTR and NCBD, their kinetics of folding and binding have been observed to be remarkably complex. Stopped-flow measurements revealed multistate kinetics on timescales of milliseconds to seconds42, and recent single-molecule experiments identified a contribution due to peptidyl-prolyl cis/trans isomerization in the range of tens of seconds39. The timescale of 80 µs we observed here for the transition path time is similar to a fast kinetic phase observed during binding of ACTR to NCBD in temperature-jump experiments43, which was attributed to the conformational exchange within NCBD previously identified by NMR44. The lifetime of the encounter complex might thus be linked to the internal dynamics of the molten-globule-like NCBD.

How do our observations for ACTR/NCBD relate to the behavior in other coupled folding-and-binding reactions of IDPs? The recent increase in kinetic investigations of IDP interactions has made it clear that the underlying mechanisms are diverse and difficult to generalize35,45. However, ACTR and NCBD do not seem to be an unusual case. With the high abundance of charged amino acids in IDPs46, a pronounced role of electrostatics is commonly observed, especially for association rates35,45. An initial binding event that precedes folding also seems to be a common scenario45, often referred to as “induced fit”47. However, even in cases where observations such as nonlinearities of concentration-dependent kinetics may indicate the presence of an encounter complex, characterizing its structural and dynamic properties has been more difficult, with NMR providing the most detailed insights so far1,2. Since the equilibrium populations of transient intermediates along the path of coupled folding and binding are typically low, detecting them with ensemble kinetics, such a temperature jump experiments48, is challenging. Since much of the interesting mechanistic information is contained in the transition paths13,20,49,50, probing them by single-molecule spectroscopy provides an opportunity for revealing the mechanisms of protein binding and complementing kinetic and structural information from other methods. Next steps for advancing this approach will be to combine it with multiple labeling positions38 or three-color FRET51 to map the structure and dynamics of encounter complexes in protein interactions in more detail.

## Methods

### Protein expression

ACTR-Avi: The coding sequence of a single-cysteine ACTR variant was cloned via BamHI/HindIII into a pAT222-pD expression vector (gift from J. Schöppe and A. Plückthun)52, yielding a protein construct with an N-terminal Avi-tag and a Thrombin-cleavable C-terminal His6-tag (sequence of the cleaved construct: MAGLNDIFEA QKIEWHEGSM GSGSGTQNRP LLRNSLDDLV GPPSNLEGQS DERALLDQLH TLLSNTDATG LEEIDRALGI PELVNQGQAL EPKQDCGGPR). pBirAcm (Avidity, Aurora CO, USA) was cotransfected for in vivo biotinylation of Lys12 in the Avi-tag53, and expression was carried out in Escherichia coli C41(DE3) (Merck). Cells were grown at 37 °C in TYH medium (for 1 l: 20 g tryptone, 10 g yeast extract, 11 g HEPES, 5 g NaCl, 1 g MgSO4, pH 7.3), supplied with 0.5% (w/v) glucose, until they reached an OD600 of 0.8. Then, 50 µM biotin in 10 mM bicine buffer (pH 8.3) and 1 mM IPTG were added to the culture. Expression continued for 3 h at 37 °C, after which cells were harvested by centrifugation. The harvested cells were lysed by sonication, and the His6-tagged protein was enriched via immobilized metal ion affinity chromatography (IMAC) on Ni-IDA resin (ABT). The His6-tag was then cleaved off with thrombin (Serva Electrophoresis) and separated from the protein by another round of IMAC. Finally, biotinylated protein was separated from impurities and nonbiotinylated protein via reversed-phase HPLC (RP-HPLC) on a C18 column (Reprosil Gold 200, Dr. Maisch, Germany) with a H2O/0.1% trifluoroacetic acid−acetonitrile gradient. The purified protein was lyophilized, resuspended in buffer, and stored at −80 °C until use.

ACTR: ACTR containing a C-terminal cysteine was also inserted into the pAT222-pD expression vector containing an HRV 3C-cleavable N-terminal Avi-tag as well as a Thrombin-cleavable C-terminal His6-tag (sequence of the cleaved construct: GPSGTQNRPL LRNSLDDLVG PPSNLEGQSD ERALLDQLHT LLSNTDATGL EEIDRALGIP ELVNQGQALE PKQDCGGPR). ACTR was expressed like the other variants containing an N-terminal Avi-tag. After enrichment of the His6-tag-containing protein by IMAC, the C-terminal His6-tag was cleaved off, followed by a second round of IMAC. To obtain the fully cleaved protein, the Avi-tag was cleaved off by HRV-3C protease. The RP-HPLC purification was carried out as described above for two consecutive rounds.

NCBD: A construct with a single-cysteine residue and proline residues 20 and 23 replaced by alanine (to suppress kinetic heterogeneity due to peptidyl-prolyl cis/trans isomerization39) was generated by site-directed-mutagenesis (primers used (Microsynth): NCBD_P20A_P23A_fw: GCA TCT TCA GCG CAA CAG CAA CAG CAA GTT CTT AAC; NCBD_P20A_P23A_rev: GCT GTT GCG CTG AAG ATG CCG ATT TCA GCG TCC). Furthermore, the expression construct contained an N-terminal His6-tag cleavable with HRV 3C protease (sequence of the cleaved construct: GPNRSISPSA LQDLLRTLKS ASSAQQQQQV LNILKSNPQL MAAFIKQRTA KYVANQPGMQ C). NCBD was coexpressed54 with ACTR from a pET-47b(+) vector. Cell lysis and protein enrichment via IMAC were carried out as described above, followed by enzymatic cleavage of the His6-tag with HRV 3C protease and separation of the tag from the proteins via another round of IMAC. Finally, ACTR and NCBD were separated with RP-HPLC as described above.

NCBD-Avi: The NCBD construct containing a single cysteine at the C-terminus as well as an N-terminal Avi-tag and a C-terminal cleavable His6-tag was cloned, expressed and purified analogously to the ACTR-Avi variant (sequence of the cleaved construct: AGLNDIFEAQ KIEWHEGSMG SGSSPNRSIS PSALQDLLRT LKSASSAQQQ QQVLNILKSN PQLMAAFIKQ RTAKYVANQP GMQCGGPR). Also in this sequence, proline residues 20 and 23 were replaced by alanine to avoid kinetic heterogeneity.

### Protein labeling

ACTR-Avi: Lyophilized protein was dissolved under nitrogen atmosphere to a concentration of 200 µM in 100 mM potassium phosphate buffer, pH 7.0, and was labeled for 3 h at room temperature with a 0.8-fold molar ratio of Cy3B or Alexa488 maleimide (GE Healthcare Life Sciences) to protein. Labeled protein was separated from unlabeled protein with RP-HPLC on a Sunfire C18 column (Waters) as described above.

ACTR: Lyophilized protein was dissolved to 50 µM in 50 mM sodium phosphate buffer, pH 7.0, and labeled with a 1.2-fold molar ratio of CF660R maleimide (Biotium) to protein. Labeled protein was separated from unlabeled protein with RP-HPLC on a Reprosil Gold 200 column, followed by RP-HPLC on a Sunfire C18 column.

NCBD: Lyophilized protein was dissolved to 170 µM in 100 mM potassium phosphate buffer, pH 7.0, and was labeled with a 1.2-fold molar ratio of CF660R to protein. Labeled protein was separated from unlabeled protein with RP-HPLC on a C18 column (Reprosil Gold 200), followed by RP-HPLC on a Sunfire column.

NCBD-Avi: Lyophilized protein was dissolved to 220 µM in 50 mM sodium phosphate buffer, pH 7.0, and labeled with a 1.2-fold molar ratio of Cy3B to protein. Labeled protein was separated from unlabeled protein with RP-HPLC on a Reprosil Gold 200 column.

The correct mass of all labeled proteins was confirmed by electrospray ionization mass spectrometry.

### Sample preparation for surface experiments

Surface experiments were performed using quartz cover slides coated with polyethylene glycol (PEG) and covalently modified with biotin (Quartz Coverslip (1″ × 1″), Bio 01, MicroSurfaces Inc., Englewood, NJ, USA). To clean the slides before use, they were boiled in water containing 0.1% Tween 20 and sonicated for 5 min. Silicone chambers (Secure Seal Hybridization Chambers, SKU:621202, Grace Bio Labs, Bend, OR, USA) were glued to the cover slide to yield four measurement chambers per slide. Biotinylated protein was immobilized on the cover slides with a biotin−avidin−biotin bond. To accomplish this, 200 µg/ml Avidin D (Vector Labs, Burlingame CA, USA) in NaP buffer (50 mM sodium phosphate, pH 7.0, 0.01% Tween 20) was added to the well and incubated for 3 min, followed by three washing steps with NaP buffer. Biotinylated protein was immobilized at a concentration of 10 pM in NaP buffer.

Measurements were performed in NaP buffer, with H2O replaced by D2O (NaP/D2O) to increase the quantum yield of the dyes55. Compared to H2O, the photon count rates in D2O were increased by ~20% for Cy3B and by ~50% for CF660R. To improve the signal quality further, we employed an oxygen scavenging system consisting of 400 U/ml bovine liver catalase (Sigma), 0.4 mg/ml glucose oxidase from Aspergillus niger (Sigma), and 1% (w/v) glucose, as well as a redox system (1 mM ascorbic acid, 1 mM methyl viologen)56. Concentrations of 20−80 nM acceptor-labeled NCBD or ACTR free in solution were used in the experiments.

To investigate the viscosity dependence, measurements were performed in NaP/D2O buffer containing 0, 14, 32, and 45% (v/v) glycerol. The viscosity was determined with a DV-I+ 4.0 Digital-Viscometer (Brookfield, Lorch, Germany). The ionic-strength dependence was measured in NaP/D2O buffer supplied with 0, 50, 100, 175, 300, and 800 mM NaCl. The point at 51 mM ionic strength was measured in 20 mM sodium phosphate, 10 mM NaCl, pH 7.0, 0.01% Tween 20 (in D2O). The pH of each solution was set to 7.0 by adjusting the ratio of monobasic to dibasic phosphate.

### Instrumentation for surface experiments

Surface experiments were performed on a MicroTime 200 confocal single-molecule instrument (PicoQuant, Berlin, Germany). A continuous-wave laser at 532 nm (LBX-532-50-COL-PP, Oxxius S.A., Lannion, France) was used for excitation. The light was focused into the sample (UplanApo 60/1.20W; Olympus, Japan), and the emitted light was collected with the same objective. A triple-band mirror (zt405/530/630rpc, Chroma, USA) and a long-pass filter (532 LP Edge Basic, Chroma) were used to separate the 532-nm laser light from the emitted fluorescence. The fluorescence light was then focused through a 100 µm pinhole and split by a dichroic mirror (T 635 LPXR, Chroma) to separate donor and acceptor photons. Donor emission was filtered with a 585/65 ET bandpass filter (Chroma), acceptor emission with a RazorEdge LP 647 RU long-pass filter (Chroma). Both photon streams were detected with avalanche photodiode detectors (SPCM-AQR-15, PerkinElmer, Waltham MA, USA) and photon arrival times recorded with a HydraHarp 400 event timer (PicoQuant). The temporal resolution is limited by the random jitter of the detectors (~50 ps). A function generator (33600A Series Waveform Generator, Keysight Technologies, USA) connected to the modulation input of the laser driver allowed fast (<3 ms) and automated switching of the laser intensity (Supplementary Fig. 1). To scan the surface, the objective was mounted on a combination of two piezo-scanners, a P-733.2CL for XY-positioning and a PIFOC for Z-positioning (Physik Instrumente, Germany). To suppress oscillations of the scanner-stage, which can result in signal fluctuations, the digital notch filters were optimized for each axis.

### Analysis of long photon time traces

To obtain the binding and unbinding rate coefficients, kon and koff, as well as the transfer efficiencies of the unbound and bound states, EU and EB, long photon time traces of surface-immobilized proteins were acquired at a laser power of 0.5 µW (measured at the back aperture of the objective). Time traces were inspected to ensure that no substantial brightness variations were occurring (e.g. caused by a drift of the molecule’s position, long-lived dark states, or background fluctuations). Suitable traces were analyzed until photobleaching. Single-step photobleaching indicated that only one immobilized molecule was present in the confocal volume.

The pseudo-first-order association rate coefficient, $$\bar k_{{\mathrm{on}}} = k_{{\mathrm{on}}} \cdot c_{{\mathrm{NCBD}}}$$, the dissociation rate coefficient, koff, and the photon rates were determined using the maximum likelihood approach introduced by Gopich and Szabo19. kon is the second-order association rate coefficient, and cNCBD is the concentration of NCBD free in solution (for Supplementary Fig. 3A, where ACTR is free in solution, this would be cACTR). The likelihood of time trace j is calculated from the general equation

$$L_j = {{{\mathbf p}}}_{{\mathrm{fin}}}^{{{\mathrm T}}}\mathop {\prod}\limits_{i = 1}^{N_j} {{{{\mathbf n}}}_{{{c}}_i,j}{\mathrm{exp}}[({\mathbf{K}} - {{\mathbf {n}}}_{{{{\mathrm D}}},j} - {{{\mathbf {n}}}}_{{{{\mathrm A}}},j})\,\tau _i]\,{{{\mathbf p}}}_{{\mathrm{ini}}}} ,$$
(1)

where Nj is the total number of photons in the time trace; ci is the color of the ith photon (D or A); τi=1 = 0, and τi>1 is the inter-photon time, i.e. the time interval between the detection of the (i − 1)th and ith photon. K is the rate matrix describing the association−dissociation dynamics. We include an additional dark state accounting for fluorophore blinking in the low-FRET unbound state, which is populated and depopulated with rate coefficients k+b and k−b, respectively. Blinking also occurs in the high-FRET bound state but does not need to be included in the model, as it is not misrecognized as a transition. K is given by

$${\mathbf{K}} = \left( {\begin{array}{*{20}{c}} { - (\bar k_{{\mathrm{on}}} + k_{{\mathrm{ + b}}})} & {k_{{\mathrm{off}}}} & {k_{ - {\mathrm{b}}}} \\ {\bar k_{{\mathrm{on}}}} & { - k_{{\mathrm{off}}}} & 0 \\ {k_{{\mathrm{ + b}}}} & 0 & { - k_{ - {\mathrm{b}}}} \end{array}} \right).$$
(2)

$$\bar k_{{\mathrm{on}}}$$ and koff are the rate coefficients of association and dissociation observed for immobilized ACTR at a given bulk concentration of NCBD. The dark state is populated and depopulated with rate coefficients k+b and k−b, respectively.

nD,j and nA,j in Eq. (1) are diagonal matrices with the observed donor photon rates ($$n_{{\mathrm{D}},j}^{\mathrm{U}}$$, $$n_{{\mathrm{D}},j}^{\mathrm{B}}$$, $$n_{{\mathrm{D}},j}^{{\mathrm{dark}}}$$) and the acceptor photon rates ($$n_{{\mathrm{A}},j}^{\mathrm{U}}$$, $$n_{{\mathrm{A}},j}^{\mathrm{B}}$$, $$n_{{\mathrm{A}},j}^{{\mathrm{dark}}}$$) of the three states on the diagonal, respectively. The photon rates vary slightly from time trace to time trace, mainly because the immobilized molecules are placed at slightly different positions inside the laser focus. $${\mathbf{p}}_{{\mathrm{fin}}}^{\mathrm{T}} = (1,1,1)$$ is the transposed unity vector. The vector pini contains the populations at the start of the measurement. For the analysis of long time traces, we assume pini = peq, the equilibrium population of the three states, which is obtained from Kpeq = 0. We maximize $$\mathop {\sum}\nolimits_j {\ln (L_j)}$$, the sum over the logarithms of the likelihoods of all photon time traces, with respect to $$\bar k_{{\mathrm{on}}}$$, koff, k+b, k−b, nD,j, and nA,j. For this purpose, we constrained the acceptor photon rate of the dark state to the acceptor photon rate of the unbound state, which is essentially the background signal of the acceptor detection channel $$( {n_{{\mathrm{A}},j}^{{\mathrm{dark}}} = n_{{\mathrm{A}},j}^{\mathrm{U}}})$$. Analogously, we constrained the donor photon rate of the dark state to the corresponding value of the bound state $$( {n_{{\mathrm{D}},j}^{{\mathrm{dark}}} = n_{{\mathrm{D}},j}^{\mathrm{B}}})$$, which is a good approximation since the transfer efficiency in the bound state is very high (i.e., EB ≈ 0.9).

To obtain the second-order association rate coefficient, $$k_{{\mathrm{on}}} = \bar k_{{\mathrm{on}}}/c_{{\mathrm{NCBD}}}$$, the concentration of labeled protein free in solution needs to be known accurately. Because the concentrations can vary by up to 25% from experiment to experiment due to surface adhesion and pipetting errors, concentrations were determined directly in the sample with fluorescence correlation analysis57. The fluorescence of CF660R-labeled NCBD in solution was measured before and after each experiment, and the amplitude of the correlation curve was used to determine the average number of molecules inside the confocal volume, which is proportional to the concentration. The nominal concentrations were then corrected by the relative concentrations found from the correlation analysis.

For converting the photon count rates to transfer efficiencies, they need to be corrected for background fluorescence (bgA and bgD), crosstalk between the detection channels (acceptor emission to donor channel, βAD, and donor emission to acceptor channel, βDA), acceptor direct excitation (α), and differences in the quantum yields of the dyes and the detection efficiencies of the two channels (γ). We determined bgA and bgD for each time trace after the molecule had photobleached and corrected the measured photon count rates:

$$n_{{\mathrm{A}},j}^{\prime} = n_{{\mathrm{A}},j} - {\mathrm{bg}}_{{\mathrm{A}},j}\hskip 6pt {\mathrm{and}}\hskip 6pt n_{{\mathrm{D}},j}^\prime = n_{{\mathrm{D}},j} - {\mathrm{bg}}_{{\mathrm{D}},j}.$$
(3)

βAD and α are negligible for these dye pairs in our instrument, so we can calculate the transfer efficiency as:

$$E_j = \frac{{n_{{\mathrm{A}},j}^\prime - \beta _{{\mathrm {DA}},j}n_{{\mathrm{D}},j}^\prime }}{{n_{{\mathrm{A}},j}^\prime - \beta _{{\mathrm{DA}},j}n_{{\mathrm{D}},j}^\prime + \gamma n_{{\mathrm{D}},j}^\prime }}.$$
(4)

Since we know that EU = 0, we can determine βDA,j and γj directly from the measured time traces. In the unbound state, Eq. (4) simplifies to

$$n_{{\mathrm{A}},j}^{\prime {\mathrm{U}}} - \beta _{{\mathrm{DA}},j}n_{{\mathrm{D}},j}^{\prime {\mathrm{U}}} = 0,$$
(5)

and therefore

$$\beta _{{\mathrm{DA}},j} = \frac{{n_{{\mathrm {A}},j}^{\prime {\mathrm{U}}}}}{{n_{{\mathrm{D}},j}^{\prime {\mathrm{U}}}}}.$$
(6)

After correcting for γj, the total photon count rates in the bound and unbound state should be the same:

$$n_{{\mathrm{A}},j}^{\prime {\mathrm{B}}} - \beta _{{\mathrm{DA}},j}n_{{\mathrm{D}},j}^{\prime {\mathrm{B}}} + \gamma _jn_{{\mathrm{D}},j}^{\prime {\mathrm{B}}} = n_{{\mathrm{A}},j}^{\prime {\mathrm{U}}} - \beta _{{\mathrm{DA}},j}n_{{\mathrm{D}},j}^{\prime {\mathrm{U}}} + \gamma _jn_{{\mathrm{D}},j}^{\prime {\mathrm{U}}}.$$
(7)

Since $$n_{{\mathrm{A}},j}^{\prime {\mathrm{U}}} - \beta _{{\mathrm{DA}},j}n_{{\mathrm{D}},j}^{\prime {\mathrm{U}}} = 0$$, and we know βDA,j, we can calculate γj:

$$\gamma _j = \frac{{n_{{\mathrm{A}},j}^{\prime {\mathrm{B}}} - \beta _{{\mathrm{DA}},j}n_{{\mathrm{D}},j}^{\prime {\mathrm{B}}}}}{{n_{{\mathrm{D}},j}^{\prime {\mathrm{U}}} - n_{{\mathrm{D}},j}^{\prime {\mathrm{B}}}}}.$$
(8)

We calculated βDA,j and γj for each time trace and used them to calculate EB,j with Eq. (4). EB,j was then averaged over all time traces to get a mean transfer efficiency, 〈EB〉. All parameters determined from long photon time traces are listed in Supplementary Tables 1 and 2.

### Measuring transition path times

To measure transition path times, high-intensity time traces were recorded in an automated fashion. The piezo-driven scanning stage of the microscope allows surface-immobilized labeled proteins to be localized in a 20 µm × 20 µm region of the cover slide. In the next step, the identified molecules are brought into focus one-by-one. The flow chart of the data acquisition procedure is detailed in Supplementary Fig. 1A. Initially, fluorescence is recorded at a laser power of 0.5 µW (measured at the back aperture of the objective), and donor and acceptor photons are binned (binning interval 10 ms). If the photon count ratio nA / (nA + nD) is below 0.5 for five consecutive bins (i.e. no binding partner is bound), the laser is switched to high power (5–50 µW) for 0.9 s in order to detect a potential binding event with much higher photon count rates. Afterwards, the laser power is switched off and the objective is moved to the next ACTR molecule. By always switching the laser to high power when the ACTR molecule is in the unbound state, we increase the probability of observing a binding transition instead of an unbinding transition during the period of high laser power. Additionally, the initial part of the recording at low laser power allows us to verify that the laser is indeed positioned on a functional molecule that shows anti-correlated changes in donor and acceptor signal characteristic of binding and unbinding. In Supplementary Fig. 1B, an example of a time trace with the switch between low and high laser power is shown.

We monitored only binding transitions because unbinding transitions exhibit the same change in observed FRET as acceptor photobleaching (in both cases the transfer efficiency drops to zero), which would bias the observed transition path times. In Supplementary Fig. 8, the log likelihood difference plots are compared for binding transitions, the unbinding/photobleaching transitions, and all transitions combined. For the unbinding/photobleaching transitions, no significant peak in the difference log likelihood is observed, as expected if transition paths for photobleaching are much faster than for unbinding.

### Analysis of high-intensity photon time traces

The high-intensity time traces were inspected and transitions were identified visually. Around each transition, a time window was centered in a way that it did not contain any other transitions or blinking events. The duration of this window was chosen so that it was at least 1 ms long and contained at least 1000 photons. The range of the resulting window lengths is shown for each dataset in Supplementary Table 3. The photon count rates of the unbound and bound states were determined from the donor and acceptor emission before and after the transition.

To exclude time traces with blinking, blinking events were identified in the following way: For every detection channel and conformational state, the probability for each observed inter-photon time was calculated given the observed mean photon rate and number of photons, assuming exponentially distributed inter-photon times. Blinking events were defined as inter-photon times with a probability of less than 0.01, and the window used for analysis was chosen small enough to exclude all blinking events. If a blinking event occurred within less than 1 ms from the transition, the time trace was not used for analysis. The resulting photon time traces were then used for the maximum likelihood analysis. Supplementary Fig. 2 shows representative time traces, and Supplementary Table 3 shows for each dataset the number of analyzed transitions, the average total photon count rates, the range of window lengths, and the resulting 〈tTP〉 and $$\hat E_{\mathrm{I}}$$.

### Maximum likelihood analysis of binding transitions

To obtain transition path times, we apply the method introduced by Chung and Eaton10,11,12. To approximate transition paths, a simple three-state model was used, where the transition path is described by a virtual intermediate state, I, between the unbound and bound states, U and B, respectively (here we assume that NCBD is in solution and ACTR immobilized; experiments with ACTR in solution and immobilized NCBD are described analogously):

$${\mathrm{U}}\ \mathop{\leftrightarrows}\limits_{{k_{\mathrm{I}}}}^{{k_{{\mathrm{on}}}^\prime \cdot c_{{\mathrm{NCBD}}}}}\ {\mathrm{I}}\ \mathop{\leftrightarrows}\limits_{{k_{{\mathrm{off}}}^\prime }}^{{k_{\mathrm{I}}}}\ {\mathrm{B}}.$$

The depopulation of I is described by the rate coefficient kI. The lifetime of the intermediate state, τI = 1/(2kI), corresponds to the transition path time, tTP. The rates from I to U and I to B are not necessarily equal; however, in our analysis we can only measure the lifetime of I, which is the inverse sum of the two rates. Since we only consider segments in the time traces where transitions from U to B occur, and we assume that we observe no U → I → U and B → I → B transitions, we set the rate coefficients for leaving I to be equal to simplify the analysis10,11,12. The rate coefficients to I from both directions are $$k_{{\mathrm{on}}}^\prime \cdot c_{{\mathrm{NCBD}}}$$ and $$k_{{\mathrm{off}}}^\prime$$. They are related to kon and koff in a two-state model that assumes an instantaneous transition,

$${\mathrm{U}}\ \mathop{\leftrightarrows}\limits_{{k_{{\mathrm{off}}}}}^{{k_{{\mathrm{on}}} \cdot c_{{\mathrm{NCBD}}}}}\ {\mathrm{B,}}$$

via

$$k_{{\mathrm{on}}} \approx \frac{1}{2}k_{{\mathrm{on}}}^\prime \hskip 6pt {\mathrm{and}}\hskip 6pt k_{{\mathrm{off}}} \approx \frac{1}{2}k_{{\mathrm{off}}}^\prime .$$
(9)

The factor 1/2 arises because the intermediate state can also react back to the original state in the three-state model, and so on average only every second attempt leads to binding or unbinding.

The idea behind this approach of transition path time analysis is to compare the likelihood of an instantaneous transition, Lj(τI = 0), with the likelihood of an intermediate state with lifetime τI and transfer efficiency EI, Lj(τI, EI). Schematic FRET efficiency time traces for these two cases are shown in Fig. 2a. In both cases, the likelihood is calculated according to the general formula in Eq. (1). For Lj(0), we use the rate matrix for the two-state model given by

$${\mathbf{K}} = \left( {\begin{array}{*{20}{c}} { - \bar k_{{\mathrm{on}}}} & {k_{{\mathrm{off}}}} \\ {\bar k_{{\mathrm{on}}}} & { - k_{{\mathrm{off}}}} \end{array}} \right),$$
(10)

and for Lj(τI, EI) we use the three-state model,

$${\mathbf{K}} = \left( {\begin{array}{*{20}{c}} { - \bar k_{{\mathrm{on}}}^\prime } & {k_{\mathrm{I}}} & 0 \\ {\bar k_{{\mathrm{on}}}^\prime } & { - 2k_{\mathrm{I}}} & {k_{{\mathrm{off}}}^\prime } \\ 0 & {k_{\mathrm{I}}} & { - k_{{\mathrm{off}}}^\prime } \end{array}} \right),$$
(11)

where $$\bar k_{{\mathrm{on}}} = k_{{\mathrm{on}}} \cdot c_{{\mathrm{NCBD}}}$$ and $$\bar k_{{\mathrm{on}}}^\prime = k_{{\mathrm{on}}}^\prime \cdot c_{{\mathrm{NCBD}}}$$. To prevent random fluctuations in photon rate from being misrecognized as transitions to I, $$\bar k_{{\mathrm{on}}}$$ and $$k_{{\mathrm{off}}}$$ were set to 0.1 s−1 and $$\bar k_{{\mathrm{on}}}^\prime$$ and $$\bar k_{{\mathrm{off}}}^\prime$$ to 0.2 s−1, which is slow compared to the average length of the fluorescence time traces. This approach is valid since we directly compare the two models.10 Since we used only time traces starting in the unbound state and ending in the bound state, we have $${\mathbf{p}}_{{\mathrm{ini}}}^{\mathrm{T}} = (1,0)$$ or $${\mathbf{p}}_{{\mathrm {ini}}}^{\mathrm{T}} = (1,0,0)$$, and $${\mathbf{p}}_{{\mathrm{fin}}}^{\mathrm{T}} = (0,1)$$ or $${\mathbf{p}}_{{\mathrm{fin}}}^{\mathrm{T}} = (0,0,1)$$. For the two-state model, nD,j and nA,j are

$${\mathbf{n}}_{{\mathrm{D}},j} = \left( {\begin{array}{*{20}{c}} {n_{{\mathrm{D}},j}^{\mathrm{U}}} & 0 \\ 0 & {n_{{\mathrm{D}},j}^{\mathrm{B}}} \end{array}} \right)\hskip 6pt {\mathrm{and}}\hskip 6pt {\mathbf{n}}_{{\mathrm{A}},j} = \left( {\begin{array}{*{20}{c}} {n_{{\mathrm{A}},j}^{\mathrm{U}}} & 0 \\ 0 & {n_{{\mathrm{A}},j}^{\mathrm{B}}} \end{array}} \right).$$
(12)

These rates are obtained from the photon rates in the individual time traces before and after the transition. For the three-state model, the corresponding matrices are

$${\mathbf{n}}_{{\mathrm{D}},j} = \left( {\begin{array}{*{20}{c}} {n_{{\mathrm{D}},j}^{\mathrm{U}}} & 0 & 0 \\ 0 & {n_{{\mathrm{D}},j}^{\mathrm{I}}} & 0 \\ 0 & 0 & {n_{{\mathrm{D}},j}^{\mathrm{B}}} \end{array}} \right)\hskip 6pt {\mathrm{and}}\hskip 6pt {\mathbf{n}}_{{\mathrm{A}},j} = \left( {\begin{array}{*{20}{c}} {n_{{\mathrm{A}},j}^{\mathrm{U}}} & 0 & 0 \\ 0 & {n_{{\mathrm{A}},j}^{\mathrm{I}}} & 0 \\ 0 & 0 & {n_{{\mathrm{A}},j}^{\mathrm{B}}} \end{array}} \right).$$
(13)

The photon rates of the intermediate state of the three-state model are given by

$$n_{{\mathrm{c}},j}^{\mathrm{I}} = n_{{\mathrm{c}},j}^{\mathrm{U}} + \frac{{E_{\mathrm{I}} - \left\langle {E_{\mathrm{U}}} \right\rangle }}{{\left\langle {E_{\mathrm{B}}} \right\rangle - \left\langle {E_{\mathrm{U}}} \right\rangle }}\left( {n_{{\mathrm{c}},j}^{\mathrm{B}} - n_{{\mathrm{c}},j}^{\mathrm{U}}} \right)\hskip 6pt {\mathrm{with}}\hskip 6pt {\mathrm{c = A,D}}$$
(14)

where 〈EU〉 is zero, and 〈EB〉 was determined from long time traces acquired at low excitation power (see Analysis of long photon time traces).

The likelihoods for a time trace to have originated from an instantaneous transition or from a transition of finite duration, corresponding to an intermediate state I with lifetime τI, and transfer efficiency EI can be calculated with Eq. (1). To compare the two models, the log likelihoods are subtracted,

$${\mathrm{\Delta ln}}L_j(\tau _{\mathrm{I}},E_{\mathrm{I}}) = {\mathrm{ln}}L_j(\tau _{\mathrm{I}},E_{\mathrm{I}}) - {\mathrm{ln}}L_j(0),$$
(15)

and τI and EI are varied systematically to obtain log likelihood difference plots (Fig. 2b). ΔlnLj values of multiple time traces can be added to yield an average likelihood; from its maximum, the most likely lifetime, $$\hat \tau _{\mathrm{I}} = \left\langle {t_{{\mathrm{TP}}}} \right\rangle$$, and most likely transfer efficiency, $$\hat E_{\mathrm{I}}$$, can be determined:

$${\mathrm{\Delta ln}}L = \mathop {\sum}\limits_j {{\mathrm{\Delta ln}}L_j} .$$
(16)

One can also find the most likely value for tTP of an individual transition by maximizing ΔlnLj, although with higher uncertainty.

To test whether the peaks observed in the log likelihood difference plots originate from molecular binding rather than random fluctuations in the fluorescence signal, we used the following control: For each measured time trace, we deleted segments containing the transition and surrounding intervals of varying length (see Supplementary Fig. 9A). We analyzed these altered datasets again with the maximum likelihood method. If the likelihood peak is caused by the finite duration of the binding transition, then we expect the peak to disappear upon deleting the transition region. However, if it was due to fluctuations of the fluorescence signal, it would persist. In Supplementary Fig. 9B, the likelihood curves at $$E_{\mathrm{I}} = \hat E_{\mathrm{I}} = 0.72$$ are shown for time traces with segments of different lengths deleted. The likelihood peak disappears if more than about 250 µs are deleted, indicating that the measured peak is caused by the transition path times of binding transitions. Upon deleting segments of 80 µs, corresponding to 〈tTP〉, there is still a substantial peak, because many of the transitions are longer than 80 µs (owing to the tail of the tTP distribution, Fig. 3).

To simplify the analysis, we assume the barriers for escape from I to be symmetric in height. However, we can estimate the maximum asymmetry in barrier heights from I to U and from I to B compatible with our experimental observations based on the following considerations (cf. Figure 4e): The ratio of about 0.1 between the observed association rate coefficient and a purely diffusion-limited collision rate (~109 M−1 s−1) yields an overall activation free-energy barrier for binding of ~2.3 kBT. From the observed dissociation rate and a preexponential factor of 0.5 μs (see main text), we obtain an overall activation free-energy barrier for dissociation of ~11 kBT for the case of equal barriers from I to U and B (since in that case kB→I = k′off = 2 koff = 32 s−1). In this case, our estimate for the barrier heights for the escape from I is ~5.8 kBT (see main text). These restraints correspond to the scenario shown in Fig. 4e (solid line). Our results would in principle, however, also be compatible with a situation where both the free energy of I and the barrier from I to B are reduced to the same extent. If we choose as a limit for this reduction the point where the free energies of I and B are equal, the barrier for I to U would be ~11 kBT. kI→U would then be negligible compared to kI→B for leaving I, in which case 1/kI→B = 80 μs, resulting in a barrier from I to B of ~5 kBT. The largest asymmetry of the barriers bounding I would thus be 11 kBT versus 5 kBT.

### Analysis of transition path time distributions

To quantify the distribution of transition path times, we calculated ΔlnLj for every transition individually (using the $$\hat E_{\mathrm{I}}$$ value found above), but for each transition j, we identified the $$\hat \tau _{{\mathrm{I}},j}$$ with the highest likelihood and generated a histogram from the resulting values (Fig. 3d). To find the underlying tTP distribution, we need to consider the broadening of the distribution due to the limited photon statistics. Photon time traces were thus simulated based on different theoretical transition path time distributions (see below) and analyzed in the same way as the measured data. To ensure that the photon statistics and shot-noise broadening are equivalent to those in the experimental data, we used the lengths and photon count rates of the measured time traces for the simulations. The simulations were performed in the following way: First, state trajectories were generated, each containing a single transition in the center, with a transition path time chosen randomly from the given theoretical distribution. For each state, photons were simulated with exponentially distributed inter-photon times, using the inverse of the total experimentally observed photon count rate of each state as mean inter-photon times. Photons were then randomly assigned to the acceptor or donor channel in accordance with the ratio of the corresponding photon count rates of the states observed experimentally. The photon count rates in the intermediate state were calculated using the $$\hat E_{\mathrm{I}}$$ obtained from the measured data.

These simulated photon time traces were then analyzed in the same way as the measured time traces, and $$\hat \tau _{{\mathrm{I}},j}$$ was quantified for all transitions. The resulting $$\hat \tau _{{\mathrm{I}},j}$$ histogram, Hs, of the simulated data was then compared to the measured histogram, Hm, by calculating the χ2-distance:

$$\chi ^2 = \mathop {\sum}\limits_{i = 1} {\frac{{(H_{{\mathrm{m}},i} - H_{{\mathrm{s}},i})^2}}{{H_{{\mathrm{m}},i} + H_{{\mathrm{s}},i}}}} .$$
(17)

By simulating data with different theoretical tTP distributions and finding the one with the smallest χ2-distance to the measured data, we can identify the distribution of underlying transition path times that agrees best with the measured data. In addition to calculating the χ2-distance, we also performed a k-sample Anderson−Darling test58 (see Supplementary Fig. 5), which tests whether two samples originate from the same underlying distribution, independent of the functional form of the distribution. Like the χ2-distance, this method finds the best agreement with our measured data for an exponential distribution. The accuracy of this analysis was tested based on simulations (see Brownian dynamics simulations of transition paths).

There are three peaks present in the $$\hat \tau _{{\mathrm{I}},j}$$ histograms (see Fig. 3d): The largest at 〈tTP〉, a smaller one at ~100 ns, and a third one at ~1 ns. The one at 1 ns arises from all transitions lacking a maximum in their ΔlnLj plots, which are all collected in the shortest bin. The peak at ~100 ns also appears in the simulated datasets (see Fig. 3d), suggesting that it originates from the analysis of transitions that are too short or have too low a photon rate to be resolved accurately. To test this hypothesis, we simulated photon time traces of binding transitions with varying photon count rates and determined $$\hat \tau _{{\mathrm{I}},j}$$ histograms as for the experimental data (see Supplementary Fig. 7). Indeed, all three peaks are present in the simulated results, and the ones at ~1 ns and ~100 ns decrease in amplitude with increasing photon count rates. Even though this observation indicates that some transitions in our measurements are not resolved, the method of determining the barrier shape is still expected to be valid, since we take the limited photon statistics into account when simulating the photon time traces we compare to the experimental data (Fig. 3d).

### Theoretical transition path time distributions

In a description of coupled folding and binding as Brownian motion on a 1D free-energy surface, the distribution of transition path times depends on the shape of the free-energy barrier and the effective diffusion coefficient. While the barrier shape determines the shape of the tTP distribution, the diffusion coefficient only determines the overall timescale of the distribution. We tested different kinds of barrier shapes, including parabolic barriers of different heights, a flat barrier, and barriers with intermediates of different depths (Fig. 3a, b). We modeled the barriers with the following equations:

$${\mathrm{Barriers}}\,{\mathrm{with}}\,{\mathrm{transition}}\,{\mathrm{state}}:V(x) = - \frac{{{\mathrm{\Delta }}V}}{{x_1^2}}x^2,$$
(18)
$${\mathrm{Flat}}\,{\mathrm{barrier}}:V(x) = 0,$$
(19)
$${\mathrm{Barriers}}\,{\mathrm{with}}\,{\mathrm{intermediate}}:V(x) = - 2\frac{{{\mathrm{\Delta }}V}}{{x_1^2}}x^2 + \frac{{{\mathrm{\Delta }}V}}{{x_1^4}}x^4.$$
(20)

The transition path boundaries are x0 and x1, with x1 = −x0, and the heights or depths of the potentials are given by ΔV = V(0) − V(x1).

From these functions, we calculated the tTP distributions numerically according to the procedure described in the appendix of ref. 28. Specifically, this distribution is proportional to the flux of trajectories starting infinitely close to the left boundary, at x = x0 + ε, and exiting through the right boundary, x1, without returning to x0:

$$p(t_{{\mathrm{TP}}}) = \lim\limits_{\varepsilon \to \infty } \frac{{J(x_1,t_{{\mathrm{TP}}})}}{{{\int}_0^\infty J{(x_1,t)}{\mathrm d}t }}.$$
(21)

The flux J was computed by solving the Smoluchowski equation,

$$\frac{{\partial p(x,t)}}{{\partial t}} = - \frac{{\partial J}}{{\partial x}},$$
(22)
$$J(x,t) = - D\left( {\beta V\prime (x) + \frac{\partial }{{\partial x}}} \right)p(x,t),$$
(23)

with absorbing boundary conditions, p(x0, t) = p(x1, t) = 0. The spectral expansion method was used to solve the Smoluchowski equation numerically: in this method, the Smoluchowski equation is first transformed into the equivalent Schrödinger equation, with absorbing boundaries being equivalent to introducing infinite potential walls at x = x0, x1. The Schrödinger equation is then solved by diagonalizing its effective Hamiltonian using particle-in-a-box wavefunctions as the basis.

For each tTP distribution calculated in this way, simulations of photon time traces were performed with different diffusion coefficients, corresponding to different values of 〈tTP〉 (ranging from 90 to 130% of the measured 〈tTP〉, in steps of 5%). Twenty-seven simulations were done for each value of 〈tTP〉. $$\hat \tau _{\mathrm{I}}$$ was determined for each simulation with the maximum-likelihood method, and for each set of 27, the average of all $$\hat \tau _{\mathrm{I}}$$ was determined. The set with an average $$\hat \tau _{\mathrm{I}}$$ closest to the experimentally observed 〈tTP〉 was then used for comparison to the measured $$\hat \tau _{{\mathrm{I}},j}$$ histograms. The diffusion coefficients resulting in the best agreement with the measured values of 〈tTP〉 are listed in Supplementary Table 4.

### Brownian dynamics simulations of transition paths

To validate our method of finding the transition path time distribution, we performed Brownian dynamics simulations to generate transition paths for different barrier shapes. We then simulated fluorescence time traces based on these transition paths and analyzed them as described in Transition path time distribution analysis to test whether we could correctly identify the barrier shape on which the simulations were based. We used three different representative potentials for the Brownian dynamics simulations (see Supplementary Fig. 4A):

Barrier with transition state:

$$V(r) = 80\left( {\left( {1.1\left( {r - 1} \right)} \right)^4 - \left( {1.1\left( {r - 1} \right)} \right)^2} \right).$$
(24)

Flat barrier top:

$$V(r) = 80\left( {\left( {1.1\left( {r - 1} \right)} \right)^4 - \left( {1.1\left( {r - 1} \right)} \right)^2} \right) - 6{\mathrm e}^{ - 2\left( {\frac{{20}}{7}} \right)^2\left( {r - 1} \right)^2}.$$
(25)

Barrier with intermediate:

$$V(r) = 80\left( {\left( {1.1\left( {r - 1} \right)} \right)^4 - \left( {1.1\left( {r - 1} \right)} \right)^2} \right) - \frac{{44}}{5}{\mathrm e}^{ - 2 \cdot 5^2\left( {r - 1} \right)^2}.$$
(26)

We adjusted the effective diffusion coefficient, D, for each potential so that 〈tTP〉 between r0 = 0.8 and r1 = 1.2 (dashed lines in Supplementary Fig. 4A,B) was ~80 µs. We then simulated transitions with time steps of 0.1 µs (see Supplementary Fig. 4B) and converted distances into transfer efficiencies using the Förster equation:

$$E = \frac{1}{{1 + \left( {r/R_0} \right)^6}},$$
(27)

with the Förster radius R0 = 1. The transfer efficiency time traces were then discretized into 20 states in steps of ΔE = 0.05. To ensure that the photon statistics are equivalent to those of the experimental data (dataset at 1.28 cP), we used the measured photon count rates to calculate the donor and acceptor photon count rates of each state,

$$n_{{\mathrm{D}},j}(E) = E\left( {n_{{\mathrm{D}},j}^{\mathrm{B}} - n_{{\mathrm{D,}}j}^{\mathrm{U}}} \right) + n_{{\mathrm{D}},j}^{\mathrm{U}}\hskip 6pt {\mathrm{and}}\hskip 6pt n_{{\mathrm{A}},j}(E) = E\left( {n_{{\mathrm{A}},j}^{\mathrm{B}} - n_{{\mathrm{A}},j}^{\mathrm{U}}} \right) + n_{{\mathrm{A}},j}^{\mathrm{U}},$$
(28)

and simulated photon time traces based on the simulated state trajectories and the determined photon count rates as described in Transition path time distribution analysis. These photon time traces were then analyzed in the same way as the measured data. Both the χ2-distance and the k-sample Anderson−Darling test correctly identify the original barrier shapes (Supplementary Fig. 4C).