Understanding protein folding under conditions similar to those found in vivo remains challenging. Folding occurs mainly vectorially as a polypeptide emerges from the ribosome or from a membrane translocon. Protein folding during membrane translocation is particularly difficult to study. Here, we describe a single-molecule method to characterize the folded state of individual proteins after membrane translocation, by monitoring the ionic current passing through the pore. We tag both N and C termini of a model protein, thioredoxin, with biotinylated oligonucleotides. Under an electric potential, one of the oligonucleotides is pulled through a α-hemolysin nanopore driving the unfolding and translocation of the protein. We trap the protein in the nanopore as a rotaxane-like complex using streptavidin stoppers. The protein is subjected to cycles of unfolding-translocation-refolding switching the voltage polarity. We find that the refolding pathway after translocation is slower than in bulk solution due to the existence of kinetic traps.
Protein folding has attracted much attention during the last 50 years, because it is central to cell function and disease. Early studies examined how proteins folded in vitro, typically after chemical or thermal denaturation1,2. More recently, technical developments have allowed a deeper understanding of protein folding under experimental conditions closer to those found in vivo3,4,5,6,7. A relevant event within the cell is the emergence of a polypeptide from the ribosome exit tunnel5,8. The nascent chain emerges N terminus-first and, before the C terminus is complete, the protein may start folding9,10, which is a strikingly different process to experimental refolding in bulk solution4. Second, during or after synthesis by ribosomes, proteins may be transported between compartments in the cell11 or into the extracellular medium12. During these processes, they must cross membranes by using one of a diverse set of translocons. For example, in bacteria alone, there are at least ten secretion systems13,14 and translocated substrates include many of the virulence factors secreted by pathogens14. Usually, secretion systems translocate unfolded polypeptides, and therefore the substrate must refold after or during the translocation process15,16,17.
Currently, pulse-chase experiments, which have a temporal resolution of minutes, are often used to study the folded state of proteins following membrane translocation in cells18,19. Similar temporal resolution is available when experiments are performed in vitro with reconstituted translocon components20. Further, these ensemble experiments frequently fail to resolve the elementary steps of complex processes21. Studies of translocation to the periplasm of substrates that contain disulfides have been a remarkable exception12. By using a mutant of DsbA, a disulfide-bond-forming enzyme, the formation of covalent mixed-disulfide complexes between DsbA and a polypeptide substrate was slowed, which allowed the resolution of intermediates in the folding process. However, this approach is limited to disulfide-containing substrates in bacteria.
We recently introduced a method to study co-translocational protein unfolding at the single-molecule level22,23,24,25, which avoids the requirement to synchronize an ensemble measurement. To achieve this, a single α-hemolysin pore (αHL) is inserted into a lipid bilayer. The αHL pore forms a conduit of ~2 nm diameter (~1.3 nm at its narrowest point), which can accommodate alpha helices but forces a protein to pass unfolded26. A model protein, thioredoxin (Trx), derivatized at one of its termini with a negatively-charged oligonucleotide, is driven into the pore by an electric field, which causes unfolding and translocation of the protein22,23. Steps in the ionic current carried by the pore represent intermediates in the translocation process27. In an alternative approach, a signal sequence is placed at the C terminus of a polypeptide and recognized by the molecular motor ClpX, which drives translocation24,25. Both approaches allow real-time tracking of the unfolding and translocation process. However, following translocation, the protein substrate is released into the trans compartment and cannot be further examined to determine whether it has refolded.
Here, we generate a rotaxane-like structure that prevents release of the protein substrate into the medium28 and allows examination of the folding status of the translocated substrate. The overall process involves unfolding, translocation and refolding of the protein substrate. Devoid of molecular motors and chaperones, our model system constitutes a simplification of the more complex in vivo process but has the vectorial nature of protein folding as in vivo. We first attached 3′-biotinylated oligonucleotides to each terminus of Trx. The oligonucleotide-Trx-oligonucleotide conjugate with one end bound to monovalent streptavidin (mSA)29 is driven into the nanopore by an applied potential. The direction of translocation (N-to-C or C-to-N) is ascertained by using distinguishable oligonucleotides. The oligonucleotide that reaches the trans side of the bilayer is captured by a second streptavidin (SA). The interlocked architecture of the threaded complex bound to one streptavidin at each side of the membrane constitutes a transmembrane protein rotaxane and is stable until either streptavidin dissociates. The protein substrate can then be driven from one side of the membrane to the other by repeatedly switching the polarity of the voltage between positive and negative. Our approach allows an examination of the evolution of the folding landscape of Trx, beginning within milliseconds after translocation.
Identification of the terminus leading protein translocation
We used a mutant of E. coli thioredoxin 2 (PDB code 2TRX) lacking the natural disulfide bond (C32S-C35S), with three mutations that increase stability (A22P-I23V-P68A)30 and both an N-terminal and a C-terminal cysteine (S1C and the added C109). This mutant, V5, was selectively modified on the N-terminal cysteine by using a click reaction with a 2-cyanobenzothiazole (CBT)31 to form a stable thiazoline ring32. To introduce the CBT functionality into a thiol-modified oligonucleotide, we coupled the amino group of 6-amino-2-CBT to p-maleimidophenyl isocyanate (Supplementary Fig. 1a, b). The product, a maleimide-CBT crosslinker, was reacted with the 5′-thiol of a 3′-biotinylated oligonucleotide composed of 30 cytosines, and the resultant 5′-CBT-oligo(dC)30-biotin-3′ was purified by ion-exchange chromatography (Fig. 1a and Supplementary Fig. 1c–f).
Following bacterial expression and purification of S1C-V5-C109, more than half of the N-terminal cysteines were found modified with pyruvate33, which was removed by incubation with methoxyamine exposing the 1,2-aminothiol (Supplementary Fig. 2). Subsequent incubation of the protein, reduced with TCEP, with 5′-CBT-oligo(dC)30-biotin-3′, followed by ion-exchange chromatography, produced a product that gave a single band of ~20 kDa upon electrophoresis (Supplementary Fig. 3a, d, f), consistent with a single oligonucleotide coupled to the N terminus of S1C-V5-C10931. Because CBT reacts site-specifically with 1,2-aminothiols, the C-terminal cysteine was not altered. The conjugate was then reacted at the C-terminal cysteine with the 5′-thiol of a 3′-biotinylated oligonucleotide, composed of 30 adenines, that had been activated with 2,2′-dipyridyl disulfide (Supplementary Fig. 3b, c)34. The purified product produced a band of ~30 kDa upon electrophoresis (Supplementary Fig. 3f), and the native mass spectrometry (native MS) was consistent with S1C-V5-C109 derivatized with different oligonucleotides at the N and C termini (hereafter V5 stands for the S1C-A22P-I23V-P68A-C32S-C35S-C109 mutant, and oligo(dC)30-V5-oligo(dA)30 for the mutant modified with the oligonucleotides, Supplementary Fig. 3g). One of the amide bonds of the maleimide group was hydrolyzed to completion during the synthesis of oligo(dC)30-V5-oligo(dA)30, which produced a + 18 peak in the MS but left the second amide and the linkage intact.
We added oligo(dC)30-V5-oligo(dA)30 to the cis side of a phospholipid bilayer containing a single αHL pore. Under an applied potential of +70 to +140 mV, the construct was driven into the pore, by either of the two oligonucleotides, causing the unfolding and translocation of the protein. Previously we have shown that an oligonucleotide tag at the C-terminus does not modify the stability of the native state as judged by the similar urea induced unfolding curve, but it only serves to drive the translocation22. The ionic current passing through the pore revealed the various molecular steps of the process23 (Fig. 1b–d). First the leading oligonucleotide, oligo(dC)30 for N terminus-first or oligo(dA)30 for C terminus-first, threaded into the pore (level 2 in Fig. 1e). The ionic current differed depending on the composition of the oligonucleotide, with oligo(dC)30 giving a higher residual conductance value than oligo(dA)30 (Ires% = 16.5 ± 0.5% (n = 36) and Ires% = 14.2 ± 0.2% (n = 43) at +140 mV, respectively, Fig. 1f, g), in accord with previous studies with immobilized oligonucleotides35,36. The ionic current during level 2 can therefore be used to determine the orientation in which oligo(dC)30-V5-oligo(dA)30 threads through the pore for each event that is registered. After threading, the protein unfolded by way of a long-lived intermediate (level 3 in Fig. 1c, d) that spontaneously unfolded and then passed through the pore by diffusion22 (level 4 in Fig. 1c, d). Once the leading oligonucleotide has entered the trans compartment and left the electric field, there is no longer a pulling force on the protein. Following the translocation of the remaining unfolded protein, the pore remained open until another oligo(dC)30-V5-oligo(dA)30 entered. The terminal oligonucleotide translocated too rapidly to generate a distinguishable ionic current level. The dwell time and ionic current of level 3 also depended on the direction of translocation, being longer and noisier if co-translocational unfolding is initiated through the N terminus, as already reported23. These features can also be used to determine the direction in which oligo(dC)30-V5-oligo(dA)30 initiates translocation, and the directionality of pore entry based on the Ires% values in level 2 are in agreement with the features observed in level 3. We study the C-first translocation reaction because when the protein is pulled back the protein translocates N-terminus first and therefore vectorial refolding is as in most biological systems. For these molecules, the dwell time in level 2 reflects the unfolding kinetics of a C-terminal region22,23, and the dwell time in level 3 reflects the unfolding kinetics of the partly unfolded intermediate (Fig. 1c, d). The rate constants determined here (k2→3 = 6.9 ± 0.5 s−1 and k3→4 = 1.0 ± 0.1 s−1 at +100 mV, n = 160, Supplementary Fig. 8a, b) are in good agreement with those reported22 for the construct C1S-V5-C109-oligo(dC)30 (k2→3 = 6.0 ± 1.0 s−1 and k3→4 = 1.5 ± 0.1 s−1 at +100 mV). These results suggest that simultaneous N-terminal and C-terminal modification have a minimal effect on the kinetics of co-translocational protein unfolding.
Formation of a transmembrane protein rotaxane
Oligo(dC)30-V5-oligo(dA)30 present in the cis compartment was then treated with a sub-stoichiometric amount of mSA. Each of the two oligonucleotides carried a terminal biotin and, therefore, a significant fraction of the construct bound only a single mSA29, leaving the other terminus free. When these complexes were pulled into the pore by voltage (+100 to +140 mV), they produced the characteristic ionic current signature that represents threading, unfolding and translocation of the protein (Fig. 2a, b). However, following translocation, the ionic current signal does not recover to the open pore level but remains at a new level (level 5, in Fig. 2a, b). The residual current in level 5 also depended on whether the process was initiated N or C terminus-first, with Ires% = 17.8 ± 0.1% (n = 33, +140 mV) and Ires% = 17.4 ± 0.1% (n = 32, +140 mV), respectively (Supplementary Fig. 4). During level 5, mSA on the cis side atop the pore stops the movement of the construct, with the terminal oligonucleotide threaded into the pore35. The immobilized oligonucleotides were in the 5′-end-first orientation in level 5, and the higher Ires% values for the oligo(dA)30 blockades than for those arising from the oligo(dC)30 blockades, opposite to the result in level 2 (3′-end-first), is in agreement with previous work35,36.
The protein was hence in the trans compartment, and when the potential was changed from +140 mV to −40 mV the complex was pulled in the opposite direction, from trans to cis, i.e., it was forced to retro-translocate. At the beginning of this voltage step, the ionic current signal moved transiently to a new level 6, before the characteristic ionic current level of the open pore at −40 mV was obtained (level 1* in Fig. 2a, b, the asterisk signifies the negative potential). Level 6 therefore represents the unfolding and translocation in the trans to cis direction.
Alternatively, after oligo(dC)30-V5-oligo(dA)30 threaded C terminus-first, instead of stepping the voltage to −40 mV we added mSA to the trans compartment, which bound to the biotinylated oligo(dA)30 that had led the translocation process (Fig. 2c). Hence, a rotaxane-like structure was generated (level 5 in Fig. 2c, d)28. In this state, a voltage step from +140 to −40 mV did not produce the current level of the open pore (level 1*), but instead the signal assumed a new level 7 in which the translocating Trx V5 is presumed to have moved completely back into the cis compartment, and the C-terminal oligonucleotide is threaded through the pore. After a selected time in level 7, the potential was switched back to +140 mV, and we observed an ionic current pattern similar to that produced during the initial translocation, with levels 2′, 3′, 4′, and 5′ (Fig. 2d). Importantly, the time provided in level 7 was time available to allow the protein to refold, and the dwell times during subsequent forward re-translocation (levels 2′ and 3′) revealed the kinetic stability of the refolded state. Finally, and in agreement with the proposed model, when the voltage was switched from +80 mV (a lower voltage to produce an extended level 2′) to −40 mV during residence in level 2′ or 3′ (i.e., while the protein was still or largely still in the cis compartment), level 6′ was not observed, but rather level 7 reappeared (Supplementary Fig. 5a, c). Similarly, if the voltage was switched from −40 mV to +80 mV during level 6′ (i.e., while the protein was largely in the trans compartment), levels 2′, 3′ and 4′ were never observed (Supplementary Fig. 5a, d).
These results are consistent with the formation of a transmembrane protein rotaxane. In this state, the protein can be translocated forwards and backwards across the pore by changing the polarity of the applied potential (Fig. 2c) and thereby subjected to multiple cycles of unfolding, translocation and refolding, observable with millisecond time resolution. Yet higher temporal resolution is challenging, because of capacitive transients after the voltage steps. In addition, the use of monovalent traptavidin (mTA)37 in the cis compartment, which dissociates from biotin more than ten-times slower than mSA, and traptavidin (TA) in the trans compartment, allowed over 24 h of recording on the same individual rotaxane.
The refolding landscape of a translocated protein substrate
We next aimed to define the folding landscape of Trx after translocation through the αHL pore, by subjecting the protein in the rotaxane to cycles of unfolding, translocation and refolding (Fig. 3a, b). In this work, we analyze constructs that enter C terminus-first from the cis compartment. Therefore, the refolding we investigate is initiated during or after retro-translocation N terminus-first from the trans into the cis compartment.
The protein may refold outside the pore during level 7′ (Fig. 3a). We held the protein at level 7′ for a selected period (the refolding time) and then switched to a positive potential (+100 mV) to probe the state of the protein38, which at this point could have returned to the native state, remained in an unfolded state or be in an intermediate state. By comparing the dwell times in levels 2′ and 3′ to the corresponding dwell times during the initial unfolding of the native protein (levels 2 and 3), we analyzed the stability of individual refolded Trx molecules. The dwell time in level 4′, which represents diffusive translocation of the remaining unfolded polypeptide, was unaffected by the refolding time and was similar to that of level 4 of the native protein (Supplementary Fig. 6). We allowed the protein to refold at level 7′ for 0.09 s, 0.49 s, 0.99 s, 1.49 s, 10 s, 100 s, or 1000 s, and probed the state of individual Trx molecules hundreds of times under each condition.
When we allowed the translocated Trx V5 protein molecule to refold for 90 ms, levels 2′ and 3′ were observed in only 99 out of 365 re-translocation events (Fig. 3c). In the remaining cases, level 2′ (which has been previously related to the folding of ~40 aminoacids at the C terminus22,23), level 3′ (previously related to the fold of the remainder of the polypeptide chain22,23) or both were absent (Supplementary Fig. 7). The absence of these levels either represents a lack of folding or the formation of weakly folded regions that unfold faster than the detection limit (~0.2 ms). Importantly, we observed that the longer the refolding time, the more frequently the signal contained both levels 2′ and 3′ (Fig. 3c). The time course of this folding process (measured as the presence of both levels 2′ and 3′) was not limited by the translocation of the unfolded polypeptide (that is ~50 fold slower than the folding event, Supplementary Fig. 6) and well described by a single exponential function, yielding a rate of 2.5 ± 0.4 s−1.
After 10 s of refolding, a plateau was reached where ~90% of the re-translocation events showed both levels 2′ and 3′. Nonetheless, the co-translocational unfolding of the refolded Trx V5 was faster than that of the Trx V5 that was unfolded for the first time (Fig. 3d). We observed a shorter dwell time in level 2′, which converted to level 3′ with a rate constant of k2’→3’ = 250 ± 10 s−1 (at + 100 mV, n = 345), compared to the value for the native Trx V5 of k2→3 = 6.9 ± 0.5 s−1 (at + 100 mV, n = 160). The dwell time in level 3′ was also shorter, but the difference was less obvious (less than 7-fold reduction, Supplementary Fig. 8).
These results imply that Trx V5 had not reached the native state after 10 s of refolding. The folding kinetics of Trx are complicated because there is an invariant cis proline, Pro-76, buried in the native structure. In bulk solution, when Trx is placed under denaturing conditions for a few seconds, proline residues do not have time to isomerize and refolding is fast (sub-millisecond)39,40. When Trx is left under denaturing conditions for hundreds of seconds, proline isomerization occurs and Trx then refolds slowly (t1/2 ~ 500 s)41. In this respect, the unfolding and translocation of Trx V5 through the pore typically takes less than ~1.5 s. This gives little time for proline isomerization to occur (proline isomerization takes from 103 s -cis to trans- to 104 s -trans to cis-42, roughly once every 1000 cycles), and we therefore expected refolding to occur in the sub-millisecond regime. Given that 10 s after translocation the the refolded Trx V5 remains in a state which unfolds faster than the native Trx V5, we conclude that the folding mechanism differs from the fast folding route that occurs in bulk solution.
We next used oligo(dC)30-V5-oligo(dA)30 to explore the folded state of Trx V5 after prolonged folding times. We first confirmed that Trx V5 mutant is a slow folder in bulk solution by measuring the recovery of tryptophan fluorescence after dilution from 4 M guanidinium chloride. Refolding proceeded at 2.3 × 10−3 s−1, a similar rate to that of wild-type Trx30,39,40,41, showing that in these cases folding is rate-limited by proline isomerization (we could not use the tagged substrate because we do not generate enough sample). By contrast, after retro-translocation through the nanopore and a 1000 s delay for refolding, we only observed ~15% of the molecules unfolding with dwell times in levels 2′ and 3′ that were characteristic of the native state (Supplementary Fig. 8). This yields a refolding rate of 1.4 × 10−4 s−1, >15-fold slower than that observed in bulk solution, which unlike refolding upon translocation requires proline isomerization. In atomic force microscopy and optical tweezer experiments, tethering to a surface, a cantilever or a bead slows down folding due to dragging forces43. We expect this effect to be negligible in our case: one terminus of the protein is anchored to the pore embedded in the lipid bilayer and the other is tethered to an oligonucleotide-streptavidin complex, neither of which produces a strong drag. On the other hand, the N-terminal and C-terminal of Trx are fully exposed to the solvent, suggesting that the addition of the tags should have a minor effect on the kinetics. Therefore, refolding after protein translocation conforms with neither the fast folding nor the slow folding pathways observed for Trx in bulk solution.
An advantage of our approach is that a given refolded state can be characterized with at least as many variables as the number of observable unfolding steps, which allows the use of bivariate scatter plots to improve the ability to distinguish different populations and better resolve the folding landscape. We display dwell times in levels 2′ and 3′ for the refolding times of 0.09 s, 0.49 s, 0.99 s, 1.49 s, 10 s, 100 s, and 1000 s acquired from three different rotaxanes and the superimposed data to better visualize the identified populations (Fig. 4a). The plots are snapshots of how the protein moves through the folding landscape. Trx V5 emerged unfolded from the pore, and after 0.09 s, ~73% of the events examined lacked any detectable folding (level 2′ or 3′). The remainder (~27%) had entered one of two states, population A or population B, both less stable than the native state based on the shorter dwell times in levels 2′ and 3′. During the first second (0.99 s), the fraction of events with undetectable folding was reduced drastically to ~17%, while population A remained constant (from ~10% at t = 0.09 s to ~17% at t = 0.99 s) and population B increased (from ~15% at t = 0.09 s to ~65% at t = 0.99 s). As the population with undetectable folding disappeared, so did population A. After 10 s of refolding, most of the events were found within population B, and the native state became slowly populated, reaching ~15% at 1000 s.
To set our analysis on a more quantitative basis, we used k-means clustering44 to classify the events into population A, population B, or the native state (Supplementary Fig. 9). When either level 2′, level 3′ or both were absent, the events were assigned to an unfolded state. By using this approach, we found that the native state was characterized by k2’→3’ = 6.2 s−1 and k3’→4’ = 1.1 s−1 (at + 100 mV, n = 2251), which are close to the values obtained by fitting 1D histograms to exponential distributions (k2→3 = 6.9 s−1 and k3→4 = 1.0 s−1 at +100 mV, n = 160 respectively, Supplementary Fig. 8a, b). Next, the ability of various kinetic models describing the time-dependent evolution of the populations obtained by k-means clustering was evaluated based on the Akaike information criterion (AIC)45 and the Bayesian information criterion (BIC)46 (Supplementary Fig. 10). These are estimators that allow model selection based on information theory. Remarkably, the best-fitting model was the one that imposed an equilibrium between the unfolded state and both populations A and B (Fig. 4b, c), which suggests that both populations A and B are kinetic traps and that successful folding only occurs from the unfolded state. While we do not know the exact structural nature of these intermediates, the folding funnel hypothesis predicts that along the folding pathway the protein would acquire more native-like structure as the free energy decreases47. We therefore take the recovery of mechanical stability as a proxy of nativeness. Similarly, folding upon emergence from the ribosome exit tunnel during protein synthesis may lead to folded but non-native structures5. A similar result was reported for the α-lytic protease, which in the absence of its pro-region folds to a native-like structure with its three disulfides in place that lacks the activity and stability of the native state48.
To seek support for this interpretation, we performed the same experiments on a Trx mutant of lower thermodynamic stability. We assume that the native state is the deepest well in the free energy landscape. We also assume that intermediates share structural similarities with the native state. A final assumption is that transition states between one state and another share structural features of them both. Based on these assumptions, it follows that destabilization of the native state should cause slower folding kinetics in the presence of on-pathway intermediates49,50. Conversely, destabilization would result in faster folding kinetics if the observed intermediates are off-pathway. We removed the stabilizing mutations A22P-I23V-P68A from Trx V5, which increase the unfolding temperature by 10 °C30 to give Trx V2, and then formed a transmembrane protein rotaxane that was subjected to cycles of unfolding, translocation and refolding. This less stable mutant behaved similarly to Trx V5 but co-translocational unfolding of the native state was faster (k2→3 = 11.6 ± 0.4 s−1 and k3→4 = 7.5 ± 0.3 s−1 at +100 mV, n = 172), as expected22. Refolding, in agreement with the kinetic trap interpretation, was also faster. After 1000 s, Trx V2 had recovered native state stability in ~83% of the cycles examined. The refolding rate was 1.8 × 10−3 s−1, whereas that of Trx V5 was 1.4 × 10−4 s−1. The kinetic model that best fitted the data for Trx V2 again requires that the observed intermediate is off-pathway (Fig. 4d–f and Supplementary Figs. 11, 12). Therefore, in the nanopore system explored here, upon vectorial emergence from the pore, Trx rapidly collapses to non-native state(s) that shows moderate stability and is off-pathway. Due to the presence of both levels 2′ and 3′, these states could have similar folds to the native state, nevertheless they must return to the unfolded state to be converted to the native state.
We have presented a single-molecule approach that, through the formation of a protein rotaxane in a transmembrane pore, allows the evaluation of the folded state of a protein at different refolding times after membrane translocation. The approach has millisecond temporal resolution and allows for characterization of the pathway to the refolded state with more than one parameter, which provides a better resolved folding landscape. Another important facet of our study, by comparison with recent single-molecule methods that examine the folding landscape of proteins51,52, is that the protein emerges vectorially from the pore as it does from a ribosome or translocon although in our case the rate may be 2–3 orders of magnitude faster. The approach requires repeated oligonucleotide-driven translocation and retro-translocation of the protein, which cannot be achieved by using a motor protein24,25.
We find that the folding mechanism of Trx following N terminus-first membrane translocation differs from that observed in bulk solution upon the removal of a denaturant such as urea or guanidinium chloride. In the latter, folding proceeds through a short-lived molten globule intermediate, if the protein has been in the unfolded state for only a few seconds40. If the protein is unfolded for hundreds of seconds, allowing proline isomerization to occur, folding is far slower and multiple intermediates are observed, all of them on pathways to the native state41. By contrast, in our nanopore approach, the protein is in the unfolded state only during the period of retro-translocation (tens of milliseconds), which implies that proline isomerization does not occur and hence is unlikely involved in the subsequent refolding process. We therefore would expect the rapid folding behavior: if the protein does not fold vectorially during retrotranslocation, once it completes translocation it should rapidly equilibrate in the unfolded state53 and the rate of folding should approximate that of the fast folding pathway observed in bulk solution. Given that we observed slow folding, the folding mechanism must be different from that in bulk solution, because the refolding rates are more than an order of magnitude slower. We show that this is a consequence of detours to off-pathway intermediates. We therefore suggest that the folding landscape we are observing is caused by the vectorial folding of the polypeptide as it emerges from the pore.
Our results are in line with analyses of T4 lysozyme4, HemK5, and flavodoxin54 which were artificially stalled as they emerged from the ribosome exit tunnel. For these particular cases at least, folding while tethered to the ribosome from the c-terminus is slower than in bulk solution. This suggests that the vectorial nature of folding inherent to protein synthesis and membrane translocation may increase the probability of folding intermediates including off-pathway ones. The generality of this phenomena and how chaperones modulate vectorial folding remain interesting yet unaddressed questions.
Finally, the methodology presented here can find interesting uses not only to study membrane associated processes at the single-molecule level but also in biotechnology applications, providing a novel way to prepare arrays of nanopore sensors.
Synthesis of maleimide-CBT crosslinker
17.5 mg of 6-amino-2-cyanobenzothiazole (CBT, Sigma) (0.1 mmol) and 10.7 mg of p-maleimidophenyl isocyanate (Sigma) (0.05 mmol) were dissolved in 1 mL of anhydrous dimethyl sulfoxide and stirred for 30 min at room temperature. The product was purified by preparative high-performance liquid chromatography (HPLC) on an Agilent 1260 Infinity system equipped with a Supelcosil PLC-18 12 μm, 250 × 21.2 mm column. The mobile phase consisted of 0.1% formic acid in H2O (buffer A), and 0.1% formic acid in acetonitrile (buffer B). A gradient from 5 to 95% of buffer B in buffer A was run over 30 min at a flow rate of 5 mL min−1. The separation was monitored at 335 nm. The identity of the product was confirmed by mass spectrometry (MS) and nuclear magnetic resonance (NMR) (Supplementary Fig. 1a, b).
1.0 mg of an oligonucleotide (Biomers) composed of 30 cytosines with a 5′-thiol (hexamethylene linker) and a 3′-biotin (triethylene glycol spacer) was dissolved in 1 mL of 10 mM Tris·HCl, 1 mM EDTA, 5 mM TCEP, pH 7.5. After heating at 65 °C for 1.5 h, TCEP was removed with an Amicon Ultra column (0.5 mL, 3 kDa). The reduced oligonucleotide was then treated with the maleimide-CBT crosslinker (15 equivalents dissolved in a minimal volume of dimethyl sulfoxide) at room temperature for 2 h. The modified oligonucleotide (5′-CBT-oligo(dC)30-biotin-3′) was purified by ion-exchange chromatography on a monoQ FF Sepharose column (GE Healthcare) with an AKTApurifier FPLC (GE Healthcare) eluted with 0–1 M KCl in TE buffer (10 mM Tris·HCl, 1 mM EDTA, pH 8.0). The identity of the product was confirmed by MS (Supplementary Fig. 1c–f).
5′-S-thiopyridyl-oligo(dA)30-biotin-3′ was synthesized by incubating 1.0 mg of a reduced oligonucleotide composed of 30 adenines with a 5′-thiol (hexamethylene linker) and a 3′-biotin (triethylene glycol spacer) with 20 mM 2,2′-dithiodipyridine (Sigma) in acetonitrile for 2 h at room temperature (Supplementary Fig. 3b). The modified oligonucleotide was purified by ion-exchange chromatography on the monoQ FF Sepharose column.
The thioredoxin (Trx) mutants S1C-V5-C109 (S1C-A22P-I23V-C32S-C35S-P68A-C109) and S1C-V2-C109 (S1C-C32S-C35S-C109) were each cloned into the pET 30a+ plasmid (TopGene). E. coli BL21(DE3) cells (Novagen) were transformed with this plasmid and grown on LB medium supplemented with kanamycin (10 mg L−1) at 37 °C with shaking. During the exponential growth phase (OD600nm = 0.6), protein production was induced with isopropyl-β-D-1-thiogalactopyranoside (0.4 mM, final concentration). After overnight growth, the cells were collected by centrifugation, suspended in TE buffer (30 mM Tris·HCl, 1 mM EDTA, pH 8.3) and lysed by sonication. Cell debris was removed by centrifugation, and the supernatant containing the protein was mixed with one quarter of its volume of 10% streptomycin sulfate (w/v) in TE buffer added drop by drop at 4 °C with stirring. After 2 h, the solution was centrifuged at 20,000 × g for 30 min, and the supernatant was loaded onto a Superdex 75 10/300 column (GE Healthcare). The peak corresponding to Trx was almost pure, but was further purified by ion-exchange chromatography on a monoQ FF column eluted with 0–1 M KCl in TE buffer. The purity of the protein was estimated with SDS-PAGE in 18% Criterion TGX Precast Midi Protein Gels (Bio-Rad) stained with InstantBlue Protein Stain (Expedeon). Protein concentration was estimated spectrophotometrically from the absorbance at 280 nm by using a molar extinction coefficient of 14,100 M−1cm−1 30.
The N-terminal cysteines of Trx as purified are not reactive because they are modified with aldehydes such as pyruvate33. The cysteines were deprotected by incubation with methoxyamine (0.4 M methoxyamine, 100 mM sodium phosphate, 150 mM NaCl, 5 mM TCEP, pH 7.0) at room temperature overnight. The protein was recovered with buffer exchange on a PD MiniTrap G-25 column (GE Healthcare). To modify the N-terminal cysteine with 5′-CBT-oligo(dC)30-biotin-3′, protein in 10-fold excess was mixed and incubated with 0.1–0.2 mg of the oligonucleotide in TE buffer (10 mM Tris·HCl, 1 mM EDTA, pH 7.5) at room temperature overnight. The 3′-biotin-oligo(dC)30-S1C-V5-C109 conjugate was purified by ion-exchange chromatography on a MonoQ FF Sepharose column eluted with 0–1 M KCl in the same buffer. The eluted conjugate was immediately mixed with 5′-S-thiopyridyl-oligo(dA)30-biotin-3′ and incubated for 16 h at room temperature22. The product (3′-biotin-oligo(dC)30-S1C-V5-C109-oligo(dA)30-biotin-3′) was purified by ion-exchange chromatography, and the mass was verified with SDS-PAGE and native MS (Supplementary Fig. 3).
Purification and formation of heptameric α-hemolysin pores
α-Hemolysin (αHL) monomers were synthesized by in vitro transcription/translation and oligomerized into heptameric pores on rabbit red blood cell membranes27. [35S]Methionine radioactivity was used to identify the oligomeric band upon SDS-PAGE. The heptameric pores were eluted from the gel in a concentration range of 0.1–1 ng μL−1, ready to use in single-molecule experiments.
Streptavidin variants expression and purification
Monovalent streptavidin (mSA) is a tetramer (SAe1D3) containing one streptavidin (SA) subunit with wild-type biotin binding affinity with a C-terminal Glu6 tag (SAe) and three “dead” subunits (D) with negligible biotin binding affinity29,55. Subunits were expressed as inclusion bodies in E. coli, solubilized in guanidinium hydrochloride, refolded by dilution, and the desired heterotetramer was purified by ion-exchange chromatography as described55. mSA assembly and composition were validated by mobility on SDS-PAGE with Coomassie Blue (Bio-Rad) staining for samples with or without boiling55. Monovalent traptavidin (mTA) is a tetramer (Tre1D3) containing one traptavidin (TA) subunit with a C-terminal Glu6 tag (Tre) and three “dead” subunits (D)37,56. mTA was purified and validated by using the same procedure as for mSA. Tetravalent SA had the subunit composition SAe4. Tetravalent TA had the subunit composition Tre4. Tetravalent forms were expressed and refolded as above, with purification by iminobiotin-Sepharose as described55.
A bilayer of 1,2-diphytanoyl-sn-glycero-3-phosphocholine (Avanti Polar Lipids) was made by using the Müller-Montal method on a 100 μm-diameter aperture made in a Teflon film (25 μm thick, Goodfellow) that separated two compartments (cis and trans). Each compartment was filled with electrolyte solution (10 mM HEPES, 2 M KCl, pH 7.2) and connected to the headstage of an Axopatch 200B amplifier (Molecular Devices) with Ag/AgCl electrodes. A lid was used to cover the compartments to avoid evaporation, and the temperature was controlled with a circulating bath to 23 °C. The signal from the amplifier was stored on a computer by using a Digidata 1440 A digitizer (Molecular Devices). Following membrane formation, heptameric αHL pores (0.2 μL, see above) were added to the cis compartment under an applied potential of +100 mV. After a single insertion event (identified by a step increase in the conductance of 1 nS) the cis compartment was perfused with fresh buffer by using a push-pull syringe driver PHD 2000 Syringe Pump (Harvard Apparatus) to prevent further insertions. Data were low-pass filtered at 5 kHz, which corresponds to a temporal resolution of ~0.2 ms, and sampled at 20 kHz. Voltage protocols in Clampex (Molecular Devices) were used to automatically change the voltage at defined times. The negative potential at level 7′ was maintained for 0.1 s, 0.5 s, 1 s, 1.5 s, 10 s, 100 s, or 1000 s, and therefore the protein was allowed to refold for 0.09 s, 0.49 s, 0.99 s, 1.49 s, 10 s, 100 s, or 1000 s, respectively (the average duration of the retro-translocation, level 6′, was deducted to calculate the refolding times at short timescales).
Statistics and reproducibility
The state of individual molecules was probed hundreds of times under each condition. These data were collected on at least 3 independent experiments for each condition. Raw data files were first analyzed with pClamp (Molecular Devices) to obtain the current amplitude and dwell time in each level. The event histograms of the natural logarithm of the dwell times were fit to an exponential distribution by Igor Pro (Wavemetrics):
with: A, amplitude; k, rate constant; x, the natural logarithm of the dwell time. The rate constants in the main text are the best-fit values ± 1σ confidence intervals.
The 95% confidence interval in the estimation of the refolded fraction, p, (Fig. 3c) was estimated by using:
with: n, the number of observations.
k-means clustering analysis was carried out by Igor Pro considering only the cases where both levels 2′ and 3′ were detected. In k-means clustering, the centroids define the center of each population (Supplementary Figs. 9, 11). Each data point is assigned to the population with the closest centroid. For Trx V5, a ~5% overestimation of the unfolded population and population A was assumed, due to an overlap with the population B. Therefore, the estimation of each population should be taken as a rough estimate. The different kinetic models used to explain the time-course evolution of refolding were evaluated with the application SimBiology from Matlab (MathWorks), which provided the Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) (Supplementary Figs. 10, 12).
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Relevant data and/or materials are available upon reasonable request from D.R-L. and/or H.B.
Anfinsen, C. B., Haber, E., Sela, M. & White, F. H. Jr The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proc. Natl Acad. Sci. USA 47, 1309–1314 (1961).
Lipman, E. A., Schuler, B., Bakajin, O. & Eaton, W. A. Single-molecule measurement of protein folding kinetics. Science 301, 1233–1235 (2003).
Rief, M., Gautel, M., Oesterhelt, F., Fernandez, J. M. & Gaub, H. E. Reversible unfolding of individual titin immunoglobulin domains by AFM. Science 276, 1109–1112 (1997).
Kaiser, C. M., Goldman, D. H., Chodera, J. D., Tinoco, I. Jr & Bustamante, C. The ribosome modulates nascent protein folding. Science 334, 1723–1727 (2011).
Holtkamp, W. et al. Cotranslational protein folding on the ribosome monitored in real time. Science 350, 1104–1107 (2015).
Kim, S. J. et al. Protein folding. Translational tuning optimizes nascent protein folding in cells. Science 348, 444–448 (2015).
Kowarik, M., Kung, S., Martoglio, B. & Helenius, A. Protein folding during cotranslational translocation in the endoplasmic reticulum. Mol. Cell 10, 769–778 (2002).
Nilsson, O. B. et al. Cotranslational protein folding inside the ribosome exit tunnel. Cell Rep. 12, 1533–1540 (2015).
Ugrinov, K. G. & Clark, P. L. Cotranslational folding increases GFP folding yield. Biophys. J. 98, 1312–1320 (2010).
Fedyukina, D. V. & Cavagnero, S. Protein folding at the exit tunnel. Annu. Rev. Biophys. 40, 337–359 (2011).
Okamoto, K. et al. The protein import motor of mitochondria: a targeted molecular ratchet driving unfolding and translocation. EMBO J. 21, 3659–3671 (2002).
Kadokura, H. & Beckwith, J. Detecting folding intermediates of a protein as it passes through the bacterial translocation channel. Cell 138, 1164–1173 (2009).
Tsirigotaki, A., De Geyter, J., Sostaric, N., Economou, A. & Karamanou, S. Protein export through the bacterial Sec pathway. Nat. Rev. Microbiol. 15, 21–36 (2017).
Costa, T. R. et al. Secretion systems in Gram-negative bacteria: structural and mechanistic insights. Nat. Rev. Microbiol. 13, 343–359 (2015).
Junker, M., Besingi, R. N. & Clark, P. L. Vectorial transport and folding of an autotransporter virulence protein during outer membrane secretion. Mol. Microbiol. 71, 1323–1332 (2009).
Li, L. et al. Crystal structure of a substrate-engaged SecY protein-translocation channel. Nature 531, 395–399 (2016).
Matouschek, A., Pfanner, N. & Voos, W. Protein unfolding by mitochondria. The Hsp70 import motor. EMBO Rep. 1, 404–410 (2000).
Wilson, R. et al. The translocation, folding, assembly and redox-dependent degradation of secretory and membrane proteins in semi-permeabilized mammalian cells. Biochem. J. 307(Pt 3), 679–687 (1995).
Geiger, R., Gautschi, M., Thor, F., Hayer, A. & Helenius, A. Folding, quality control, and secretion of pancreatic ribonuclease in live cells. J. Biol. Chem. 286, 5813–5822 (2011).
Bonardi, F. et al. Probing the SecYEG translocation pore size with preproteins conjugated with sizable rigid spherical molecules. Proc. Natl Acad. Sci. USA 108, 7775–7780 (2011).
Ha, T. Single-molecule methods leap ahead. Nat. Methods 11, 1015–1018 (2014).
Rodriguez-Larrea, D. & Bayley, H. Multistep protein unfolding during nanopore translocation. Nat. Nanotechnol. 8, 288–295 (2013).
Rodriguez-Larrea, D. & Bayley, H. Protein co-translocational unfolding depends on the direction of pulling. Nat. Commun. 5, 4841 (2014).
Nivala, J., Mulroney, L., Li, G., Schreiber, J. & Akeson, M. Discrimination among protein variants using an unfoldase-coupled nanopore. ACS nano 8, 12365–12375 (2014).
Nivala, J., Marks, D. B. & Akeson, M. Unfoldase-mediated protein translocation through an alpha-hemolysin nanopore. Nat. Biotechnol. 31, 247–250 (2013).
Song, L. et al. Structure of staphylococcal alpha-hemolysin, a heptameric transmembrane pore. Science 274, 1859–1866 (1996).
Maglia, G., Heron, A. J., Stoddart, D., Japrung, D. & Bayley, H. Analysis of single nucleic acid molecules with protein nanopores. Methods Enzymol. 475, 591–623 (2010).
Sanchez-Quesada, J., Saghatelian, A., Cheley, S., Bayley, H. & Ghadiri, M. R. Single DNA rotaxanes of a transmembrane pore protein. Angew. Chem. Int. Ed. Engl. 43, 3063–3067 (2004).
Howarth, M. et al. A monovalent streptavidin with a single femtomolar biotin binding site. Nat. Methods 3, 267–273 (2006).
Rodriguez-Larrea, D. et al. Role of conservative mutations in protein multi-property adaptation. Biochem. J. 429, 243–249 (2010).
Ren, H. et al. A biocompatible condensation reaction for the labeling of terminal cysteine residues on proteins. Angew. Chem. Int. Ed. Engl. 48, 9658–9662 (2009).
Rosen, C. B. & Francis, M. B. Targeting the N terminus for site-selective protein modification. Nat. Chem. Biol. 13, 697–705 (2017).
Gentle, I. E., De Souza, D. P. & Baca, M. Direct production of proteins with N-terminal cysteine for site-specific conjugation. Bioconjug. Chem. 15, 658–663 (2004).
Iyer, M., Norton, J. C. & Corey, D. R. Accelerated hybridization of oligonucleotides to duplex DNA. J. Biol. Chem. 270, 14712–14717 (1995).
Stoddart, D., Heron, A. J., Mikhailova, E., Maglia, G. & Bayley, H. Single-nucleotide discrimination in immobilized DNA oligonucleotides with a biological nanopore. Proc. Natl Acad. Sci. USA 106, 7702–7707 (2009).
Purnell, R. F., Mehta, K. K. & Schmidt, J. J. Nucleotide identification and orientation discrimination of DNA homopolymers immobilized in a protein nanopore. Nano Lett. 8, 3029–3034 (2008).
Chivers, C. E. et al. A streptavidin variant with slower biotin dissociation and increased mechanostability. Nat. Methods 7, 391–393 (2010).
Kiefhaber, T. Kinetic traps in lysozyme folding. Proc. Natl Acad. Sci. USA 92, 9029–9033 (1995).
Kelley, R. F., Wilson, J., Bryant, C. & Stellwagen, E. Effects of guanidine hydrochloride on the refolding kinetics of denatured thioredoxin. Biochemistry 25, 728–732 (1986).
Georgescu, R. E., Li, J. H., Goldberg, M. E., Tasayco, M. L. & Chaffotte, A. F. Proline isomerization-independent accumulation of an early intermediate and heterogeneity of the folding pathways of a mixed alpha/beta protein, Escherichia coli thioredoxin. Biochemistry 37, 10286–10297 (1998).
Roderer, D. J., Scharer, M. A., Rubini, M. & Glockshuber, R. Acceleration of protein folding by four orders of magnitude through a single amino acid substitution. Sci. Rep. 5, 11840 (2015).
Reimer, U. et al. Side-chain effects on peptidyl-prolyl cis/trans isomerisation. J. Mol. Biol. 279, 449–460 (1998).
Berkovich, R. et al. Rate limit of protein elastic response is tether dependent. Proc. Natl Acad. Sci. USA 109, 14416–14421 (2012).
Lloyd, S. P. Least Squares Quantization in PCM. In IEEE Transactions on Information Theory IT-28, 9 (1982).
Akaike, H. A New Look at the Statistical Model Identification. In IEEE Transactions on Automatic Control AC-19, 8 (1974).
Schwarz, G. Estimating the dimension of a model. Ann. Stat. 6, 4 (1978).
Leopold, P. E., Montal, M. & Onuchic, J. N. Protein folding funnels: a kinetic approach to the sequence-structure relationship. Proc. Natl Acad. Sci. USA 89, 8721–8725 (1992).
Baker, D., Sohl, J. L. & Agard, D. A. A protein-folding reaction under kinetic control. Nature 356, 263–265 (1992).
Baldwin, R. L. On-pathway versus off-pathway folding intermediates. Fold. Des. 1, R1–R8 (1996).
Matouschek, A., Kellis, J. T. Jr, Serrano, L., Bycroft, M. & Fersht, A. R. Transient folding intermediates characterized by protein engineering. Nature 346, 440–445 (1990).
Stigler, J., Ziegler, F., Gieseke, A., Gebhardt, J. C. & Rief, M. The complex folding network of single calmodulin molecules. Science 334, 512–516 (2011).
Pirchi, M. et al. Single-molecule fluorescence spectroscopy maps the folding landscape of a large protein. Nat. Commun. 2, 493 (2011).
Zwanzig, R. Two-state models of protein folding kinetics. Proc. Natl Acad. Sci. USA 94, 148–150 (1997).
Houwman, J. A., Andre, E., Westphal, A. H., van Berkel, W. J. & van Mierlo, C. P. The ribosome restrains molten globule formation in stalled nascent flavodoxin. J. Biol. Chem. 291, 25911–25920 (2016).
Fairhead, M. et al. Plug-and-play pairing via defined divalent streptavidins. J. Mol. Biol. 426, 199–214 (2014).
Chivers, C. E. et al. How the biotin-streptavidin interaction was made even stronger: investigation via crystallography and a chimaeric tetramer. Biochem. J. 435, 55–63 (2011).
The authors thank Ellina Mikhailova for α-hemolysin prepared by in vitro transcription and translation. We thank Dr. Jonathan Hopper and Professor Carol Robinson for the MS of the construct oligo(dC)30-V5-oligo(dA)30. H.B. was funded by the NIH and Oxford Nanopore Technologies and M.H. by the Biotechnology and Biological Sciences Research Council (BBSRC, grant BB/M02122X/1). J.F. was supported by the China Scholarship Council and G.V. by a Medical Research Council studentship and Merton College Oxford. D.R.-L. is a recipient of a Ramón y Cajal Fellowship (RYC-2013-12799). D.R.-L. was funded by MINECO grants BIO2017-88946-R and BFU2016-81754-ERC (FEDER funds). This work was supported in part by the Fundación Biofísica Bizkaia and the Basque Excellence Research Centre (BERC) program of the Basque Government.
H.B. declares no competing non-financial interests but the following competing financial interest: H.B. is the Founder of, a consultant for and a share-holder of Oxford Nanopore Technologies, a company engaged in the development of nanopore sensing and sequencing technologies. The remaining authors declare no financial or non-financial interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Feng, J., Martin-Baniandres, P., Booth, M.J. et al. Transmembrane protein rotaxanes reveal kinetic traps in the refolding of translocated substrates. Commun Biol 3, 159 (2020). https://doi.org/10.1038/s42003-020-0840-5