The recent development of chemical and bio-conjugation techniques allows for the engineering of various protein polymers. However, most of the polymerization process is difficult to control. To meet this challenge, we develop an enzymatic procedure to build polyprotein using the combination of a strict protein ligase OaAEP1 (Oldenlandia affinis asparaginyl endopeptidases 1) and a protease TEV (tobacco etch virus). We firstly demonstrate the use of OaAEP1-alone to build a sequence-uncontrolled ubiquitin polyprotein and covalently immobilize the coupled protein on the surface. Then, we construct a poly-metalloprotein, rubredoxin, from the purified monomer. Lastly, we show the feasibility of synthesizing protein polymers with rationally-controlled sequences by the synergy of the ligase and protease, which are verified by protein unfolding using atomic force microscopy-based single-molecule force spectroscopy (AFM-SMFS). Thus, this study provides a strategy for polyprotein engineering and immobilization.
Protein conjugation and polymerization is a natural biochemical process and has important applications in biomaterial and biomedicine engineering1,2. Compared with synthetic polymers, one of the unique features of most biopolymers, such as protein, is a uniform structure with a well-controlled sequence of amino acids, while multi-domain protein consists of similar or different protein subdomains. This clustering of the same or different protein domains often results in enhanced biological function and stability3. Several biochemical reaction-based methods, especially the cysteine-based coupling, have been developed for building protein polymer, in which protein monomer is designed with additional cysteines or specific residues as the basic unit for ligation4,5. However, unlike natural multi-domain proteins, it is rarely reported that these polymers and biomaterials are of well-controlled subunit sequence like their natural origin, and a bio-synthetic route for this purpose remains a key challenge. Another approach is to build the complete gene into one open reading frame for the artificial protein oligomer, just like the natural way. For example, a so-called polyprotein strategy has been developed to build protein oligomer to mimic natural modular protein6,7,8. The fused polyprotein comprises identical or different multiple protein domains whose genes are built using recombinant DNA technology. However, the engineering of toxic or large-sized protein polymer is often challenging. Also, many proteins such as metalloprotein and delicate enzyme may need purification as a monomer. Thus, the application of recombinant DNA technology for building polyprotein is also limited.
To address this challenge, we develop an enzymatic, stepwise construction of protein polymer/polyprotein with a relatively-precise controlled sequence using a protein ligase and a protein protease. OaAEP1 is a recently developed, efficient and strict endopeptidase, which links two peptides/proteins covalently as a peptide bond through two termini in less than 30 min9. It requires a ligation unit with only two N-terminal GL residues (NH2-Gly-Leu) and three C-terminal NGL residues (Asn-Gly-Leu-COOH) (Fig. 1a)10,11. Thus, based on the ligation unit GL-POI-NGL (POI: protein of interest), OaAEP1 can be used to build polyprotein with an uncontrolled sequence in the solution, similar to the bi-cysteine or other ligase-based coupling methods4,12. The synthesized polyprotein is characterized by the SDS-PAGE gel method, at the ensemble level. Moreover, the designed polymers are unambiguously verified by protein domain unfolding using AFM-based SMFS, at the single-molecule level. AFM-based SMFS can mechanically unfold each protein domain and characterize protein mechanics13,14,15,16,17,18,19,20,21,22,23. The unfolding of polyprotein leads to a characteristic sawtooth-like force-extension curve, in which each force peak corresponds to the unfolding of each protein domain24,25,26,27,28. Thus, the polymerization number of the protein polymer can be directly counted to verify our design29,30,31. A poly-ubiquitin molecule is built and characterized. Compared with the PAGE gel method, the protein is measured under a native condition at room temperature, and only well-folded protein shows expected stability. Also, any linker effect between ligated proteins can be examined. Consequently, SMFS measurement of the protein polymer not only confirms the polymer design but also provides complementary information about protein stability and folding32,33,34.
In addition to the polyprotein engineering, OaAEP1 is implemented for protein immobilization for SMFS, together with the specific cohesin-dockerin (Coh-XDoc) receptor-ligand pairs35,36,37. This configuration ensures the complete unfolding of all protein subdomains. And the polymerization number is counted correctly, Traditionally, the polyprotein sample is deposited on a glass coverslip and is picked up by AFM tip randomly through a non-specific interaction6,7. Although this set up is simple, the pick-up ratio of the high-quality single-molecule event is very low (<0.01%). Based on the development of surface (bio)chemistry methods for covalent anchoring of protein such as thiol chemistry, chloroalkane chemistry, isopeptide bond, click chemistry and specific and reversible receptor-ligand pairs for reversible linkage of sample such as streptavidin-biotin, cohesin-dockerin, and other antigen-antibody interactions, site-specific anchoring and probing configuration for AFM experiment has become possible and is becoming a standard method now36,38,39,40,41,42,43. Our method provides another alternative for such a purpose which can be adopted in all these similar AFM systems. Only two short peptide tags are needed for the ligation, which leads to simple protein preparation without further chemical modification.
Moreover, this OaAEP1 ligase-mediate method is cysteine-free and achieved at the monomeric protein module level. It enables the study of the challenging protein system such as metalloprotein, which needs initial purification as a monomer. Because the overexpression of metalloprotein often results in different metal forms as a mixture, it requires additional purification to obtain a pure-metal form monomer for ligation44,45.
Lastly, we take advantage of a removable TEV protease site, which is compatible with OaAEP1 ligation, to achieve the stepwise protein polymerization on the surface. When a TEV site (ENLYFQ/G) plus a leucine (L) is engineered at the N-terminus of the protein unit as ENLYFQ/G-L-POI, the TEV cleavage results in an N-terminal GL residue as GL-POI, and is compatible with further OaAEP1 ligation. Consequently, an enzymatic, stepwise biosynthesis of polyprotein with a relatively well-controlled sequence is achieved. Thus, our enzymatic method provides a method for polyprotein sample preparation, both sequence-uncontrolled and controlled, as well as protein immobilization for single-molecule studies, especially for the complex metalloprotein system46,47,48,49,50.
A sequence-uncontrolled polyprotein built by OaAEP1-only
Protein ubiquitin (Ub), which has been well characterized by single-molecule AFM before, was chosen for demonstration. The poly-ubiquitin is also a natural signal for protein degradation with biological function. A construct GL-Ub-NGL was built for the OaAEP1 polymerization and reacted in the buffer solution. The Coomassie-stained SDS-PAGE gel results of the product clearly showed the rapid construction of Ub polymer at least up to a pentamer within 20 min (Fig. 1b). Quantitatively, ~25% of the protein unit was ligated to a dimer, 10% to a trimer, while ~60% of the protein remained a monomer (Fig. 1b). As expected, the polyprotein length was uncontrolled, and the yields diminished rapidly as the chain grew. Nevertheless, its yields and dispersity were comparable to most other protein monomer-based polymerization method, such as the bi-cysteine and sortase-based method.
In addition, we found that OaAEP1 is also a robust ligase under harsh conditions, such as under the acidic solutions and in the presence of metal ions. First, the same ubiquitin polymerization reaction was tested under different pH levels. The SDS-PAGE gel results showed the polymerization was well-performed under acidic condition (pH 4 to 7) (Fig. 1c). Moreover, the reaction was performed in the presence of the most biologically-relevant metal ions. Twelve different metal ions including Fe(III), Co(II), Ni(II), Cu(II), Zn(II), Mn(II), Ca(II), Mg(II), Al(III), Cd(II), Hg(II), and Pb(II) were tested, respectively. The SDS-PAGE gel results showed that the reaction was not affected by most metal ions at the concentrations of 0.2 mM, except 0.002 mM for Hg (Fig. 1d, Supplementary Fig. 1). These concentrations are much higher than the free metal level in most metalloprotein solutions. These experiments indicate that OaAEP1 ligase is a versatile enzyme that is suitable for constructing challenging protein under harsh conditions.
Next, the polyprotein sample obtained by OaAEP1-only ligation was used directly for single-molecule AFM characterization and study. The sample purified by gel-filtration chromatography with higher polymerization degrees was used for better performance, in which Ub tetramer presented most (Supplementary Fig. 2). The protein solution was deposited on a clean glass coverslip and was then pressed and captured by AFM tip (Fig. 2a). Stretching the polyproteins resulted in a typical saw-tooth like force-extension curve with multiple peaks, which corresponded to the unfolding of each ubiquitin monomer (Fig. 2b). For example, seven unfolding peaks were observed in curve 2 of Fig. 2b, indicating that seven ubiquitin units were unfolded. Two key experimental results, the contour length increment (ΔLc) and the force (F) upon unfolding were analyzed, and the results are shown in Fig. 2c. The contour length increment is related to how many proteins residues are unfolded and extended upon protein unfolding. Previous single-molecule AFM unfolding experiments of ubiquitin polyprotein built by the recombinant DNA method showed a ΔLc of ~24 nm, which was from the full extension of 76 amino acids of Ub (76aa*0.36nm-4 nm = 23 nm, 4 nm is the distance between the N and C termini of the folded Ub), and an average unfolding force of 203 pN51,52. Here, our ligase-built ubiquitin polymer showed comparable results, with an average ΔLc and standard deviation of 23.1 ± 2 nm, and an average unfolding force and standard deviation of 202 ± 44 pN, number (n) = 198. These results validate our method for building polyprotein for single-molecule measurement. Moreover, it indicates no linker effect from the three new connection NGL residues between protein monomers as the same unfolding forces for Ub measured in our construct. Finally, there was a large dispersity, as many different numbers of ubiquitin unfolding peaks from one to seven were recorded for each molecule. Thus, more experiments were performed to obtain statistics. As shown in the Fig. 2d, most curves showed three Ub unfolding peaks (n = 51, 52%) and four Ub unfolding peaks (n = 26, 27%). The curves with only two or one ubiquitin molecules were not analyzed, as their signals were strongly affected by the non-specific interaction for a short molecule.
It is noted that Ub trimer (52%) was detected most instead of tetramer (27%), which was different from the previous polyprotein mixture identification (Supplementary Fig. 2). Here, the polyprotein sample absorbed on the coverslip was picked up at a random location along the long protein polymer, and by a weak non-specific interaction between the protein and AFM tip or coverslip. It is possible that the full-length protein polymer may not have been completely captured and unfolded. For example, the seven unfolding peaks of Ub observed above (Fig. 2b) cannot guarantee that a ubiquitin heptamer was captured. It was perhaps an octamer, but the tip only pressed on the seventh Ub domain, and one domain was not unfolded and counted. Another possibility is that an octamer was fully captured, while the stability of one domain was higher than the non-specific protein detachment force from either the AFM tip or the glass coverslip. Thus, the domain remained folded when the octamer was detached. Both scenarios have led to force-extension curves detected with fewer unfolding peaks and a large dispersity.
The covalent immobilization of (poly)protein in the AFM system
To solve this problem, a covalent attachment of the protein sample in the AFM system with defined immobilization geometry and strong attachment force was developed using OaAEP1 as well. We demonstrated this application based on a strong and reversible type III cohesin-dockerin-Xmodule (Coh-XDoc, or Coh-Doc) receptor-ligand pair developed by Dr. Nash and Dr. Gaub for single-molecule studies35,53. Protein polymer, GL-(POI)n-NGL, as obtained above, was directly used here. First, it was covalently linked to the NH2-Gly-Leu functionalized glass coverslip using OaAEP1 through its C-terminal Asn-Gly-Leu residues. Then, it was ligated with the Coh-NGL through its N-terminus as Coh-(POI)n-NGL-Glass. Finally, it was directly probed by a Protein Marker-XDoc functionalized AFM tip forming a complete force loop: AFM Tip-Protein Marker-(XDoc-Coh)-(POI)n-Glass (Fig. 3a). The experimental details are described in Supplementary Methods. (Ub)n obtained previously was used directly and measured in this configuration. It showed similar unfolding force and ΔLc results to before (Fig. 3b). In contrast, from the statistical analysis, it is clear that protein with a larger polymerization number was detected with higher frequency, and the tetramer presents most now (32%, Fig. 3c, d). This indicates the polyprotein was stretched between the two ends in this configuration. Fifty-two curves were randomly selected for force analysis (Fig. 3e). The force peak with a ΔLc of ~55 nm was from the unfolding of the protein marker CBM (cellulose-binding module), and sometimes Xmodule in the XDoc unfolded before the Coh-XDoc complex dissociated, with a ΔLc of ~34 nm 35.
To verify all subdomains were unfolded in this covalent attachment setup, a (Ub)6 polyprotein with a known number of subdomains was built by the recombinant DNA method and tested. Coh-(Ub)6 was used for coverslip attachment, and a CBM-XDoc was used for the tip functionalization. As a result, a polyprotein CBM-(XDoc-Coh)-(Ub)6 including one marker protein CBM and six ubiquitin was used. Most AFM measurement showed an expected full-length polyprotein unfolding scenario with six Ub (292 out of 322, 91%), as demonstrated by the overlap of the force-extension curves with CBM unfolding first. (Fig. 3f, n = 52). Consequently, this OaAEP1-facilitated covalent protein immobilization method enables the complete unfolding and correct counting of folded subdomains in the polyprotein molecule.
Pure-metal form polyprotein built from the purified monomer
Our monomer-based ligation method enables the study of more challenging protein system such as metalloprotein, which sometimes shows several different metal-forms and needs purification as a monomer first. A well-characterized iron-sulfur protein rubredoxin (Rd) with a native Fe(Cys)4 metal center was chosen for demonstration. The overexpression of rubredoxin in E. coli results in a mixture of native iron-form and zinc-substituted form, whose polyprotein built by the classic recombinant DNA method lead to a mixed and uncontrolled metal form (Fig. 4a)44. Instead as a monomer, the protein mixture solution can be separated into pure Fe-Rd and pure Zn-Rd firstly using ion-exchange chromatography (Fig. 4b). The purity was confirmed by their characteristic UV-Vis spectra, respectively (Fig. 4c). As a result, the pure-metal form Rd can be polymerized to poly-metalloprotein sample (Fe-Rd)n and (Zn-Rd)n for AFM measurement. Here, we fused the marker protein GB1 to Rd as GL-(GB1-Rd)-NGL for the experiment (Fig. 4d). The protein GB1 was used here as a single-molecule fingerprint with known ΔLc of 18 nm as well as an internal force caliper (178 pN). Previous cysteine-based protein coupling method using C-(GB1-Rd)-C showed that the unfolding force of Fe(III)-Rd was 211pN with ΔLc of 12.6 nm54. Here, the AFM measurement of the two polyproteins built by our method (Fig. 4e) showed comparable results: 194 ± 63 pN (n = 184) and 12.6 ± 1.5 nm for Fe(III)-Rd (Fig. 4e), and 124 ± 52 pN (n = 246) and 12.4 ± 1.7 nm for Zn-Rd (Fig. 4f).
Sequence-controlled polyproteins built on the glass surface
To rationally control the sequence of the polyprotein built by OaAEP1, we first validated in principle that the TEV cleavage site is compatible with our OaAEP1 ligation system for a stepwise protein polymerization. As shown in the schematic of Fig. 5a, when a TEV site (ENLYFQ/G) plus a leucine (L) was added at the N-terminal part of a protein, the TEV protease cleavage results in an N-terminal Gly-Leu. As a result, the cleaved protein was then compatible with further OaAEP1 ligation and ultimately led to a protein polymer. Ubiquitin was used here for demonstration again. The SDS-PAGE gel result proved the validity of such a procedure for the construction of ubiquitin dimer based on Coh-tev-L-Ub (for cleavage) and Coh-tev-L-Ub-NGL (for ligation) (Fig. 5b).
However, the ligation efficiency for OaAEP1 was not sufficiently high enough for efficient polymerization. Previously, the OaAEP1-only ligation resulted in Ub dimer formation with a yield of 25%. Here, for a one-step ligation, the efficiency was of 20%. Although this is satisfactory for the one-step protein labeling or a dimer construction, it is still challenging to build a relatively long protein oligomer/polymer. The yield decreases exponentially after several rounds of reaction. Consequently, it is necessary to increase ligation efficiency.
For a chemical reaction, when one reactant is added in excess, the chemical equilibrium will be pushed toward the product. Thus, we modified the ratio between the two reactants to increase the yield. The one-step ligation between Coh-tev-L-Ub-NGL and GL-Ub was used here for the test. When the two reactants were in an equal molar ratio, the ligation efficiency was 20%. By increasing their ratio from 1 to 3, the yield improved to 40%, and 50% at a ratio of 5. Finally, when a ratio of 10 to 1 was used, the efficiency increased to 75% (Fig. 5c). Here, the ligation efficiency is calculated based on the formation of the product, Coh-tev-L-(Ub)2 dimer, whose band color in the gel intensifies clearly as the reactant ratio increases. To resolve between the reactant monomer Coh-tev-L-Ub-NGL and the product dimer, the sample migrates for a long time. As a result, the band of the other smaller reactant monomer GL-Ub is unclear upon the long migration. By increasing the concentration ten times for all the samples, the decrease of this monomer can also be detected and used for the efficiency calculation with similar results (Supplementary Fig. 3). Consequently, the yield (75%) for OaAEP1-ligation is obtained and is enough to build a polyprotein.
We then built the protein polymer on a functionalized glass surface based on the stepwise OaAEP1 ligation and TEV cleavage reaction (Fig. 6a). First, the C-terminus of the protein unit was linked to the glass surface which naturally protects the protein unit from the sequence-uncontrolled polymerization. In addition, it allows the removal of excessive protein monomer and enzymes by simple buffer washing. Moreover, the resultant polymer was ready for single-molecule AFM characterization. We first ligated the monomer Coh-tev-L-Ub-NGL on the NH2-Gly-Leu (GL) functionalized glass coverslip as the Coh-tev-L-(Ub)1-NGL-Glass, denoted as (Ub)1 here for simplicity. The cohesin incorporated here was for single-molecule AFM measurement using an XDoc-functionalized tip. The AFM results of (Ub)1 showed the corresponding unfolding event with only one Ub peak (Number, N = 54). Then, (Ub)1 was cleaved by protease TEV as GL-(Ub)1-NGL-Glass to expose the GL residues for the second-round ligation. By adding the protein unit Coh-tev-L-Ub-NGL in excess with OaAEP1, Coh-tev-L-(Ub)2-NGL-Glass as a ubiquitin dimer was obtained. (Ub)2 was also characterized by single-molecule AFM, showing the corresponding dimer formation (N = 155, 88%). By repeating this stepwise cleavage and ligation procedure, we obtained ubiquitin polymer up to (Ub)5. The AFM experiments showed that 31% (N = 83) of curves had five Ub unfolding peaks (Fig. 6c). Through the whole process, the same polyprotein sample was characterized at each polymerization stage using the same AFM tip, and the maximum unfolding peaks of polymer picked up corresponded to the desired length (Fig. 6b). Nevertheless, there was still a fraction of molecules that were not reacted, as the ligation efficiency was not 100%. Indeed, when the polymer grew to hexamer, the yield became even lower (Supplementary Fig. 4).
Similarly, metalloprotein Rd was constructed as a sequence-controlled polyprotein on the glass surface and characterized. It demonstrated the feasibility of this stepwise method for constructing metalloprotein polymer (Fig. 6b). Finally, the protein copolymer (Ub-Rd)3 was also built by adding the protein units Ub and Rd one by one. Qualitatively, protein polymer built up to decamer can still be captured (Supplementary Fig. 5). Nevertheless, the yield becomes too low after five ligation cycles, and only a few curves can be found. Generally speaking, a protein pentamer can be obtained with a reasonable yield.
In this study, we have developed a simple, enzymatic methodology for building both sequence-uncontrolled and sequence-controlled polyproteins using protein ligase OaAEP1 and protease TEV. The protein ligation and polymerization were achieved under a mild condition, which should be applicable for most delicate proteins. In addition, the OaAEP1 is robust under harsh conditions, such as acidic solution and additional metal ions, which further expands the application of this method to build complex proteins. Most importantly, only two and three short residues as peptide tags are needed for the OaAEP1 ligation, which leads to minimal perturbance to the protein. The resultant peptide linkage between proteins is both thermally and mechanically stable, proved by the joint SDS-PAGE gel and single-molecule force spectroscopy experiments.
By adding one reactant in excess, the ligation efficiency increases from 20 to 75%. This allows for the construction of a relatively long protein oligomer. Statistically, pentameric protein, such as (Ub)5, can be obtained on the glass surface with a small polydispersity and percentage of ~30%. And a long protein decamer (Ub-Rd)5 can be obtained but only with several molecules under current ligation efficiency. Further improvement of the ligation efficiency or a new method to purify the resultant protein is necessary to achieve such a long protein polymer. Nevertheless, these results are already an advance compared with other ligase-based protein ligation method. Previous protein ligase-dependent polymerization seldom reported the construction of a protein pentamer, and the yield for a protein tetramer was <1% at the single-molecule level12,55. Thus, our method provides a tool for biotechnology and protein engineering, which is suitable for both protein coupling and immobilization.
There are other enzymatic methods which are well-suited for linking protein as a dimer or for protein labeling, such as sortase, split intein, bultease, and SpyCather-SpyTag12,55,56,57. Sortase is another similar peptidase which is promising for building long protein polymer. Several groups have used it for single-molecule AFM studies, both for protein covalent attachment and polyprotein construction, with important discoveries12,53. Unfortunately, the wild-type sortase is not a strict ligase and can hydrolyze the ligated linker itself. The engineering for a better sortase is under intense study with substantial improvements being developed58. The SpoonTag/SpoonCatcher system based on the SpyCather-SpyTag system is another powerful ligation method with a high ligation efficiency of >95% by forming an intermolecular isopeptide formation59,60. Thus, a long protein polymer like a decamer can be obtained in a similar, stepwise fashion with a high yield. By comparison, our OaAEP1 requires two and three additional amino acids at the two ends, and a short three amino-acid length linkage is present after ligation. Thus, either of these two methods may be applied based on different requirements.
From the perspective of single-molecule study, our method also provides a way to construct the polyprotein sample with a better-controlled length using monomeric protein unit. Recombinant DNA technology was employed to build polyprotein for single-molecule AFM study, which composes of identical or different multiple protein domains. Thus, it results in characteristic sawtooth like force-extension curves from the stepwise unfolding of each domain under force. This strategy significantly increases the data reliability and collection efficiency and has become the gold standard for AFM-based SMFS7. However, it relies on repetitive cloning cycles and is time-consuming. Recently, a much more efficient Gibson assembly-based method was developed for building long polyprotein genes61,62. Nevertheless, the engineering of toxic or large-sized protein polymer is often challenging, as the misfolding or the inability to express a large-size protein often occurs, and limits its application.
Another simple approach is to build the polymer at the monomeric protein level, by expressing individual protein monomers first and then conjugating them as a polyprotein. A bi-cysteine based protein coupling method was first developed and is widely used. By engineering two cysteines on a single protein, the formation of the intermolecular disulfide bond enables the construction of polyprotein and has led to many important discoveries4,63. This approach is efficient and enables the construction of larger protein like poly-GFP. A modified method was later developed consisting of a reducing-resist thiol-ether bond formation using maleimide-thiol chemistry64. Recently, similar ligase-based polymerization methods using sortase were also developed12,55. All these methods enable the pre-purification of protein monomer before polymerization. Thus, they are suitable for building complex protein such as metalloprotein. However, these polymerization methods all suffer from a sequence-uncontrolled protein polymer with a large dispersity. The protein unit is also linked in a mixed geometry of head-to-tail and head-to-head. For the sortase-based method, a long protein polymer was difficult to obtain due to the low ligation efficiency12,55. Here, our method utilizes a cysteine-free ligase OaAEP1 for protein ligation, polymerization, and immobilization with relatively high yield.
In conclusion, we develop an enzymatic method to synthesize both sequence-controlled and non-controlled polymerized protein. The robustness and high efficiency of the ligase enable the engineering and study of a wide range of protein, both delicate and complex, as well as providing an efficient way for both protein sample coupling and immobilization for single-molecule studies.
The gene coding for protein of interest: ubiquitin from human (Ub), rubredoxins (Rd, Rd represents Zn-formed rubredoxin from Clostridium pasteurianum, if not specified, Fe-form rubredoxin from Pyrococcus furiosus), the B1 domain of immunoglobulin G (GB1), the cellulose-binding module (CBM), type III cohesin-dockerin-X module domain complex from Ruminococcus flavefaciens (Coh-Xmodule-Doc, or Coh-XDoc), tobacco etch virus (TEV) protease (fused with superfold GFP as GFP-TEV for use), elastin-like polypeptides (ELP) were ordered from Genscript Inc, respectively. Regular PCR procedure was used for the further addition of N-terminal GL and C-terminal NGL to the protein if needed. Typically, a three-restriction digestion enzyme system BamHI-BglII-KpnI was used for connecting the gene of different protein fragments7. The same overhang after BamHI and BglII digestion allows the stepwise ligation between their fragments. All genes were finally confirmed by DNA sequencing from GenScript Inc. Typical protein overexpression and purification procedure, the general method for OaAEP1-only polymerization, and corresponding amino acid sequences are all provided in Supplementary Methods and Notes.
First, the pQE80L-POI or pET28a-POI expression plasmids were transformed into E. coli BL21(DE3) cells. Single colonies were picked into LB medium containing 100 µg mL−1 ampicillin sodium salt or 50 µg mL−1 kanamycin (continuous shaking, 37 °C, and 16–20 h). After grown to saturation, overnight cultures were diluted 1:50 into fresh LB media containing ampicillin sodium salt or kanamycin (continuous shaking, 37 °C, t ~3 h, except Fe(III)-Rd, and Zn-Rd which were overexpressed in M9 medium (M9 media supplemented with 0.4% glucose, 0.1 mM CaCl2, 2 mM MgSO4) with continuous shaking 6 h), and induced with 1 mM isopropyl β-D-thiogalactoside (IPTG) based on each protein when OD600 is ~0.6 (For rubredoxin, 1 mM FeCl3 or ZnCl2 is added). The bacterial cultures were allowed to incubate for an additional 4–6 h (37 °C). Finally, 400 mL bacterial culture was pelleted by centrifugation (13,260 × g, 25 min, 4 °C) and stored at −80 °C before purification.
The cells were then resuspended in 25 mL lysis buffer (50 mM Tris, pH 7.4) and lysed on ice using a Biosafer sonicator (15% amplitude for 30 min). The lysate was centrifuged (19,632 × g, 40 min, 4 °C) to pellet cell fragments and the supernatant fluids were applied to a Co-NTA or Ni-NTA affinity column (TALON) and washed with buffer containing 20 mM Tris, 400 mM NaCl, 2 mM imidazole, pH 7.4. The bound protein was eluted with elution buffer (20 mM Tris, 400 mM NaCl, 250 mM imidazole, pH 7.4). For rubredoxin we used an anion exchange chromatography (Mono Q 5/50 GL GE Healthcare) using a continuous salt gradient of 0–30% of buffer B (50 mM Tris, 1 M NaCl, pH 8.5) and then a size-exclusion chromatography (Superdex 200 increase 10/300 GL GE Healthcare) that had been pre-equilibrated in 50 mM Tris, 100 mM NaCl, pH 7.4 buffer in an AKTA FPLC system (GE Healthcare) for further purification to ensure the purity >95%.
Stepwise polyprotein preparation with controlled sequences
Here we used the sample preparation for ubiquitin homo-polymer, Coh-tev-L-(Ub)n-NGL, as an example. Ligation unit Coh-tev-L-Ub-NGL was first linked to the GL-ELP50nm-C functionalized coverslip by OaAEP1 and the C-terminus was blocked. Then, the TEV protease solution was added on the coverslip to cleave the TEV site in the protein unit (0.4 mg mL−1 TEV protease 100 μL, 75 mM NaCl, 0.5 mM EDTA, 25 mM Tris-HCl, pH 8.0, 10% [v/v] glycerol). Typically, it was reacted for ~1 hour at 25 °C to produce GL-Ub-NGL-glass with exposed N-terminal GL residues. TEV protease was then washed away. Then, ~5 times the amount of the ligation unit, compared with the first time, was added to the solution for the stepwise ligation by OaAEP1. Due to the incomplete ligation reaction, we estimated that the real ratio between the two reactants was beyond 10. As a result, ubiquitin dimer was obtained on the glass surface as Coh-tev-L-(Ub)2-NGL-Glass. To reach the desired polymerization number N, this stepwise ligation and cleavage procedure was then repeated for N-1 times, leading to the protein polymer GL-(Ub)n-NGL-Glass. The final TEV cleavage was omitted to obtain Coh-tev-L-(Ub)n-NGL-Glass, which was ready for single-molecule AFM experiment using a Coh-XDoc system. A similar procedure was used to build other protein homo-polymer (Rd)n, and the hetero-polymer (Ub-Rd)n.
Functionalized coverslip surface preparation
A glass coverslip (Sail Brand, China) surface was cleaned and activated by chromic acid treatment for 30 min at 80 °C. For amino-silanization, the coverslips were completely submerged in 1% (v/v) APTES toluene solution for 1 hour at room temperature, protected from light. The coverslips were then washed with toluene and absolute ethyl alcohol and dried under a stream of nitrogen. Then, the coverslips were incubated at 80 °C for 15 min. After immobilization, the coverslips were cooled down to room temperature. Two hundred microliters of sulfo-SMCC (1 mg mL−1) dimethyl sulfoxide (DMSO) solution was added between two immobilized coverslips and incubated for 1 h protected from light. The coverslips were washed with DMSO first and then with absolute ethyl alcohol to remove residual sulfo-SMCC. The cleaned coverslips were dried under a stream of nitrogen. 200 μL of 200 μM GL-ELP50nm-C protein solution was pipetted over a functional coverslip and was incubated for ~3 h. Finally, the coverslip was washed with Milli-Q water to remove the unreacted GL-ELP50nm-C and was used immediately or stored at 4 °C.
Cohesin-NGL was linked to the POIs, such as GL-(Ub)n-NGL if necessary. OaAEP1-catalyzed coupling of the bound GL-ELP50nm-C and Coh-POIs-NGL was done in the measurement buffer for 20–30 min. The sample cell was used for AFM-SMFS measurement after washing with the measurement buffer.
Functionalized cantilevers surface preparation
Silicon nitride cantilever (MLCT, Bruker Corp) was used as a force probe. The surface chemistry of the cantilevers was similar to that of the coverslip. Cantilevers were cleaned by chromic acid treatment for 10 min at 80 °C. Cleaned cantilevers were functionalized by amino-silanization with APTES and were then conjugated to sulfo-SMCC. C-ELP50nm-NGL was linked to the surface with maleimide group of sulfo-SMCC for 1.5 h and the unreacted ELP was removed by Milli-Q water. The functionalized cantilever with ELP was immersed in 200 μL of 50 μM GL-CBM-XDoc protein solution containing 200 nM OaAEP1.
Covalent attachment method
First, the silanized glass coverslip was functionalized with GL-ELP-C using thiol-maleimide chemistry in which the ELP was used as a spacer. Next, the polyprotein was covalently linked to the coverslip by OaAEP1 (Fig. 3a. Step 1) followed by ligation with Coh-NGL as Coh-(POI)n-glass (Step 2). Similarly, the C-ELP-NGL functionalized AFM tip was linked with GL-CBM-XDoc (Step 3). Consequently, the cohesin-dockerin pair formed when the tip pressed the coverslip as Tip-CBM-(XDoc-Coh)-(POI)n-glass (Step 4).
AFM-based SMFS experiments were performed on Nanowizard 4 AFM (JPK Germany). MLCT cantilever with a spring constant (k) of ~40 pN nm−1 was used. The equipartition theorem was used to calibrate the k of each cantilever in solution with an accurate value before the experiment. All proteins were measured in AFM measurement buffer (100 mM Tris, 100 mM NaCl, pH 7.4). For measurements using non-specific interaction, 10 μL of the polyprotein sample at a concentration of ~1 mg mL−1 was diluted into 30 μL of measurement buffer and added to a clean glass coverslip. The protein was allowed to absorb for 30 min. The suspending protein was washed away with 2 mL of the measurement buffer, and 1.5 mL measurement buffer was used to cover the coverslip. The tip of the AFM cantilever pressed the protein-deposited surface under a contact force of ~0.5 nN for hundreds of milliseconds, and a single polyprotein molecule was picked up with a ratio of ~0.01% and stretched at a constant pulling velocity of 400 nm s−1 in all experiments. The spring constant of the cantilevers used for (Ub)n were of ~44 pN nm−1 (non-specific protein immobilization), and 45 pN nm−1 (specific immobilization), for Coh-(Ub)6 was of 39 pN nm−1, for (GB1-Fe(III)-Rd)n were of ~56 pN nm−1 and ~48 pN nm−1, for (GB1-Zn-Rd)n were of ~53 pN nm−1 and ~41 pN nm−1, for (Rd)6 was of ~34 pN nm−1, for (Ub)6 was of ~61 pN nm−1, for (Ub-Rd)3 was of ~100 pN nm−1, for (Ub-Rd)5 was of ~35 pN nm−1.
The data analysis of the force-extension curve was carried out using program Igor Pro 6.12 (Wavemetrics). The curves were fitted with the worm-like-chain (WLC) model of polymer elasticity, and the persistence length is of ~0.4 nm. For measurements with both covalent attachment and specific interaction pairs configuration, the functionalized glass coverslips and cantilevers were used. For the experiments using the functional coverslip and cantilever, only the curves containing the whole information of the protein construct and a high rupture force from the cohesin-dockerin dissociation were selected with a ratio of ~5%.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Data supporting the findings of this work are available within the paper and its Supplementary Information files. A reporting summary for this Article is available as a Supplementary Information file. The datasets generated and analyzed during the current study are available from the corresponding author upon request. The source data underlying Figs. 2c, 3e, 4c, e, f are provided as a Source Data file.
Yang, Y. J., Holmberg, A. L. & Olsen, B. D. Artificially engineered protein polymers. Annu. Rev. Chem. Biomol. Eng. 8, 549–575 (2017).
Pelegri-O’Day, E. M. & Maynard, H. D. Controlled radical polymerization as an enabling approach for the next generation of protein–polymer conjugates. Acc. Chem. Res. 49, 1777–1785 (2016).
Yang, J. et al. Polyprotein strategy for stoichiometric assembly of nitrogen fixation components for synthetic biology. Proc. Natl Acad. Sci. USA 115, E8509–E8517 (2018).
Dietz, H. et al. Cysteine engineering of polyproteins for single-molecule force spectroscopy. Nat. Protoc. 1, 80–84 (2006).
Albayrak, C. & Swartz, J. R. Direct polymerization of proteins. Acs Synth. Bio. 3, 353–362 (2014).
Rief, M., Gautel, M., Oesterhelt, F., Fernandez, J. M. & Gaub, H. E. Reversible unfolding of individual titin immunoglobulin domains by AFM. Science 276, 1109–1112 (1997).
Carrion-Vazquez, M. et al. Mechanical and chemical unfolding of a single protein: a comparison. Proc. Natl Acad. Sci. USA 96, 3694–3699 (1999).
Hoffmann, T. & Dougan, L. Single molecule force spectroscopy using polyproteins. Chem. Soc. Rev. 41, 4781–4796 (2012).
Yang, R. et al. Engineering a catalytically efficient recombinant protein ligase. J. Am. Chem. Soc. 139, 5351–5358 (2017).
Harris, K. S. et al. Efficient backbone cyclization of linear peptides by a recombinant asparaginyl endopeptidase. Nat. Commun. 6, 10199 (2015).
Jackson, M. A. et al. Molecular basis for the production of cyclic peptides by plant asparaginyl endopeptidases. Nat. Commun. 9, 2411 (2018).
Liu, H. P., Ta, D. T. & Nash, M. A. Mechanical polyprotein assembly using sfp and sortase-mediated domain oligomerization for single-molecule studies. Small Methods 2, 39–45 (2018).
Zhang, X., Rico, F., Xu, A. & Moy, V. In Handbook of Single-Molecule Biophysics (eds Hinterdorfer, P. & Oijen, A.). (Springer US, 2009).
Beedle, A. E. M. et al. Forcing the reversibility of a mechanochemical reaction. Nat. Commun. 9, 3155 (2018).
Alonso-Caballero, A. et al. Mechanical architecture and folding of E. coli type 1 pilus domains. Nat. Commun. 9, 2758 (2018).
Walder, R. et al. High-precision single-molecule characterization of the folding of an HIV RNA hairpin by atomic force microscopy. Nano Lett. 18, 6318–6325 (2018).
Takahashi, H., Rico, F., Chipot, C. & Scheuring, S. alpha-helix unwinding as force buffer in spectrins. ACS Nano 12, 2719–2727 (2018).
Thoma, J., Sapra, K. T. & Müller, D. J. Single-molecule force spectroscopy of transmembrane β-barrel proteins. Annu. Rev. Anal. Chem. 11, 375–395 (2018).
Zhu, R. et al. Nanopharmacological force sensing to reveal allosteric coupling in transporter binding sites. Angew. Chem. Int. Ed. 55, 1719–1722 (2016).
Liu, C. J., Shi, W. Q., Cui, S. X., Wang, Z. Q. & Zhang, X. Force spectroscopy of polymers: beyond single chain mechanics. Curr. Opin. Solid State Mat. Sci. 9, 140–148 (2005).
Valbuena, A. et al. On the remarkable mechanostability of scaffoldins and the mechanical clamp motif. Proc. Natl Acad. Sci. USA 106, 13791–13796 (2009).
Muddassir, M. et al. Single-molecule force-unfolding of titin I27 reveals a correlation between the size of the surrounding anions and its mechanical stability. Chem. Commun. 54, 9635–9638 (2018).
Aioanei, D. et al. Single-molecule-level evidence for the osmophobic effect. Angew. Chem. Int. Ed. 50, 4394–4397 (2011).
Giganti, D., Yan, K., Badilla, C. L., Fernandez, J. M. & Alegre-Cebollada, J. Disulfide isomerization reactions in titin immunoglobulin domains enable a mode of protein elasticity. Nat. Commun. 9, 185 (2018).
Lu, W., Schafer, N. P. & Wolynes, P. G. Energy landscape underlying spontaneous insertion and folding of an alpha-helical transmembrane protein into a bilayer. Nat. Commun. 9, 4949 (2018).
Hickman, S. J., Cooper, R. E. M., Bellucci, L., Paci, E. & Brockwell, D. J. Gating of TonB-dependent transporters by substrate-specific forced remodelling. Nat. Commun. 8, 14804 (2017).
Zakeri, B. et al. Peptide tag forming a rapid covalent bond to a protein, through engineering a bacterial adhesin. Proc. Natl Acad. Sci. USA 109, E690–E697 (2012).
Marko, J. F. & Siggia, E. D. Stretching DNA. Macromolecules 28, 8759–8770 (1995).
Rico, F., Gonzalez, L., Casuso, I., Puig-Vidal, M. & Scheuring, S. High-speed force spectroscopy unfolds titin at the velocity of molecular dynamics simulations. Science 342, 741–743 (2013).
Yu, H., Siewny, M. G. W., Edwards, D. T., Sanders, A. W. & Perkins, T. T. Hidden dynamics in the unfolding of individual bacteriorhodopsin proteins. Science 355, 945–950 (2017).
Borgia, A., Steward, A. & Clarke, J. An Effective strategy for the design of proteins with enhanced mechanical stability. Angew. Chem. Int. Ed. 47, 6900–6903 (2008).
Scholl, Z. N. & Marszalek, P. E. AFM-based single-molecule force spectroscopy of proteins. Methods Mol. Biol. 1814, 35–47 (2018).
Sen Mojumdar, S. et al. Partially native intermediates mediate misfolding of SOD1 in single-molecule folding trajectories. Nat. Commun. 8, 1881 (2017).
Yuan, G. et al. Elasticity of the transition state leading to an unexpected mechanical stabilization of titin immunoglobulin domains. Angew. Chem. Int. Ed. 56, 5490–5493 (2017).
Stahl, S. W. et al. Single-molecule dissection of the high-affinity cohesin–dockerin complex. Proc. Natl Acad. Sci. USA 109, 20431–20436 (2012).
Ott, W., Jobst, M. A., Schoeler, C., Gaub, H. E. & Nash, M. A. Single-molecule force spectroscopy on polyproteins and receptor–ligand complexes: The current toolbox. J. Stru. Bio 197, 3–12 (2017).
Ott, W., Durner, E. & Gaub, H. E. Enzyme-mediated, site-specific protein coupling strategies for surface-based binding assays. Angew. Chem. Int. Ed. 57, 12666–12669 (2018).
Zimmermann, J. L., Nicolaus, T., Neuert, G. & Blank, K. Thiol-based, site-specific and covalent immobilization of biomolecules for single-molecule experiments. Nat. Protoc. 5, 975–985 (2010).
Kamruzzahan, A. S. M. et al. Antibody linking to atomic force microscope tips via disulfide bond formation. Bioconj. Chem. 17, 1473–1481 (2006).
Erlich, K. R., Baumann, F., Pippig, D. A. & Gaub, H. E. Strep-tag II and monovalent strep-tactin as novel handles in single-molecule cut-and-paste. Small Methods 1, 169–173 (2017).
Popa, I. et al. A halotag anchored ruler for week-long studies of protein dynamics. J. Am. Chem. Soc. 138, 10546–10553 (2016).
Walder, R. et al. Rapid characterization of a mechanically labile α-helical protein enabled by efficient site-specific bioconjugation. J. Am. Chem. Soc. 139, 9867–9875 (2017).
Ebner, A., Wildling, L. & Gruber, H. J. In Atomic Force Microscopy: Methods and Protocols (eds Santos, N. C. & Carvalho, F. A.). (Springer, New York, 2019).
Blake, P. R. et al. Determinants of protein hyperthermostability: purification and amino acid sequence of rubredoxin from the hyperthermophilic archaebacterium Pyrococcus furiosus and secondary structure of the zinc adduct by NMR. Biochemistry 30, 10885–10895 (1991).
Liu, J. et al. Metalloproteins containing cytochrome, iron-sulfur, or copper redox centers. Chem. Rev. 114, 4366–4469 (2014).
Schmidt, S. W., Filippov, P., Kersch, A., Beyer, M. K. & Clausen-Schaumann, H. Single-molecule force-clamp experiments reveal kinetics of mechanically activated silyl ester hydrolysis. ACS Nano 6, 1314–1321 (2012).
Xue, Y., Li, X., Li, H. & Zhang, W. Quantifying thiol–gold interactions towards the efficient strength control. Nat. Commun. 5, 4348 (2014).
Beedle, A. E. M., Lezamiz, A., Stirnemann, G. & Garcia-Manyes, S. The mechanochemistry of copper reports on the directionality of unfolding in model cupredoxin proteins. Nat. Commun. 6, 7894 (2015).
Lei, H. et al. Reversible unfolding and folding of the metalloprotein ferredoxin revealed by single-molecule atomic force microscopy. J. Am. Chem. Soc. 139, 1538–1544 (2017).
Li, Y. R. et al. Single-molecule mechanics of catechol-iron coordination bonds. ACS Biomater. Sci. Eng. 3, 979–989 (2017).
Carrion-Vazquez, M. et al. The mechanical stability of ubiquitin is linkage dependent. Nat. Struct. Bio. 10, 738 (2003).
Garcia-Manyes, S., Brujic, J., Badilla, C. L. & Fernandez, J. M. Force-clamp spectroscopy of single-protein monomers reveals the individual unfolding and folding pathways of I27 and ubiquitin. Biophys. J. 93, 2436–2446 (2007).
Durner, E., Ott, W., Nash, M. A. & Gaub, H. E. Post-translational sortase-mediated attachment of high-strength force spectroscopy handles. Acs Omega 2, 3064–3069 (2017).
Zheng, P. & Li, H. Highly covalent ferric-thiolate bonds exhibit surprisingly low mechanical stability. J. Am. Chem. Soc. 133, 6791–6798 (2011).
Garg, S., Singaraju, G. S., Yengkhom, S. & Rakshit, S. Tailored polyproteins using sequential staple and cut. Bioconj. Chem. 29, 1714–1719 (2018).
Shah, N. H. & Muir, T. W. Inteins: nature’s gift to protein chemists. Chem. Sci. 5, 446–461 (2014).
Celik, E., Zakeri, B., Howarth, M. & Moy, V. T. An irreversible lock to proteins for dynamic force spectroscopy at the mammalian cell surface. Biophys. J. 102, 718a (2012).
Dorr, B. M., Ham, H. O., An, C. H., Chaikof, E. L. & Liu, D. R. Reprogramming the specificity of sortase enzymes. Proc. Natl Acad. Sci. USA 111, 13343–13348 (2014).
Veggiani, G. et al. Programmable polyproteams built using twin peptide superglues. Proc. Natl Acad. Sci. USA 113, 1202–1207 (2016).
Buldun, C. M., Jean, J. X., Bedford, M. R. & Howarth, M. SnoopLigase Catalyzes peptide–peptide locking and enables solid-phase conjugate isolation. J. Am. Chem. Soc. 140, 3008–3018 (2018).
Hoffmann, T. et al. Rapid and robust polyprotein production facilitates single-molecule mechanical characterization of beta-barrel assembly machinery polypeptide transport associated domains. ACS Nano 9, 8811–8821 (2015).
Crampton, N. & Brockwell, D. J. Unravelling the design principles for single protein mechanical strength. Curr. Opin. Struct. Biol. 20, 508–517 (2010).
Dietz, H., Berkemeier, F., Bertz, M. & Rief, M. Anisotropic deformation response of single protein molecules. Proc. Natl Acad. Sci. USA 103, 12724–12728 (2006).
Zheng, P., Cao, Y. & Li, H. Facile method of constructing polyproteins for single-molecule force spectroscopy studies. Langmuir 27, 5713–5718 (2011).
This work was supported by the National Natural Science Foundation of China (Grant No. 21771103), Natural Science Foundation of Jiangsu Province (Grant No. BK20160639), Fundamental Research Funds for the Central Universities (Grant No. 14380171) and Shuangchuang Program of Jiangsu Province for P.Z., B.W. was supported by Minister of Singapore Tier 1 Grant (2017-T1-001-168).
The authors declare no competing interests.
Peer review information: Nature Communications thanks Jorge Alegre-Cebollada, Michael Nash and the other anonymous reviewer(s) for their contribution to the peer review of this work.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Deng, Y., Wu, T., Wang, M. et al. Enzymatic biosynthesis and immobilization of polyprotein verified at the single-molecule level. Nat Commun 10, 2775 (2019). https://doi.org/10.1038/s41467-019-10696-x
Nature Chemistry (2021)
Biophysical Reviews (2021)
Scientific Reports (2019)