Introduction

\(\beta \)-Propeller proteins can be found in all domains of life. They consist of repeating units adopting a circular shape around a central channel. These repeat units or “blades” consist of four-stranded anti-parallel \(\beta \)-sheets. The number of repeats varies between 4 and 12, with seven being the most common1. Hydrophobic interactions contribute the most to the stability of the fold, but in addition, most propellers possess a closing mechanism. In the four-bladed variants there is often a disulfide bridge between the N-terminal and C-terminal blade. In the larger propellers a so-called “Velcro closure” is found, in which the N- and C-terminal \(\beta \)-strands are part of the same \(\beta \)-sheet, keeping the propeller closed. This “Velcro” can occur in different positions; in the “Velcro 1-3” variant, the closing \(\beta \)-sheet consists of one N-terminal and three C-terminal strands, while for the “Velcro 3-1” variant, the opposite is true. Both of these configurations are common in nature2,3.

Like other repeat proteins, \(\beta \)-propellers are pseudo-symmetrical, with similar but not identical repeated motifs, suggesting evolution through multiple cycles of duplication and subsequent divergence. It is therefore believed that repeat proteins have evolved from smaller ancestral units, with diversification of the repeating units leading to the present natural sequences4,5. This process appears to have occurred independently many times, resulting in the existence of different distinct repeating motifs within the \(\beta \)-propeller family. The most archetypical are the WD40 repeats, but others including the Asp-box and Kelch repeats are also widespread4. While these families generally prefer a specific number of repeats, variations do occur. WD40 proteins for example are typically seven-fold symmetric, but six- and eight-fold proteins have also been observed6.

Repeat proteins are a favoured target for protein design because larger domains can be constructed from a smaller subset thus reducing the design space considerably7. Many of these studies aim to increase our understanding of protein evolution8. Pizza was the first successfully designed symmetrical \(\beta \)-propeller by reverse engineering the duplication-fusion process. Ancestral repeat sequences calculated from a natural protein were computationally mapped onto a designed symmetric backbone. This resulted in a symmetric \(\beta \)-propeller comprised of six identical blades9. Interestingly the protein can self-assemble from smaller fragments, validating these as probable evolutionary intermediates. The self-assembling properties of Pizza were further explored by engineering a metal-binding site into the Pizza protein, allowing it to scaffold a cadmium chloride nanocrystal, opening up possibilities for symmetric \(\beta \)-propeller proteins in the field of biomineralization10,11. Other Pizza derivatives exhibited catalytic activity12, bound inorganic metaloxoclusters13, or were joined with \(\alpha \)-helices to create protein cages14. As symmetric proteins are highly interesting for the creation of symmetric macromolecular assemblies, the same computational design strategy was subsequently used to make the eight-bladed propeller, Tako815 and the nine-bladed propeller Cake916. The latter can also fold as an eight-bladed propeller when only eight repeats are expressed. This structural plasticity may explain the origin of abundantly present odd-numbered \(\beta \)-propeller through duplication and fusion events, but it limits the applicability of Cake9 as a building block for larger assemblies. An attempt was made to create a protein that can only adopt the nine-bladed propeller fold by designing a three-fold symmetric protein instead of a nine-fold symmetric one. This resulted in the Scone protein which interestingly folds as a permuted eight-bladed propeller despite nine repeats are expressed in the open reading frame8. These unexpected observations highlight the need for further optimization of the computational design strategies in order to accurately control the folding symmetry of \(\beta \)-propellers.

In this study, we aimed to investigate whether there are geometric patterns linking the 3D structure to the fold symmetry. Perfectly symmetric \(\beta \)-propellers ease this analysis compared to natural pseudo-symmetric propellers. With Pizza6, Tako8, Cake8 and Cake9 we have six-, eight-, and nine-bladed proteins at hand. A five-bladed symmetric propeller was designed by the group of Tawfik. They selected functional proteins from an ancestral sequence library constructed from the sugar binding protein Tachylectin-2, resulting in a sequence of 47 amino acids that multimerized into a functional five-bladed propeller17. The group of Lupas more recently reported a highly repetitive propeller with 14 repeats folding as two seven-bladed propeller domains (PDB:2ymu) in Nostoc punctiforme. They named this protein “WD40-family Recently Amplified Propeller (WRAP)”. Similar to Cake however this sequence possesses the ability to adopt both multimeric eight- and nine-fold symmetry, when fragments are expressed18.

However, in nature the most commonly observed number of blades is seven, and so far no monomeric 7-bladed symmetrical propeller protein has been reported. Here, we report the crystallographic structure of a seven-fold repeating monomeric protein, based on 2ymu. We also sought to investigate the influence of the “Velcro” positioning on the stability of the protein19. As a full set of designed symmetric \(\beta \)-propellers from five to nine-fold symmetry is now available, we also analysed and compared their geometry in order deduce rules which may be used as guidelines or restraints in future designs.

Results

We began our project before a consensus sequence of the 14 WRAP blades was reported by the group of Lupas18. Therefore we independently aligned all repeats and identified the most common amino acid at each position. The WRAP repeats show almost perfect conservation with very few mutations, and the only position which is not highly conserved is the ninth residue of the inner \(\beta \)-strand, where arginine and tryptophan occur equally often. As neither amino acid dominated, we did not choose either. Choosing an arginine may have yielded a supercharged structure at the core potentially preventing stable folding, while choosing the tryptophan could have depleted the supply of this single codon amino acid during expression and therefore negatively influence the yield of the protein. Hence, we constructed a sequence logo based on 7142 sequences classified in the PROSITE database as WD40 repeats20. In agreement with previous studies21,22, we observed threonine to be the most commonly observed residue at this position. As this manually edited consensus sequence is only different in one residue compared to the WRAP structures later reported by the Lupas group18, we refer to the here reported sequence as WRAP-T. The sequence identity between WRAP-T and the highly repetitive natural propeller is 97%, with a minimum of 95% in the most divergent repeat. The template protein, repeat alignment and sequence logo are shown in Fig. 1.

Figure 1
figure 1

Design of the WRAP-T repeat. The structure 2ymu, the protein consists of 14 nearly identical WD40 repeats, folded into two seven-bladed \(\beta \)-propellers (a). A schematic representation of a single WD40 repeat. The repeat folds in four \(\beta \)-strands numbered form one to four from the inside of the cavity to the outside of the cavity (b). Sequence alignment of all 14 repeats of 2ymu, the sequence logo on top was constructed based on 7,142 sequences classified in the PROSITE database as WD40 repeats. The sequence of WRAP-T is shown on the bottom accompanied by a diagram showing the location of the \(\beta \)-strands (c). Panel (a) was created with PyMOL23.

In order to investigate the influence of the “Velcro closure” position we designed four synthetic DNA fragments expressing circularly permuted versions of the protein : nvWRAP-T without a “Velcro strap”, v31WRAP-T with a “Velcro” close to the central cavity, v22WRAP-T with a “Velcro” in the middle of the four \(\beta \)-strands and v31WRAP-T with the outer “Velcro closure” (also present in the model protein 2ymu). These “Velcro” mutants were created by shifting the N- and C-termini to the inside of the loop region between \(\beta \)-strands. To avoid bad interactions between the close termini, the N-terminal residue was removed. We decided to position the termini so that a glycine residue would be removed, theorizing this residues contribute little to the overall interaction energy and stability of the fold. In the case of V22WRAP-T, this was impossible as the loop lacks a glycine. Hence the central aspartate was removed instead. A model and diagram of the first repeat are shown in Fig. 2.

Figure 2
figure 2

Construction of the “Velcro” variants, each “Velcro” protein has the same sequence but the N-terminal amino acid is shifted within the repeat thus resulting in a different location of the “Velcro”. The first repeat is shown in colour, underneath each structure a diagram is shown illustrating the order of \(\beta \)-strands within the repeat. This figure was created with PyMOL23.

The DNA fragments were cloned into a pET28 vector, and expressed in E. coli. Purification with a Ni-NTA column and size-exclusion chromatography yielded protein of high purity, suitable for crystallization. Each protein variant was monodisperse and eluted at the same volume, indicating the same molecular size, which was verified using analytical gel filtration, see Fig. 3. Optimal crystallization conditions were determined using standard screens.

Figure 3
figure 3

Purification and stability data for the WRAP-T proteins. Normalized analytical size exclusion indicates all proteins have the same hydrodynamic volume (a). Measuring the CD signal at 218 nm while increasing the temperature shows the melting point of each protein see data Table 1 (b). The fluorescent signal of tryptophan was followed at 320 nm this allows for the calculation of fraction denatured under increasing Guanidine hydrochloride concentration (c).

Each protein crystallized in a different condition. High quality X-ray diffraction data-sets (see Table 1) were obtained with high resolution limits between 1.4 and 1.8 Å. Initial models were obtained using molecular replacement, starting with a propeller domain from PDB entry 2ymu. While the space group and packing differ between the crystals, the structure is identical for each protein. The backbone RMSD between each artificial protein and the 2ymu template is below 0.4 Å. The crystal structures also show that the proteins are highly internally symmetric with a backbone RMSD between individual repeats below 0.3 Å. The characteristic hydrogen bridge network in WD40 repeats, between a conserved histidine, serine, aspartate and tryptophan is clearly visible in the electron density, see Fig. 4.

Table 1 Data collection and refinement statistics of X-ray structures.
Figure 4
figure 4

The hydrogen bonding network in WD40 repeats. A single repeat from the crystal structure nvWRAP-T showing the highly conserved hydrogen bonding network between 111:Trp, 101:Ser, 83:His and 105:Asp. The electron density is shown as a \(F_c - F_o\) map at sigma level 1. Beside the structure a diagram indicating the hydrogen network is shown. This figure was created with PyMOL23.

In order to investigate the influence of the “Velcro” position on stability, the CD signal at 218 nm was monitored while raising the protein temperature, see Fig. 3. The resulting curves were fitted with a Boltzmann-sigmoid to determine the melting temperature and \(\Delta \)H, see Table 2. Thermal reversibility was tested by heating the samples to 85 \(^\circ \)C and then back to 20 \(^\circ \)C to compare the spectrum. Chemical stability was tested by measuring intensity changes in trypthophan fluorescence at the 330 nm wavelength under increasing concentrations of guanidine hydrochloride (GdnHCl). From this, the fraction of denatured protein could be calculated by assuming the protein is completely unfolded at the highest concentration. When possible this data was fitted with a Boltzmann-sigmoid curve to determine the concentration at which 50% of the protein is denatured and the proportionality constant (m), see Table 2. v13WRAP-T remained stable at high GdnHCl concentration thus fitting the data was not possible.

Table 2 Stability parameters of the proteins.

Discussion

A symmetric seven-bladed \(\beta \)-propeller was designed from a highly symmetric protein with unknown function (PDB:2ymu). This design completes the arsenal of designed symmetric \(\beta \)-propellers from five- to nine-fold symmetry. All these proteins consist of a motif between 40 and 49 amino acids long that is repeated multiple times. Some motifs permit multiple symmetries. The Cake sequence for example can both fold into an eight- and nine-bladed propeller depending on the number of repeats expressed16, and the WRAP sequence has also been shown to possess this property18. Most sequences however will only properly fold in a specific symmetry; Pizza will only adopt a six-bladed propeller, even when seven repeats are tandemly placed in a single polypeptide9. So far it has proved impossible to predict accurately the correct folding symmetry for a designed sequence, or to assess whether a designed structure will correctly fold. In a recent study we designed a nine-bladed propeller that was found in practice to form an eight-bladed propeller24. In order to design confidently a \(\beta \)-propeller ab initio, improved understanding of these structures is required. Analysis of artificial as well as natural proteins will assist in this process.

Until now, no successful design of a seven-bladed propeller has been reported. The first attempt was done by the group of M. Paoli in 2006. They created a consensus sequence, taking into account structural alignments to assess the selected amino acids, but this evaluation lacked a computational evaluation of the sequence. The resulting protein had a low stability, and the structure could not be verified, thus the authors concluded that is had adopted a molten globule state. The group of Baker also attempted to create WD40 \(\beta \)-propeller proteins utilizing Rosetta De Novo design25 with additional structural and sequence constraints from the protein family. However, none of their 13 designs including three seven-bladed propellers could be structurally verified26. This might be attributed to the worse RosettaHoles scores27 observed for these proteins compared to their other designs. This indicates worse side chain packing in these proteins. We attempted to use our previously reported \(\hbox {RE}_3\)Volutionary design method28, which was successful in designing the Pizza and Cake proteins, to design a seven-bladed propeller. The first attempt used the \(\beta \)-propeller domain of PAF-AH Holoenzyme from Homo Sapiens (PDB: 1vyh) as template, a second attempt used the beta-1 subunit form Bos Taurus Guanine nucleotide-binding protein. Both designs (sequences in Supporting materials) failed to express. For the creation of the WRAP-T protein an extremely repetitive natural template was used. This design was probably successful because only limited changes were made to the natural protein, while two previous designs had less then 50% sequence identity to their natural templates.

All the WRAP-T permutants expressed well and could be purified easily. Solving the crystal structures revealed that they possess an identical structure. Interestingly, the different position of the chain termini had an influence on the preferred crystallisation conditions and crystal packing. While for previously designed symmetric proteins, the proteins typically packed in distinct symmetric ways reflecting the inherent six-, eight- or nine-fold symmetry of the proteins, here no clear symmetric interactions could be observed within the lattice. This may be expected given the incompatibility of seven-fold internal symmetry with any space group. Despite the non-equivalent positions of each blade within the crystal structures, the proteins proved to be perfectly symmetric, with a TM-score above 0.97 calculated by CE-symm29. Analysing the thermal and chemical stability of the WRAP-T permutants revealed that nvWRAP-T, lacking the velcro closure, is just as stable as the variant v13WRAP-T. While v31WRAP-T is considerably more resistant to GdnHCl induced denaturation compared to the other proteins, its increased thermal stability is not that significant. As expected, v22WRAP-T is the least stable protein, reflecting the fact this conformation is almost never seen in nature.

In a previous study, we investigated the influence of the “Velcro” strap on the stability of the Pizza protein, which belongs to the NHL family of propeller proteins19. It was found that the two“non-Velcro” Pizza variants (respectively having the termini just in front of the first strand, or behind the last strand of the blade) are the least stable, having a melting temperature of 41.78 \(^{\circ }\)C and 45.32 \(^{\circ }\)C compared to 52.48 \(^{\circ }\)C for the regular Pizza. The “Velcro 2-2” variant is the least stable Pizza with a “Velcro”, having a melting temperature of 44.13 \(^{\circ }\)C. The “Velcro 3-1” variant is almost as stable as the regular Pizza, with a melting temperature of 52.48 \(^{\circ }\)C. The same trends can be found for chemical denaturation, with nv1Pizza6 having a \(C_{m}\) value of 0.660 M, nv2Pizza6 a value of 0.782 M and v22Pizza6 a value of 0.749 M, this can be compared to a value of 0.949 M for the regular Pizza. Just as with WRAP-T, the inside “Velcro3-1” position is highly resistant to chemical denaturation.

In both Pizza and WRAP-T, the “Velcro 2-2” conformation is the least stabilising, which presumably explains why it is rarely observed in nature. The few exceptions include the RCC1-like domains30, which are only distantly related to other \(\beta \)-propellers4. For WD40 repeats this decreased stability can be explained through the disruption of the highly conserved hydrogen bonding network between a serine/threonine in \(\beta \)-strand two, an aspartate and tryptophan in \(\beta \)-strand three and a histidine in the connecting loop, see Fig. 4. The importance of this conserved hydrogen bonding network for the WD40 class of proteins was previously confirmed by mutating the amino acids involved31. However the Pizza protein does not posses this network, yet the middle “Velcro” is still the least stable. Possibly burying the charged chain termini deep within the protein is destabilising enough to explain the observed results. In both cases the “Velcro 3-1” position yields the protein with the highest stability, especially against chemical denaturants such as GdnHCl. A striking difference however is the stability of the “non-Velcro” variant, the least stable for Pizza but as stable as the natural “Velcro” position in WRAP-T. This may be due to differences between the sequences, as the two different “non-Velcro” variants of Pizza have different stabilities as well. However, these two examples alone may not be sufficient to determine a clear correlation, and additional proteins will be needed to confirm whether the inner “Velcro” is consistently the most stable conformation for propeller proteins. Such information may be helpful in engineering \(\beta \)-propeller enzymes for industrial applications such as the production of fructans by fructosyltransferases32.

Figure 5
figure 5

Geometric analysis of different symmetric designer proteins: the five-fold Tachylectin-2 (PDB:5c2m), the six-fold Pizza (PDB:3ww9), seven-fold WRAP-T (PDB:7big), eight-fold Tako (PDB:6g6n) and Cake (PDB:6tjg) and the nine-fold Cake (PDB:6tjh). An alignment of a single blade from each protein. The spheres indicate the location of the chosen reference point (a). A structure based sequence alignment of a single blade from each protein, the red boxes show the location of the sphere reference point in each sequence (b). A cartoon of two subsequent blades with spheres indicating the reference points. The chosen parameters are shown: The distances from the inner position to the central axis (a,b); the distance between reference points in subsequent repeats, for inner points (c,d) and outer points (e,f); the rotation angle (\(\alpha \)); the twist angles of the blade as the dihedral angle between the \(\beta _{1}\)-strand and the central axis (\(\beta \)) and the internal dihedral angle between the two strands (\(\gamma \)); and the tilt angle between the central axis and the \(\beta _{1}\)-strand (\(\delta \)) (c). Graphs showing the average value for each parameter in function of repeat number. WRAP-T and Tako are both WD40 repeats and are represented by a hollow triangle, the other proteins belong to different families and are shown as a solid circle (d). Panels (a,c) were created with PyMOL23.

Another goal of this research was the investigation of common trends in the backbone arrangement, especially the blade orientation within the propeller architecture. These findings could serve as guidelines for the de novo design of propeller backbones with a desired symmetry. We performed a structural analysis of the currently reported perfectly symmetric monomeric propeller proteins, but excluded any non-symmetric proteins in order to avoid bias to the backbone structure induced by the pseudo-symmetry of the sequence. The investigated proteins include: Tachylectin-2 (PDB:5c2m), Pizza (PDB:3ww9), WRAP-T (PDB:7big), Tako8 (PDB:6g6n), Cake8 (PDB:6tjg), and Cake9 (PDB:6tjh). Each has a different number of repeats, with the Cake sequence giving both eight- and nine-fold symmetric variants. A structural alignment of a single repeat of each protein allowed for the selection of four reference \(C_{\alpha }\) atoms in each repeat marked in the structure and sequence alignment in Fig. 5. Two are located on the inside \(\beta _{1}\)-strand, the other two are found in the \(\beta _{3}\)-strand. From these reference points we calculated six distances and four angles to characterize the specific protein.

The distances from the inner position to the central axis (a, b); the distance between reference points in subsequent repeats, for inner points (c, d) and outer points (e, f); the rotation angle (\(\alpha \)); the twist angles of the blade as the dihedral angle between the \(\beta _{1}\)-strand and the central axis (\(\beta \)) and the internal dihedral angle between the two strands (\(\gamma \)); and finally the tilt angle between the central axis and the \(\beta _{1}\)-strand (\(\delta \)). Average values and standard deviation of these values are shown in Table 3, and plots are shown in Fig. 5. Both WRAP-T and Tako belong to the WD40 family of proteins and are marked by a hollow triangle, while the other propellers belong to different families and are marked by a circle. As could be expected, the rotation angle \(\alpha \) is entirely dictated by the symmetry \(\alpha = \frac{360 ^{\circ }}{n}\), with n the number of blades. The channel radius is also highly dependent on symmetry, generally increasing with increasing blade number. Tachylectin-2 is the exception, having a larger central channel than Pizza. A linear regression fit of the channel radii against symmetry number indicates a high correlation (\(r^2 > 0.95\)) for the larger propellers, but only a weak correlation if Tachylectin-2 is included.

Table 3 Geometric parameters of symmetric designer proteins.

The inter-blade distances also follow a linear pattern; while the inside inter-blade distance increases with the number of blades, the outside distance decreases. Again Tachylectin is the one exception to this rule. It should be noted that these distances are not independent, and are related following the simple model of an isosceles triangle, \(c=2\times a\sin {\frac{360}{n}}\). The same is true for the relation between d and b. A similar formula can be applied to e and f with the difference that the distance between the first and third \(\beta \)-strand has to be taken in to account. As can be seen from the alignment of all blades in Fig. 5, this distance is nearly identical for all symmetries, thus e and f are also completely determined by a/b and the number of repeats. This results in a linear decrease with \(r^2 > 0.8\) From these distances it is clear that Tachylectin-2 is the misfit within these proteins, although it is unclear whether this is true for all five-bladed propellers, or specific to Tachylectin-2 and its homologs. The dihedral twist angles show some relation with symmetry, with \(\beta \) increasing while the internal \(\gamma \) angle decreases, however \(r^2\) is only above 0.5. In this case Tako seems to be the misfit, and excluding this protein yields a much higher correlation, \(r^2>0.95\). The tilt angle \(\delta \) seems uncorrelated to the blade number with an \(r^2<0.2\). It is interesting to note that for the two Cake proteins, the dihedral angles are nearly identical while the tilt angles are different. It seems that this sequence can adopt different symmetry by increasing the channel radius and decreasing the tilt with respect to the central axis, similar as how a wire helix can be expanded or contracted in a similar way by rotating coils at each end in opposite directions about the central axis.

While the inter-blade distances obey simple geometric rules depending on the channel radius and symmetry, no design guidelines on blade orientation could be deduced from these few proteins. Tachylectin-2 shows that it is possible to have a larger channel with a low symmetry by including a longer loop between the third and fourth \(\beta \)-strand, filling the gap that would otherwise be created on the outside of the propeller. The Tako protein deviates from the trend and has drastically different angles from Cake8, the other eight-fold protein, showing that the same symmetry can be achieved by very different blade orientations.

It is known that close homologs of ring-shaped protein oligomers may have different numbers of subunits in the complex. For example the Tryptophan RNA-binding attenuation protein (TRAP) usually consists of eleven subunits, but the variant from B. halodurans (PDB: 3zzl) has twelve. This is caused by a deletion of the last five residues of each subunit, shifting the angle of rotation slightly. Equally the insertion of a larger amino acid on the inside of the circle also causes a shift in rotation angle, resulting in a conformation with twelve subunits33. For the monomeric \(\beta \)-propellers, it seems more difficult to find such rules, as deviations in repeat structure are not pronounced and the same sequence can result in multiple symmetries.

Starting from a highly repetitive \(\beta \)-propeller sequence, we designed a consensus motif sequence that formed a perfectly symmetric 7-bladed \(\beta \)-propeller. This protein completes a set of artificial propeller proteins with five to nine-fold internal symmetry. A structural analysis of these proteins was performed to deduce their geometric patterns and rules. Although some trends are present in this small dataset, no definitive rules for propeller protein design can be deduced from it. A geometric relation between blade angle, radial distance and inter-blade distance was found, and a trend in angles of orientation could be observed, but exceptions were found to both tendencies. In the future it would be useful to compare proteins with the same symmetry but belonging to different propeller sub-families, in order to determine whether blade orientation is controlled more by the symmetry or features of the sequence.

Methods

Protein preparation

Linear DNA sequences encoding the variants of WRAP-T were commercially prepared, amplified by PCR and inserted into pET28 vector using the NdeI and XhoI restriction sites. Proteins were expressed in E. coli BL21(DE3) and purified using published protocols34. Protein expression was induced by adding IPTG to a final concentration of 0.5 mM, and subsequently incubating the cells with shaking at 20 \(^\circ \)C for 20 h. Cells were harvested by centrifugation and suspended in 50 mM \(\hbox {NaH}_2\)PO4, 200 mM NaCl and 10 mM imidazole. After lysis by sonication, cell debris was removed and the supernatant loaded onto a 10 ml volume nickel-sepharose column equilibrated with the same buffer. The column was washed with a similar buffer containing 20 mM imidazole. The protein was finally eluted with buffer with 300 mM imidazole and digested with thrombin overnight at 4 \(^\circ \)C during dialysis into 50 mM \(\hbox {NaH}_2\)PO4, 200 mM NaCl, 10 mM imidazole. The protease:WRAP-T ratio was 1:200. The protein was re-loaded onto the washed nickel-sepharose column and the same steps were repeated. The fractions containing tag-free WRAP-T were pooled before loading onto a Superdex-200 column equilibrated with 20 mM HEPES and 200 mM NaCl buffer at pH 8.0.

The purified proteins were concentrated to 10 mg/mL and shown to be at least 95% pure by SDS-PAGE. All purified proteins were analysed by size-exclusion chromatography (SEC). The SEC analysis was performed using a Superdex 200 increase 10/300 GL column equilibrated with 20 mM HEPES and 200 mM NaCl buffer at pH 8.0.

Tryptophan fluorescence

Denaturation of the protein samples was measured by observing intrinsic tryptophan fluorescence with a Sapphire2 96-well plate reader following the protocol described in19. 2 \(\upmu \)L of protein sample (\(\hbox {OD}_{{280}}\) of 10) were pipetted into 98 \(\upmu \)L of guanidinium hydrochloride (GdnHCl) solution to give a final protein concentration of 0.5 \(\upmu \)g/mL. GdnHCl concentration was tested from zero to 6 M, in steps of of 0.25 M. Tryptophan fluorescence was measured after seven days storage at 20 \(^\circ \)C, by observing the emission intensity at 330 nm after excitation at 280 nm. Samples were prepared and measured in triplicate35. Fluorescence measurements from blank samples (containing no protein) were subtracted from the measured values. Normalized values for the fraction of denatured protein were obtained by scaling, with the maximum value set to 1 and the smallest to zero. The data points were fitted to a Boltzmann-sigmoid equation (1) with an in-house Python script utilising the SciPy library36.

$$\begin{aligned} F([Denaturant]) = \frac{(\alpha _{N} + \beta _{N}[Denaturant]) + (\alpha _{D} + \beta _{D}[Denaturant]) e^{-\frac{m(C_{m} - [Denaturant])}{RT}}}{1+e^{-\frac{m(C_{m} - [Denaturant])}{RT}}} \end{aligned}$$
(1)

F is the measured fluorescence, \(\alpha _{N}\) is the F value for the native state with zero denaturant, \(\beta _{N}\) is the slope of this signal, \(\alpha _{D}\) and \(\beta _{D}\) are the corresponding values for the completely denatured state. These parameters are introduced because the denatured and native states change linearly with the denaturant concentration. \(C_{m}\) is the concentration at which 50% of the protein is denatured, m is the constant of proportionality (\(= \frac{-\partial G_{D-N}}{\partial [denaturant]}\)) and has the dimensions of cal/mol/M37.

The difference in free energy between the “Velcro” mutants and the natural position of v13WRAP-T was calculated with Eq. (2)38.

$$\begin{aligned} \begin{aligned} \Delta \Delta G = 0.5(m + m')\Delta C_{m} \end{aligned} \end{aligned}$$
(2)

\(m^\prime \) is the m-value of the mutant and \(\Delta C_{m}\) is the difference in the concentration point at which 50% of the protein is denaturated.

CD spectroscopy

All circular dichroism measurements were performed with a JASCO J-1500 instrument following the protocol described in19. Thermal denaturation measurements were performed in a 2 mm path quartz cuvette using 0.05 mg/mL protein in 20 mM phosphate pH 7.6. The samples were heated from 20 to 85 \(^\circ \)C in steps of 0.2 \(^\circ \)C while monitoring the CD signal at a wavelength of 218 nm.

The CD signal was fitted to a Boltzmann-sigmoid equation (3) using the same script employed for the chemical denaturation experiments.

$$\begin{aligned} F(T) = \frac{\alpha _{N} +\beta _N T + (\alpha _{D} + \beta _{D}T)e^{-\frac{-\Delta H_{m}(1-\frac{T}{T_m})}{RT}}}{1+e^{-\frac{-\Delta H_{m}(1-\frac{T}{T_m})}{RT}}} \end{aligned}$$
(3)

F is the measured signal (in this case the measured CD signal at 218 nm), \(\alpha _{N}\) is the signal of the native state \(\alpha _{D}\) and \(\beta _{D}\) are the corresponding values for the completely denatured state. \(T_m\) is the temperature at which 50% of the protein is denatured, and \(\Delta H_{m}\) is the change of enthalpy upon denaturation39. The change in free energy between the “Velcro” mutants and the natural position of v13WRAP-T was calculated with Eq. (4)38.

$$\begin{aligned} \begin{aligned} \Delta \Delta G = \Delta H_{m} \frac{\Delta T_{m}}{T_{m}} \end{aligned} \end{aligned}$$
(4)

In which \(\Delta H_{m}\) is the difference in enthalpy and \(\Delta T_{m}\) the difference in melting temperature between the mutant and v13WRAP-T.

Crystallisation and structure determination

The proteins were dialysed against 20 mM HEPES pH 8.0, and the concentration adjusted to 10.0 mg/mL. Crystallisation experiments were performed at 20 \(^\circ \)C using the sitting-drop vapor diffusion method with a Gryphon robot (Art Robbins). The crystals were washed in mother liquor containing glycerol, 20% for nvWRAP-T, 25% for v31WRAP-T, 15% for v22WRAP-T and 10% for v13WRAP-T as cryo-protectant before being flash-frozen and stored in liquid nitrogen. Data-sets were collected on the I03 or I04 beamlines at the Diamond Light Source (DLS) (Oxfordshire, UK). A total of 3600 images were taken with a 0.1\(^\circ \) oscillation with a wavelength of 0.9763 Å using an Eiger2 X 16M detector. The data were processed with XDS (for v31WRAP-T)40 or DIALS41, and scaling was carried out with Aimless42. A single 7-bladed unit (Residues X to y) from parent template (PDB:2ymu) was used for molecular replacement with PHASER43. Refinement was carried out with PHENIX.REFINE44, with manual modifications performed with COOT45. The CCP4 suite was used for general data handling46. An overview of data collection and refinement statistics is given in Table 1. The structures are available from the PDB with entry codes 7bie, 7bid, 7bif, 7big for nvWRAP-T, v31WRAP-T, v22WRAP-T and v13WRAP-T, respectively.