Design and characterization of a protein fold switching network

Ruan, Biao; He, Yanan; Chen, Yingwei; Choi, Eun Jung; Chen, Yihong; Motabar, Dana; Solomon, Tsega; Simmerman, Richard; Kauffman, Thomas; Gallagher, D. Travis; Orban, John; Bryan, Philip N.

doi:10.1038/s41467-023-36065-3

Download PDF

Article
Open access
Published: 26 January 2023

Design and characterization of a protein fold switching network

Biao Ruan¹^na1,
Yanan He²^na1,
Yingwei Chen¹^na1,
Eun Jung Choi¹,
Yihong Chen²,
Dana Motabar ORCID: orcid.org/0000-0003-3615-109X^1,3,
Tsega Solomon^2,4,
Richard Simmerman¹,
Thomas Kauffman^2,4,
D. Travis Gallagher^2,5,
John Orban ORCID: orcid.org/0000-0002-3895-1800^2,4 &
…
Philip N. Bryan ORCID: orcid.org/0000-0001-7813-7523^1,2

Nature Communications volume 14, Article number: 431 (2023) Cite this article

4837 Accesses
7 Citations
8 Altmetric
Metrics details

Subjects

Abstract

To better understand how amino acid sequence encodes protein structure, we engineered mutational pathways that connect three common folds (3α, β−grasp, and α/β−plait). The structures of proteins at high sequence-identity intersections in the pathways (nodes) were determined using NMR spectroscopy and analyzed for stability and function. To generate nodes, the amino acid sequence encoding a smaller fold is embedded in the structure of an ~50% larger fold and a new sequence compatible with two sets of native interactions is designed. This generates protein pairs with a 3α or β−grasp fold in the smaller form but an α/β−plait fold in the larger form. Further, embedding smaller antagonistic folds creates critical states in the larger folds such that single amino acid substitutions can switch both their fold and function. The results help explain the underlying ambiguity in the protein folding code and show that new protein structures can evolve via abrupt fold switching.

Emergence of fractal geometries in the evolution of a metabolic enzyme

Article Open access 10 April 2024

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

De novo design of protein structure and function with RFdiffusion

Article Open access 11 July 2023

Introduction

There have been remarkable advances recently in the ability to predict the tertiary structure of a protein from its primary amino acid sequence^1,2 as well as to design amino acid sequences that encode stable, unique protein structures³. It is also well-established, however, that some proteins have a propensity for two completely different, but well-ordered, conformations^{4,5,6,7,8,9,10,11,12}. Better insight into the ambiguity of the protein folding code would lead to a better understanding of how proteins evolve, how mutation is related to disease, and how function can be annotated to sequences of unknown structure^{13,14,15,16,17,18,19,20,21,22,23,24,25,26,27}. If the protein folding code were truly understood, it would be possible both to predict and design proteins that undergo profound switches in conformation. There has been significant progress in understanding natural proteins that switch folds¹¹ and predicting natural fold-switching proteins from amino acid sequence data²⁵. Designing proteins at the interface between different folds has been possible^7,28,29,30 but still presents a formidable challenge. It has been particularly challenging to design monomeric proteins that switch fold without a change in quaternary structure, and a better understanding is needed about how a very limited subset of intra-protein interactions can tip the balance from one fold and function to another^29,31,32.

Our goal here was to engineer monomeric proteins that are in a critical state between two distinct folds. To do this we chose three well-studied protein folds and designed a series of sequences such that each sequence is compatible with two sets of native interactions. Two of these folds are from Streptococcal Protein G which contains two types of domains that bind to serum proteins in blood: the G_A domain binds to human serum albumin (HSA)^33,34 and the G_B domain binds to the constant (Fc) region of IgG^35,36. The third protein is S6, a component of the 30S ribosomal subunit of Thermus thermophilus^{37,38,39,40,41}. For simplicity, the S6 fold is referred to as an S-fold, the G_A fold as an A-fold, and the G_B fold as a B-fold. These proteins share no significant sequence homology and are representative of three of the ten most common folds: the S-fold is a thioredoxin-like α/β plait; the A-fold is a homeodomain-like 3α-helix bundle; and the B-fold is a ubiquitin-like β grasp⁴².

Figure 1 depicts a network of high-identity sequence intersections (nodes) that connect the three folds. The arrows in Fig. 1 show a network originating with the natural S6 sequence. Circles represent nodes in the network at which structural and/or functional switches occur. The SI and S’I nodes are branch points and lead down diverging sequence pathways, one leading to a node with the A-fold (S/A) and one to a node with the B-fold (S/B). Intersecting mutational pathways lead from S/A to the native G_A protein and S/B to the native G_B protein. At this intersection (A/B), an A-fold switches to a B-fold.

**Fig. 1: Overview of engineered nodes in the S6, G_A, and G_B networks.**

Proteins around the A/B node have been extensively characterized in our earlier work^29,31,32. Here we determine that both G_A and G_B can switch into a third fold (α/β−plait) and show that these three folds and four functions (HSA-binding, IgG-binding, protease inhibition, and RNA-binding) can be connected in a network that avoids unfolded and functionless states. We describe how these nodes were engineered, determine key structures using NMR spectroscopy, and analyze stability and binding function. The ability to design and characterize nodes connecting three common small folds suggests that fold switching may be an intrinsic feature of the protein folding code and is important in the evolution of protein structure and function.

Results

Designing a functional switch from ribosomal protein to protease inhibitor

The S6 ribosomal protein is structurally homologous to subtilisin protease inhibitors known as prodomains (Fig. 2a, b)^43,44. Prodomain-type inhibitors have two binding surfaces with the protease. One surface comprises the last nine C-terminal amino acids of the inhibitor which bind in the substrate binding cleft of the protease (Fig. 2b). A second, more dynamic surface is formed between two subtilisin helices and the large surface of the β−sheet in the α/β-plait topology of the inhibitor (Fig. 2b)^45,46,47. As a result, the S6 protein could be converted into a subtilisin inhibitor protein of the same overall fold (denoted SI) by replacing its nine C-terminal amino acids with residues optimized to bind in the substrate binding cleft of subtilisin. This replacement results in new contacts between the SI β−sheet and the subtilisin surface helices (Fig. 2b).

**Fig. 2: Summary of switches in structure and function.**

The SI-protein is 99 amino acids in length and has a 10 residue loop between β2 and β3. However, there are many natural variations in the length of loops in the conserved α/β-plait topology⁴⁸. Therefore, we also engineered a 91 amino acid version of the S-fold (denoted S’I), which resembles the topology of natural prodomain inhibitors (Supplementary Fig. 1). Specifically, the S’I inhibitor has a longer loop connecting β1 to α1 and a shorter turn connecting β2 to β3 (Fig. 2b).

The SI and S’I proteins were expressed and purified by binding to a protease column⁴⁹. The CD spectra were compared to the native S6 protein (Supplementary Fig. 1). Inhibition constants (K_I) were measured using an engineered RAS-specific subtilisin protease and the peptide substrate QEEYSAM-AMC⁴⁹. SI and S’I inhibit the RAS-specific protease with K_I values of 200 and 60 nM, respectively (Supplementary Table 1). The details of the competitive inhibition assay are described in the “Methods” section. The results demonstrate that a ribosomal protein can be converted into a protease inhibitor with minor modification (and without a fold switch). In addition, however, the SI and S’I proteins also facilitated engineering subsequent switches to new folds and functions by linking each of the S-, A-, and B- folds to easily measured binding functions: protease inhibition (S or S’-fold); HSA-binding (A-fold, Fig. 2e)⁵⁰; and IgG binding (B-fold, Fig. 2f)⁵¹.

Designing fold switches

In previous work, we created sequences that populate both A- and B-folds by threading the A-sequence through the B-fold, finding a promising alignment, and then using phage-display selection to reconcile one sequence to both folds^29,52,53. Here the approach is conceptually similar, except that we use Rosetta⁵⁴ as a computational design tool to test compatible mutations rather than phage display. The design process is as follows:

i.
Thread the A- or B- sequence through both SI and S’I-fold types.
ii.
Identify alignments that minimize the number of catastrophic interactions.
iii.
Design mutations to resolve unfavorable interactions in clusters of 4–6 amino acids using Pymol⁵⁵ and energy minimize using Rosetta-Relax⁵⁴.
iv.
Optimize protein stability in the S-fold by computationally mutating amino acids at non-overlapping positions. Repeat energy minimization and evaluation with Rosetta-Relax.
v.
To reduce uncertainties involved in computational design, conserve original amino acids whenever possible.

There is no reason to assume that this method is optimal. We are just applying a practicable scheme for engineering sequences compatible with two sets of native interactions and then evaluating structure, stability, and function. Initial designs were refined based on structural analysis with NMR, thermodynamic analysis of unfolding, and functional analysis using binding assays, as described below. All designed proteins were expressed in E. coli and purified to homogeneity as described in the “Methods” section.

Designing a switch from α/β-plait protease inhibitor to 3α HSA-binding protein

Alignment of the 56 amino acid HSA-binding, A-fold with the 99 amino acid SI-fold and subsequent mutation to resolve catastrophic interactions produced low-energy switch candidates denoted S_a1 and A₁. The exact sequence of A₁ is embedded in S_a1 at positions 11–66 such that the α1 helices are structurally aligned (Fig. 3a, Supplementary Fig. 2A). Their final computational models were generated by Rosetta using the Relax application. The Relax protocol searches the local conformational space around an experimentally determined structure and is used only to evaluate whether the designed mutations have favorable native interactions within that limited conformational space. The designed models of S_a1 and A₁ are very similar in energy compared to the respective relaxed native structures (Supplementary Fig. 3 and Source data files).

**Fig. 3: Structure and dynamics of A₁ and S_a1.**

Structural analysis of A₁ and S_a1

Overall, the 3α-helical bundle topology of A₁ is very similar to the G_A parent structure from which it was derived⁵⁶. The sequence-specific chemical shift assignments for A₁ (Fig. 3b) were utilized to calculate a 3D structure with CS-Rosetta (Fig. 3c, Table 1). Our previous studies indicated close correspondence of CS-Rosetta and de novo structures for A- and B-folds with high sequence identity⁵⁷. The N-terminal residues 1–4 and the C-terminal residues 53–56 are disordered in the structure, consistent with {¹H}-¹⁵N steady-state heteronuclear NOE data (Fig. 3e). Likewise, S_a1 has the same overall βαββαβ-topology as the parent S6 structure (Fig. 3d, Table 2). The backbone chemical shifts (Fig. 3b) were used in combination with main chain inter-proton NOEs (Supplementary Fig. 4) to determine a three-dimensional structure utilizing CS-Rosetta (PDB 7MN1). The conformational ensemble shows well-defined elements of secondary structure at residues 2–10 (β1), 16–32 (α1), 40–44 (β2), 59–67 (β3), 73–81 (α2), and 86–92 (β4). The principal difference from the native structure is that the β2-strand is seven amino acids shorter in S_a1 than in S6. Heteronuclear NOE data show overall consistency with the structure, indicating that the long loop between the β2- and β3-strands from residues 45–58 is more flexible than other internal regions of the polypeptide chain (Fig. 3e).

Table 1 Structure statistics for A₁, B₁, and B₄

Full size table

Table 2 Structure statistics for S_a1, S_b1, S_b2, and S_b3

Full size table

Comparison of A₁ and S_a1 structures

Although the 56 amino acid sequence of A₁ is 100% identical to residues 11–66 of S_a1, a significant fraction of the backbone undergoes changes between the two structures. Most notably, while the α1 helices in both A₁ and S_a1 are similar in length, the regions corresponding to the α2 and α3 helices of A₁ form the β2 and β3 strands of S_a1 (Fig. 4a). Core amino acids in the α1-helix of A₁ correspond with residues that also contribute to the core of S_a1. However, the α1-helix in S_a1 contacts an almost entirely different set of residues (Fig. 4b). For example, amino acids L51, Y53, and I55 in the C-terminal tail of A₁ do not have extensive contact with α1 but the corresponding residues in S_a1 (L61, Y63, and I65) form close core interactions with α1 as part of the β3-strand. Most of the other core residues contacting the α1-helix of S_a1 are outside the 56 amino acid region coding for the A₁ fold. These include F4, V6, I8, and L10 from the β1-strand; A67 from the β3-strand; V72, L75, and L79 from the α2-helix; and V85 from the loop between the α2-helix and the β4-strand. Two additional residues, V88 and V90 (β4) also contribute significantly to the core but do not contact α1. Thus, except for the original topological alignment of the α1-helices, the cores of the 3α and α/β-plait folds are largely non-overlapping. In total, approximately half of the residues participating in the S_a1 core are not present in the A₁ sequence.

**Fig. 4: Structural differences between the 100% sequence identical regions of A₁ and S_a1.**

Energetics of unfolding for A₁/S_a1

Far-UV CD spectra were measured for S_a1 and A₁ and their thermal unfolding profiles were determined by measuring ellipticity at 222 nm versus temperature (Fig. 5 and Supplementary Fig. 5). S_a1 has a T_M of ~100 °C and an estimated ∆G_folding of −5.3 kcal/mol at 25 °C (Fig. 5b, Supplementary Table 1)⁵⁸. The ∆G_folding of the parent S6 is −8.5 kcal/mol⁴⁰. The Rosetta energy of the S_a1 design is similar to that of the native sequence (Supplementary Fig. 3). A₁ has a T_M of 65 °C and a ∆G_folding = −4.0 kcal/mol at 25 °C⁵⁸ (Fig. 5a, Supplementary Table 1). The ∆G_folding of the parent G_A is −5.6 kcal/mol^59,60. The Rosetta energy of the A₁ design is slightly more favorable than for the native sequence (Supplementary Fig. 3).

HSA binding

Initial engineering of the fold switch was carried out without consideration of preserving function. As a result, A₁ does not have detectible HSA binding affinity because two amino acids in the binding interface were mutated. Significant HSA-binding is recovered, however, when the surface mutations, E28Y and K29Y, are made in A₁ (denoted A₂). These mutations do not appear to affect the structure of A₁ (Supplementary Fig. 5) but result in HSA binding of K_D ≤ 1 µM (Supplementary Table 1). This was determined by measuring binding to immobilized HSA as described in the “Methods” section.

Protease inhibition

S_a1 does not bind protease because C-terminal amino acids were not preserved in its design. It can be converted into a protease inhibitor, however, by replacing its three C-terminal amino acids (AAD) with DKLYRAL (denoted S_a1I). A version of S_a1I was also made that contains the exact 56 amino acid A₂ sequence by making E38Y, K39Y mutations (denoted S_a2I). S_a1, S_a1I, and S_a2I are similar in structure by CD analysis (Supplementary Fig. 5). The inhibition constant of S_a2I with the engineered subtilisin was determined to be 50 nM as described in the “Methods” section (Supplementary Table 1). Thus, a stable A-fold with HSA-binding function can be embedded within a 99 amino acid S-fold with protease inhibitor function (Fig. 2c, e). It should be noted that all HSA contact amino acids are preserved in both the A₂ and S_a2I sequences, but the three-dimensional topology necessary to form the HSA contact surface occurs only in the A-fold⁵⁰. Nevertheless, S_a2I was observed to bind weakly to HSA (K_D ~ 100 µM, Supplementary Table 1). This weak affinity suggests that some S_a2I molecules may populate the 3α fold even though the α/β-plait fold strongly predominates.

Designing a switch from α/β-plait protease inhibitor to β−grasp IgG-binding protein

In designing an S- to B-fold switch, we used two topological alignments. The first was between SI- and B-folds, where the β1 strands of each fold were aligned (Supplementary Figs. 2B and 6A). The second alignment was between S’I- and B-folds, where the long loop between β2 and β3 in SI was shortened in S’I to be more consistent with natural protease inhibitors. In this scheme, the α1β3β4 topology of the B-fold was aligned with the α1β2β3 topology of the S’I-fold (Fig. 6a, Supplementary Fig. 2C).

**Fig. 6: Structure and dynamics of S_b3 and B₄.**

Design and characterization of B₁, S_b1, B₂, and S_b2

In the first approach, alignment of the β1-strands of the B-fold and the S-fold and subsequent mutation to resolve catastrophic interactions produced low-energy switch candidates denoted B₁ and S_b1. The exact sequence of B₁ is embedded in S_b1 at positions 4–59 (Supplementary Fig. 6A). The computational models of B₁ and S_b1 show relatively small increases in energy compared to the corresponding relaxed native structures (Supplementary Fig. 3). The NMR structure of B₁ displayed a ββαββ topology identical to that of the parent B-fold, with a backbone RMSD of ~0.6 Å (Supplementary Fig. 6B, C). The topology of S_b1 is not the same as the parent S6 structure, however, and instead has a fold similar to that of B₁ (Supplementary Figs. 6B, D, and 7, PDB 7MQ4). Introducing 13 mutations into S_b1 generated a protein denoted S_b2 (Supplementary Fig. 8). S_b2 contains four β-strands and two α-helices and has the general features of the parent S-fold (Supplementary Fig. 9, PDB 7MN2). The 56 amino acid version of S_b2 (denoted B₂) has a significantly higher Rosetta energy than B₁, however, and is presumably unfolded (Supplementary Fig. 3). Thus, neither the B₁/S_b1 nor B₂/S_b2 protein pairs resulted in high identity sequences with different folds. Nonetheless, B₁ is 80% identical to the corresponding embedded region in the S-folded protein S_b2 (Supplementary Fig. 9A). The structures of B₁, S_b1, and S_b2 are described further in the Supplement and Tables 1 and 2.

Design of S_b3 and B₃

To improve the design of the S-to-B switch we aligned the B-fold with the S’ inhibitor fold and chose an alignment that creates a topological match between α1β3β4 in B and α1β2β3 in S’ (Supplementary Fig. 2C). Mutation to resolve deleterious interactions in this alignment produced low-energy switch candidates denoted B₃ and S_b3 (Supplementary Fig. 10). The exact sequence of B₃ is embedded in S_b3 at positions 1–56. The energy of the computational model for S_b3 is slightly more favorable than the relaxed native structure. The designed model of B₃ shows relatively small increases in energy compared to the relaxed native structure (Supplementary Fig. 3).

Structural analysis of S_b3 and B₃

NMR-based structure determination indicated that S_b3 has a βαββαβ secondary structure and an S-fold topology (Fig. 6a, b, d, PDB 7MP7). Ordered regions correspond with residues 4–10 (β1), 24–37 (α1), 42–46 (β2), 51–56 (β3), 62–70 (α2), and 79–85 (β4). Comparison of S_b3 with the parent S-fold indicates that the β1/α2/β4 portion of the fold is similar in both. In contrast, the β1–α1 loop is longer in S_b3 (13 residues) than in the parent S-fold (5 residues), while α1, β2, the β2–β3 loop, and β3 are all shorter than in the parent (Fig. 6d). Consistent with the S_b3 structure, the 13 amino acid β1-α1 loop is highly flexible (Fig. 6e). We also expressed and purified a truncated protein corresponding to the embedded B-fold, the 56 amino acid version of S_b3 (denoted B₃). The 2D ¹H–¹⁵N HSQC spectrum of B₃ at 5 °C and low concentrations (<20 μM) was consistent with a predominant, monomeric B-fold (Supplementary Fig. 11) but showed significant exchange broadening at 25 °C, indicative of low stability (see below). Presumably, the low stability is due to the less favorable packing of Y5 in the core of the B-fold compared with a smaller aliphatic leucine. However, additional, putatively oligomeric, species were also present for which relative peak intensities increased with increasing protein concentration. Due to its relatively low stability and sample heterogeneity, B₃ was not analyzed further structurally.

Design and analysis of point mutations that switch the fold of S_b3

We used the NMR structure of S_b3 to design a point mutation, tyrosine 5 to leucine (Y5L), that would stabilize the embedded B-fold without compromising native contacts in the S-fold (Supplementary Figure 10). This mutant was therefore expected to shift the population to the B-fold. Two mutants were prepared, a Y5L mutant of S_b3 (denoted S_b4) and a Y5L mutant of B₃ (denoted B₄). B₄, is indeed more stable than B₃ (Fig. 5a, Supplementary Table 1). Assignment and structure determination of B₄ showed its topology to be identical to the parent B-fold (Fig. 6b, c). At concentrations above 100 μM, B₄ displayed a tendency for weak self-association similar to that seen for B₃. For S_b4, the HSQC spectrum exhibited approximately twice the number of amide cross-peaks relative to S_b3 (Fig. 7a), suggesting that S- and B-states were populated simultaneously. This was confirmed by the NMR assignment and also a comparison of the HSQC spectra for S_b4, B₄, and S_b3. A significant fraction of the S_b4 backbone amide signals (~50 peaks) closely matched those of B₄, indicating the presence of a B-state (Supplementary Fig. 12A–C). The close matching of these peaks is presumably because residues 1–56 in the B-state of S_b4 are identical in sequence to B₄. The largest amide shift perturbations between the B-state of S_b4 and B₄ occur for residues proximal to the C-terminus of the B-fold, such as G41, where S_b4 has additional residues and B₄ does not. Many of the S_b4 signals also matched well with S_b3, although the degree of similarity was not as extensive as with B₄ (Supplementary Fig. 12D–F). More significant amide chemical shift differences between the S-state of S_b4 and S_b3 are likely due to the Y5L mutation, which is a relatively large change located adjacent to the core. To resolve these ambiguities, backbone resonance assignments were made for the S-state of S_b4 (Fig. 7a, [https://doi.org/10.13018/BMR51719] see the “Methods” section for details). Comparison of S_b4 S-state assignments with S_b3 indicated that most of the larger amide shift perturbations were in the β1 and β4 strands. Secondary shift analysis showed that the pattern of secondary structure elements for the S-state of S_b4 is similar to that of S_b3 (Fig. 7b). Inter-proton NOE analysis indicated that the arrangement of the β-strands is also similar (Fig. 7c). Together, these results show that S_b4 populates both S- and B-folds approximately equally at 25 °C. Moreover, a ZZ-exchange spectrum demonstrated that the S- and B-states of S_b4 are in slow conformational exchange on the NMR timescale (Fig. 7d).

**Fig. 7: S_b4 is an equilibrium mixture of S- and B-states.**

Finally, we designed a mutation of leucine 67 to arginine (L67R) in S_b4 to destabilize the S-fold without changing the sequence of the embedded B-fold. The mutant is denoted as S_b5 (Supplementary Fig. 10). This was expected to shift the population to the B-fold. The 2D ¹H-¹⁵N HSQC spectrum of S_b5 indicates that the L67R mutation does indeed destabilize the S-fold, with the loss of S-type amide cross-peaks and the concurrent appearance of a new set of signals indicating a switch to a B-fold. The superposition of the spectrum of S_b5 with that of B₄ shows that the new signals in S_b5 largely correspond with the spectrum of B₄ (Supplementary Fig. 13). Thus, the L67R mutation shifts the equilibrium from the S-fold to the B-fold. The additional signals (~25–30) in the central region of the HSQC spectrum that are not detected in B₄ are presumably due to the disordered C-terminal tail of S_b5. The C-terminal tail of S_b5 does not appear to interact extensively with the B-fold, as evidenced by few changes in chemical shifts or peak intensities in the B-region of S_b5 compared with B₄.

Structural comparison of S_b3 and B₄

The aligned amino acids 1–56 of S_b3 and B₄ have 98% sequence identity, the only difference being an L5Y mutation in S_b3 (Fig. 6a). The global folds of S_b3 and B₄ have large-scale differences, however (Fig. 8a, Supplementary Fig. 4). The β1-strands, while similar in length, are in opposite directions in S_b3 and B₄. The β1-strand forms a parallel-stranded interaction with β4 in B₄, but an antiparallel interaction with the corresponding β3-strand in S_b3. Whereas residues 9-20 form the 6-residue β1–β2 turn and the 6-residue β2-strand of B₄, these same amino acids constitute the end of β1 and 10 residues of the largely disordered β1–α1 loop in S_b3. The remainder of the B-region is topologically similar, with the α1/β3/β4 structure in B₄ matching the α1/β2/β3 structure in S_b3. Overall, however, the order of H-bonding in the 4-stranded β-sheets is quite different, with β2β3β1β4 in S_b3 and β3β4β1β2 in B₄.

**Fig. 8: Structural differences in the high (~98%) sequence identity regions of B₄ and S_b3.**

The main core residues of B₄ consist of Y3, L5, L7, and L9 from β1, A26, F30, and A34 from α1, and F52 and V54 from β4 (Fig. 8b). In S_b3, the topologically equivalent regions of the core are A26, F30, and A34 from α1, and F52 and V54 from β3. Residues Y5, L7, and L9 from the β1 strand of S_b3 also form part of the core, but with different packing from B₄ due to the reverse orientation of β1. Residues A12 and A20, which contribute to the periphery of the core in B₄, are solvent accessible in the β1-α1 loop of S_b3. Most of the remaining core residues of S_b3 come from outside of the B-region and include amino acids from β3 (A56), α2 (V64, L67, A68, L71), and β4 (V80 and I82).

Energetics of unfolding for B₃/S_b3, B₄/S_b4, and S_b5

Far-UV CD spectra were measured for B₃, B₄, S_b3, S_b4, and S_b5 and their thermal unfolding profiles were determined by measuring ellipticity at 222 nm versus temperature (Fig. 5, Supplementary Fig. 10, Supplementary Table 1). As described above, the predominant form of S_b3 is an S-fold. CD and NMR analyses show that B₃ is predominantly a B-fold with a ∆G_folding of −1.2 kcal/mol at 25 °C⁵⁸. From the NMR analysis, it appears that the B-fold is in equilibrium with putatively dimeric states. This creates a situation in which the B-fold is both temperature-dependent and concentration-dependent. The predominant form at 5 °C and ≤18 µM is the B-fold, however. The low stability and concentration-dependent behavior of B₃ may indicate that some propensity for S-type conformations could persist in the 56-residue protein.

S_b4 has a temperature unfolding profile very similar to S_b3 (Fig. 5) even though both S- and B- are approximately equally populated at 25 °C in S_b4 (Fig. 7). This shows that the Y5L mutation results in two folds that are almost isoenergetic and both thermodynamically stable relative to the unfolded state. Further, because S- and B-folds are in equilibrium and approximately equally populated, the free energy of switching to the B-fold from the S-fold (∆G_{B-fold/S-fold}) is ~0 kcal/mol at 25 °C. The switch equilibrium reflects the influence of the antagonistic B-fold on the S-fold population in S_b4, where the leucine at residue 5 helps stabilize the alternative B-state at the expense of the S-state. Thermal denaturation by CD shows that B₄ has a ∆G_folding = −4.1 kcal/mol at 25 °C⁵⁸. The thermal unfolding profile of S_b5 shows a low-temperature transition with a midpoint ~10 °C and a major transition with a midpoint of ~60 °C (Fig. 5b). The NMR analysis indicates that the major transition is unfolding of the B-fold. Thus, the arginine at 67 in S_b5 makes the B-fold more favorable by making the S-fold unfavorable, consistent with the change in population from mixed to B-fold observed by NMR.

Protease inhibition

The S_b3 protein is closely related to S’I but lacks inhibitor function because C-terminal amino acids were changed in the design of the switch. It can be converted into a protease inhibitor, however, by altering C-terminal amino acids VTE to DKLYRAL. This mutant is denoted S_b3I. S_b3 and S_b3I appear similar in structure by CD analysis (Supplementary Fig. 10). The K_I for S_b3I with the engineered subtilisin was determined to be 50 nM (Supplementary Table 1).

IgG binding

Binding to IgG was determined for B₃ and S_b3I (Supplementary Table 1). B₃ and S_b3I bound to IgG Sepharose with K_D ≤ 1 µM and 10 µM, respectively. Presumably, S_b3I has significant IgG-binding activity because the α1β3 IgG binding surface of the B-fold is largely preserved in the S-fold. Thus, S_b3I is a dual-function protein with both IgG-binding and protease inhibitor functions (Fig. 2f).

Discussion

The entire network of intersecting pathways between the S-, A-, and B-folds is summarized in Fig. 9. The first node on the pathway is a functional switch from RNA binding protein to protease inhibitor without a fold switch. The α/β plait is a common fold, and proteins with this basic topology include many different functions⁴². Engineering the SI and S’I nodes illustrates how protease inhibitor function can arise in the α/β plait topology with a few mutations. Replacing only C-terminal amino acids in the S6 protein creates interaction with the substrate binding cleft of the protease (Fig. 2a, b). This C-terminal interaction plus adventitious contact between the β-sheet surface of the α/β plait and two α-helices in the protease result in protease inhibition in the 50 nM range. Based on the structure of S6 in the 30S complex, the C-terminal modification may not have major effects on binding interactions with ribosomal RNA and the S15 protein (Fig. 2a)⁴³. Thus, the transition from RNA binding protein to protease inhibitor likely is uninterrupted. An insertion in the β1–α1 loop and a deletion β2–β3 loop in the SI-inhibitor creates a topology that more closely resembles natural prodomain-type inhibitors^44,46,61 and creates an α1β2β3 motif in the S’-fold that is similar to the α1β3β4 motif of the B-fold. This topological similarity brings the S’I closer to an intersection with the B-fold. Thus, SI and S’I nodes are both functional switches and branch points for switching the S-fold into the A- and B-folds, respectively.

**Fig. 9: Sequence-fold relationships of engineered S/A, S/B, and A/B nodes.**

Engineering nodes at fold intersections required designing sequences that are compatible with native interactions in two different folds. We used simple rules to do this. The first rule was to align topologies rather than maximizing sequence similarities. Identifying a common topology can help determine a register that has fewer irreconcilable clashes. For example, topological alignment of the α1 helix of the SI fold and the α1 helix of the A-fold facilitated engineering the fold switch, because the regions flanking α1 of the SI-fold can encode two different fold motifs. When topological alignment is poor, as was the case with S- and B-folds, it was helpful to look for natural variations in the turns of the longer fold to create better alignment. Variation in loops and turns in a larger fold creates more freedom of design and a higher probability of switches. Once an alignment is chosen, the basic rule in resolving catastrophic clashes is to conserve original amino acids when possible. This reduces the uncertainties involved in computational design. The Rosetta energy function was not used to predict a favorable alignment but was important in evaluating mutations to resolve clashes once an alignment was chosen.

Selecting mutations compatible with two sets of native interactions required tradeoffs in the native state energetics of each individual fold^5,11. A node may be produced in cases in which both alternative folds are stable relative to the unfolded state. Stability relative to the unfolded state (i.e. a state with little secondary structure) was determined by CD melting (Fig. 5). It was informative to examine the stability of both short (56 residues) and longer forms of a putative node sequence. The independent stability of the G-fold can be determined in the short form without the antagonism from the S-fold that is present in the longer sequence. The stabilities of the A₁ and A₂ proteins are about −4 kcal/mol at 25 °C⁵⁸ compared to −5.6 kcal/mol for the native G_A protein⁵⁶. The stabilities of B₃ and B₄ are −1.2 and −4.1 kcal/mol, respectively, at 25 °C⁵⁸ compared to −6.7 kcal/mol for the native G_B protein⁶². For the longer sequences, the ∆G_folding of S_a1 and S_b3 are −5.3 and −3.5 kcal/mol, respectively, at 25 °C⁵⁸ compared to −8.5 kcal/mol for the native S6 protein⁴⁰.

In the case of the S-folds, however, the energetic effects of the stable, embedded G-fold must also be considered. Since the equilibria between both folded states and the unfolded state are thermodynamically linked, the free energy of a switch to a G-fold from an S-fold (∆G_{G-fold/S-fold}) is approximated by the difference in ∆G_folding (∆∆G_folding) between the short and long forms of a node protein. For example, based on ∆G_folding for A₁ and S_a1, the predicted ∆G_{A-fold/S-fold} of S_a1 is 1.3 kcal/mol. This is consistent with the structure of the predominant S-fold determined by NMR but also with the small population of 3α fold suggested by weak HSA-binding. From the thermal denaturation profiles of B₃ and S_b3, the predicted ∆G_{B-fold/S-fold} of S_b3 is 2.3 kcal/mol, a value consistent with the stable S-fold observed in NMR experiments. The S_b3 sequence is also approaching a critical point, however. A substitution in S_b3 that stabilizes the B-fold (Y5L) shifts the equilibrium of S_b4 to an approximately equal mixture of B- and S-folds. That is, ∆G_{B-fold/S-fold} of S_b4 is ~0 kcal/mol at 25 °C. One further substitution that destabilizes the S-fold (L67R) shifts the population of S_b5 to a stable B-fold (∆G_{B-fold/S-fold} ≤ −5 kcal/mol) (Fig. 9).

The existence of nodes between folds has implications for the evolution of new functions. In the case of the S/A node, all contact amino acids for HSA exist within the S-fold of the protease inhibitor S_a2I albeit in a cryptic topology. Deletion of amino acids 67–99 (A₂) results in loss of inhibitor function and a fold switch from α/β plait to 3α. Acquisition of HSA binding activity (K_D < 1 µM) results from unmasking the cryptic HSA binding amino acids via the fold switch (Fig. 2e). This level of binding affinity could be biologically relevant since the concentration of HSA in serum is >500 µM⁶³. In the case of the S/B node, the α1β3 motif contains all IgG contact amino acids and S_b3I has some affinity for both IgG (K_D = 10 µM) and protease (K_I = 50 nM). In this case, the Y5L mutation (S_b4) or a deletion of 57–91 (B₄) causes a fold switch from α/β plait to the β-grasp and results in tighter IgG binding (K_D ≤ 1 µM) (Fig. 2f). This level of binding affinity could also be biologically relevant since the concentration of IgG in serum is >50 µM (or >100 µM Fc binding sites)⁶⁴. We have previously shown that an A-fold with HSA binding function can be switched to a B-fold with IgG-binding function via single amino acid substitutions that switch the folds and unmask cryptic contact amino acids for the two ligands^29,32.

In conclusion, it was possible to connect three common folds in a network of high-identity nodes that form critical points between two folds. As in other complex systems, a small change in a protein near a critical point can have a “butterfly effect” on how the folds are populated. This property of the protein folding code means that proteins with multiple folds and functions can exist in highly identical amino acid sequences. This suggests that the evolution of new folds and functions sometimes can follow uninterrupted mutational pathways.

Methods

Mutagenesis, protein expression and purification

Mutagenesis was carried out using Q5® Site-Directed Mutagenesis Kits (NEB). G_A and G_B variants were cloned into a vector (pH0720) encoding the sequence:

MEAVDANSLA QAKEAAIKEL KQYGIGDKYI KLINNAKTVE GVESLKNEIL KALPTEGSGN TIRVIVSVDK AKFNPHEVLG IGGHIVYQFK LIPAVVVDVP ANAVGKLKKM PGVEKVEFDH QYRGL

as an N-terminal fusion domain⁵⁶. Cell growth was carried out by auto-induction^29,65. Cells were harvested by centrifugation at 3750 × g for 20 min and lysed by sonication on ice in 0.1 M KPi, pH 7.2. Cellular debris was pelleted by centrifugation at 10,000 × g for 15 min. Supernatant was clarified by centrifugation at 45,000 × g for 30 min. Proteins were purified using a second generation of the affinity-cleavage tag system employed previously to purify switch proteins^29,66. The second-generation tag results in high-level soluble expression of the switch proteins and also enables the capture of the fusion protein by binding tightly to an immobilized processing protease via the C-terminal EFDHQYRGL sequence. Loading and washing were at 5 mL/min for a 5 mL Im-Prot column using a running buffer of 20 mM KPi, pH 6.8. The amount of washing required for high purity depends on the stickiness of the target protein and how much of it is bound to the column. We typically wash with 10 column volumes (CV) of wash solution followed by 3 CV 0.5 M NaCl and then ~10 CV running buffer. This can be repeated as necessary. The 0.5 M NaCl shots are repeated until the amount of absorbance released with each high-salt shot becomes small and constant. All the high-salt solution is washed out before initiating the cleavage. The target protein was cleaved from the Im-Prot column by injecting 15 mL of imidazole solution (0.1 mM) at 1 mL/min, 22 °C. The cleaved protein typically elutes as a sharp peak in 2–3 CV. The purified protein was then concentrated to 0.2–0.3 mM, as required for NMR analysis. The columns were regenerated by injecting 15 mL of 0.1 N H₃PO₄ (0.227 mL concentrated phosphoric acid (85%) per 100 mL) at a flow rate of ~1 CV/min. The wash solution was neutralized immediately after stripping. The purification system is available from Potomac Affinity Proteins.

Protease inhibitor proteins were purified by binding to Im-Prot media and then stripping off the purified inhibitor with 0.1 N H₃PO₄. Samples were then immediately neutralized by adding 1/10 volume 1 M K₂HPO₄.

Rosetta calculations

Rosetta energies of all designed structures were generated using the Slow Relax routine⁵⁴. 1000 decoys were calculated for each design. PDB coordinates and energy parameters for the lowest energy decoy for each design are included as supplemental files.

Circular dichroism (CD)

CD measurements were performed in 100 mM KPi, pH 7.2 with a Jasco spectropolarimeter, model J-1100 with a Peltier temperature controller. Quartz cells with path lengths of 0.1 and 1 cm were used for protein concentrations of 3 and 30 µM, respectively. The ellipticity results were expressed as mean residue ellipticity, [θ], deg cm² dmol⁻¹. Ellipticities at 222 nm were continuously monitored at a scanning rate of 0.5°/min. Reversibility of denaturation was confirmed by comparing the CD spectra at 20 °C before melting and after heating to 100 °C and cooling to 20 °C.

Measuring HSA and IgG binding affinity

Affinity of proteins to HSA and IgG was determined by their retention on the immobilized ligands. HSA and rabbit IgG were immobilized by reaction with NHS-activated Sepharose 4 Fast Flow (Cytiva) according to the manufacturer’s instructions. The concentration of immobilized HSA was 100 µM. The concentration of immobilized IgG was 50 µM (i.e. 100 µM Fc binding sites). Generally, 0.2 mL of a 5 µM solution of the test protein was injected into a 5 mL column at a flow rate of 0.5 mL/min. Determination of binding affinity assumes that binding is in rapid equilibrium such that the elution volume is proportional to the fraction of test protein bound to 100 µM of binding sites. Proteins that are completely retained after 20 column volumes (CV) are assessed to have K_D ≤ 1 µM. Completely retained proteins are stripped from the column with 0.1 N H₃PO₄ at the end of the run.

Measuring protease inhibition

Competitive inhibition constants (K_I) were determined using the fluorogenic peptide substrate QEEYSAM-AMC (7-amino-4-methylcoumarin) purchased from AnaSpec Inc. and a highly specific, engineered protease known as RASProtease(I)⁴⁹. Competitive inhibition constants (K_I) were measured by determining the K_M(apparent) in the presence of 0, 50, and 100 nM of each inhibitor protein. The reactions were carried out in 100 mM KPi, 10 mM imidazole, 0.005% tween-20, pH 7.0 at 25 °C with 1 nM RASProtease(I). The QEEYSAM-AMC concentrations used to determine K_M and K_M(apparent) were 0.1, 0.5, 1, 2, 5, and 10 µM. Initial rates were determined with a BioTek Synergy MT fluorescence microplate reader (Ex: 360/40, Em: 460/40) by measuring the release of the fluorescent AMC group via hydrolysis of the amide bond. Highly pure (≥98%) protease and inhibitor proteins were used for all kinetic experiments.

NMR spectroscopy

Isotope-labeled samples were prepared at 0.2–0.3 mM concentrations in 100 mM potassium phosphate buffer (pH 7.0) containing 5% D₂O. NMR spectra were collected using Topspin3.6.1 software on Bruker AVANCE III 600 and 900 MHz spectrometers fitted with Z-gradient ¹H/¹³C/¹⁵N triple resonance cryoprobes. Standard double and triple resonance experiments (HNCACB, CBCA(CO)NH, HNCO, HN(CA)CO, and HNHA) were utilized to determine main chain NMR assignments. Inter-proton distances were obtained from 3D ¹⁵N-edited NOESY and 3D ¹³C-edited NOESY spectra with a mixing time of 150 ms. NmrPipe⁶⁷ was used for data processing and analysis was done with Sparky⁶⁸. Two-dimensional {¹H}-¹⁵N steady-state heteronuclear NOE experiments were acquired with a 5 s relaxation delay between experiments. Errors in heteronuclear NOEs were estimated based on the background noise level. Chemical shift perturbations were calculated using Δδ_total = ((W_HΔδ_H)² + (W_NΔδ_N)²)^1/2, where W_H is 1, W_N is 0.2, and Δδ_H and Δδ_N represent ¹H and ¹⁵N chemical shift changes, respectively. For PRE experiments on S_b1, single-site cysteine mutant samples were incubated with 10 equivalents of (1-oxyl-2,2,5,5-tetramethylpyrroline-3-methyl) methanethiosulfonate (MTSL), Santa Cruz Biotechnology) at 25 °C for 1 h and completion of labeling was confirmed by MALDI mass spectrometry. Control samples were reduced with 10 equivalents of sodium ascorbate. Backbone amide peak intensities of the oxidized and reduced states were analyzed using Sparky. Three-dimensional structures were calculated with CS-Rosetta3.2 using experimental backbone ¹⁵N, ¹H_N, ¹Hα ¹³Cα, ¹³Cβ, and ¹³CO chemical shift restraints and were either validated by comparison with experimental backbone NOE patterns (A₁, B₁, B₄, S_b1) or directly employed interproton NOEs (S_a1, S_b2) or PREs (S_b1) as additional restraints. One thousand CS-Rosetta structures were calculated from which the 10 lowest energy structures were chosen. For S_b3, CS-Rosetta failed to converge to a unique low-energy topology, producing an approximately even mixture of S- and B-type folds despite the chemical shifts and NOE pattern indicating an S-fold. In this case, CNS1.1⁶⁹ was employed to determine the structure⁵⁶, including backbone dihedral restraints from chemical shift data using TALOS-N⁷⁰. The backbone resonances for the S-state of S_b4 were assigned using triple resonance methods as above, under conditions where the S-state is more favorably populated (30 °C, 100 mM KPi, 200 mM sodium chloride, pH 7.0). Amide assignments were then transferred to the two-dimensional ¹H-¹⁵N HSQC spectrum of S_b4 at 25 °C in 100 mM KPi, pH 7.0. Inter-proton NOEs for the S-state of S_b4 were obtained at the 30 °C/high salt condition, employing a 3D ¹⁵N-edited NOESY spectrum with a 150 ms mixing time. A two-dimensional ZZ-exchange ¹H–¹⁵N HSQC spectrum was recorded on S_b4 using a mixing time of 300 ms (25 °C, 100 mM KPi, pH 7.0)^71,72. Protein structures were displayed and analyzed utilizing PROCHECK-NMR⁷³, MOLMOL⁷⁴ and PyMol (Schrodinger)⁵⁵.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

References

Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Article ADS CAS Google Scholar
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).
Article ADS CAS Google Scholar
Huang, P. S., Boyken, S. E. & Baker, D. The coming of age of de novo protein design. Nature 537, 320–327 (2016).
Article ADS CAS Google Scholar
Ambroggio, X. I. & Kuhlman, B. Design of protein conformational switches. Curr. Opin. Struct. Biol. 16, 525–530 (2006).
Article CAS Google Scholar
Bryan, P. N. & Orban, J. Proteins that switch folds. Curr. Opin. Struct. Biol. 20, 482–488 (2010).
Article CAS Google Scholar
Dishman, A. F. et al. Evolution of fold switching in a metamorphic protein. Science 371, 86–90 (2021).
Article ADS CAS Google Scholar
Wei, K. Y. et al. Computational design of closely related proteins that adopt two well-defined but structurally divergent folds. Proc. Natl Acad. Sci. USA 117, 7208–7215 (2020).
Article ADS CAS Google Scholar
Anderson, W. J., Van Dorn, L. O., Ingram, W. M. & Cordes, M. H. Evolutionary bridges to new protein folds: design of C-terminal Cro protein chameleon sequences. Protein Eng. Des. Sel. 24, 765–771 (2011).
Article CAS Google Scholar
Burmann, B. M. et al. An α helix to β barrel domain switch transforms the transcription factor RfaH into a translation factor. Cell 150, 291–303 (2012).
Article CAS Google Scholar
Kulkarni, P. et al. Structural metamorphism and polymorphism in proteins on the brink of thermodynamic stability. Protein Sci. 27, 1557–1567 (2018).
Article CAS Google Scholar
Dishman, A. F. & Volkman, B. F. Design and discovery of metamorphic proteins. Curr. Opin. Struct. Biol. 74, 102380 (2022).
Article CAS Google Scholar
Alberstein, R. G., Guo, A. B. & Kortemme, T. Design principles of protein switches. Curr. Opin. Struct. Biol. 72, 71–78 (2022).
Article CAS Google Scholar
Rackovsky, S. Nonlinearities in protein space limit the utility of informatics in protein biophysics. Proteins 83, 1923–1928 (2015).
Article CAS Google Scholar
Chen, S. H., Meller, J. & Elber, R. Comprehensive analysis of sequences of a protein switch. Protein Sci. 25, 135–146 (2016).
Article CAS Google Scholar
Li, W., Kinch, L. N., Karplus, P. A. & Grishin, N. V. ChSeq: A database of chameleon sequences. Protein Sci. 24, 1075–1086 (2015).
Article CAS Google Scholar
Wolynes, P. G. Evolution, energy landscapes and the paradoxes of protein folding. Biochimie 119, 218–230 (2015).
Article CAS Google Scholar
Holzgräfe, C. & Wallin, S. Smooth functional transition along a mutational pathway with an abrupt protein fold switch. Biophys. J. 107, 1217–1225 (2014).
Article ADS Google Scholar
Scheraga, H. A. & Rackovsky, S. Homolog detection using global sequence properties suggests an alternate view of structural encoding in protein sequences. Proc. Natl Acad. Sci. USA 111, 5225–5229 (2014).
Article ADS CAS Google Scholar
Ha, J. H. & Loh, S. N. Protein conformational switches: from nature to design. Chemistry 18, 7984–7999 (2012).
Article CAS Google Scholar
Yadid, I., Kirshenbaum, N., Sharon, M., Dym, O. & Tawfik, D. S. Metamorphic proteins mediate evolutionary transitions of structure. Proc. Natl Acad. Sci. USA 107, 7287–7292 (2010).
Article ADS CAS Google Scholar
Lichtarge, O. & Wilkins, A. Evolution: a guide to perturb protein function and networks. Curr. Opin. Struct. Biol. 20, 351–359 (2010).
Article CAS Google Scholar
Rollins, N. J. et al. Inferring protein 3D structure from deep mutation scans. Nat. Genet. 51, 1170–1176 (2019).
Article CAS Google Scholar
Sikosek, T., Chan, H. S. & Bornberg-Bauer, E. Escape from Adaptive Conflict follows from weak functional trade-offs and mutational robustness. Proc. Natl Acad. Sci. USA 109, 14888–14893 (2012).
Article ADS CAS Google Scholar
Chen, N., Das, M., LiWang, A. & Wang, L. P. Sequence-based prediction of metamorphic behavior in proteins. Biophys. J. 119, 1380–1390 (2020).
Article ADS CAS Google Scholar
Porter, L. L. & Looger, L. L. Extant fold-switching proteins are widespread. Proc. Natl Acad. Sci. USA 115, 5968–5973 (2018).
Article ADS CAS Google Scholar
Bedford, J. T., Poutsma, J., Diawara, N. & Greene, L. H. The nature of persistent interactions in two model β-grasp proteins reveals the advantage of symmetry in stability. J. Comput. Chem. 42, 600–607 (2021).
Article CAS Google Scholar
Sykes, J., Holland, B. R. & Charleston, M. A. A review of visualisations of protein fold networks and their relationship with sequence and function. Biol. Rev. Camb. Philos. Soc. https://doi.org/10.1111/brv.12905 (2022).
Article Google Scholar
Ambroggio, X. I. & Kuhlman, B. Computational design of a single amino acid sequence that can switch between two distinct protein folds. J. Am. Chem. Soc. 128, 1154–1161 (2006).
Article CAS Google Scholar
Alexander, P. A., He, Y., Chen, Y., Orban, J. & Bryan, P. N. A minimal sequence code for switching protein structure and function. Proc. Natl Acad. Sci. USA 106, 21149–21154 (2009).
Article ADS CAS Google Scholar
Davey, J. A., Damry, A. M., Goto, N. K. & Chica, R. A. Rational design of proteins that exchange on functional timescales. Nat. Chem. Biol. 13, 1280–1285 (2017).
Article CAS Google Scholar
He, Y., Chen, Y., Alexander, P., Bryan, P. N. & Orban, J. NMR structures of two designed proteins with high sequence identity but different fold and function. Proc. Natl Acad. Sci. USA 105, 14412–14417 (2008).
Article ADS CAS Google Scholar
He, Y., Chen, Y., Alexander, P. A., Bryan, P. N. & Orban, J. Mutational tipping points for switching protein folds and functions. Structure 20, 283–291 (2012).
Article CAS Google Scholar
Falkenberg, C., Bjorck, L. & Akerstrom, B. Localization of the binding site for streptococcal protein G on human serum albumin. Identification of a 5.5-kilodalton protein G binding albumin fragment. Biochemistry 31, 1451–1457 (1992).
Article CAS Google Scholar
Frick, I. M. et al. Convergent evolution among immunoglobulin G-binding bacterial proteins. Proc. Natl Acad. Sci. USA 89, 8532–8536 (1992).
Article ADS CAS Google Scholar
Myhre, E. B. & Kronvall, G. Heterogeneity of nonimmune immunoglobulin Fc reactivity among gram-positive cocci: description of three major types of receptors for human immunoglobulin G. Infect. Immun. 17, 475–482 (1977).
Article CAS Google Scholar
Reis, K. J., Ayoub, E. M. & Boyle, M. D. P., Streptococcal Fc receptors. II. Comparison of the reactivity of a receptor from a group C streptococcus with staphylococcal protein A. J. Immunol. 132, 3098–3102 (1984).
Article CAS Google Scholar
Lindberg, M. O., Haglund, E., Hubner, I. A., Shakhnovich, E. I. & Oliveberg, M. Identification of the minimal protein-folding nucleus through loop-entropy perturbations. Proc. Natl Acad. Sci. USA 103, 4083–4088 (2006).
Article ADS CAS Google Scholar
Haglund, E., Lindberg, M. O. & Oliveberg, M. Changes of protein folding pathways by circular permutation. Overlapping nuclei promote global cooperativity. J. Biol. Chem. 283, 27904–27915 (2008).
Article CAS Google Scholar
Haglund, E. et al. The HD-exchange motions of ribosomal protein S6 are insensitive to reversal of the protein-folding pathway. Proc. Natl Acad. Sci. USA 106, 21619–21624 (2009).
Article ADS CAS Google Scholar
Haglund, E. et al. Trimming down a protein structure to its bare foldons: spatial organization of the cooperative unit. J. Biol. Chem. 287, 2731–2738 (2012).
Article CAS Google Scholar
Lindahl, M. et al. Crystal structure of the ribosomal protein S6 from Thermus thermophilus. EMBO J. 13, 1249–1254 (1994).
Article CAS Google Scholar
Day, R., Beck, D. A., Armen, R. S. & Daggett, V. A consensus view of fold space: combining SCOP, CATH, and the Dali Domain Dictionary. Protein Sci. 12, 2150–2160 (2003).
Article CAS Google Scholar
Schluenzen, F. et al. Structure of functionally activated small ribosomal subunit at 3.3 angstroms resolution. Cell 102, 615–623 (2000).
Article CAS Google Scholar
Gallagher, T. D., Gilliland, G., Wang, L. & Bryan, P. The prosegment-subtilisin BPN’ complex: crystal structure of a specific foldase. Structure 3, 907–914 (1995).
Article CAS Google Scholar
Tangrea, M. A. et al. Stability and global fold of the mouse prohormone convertase 1 pro-domain. Biochemistry 40, 5488–5495 (2001).
Article CAS Google Scholar
Tangrea, M. A., Bryan, P. N., Sari, N. & Orban, J. Solution structure of the pro-hormone convertase 1 pro-domain from Mus musculus. J. Mol. Biol. 320, 801–812 (2002).
Article CAS Google Scholar
Sari, N. et al. Hydrogen-deuterium exchange in free and prodomain-complexed subtilisin. Biochemistry 46, 652–658 (2007).
Article CAS Google Scholar
Orengo, C. A. & Thornton, J. M. Alpha plus beta folds revisited: some favoured motifs. Structure 1, 105–120 (1993).
Article CAS Google Scholar
Chen, Y. et al. Engineering subtilisin proteases that specifically degrade active RAS. Commun. Biol. 4, 299 (2021).
Article CAS Google Scholar
Lejon, S., Frick, I. M., Bjorck, L., Wikstrom, M. & Svensson, S. Crystal structure and biological implications of a bacterial albumin binding module in complex with human serum albumin. J. Biol. Chem. 279, 42924–42928 (2004).
Article CAS Google Scholar
Sauer-Eriksson, A. E., Keywegt, G. J., Uhlen, M. & Jones, T. A. Crystal structure of the C2 fragment of streptococcal protein G in complex with the Fc domain of human IgG. Structure 3, 265–278 (1995).
Article CAS Google Scholar
Alexander, P. A., Rozak, D. A., Orban, J. & Bryan, P. N. Directed evolution of highly homologous proteins with different folds by phage display: implications for the protein folding code. Biochemistry 44, 14045–14054 (2005).
Article CAS Google Scholar
Alexander, P. A., He, Y., Chen, Y., Orban, J. & Bryan, P. N. The design and characterization of two proteins with 88% sequence identity but different structure and function. Proc. Natl Acad. Sci. USA 104, 11963–11968 (2007).
Article ADS CAS Google Scholar
Leaver-Fay, A. et al. ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 487, 545–574 (2011).
Article CAS Google Scholar
Delano, W. L. The PyMOL Molecular Graphics System (DeLano Scientific, San Carlos, CA, 2002).
He, Y. et al. Structure, dynamics, and stability variation in bacterial albumin binding modules: implications for species specificity. Biochemistry 45, 10102–10109 (2006).
Article CAS Google Scholar
Shen, Y. et al. De novo structure generation using chemical shifts for proteins with high-sequence identity but different folds. Protein Sci. 19, 349–356 (2010).
Article CAS Google Scholar
Chen, Y. et al. Rules for designing protein fold switches and their implications for the folding code. Preprint at bioRxiv https://doi.org/10.1101/2021.05.18.444643 (2021).
Rozak, D. A., Orban, J. & Bryan, P. N. G148-GA3: a streptococcal virulence module with atypical thermodynamics of folding optimally binds human serum albumin at physiological temperatures. Biochim. Biophys. Acta 1753, 226–233 (2005).
Article CAS Google Scholar
He, Y., Chen, Y., Rozak, D. A., Bryan, P. N. & Orban, J. An artificially evolved albumin binding module facilitates chemical shift epitope mapping of GA domain interactions with phylogenetically diverse albumins. Protein Sci. 16, 1490–1494 (2007).
Article CAS Google Scholar
He, Y. et al. Solution NMR structure of a sheddase inhibitor prodomain from the malarial parasite Plasmodium falciparum. Proteins 80, 2810–2817 (2012).
Article CAS Google Scholar
Alexander, P., Fahnestock, S., Lee, T., Orban, J. & Bryan, P. Thermodynamic analysis of the folding of the Streptococcal protein G IgG-binding domains B1 and B2: why small proteins tend to have high denaturation temperatures. Biochemistry 31, 3597–3603 (1992).
Article CAS Google Scholar
Chien, S.-C., Chen, C.-Y., Lin, C.-F. & Yeh, H.-I. Critical appraisal of the role of serum albumin in cardiovascular disease. Biomark. Res. 5, 31 (2017).
Article Google Scholar
Gonzalez-Quintela, A. et al. Serum levels of immunoglobulins (IgG, IgA, IgM) in a general adult population and their relationship with alcohol consumption, smoking and common metabolic abnormalities. Clin. Exp. Immunol. 151, 42–50 (2008).
Article CAS Google Scholar
Studier, F. W. Protein production by auto-induction in high density shaking cultures. Protein Expr. Purif. 41, 207–234 (2005).
Article CAS Google Scholar
Ruan, B., Fisher, K. E., Alexander, P. A., Doroshko, V. & Bryan, P. N. Engineering subtilisin into a fluoride-triggered processing protease useful for one-step protein purification. Biochemistry 43, 14539–14546 (2004).
Article CAS Google Scholar
Delaglio, F. et al. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277–293 (1995).
Article CAS Google Scholar
Goddard, D. & Kneller, D. G. SPARKY 3 Vol. 3 (University of California, San Francisco, 2004).
Brunger, A. T. et al. Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr. D (Biol. Crystallogr.) 54, 905–921 (1998).
Article CAS Google Scholar
Shen, Y. & Bax, A. Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks. J. Biomol. NMR 56, 227–241 (2013).
Article CAS Google Scholar
Farrow, N. A., Zhang, O., Forman-Kay, J. D. & Kay, L. E. A heteronuclear correlation experiment for simultaneous determination of ¹⁵N longitudinal decay and chemical exchange rates of systems in slow equilibrium. J. Biomol. NMR 4, 727–734 (1994).
Article CAS Google Scholar
Montelione, G. T. & Wagner, G. 2D Chemical exchange NMR spectroscopy by proton-detected heteronuclear correlation. J. Am. Chem. Soc. 111, 3096–3098 (1989).
Article CAS Google Scholar
Laskowski, R. A., Rullmann, J. A., MacArthur, M. W., Kaptein, R. & Thornton, J. M. AQUA and PROCHECK-NMR: Programs for checking the quality of protein structures solved by NMR. J. Biomol. NMR 8, 477–486 (1996).
Article CAS Google Scholar
Koradi, R., Billeter, M. & Wuthrich, K. MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graph. Model. 14, 51–55 (1996).
Article CAS Google Scholar

Download references

Acknowledgements

This work was supported by National Institutes of Health Grant GM62154 (to P.B. and J.O.) and 5R44GM126676 (to P.B.). The NMR facility is supported by the University of Maryland, the National Institute of Standards and Technology, and a grant from the W. M. Keck Foundation. We also thank Drs. Nese Sari and Louisa Wu for critically reading the manuscript and for many thoughtful comments. Mention of commercial products does not imply recommendation or endorsement by NIST.

Author information

These authors contributed equally: Biao Ruan, Yanan He, Yingwei Chen.

Authors and Affiliations

Potomac Affinity Proteins, 11305 Dunleith Pl, North Potomac, MD, 20878, USA
Biao Ruan, Yingwei Chen, Eun Jung Choi, Dana Motabar, Richard Simmerman & Philip N. Bryan
Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD, 20850, USA
Yanan He, Yihong Chen, Tsega Solomon, Thomas Kauffman, D. Travis Gallagher, John Orban & Philip N. Bryan
Department of Bioengineering, University of Maryland, College Park, MD, 20742, USA
Dana Motabar
Department of Chemistry and Biochemistry, University of Maryland, College Park, MD, 20742, USA
Tsega Solomon, Thomas Kauffman & John Orban
National Institute of Standards and Technology and the University of Maryland, 9600 Gudelsky Drive, Rockville, MD, 20850, USA
D. Travis Gallagher

Authors

Biao Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Yanan He
View author publications
You can also search for this author in PubMed Google Scholar
Yingwei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Eun Jung Choi
View author publications
You can also search for this author in PubMed Google Scholar
Yihong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Dana Motabar
View author publications
You can also search for this author in PubMed Google Scholar
Tsega Solomon
View author publications
You can also search for this author in PubMed Google Scholar
Richard Simmerman
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Kauffman
View author publications
You can also search for this author in PubMed Google Scholar
D. Travis Gallagher
View author publications
You can also search for this author in PubMed Google Scholar
John Orban
View author publications
You can also search for this author in PubMed Google Scholar
Philip N. Bryan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Protein design: Yw.C., B.R., E.C., J.O., P.B.; Performed thermodynamic and binding analyses: B.R., Yw.C., D.M., R.S., P.B.; Performed dynamic light scattering experiments: T.G.; Performed NMR experiments/structural analysis: Y.H., Yh.C., T.S., T.K., J.O.; Wrote the paper: J.O. (NMR and structural analysis), Yw.C., B.R., P.B. (remaining sections).

Corresponding authors

Correspondence to John Orban or Philip N. Bryan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Reporting Summary

Source data

Source Data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ruan, B., He, Y., Chen, Y. et al. Design and characterization of a protein fold switching network. Nat Commun 14, 431 (2023). https://doi.org/10.1038/s41467-023-36065-3

Download citation

Received: 14 July 2022
Accepted: 13 January 2023
Published: 26 January 2023
DOI: https://doi.org/10.1038/s41467-023-36065-3

This article is cited by

Identification of a covert evolutionary pathway between two protein folds
- Devlina Chakravarty
- Shwetha Sreenivasan
- Lauren L. Porter
Nature Communications (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Designing a functional switch from ribosomal protein to protease inhibitor

Designing fold switches

Designing a switch from α/β-plait protease inhibitor to 3α HSA-binding protein

Structural analysis of A1 and Sa1

Comparison of A1 and Sa1 structures

Energetics of unfolding for A1/Sa1

HSA binding

Protease inhibition

Designing a switch from α/β-plait protease inhibitor to β−grasp IgG-binding protein

Design and characterization of B1, Sb1, B2, and Sb2

Design of Sb3 and B3

Structural analysis of Sb3 and B3

Design and analysis of point mutations that switch the fold of Sb3

Structural comparison of Sb3 and B4

Energetics of unfolding for B3/Sb3, B4/Sb4, and Sb5

Protease inhibition

IgG binding

Discussion

Methods

Mutagenesis, protein expression and purification

Rosetta calculations

Circular dichroism (CD)

Measuring HSA and IgG binding affinity

Measuring protease inhibition

NMR spectroscopy

Reporting summary

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links

Structural analysis of A₁ and S_a1

Comparison of A₁ and S_a1 structures

Energetics of unfolding for A₁/S_a1

Design and characterization of B₁, S_b1, B₂, and S_b2

Design of S_b3 and B₃

Structural analysis of S_b3 and B₃

Design and analysis of point mutations that switch the fold of S_b3

Structural comparison of S_b3 and B₄

Energetics of unfolding for B₃/S_b3, B₄/S_b4, and S_b5