Introduction

Sequences encoding natural proteins constitute a tiny fraction of the enormous variety of sequences that can be derived from the combination of 20 amino acids. This variety is a major barrier to the elucidation of how a protein structure is encoded in its sequence1. However, artificial sequences encoding functional and stably folded proteins have been designed from a reduced set of amino acids2,3 or by specifying a reduced number of sites along the amino acid sequence4,5,6. This suggests that the information redundancy in a natural protein sequence can be experimentally minimized without compromising its native structure. For example, a chorismate mutase variant encoded with nine types of amino acids2 and a bovine pancreatic trypsin inhibitor (BPTI) variant in which 47% of the residues are alanines retain their native functional structures4,5. Simplified protein sequences that retain their native-like properties are thus expected to allow us to explore the determinants of the thermodynamic stability of globular proteins.

Protein stability remains difficult to rationalize, as most examples of protein stabilization/destabilization result from multiple mutations whose effects are intertwined. A well-documented factor determining protein stability is entropy-driven stabilization7, achieved by restricting the conformational freedom of the polypeptide chain in the denatured state by inserting disulfide bonds into it8,9. However, entropy stabilization can also originate from increased conformational freedom in the native state or even be overturned by enthalpy loss10. Enthalpy-driven stabilization may be easier to analyze and control when a highly accurate protein structure, determined at atomic level, is available11,12. Thus, in practice, the stabilization of protein structure is often achieved through a mixture of rational design and semi-random mutational analysis as the effects of even a modest backbone displacement on protein stability are difficult to quantify.

Here, we report a structural and thermodynamic analysis of six extensively simplified BPTI variants, where 19–24 of its 58 residues are alanines, using differential scanning calorimetry (DSC) and X-ray crystallography. We expected that the replacement of long or bulky side chains with the small alanine methyl group side chains would create destabilizing cavities. However, the alanine substitutions exerted a small unexpected enthalpy stabilization, and the magnitude of which increased in a nearly additive fashion along with the number of alanine substitutions, which was, however, over-compensated with entropy destabilization. A structural analysis indicated that new water molecules were recruited into the spaces created by the alanine substitutions, facilitating protein–water hydrogen-bonded interactions as well as interactions with the methyl group and creating hydration networks around the substitution sites. This observation suggested that enthalpy stabilization originates from hydration rather than from chain enthalpy. This is the very first molecular perspective on hydration enthalpy/entropy, and it suggests a generic strategy for a water-mediated enthalpy stabilization of a protein.

Results and Discussion

Design and thermodynamic stability of BPTI variants

We used a stabilized BPTI-[5,55] (BPTI-[5,55] A14GA38V12), containing only the 5–55 disulfide bond, as the reference molecule instead of wild-type BPTI, which contains three disulfide bonds13, and a BPTI variant containing 19 alanines (BPTI-19A) as the template for further simplification14. We mutated the exposed and partially exposed residues (P8, K15, R17, R39, K41, R42, and D50) to alanines because we expected that these substitutions would minimally affect the densely packed interior of the native protein4,5. The mutants were named according to the numbers of alanines in their sequences (i.e. BPTI-19A stands for a BPTI variant containing 19 Alanines) and a suffix ‘a’ or ‘b’ was added to distinguish variants containing the same numbers of alanines at different sites4,5 (Fig. 1a).

Figure 1: Thermodynamic properties of the simplified BPTI variants.
figure 1

(a) Sequences of the simplified BPTIs. Alanines are shown in black, and the other residues are in gray. (b) CD measurements at 20 μM concentration in 20 mM Acetate buffer (pH4.7) and at 4 °C. The dotted line represents the reference BPTI-[5,55] A14GA38V12, and the continuous lines stand for BPTI-19A, -22A and -25A variants. Similar patterns were also observed at 40 °C and 70 °C (Suppl. Figure S1). (c) Thermal stability measurements using differential scanning calorimetry at pH4.7. The symbols represent the experimental data and the continuous lines represent two-state model fitted curves. (d) Temperature-dependence of the enthalpy change (∆H) upon thermal unfolding of BPTI-[5,55] A14GA38V and extensively simplified BPTIs. The pH values are 4.1, 4.7 and 5.5, from the lowest to the highest Tm values. 5,55* stands for BPTI-[5,55]A14GA38V.

Circular dichroism measurements indicated that all the simplified BPTI variants had the same secondary structure contents as the reference BPTI-[5,55] A14GA38V (Fig. 1b). Thermal denaturation curves, measured with DSC, were all cooperative and two-state, typical of a small natively folded protein with a densely packed interior (Fig. 1c,d; Suppl. Figure S1). Substitution with alanine slightly reduced the melting temperature (Tm) of the variants, as anticipated. Interestingly, the ∆Cp values also increased slightly with increasing number of alanines in the sequences (Table 1). However, unexpectedly, the enthalpy changes at both Tm (∆HTm) and 37 °C (∆H37 °C) increased as the number of alanine substitutions increased (Table 1 and 2). This increase in enthalpy was counter-intuitive, because we imagined that the substitution of bulky residues with the small alanine side-chain would create voids, thus reducing the van der Waals interactions between side-chain atoms and consequently the enthalpy of denaturation. Finally, the thermodynamic parameters estimated at 37 °C clearly indicated that the enthalpy stabilization was over-compensated by entropy destabilization (Suppl. Figure S2b), slightly reducing the melting temperatures (Suppl. Figure S2a; Table 2; and Suppl. Table S1).

Table 1 Thermodynamic parameters for reference BPTI and extensively simplified BPTI sequences.
Table 2 Effects of individual alanine substitution on specific thermodynamic parameters.

Crystallization and structures of the simplified BPTIs

To provide a structural view of this peculiar instance of enthalpy stabilization, we crystallized and solved the structures of six simplified BPTI variants containing 19, 20, 21, 22, 23, and 24 alanines (Fig. 2; Table 2). On the other hand, despite repeated trials, BPTI variants containing 25–27 alanines merely formed needle crystals, unsuitable for diffraction. All six variants crystallized under the same conditions, indicating a minimal disturbance of the surface properties, even upon multiple alanine substitutions. The alanines were spread almost uniformly over the entire BPTI structure (Fig. 2a). In all of the structures, the overall backbone and side-chain conformations were almost perfectly retained, with a backbone root square mean deviation (RMSD) of 0.3–0.4 Å, which is similar to the RMSD of the wild-type BPTIs solved by different research groups (Table 3; Fig. 3; Suppl. Figure S3). Therefore, despite the large number of alanine substitutions, the native-like backbone structure and the densely packed protein interior remained completely unaffected (Fig. 3 and Suppl. Figure S4).

Figure 2: Structures of the simplified BPTI variants.
figure 2

(a) Structures of wild-type BPTI, BPTI-[5,55]A14GA38V, BPTI-19A and BPTI-24A, from left to right. Alanines are shown as blue spheres in surface model. New hydration networks around the K15AR17A (b) and D50A (c) sites. Wild-type BPTI and BPTI-24A are shown with a ribbon model in blue and violet respectively. Spheres represent water molecules around the alanine substitution sites.

Table 3 Structure determination and refinement details.
Figure 3: Structural details of the simplified BPTI variants.
figure 3

(a) Superimposition of simplified BPTIs onto 2SS-BPTI (7pti.pdb). The overall structures remained almost unchanged with RMSD < 0.4 Å. Side-chain conformation of surface exposed (ASA > 50%), partially buried (ASA 30–50%) and buried (ASA < 30%) residues are shown in panels b, c, and d, respectively. In panels b-d the alanines as well as the residues substituted to alanines are encircled. The side-chain conformations of almost all residues were retained in all simplified BPTIs, indicating that multiple alanine substitutions did not affect the native-like densely packed protein interior. In all panels color codes are the same (7pti: orange, BPTI-19A: red, BPTI-21A: green, BPTI-22Ab: blue, BPTI-23A: yellow, and BPTI-24A: violet).

For the purpose of discussion, we examined the fine structural changes induced near the substitution sites by the mutations. The K15A and R17A substitutions, which are located in a loop, did not affect either the local main-chain or the side-chain structures (Figs 2 and 3; Suppl. Figure S4a), but two new water molecules were recruited near the amide nitrogens of Tyr10, Ala11, Gly12, and the Gly14/Val38 pair to fill the spaces left by the large side chains after the alanine substitutions (Fig. 4). Seven novel water molecules appeared near residues 10–14 and the Gly14/Val38 pair, forming an elongated hydration network involving water molecules hydrogen-bonded to the backbone atoms of the protein (Fig. 4). The R39A substitution also recruited 1–2 new water molecules close to the amide nitrogen of Ala39, extending the hydration networks observed in BPTI-21A. Similarly, the P8A substitution did not affect the local structure, but recruited a water molecule and a sulfate ion, which formed new hydrogen bonds with the amide nitrogen of Ala8 (Fig. 4). Ala8 was also hydrogen-bonded to the ε Oxygen atom of Glu7 (OE1) which was further hydrogen-bonded to Asn43 and two new water molecules (Fig. 4a,b). This hydration structure is absent from the wild-type BPTI structures and from all the simplified BPTIs containing Pro at the 8th position (Fig. 4b). The D50A substitution in the α-helix did not affect the backbone conformation around residue 50 (Fig. 2c), but four new water molecules were recruited: two near the amide nitrogen of Ala48 and two near the amide nitrogen of Ala49 (Fig. 4). To date, only a single water molecule is observed at these sites in the wild-type and 2SS-BPTI structures, whereas in the simplified structures, three water molecules were hydrogen-bonded to Ala48 and Ala49 (Figs 2c and 4). Finally, both the intramolecular hydrogen bonds and the hydration structures at other positions were essentially unchanged from those in the wild-type BPTIs (Suppl. Figure S4). These observations clearly indicate that the multiple alanine substitutions merely affected the native-like structure of BPTI, but new water molecules appeared in the vicinity of the main-chain atoms near the substitution sites (Figs 2b,c and 4).

Figure 4: Hydration structures in 2-SS BPTI and simplified BPTIs.
figure 4

(a) New hydration networks around the larger side-chains to smaller alanine substitution sites. Alanine substitutions introduced in this study are mentioned on the left of the panel and BPTI variants are indicated at the top of the panel. Chain A of BPTI-19A, BPTI-22Ab and BPTI-24A were globally superimposed onto the structure of 2SS-BPTI (7PTI.pdb) using PyMol (www.pymol.com), and the individual alanine substitution sites are shown at identical scale. Inter-atomic distances between backbone atoms (amide nitrogen and carbonyl oxygen) and water molecules are mentioned in Angstrom. (b) New hydrogen bond forming water molecules. Arrows indicate the direction of the hydrogen bonds, from donor to acceptor atoms. The number of protein-water hydrogen bonds increased at and around the substitution sites with increasing number of alanine substitutions. At sites far from the alanine substitutions the hydrogen bonding and hydration structures remained unchanged (Suppl. Figure S4). Protein-water hydrogen bonds were calculated using HBAT (26). Residues identities and inter-atomic distances (Å) are mentioned.

Structural interpretation of the thermodynamic parameters

Let us consider possible structural features that could account for the increase in enthalpy change arising from the multiple alanine substitutions (Fig. 5). X-ray crystallographic analyses indicated that all of the simplified BPTI structures fully overlapped (Fig. 3) and that the main change that occurred upon the number of alanines was an increase in the solvent content in the asymmetric units (Table 3). The new water molecules were recruited around the alanine substitution sites and were involved in novel hydration networks (Figs 2, 4 and 5). Thus, favorable protein–water interactions appear to be the most likely factor responsible for the increased unfolding enthalpy of the simplified BPTIs, rather than the relaxation of atomic clashes or the creation of new intramolecular hydrogen bonds or van der Waals contacts (Figs 4 and 5), assuming that the hydration structures remain the same for all alanine mutants in their denatured states.

Figure 5: Correlation between thermodynamic parameters and hydration networks around the alanine substitution sites.
figure 5

(a) Specific thermodynamic parameters estimated at 37 °C are shown along the horizontal axis (□ open bars: changes in enthalpy (∆H37 °C kJ/mol) and gray bars: changes in entropy (T∆S37 °C kJ/deg.mol)) are shown and along the vertical axis (□ open squares represent the changes in free energy (∆G37 °C kJ/mol)). (b) Correlation plot of thermodynamic parameters [:enthalpy change at Tm (∆HTm); ■: enthalpy change at 37 °C (∆H37 °C); and ▲: entropy change at 37 °C (T∆S37 °C)] versus the protein-water hydrogen bonds observed around the alanine substitution sites in their crystal structures (see also Materials and methods; and Fig. 4 legends). Correlation coefficients are shown within the panel. ∆H37 °C kJ/mol versus the number of protein-water hydrogen bonds to the backbone atoms and versus the number of water molecules close to Cβ-atoms are shown in panels c and d, respectively. Water molecules residing within 3.4 to 4.0 Å from side-chain Cβ-atoms were considered as ‘close’. In panel c, we considered 7pti as wild-type BPTI and bars, ■ and □ squares represent, respectively, ∆H37 °C kJ/mol, the number of protein-water H-Bonds around the alanine substitution sites, and those at sites not substituted to alanines. In panel d, we considered 5pti, 6pti and 7pti structures as wild-type BPTI and for the simplified variants we included all the chains (monomers) in their asymmetric unit. The gray bars stand for ∆H37 °C kJ/mol. The ■ and □ squares represent the number of water molecules close to Cβ atoms of, respectively, alanines and amino acids different from alanines. The number of water-water H-Bonds and number of water molecules close to the Cβ-atoms of alanines increased with increasing alanine substitutions while the numbers remained almost the same at residues not substituted to alanines. Similar correlation was also observed in specific entropy versus hydration structures plot (Suppl. Figure S5). 5,55* stands for BPTI-[5,55]A14GA38V.

Generally, an increase in entropy change is interpreted as either an enlargement of the conformational space in the denatured state or as a loss of it in the native state, in which both the chain and hydration terms must be accounted for. We first imagined that replacing bulky side chains with small alanine side chains would create voids, increasing the flexibility of the local chain in the native state, and thus reducing the entropy of unfolding, and consequently stabilizing the native state in terms of entropy. Another possibility being that the entropy of the denatured state increases by reducing the size of side-chains and thus increasing the conformational space of the denatured state, which would destabilize the native state. DSC experiments indicated that the entropy of unfolding increased upon replacement of the native amino acids with increasing number of alanine replacements. Moreover, the crystal structures indicated that the voids were filled with water molecules and the flexibility or dynamics of the residues surrounding the mutations were unchanged, as assessed with the B-factors (Suppl. Figure S3). The entropy destabilization observed upon alanine substitution may thus originate from the enlargement of the conformational space in the denatured state. For example, a P → A mutation is estimated to result in a reduction of 3.5 °C in the melting temperature mainly destabilized by entropy7 and these figures are roughly consistent with a Tm reduction of 2.24 °C observed upon the P8A substitution (Table 2). The difference could be accounted for by entropy loss associated with the increased number of water molecules released to the bulk water upon the denaturation of the simplified BPTIs (Fig. 5c,d). Finally, a strong correlation between the number of protein–water hydrogen bonds (Fig. 5b,d), as well as the number of water molecules interacting with the newly added methyl groups15 (Fig. 5c), and the thermodynamic parameters further substantiated the notion that these water molecules represent the molecular origin of hydration enthalpy/entropy. In principle, measuring the thermodynamics of mutants where a large buried hydrophobic residue is replaced to alanine could provide further proof of this effect, however, such substitutions nearly completely unfold BPTI-[5,55]16 making such analysis impractical.

Concluding remarks

The enthalpy stabilization introduced to a protein with multiple alanine substitutions is novel and unexpected. A comparison of the thermodynamic parameters and structural data suggests that the enthalpy stabilization of the simplified BPTIs probably arises from improved interactions between water molecules and the protein. This observation sheds new light on the molecular nature of the hydration term of enthalpy/entropy of unfolding. Further analyses, such as DSC performed with D2O protein solutions might enable decomposition of water contribution to electrostatic and hydrogen-bonding terms.

The rational design of enthalpy stabilization usually requires high-resolution structures for designing novel hydrogen bonds17 or salt bridges18 to fill cavities19 or to relax steric clashes11,12, which is difficult. On the other hand, the substitution of surface-exposed and semi-exposed residues with alanine could provide a new and generic strategy for increasing a protein’s stability if we can determine the exact nature of the entropy destabilization and reduce its extent.

Finally, it is noteworthy that a protein in which more than 40% of the residues are alanines can fold into a native-like, well packed structure that can be crystallized and solved at high resolution. This observation implies that the determinants of a protein fold lie in residues deeply buried in the template structure. Indeed this study and our previous studies demonstrate that a substantial number of surface and semi-exposed residues do not actively contribute to specifying the native structure of proteins with densely packed interiors, which translates to highly cooperative thermal denaturation, a biophysical hallmark of natively folded proteins.

Materials and Methods

Protein expression and purification

All simplified BPTI variants were over-expressed using the pMMHA expression vector in Escherichia coli JM109(DE3)pLysS cell line, and collected by Ni-NTA chromatography. After removal of the Trp tag by CNBr cleavage, the BPTI variants were further purified by reverse phase HPLC as previously described20. Purified proteins were lyophilized and preserved at −30 °C until use. Protein identities were confirmed by ESI-TOF mass spectroscopy.

Thermodynamic analysis

Sample preparation

Samples for circular dichroism (CD) and differential scanning calorimetry (DSC) were prepared by dissolving lyophilized proteins in 20 mM sodium acetate buffer (pH4.1, pH4.7, and pH5.5). All samples were filtered with a 0.20 μm membrane filter to remove aggregates that might have accumulated during dialysis. Protein concentrations and pHs were confirmed after dialysis and the samples were thoroughly degassed just before DSC measurements.

CD measurements

The CD measurements were performed at 20 μM protein concentration in 20 mM acetate buffer (pH4.7) at 4 °C, 40 °C and 70 °C using JASCO J-820 spectrophotometer. The reversibility of the thermal denaturation were assessed by measuring the CD at 222 nm wavelength while heating (forward) the samples to 80 °C, cooling (backward) them to 4 °C, and then re-heating the samples from 10 to 80 °C. All variants showed almost complete reversible thermal denaturation curves (Suppl. Figure S1).

DSC measurements

Samples at 1 mg/mL concentrations were dialyzed for 18 hours at 4 °C, as previously described14. DSC measurements were performed using a VP-DSC MicroCalorimeter (Microcal, MA, USA) at a scan rate of 1.0 °C/min in the temperature range of 5 to 90 °C. The individual apparent heat capacity curves were analyzed with a two-state model using a non-linear least-square fitting method and by assuming a linear temperature dependence of the heat capacity for the native and denatured states21,22,23 (Fig. 1).

Structure analysis

Crystallization

Stock solution containing 10–15 mg/ml protein was prepared in 15 mM Tris-HCl, pH7.0. Crystals of all simplified BPTI variants were grown at 20 °C using the hanging drop vapor diffusion technique in 20–30% PEG4000, 0.2 M lithium sulfate and 0.1 M Tris-HCl (pH8.5).

Structure determination

The X-ray diffraction data were recorded from single crystals using a synchrotron beam line at the Photon Factory (KEK, Tsukuba, Japan). The data were processed with the HKL2000 program package, using DENZO for the integration and SCALEPACK for the merging and statistical analysis of the diffraction intensities24. The structures were determined by molecular replacement using 5PTI13 as a template with the program Molrep and refined using Refmac525, as previously described4. Structures were validated using Molprobity26 and visualized using Coot27.

Identification of novel water molecules

The protein-protein and protein-water hydrogen bonds in the crystal structures were calculated using Molprobity26 and HBAT28. In short, hydrogen atoms were added to the x-ray structures using Molprobity and then hydrogen bonds were calculated using the HBAT program. Water molecules within 2 to 4 Å from the amide-nitrogen and carbonyl oxygen were considered to have strong protein-water hydrogen bonds (Figs 4 and 5), while water molecules within the 3.4 to 4 Å from Cβ-atoms were considered as methyl side-chain hydration (Fig. 5d). As a reference we also calculated the hydrogen bonds in 2SS-BPTI29 that contains SS-bonds at 5–55 and 14–38 sites, while the 30–51th sites are substituted to alanines.

Data Availability

The coordinates and structure factors of BPTI-21A, BPTI-22Ab, BPTI-23A and BPTI-24A variants are deposited in the Protein Data Bank under the PDB entry codes 4YPK, 4YPP, 4YR4 and 4YR5, respectively and we previously reported the structures of BPTI-19A (3AUB) and BPTI-20A (3CI7).

Additional Information

How to cite this article: Islam, M. M. et al. Crystal structures of highly simplified BPTIs provide insights into hydration-driven increase of unfolding enthalpy. Sci. Rep. 7, 41205; doi: 10.1038/srep41205 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.