Introduction

Enzymes are flexible macromolecules that sample multiple structural states, described by a conformational energy landscape1. It is the relative stability of these conformational states, and the ability of enzymes to transition between them, that ultimately dictates enzymatic function2,3,4. Analyses of directed evolution trajectories have shown that evolution can reshape enzyme conformational landscapes by enriching catalytically productive states and depopulating non-productive ones5,6,7, leading to enhanced catalytic activity. Similar mechanisms contribute to the evolution of substrate selectivity, where transient conformations responsible for activity on non-cognate substrates become enriched8,9,10. Thus, it should be possible to harness the pre-existing conformational plasticity of enzymes to tailor their catalytic properties by reshaping their conformational landscapes. However, predicting the effect of mutations on these energy landscapes remains challenging, and current computational enzyme design protocols, which focus on a single structural state, are poorly optimized for this task. New design methodologies are therefore required for the targeted alteration of subtle conformational states and equilibria, which would in turn facilitate the design of biocatalysts with customized activity and selectivity.

Here, we report a computational procedure for rationally tuning enzyme conformational landscapes that is based on multistate computational protein design, a methodology that allows protein sequences to be optimized on multiple structural states11. As a case study, we remodel the conformational landscape of aspartate aminotransferase, an enzyme that switches between open and closed conformations via hinge movement, which involves the rotation of a protein domain relative to another around an axis between two planes. Using our approach, we enrich the less populated but catalytically active closed conformation in order to increase catalytic efficiency (kcat/KM) with the non-native substrate l-phenylalanine, leading to altered substrate selectivity. Steady-state kinetics reveal kcat/KM increases of up to 100-fold towards this aromatic amino acid, resulting in a selectivity switch of up to 1900-fold, and structural analyses by room-temperature X-ray crystallography and multitemperature nuclear magnetic resonance (NMR) spectroscopy confirm that the conformational landscape is remodeled to favor the target state. Our methodology for altering conformational equilibria should be applicable to many enzymes and proteins that undergo hinge-mediated domain motions.

Results

Computational remodeling of conformational landscape

E. coli aspartate aminotransferase (AAT) is a pyridoxal phosphate (PLP)-dependent enzyme that catalyses the reversible transamination of l-aspartate with α-ketoglutarate, yielding oxaloacetate and l-glutamate (Supplementary Fig. 1). During its catalytic cycle, AAT undergoes hinge movement to switch between an open conformation, in the ligand-free form, and a closed conformation upon association with substrates or inhibitors (Fig. 1a,b)12. Previously, AAT was redesigned to change its substrate specificity to allow transamination of aromatic amino acids13. To do so, six of the 19 residues that are strictly conserved in AAT enzymes were replaced by those found at corresponding positions in the homologous E. coli tyrosine aminotransferase. These residues were selected because they line the substrate channel leading to the catalytic pocket or are located near the cofactor phosphate moiety. This process yielded a hexamutant (HEX) of AAT that is approximately two orders of magnitude more catalytically efficient than the wild type (WT) with the aromatic amino acid l-phenylalanine (Table 1), and similarly efficient for transamination of l-aspartate. Unexpectedly, it was found that unlike WT, HEX was closed in its ligand-free form (Fig. 1c)14, demonstrating that those six mutations shifted its conformational equilibrium to favor the closed state (Fig. 1d), which is the active conformation of the enzyme15. This equilibrium shift was accompanied by enhanced l-phenylalanine transamination activity resulting from a larger increase in affinity for this non-native substrate than for the native l-aspartate substrate13. Since domain closure in AAT is driven by the excess binding energy of the native substrate l-aspartate, which forms strong polar interactions with two active-site arginines15,16, the observed equilibrium shift towards the closed state in the absence of ligand was proposed to facilitate access to this active conformation for non-native substrates whose binding may not provide sufficient energy to induce domain closure due to their lower structural complementarity to the active site14. Based on these observations, we postulated that we could rationally remodel the conformational landscape of AAT by using multistate computational protein design11 to identify novel mutation combinations that can preferentially stabilize the closed conformation over the open conformation, and in doing so, increase catalytic efficiency and selectivity for l-phenylalanine.

Fig. 1: AAT conformational landscape.
figure 1

a E. coli AAT is a 90 kDa homodimer that undergoes a conformational change from an open (green, PDB ID: 1ARS) to closed (dark blue, PDB ID: 1ART) state upon substrate binding. This conformational transition involves rotation of a small moving domain (colored) relative to a fixed domain (white and gray for chains A and B, respectively), which causes a 2.4 ± 0.4 Å displacement (mean Cα distance ± s.d.) of the helix formed by residues K355–F365 (indicated by an asterisk). The PLP cofactor bound at the active site is shown as spheres (salmon). b Hinge movement analysis of chain A reveals a 7.1-degree rotation of the moving domain relative to the fixed domain along an axis between two planes (dotted line). Hinge-bending residues and PLP are shown as orange spheres and white sticks, respectively. c Superposition of HEX structures in the absence (yellow, PDB ID: 1AHE) and presence (blue, PDB ID: 1AHY) of bound inhibitor with that of the WT closed state (white, PDB ID: 1ART) show that this mutant is closed in both cases. Cα displacement (mean ± s.d.) of residues K355–F365 (asterisk) is indicated. d These results demonstrate that the six mutations (mut.) of HEX remodel its conformational landscape to favor the closed conformation.

Table 1 Apparent kinetic parameters of E. coli AAT and its mutants for transamination of various amino-acid donors with α-ketoglutarate as acceptor

To test this hypothesis, we implemented a computational strategy (Fig. 2) that proceeds in five steps: (1) identification of hinge-bending residues involved in transition between open and closed conformations; (2) generation of structural ensembles approximating backbone flexibility to model open and closed conformational states; (3) optimization of side-chain rotamers for all allowed amino-acid combinations at key hinge-bending residues and neighboring positions, on each ensemble; (4) calculation of energy differences between open and closed states to predict preferred conformation, and (5) combinatorial library design using computed energy differences to select mutant sequences for experimental testing.

Fig. 2: Computational remodeling of AAT conformational landscape by multistate design.
figure 2

To remodel the AAT conformational landscape, we followed a 5-step process: (1) identification of hinge-bending residues involved in transition between open (green) and closed (dark blue) conformational states; (2) generation of structural ensembles approximating backbone flexibility to model open and closed states; (3) optimization of rotamers for mutant sequences on both open- and closed-state ensembles; (4) calculation of energy differences between conformational states (ΔE = EclosedEopen) to predict equilibrium of each mutant, and (5) combinatorial library design using ΔE values to generate Closed, OpenLow, and OpenHigh libraries for experimental testing. Designed residues V35, K37, T43, and N64 correspond to V39, K41, T47, and N69 in the previously published crystal structures of wild-type AAT (PDB ID: 1ARS and 1ART).

To identify hinge-bending residues for the open/closed conformational transition, crystal structures of WT AAT in its open and closed forms (PDB ID: 1ARS and 1ART, respectively12) were used as input for hinge movement analysis with DynDom, a program that identifies domains, hinge axes, and hinge bending residues in proteins for which two conformations are available17. DynDom analysis (Supplementary Table 1) revealed a small moving domain (Fig. 1b) that rotates by 7.1 degrees about the hinge axis from the larger fixed domain, and identified 25 hinge-bending residues. We selected two of these residues for design, Val35 and Lys37 (numbering based on Uniprot sequence P00509), because they are found on the flexible loop connecting the moving and fixed domains (Fig. 2). We also selected for design residues Thr43 and Asn64, which are not part of the hinge, but whose side chains form tight packing interactions with those of Val35 and Lys37. Interestingly, these residues comprise four of the six positions that were mutated in HEX (Table 1), demonstrating that our analysis using only WT structures led to the identification of positions that contribute to controlling the open/closed conformational equilibrium in AAT.

Next, we generated backbone ensembles from the open- and closed-state crystal structures to approximate the intrinsic flexibility of these two conformational states using the PertMin algorithm18, which we previously showed to result in improved accuracy of protein stability predictions when used as templates in multistate design19. Using the protein design software Phoenix20,21, we optimized rotamers for all combinations of proteinogenic amino acids with the exception of proline at the four designed positions on each backbone ensemble, yielding Boltzmann-weighted average energies for 130,321 (194) AAT sequences that reflect their predicted stability on each conformational state. To identify mutant sequences that preferentially stabilized the closed conformation, we computed the energy difference between closed and open state ensembles (ΔE = EclosedEopen) for each sequence using the Phoenix potential energy function developed for protein design. Phoenix energies correspond to the sum of pairwise nonbonded interaction energies between rotamers and between rotamers and template in the folded state, without consideration of bonded interactions, entropy or the energy of the unfolded state. As a final step, we used these ΔE values as input to the CLEARSS library design algorithm20 to generate a 24-member combinatorial library of AAT mutants predicted to favor the closed state (Closed library, Supplementary Table 2) with a range of values (–86.0 to –9.5 kcal mol–1) encompassing that of HEX (–76.8 kcal mol–1, Table 2). As controls, we also generated two libraries of sequences predicted to favor the open conformation (Supplementary Table 2): the OpenLow library, which contains 18 sequences predicted to stabilize the open state with ΔE values (0.8–16.4 kcal mol–1) comparable to that of the WT (14.2 kcal mol–1, Table 2), and the OpenHigh library, which contains 24 sequences predicted to more strongly favor the open state due to substantial destabilization of the closed state by >120 kcal mol–1. While we postulated that the OpenLow library would yield mutants with wild-type-like conformational landscapes and therefore similar catalytic efficiency and substrate selectivity, we hypothesized that OpenHigh library mutants would be less efficient than WT with both native and non-native substrates due to their strong destabilization of the closed conformation, which is the active form of the enzyme15. Thus, experimental characterization of these three mutant libraries, which comprise non-overlapping sequences (Fig. 2), allowed us to assess the ability of the ΔE metric to predict sequences with conformational landscapes favoring the open or closed states.

Table 2 Conformational equilibrium of AAT variants

Kinetic analysis of designs

We screened the three mutant libraries for transamination activity with the non-native substrate l-phenylalanine (“Methods” section) and selected the most active mutants from each library for kinetic analysis. All selected mutants catalyzed transamination of l-phenylalanine or l-aspartate with α-ketoglutarate, and displayed substrate inhibition with this acceptor substrate, as is the case for WT (Table 1, Supplementary Tables 34, and Supplementary Figs. 25). All Closed library mutants displayed catalytic efficiencies towards l-phenylalanine that were improved by approximately two orders of magnitude relative to WT (Table 1), comparable to HEX, and all were similarly or more active with this non-native substrate than with l-aspartate, in stark contrast with WT AAT, which prefers the native substrate by a factor of 100 (Supplementary Fig. 6). Furthermore, all Closed library mutants were less catalytically efficient with l-aspartate than the WT despite having lower KM values for this substrate, similar to HEX. These results are consistent with previous studies that showed that release of the oxaloacetate product resulting from l-aspartate transamination is dependent on a conformational change from the closed to the open state that is partially rate determining22. Meanwhile, OpenLow and OpenHigh mutants favored the native over the non-native substrate by up to 50-fold, similar to WT. OpenLow mutants had lower KM values and were more catalytically efficient than WT with both l-phenylalanine and l-aspartate, which could be due to the fact that these mutants have ΔE values similar to the WT but stabilize both the open and closed conformations by >10 kcal mol‒1 (Table 2 and Supplementary Table 2). This is not the case for OpenHigh mutants, which have similar KM values but are less catalytically efficient with the native substrate than WT, in agreement with our hypothesis that strong destabilization of the catalytically active closed conformation would result in less efficient catalysis. Overall, these kinetic results support the hypothesis that the change in substrate selectivity is linked to the conformational equilibrium previously suggested by the data from the HEX mutant13,14.

Structural analysis of designs

To provide structural information on the conformations adopted by AAT mutants, we turned to room-temperature X-ray crystallography, which provides insight into enzyme conformational ensembles under conditions that are relevant to catalysis23 and free of potential distortions or conformational bias introduced by sample cryocooling24. We crystallized WT, HEX, and select variants from the Closed (VFIT and VFIY), OpenLow (VFCS), and OpenHigh (AIFS) libraries. All six enzymes yielded crystals under similar conditions (Supplementary Table 5), which could only be obtained in the presence of maleate (Supplementary Fig. 1d), an inhibitor that stabilizes the closed conformation when bound by both WT25 and HEX14. To obtain structures in the absence of maleate, we applied a rigorous crystal soaking method to serially dilute and extract the inhibitor from the crystallized enzymes (Methods). We collected X-ray diffraction data at room temperature (278 K) for all variants with the exception of AIFS, which could only be measured at cryogenic temperature (100 K) because we could only obtain small crystals that were not robust to radiation damage at non-cryogenic temperatures. We applied statistical criteria (Supplementary Table 6) to assign high-resolution cut-offs of 1.37–2.31 Å for our data sets, and all structures were subsequently determined by molecular replacement in space group P 63 with an enzyme homodimer in the asymmetric unit. Given the known impact of crystal packing and crystallization conditions on domain movement amplitude in AAT25,26, we confirmed that crystals of all variants with or without soaking had similar unit cell dimensions and identical space group. In the inhibitor-bound state, all structures were closed as expected (Supplementary Fig. 7) due to electrostatic interactions between maleate and the side chains of Arg280 and Arg374 (Supplementary Fig. 8). Upon soaking crystals of the WT enzyme to remove the bound inhibitor, we observed that one subunit (chain A) within the enzyme homodimer was in the open conformation (Fig. 3), confirming that the soaking procedure was able to remove the bound maleate, and that the crystal lattice could accommodate the domain rotation required for opening and closing of this subunit. Chain B however remained closed upon maleate removal, likely due to crystal packing interactions hindering domain rotation of this subunit. These results are consistent with a previous study that showed that only one subunit of a homologous AAT closed upon soaking of the crystal in a substrate solution26. In the maleate-free structures, a sulfate ion and one or more ordered water molecules occupy the inhibitor binding site of both subunits (Supplementary Fig. 9), as was previously observed in WT25 and HEX14 structures at cryogenic temperatures.

Fig. 3: Crystal structures.
figure 3

Overlay of AAT structures (Chain A) in the presence and absence of maleate show transition from open to closed states upon inhibitor binding for WT, VFCS, and AIFS, but not for HEX, VFIT, and VFIY, which already adopt the closed conformation in the absence of bound inhibitor. Average displacements of helix formed by residues K355–F365 upon maleate binding are reported as the average pairwise distance of corresponding Cα atoms for the 11 residues comprising this helix (mean ± s.d.).

Superposition of bound and unbound structures for each variant confirmed that closed library mutants VFIT and VFIY remained in the closed conformation in the inhibitor-free form, similar to HEX, while OpenLow variant VFCS and OpenHigh variant AIFS adopted the open conformation, similar to WT (Fig. 3). AIFS is unique in that the electron density is especially weak in the regions corresponding to the helix formed by Pro12–Leu19 and the loop connecting moving and fixed domains (Leu31–Thr43), even at cryogenic temperatures, demonstrating that these structural segments are disordered when the enzyme adopts the open conformation (Supplementary Fig. 10). This result suggests that the four mutations of AIFS, three of which are found within the Leu31–Thr43 loop, contribute to destabilize the open conformation, consistent with the calculated Eopen value of this variant (Table 2) being >7 kcal mol–1 higher than that of the other variants that favor the open state (WT and VFCS). Interestingly, the amplitude of the open/closed conformational transition that occurs in open variants upon maleate binding (Fig. 3) correlated with their computed ΔE values (VFCS < WT < AIFS), suggesting that this metric can be used to fine-tune this enzyme conformational landscape. Furthermore, DynDom analyses of all variants confirmed that only WT, VFCS, and AIFS undergo hinge motion upon maleate binding (Supplementary Table 7), which rotates the moving domain relative to the fixed domain by 4.6, 2.6, and 5.9 degrees, respectively.

NMR analysis of conformational landscapes

Having demonstrated crystallographically that our designed variants adopted the target conformation in the absence of ligand, we turned to NMR spectroscopy to gain insights into their conformational equilibria in solution, and compared results against those obtained for WT and HEX. We first measured 1H-15N HSQC spectra for WT in the presence and absence of l-aspartate (Supplementary Fig. 11a). As expected for a protein of this size, the spectrum consisted of a large number of peaks that were broad with a high degree of overlap. It was nonetheless possible to assign the unique chemical shifts of peaks from the indole NH group for three native Trp residues by comparison of these spectra with those acquired with single Trp mutants (Supplementary Fig. 11b,c). This allowed assignment of a peak that was significantly broadened in spectra of both HEX and Closed library mutant VFIY to the indole NH from Trp307 (Supplementary Fig. 11d), whose side chain is closest to the designed hinge residues. Moreover, the Trp307 indole peak was no longer detectable when the l-aspartate substrate was present for both WT and mutant enzymes, most likely being broadened beyond detection. Since all conditions that favor the closed state (i.e., HEX and Closed library mutations and/or l-aspartate binding) broaden the Trp307 indole resonance, this exchange appears to be associated with the closed state, potentially due to conformational dynamics around the hinge.

In order to characterize the thermodynamics of the exchange processes around the hinge region, we labeled AAT variants at a single site with 19F using site-specific incorporation of the noncanonical amino acid 4-trifluoromethyl-l-phenylalanine27. Phe217 was chosen as the incorporation site since it is proximal to hinge residues but is not in direct contact with the substrate (Fig. 4a). The 19F spectrum of WT at 278 K showed a single peak centered at approximately ‒60 ppm that shifts downfield as temperature is increased (Fig. 4b), similar to what is observed for free 4-trifluoromethyl-l-phenylalanine in solution (Supplementary Fig. 12). By contrast, the HEX spectrum at 278 K showed a large broad peak centered at around ‒61.6 ppm, with another peak of substantially lower intensity also appearing at a similar shift to that observed in the WT spectrum (‒60.4 ppm). The relative intensity of these 2 peaks changed as the temperature was increased, with the low-intensity peak increasing as the major peak decreased. This is characteristic of two-state exchange, with the equilibrium between the two states being shifted by the temperature change. Given that the crystal structure of HEX in its ligand-free form showed a closed conformation (Fig. 3), it is likely that the major peak reflects a local chemical environment created by the closed state, with a small population in an open state similar to that seen in the WT spectrum.

Fig. 4: Conformational landscape analysis by NMR.
figure 4

a To evaluate conformational equilibrium of AAT variants, we introduced the fluorinated amino acid 4-trifluoromethyl-l-phenylalanine at position F217 (orange), which is located closer to hinge-bending and designed residues (green) than to the bound maleate inhibitor (blue). Crystal structure shown is that of wild-type (WT) AAT at 278 K (PDB ID: 8E9K). b 19F NMR spectra of AAT variants in the absence of ligand show dynamic equilibrium between 278 K and 308 K for HEX, VFIT, VFIY, and AIFS, confirming that these proteins are undergoing exchange. This is not the case for WT and OpenLow library mutant VFCS, who both adopt predominantly the open conformation within this temperature range. For HEX and Closed library mutants VFIT and VFIY, the open conformation is enriched as temperature increases. For AIFS, spectra suggest that this OpenHigh library mutant samples conformations distinct from those sampled by the other variants.

Using peak deconvolution and integration, it was possible to calculate relative populations for each species for variants undergoing the observed two-state exchange, along with the free energy difference between states (Supplementary Fig. 13 and Supplementary Table 8). We calculated ΔG at 278 K for HEX to be –1.72 kcal mol–1 (Table 2), a small difference that would be compatible with the interconversion between these two states to be part of the catalytic cycle28. These peak volumes could also be calculated over the entire temperature range tested, giving rise to a van’t Hoff relationship with a small degree of curvature (Supplementary Fig. 14). Deviations from linearity can occur when there are differences in heat capacity between the two states, as would be expected for a process involving a change in conformational states over the temperature range tested29. By contrast, the presence of a single peak in WT spectra suggests that the closed state is not significantly populated under these conditions, as expected from its open-state crystal structures obtained under both cryogenic12,25 and room-temperature conditions.

We next analyzed two mutants from the Closed library (VFIT and VFIY). 19F NMR spectra at 278 K for both mutants showed a narrow peak centered at approximately ‒60.2 ppm that is similar to that of WT (Fig. 4). Spectra of these variants also showed another broad peak centered at approximately ‒61.2 ppm that is similar to the HEX peak characteristic of the closed conformation. The relative intensity of the two peaks showed similar temperature dependence, with an increase in the relative intensity of the WT-like peak as temperature was increased to 308 K. Peak deconvolution and integration (Supplementary Figs. 1314 and Supplementary Table 8) were used to evaluate the population of the two states, and confirmed that at low temperatures both VFIT and VFIY favor the state resembling that adopted by the closed HEX mutant (T < 288 K or 283 K, respectively). However, unlike HEX, at higher temperatures the alternate conformation with the WT-like peak becomes the favored state (Table 2). Interestingly, ΔE values for these mutants were smaller in magnitude than that calculated for HEX, supporting the predictive nature of the calculated energy differences between open and closed states for these sequences.

To determine if these differences in the temperature dependence of exchange could be observed in the crystal state, we also solved the maleate-free structures of WT, HEX, and VFIT at 303 K (Supplementary Table 9 and Supplementary Fig. 9), and calculated isomorphous difference density maps by subtracting electron density at 278 K (Supplementary Fig. 15). Comparing VFIT data obtained at 278 K and 303 K results in substantial difference density throughout chain A, the chain that opens when WT crystals are soaked to remove maleate. By contrast, similar comparisons for WT and HEX showed relatively little difference density. This analysis confirms that VFIT undergoes larger local conformational changes than either HEX or WT when the temperature of the crystal is increased to 303 K, in agreement with our van’t Hoff analysis of NMR data (Table 2, Supplementary Table 8). The agreement between temperature-dependent X-ray crystallography and NMR data provides strong evidence that the conformational exchange detected in the NMR experiments reflects a dynamic equilibrium between closed and open conformations, with mutations that favor the closed conformation also increasing selectivity toward l-phenylalanine.

We next analyzed two mutants designed to favor the open conformation in their ligand-free forms. OpenLow mutant VFCS showed 19F NMR spectra that were very similar to those of WT within the tested temperature range (Fig. 4b and Supplementary Fig. 13), consistent with its preference for the open conformation (Fig. 3 and Table 2). However, OpenHigh mutant AIFS gave rise to spectra that were distinct from those of all other variants (Fig. 4b), but could be deconvoluted to two exchanging peaks (Supplementary Fig. 13), to allow estimation of populations (Supplementary Table 8 and Supplementary Fig. 14). We postulate that those peaks correspond to alternate open conformations distinct from the one sampled by the other variants. This hypothesis is supported by our observations that ligand-free AIFS is open at low temperature (Fig. 3) but contains disordered segments around the Pro12–Leu19 helix and the loop containing three of the four designed positions (Supplementary Fig. 10), which are located close to the Phe217 position where the 19F label was introduced. The observed heterogeneity could therefore correspond to a mixture of these alternate open conformations. Additional support that the exchange detected in AIFS differed from that of HEX and closed mutants was provided by the van’t Hoff analysis, which showed no curvature for AIFS unlike for the closed mutants (Supplementary Fig. 14), suggesting that the OpenHigh variant does not undergo the open/closed conformational transition within this temperature range.

Discussion

Here, we successfully remodeled the conformational landscape of an enzyme via targeted alterations to the equilibrium between two distinct conformational states related by a hinge-bending motion. The resulting equilibrium shift promoted activity toward a non-native substrate, leading to a selectivity switch of up to 1900-fold. Our approach should be applicable to the redesign of other enzymes where alterations to known conformational equilibria, including those mediated by hinge-bending motions, are expected or hypothesized to increase catalytic efficiency7,30 or alter substrate selectivity31. As many enzymes undergo hinge-mediated domain motions during their catalytic cycles32, the multistate design approach presented here should be straightforward to implement for such enzymes. Given the ability of our multistate design procedure to distinguish between closed and open states whose free energy difference is on the order of a single hydrogen bond, our approach could, in principle, also be applied to preferentially stabilize catalytically competent substates involving more subtle structural changes, such as backbone carbonyl flips33 or side-chain rotations34. In such cases, it will be important to have structural information available for the target states. This methodology could therefore help to tailor catalytic efficiency or substrate selectivity by mimicking, in silico, the processes of evolution that harness altered conformational equilibria to tune function7,35,36.

The predictive capacity of our multistate design framework could only be achieved by evaluating the energy of sequences on multiple conformational states. For example, the VFCS variant that prefers the open state is predicted to be more stable on the closed state than Closed library mutants VFIT and VFIY (Table 2), and more stable on the Open state than AIFS even though it is less open than this variant (Fig. 3). Furthermore, there were no obvious trends in designed mutations that could explain their effect on the conformational landscape, as none of these introduced bulkier or smaller amino acids at all or specific residue positions to cause or alleviate steric clashes in the open or closed conformations so as to shift the equilibrium towards one of these states, which has been the approach others have used to shift conformational equilibria37. Thus, subtle effects of mutation combinations on the relative stability of each conformational state were likely responsible for the observed preference of mutants for the open or closed conformations.

Our results demonstrate the utility of multistate design, with ΔE values calculated from ensemble energies of open and closed states, for the targeted alteration of subtle conformational equilibria, an approach that represents a useful alternative to heuristic methods that others have used to tune the relative stability of protein conformational states38. Extending this concept, we envision that de novo design of artificial enzymes with native-like catalytic efficiency and selectivity for complex multistep chemical transformations will require a holistic approach where every conformational state and/or substate required to stabilize reaction intermediates and transition states are explicitly modeled, and their relative energies optimized. The multistate design method for conformational landscape remodeling presented here opens the door to alter this common type of enzyme conformational equilibrium to facilitate the creation of designer biocatalysts with tailored functionality.

Methods

Structure preparation and ensemble generation

Crystal structures of wild-type Escherichia coli AAT in its internal aldimine form (PDB ID: 1ARS12) or complexed with N-phosphopyridoxyl-l-glutamic acid (PDB ID: 1X2815) were used to model the open and closed states, respectively. To eliminate biases that could arise during ensemble generation from the presence of substrate bound in the active site, we deleted coordinates for the N-phosphopyridoxyl-l-glutamic acid and catalytic K246 residue in the 1X28 structure and replaced them with coordinates for K246 and the pyridoxal 5′-phosphate (PLP) cofactor extracted from the 1ARS structure. These structures were then prepared for ensemble generation using the Molecular Operating Environment (MOE) software39. Hydrogens were added with the Protonate3D utility and manually adjusted to ensure that the protonation states of PLP and K246 were consistent with the aminotransferase catalytic mechanism40. The resulting structures were then solvated in a rectangular box of water with counter ions (Na+ and Cl) under periodic boundary conditions with a box cut-off of 6 Å, and energy minimized by conjugate gradient energy minimization to a root mean square gradient <7 kcal mol–1 Å–1 using the AMBER99 force field41 with a combined explicit solvent and implicit reaction field solvent model set up using the MOE software package. These structures were used as input templates to generate backbone ensembles with the PertMin algorithm18,19. Briefly, two 50-member PertMin ensembles were created by randomly perturbing the coordinates of all heavy atoms of the two prepared AAT structures by ±0.001 Å along each Cartesian coordinate axis, and energy minimizing them using a truncated Newton42 minimization algorithm for 100 iterations. The “Open” and “Closed” ensembles thus obtained displayed diversities (i.e. average backbone root mean square deviations between pairs of ensemble members) of 0.29 ± 0.03 Å and 0.32 ± 0.03 Å, respectively, and backbone root mean square deviations from the starting structure of 0.51 ± 0.02 Å.

Computational protein design

All calculations were performed using the Phoenix protein design software21,43 with the fast and accurate side-chain topology and energy refinement (FASTER) algorithm44 for sequence optimization. The 2002 backbone-dependent Dunbrack rotamer library45 with expansions of ±1 standard deviation around χ1 and χ2 was used to provide side-chain conformations of AAT residues to be threaded onto each fixed backbone template. Sequences were scored using the Phoenix energy function, a five-term potential energy function consisting of a Lennard-Jones 12–6 van der Waals term from the Dreiding II force field46 with atomic radii scaled by 0.9, a direction-dependent hydrogen bond term with a well depth of 8.0 kcal mol–1 and an equilibrium donor-acceptor distance of 2.8 Å47, an electrostatic energy term modeled using Coulomb’s law with a distance-dependent dielectric of 10, an occlusion-based solvation penalty term21, and a secondary structural propensity term48. Sidechain rotamers of residues 35, 37, 43, and 64 were optimized on each backbone template using all proteinogenic amino acids with the exception of proline. Sidechain rotamers of residues within 5 Å of the designed residues were also optimized but their identities were not changed. The searched sequence space thus consisted of 130,321 (194) sequences, resulting in >13 million individual sequence energies (130,321 sequences × 50 backbones × 2 ensembles).

Boltzmann weighted average potential energies at 300 K were computed for each sequence on each ensemble, yielding energy values for the Closed and Open states (Eclosed and Eopen). Energy differences between conformational states (ΔE = EclosedEopen) were then computed for use in library design. To avoid favoring sequences displaying unfavorable Eclosed and/or Eopen values resulting in favorable ΔE (e.g., high, low, or high absolute ΔE values for Closed, OpenLow, or OpenHigh libraries, respectively), which would be expected to be unstable, sequences whose Eclosed and/or Eopen value fell outside of the 75th percentile were discarded.

Library design

Library design was performed with the CLEARSS algorithm20 using as input the ΔE values obtained as described above. For a specific library size configuration, which is the specific number of amino acids at each position in the protein (e.g., 4 amino acids at position 1, 3 amino acids at position 2, etc.), the highest probability set of amino acids at each position were included in the library. To identify the optimal library size configuration, all configurations that lead to a combinatorial library of target sizes (in this case, 20 ± 4 sequences) were scored by taking the sum of all partition functions of the chosen amino acid sets over all positions, and the highest scoring library was selected. Eclosed, Eopen, and ΔE values for mutants from each library are reported on Supplementary Table 2.

Chemicals

All reagents used were of the highest available purity. Synthetic oligonucleotides were purchased from Eurofins MWG Operon. Restriction enzymes and DNA-modifying enzymes were purchased from New England Biolabs. Ni-NTA agarose resin was purchased from Bio-Rad Laboratories. All aqueous solutions were prepared using water purified with a Barnstead Nanopure Diamond system.

Mutagenesis

The wild-type E. coli AAT gene (Uniprot ID: P00509) with an N-terminal His-tag cloned into plasmid pET-45b (Novagen) via the NcoI/PacI restriction sites49 was a generous gift from Michael D. Toney (University of California, Davis). Mutations were introduced into the AAT gene by overlap extension mutagenesis50 using VentR DNA Polymerase. Briefly, external primers containing NdeI or BamHI restriction sites were used in combination with sets of complementary pairs of oligonucleotides containing mutated codons (individual codons for HEX and single point mutants, codon mixtures for the Closed, OpenLow, and OpenHigh libraries) in individual polymerase chain reactions (PCRs). The resulting overlapping fragments were gel-purified (Omega Biotek) and recombined by overlap extension PCR. The resulting amplicons were digested with NdeI/BamHI, gel-purified, and ligated into the pET-11a expression vector (Novagen) with T4 DNA ligase. pBAD vectors (Invitrogen) harboring selected AAT genes flanked by NcoI/EcoRI restriction sites were prepared using a similar procedure. All constructs were verified by sequencing the entire open reading frame. Amino-acid sequences of all AAT variants are listed in Supplementary Table 10, and a multiple sequence alignment is shown on Supplementary Fig. 16.

Preparation of clarified cell lysates

DNA libraries prepared as described above were transformed into chemically competent E. coli BL21-Gold (DE3) cells (Agilent). Colonies (180 per library) were picked into individual wells of V96 MicroWell polypropylene plates (Nunc) containing 300 μL of lysogeny broth (LB) supplemented with 100 µg mL–1 ampicillin and 10% glycerol. The plates were covered with a sterile breathable rayon membrane (VWR) and incubated overnight at 37 °C with shaking. After incubation, these mother plates were used to inoculate sterile Nunc V96 MicroWell polypropylene plates (“daughter” plates) containing 300 μL per well of Overnight Express Instant TB medium (Novagen) supplemented with ampicillin. Daughter plates were sealed with breathable membranes and incubated overnight (37 °C, 250 rpm). After incubation, cells were harvested by centrifugation (3000×g, 30 min, 4 °C) and pellets were washed twice with phosphate-buffered saline (pH 7.4). Washed cell pellets were resuspended in lysis buffer (100 mM potassium phosphate buffer pH 8.0 containing 1× Bug Buster Protein Extraction Reagent [Novagen], 5 U mL–1 Benzonase Nuclease [EMD], and 1 mg mL–1 lysozyme). Clarified lysates were collected following centrifugation and stored at 4 °C until used in the screening assay.

Library screening

All assays were performed in 200-µL reactions at 37 °C in 100 mM potassium phosphate buffer (pH 8.0). The standard reaction mixture contained final concentrations of either 3 or 40 mM l-phenylalanine, 16 µM PLP, 0.2 mM α-ketoglutarate, 1 U of glutamate dehydrogenase (GDH) from bovine liver (Sigma), and 5 mM NAD+. Plates containing the standard reaction mixture were incubated at 37 °C for 5 min prior to initiation of the reaction by addition of 10 µL of clarified cell lysates prepared as described above. Enzyme reactions were monitored by measuring absorbance of NADH at 340 nm every 12 sec for 30 or 60 min in individual wells of 96-well plates (Greiner Bio-One) using a SpectraMax 384 Plus plate reader (Molecular Devices). The four or five most active variants from each library were selected for further characterization.

Protein expression and purification

Selected mutants were expressed and purified as described by Mironov et al. 51. Briefly, E. coli BL21-Gold (DE3) cells harboring expression vectors containing aminotransferase genes were grown at 37 °C in 500 mL LB medium supplemented with 100 μg mL–1 ampicillin until they reached an OD600 of 0.6. Isopropyl β-d−1-thiogalactopyranoside (1 mM) was added to the flasks to induce protein expression, followed by shaking overnight at 16 °C. Cells were harvested by centrifugation, resuspended in 10 mL lysis buffer (5 mM imidazole in 100 mM potassium phosphate buffer, pH 8.0), and lysed with an EmulsiFlex-B15 cell disruptor (Avestin). Proteins were purified by immobilized metal affinity chromatography using Ni–NTA agarose pre-equilibrated with lysis buffer in individual Econo-Pac gravity-flow columns (Bio-Rad). Columns were washed twice, first with 10 mM imidazole in 100 mM potassium phosphate buffer (pH 8.0), and then with the same buffer containing 20 mM imidazole. Bound proteins were eluted with 250 mM imidazole in 100 mM potassium phosphate buffer (pH 8.0) and exchanged into 100 mM sodium phosphate buffer (pH 8.0) using Econo-Pac 10DG desalting pre-packed gravity flow columns (Bio-Rad). For crystallography, proteins were further purified by gel filtration in 20 mM potassium phosphate buffer (pH 7.5) using an ENrich SEC 650 size-exclusion chromatography column (Bio-Rad). Purified samples were concentrated using Microsep Advance 10 K centrifugal devices (Pall) to a final concentration of 170 μM. Protein concentrations were quantified using a modified version of the Bradford assay, where the calibration curve is constructed as a plot of the ratio of the absorbance measurements at 590 and 450 nm versus concentration52.

Steady-state kinetics

To measure steady-state kinetics for the α-ketoglutarate acceptor substrate, assays were performed by varying the α-ketoglutarate concentration from 0.002 to 20 mM in the presence of 10–20 mM l-phenylalanine or l-aspartate (Supplementary Table 4), 5 mM NAD+, 16 µM PLP, 1 U GDH, and approximately 10 mU of aminotransferase in 100 mM potassium phosphate buffer (pH 8, 37 °C). To measure steady-state kinetics for the l-aspartate and l-phenylalanine donor substrates, assays were performed by varying the amino-acid concentration from 0.002 to 40 mM in the presence of 0.1563–1.25 mM α-ketoglutarate (Supplementary Table 3), 5 mM NAD+, 16 µM PLP, 1 U GDH, and approximately 10 mU of aminotransferase in 100 mM potassium phosphate buffer (pH 8, 37 °C). The pH of all reaction mixtures was adjusted to 8.0 prior to initiation of the reaction. Enzyme reactions were monitored by measuring absorbance of NADH at 340 nm (ε = 6220 M–1 cm–1) every 12 sec for 30 or 60 min in individual wells of 96-well plates (Greiner Bio-One) using a SpectraMax 384 Plus plate reader (Molecular Devices). Path lengths for each well were calculated ratiometrically using the difference in absorbance of potassium phosphate buffer at 900 and 998 nm. Linear phases of kinetic traces were used to measure initial reaction rates. Initial reaction rates at different substrate concentrations were fit to the Michaelis-Menten equation using Python ver.2.7.15 with the scipy.optimize.curve_fit function (scipy ver.1.1.0). For mutant and substrate combinations resulting in substrate inhibition, fitting of the kinetic data was done with a rate equation that takes into account this type of inhibition: v0 = (vmax[S])/(KM + [S] + [S]2/Ki).

Preparation of 15N labeled proteins

Proteins for NMR spectroscopy were expressed using M9 minimal expression medium supplemented with 1 g L–1 15N-labeled ammonium chloride (15NH4Cl) for isotopic enrichment. Cultures were grown at 37 °C with shaking to an optical density at 600 nm of approximately 0.6, after which protein expression was initiated with 1 mM isopropyl β-d-1-thiogalactopyranoside. Following overnight incubation at 16 °C with shaking (275 rpm), cells were harvested by centrifugation and lysed with an EmulsiFlex-B15 cell disruptor (Avestin). Proteins were purified by immobilized metal affinity chromatography as described above, which was followed by gel filtration in 10 mM sodium phosphate buffer (pH 6.0) using an ENrich SEC 650 size exclusion chromatography column (Bio-Rad). Purified samples were concentrated using Amicon Ultracel-10K centrifugal filter units (EMD Millipore).

Preparation of 19F site-specific labeled proteins

Site-specific incorporation of 4-trifluoromethyl-l-phenylalanine into selected AAT variants was prepared with a protocol adapted from Hammill et al. 27. Briefly, chemically competent E. coli DH10B cells (Thermo Fisher) harboring the pDule-4-tfmF A65V S158A plasmid (Addgene plasmid #85484)53, which encodes an orthogonal aminoacyl-tRNA synthetase and cognate amber suppressing tRNA for site-specific incorporation of 4-trifluoromethyl-l-phenylalanine, were transformed with pBad vectors (Invitrogen) containing mutated genes of AAT variants in which an amber stop codon was introduced at residue position 217 to allow direct incorporation of the fluorinated amino acid. Labeled AAT variants were expressed in 2 L of LB containing 100 μg mL–1 ampicillin and 15 μg mL–1 tetracycline at 37 °C. After cells were grown for 1 h, 468 mg of 4-trifluoromethyl-l-phenylalanine (SynQuest Laboratories) was added to the flask to give a final concentration of 1 mM. Once the cell culture reached an OD600 of 0.6, l-arabinose was added to a final concentration of 0.2% to induce protein expression. Following overnight incubation at 16 °C with shaking, cells were harvested by centrifugation, resuspended in 10 mL lysis buffer, and lysed with an EmulsiFlex-B15 cell disruptor (Avestin). Proteins were then extracted and purified by immobilized metal affinity chromatography, as described above. Elution fractions containing the aminotransferases were concentrated through centrifugation (Pall Microsep Advance Centrifugal Device 10 K), resuspended in 10 mM potassium phosphate buffer pH 6.0 and further purified through size exclusion chromatography, as described above. Elution fractions containing the purified aminotransferase were combined and concentrated through centrifugation.

1H-15N heteronuclear single quantum coherence spectroscopy

15N-labeled AAT samples for NMR (wild type, HEX, VFIY, and single point mutants Trp124Phe, Trp194Phe, Trp307Phe) consisted of 0.2–0.6 mM protein in 10 mM sodium phosphate buffer (pH 7.4), 10 μM EDTA, 0.02% sodium azide, and 10% D2O. HSQC experiments were performed on a Bruker AVANCEIII HD 600 MHz spectrometer equipped with a triple resonance cryoprobe. 128 scans were accumulated to acquire each spectrum.

19F nuclear magnetic resonance spectroscopy

Protein samples used for 19F NMR analysis were diluted to concentrations of 75–280 μM in 500 μL of 10 mM potassium phosphate buffer (pH 6.0), 100 μM EDTA, 0.02% sodium azide, and 10% D2O. All 19F NMR spectra were acquired with a Bruker Avance 500 MHz spectrometer with 10 sec acquisition delays. 512 scans were accumulated for one-dimensional 19F chemical shift analysis at 5, 10, 15, 20, 25, 30, and 35 °C. Data were processed with an exponential window function (20 Hz line-broadening) using TopSpin 3.6.1 (Bruker Biospin). The 19F NMR spectra of fluorinated HEX, VFIT, and VFIY showed two resonances, which we interpret to correspond to two distinct conformations that do not rapidly exchange with each other on the 19F-NMR timescale. All spectra were deconvoluted into two Lorentzian curves using Python v3.7.9 with the LMFIT package for nonlinear least-squares minimization and curve-fitting54. The resulting two curves were integrated to measure the relative populations of each conformational state. The ratio of the two populations were used to calculate equilibrium constants at each temperature (Supplementary Table 8) and fit using the Solver function of Excel to the nonlinear van’t Hoff equation55 (Equation 1), where enthalpy and entropy are not assumed to be temperature-independent (i.e. ΔCp ≠ 0) to extract thermodynamic parameters of equilibrium. When fitting to the linear van’t Hoff equation, ΔCP was simply set to zero.

$${{{{\mathrm{ln}}}}}\left({K}_{{eq}}\right)=-\frac{\varDelta {H}_{{ref}}^{^\circ }}{R}\left(\frac{1}{T}\right)+\frac{\varDelta {S}_{{ref}}^{^\circ }}{R}-\frac{\varDelta {C}_{P}}{R}\left[\left(\frac{T-{T}_{{ref}}}{T}\right)+{{{{\mathrm{ln}}}}}\left(\frac{{T}_{{ref}}}{T}\right)\right]$$

Equation 1. Nonlinear van’t Hoff equation. Tref is an arbitrary reference temperature (set to 298 K), ΔH°ref and ΔS°ref are ΔH°(T) and ΔS°(T) evaluated at Tref, respectively.

Crystallization

Purified AAT variants were prepared in 20 mM potassium phosphate buffer (pH 7.5) with 2 mM EDTA and 10 uM PLP to a final concentration indicated on Supplementary Table 5. Maleic acid was dissolved in water to make a 1 M stock solution, and then added to each protein solution yielding a final maleate concentration of 20 mM. For each enzyme variant, we carried out initial crystallization trials in 15-well hanging drop format using EasyXtal crystallization plates (NeXtal) and a crystallization screen that was designed to explore parameter space around the crystallization conditions reported by Islam et al. 15. Crystallization drops were prepared by mixing 1 µL of protein solution with 1 µL of the mother liquor and sealing the drop inside a reservoir containing 500 µL of mother liquor. The mother liquor solutions contained ammonium sulfate as a precipitant and the specific growth conditions that yielded the crystals used for X-ray data collection are provided in Supplementary Table 5. For all six enzymes, a microseeding protocol was required to obtain high-quality crystals. Microseeds were prepared by crushing initial crystals in their mother liquor using a glass rod, and were subsequently streaked into the crystallization drops using a cat whisker. Because maleate was required for crystallization, ligand-free structures were obtained by soaking the crystal in new drops of 1 μL mother liquor containing 10 μM PLP but no maleate, allowing maleate in the crystal to diffuse out. Each crystal was treated in this way six subsequent times, 12 hours apart, to achieve removal of bound maleate (Supplementary Fig. 9).

X-ray data collection and processing

Prior to X-ray data collection, crystals were mounted on polymer MicroMounts (MiTeGen) and sealed using a MicroRT tubing kit (MiTeGen). Single-crystal X-ray diffraction data was collected on beamline 8.3.1 at the Advanced Light Source. The beamline was equipped with a Pilatus3 S 6 M detector and was operated at a photon energy of 11111 eV. Crystals were maintained at either 100 K, 278 K, or 303 K throughout the course of data collection. Each data set was collected using a total X-ray dose of 50–100 kGy and covered a 180° wedge of reciprocal space. Multiple data sets were collected for each enzyme variant either from different crystals, or if their size permitted, from unique regions of single large crystals.

X-ray data were processed with the Xia2 program56, which performed indexing and integration with DIALS57, followed by scaling with DIALS.SCALE58. The resolution cut-off was taken where the CC1/2 and <I/σI> values for the intensities fell to approximately 0.5 and 1.0 respectively.

Structure determination

We obtained initial phase information for calculation of electron density maps by molecular replacement using the program Phaser59, as implemented in v1.17.1.3660 of the PHENIX suite60, with the crystal structure of wild-type Escherichia coli AAT complexed with N-phosphopyridoxyl-l-glutamic acid (PDB ID: 1X2815) as search model. All AAT variants crystallized in the same crystal form, containing two chains of the molecule in the crystallographic asymmetric unit. Next, we rebuilt the initial model using the electron density maps calculated from molecular replacement. We then performed additional, iterative refinement of atomic positions, individual atomic displacement parameters (B-factors), and occupancies using a translation-libration-screw (TLS) model, a riding hydrogen model, and automatic weight optimization, until the model reached convergence. All model building was performed using Coot 0.8.9.261 and refinement steps were performed with phenix.refine (v1.17.1.3660) within the PHENIX suite60,61. Restraints for PLP in its internal aldimine form were generated using phenix.elbow62, starting from coordinates available in the Protein Data Bank63 (PDB ligand ID: PLP), and manually edited to ensure planarity of the pyridine ring and correct geometry of the imine bond between PLP and K246 (restraints available as Supplementary Data Files). Further information regarding model building and refinement, as well as PDB accession codes for the final models, are presented in Supplementary Table 6 and Supplementary Table 9.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.