Kemp elimination catalysts by computational enzyme design

Röthlisberger, Daniela; Khersonsky, Olga; Wollacott, Andrew M.; Jiang, Lin; DeChancie, Jason; Betker, Jamie; Gallaher, Jasmine L.; Althoff, Eric A.; Zanghellini, Alexandre; Dym, Orly; Albeck, Shira; Houk, Kendall N.; Tawfik, Dan S.; Baker, David

doi:10.1038/nature06879

Article
Published: 19 March 2008

Kemp elimination catalysts by computational enzyme design

Daniela Röthlisberger¹^na1,
Olga Khersonsky⁴^na1,
Andrew M. Wollacott¹^na1,
Lin Jiang^1,2,
Jason DeChancie⁶,
Jamie Betker³,
Jasmine L. Gallaher³,
Eric A. Althoff¹,
Alexandre Zanghellini^1,2,
Orly Dym⁵,
Shira Albeck⁵,
Kendall N. Houk⁶,
Dan S. Tawfik⁴ &
…
David Baker^1,2,3

Nature volume 453, pages 190–195 (2008)Cite this article

35k Accesses
996 Citations
90 Altmetric
Metrics details

Abstract

The design of new enzymes for reactions not catalysed by naturally occurring biocatalysts is a challenge for protein engineering and is a critical test of our understanding of enzyme catalysis. Here we describe the computational design of eight enzymes that use two different catalytic motifs to catalyse the Kemp elimination—a model reaction for proton transfer from carbon—with measured rate enhancements of up to 10⁵ and multiple turnovers. Mutational analysis confirms that catalysis depends on the computationally designed active sites, and a high-resolution crystal structure suggests that the designs have close to atomic accuracy. Application of in vitro evolution to enhance the computational designs produced a >200-fold increase in k_cat/K_m (k_cat/K_m of 2,600 M^-1s^-1 and k_cat/k_uncat of >10⁶). These results demonstrate the power of combining computational protein design with directed evolution for creating new enzymes, and we anticipate the creation of a wide range of useful new catalysts in the future.

You have full access to this article via your institution.

Download PDF

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

John Jumper, Richard Evans, … Demis Hassabis

Heat flows enrich prebiotic building blocks and enhance their reactivity

Article Open access 03 April 2024

Thomas Matreux, Paula Aikkila, … Christof B. Mast

De novo design of protein structure and function with RFdiffusion

Article Open access 11 July 2023

Joseph L. Watson, David Juergens, … David Baker

Main

Naturally occurring enzymes are extraordinarily efficient catalysts¹. They bind their substrates in a well-defined active site with precisely aligned catalytic residues to form highly active and selective catalysts for a wide range of chemical reactions under mild conditions. Nevertheless, many important synthetic reactions lack a naturally occurring enzymatic counterpart. Hence, the design of stable enzymes with new catalytic activities is of great practical interest, with potential applications in biotechnology, biomedicine and industrial processes. Furthermore, the computational design of new enzymes provides a stringent test of our understanding of how naturally occurring enzymes work. In the past several years, there has been exciting progress in designing new biocatalysts^2,3.

Here we describe the use of our recently developed computational enzyme design methodology⁴ to create new enzyme catalysts for a reaction for which no naturally occurring enzyme exists: the Kemp elimination^5,6. The reaction, shown in Fig. 1a, has been extensively studied as an activated model system for understanding the catalysis of proton abstraction from carbon—a process that is normally restricted by high activation-energy barriers^7,8.

Figure 1: **Reaction scheme and catalytic motifs used in design.**

Computational design method

The first step in our protocol for designing new enzymes is to choose a catalytic mechanism and then to use quantum mechanical transition state calculations to create an idealized active site with protein functional groups positioned so as to maximize transition state stabilization (Fig. 1b). The key step for the Kemp elimination is deprotonation of a carbon by a general base. We chose two different catalytic bases for this purpose: first, the carboxyl group of an aspartate or glutamate side chain, and, second, the imidazole of a histidine positioned and polarized by the carboxyl group of an aspartate or glutamate (we refer to this combination as a His–Asp dyad). The two choices have complementary strengths and weaknesses. The advantage of the carboxylate is that it is likely to be in the basic (deprotonated) form, but partial desolvation of the charged group in an apolar environment (to increase its relatively weak basicity) could destabilize the protein and further desolvation by the substrate could oppose binding. Although histidine is a better general base than a carboxylate, it is necessary to regulate both its pK_a and its tautomeric state. Coupling the histidine with a base such as aspartate in a dyad serves to both position the histidine and increase its basicity. If the pK_a of histidine is raised too high, however, it can become doubly protonated, rendering it ineffective as a base.

For both the carboxylate- and histidine-based mechanisms, we included additional functional groups in the idealized active sites to further facilitate catalysis using both quantum mechanical and classical methods⁹. A hydrogen bond donor was used to stabilize the developing negative charge on the phenolic oxygen in the otherwise hydrophobic active site. Catalytic motifs lacking the H-bond donor were also tested, because the developing negative charge is relatively small in the transition state and can be easily solvated by water^9,10. For each choice of catalytic site composition, density functional theory quantum-mechanical methods^11,12,13 were used to optimize the placement and orientations of the catalytic groups around the transition state for maximal stabilization (see Methods). Finally, because stabilization of the transition state by charge delocalization is a key factor in catalysis of the Kemp elimination^5,6,7,10,14, we chose to stack aromatic amino acid side chains on the planar transition state (Fig. 1b) using idealized π-stacking geometries¹⁵.

We next used the RosettaMatch hashing algorithm⁴ to search for constellations of protein backbone positions capable of supporting these idealized active sites in a large set of stable protein scaffolds with ligand-binding pockets and high-resolution crystal structures. As described in the Methods, the His–Asp dyad required generalizing RosettaMatch to handle side chains, such as the Asp, for which the range of allowed positions are referenced to another catalytic side chain rather than to the transition state; this was accomplished by identifying, for each His rotamer in a scaffold, the set of Asp rotamers that can provide the supporting hydrogen bond. The scaffold set spans a broad range of protein folds, including TIM barrels, β-propellers, jelly rolls, Rossman folds and lipocalins, amongst others (Supplementary Table 3). In a typical search, more than 100,000 possible realizations of the input idealized active site were found in the scaffold set. For each of these ‘matches’, gradient-based minimization¹⁶ was used to optimize the rigid body orientation of the transition state and the torsional degrees of freedom of the catalytic side chains to best satisfy all catalytic geometrical constraints. Subsequently, residues surrounding the transition state were redesigned both to maximize the stability of the active site conformation and the affinity to the transition state and to maintain protein stability using the Rosetta design methodology for proteins¹⁷ and small molecules¹⁸. Designs were screened for compatibility with substrate and product and were ranked on the basis of the catalytic geometry and the computed transition-state-binding energy.

A steady enrichment of the fraction of designs in the TIM barrel scaffold was observed throughout the enzyme design process. TIM barrel scaffolds represent 25% of the proteins in the input scaffold set, 43% of the initial matches, and 71% of the low-energy designs. Inspection of the designs suggests that the binding pockets in TIM barrel scaffolds were favoured because of the large number of take-off positions (all positions around the barrel pointing towards the cavity) for both the catalytic residues and the additional transition-state-binding residues optimized in the design process; the former favoured TIM barrel matches, and the latter favoured low-energy designs in TIM scaffolds. The TIM barrel is the most widespread and catalytically diverse fold in naturally occurring enzymes; our in silico design process seems to be drawn towards the same structural features as naturally occurring enzyme evolution.

Experimental characterization

Following the active site design, a total of 59 designs in 17 different scaffolds were selected for experimental characterization. Out of the 59 designs, 39 use an Asp or Glu as the generalized base and 20 use a His–Asp or His–Glu dyad. Eight of the designs showed measurable activity in Kemp elimination assays in an initial activity screen (Table 1; see Supplementary Table 4 for sequence information and Methods for experimental details). For each of these eight designs, mutation of the catalytic base (to Ala or Gln/Asn) markedly decreased the activity or abolished catalysis completely, suggesting that the observed activity results from the designed active site (Table 1; for some examples, see Fig. 2a). The designs have k_cat/K_m values in the range of 6 to 160 M^-1 s^-1 (Table 1 and Fig. 2b); it was not possible to obtain saturation kinetics in all cases (for example, see KE10 (open squares) and KE61 (open triangles) in Fig. 2b) owing to low substrate solubility. Both catalytic motifs were used in active designs; of the two most active catalysts, which show a rate acceleration of roughly 10⁵ and a k_cat/K_m of about 100, one uses the Glu as the base and the other uses the His–Asp dyad. All designs exhibited multiple turnovers (≥7)—a prerequisite for efficient catalysis.

Table 1 Kinetic parameters of designed enzymes

Full size table

Figure 2: **Kinetic characterization of designed catalysts.**

Models for these two most active designs are shown in Fig. 3. In the KE59 design (Fig. 3a), which is in a TIM barrel scaffold, Glu 231 is the catalytic base and Trp 110 facilitates charge delocalization by π-stacking to the transition state. Additionally, Leu 108, Ile 133, Ile 178, Val 159 and Ala 210 create a tightly packed hydrophobic pocket that envelops the non-polar substrate. The polar residues Ser 180 and Ser 211 provide hydrogen-bonding interactions with the nitro group of the transition state. Mutation of the catalytic base Glu 231 to Gln abolished catalytic activity (Table 1 and Fig. 2a, open triangles). Attempts to add a hydrogen bond donor to stabilize the negative charge developing at the phenolic oxygen through a Gly 131 to Ser mutation caused a ninefold reduction in k_cat/K_m (Table 1), perhaps owing to unfavourable electrostatic interactions between the oxygen atoms on the serine and substrate; this large effect suggests that the transition-state-binding site is quite well defined. The aromatic-rich pocket and carboxylate base are reminiscent of the active site of the Kemp catalytic antibody 34E4 (ref. 10).

Figure 3: **Computational design models of the two most active catalysts.**

The KE70 design (Fig. 3b) uses the His–Asp dyad mechanism. Asp 44 positions and polarizes His 16 to optimally deprotonate the substrate. Tyr 47 π-stacks above the transition state, and together with Ile 201, Ile 139, Val 167, Ala 18, Ala 102 and Trp 71 creates a tight hydrophobic pocket around the transition state. The active site is again in a TIM barrel scaffold with the His–Asp dyad near the bottom of the site. Mutation of the catalytic base His 16 to Ala abolished catalytic activity (Table 1 and Fig. 2a, filled triangles), whereas mutating Asp 44 of the catalytic dyad to Asn produced an approximately 2.5-fold reduction (Table 1 and Fig. 2a, filled squares). In another design using a His–Asp dyad as general base (KE71), the analogous Asp-to-Asn mutation reduced activity sixfold (Table 1) whereas the His-to-Ala mutation abolished catalysis (Table 1).

High-resolution structural information on designed proteins is essential to validate the accuracy of the design methodology. We were able to grow crystals and obtain a high-resolution structure of one of the early Glu-based designs, KE07 (see Supplementary Information for details). As shown in Fig. 4, the crystal structure and design model are virtually superimposable, with an active site (6.0 Å around the transition state) root mean squared deviation (r.m.s.d.) of 0.95 Å mostly reflecting modest side-chain rearrangements. The similarity between the design model and the crystal structure suggests that the active sites in our new enzymes resemble those in the corresponding design models. The subtle deviations in the backbone indicate loop regions in which explicitly modelling backbone flexibility may yield improved designs.

Figure 4: **Comparison of the designed model of KE07 and the crystal structure.**

The crystal structure also revealed that Lys 222 makes a salt bridge to the catalytic Glu 101 in the absence of substrate, whereas in the designed model the ammonium of the lysine stabilizes the developing phenoxide in the transition state. Forming the productive transition state complex thus requires breaking of the salt bridge, and therefore elimination of the salt bridge in the unbound state would be expected to improve catalysis. We tested this prediction by substituting the lysine with an alanine, and this resulted in a 2.5-fold increase in k_cat/K_m (Table 1).

Directed evolution

In vitro evolution has been shown to markedly improve the stability, expression and activity of enzymes, and is currently the most widely used and successful approach for refining biocatalysts¹⁹. However, in vitro enzyme evolution generally requires a starting point with at least a low level of the desired activity, which is then optimized by repeated rounds of mutation and selection (for a notable exception, see ref. 20). We reasoned that in vitro evolution would be an excellent complement to our computational design efforts. The design calculations ensure that key catalytic functional groups are correctly positioned around the transition state, and, as demonstrated above, can generate active catalysts without requiring any starting activity. Thus, computational design can potentially provide excellent starting points for in vitro evolution. In contrast, the design process does not explicitly model configurational entropy changes, longer range second-shell interactions, and dynamics effects that can be important for efficient turnover; these shortcomings can potentially be remedied by directed evolution. Directed evolution can be valuable both in improving the designed catalysts and in stimulating improvements in the computational design methodology by shedding light on what is missing from the designs.

To investigate the extent to which in vitro evolution methods can improve computationally designed enzymes, we initiated evolution experiments on KE07—the early design for which the crystal structure was determined. Seven rounds of random mutagenesis and shuffling (also including synthetic oligonucleotides that expanded the diversity at selected residues), followed by screens in microtitre plates, yielded variants that had 4–8 mutations relative to KE07 and an improvement of >200-fold in k_cat/K_m (Table 2). Notably, the key aspects of the computational design, including the identities of the catalytic side chains, were not altered by the evolutionary process (indeed, mutating the catalytic base Glu 101 abolished the catalytic activity of both the designed template and its evolved variants; Table 2). Instead, the mutations were often seen in residues adjacent to designed positions (for example, Val 12, Ile 102, Gly 202), and thus provide subtle fine-tuning of the designed enzyme. Some mutations, such as Gly202Arg, are likely to increase the flexibility of regions neighbouring the active site. The hydrophobic residues Ile 7 and Ile 199 at the bottom of the active site were frequently mutated to polar or charged residues (the most common mutation being Ile7Asp), which may hold Lys 222 in position to stabilize the developing negative charge in the transition state while preventing interaction of Lys 222 with Glu 101. Consistent with this idea, the pK_a of the catalytic Glu 101 shifts from <4.5 to 5.9 in the evolved variant with the Ile7Asp mutation (for details, see Supplementary Information). Although the Lys222Ala mutation increases the activity of the original KE07, it significantly decreases the activity of the evolved variants, perhaps owing to the uncompensated additional negative charge.

Table 2 Kinetic parameters of KE07 variants

Full size table

Conclusions

The marked increase in catalytic activity and in turnover (>1,000 catalytic cycles were observed for the evolved variants), achieved through screening a relatively small number of variants (800–1,600 clones per round) by molecular evolution standards bodes well for future combinations of computational design and molecular evolution. In particular, the in vitro evolution of the most active of the computational designs, for example, KE59 or KE70, has the potential to yield highly active catalysts for the Kemp elimination reaction. We anticipate the successful use of the combination of computational design and molecular evolution that we have described here for a wide range of important reactions in the years to come.

The challenge of generating new biocatalysts has led to several successful experimental strategies^20,21,22. In particular, the Kemp elimination comprises a well-defined model for catalysis of proton transfer from carbon—a highly demanding reaction and a rate-determining step in numerous enzymes. It has therefore been the subject of several attempts to generate enzyme-mimics and models (such as catalytic antibodies²³, promiscuous protein catalysts²⁴ and enzyme-like polymers¹⁴). The catalytic parameters of the new enzymes described here are comparable to the most active catalysts of the Kemp elimination of 5-nitro-benzisoxazole described thus far, and provide further insights into the makings of an enzyme. Comparison with the catalytic antibodies²³ highlights the major shortcoming of many of the designs noted above—that is, their relatively weak binding of the substrate. Although the computational design methodology has the advantage of being able to explicitly place key catalytic residues, this may come at a cost of overall substrate and transition-state binding affinity. Consistently achieving high affinity to the transition state and high turnover numbers is a challenge that we are currently approaching by introducing scaffold backbone flexibility into the design process. This should enable us to create higher affinity binding sites formed by more precisely positioned constellations of binding and catalytic residues.

The computational methodology described here can be readily generalized to design catalysts for more complex multistep reactions²⁵. The combination of computational enzyme design to create the overall active site framework for catalysing a synthetic chemical reaction with molecular evolution to fine-tune and incorporate subtleties not yet modelled in the design methodology is a powerful route to create new enzyme catalysts for the very wide range of chemical reactions for which naturally occurring enzymes do not exist. Equally importantly, computational design provides a critical testing ground for evaluating and refining our understanding of how enzymes work.

Methods Summary

Computational design

Transition state geometries were computed at the B3LYP/6-31G(d) level for idealized active sites containing either a carboxylate or an imidazole-carboxylate dyad as the general base. Aromatic side chains were placed above and below the transition state using idealized π-stacking geometries¹⁵. A six-dimensional hashing procedure⁴ was applied to find transition state placements in a large set of protein scaffolds (Supplementary Table 3) that were consistent with the catalytic geometry. Residues surrounding the catalytic side chains and transition state were repacked and redesigned^17,18 to optimize steric, coulombic and hydrogen-bonding interactions with the transition state and associated catalytic residues.

Experimental characterization

The proteins were expressed in Escherichia coli BL21(DE3) using pET29b (Novagen) and purified over a Ni-NTA column (Qiagen). The proteins (1 μM to 10 μM) were assayed in 25 mM HEPES (pH 7.25) and 100 mM NaCl at 250 μM substrate concentration for the initial screening, and substrate dilutions from 1 mM to 11 μM were used for kinetic characterization. Kinetic parameters were determined in at least three independent measurements. Fitted K_m values above 1 mM (and their corresponding k_cat values) are necessarily approximate. Site-directed mutagenesis of catalytic residues and independent protein purifications by different protocols/laboratories were carried out to exclude possible contaminating enzymes (Supplementary Information).

In vitro evolution

Gene libraries of KE07 were created by random mutagenesis using error-prone PCR with ‘wobble’ base analogues dPTP and 8-oxo-dGTP²⁶ using the Genemorph PCR mutagenesis kit (Stratagene), and by DNA shuffling of the most active variants²⁷. In certain rounds, shuffling included the spiking of synthetic oligonucleotides that expanded the diversity at selected residues²⁸. In each round, the cleared lysates of 800–1,600 individual colonies were assayed for hydrolysis of 5-nitrobenzisoxazole (0.125 mM) by following product formation at 380 nm. The most active clones were sub-cloned and sequenced, and the encoding plasmids were used as templates for subsequent rounds of mutagenesis and screening.

Online Methods

Quantum mechanical transition state calculation

Quantum mechanical calculations using density functional theory with the B3LYP functional and the 6-31G(d) basis set^11,12 were used to locate transition structures (confirmed by vibrational frequency analysis) for the acetate- and imidazole/acetate-catalysed reactions in the gas phase. Lysine, serine, threonine and tyrosine functional groups were included in the calculations as hydrogen bond donors to stabilize the developing negative charge on the phenolic oxygen of the transition state. All calculations were carried using Gaussian03 (ref. 13).

Aromatic side chains (Phe, Tyr and Trp) were also modelled to stabilize charge delocalization of the transition state and to provide favourable π-stacking interactions. These side chains were placed using idealized π-stacking geometries¹⁵ in a parallel configuration (4 Å separation) with the aromatic centre offset from the transition state rings by 1 Å. The aromatic groups were placed above either the five- or the six-membered ring and were allowed on both the top and the bottom faces of the transition state. Full rotation about the normal to the aromatic plane was permitted, allowing for variable Cβ–Cγ bond vector placement. The optimal catalytic geometry and the associated constraints for both reaction mechanisms are shown in Supplementary Fig. 1.

Scaffold selection

A large set of protein scaffolds were chosen as candidates for transition state placement. The selection criteria for these scaffolds were as follows: that a high-resolution crystal structure is available; that expression in E. coli is possible; that they are stable proteins; that they contain a preexisting pocket; and that they span a variety of protein folds. The protein scaffolds used in this study are listed in Supplementary Table 3.

For each scaffold, a three-dimensional grid representing the pre-existing pocket was mapped out using an in-house pymol plugin (Supplementary Fig. 2). This was used to reduce the extremely large search space for transition state placement (see below). The positions of potential catalytic residues near the active site were then compiled for each scaffold. In addition, a three-dimensional grid representing the protein backbone was created for each scaffold to allow for a fast clash check.

Transition state placement

To find active site placements in the input scaffolds, it is necessary to consider many alternative geometries for each catalytic motif. As described below, by varying the precise orientations of the catalytic side chains relative to the transition state, we generated very large ensembles of active site geometries. For each of these active site geometric variants, RosettaMatch⁴ was used to position simultaneously transition state and catalytic residues into the set of pre-selected protein scaffolds so as to satisfy all catalytic constraints without steric overlap (only scaffold backbone atoms were modelled for clashes). Supplementary Figs 3–5 show the geometric descriptors used for catalytic side chain–transition state placement and the corresponding number of alternative conformations to be sampled. The His-based mechanism is shown as an example. The Glu/Asp-based mechanism was diversified similarly.

The geometric parameters for the catalytic base–transition state interaction were sampled much more finely because the relative geometry of the general base was considered to be more important than π-stacking or negative charge stabilization. Using the geometric parameters specified in Supplementary Fig. 3, there were 77,472,288 histidine–transition state conformations per position, 52,488 serine–transition state conformations per position, and 27,216 π-stacking–transition state conformations per position.

For a typical matching run, such as the TIM barrel protein scaffold, histidines were sampled at 41 positions around the barrel, and serine and π-stacking residues were placed at 119 residues to allow for catalytic side chains at second-shell residues. For this example, there are more than 1.5 × 10²¹ possible combinations for creating the catalytic motif, which would be computationally intractable to enumerate. By using the linear-scaling RosettaMatch algorithm, this number was reduced to a much more manageable number (8.7 × 10⁷). The use of three-dimensional grids allows for rapid pruning of this large number of transition state conformations, as described above.

For the catalytic mechanism using histidine as the base, we prefiltered each scaffold to identify pairs of positions at which histidine and aspartate/glutamate rotamers can be placed to achieve the dyad geometry. Rotamer pairs with a van der Waals repulsive energy less than 2.0 kcal mol^-1 and hydrogen bonding energy less than -0.5 kcal mol^-1 were stored in an ‘interaction graph’. Matching was carried out using histidine as the catalytic residue, iterating only over histidine rotamers in the interaction graph of His–Asp and His–Glu pairs. For a given match, each Asp or Glu rotamer in the interaction graph that interact with the matched His rotamer was grafted onto the match, and the result screened to remove clashes between the transition state and the backing-up residue. Using the interaction graph decreases the number of potential histidine rotamers that must be modelled in the active site, and thus allows for even finer sampling of ligand rotamer sets. In the TIM barrel example described above, the number of histidine rotamers sampled at the 41 residue positions was decreased from 3,321 (81 × 41) to 253 by precalculating and filtering only the subset of histidine rotamers that can form hydrogen bonds to Asp/Glu. This reduces the number of histidine–transition state conformations from 7.7 × 10⁷ to 5.9 × 10⁶.

Geometric filters were applied to remove matches unlikely to produce good designs. Matches for which transition state poses clashed with more than four modelled Cβ atoms were removed as they would require too many Gly mutations to be introduced to accommodate the bound pose, potentially destabilizing the folded state. Matches with an insufficient number of neighbouring residues around the transition state would be expected to lead to underpacking during the design stage and were also removed.

Protein design

Residues surrounding the transition state and catalytic residues were selected for redesign, and the Rosetta protein design methodology^17,18,30 was used to create a pocket with high affinity for the transition state. Residue selection was carried out using a shell-based method. Residues with Cβ atoms within 8 Å were redesigned, those within 10 Å for which the Cα–Cβ bond vector pointed towards the transition state were redesigned, and all other residues within 12 Å were repacked. A rigid-body minimization of the transition state as well as side-chain relaxation of the protein was performed for each designed model.

Design filtering

A geometric filter was applied to choose models for which catalytic geometry was consistent with the specified constraints (tables in Supplementary Figs 3–5). The van der Waals interaction energy for the transition state and catalytic residues was a useful filter for choosing designs that were roughly well packed; designs with a transition state–protein van der Waals energy greater than -5.0 kcal mol^-1 were removed. Filters were used to select for high transition state–protein shape complementarity²⁹, and to choose models with minimal small cavities surrounding the transition state (W. Sheffler and D.B., submitted). Solvent accessibility measures were used to remove models that completely buried the transition state. For the His–Asp dyad mechanism, an additional filter was added, requiring that the His–Asp hydrogen bond remain on repacking of all residues in the presence of the transition state.

Protein expression and purification

Genes encoding the designs in the pET29b expression vector (Novagen) were purchased from Codon Devices, Inc. The catalytic-side-chain knockout mutations to Ala or Asn/Gln were introduced by site-directed mutagenesis as described³¹. After transformation into BL21 Star (Invitrogen), a one litre culture of auto-induction media³² was inoculated with a single colony and shaken at 37 °C for 8 h. Expression was continued at 18 °C for 24 h. The cells were harvested, resuspended in 25 mM HEPES (pH 7.5) and 100 mM NaCl, and lysed by sonication. The soluble fraction was applied to a Ni-NTA column (Qiagen), washed with 20 mM imidazole, and the protein was eluted with 250 mM imidazole. The proteins were concentrated and the buffer was exchanged to 25 mM HEPES (pH 7.25) and 100 mM NaCl using a 5 ml Hi-Trap desalting column (GE Healthcare). For KE59, an additional size-exclusion chromatography step (Superdex75 10/300 GL from GE Healthcare) was performed. Protein concentrations were determined by measuring the absorbance at 280 nm using the calculated extinction coefficient³³. To eliminate the possibility of observing the activity from a contaminating natural enzyme, further purification steps were carried out for KE07 and the evolved KE07 variants, for KE59 and for KE70 as described in the Supplementary Information section 10, validating the Kemp elimination activity of the designed and evolved enzymes.

Kinetic measurements

For the initial activity screen, 100 μl of the designed proteins (10 μM final concentration) were mixed with 100 μl of 500 μM substrate (freshly diluted from a 50 mM stock solution in acetonitrile) in 25 mM HEPES (pH 7.25) and 100 mM NaCl in a 96-well plate. For the kinetic characterization, the reactions were started by adding 150 μl of substrate dilutions (1 mM to 11 μM final concentration) in 25 mM HEPES (pH 7.25), 100 mM NaCl and 2% acetonitrile to 50 μl of protein (1 μM to 10 μM final concentration) in 25 mM HEPES (pH 7.25) and 100 mM NaCl (or no protein for the background reaction) in a 96-well plate. Product formation was followed at 380 nm in a SpectraMax M5e (Molecular Devices) plate reader at 27 °C in at least three independent experiments. The initial rates divided by the catalyst concentration were plotted against substrate concentration, and k_cat and K_m were determined by fitting the data to the Michaelis–Menten equation (equation (1)) using Kaleidagraph.

If saturation kinetics were not observed, k_cat/K_m values were calculated from a linear fit from the data.

Screening procedure

The libraries were screened by growing the cultures of E. coli BL21 cells in 96-deep-well plates and checking the activity of the lysates with 5-nitrobenzisoxazole. In brief, E. coli BL21 cells transformed with the libraries were grown on luria broth (LB) agar plates (containing 100 µg ml^-1 kanamycin). Individual colonies were inoculated into 2YT supplemented with 50 µg ml^-1 kanamycin (300 µl) in 96-deep-well plates, and grown for ∼15 hours at 37 °C. Overnight cultures (20 µl) were inoculated into 2YT supplemented with 50 µg ml^-1 kanamycin (500 µl) in 96-deep-well plates and grown to A_600 nm of ∼0.6. Overexpression was induced by adding 1 mM isopropyl-β-D-thiogalactoside (IPTG), and the cultures were grown for another 5 h, centrifugated, and the pellet frozen overnight at -20 °C. The cells were lysed with lysis buffer, 250 µl well^-1 (50 mM HEPES (pH 7.25), 0.2% Triton, 0.1 mg ml^-1 lysozyme), and the lysates were cleared by centrifugation and assayed for hydrolysis of 5-nitrobenzisoxazole (0.125 mM) by following the release of the phenol product at 380 nm (Power HT microtitre scanning spectrophotometer). Overnight cultures of the most active clones were plated on LB agar plates containing 100 µg ml^-1 kanamycin. To ensure monoclonality, and to verify the activity of the selected variants, the hydrolysis rates were re-assayed after growing two sub-clones from each original colony in the same conditions. Plasmids were extracted and used for sequencing and as templates for subsequent mutagenesis and screening rounds. Variants subjected to detailed analysis were re-transformed into E. coli BL21 cells, and the protein overexpressed and purified as described above.

Round 1

First-generation libraries were constructed from the designed KE07 gene by an error-prone PCR method using the ‘wobble’ base analogues dPTP and 8-oxo-dGTP²⁶. The rate of mutations was 5 ± 3 per gene, and mutations were mainly of the transition type. The first round of KE07 evolution yielded active variants with lysate activity up to fivefold higher than of that of the starting point KE07.

Round 2

The 23 most active variants isolated in the first round of screening were subjected to DNA shuffling in the presence of the designed template (20%)²⁷ to yield second-generation libraries. The most active variants of round 2 had lysate activities up to 15-fold higher than that of the KE07. Analysis by SDS–PAGE demonstrated that the improvements in the activity were partially caused by enhancing the expression of KE07. Four active variants from round 2 were purified, and their kinetic parameters determined (Supplementary Table 1). Several dominant mutations in round 2 clones were identified; these can be divided into three groups: 1) Lys19Glu/Thr or Lys146Glu/Thr—mutations on the surface of the protein that seem to increase the expression levels of KE07. 2) Gly202Arg or Asn224Asp—mutations at the active site, probably interacting with the substrate-binding residues. Two other mutations in the helix 223–233 (Val226Ala and Phe229Ser), which are adjacent to Asp 224, were obtained. 3) Ile7Thr or Ile199Thr—residues located at the bottom of the active site, but not in direct contact with the substrate.

Round 3

The third-generation libraries were created by shuffling the four active variants of round 2 while randomizing various positions by incorporating spiking oligonucleotides during assembly of DNA fragments²⁸:

Library 1: positions Ile 7 and Ile 199 were randomized (to Ile, Thr, Val, Ala, Phe, Ser, Glu, Asp, Gln, His), with the aim of finding the optimal combination of these residues at the bottom of the active site.

Library 2: positions Tyr 128 and His 201 were randomized (His 201 to Cys, Ser, Tyr, Ser, Thr, Asn; Tyr 128 to Leu, Pro, Ile, Thr, Val, Ala, Phe, Ser) to probe other residues at these designed positions that are responsible for benzisoxazole ring stacking.

Library 3: one or two amino acids were inserted between residues 224–225 and 225–226 to probe the variations of the helix 223–233, which seemed to be a target of many round 2 mutations.

Library 1 yielded clones with lysate activity up to 70-fold higher than that of KE07. Libraries 2 and 3 did not yield any improved variants, thus demonstrating that the designed stacking residues His 201 and Tyr 128 are at their optimal configurations, and that the length of the helix 223–233 does not need to be further optimized.

Round 4

At round 4, randomization of Ile 199 was continued because it was not changed in most of the clones of round 3. Positions Ile 173 and Leu 176 were randomized as well (to Ala, Asp, Glu, Val, Leu, Ile, Thr, Asn, Lys, Pro, His and Gln) because these residues interact with Gly 202, which in most of the improved variants was mutated into arginine.

Round 4 yielded active variants with crude lysate activities up to 200-fold higher than that of KE07. The most active variants of rounds 3 and 4 were purified, and their catalytic parameters determined (Supplementary Table 1).

Sequencing of round 3 and 4 variants confirmed the importance of the mutations found in round 2. Lys19Glu/Thr and Lys146Glu/Thr mutations increased the expression levels, and Gly202Arg and Asn224Asp optimized the top part of the active site. Randomization of positions Ile 7 and Ile 199 at the bottom of the active site demonstrated that, in the optimal combination, Ile 7 is changed to a more polar residue and Ile 199 remains intact. In several improved variants, the residues Ile 173 and Leu 176 were mutated as well, but their effect is relatively minor.

Because the mutation Asn224Asp was found in all the improved variants of rounds 3 and 4 (with the exception of R4 2F/2G), we wanted to ensure that this mutation did not alter the initial design, by acting, for example, as a general base, thus replacing the designed base Glu 101. Thus, we created Glu101Ala mutants of the variants R3 I3/10A, R4 1E/11H and R4 2F/2G, and of the KE07. Mutagenesis of Glu 101 caused a significant decrease in the activity of all the variants (up to 1%). These results demonstrated that the initial design, in which Glu 101 acts as a general base, was maintained (Supplementary Table 2).

Round 5

The active variants from round 4 were subjected to random mutagenesis by error-prone PCR with mutazyme (Genemorph PCR mutagenesis kit, Stratagene³⁴) to yield the fifth-generation libraries, which contained 1 ± 1 mutations per gene and a large portion of shuffled genes. Mild lysate activity improvements (up to 1.5-fold) were observed, and the 12 most active variants from round 5 were subjected to another round of mutagenesis at a higher mutational load.

Round 6

At round 6, the 12 most active variants from round 5 were subjected to random mutagenesis by error-prone PCR with mutazyme (Genemorph PCR mutagenesis kit, Stratagene³⁴) to yield the sixth-generation libraries, which contained 3 ± 1 mutations per gene and a large portion of shuffled genes. Lysate activity improvements of up to 1.5-fold were observed.

Round 7

Seventh-generation libraries were created by shuffling the 20 active variants of round 6, and lysate activity improvements of up to threefold were observed.

The xyz coordinates of the design KE07, KE59 and KE70 are available in the Supplementary Information.

Accession codes

Primary accessions

Protein Data Bank

2rkx

Data deposits

The crystal structure of KE07 has been deposited in the RCSB Protein Data Bank (http://www.rcsb.org) under the accession number 2rkx.

References

Radzicka, A. & Wolfenden, R. A proficient enzyme. Science 267, 90–93 (1995)
Article ADS CAS PubMed Google Scholar
Bolon, D. N. & Mayo, S. L. Enzyme-like proteins by computational design. Proc. Natl Acad. Sci. USA 98, 14274–14279 (2001)
Article ADS CAS PubMed PubMed Central Google Scholar
Kaplan, J. & DeGrado, W. F. De novo design of catalytic proteins. Proc. Natl Acad. Sci. USA 101, 11566–11570 (2004)
Article ADS CAS PubMed PubMed Central Google Scholar
Zanghellini, A. et al. New algorithms and an in silico benchmark for computational enzyme design. Protein Sci. 15, 2785–2794 (2006)
Article CAS PubMed PubMed Central Google Scholar
Casey, M. L., Kemp, D. S., Paul, K. G. & Cox, D. D. The physical organic chemistry of benzisoxazoles I. The mechanism of the base-catalyzed decomposition of benzisoxazoles. J. Org. Chem. 38, 2294–2301 (1973)
Article CAS Google Scholar
Kemp, D. S. & Casey, M. L. Physical organic chemistry of benzisoxazoles II. Linearity of the brønsted free energy relationship for the base-catalyzed decomposition of benzisoxazoles. J. Am. Chem. Soc. 95, 6670–6680 (1973)
Article CAS Google Scholar
Hu, Y., Houk, K. N., Kikuchi, K., Hotta, K. & Hilvert, D. Nonspecific medium effects versus specific group positioning in the antibody and albumin catalysis of the base-promoted ring-opening reactions of benzisoxazoles. J. Am. Chem. Soc. 126, 8197–8205 (2004)
Article CAS PubMed Google Scholar
Hollfelder, F., Kirby, A. J., Tawfik, D. S., Kikuchi, K. & Hilvert, D. Characterization of proton-transfer catalysis by serum albumins. J. Am. Chem. Soc. 122, 1022–1029 (2000)
Article CAS Google Scholar
Na, J., Houk, K. N. & Hilvert, D. Transition state of the base-promoted ring-opening of isoxazoles. Theoretical prediction of catalytic functionalities and design of haptens for antibody production. J. Am. Chem. Soc. 118, 6462–6471 (1996)
Article CAS Google Scholar
Debler, E. W. et al. Structural origins of efficient proton abstraction from carbon by a catalytic antibody. Proc. Natl Acad. Sci. USA 102, 4984–4989 (2005)
Article ADS CAS PubMed PubMed Central Google Scholar
Lee, C., Yang, W. & Parr, R. G. Development of the Colle–Salvetti correlation-energy formula into a functional of the electron density. Phys. Rev. B Condens. Matter 37, 785–789 (1988)
Article ADS CAS PubMed Google Scholar
Becke, A. D. Density-functional exchange-energy approximation with correct asymptotic behavior. Phys. Rev. A 38, 3098–3100 (1988)
Article ADS CAS Google Scholar
Frisch, M. J. et al. Gaussian 03, revision C. 02 (Gaussian, Inc., Wallingford, Connecticut, 2004)
Google Scholar
Hollfelder, F., Kirby, A. J. & Tawfik, D. S. Efficient catalysis of proton transfer by synzymes. J. Am. Chem. Soc. 119, 9578–9579 (1997)
Article CAS Google Scholar
Misura, K. M., Morozov, A. V. & Baker, D. Analysis of anisotropic side-chain packing in proteins and application to high-resolution structure prediction. J. Mol. Biol. 342, 651–664 (2004)
Article CAS PubMed Google Scholar
Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P. Numerical Recipes in C++ 2nd edn (Cambridge Univ. Press, Cambridge, UK, 2002)
MATH Google Scholar
Kuhlman, B. et al. Design of a novel globular protein fold with atomic-level accuracy. Science 302, 1364–1368 (2003)
Article ADS CAS PubMed Google Scholar
Meiler, J. & Baker, D. ROSETTALIGAND: protein-small molecule docking with full side-chain flexibility. Proteins 65, 538–548 (2006)
Article CAS PubMed Google Scholar
Chica, R. A., Doucet, N. & Pelletier, J. N. Semi-rational approaches to engineering enzyme activity: combining the benefits of directed evolution and rational design. Curr. Opin. Biotechnol. 16, 378–384 (2005)
Article CAS PubMed Google Scholar
Seelig, B. & Szostak, J. W. Selection and evolution of enzymes from a partially randomized non-catalytic scaffold. Nature 448, 828–831 (2007)
Article ADS CAS PubMed PubMed Central Google Scholar
Cesaro-Tadic, S. et al. Turnover-based in vitro selection and evolution of biocatalysts from a fully synthetic antibody library. Nature Biotechnol. 21, 679–685 (2003)
Article CAS Google Scholar
Varadarajan, N., Gam, J., Olsen, M. J., Georgiou, G. & Iverson, B. L. Engineering of protease variants exhibiting high catalytic activity and exquisite substrate selectivity. Proc. Natl Acad. Sci. USA 102, 6855–6860 (2005)
Article ADS CAS PubMed PubMed Central Google Scholar
Thorn, S. N., Daniels, R. G., Auditor, M. T. & Hilvert, D. Large rate accelerations in antibody catalysis by strategic use of haptenic charge. Nature 373, 228–230 (1995)
Article ADS CAS PubMed Google Scholar
Hollfelder, F., Kirby, A. J. & Tawfik, D. S. Off-the-shelf proteins that rival tailor-made antibodies as catalysts. Nature 383, 60–62 (1996)
Article ADS CAS PubMed Google Scholar
Jiang, L. et al. De novo computational design of retro-aldol enzymes. Science 319, 1387–1391 (2008)
Article ADS CAS PubMed PubMed Central Google Scholar
Vartanian, J. P., Henry, M. & Wain-Hobson, S. Hypermutagenic PCR involving all four transitions and a sizeable proportion of transversions. Nucleic Acids Res. 24, 2627–2631 (1996)
Article CAS PubMed PubMed Central Google Scholar
Abecassis, V., Pompon, D. & Truan, G. High efficiency family shuffling based on multi-step PCR and in vivo DNA recombination in yeast: statistical and functional analysis of a combinatorial library between human cytochrome P450 1A1 and 1A2. Nucleic Acids Res. 28, E88 (2000)
Article CAS PubMed PubMed Central Google Scholar
Herman, A. & Tawfik, D. S. Incorporating synthetic oligonucleotides via gene reassembly (ISOR): a versatile tool for generating targeted libraries. Protein Eng. Des. Sel. 20, 219–226 (2007)
Article CAS PubMed Google Scholar
The. CCP4 suite: programs for protein crystallography. Acta Crystallogr. 50, 760–763 (1994)
Dantas, G., Kuhlman, B., Callender, D., Wong, M. & Baker, D. A large scale test of computational protein design: folding and stability of nine completely redesigned globular proteins. J. Mol. Biol. 332, 449–460 (2003)
Article CAS PubMed Google Scholar
Kunkel, T. A., Roberts, J. D. & Zakour, R. A. Rapid and efficient site-specific mutagenesis without phenotypic selection. Methods Enzymol. 154, 367–382 (1987)
Article CAS PubMed Google Scholar
Studier, F. W. Protein production by auto-induction in high density shaking cultures. Protein Expr. Purif. 41, 207–234 (2005)
Article CAS PubMed Google Scholar
Pace, C. N., Vajdos, F., Fee, L., Grimsley, G. & Gray, T. How to measure and predict the molar absorption coefficient of a protein. Protein Sci. 4, 2411–2423 (1995)
Article CAS PubMed PubMed Central Google Scholar
Barlow, M. & Hall, B. G. Predicting evolutionary potential: in vitro evolution accurately reproduces natural evolution of the tem β-lactamase. Genetics 160, 823–832 (2002)
CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank R. Stanfield and I. Wilson for providing D-2-deoxyribose-5-phosphate aldolase wild-type protein (PDB code 1jcl) and W. A. Greenberg and C.-H. Wong for providing the expression plasmid. We thank Rosetta@home participants for their contributions of computing power. This work was supported by a postdoctoral fellowship from the Swiss National Science Foundation to D.R., an Adams Fellowship (Israel Academy of Science) to O.K., research grants from the Minerva Foundation and the Fannie Sherr Estate to D.S.T., and NSF and NIH-CBI grants to K.N.H. We are also thankful for financial support from the Defense Advances Research Projects Agency (DARPA) and the Howard Hughes Medical Institute (HHMI) for this research.

Author Contributions D.R. performed computational design using carboxylate and the His–Asp motif, and purified and experimentally characterized designed catalysts; O.K. synthesized the substrate, performed in vitro evolution and experimentally characterized evolved variants; A.M.W. performed computational design on the His–Asp motif; L.J. performed initial computational design on the carboxylate motif; J.D. and K.N.H. computed idealized active sites using quantum mechanics; J.B. and J.L.G. expressed and purified designed catalysts; E.A.A. helped with enzyme design set-up; A.Z. wrote RosettaMatch and helped with computational set-up; O.D. and S.A. crystallized KE07; and D.R., A.M.W., D.B., K.N.H., O.K. and D.S.T. designed the experiment and wrote the manuscript.

Author information

Daniela Röthlisberger, Olga Khersonsky and Andrew M. Wollacott: These authors contributed equally to this work.

Authors and Affiliations

Department of Biochemistry,,
Daniela Röthlisberger, Andrew M. Wollacott, Lin Jiang, Eric A. Althoff, Alexandre Zanghellini & David Baker
Biomolecular Structure and Design, and,,
Lin Jiang, Alexandre Zanghellini & David Baker
Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA ,
Jamie Betker, Jasmine L. Gallaher & David Baker
Department of Biological Chemistry, and,
Olga Khersonsky & Dan S. Tawfik
Israel Structural Proteomics Center, Weizmann Institute of Science, Rehovot 76100, Israel ,
Orly Dym & Shira Albeck
Department of Chemistry and Biochemistry, University of California, Los Angeles, California 90095, USA,
Jason DeChancie & Kendall N. Houk

Authors

Daniela Röthlisberger
View author publications
You can also search for this author in PubMed Google Scholar
Olga Khersonsky
View author publications
You can also search for this author in PubMed Google Scholar
Andrew M. Wollacott
View author publications
You can also search for this author in PubMed Google Scholar
Lin Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jason DeChancie
View author publications
You can also search for this author in PubMed Google Scholar
Jamie Betker
View author publications
You can also search for this author in PubMed Google Scholar
Jasmine L. Gallaher
View author publications
You can also search for this author in PubMed Google Scholar
Eric A. Althoff
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Zanghellini
View author publications
You can also search for this author in PubMed Google Scholar
Orly Dym
View author publications
You can also search for this author in PubMed Google Scholar
Shira Albeck
View author publications
You can also search for this author in PubMed Google Scholar
Kendall N. Houk
View author publications
You can also search for this author in PubMed Google Scholar
Dan S. Tawfik
View author publications
You can also search for this author in PubMed Google Scholar
David Baker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Dan S. Tawfik or David Baker.

Supplementary information

Supplementary Information

The file contains Supplementary Discussion with Supplementary Figures 1-25 and Legends, Supplementary Tables 1-10 and additional references. (PDF 1656 kb)

Supplementary Data

The file contains Supplementary Data 1/KE07.pdb with xyz coordinates of KE07 design model in pdb format; Supplementary Data 2/KE59.pdb with xyz coordinates of KE59 design model in pdb format and Supplementary Data 3/KE70.pdb with xyz coordinates of KE70 design model in pdb format. (ZIP 151 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Röthlisberger, D., Khersonsky, O., Wollacott, A. et al. Kemp elimination catalysts by computational enzyme design. Nature 453, 190–195 (2008). https://doi.org/10.1038/nature06879

Download citation

Received: 25 October 2007
Accepted: 03 March 2008
Published: 19 March 2008
Issue Date: 08 May 2008
DOI: https://doi.org/10.1038/nature06879

This article is cited by

A non-canonical nucleophile unlocks a new mechanistic pathway in a designed enzyme
- Amy E. Hutton
- Jake Foster
- Anthony P. Green
Nature Communications (2024)
Designer catalytic nanopores meet PET nanoparticles
- Ren Wei
- Uwe T. Bornscheuer
Nature Catalysis (2023)
Computational design of highly efficient thermostable MHET hydrolases and dual enzyme system for PET recycling
- Jun Zhang
- Hongzhao Wang
- Yushan Zhu
Communications Biology (2023)
Quantitative Methods for Metabolite Analysis in Metabolic Engineering
- Cheeyoon Ahn
- Min-Kyung Lee
- Cheulhee Jung
Biotechnology and Bioprocess Engineering (2023)
Building enzymes from scratch
- Elaine O’Reilly
Nature Chemistry (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Abstract

Similar content being viewed by others

Main

Computational design method

Experimental characterization

Directed evolution

Conclusions

Methods Summary

Computational design

Experimental characterization

In vitro evolution

Online Methods

Quantum mechanical transition state calculation

Scaffold selection

Transition state placement

Protein design

Design filtering

Protein expression and purification

Kinetic measurements

Screening procedure

Round 1

Round 2

Round 3

Round 4

Round 5

Round 6

Round 7

Accession codes

Primary accessions

Protein Data Bank

Data deposits

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links