Over the past ten years, scientific and technological advances have established biocatalysis as a practical and environmentally friendly alternative to traditional metallo- and organocatalysis in chemical synthesis, both in the laboratory and on an industrial scale. Key advances in DNA sequencing and gene synthesis are at the base of tremendous progress in tailoring biocatalysts by protein engineering and design, and the ability to reorganize enzymes into new biosynthetic pathways. To highlight these achievements, here we discuss applications of protein-engineered biocatalysts ranging from commodity chemicals to advanced pharmaceutical intermediates that use enzyme catalysis as a key step.
Biocatalysis is the application of enzymes and microbes in synthetic chemistry, and uses nature’s catalysts for new purposes: applications for which enzymes have not evolved1,2,3,4,5. The field of biocatalysis has reached its present industrially proven level through several waves of technological research and innovations.
During the first wave of biocatalysis (Fig. 1), which started more than a century ago, scientists recognized that components of living cells could be applied to useful chemical transformations (in contrast to the fermentation processes, which had been commonplace for millennia already). For example, Rosenthaler synthesized (R)-mandelonitrile from benzaldehyde and hydrogen cyanide using a plant extract6; hydroxylation of steroids7 occurring within microbial cells was also known. More recent examples are the use of proteases in laundry detergents8, glucose isomerase to convert glucose to the sweeter-tasting fructose9, and penicillin G acylase to make semisynthetic antibiotics10. The main challenge for these applications is the limited stability of the biocatalyst, and such shortcomings were primarily overcome by immobilization of the enzyme, which also facilitated the reuse of the enzyme.
During the second wave of biocatalysis, in the 1980s and 1990s, initial protein engineering technologies, typically structure based, extended the substrate range of enzymes to allow the synthesis of unusual synthetic intermediates. This change expanded biocatalysis to the manufacture of pharmaceutical intermediates and fine chemicals. Examples include the lipase-catalysed resolution of chiral precursors for synthesis of diltiazem (a blood pressure drug), hydroxynitrile-lyase-catalysed synthesis of intermediates for herbicides11, carbonyl-reductase-catalysed synthesis of enantiopure alcohols for cholesterol-lowering statin drugs, lipase-catalysed synthesis of wax esters such as myristyl myristate or cetyl ricinoleate for cosmetics12, and nitrile-hydratase-catalysed hydration of acrylonitrile to acrylamide for polymers13 (where nitrile hydratase was obtained from whole cells of Rhodococcus rhodochrous). Apart from stabilization, the challenges now included optimizing the biocatalyst for the non-natural substrates.
The third, and present, wave of biocatalysis started with the work of Pim Stemmer and Frances Arnold in the mid and late 1990s. They pioneered molecular biology methods that rapidly and extensively modify biocatalysts via an in vitro version of Darwinian evolution. The methods are now commonly called directed evolution, although this term has been in use since whole-cell experiments in 197214. The initial versions of this technology involve iterative cycles of random amino-acid changes in a protein, followed by selection or screening of the resulting libraries for variants with improved enzyme stability, substrate specificity and enantioselectivity. Subsequent developments, discussed here, have focused on improving the efficiency of directed evolution to create ‘smarter’ libraries. Industrial-scale biocatalysis focused primarily on hydrolases, a few ketoreductases (KREDs), and cofactor regeneration and protein stability in organic solvents. In some cases, metabolic pathways were optimized; for example, combining genes from various natural strains to produce 1,3-propanediol (a monomer for polyesters) in a new host made it possible to switch from glycerol to the more convenient glucose as the feedstock15.
As a result of the advances made during the present wave of biocatalysis, remarkable new capabilities can now be engineered into enzymes, such as the ability to accept previously inert substrates (a KRED for montelukast16 or a transaminase for sitagliptin17,18) or to change the nature of the product that is formed (terpene cyclase variants that favour different terpenes19 or amino-acid metabolism that makes alcohols for biofuels20). Novel enzymes are needed today to convert biomass into second- and third-generation biofuels21,22, materials23 and chemicals24. Key developments that enabled this third wave are advanced protein engineering25,26,27 (including directed evolution), gene synthesis, sequence analysis, bioinformatics tools28 and computer modelling, and the conceptual advance that improvements in enzymes can be more pronounced than previously expected. Engineered enzymes can remain stable at 60 °C in solutions containing organic solvents, can accept new substrates and can catalyse new non-natural reactions. This engineering may now take only a few months, thus greatly expanding the potential applications. In the past, an enzyme-based process was designed around the limitations of the enzyme; today, the enzyme is engineered to fit the process specifications.
About ten years ago, articles in Nature29,30 and Science31 reviewed the first and second waves of biocatalysis and provided a glimpse at what the third wave might bring. Today it is timely to assess the impact of this third wave and to speculate what the next decade might bring (Box 1). Although biocatalysis involves metabolic engineering22,32,33 and synthetic biology, this Review focuses on enzymatic and whole-cell reactions.
Engineering enzymes to fit the manufacturing process
To minimize costs, chemical manufacture requires stable, selective and productive catalysts that operate under the desired process conditions. Engineering enzymes for such a process starts by defining the engineering goal, such as increased stability, selectivity, substrate range or, typically, a combination thereof. In 2000, before the third wave, only a few strategies were available to meet these goals. Enzyme immobilization could increase the stability of a protein, but the increases in stability were moderate and often insufficient for most chemical transformations. Directed evolution was also possible, but was still slow because it required construction and screening of large libraries that mostly contained variants with reduced, or even no, activity. Examples of drastic improvements were rarely of industrial relevance. The slow pace meant that the evolved proteins contained only a few changes and, thus, that the enzyme properties changed only slightly. Although several hundred enzymatic processes already had industrial uses4, most involved enzymes and whole cells that had been marginally altered genetically29.
In the past decade, our understanding of proteins and the number of available directed evolution strategies have both increased, making it possible to make large changes in enzyme properties. By and large, enzyme engineering continues to be a collection of case studies resulting from applying one of various possible approaches to the problem at hand, rather than there being a quantitative approach such as those used in disciplines such as civil, electric, software, or chemical engineering. Converting these case studies into engineering principles will require using free energy to connect the design goals to the structural changes needed (Fig. 2). Large changes in properties require large changes in free energy. For example, large changes in stability will require large free-energy changes in the folding–unfolding equilibrium. (Even irreversible protein unfolding starts with a reversible partial unfolding.) A molecular-level understanding of proteins suggests strategies that could be used for the improvements. For example, surface residues contribute to the folding–unfolding equilibrium and adding a proline residue in a loop lowers the entropy of the unfolded form. These strategies replace large libraries of random variants (mostly with poorer properties) with smaller, more focused protein libraries containing a high fraction of active and potentially improved variants (Fig. 1). Finally, by estimating the strength of various interactions (ion pairs on the surface or entropic contributions of adding a proline residue), researchers can estimate the changes needed to reach the goal. Few researchers explicitly use the free-energy-based measures to plan protein-engineering strategies today, but converting case studies into engineering principles requires a quantitative approach.
New and improved methodologies
Over the past ten years, major advances in DNA technologies and in bioinformatics have provided critical support to the field of biocatalysis. These tools have promoted the discovery of novel enzymes in natural resources and have substantially accelerated the redesign of existing biocatalysts.
Advanced DNA technologies
Next-generation DNA sequencing technology has allowed parallel sequence analysis on a massive scale and at dramatically reduced cost. Whereas the cost of a human genome sequence analysis in 2002 was estimated at ∼US$70,000,000, the price in 2012 has decreased more than 1,000-fold to less than US$10,000 (ref. 34), and LifeTechnologies, Illumina and Oxford Nanopore Technologies have announced that sequencing machines that are designed to sequence the entire human genome in a matter of hours will be available later in 2012 and will lower the cost per genome to less than US$1,000. Sequences of entire genomes from organisms from different environments, as well as environmental DNA samples that include unculturable organisms (metagenomes), have created a rich resource in which to search for novel biocatalysts35, and will continue to do so. Massive high-throughput sequencing (>10,000,000 sequence reads) using the Illumina technology also facilitated the exploration and understanding of protein sequence–function relationships36.
Low-cost DNA synthesis has replaced isolation of genomic DNA as the starting point for protein engineering. Whole-gene DNA synthesis further allows the codons to be optimized for the host organism and molecular architectural structures such as promoters, terminators, enhancers, restriction sites and so on to be introduced at convenient sites. This DNA synthesis uses traditional phosphoramidite chemistry, but optimized reaction conditions have improved coupling efficiency, increasing the overall quality and quantity of the polymer to make sequences even 200–250 nucleotides long. Parallel DNA synthesis using photolithographic and inkjet printing techniques further cut costs and speed synthesis37. DNA synthesis has been used to make entire sections of chromosomal DNA and even complete genomes for metabolic pathway engineering38. Whole-gene synthesis can also be used to make high-quality DNA libraries ranging from small, focused, site-saturation libraries to large, comprehensive gene collections. Customized genes and even gene libraries are becoming commodity chemicals similar to reagents and solvents found in today’s research laboratories.
Novel tools in bioinformatics
Complementing the experimental advances, bioinformatics tools have become an integral part of modern protein engineering39. Multiple sequence alignments across large enzyme families and homology searches have identified genes with similar catalytic activities, leading to novel, potent biocatalysts40. The same sequence information allows the reconstruction of ancestral biocatalysts40, which may have broader substrate range and catalytic promiscuity (see below). Multiple sequence alignments identify the most common amino acids at each position (the consensus sequence) and amino-acid substitutions that yield stable function enzymes. This data helps in the design of small libraries with a high proportion of catalytically active variants. These libraries have been used to discover biocatalysts with enhanced stabilities, catalytic functions and altered stereoselectivities41.
Paralleling the advances in sequence-based protein engineering, structure-guided approaches have benefited from a rapid increase in protein structure coordinates deposited in the RCSB Protein Data Bank (http://www.pdb.org). Over the past decade, the repository has grown by over 450% to contain more than 77,000 protein structures. This facilitates both rational protein design and directed evolution, because structural alignment of related proteins helps to identify distinct similarities and differences guiding the more reliable design of mutant libraries.
The utility of smaller libraries was demonstrated in two different approaches to increasing the enantioselectivity of an esterase for resolution of methyl 3-bromo-2-methylpropionic acid, a chiral synthon42. Using random mutagenesis and screening 200 out of thousands of variants, the E-value for the enzyme (the selectivity of the enzyme for one enantiomer over the other) was increased from 12 to 19 (ref. 43). Recognizing that a relatively small increase (0.5 kcal mol−1) in the difference in activation energy (ΔΔΔG‡) for the two enantiomers was needed to generate a practical enzyme (E > 30), mutagenesis was focused at the active site. A library containing all possible single mutations at four positions (76 variants) yielded an enzyme with E = 61 (ΔΔΔG‡ = 0.96). Understanding the nature of the problem to be solved focuses the enzyme optimization approach on smaller libraries and gives bigger improvements. In this context, it is worth also mentioning a new method for continuous directed evolution using a combination of a phage infection system and a mutator plasmid in Escherichia coli44.
Examples of engineered enzymes in industrial biocatalysis
As predicted by Schmid et al. in their forward-looking review in 200129, continuous regeneration of cofactors and a wider range of enzymes have been reported in the past ten years. However, the predicted applications of biochips and combinatorial biocatalysis have not yet materialized. The use of non-metabolizing cells for biocatalysis has proven to be more difficult than predicted and preference has instead shifted towards engineered enzymes used in crude and semipurified form. Whereas historically whole cells offered a simple and effective option for cofactor regeneration and enhanced enzyme stability, protein engineering and the use of single enzymes is now considered more economic and practical. The use of isolated enzymes have other advantages: they are easier to remove (less is added because they have more activity per unit mass), they tolerate harsher conditions, they eliminate potential diffusion limitations caused by cell membranes and they are easier to ship around the world. For example, KRED-based processes have now replaced whole-cell reductions and metal–ligand-based chemocatalysis, which were the industry standards during the past decade45,46. One exception is a whole-cell process to convert racemic hydantoins into optically pure, non-natural α-amino acids. Recombinant E. coli concurrently expressing hydantoinase, carbamoylase and racemase was found to be a simple and efficient production system, replacing the original process based on three immobilized enzymes in consecutive fixed-bed reactors47,48. Moreover, the whole-cell process required no metabolic flux controls and proceeded without undesired side reactions.
KREDs and other enzymes have been widely investigated for the manufacture of chiral intermediates for pharmaceuticals such as atorvastatin, the active ingredient in Lipitor, which is a cholesterol-lowering drug that had global sales of US$11,900,000,000 in 2010. Seven enzymatic approaches2,49,50 (Fig. 3), differing not only in the choice of enzyme and starting material but also as to whether the product is a raw material (with a single chiral centre) or an advanced intermediate (with two chiral centres), have been developed. In all cases, success requires protein engineering to improve the reaction rate, the enantioselectivity, the stability to high substrate concentrations (up to 3 M, as in the nitrilase process51) or the stability to high solvent concentrations (20% butylacetate in the KRED process52). Apart from a highly active biocatalyst, a low-cost process also requires inexpensive raw materials and simple isolation of pure product in high yield. One current process leading to the advanced intermediate uses three biocatalytic steps: first, the combination of KRED and glucose dehydrogenase; second, the combination of this with a halohydrin dehalogenase to make the ethyl (R)-4-cyano-3-hydroxybutanoate intermediate (Fig. 3) at a rate of >100 t yr−1; and, third, the enzymatic reduction for the advanced diol intermediate52.
Recent engineering17 expanded the substrate range of transaminases to ketones with two bulky substituents. The enzyme engineering started with a small ketone substrate, created more space in the active site and used increasingly larger ketones. Several rounds of directed evolution increased the activity ∼40,000-fold and yielded an engineered amine transaminase (Fig. 4) that can replace the transition-metal-based hydrogenation catalyst for sitagliptin manufacture. Starting from ATA-117, a close homologue of the wild-type enzyme, which had no detectable activity on the substrate, the first variant provided very low activity (0.2% conversion of 2 g l−1 substrate using 10 g l−1 enzyme) towards prositagliptin; the final variant converts 200 g l−1 ketone to sitagliptin with 99.95% e.e. at 92% yield. The biocatalytic process not only reduced the total waste and eliminated all transition metals, but increased the overall yield and the productivity by 53% by comparison with the metal-catalysed process18. The numerous biocatalytic routes scaled up for pharmaceutical manufacturing (Table 1) demonstrate their competitiveness with traditional chemical processes.
Enzyme variants resulting from optimization studies are a unique source of starting points for future programmes, and by using the more stable enzymes engineered for one process the next optimization programme can be even faster. For instance, engineering KREDs to make R3HT (3) (see Table 1 for abbreviations and numbering of compounds) created many stable enzyme variants including some that were unsuitable due to low enantioselectivity. However, one of these unsuitable variants was the starting enzyme in engineering a KRED for DCFPE (4). One of the DCFPE enzymes was then the starting point for a montelukast (5) KRED, which in turn was a starting point for the duloxetine (6) KRED. Similarly, the transaminases generated during the evolution of the prositagliptin (18) transaminase can make other amines and may serve as starting points for new engineered enzymes for amine synthesis. Starting from a non-natural stabilized enzyme variant that already works in one process thus accelerates catalyst and process development in unprecedented ways.
Enzymatic conversions that simultaneously set two stereocentres are especially efficient ways to make complex molecules. For example, reduction of a ketone catalysed by a KRED sets the alcohol stereocentre. However, if a second stereocentre next to the ketone carbonyl racemizes rapidly in solution and the KRED is highly selective for one configuration, then the reduction reaction can set two stereocentres in one step. Examples include a penem intermediate (8), pseudoephedrine (12) and phenylephrine (13), as well as similar processes for chiral amines (20). Also, aldolases are now being used for subsequent reaction steps to generate multiple chiral centres, for example in the synthesis of statin intermediates (33). The single-enzyme cascade reactions as catalysed by aldolases have been further expanded to multi-enzyme cascade processes, for instance in the synthesis of 2′-deoxyinosine (34) in vitro or of complex molecules such as artemisinin (42) or Taxol in vivo.
Environmental advantages of biocatalytic processes
In the context of concerns about the environmental aspects of chemical manufacturing, biocatalysis provides an attractive alternative. The US Environmental Protection Agency awards five prizes each year in the Presidential Green Chemistry Award Challenge. The nominations emphasize the 12 Principles of Green Chemistry53, which consider environmental factors as well as use of renewable feedstocks, energy efficiency and worker safety. Biocatalysis, using either enzyme technology or whole cells, has won 16 awards since 2000 (Table 2). Biocatalysts are made from renewable sources and are biodegradable and non-toxic, and their high selectivities simplify reaction work-ups and provide product in higher yields. Biocatalytic processes are also safe as they typically run at ambient temperature, atmospheric pressure and neutral pH. Hence, it is not surprising that so many of the awards go to biocatalysis.
The broad range of awards in Table 2 shows the application of biocatalysts beyond the pharmaceutical industry. Several applications involve polymers, especially polyesters. The optimized fermentation of lactic acid is the basis for a polylactic acid plant in Nebraska with a capacity of 141,000 tonnes per year, and a new non-natural metabolic route allows synthesis of 1,3-propanediol for the manufacture of SORONA polymer. The 1,4-butanediol fermentation yields a component of another polymer, spandex. In the synthesis of polyhydroxyalkanoates, the biocatalyst catalyses not just the monomer synthesis but also its polymerization. Yang and co-workers recently engineered these polyhydroxyalkanoate synthases to polymerize lactic acid into polylactic acid54. Several other awards involve new metabolic pathways to manufacture biofuels. For example, LS9, Inc. engineered E. coli bacteria to produce biodiesel. Adding the genes for plant thioesterases to E. coli diverted normal fatty-acid biosynthesis into synthesis of several fatty acids suitable for biodiesel. Then genes were added for enzymes to make ethanol and an enzyme to couple the ethanol and fatty acids to make fatty-acid ethyl esters, which can be used for biodiesel22. The amount of biodiesel produced is at least tenfold too low for the process to be commercially viable, but further engineering will probably increase the yield.
New concepts in protein engineering
Large changes in enzyme properties usually require multiple amino-acid substitutions because they make larger changes to the protein structure. However, simultaneous multiple amino-acid substitutions create exponentially more variants for testing. There are 7,183,900 possibilities for two substitutions anywhere in a 200 amino-acid protein and 9,008,610,600 possibilities for three substitutions. Many of these variants are inactive, and either all are created and tested to find the improved variants, or the library is screened only partially and incremental improvements in subsequent rounds of evolution are required.
The simplest solution to this problem is more efficient screening. Changes in substrate specificity may be monitored by high-throughput methods, such as fluorescence-activated cell sorting55,56,57, which can screen tens of millions of variants in a short amount of time. Whittle and Shanklin made six simultaneous substitutions in the active site of a desaturase and then screened for growth on a different substrate. Only those variants with altered substrate specificities could grow58. Seelig and Szostak used very large random libraries (up to 1013 variants), from which they could select variants that catalysed an RNA ligation59 based on binding of the product, but not the starting materials, to a column.
At present, the best approach to creating multiple mutations is to add them simultaneously but to limit the choices using statistical or bioinformatic methods. One statistical correlation approach is based on the ProSAR (protein structure activity relationship) algorithm used by Codexis researchers to improve the reaction rate of a halohydrin dehalogenase >4,000-fold60. Researchers made random amino-acid substitutions (an average of ten) in the dehalogenase and measured the rate of catalysis by the variants. Then statistical methods identified whether a particular substitution was beneficial. For example, variants that contained a Phe 186 Tyr substitution were, on average, better than those that did not. Some variants that contained such a substitution were not beneficial, owing to the detrimental effects of other mutations, but the statistical analysis identified that, on average, Phe 186 Tyr is a beneficial mutation. The final improved enzyme contained 35 amino-acid substitutions among its 254 amino acids.
γ-Humulene synthase catalyses the cyclization of farnesyl diphosphate via cationic intermediates to γ-humulene in 45% yield, but forms 51 other sesquiterpenes in smaller amounts. Keasling and co-workers substituted amino-acid residues in the active site stepwise and identified the contribution of each one to the product distribution61. Substitutions were combined to favour formation of one of the other sesquiterpenes. For example, one triple substitution created an enzyme that formed 78% sibirene; the natural enzyme forms 23% sibirene.
Another approach is to limit the location of changes to the active site and the type of changes to those known from sequence comparisons to occur often at these sites. Jochens and Bornscheuer used this approach to increase the enantioselectivity of a Pseudomonas fluorescens esterase. There were 160,000 (204) ways simultaneously to vary the four amino acids adjacent to the substrate in the active site. The researchers aligned the amino-acid sequences of >2,800 related enzymes to identify which amino acids are most common at these locations. This analysis limited the possibilities to several hundred variants, which were tested to find a double and a triple mutant with the desired selectivities41. Another important advance that allows multiple mutations is the recognition that mutations often destabilize proteins62,63,64 and that starting with a very stable protein therefore allows it to tolerate a greater number and range of changes65,66.
Because the workload for screening larger libraries containing multiple mutations increases exponentially with library size, most researchers work on the assumption that beneficial mutations are mostly additive67 and that synergistic effects are rare, except for nearby changes. Indeed, combining beneficial single mutations often yields additive improvements (for example increasing the stability of an esterase from Bacillus subtilis to an organic solvent68 or increasing the enantioselectivity of a lipase from Pseudomonas aeruginosa69). However, often the contributions do not add up exactly or have unexpected behaviour. For example, mutations A and B by themselves may be deleterious, but together they may be beneficial. Weinreich and co-workers70 investigated such cooperative interactions in the evolution of a β-lactamase with higher activity. Mutation A increased the reaction rate but destabilized the β-lactamase. The overall effect was slightly beneficial. Mutation B did not affect rate but stabilized the β-lactamase; by itself it had no effect. Together mutations A and B were highly beneficial because the β-lactamase was faster and maintained its stability, but adding mutations stepwise will most likely miss these types of synergy. Reetz and Sanchis came to similar conclusions in testing the stepwise addition of mutations to increase enantioselectivity71. Synergistic effects are important when one of the mutations is a stabilizing mutation and when the mutations are nearby one another. The complication of non-additivity due to stabilizing mutations can be minimized by stabilizing the protein before starting mutagenesis, but the complication of non-additivity due to nearby mutations is the most common one and not easily avoided.
Consequently, the extent of useful changes made during the improvement of a protein has increased drastically in the past decade. In the early 2000s, 1–5 mutations were typical, whereas by 2010, 30–40 amino-acid substitutions were not unusual. For example, directed evolution of the halohydrin dehalogenase for manufacture of the atorvastatin (Lipitor) side chain (Fig. 3) changed at least 35 of the 254 amino acids60 (>14%) and directed evolution of the transaminase for sitagliptin manufacture (Fig. 4) changed 27 of the 330 amino acids17 (8.2%). Similarly, computational design of a retro aldolase required 8 or 12 amino acid substitutions (4–6%) in the starting enzyme, which was a xylanase composed of 197 amino acids72.
A second approach investigated in the past ten years is the creation of new, often non-natural, catalytic activities. The starting point for this new activity is usually a catalytically promiscuous reaction. Catalytic promiscuity is the ability of one active site to catalyse more than one reaction type. Typically, the enzyme catalyses one normal reaction and additional side reactions, which may involve common catalytic steps. The new reaction type is not just a substituent added to the substrate, but involves a different transition state and/or forms different types of chemical bond. For example, pyruvate decarboxylase normally converts pyruvate to acetaldehyde and carbon dioxide. However, a promiscuous catalytic activity of pyruvate decarboxylase is the coupling of this acetaldehyde to another aldehyde in an acyloin condensation. Such a non-natural pyruvate-decarboxylase-catalysed condensation of acetaldehyde with benzaldehyde is the basis for an industrial process developed at BASF in the 1920s to make a precursor of the drug Ephedrine. Recent protein engineering enhanced the promiscuous ability of pyruvate decarboxylase to catalyse the acyloin condensation73. The normal reaction requires a proton transfer, but the promiscuous reaction does not. A single amino-acid substitution to remove the proton donor disabled the natural activity and increased the promiscuous activity about fivefold.
The method of disabling unwanted pathways to increase flux to the desired product is further developed by the third advance, metabolic pathway engineering. This allows more complex pathways from secondary metabolism to be transferred into new organisms and entirely new biochemical pathways to be created to make pharmaceutical intermediates and biofuels. The normal metabolisms of terpenes, amino acids and fatty acids have been re-engineered to make hydrocarbons, alcohols and polyesters for use as fuels, bulk chemicals and plastics (see above).
Challenges remaining in biocatalyst engineering
Despite the advances, there remain major challenges to harnessing the advantages of biocatalysis fully. Enzyme engineering is much faster than it was ten years ago, but changing 30–40 amino acids and screening tens of thousands of candidates still requires a large research team. Many, if not all, engineering strategies will yield improved variants, but some will yield better variants and find them faster. Which ones are the better strategies is still unclear. Directly comparing strategies for the same problem and testing the assumptions behind different strategies will identify the most efficient ones.
The first assumption is that the goal can be achieved using enzyme engineering. The thermodynamics of reactions involving non-natural substrates may be less favourable than that of reactions involving natural substrates, and attaining certain enzyme activities may be thermodynamically impossible. Diffusion sets an upper limit to reaction rates. A closer integration of thermodynamics and biocatalytic process development is highly desirable in designing new processes.
Protein engineering often relies on knowing the quaternary structure of the enzyme because residues at the protein–protein interface can contribute to stability. Researchers assume that the structure of the enzyme under reaction conditions (low enzyme concentrations, high substrate concentration, organic solvents and so on) is very similar to that of the crystallized enzyme (high enzyme concentration, no substrate and/or organic solvent). Because proteins crystallize only under narrow conditions found by extensive experimentation, in solution they probably adopt many conformations besides the ones seen in the crystal structures. Furthermore, our understanding of protein dynamics is still very limited and this makes predictions difficult.
Third, enzyme engineering assumes that individual mutations are additive67. Although mutations are mostly non-interactive, many interactive mutations are highly useful but difficult to study. One way of identifying cooperative effects involves statistical analysis using the ProSAR algorithm60, but better techniques are needed to predict at an early stage of protein engineering which additional mutations are possibly additive and which lead to a dead end.
Fourth, computer design of new enzyme activities is not accurate. Design still requires testing 10–20 predictions and usually results in an enzyme with low activity, which then requires substantial further engineering. For example, the initial computer-based design of an enzyme for the manufacture of sitagliptin17 yielded an enzyme that converted only 0.1 substrate molecule per day, yet that substrate fits well in the active site within the computer model derived from the crystal structure. New enzymes can be designed to catalyse reactions not found in nature (Kemp-elimination28, new Diels–Alder reactions74), but such activities are so far too low for practical use. Better understanding of the mechanistic, dynamic and structural aspects of enzymatic catalysis is needed.
Technical challenges also limit biocatalysis. The current DNA synthesis methods are close to their efficiency limits, but still cost approximately US$0.35 per base (∼US$300 per 1,000-nucleotide gene), which is too high for large-scale applications requiring thousands of genes. Longer and cheaper DNA fragments would simplify and speed up experiments. Next-generation DNA synthesis methods may involve synthesis of oligodeoxynucleotides by codons (trinucleotides) rather than individual nucleotides. This approach was first suggested two decades ago but never reached the mainstream, presumably owing to instrumental limitations (synthesis starts with 64 phosphoramidite trinucleotides). Nevertheless, the concept recently was used in the rapid assembly of entire genes in a single synthesis and in the preparation of high-quality mutagenesis libraries75, and thus seems feasible today.
New ideas for the integration of biocatalysts with nanodevices and in complex multi-enzyme assemblies hold promise for the future. Enzyme immobilization has been a strategy since the early days of biocatalysis, but it may be more effective when the biocatalyst’s surface orientation is controlled76. Similarly, using proteins and nucleic-acid scaffolds to control the number and orientation of enzymes within multi-enzyme pathways also improves efficiency77. Separately, functional matrices such as carbon nanotubes and quantum dots can substitute for complex biological electron transfer systems, offering new methods for regenerating redox catalysts and interfacing enzymes with semiconductors78. Nevertheless, the integration of enzymes with non-biological matrices and nanomaterials, and as part of metabolic engineering, is still inefficient. Future protein engineering has to address challenges emerging through the interfacing of individual biocatalysts with other proteins in a metabolic pathway or support matrices.
Protein engineering solves the previous weaknesses of biocatalysts: low stability and low activity towards unusual substrates. Large amounts of protein were used to compensate for low activity and this caused emulsions that hampered work-up and reduced yield. Highly active enzymes solve this problem because emulsions do not form using smaller amounts of protein. Training chemists in both biocatalysis and chemocatalysis will help them choose the best solution in each case. Improved enzymes with a long shelf life and good activity and stability in organic solvents should help biocatalysis to spread further into industrial laboratories.
Recent advances in protein engineering have achieved the equivalent of converting mouse proteins into human proteins. The amino-acid sequences of similar proteins in mice and human typically differ by ∼13% (ref. 79). Today’s advanced protein engineering makes similar changes in converting a wild-type enzyme into an enzyme suitable for chemical process applications. This protein engineering is equivalent to compressing the 75,000,000-yr evolution of an early mammal into modern-day mice and humans into several months of laboratory work. Consistent with the more extensive changes made in these proteins, the properties have also changed more dramatically. The catalytic properties of the enzymes have improved quantitatively by factors of thousands to millions80, and the engineered enzymes now can act in unusually harsh conditions. The understanding of protein engineering built over the third wave of biocatalysis allows dramatic improvements in enzymatic performance to be realized in parallel with the development of chemical syntheses requiring these catalysts, allowing biocatalysis to develop as an increasingly important tool in chemical synthesis.
We thank H.-P. Meyer and R. Fox for discussions. R.J.K. thanks the US National Science Foundation (CBET-0932762) and the Korea Science and Engineering Foundation funded by the Ministry of Education, Science and Technology (WCU programme R32-2008-000-10213-0). U.T.B. and S.L. thank the German Research Foundation (SPP 1170, Bo1864/4-1) and, respectively, the US National Science Foundation (CBET-0730312) for financial support.
About this article
Enhancement in the catalytic activity of Sulfolobus solfataricus P2 (+)-γ-lactamase by semi-rational design with the aid of a newly established high-throughput screening method
Applied Microbiology and Biotechnology (2019)