Nature provides a fantastic array of catalysts extremely well suited to supporting life, but usually not so well suited for technology. Whether biocatalysis will have a significant technological impact depends on our finding robust routes for tailoring nature's catalysts or redesigning them anew. Laboratory evolution methods are now used widely to fine-tune the selectivity and activity of enzymes. The current rapid development of these combinatorial methods promises solutions to more complex problems, including the creation of new biosynthetic pathways. Computational methods are also developing quickly. The marriage of these approaches will allow us to generate the efficient, effective catalysts needed by the pharmaceutical, food and chemicals industries and should open up new opportunities for producing energy and chemicals from renewable resources.
Biological systems are masterful chemists — a fact long appreciated by those who study how living things build complex molecules and systems from simple compounds. Enzymes catalyse the interconversion of a vast number of molecular structures, achieving tasks that range from the fixation of nitrogen to the synthesis of large and intricately structured molecules that ward off predators or attract mates. Such catalysts are models of energy-efficient, environmentally benign chemical agents, as virtually all do their work under mild conditions — in water, at room temperature and atmospheric pressure — and generate few waste products.
In view of increasing environmental and economic pressure to use renewable sources for energy and chemical feedstocks in industry, biocatalysts look like potentially attractive technological tools. But enzymes have evolved to contribute to the survival and reproduction of the organisms that make them; that they might also be useful in laundry detergents or to synthesize a new drug is simply serendipitous. In fact, attempts to use enzymes or whole organisms in applied chemical processes or products reveal some severe disadvantages of biocatalysts. Some of them turn off when a little product accumulates. This feature, so useful in regulating the flow of metabolites inside a cell, quickly derails implementation of a biocatalytic process to make that product. The process engineer is also unlikely to favour a delicate catalyst that must be replaced every few hours or must be coddled to keep it going. And of course nature does not conveniently provide a catalyst for any transformation we wish to conduct.
For many years the identification of new biocatalysts depended on labour-intensive screening of microbial cultures for the desired activities. Almost all the biocatalysts in use today came from the small fraction of organisms that can be grown under controlled conditions, the 'microbial weeds'. (By most counts less than 1% of all microorganisms can be cultured.) Some of these organisms live in harsh environments and their catalysts exhibit remarkable and useful properties, including the ability to function under extreme conditions of temperature, salt or pH1. Other organisms have potentially useful new catalysts or enzyme pathways that allow them to produce valuable, biologically active compounds2. These catalysts could be recruited to make the natural products in more tractable organisms such as Escherichia coli .
Efforts to comb natural biodiversity for useful activities have been greatly facilitated by high-throughput screening technologies and by new methods for collecting genes from the environment and expressing them in recombinant organisms2,3,4. These processes allow faster access to useful catalytic activities from organisms that cannot be cultured. But natural diversity cannot address all practical biocatalytic problems. Screening larger libraries of DNA or microbes may not even be the fastest or most efficient route to obtaining a good catalyst. Some problems can be solved by the right method of implementation — immobilization or crystallization can stabilize weak protein structures, for example. But many problems are best attacked by engineering the catalyst itself, whether it be a single enzyme, multiple enzymes or even a whole cell.
A revolution in biological design possibilities was unleashed by the advent of recombinant DNA technology, with which one can manipulate DNA sequences in a highly specific fashion and express their protein products in a variety of organisms, from animals to bacteria. This provides a means to redesign nature's catalysts at the molecular level according to detailed specifications, and to produce them in large quantities in fast-growing microorganisms. In this review I consider the ways in which biotechnological methods permit the restructuring of enzymes to adapt their functions for applied ends. Broadly speaking, one can identify two philosophies: either existing biocatalysts can be fine-tuned by rational redesign, or combinatorial techniques can be used to search for useful functionality in libraries generated at random and improved by suitable selection methods.
Our ability to manipulate the structures and functions of biological molecules and even whole organisms at will carries the prospect of applications previously not considered in the realm of biocatalysis, such as very large scale chemical production. In the not-too-distant future we can expect custom-made enzymes for gene therapies5,6, new reagents for basic science and clinical diagnostics7,8, and even new designs of the cellular machinery for making proteins in vivo9.
Rational redesign of natural biocatalysts
To take full advantage of recombinant DNA technology for making new enzymes, we need to know the connection between protein sequence and function. In other words, redesigning nature's catalysts rationally — that is, by specifying the sequence — usually requires detailed understanding of structures and mechanisms. This information is unavailable for the vast majority of enzymes. Even if the target enzyme is well characterized, the molecular basis for the desired function may not be. With hundreds and even thousands of atoms that interact weakly with each other in an ensemble of closely related and interconverting folded conformations, the complex and finely tuned enzyme fades easily in the clumsy hands of the protein engineer.
Despite these challenges, biological design is now going through its most exciting period since the introduction of recombinant DNA methods and the invention of site-directed mutagenesis over two decades ago. One factor contributing to this capability is the exponential growth of the databases of protein structures and sequences. By arranging proteins in family trees we have learnt that those proteins existing today evolved from a much smaller set of ancient molecules. Shuffling of segments, fusions with other proteins and accumulation of random mutations all contributed to their diversification. Sometimes the sequences remained sufficiently conserved during this process that we can compare the sequence of a new biocatalyst identified in a screening programme to the thousands that have been deposited in databases and identify related proteins whose functions, and maybe even structures, are already known. With this information we can make inferences about the new catalyst's structure and activity.
There is now ample evidence that new enzymes evolved in nature by relatively minor modification of active-site structures10,11. Thus sequence and structure information can sometimes be used to good effect in transferring activities from one enzyme to another related one. Shanklin and co-workers have exploited this notion to re-engineer membrane-bound di-iron enzymes that exhibit distinct hydroxylase, epoxidase, acetylenase and conjugase activities during fatty-acid biosynthesis in plants12,13. It is believed that all these activities arose from a common progenitor enzyme through modification of the active site to allow direction of a common free-radical reaction intermediate into the different end products14 (Fig. 1). Comparing the sequences of five related oleate desaturases with those of two hydroxylases, Shanklin and co-workers identified seven positions that were strictly conserved within the five desaturases but which differed from the equivalent positions in the two hydroxylases12. Reciprocal amino-acid changes between one of the desaturases and one of the hydroxylases at the seven sites yielded pronounced shifts in the ratio of desaturase to hydroxylase activity. Four amino-acid substitutions were sufficient to convert the desaturase to a hydroxylase, and as few as six substitutions turn the hydroxylase into a desaturase.
Another example comes from work on a dehalogenase from the 2-enoyl-coenzyme A (CoA) hydratase/isomerase superfamily11. Comparison of the sequences and structures of various family members revealed that their active sites can be viewed as derivatives of a single active-site structure that provides CoA binding, an oxyanion pocket and a chamber containing stations at which substrate binding and catalytic groups have been strategically positioned. Experiments with site-directed mutagenesis show that chemical diversification can be achieved through placement of one or more polar residues along the stations: grafting eight amino-acid substitutions from crotonyl hydratase conferred on its relative, 4-chlorobenzoyl-CoA-dehalogenase, the new ability to catalyse the hydration of crotonyl-CoA.
This comparative approach can be useful for identifying amino acids that control particular enzyme behaviours and demonstrating mechanisms for the diversification of catalytic functions in nature. Catalytic plasticity has clearly contributed to the evolution of chemical diversity. But such comparisons are likely to be of limited use for designing new biocatalysts, because the changes produced by altering the identified amino acids often do not extend beyond the range encoded in the parental genes13. One of the main goals of biocatalyst engineering is to endow them with new features that are not found in the natural sequences because they confer no evolutionary advantage.
A further problem is that the amino acids care about their context — their neighbours influence their contributions to an enzyme's activity. Also, many important biocatalyst properties are not localized in a small number of catalytic residues, but reflect contributions from many residues distributed over large parts of the protein. Even when large functional changes can be obtained with a few amino-acid substitutions, it will often be difficult or impossible to discern the specific mutations responsible. Most sequence changes accumulated during evolution have little or no effect on the property of interest, and their presence makes it difficult to pick out the key positions15. A good example is stability. Hundreds of amino acids using many types of interactions can contribute to the stability of a protein, and useful design rules for stabilization have not yet been extracted from sequence comparisons16. The factors determining stability have, however, been good targets for powerful computational methods of protein modelling that can handle the large numbers of competing interactions17,18.
Nearly all engineered enzymes that are used today came out of structure-based protein-engineering efforts of the 1980s. The successes have been notable, but the results were costly and came far too slowly. Although some properties, notably enzyme specificity, respond relatively well to structure-based design and site-directed mutagenesis19, this approach is often cumbersome and unsatisfactory for engineering industrial biocatalysts which must meet a long list of performance specifications and for which the windows of opportunity are all too brief. In the pharmaceuticals industry, new catalysts must be selected and implemented often within a few months. Predictive capabilities are still rudimentary for catalysis, and even when successful, the desired changes in activity and specificity often come at the cost of other, equally important properties, such as stability or expression level.
Breeding a better catalyst
Another key factor contributing to expanding biological design capabilities is the development of 'evolutionary' protein design methods20 using random mutagenesis, gene recombination and high-throughput screening.
Unlike natural evolution, laboratory evolution is directed — more like breeding21,22. A 'generation' of molecules can be bred in a few days, with large numbers of progeny subject to selective pressures not encountered in nature. Because the molecules are produced in recombinant cells and are decoupled from their biological functions, they can be bred for non-natural but useful properties, including the ability to carry out reactions on substrates not encountered in nature, or to function under highly unusual conditions. Because molecules can be bred for multiple traits simultaneously by changing the conditions of the screen or selection, this approach is particularly attractive for engineering industrial biocatalysts.
Although there are many ways to evolve a biocatalyst in the laboratory, they all involve two main steps: making a set of mutant biocatalysts and searching that set for mutants with the desired properties. The process can be iterative, so that large changes in function are obtained by accumulating small changes over many generations.
Sequential rounds of random mutagenesis carried out on ever-improved mutants is a simple and highly effective strategy that has been applied successfully to a number of catalyst design problems. Particularly relevant to biocatalysis — and particularly difficult to manipulate using structure-based design — is enzyme enantioselectivity23. Subtle changes in enzyme structure and even changes in reaction conditions can influence enantioselectivity, but these effects are almost impossible to predict. Enantioselectivity can, however, be tuned by laboratory evolution. Starting from a naturally occurring lipase with almost no selectivity for the hydrolysis of racemic 2-methyldecanoic ester, Liebeton et al.24 evolved an enzyme that catalysed the reaction at more than 90% enantiomeric excess using several rounds of mutagenesis and screening. In another recent study, three generations of mutagenesis and screening actually inverted the enantioselectivity of a hydantoinase to prefer l- over d-5-(2-methylthioethyl) hydantoin and increased its activity fivefold25. Degussa AG is currently evaluating a whole-cell catalyst incorporating the evolved enzyme for commercial production of enantiopure l-methionine.
Laboratory evolution has also been effective in altering other key biocatalyst properties, including stability, function in non-natural environments (such as organic solvents; see accompanying review by Klibanov, pages 241–246), product inhibition, expression in a recombinant host and substrate specificity21,23,26. A particularly impressive example is the evolution of an aspartate transaminase to have 2.6×10 6-fold higher activity towards the non-native substrate valine27,28. The crystal structure of the evolved enzyme shows how the active site was remodelled through the cumulative effects of mutations distributed over much of the enzyme structure27. Yet only one of the 17 mutated residues contacts the substrate, and none contact the pyridoxal 5′-phosphate cofactor. This study illustrates well how complex the solutions to enzyme design problems can be, a point echoed in structural analyses of other laboratory-evolved enzymes29. In the right places, amino acids serve as 'molecular shims'13 to tune substrate and reaction specificity; beneficial amino-acid substitutions easily identified by random mutagenesis and screening may have minute structural consequences, beyond the resolution of structural analysis and certainly beyond our ability to predict.
Accumulating point mutations is an effective fine-tuning mechanism, but nature also uses other means to create new molecular diversity on which evolution can act. One of those is recombination. Recent studies show that recombination is an extremely useful operation for laboratory evolution. So-called DNA shuffling methods30 pioneered by Stemmer create hybrid gene libraries by homologous recombination of related parent genes (ref. 31 and Fig. 2). This 'molecular sex' creates new genes that code for proteins with sequence information from any or all parents. Genes from multiple parents and even from different species can be shuffled in a single step, operations that are forbidden in nature but may be very useful for rapid adaptation. DNA shuffling is used widely to generate highly improved biocatalysts, as well as ones with features not present in the parent enzymes and not known to occur in nature21,22.
This molecular breeding concept extends nicely to more complex problems involving many interacting genes. A good example is creating new, multienzyme pathways for making chemicals. Microorganisms, plants and animals produce a wide range of compounds that could function as new drugs, dyes, fragrances, flavourings and cosmetics. But many are found only in trace quantities in their natural sources and are difficult or impossible to synthesize chemically. An important goal for biocatalysis is to produce these compounds in fast-growing organisms suitable for large-scale production.
Genes encoding the enzymes that catalyse the series of chemical reactions necessary to make a particular compound can be transferred to more amenable host organisms, conferring on them the new ability to synthesize the desired product32,33. Molecular breeding can optimize the engineered pathways, and it can also create new pathways, capable of synthesizing novel compounds.
Schmidt-Dannert et al.34 have evolved the pathways that synthesize carotenoid pigments. Using a small set of bacterial genes that produce β-carotenes, they were able to exploit the remarkable plasticity of carotenoid biosynthetic pathways to generate pathways for a number of related carotenoids and precursors (Fig. 3). The two genes from Erwinia sp. that produce phytoene were engineered into E. coli, together with a large library of gene hybrids created by shuffling two versions of a third Erwinia gene encoding a desaturase, which normally introduces double bonds into phytoene to make lycopene. Among the thousands of coloured bacterial progeny, they found some that were more yellow and pink than the orange E. coli containing the three naturally occurring carotenoid biosynthetic genes. Different members of the bacterial library made one or more of the carotenoids that contained double bonds at the various positions, and all the possible desaturation products were represented.
Combination of the genes that made the pink carotenoid (tetradehydrolycopene) with a new library of mated gene hybrids of a fourth (cyclase) gene generated an even greater variety of coloured bacteria: yellow, orange, pink and bright red (Fig. 3). The bright red cells produce torulene, a carotenoid not made by Erwinia, and not known in any bacteria but found in some red yeasts. Yet the pathway created by molecular breeding is not the same as that used by yeast to make torulene.
The combination of gene assembly (pathway engineering) and molecular evolution can solve very complex problems of biological design. By generating efficient pathways to make natural and non-natural products, it can greatly extend the applications of biocatalysis into the discovery and production of new biologically active compounds.
De novo catalyst design
There are likely to be many problems for which natural molecules cannot even offer a suitable starting point for evolution. In some cases, the whole enzyme frameworks are not suitable because, with many hundreds of amino acids, they are too unwieldy to produce or use in large quantities or, in the case of protein-based drugs, cannot be delivered efficiently to their targets. In other cases, the biological pathways are too cumbersome for practical use. For example, biological oxidation reactions are usually catalysed by large multiprotein complexes and use expensive cofactors that few would consider for an industrial process. In general, the many, sometimes conflicting demands and the contingent nature of evolution means that enzyme structures are not necessarily optimized as chemical reagents for a specific transformation, and there may be much better functional solutions that use completely different sequences and structures. How can one find them?
Some possible routes are evolutionary. Catalytic function can be coaxed out of protein frameworks evolved for different, non-catalytic roles. In one of the first evolutionary approaches to making new biocatalysts, catalytic antibodies or 'abzymes' were generated in response to molecules that mimic the transition state of a reaction35. But the development of commercially useful antibody catalysts has been hampered by their low expression, limited stability and generally low turnover rates, although there are a few notable exceptions36. The basic idea of targeting a transition-state analogue can be extended to generate catalytic activity from other, perhaps more tractable frameworks. But the activities of these new enzymes may still be low, reflecting the fact that transition-state binding is only one aspect of the catalytic process.
Other approaches to designing new protein catalysts use different breeding practices. There are many ways to create molecular diversity beyond point mutation and homologous recombination. Several groups are, for example, investigating nonhomologous recombination of distantly related37,38 and even unrelated sequences39 as a means to generate new functional proteins. Others are developing techniques to generate40,41 and screen42 larger libraries so as to be able to identify rarer but possibly more useful solutions.
But the number of possible protein sequences inevitably dwarfs any existing or even conceivable technology for searching it experimentally. So one must make intelligent choices about what and how to search. This may be where 'rational' design will be crucial: identifying the most likely places to search combinatorially for desired functions.
That rudimentary structure-based designs can be improved through evolutionary tuning is well accepted, if not yet widely practised. This blend of approaches was demonstrated by Altamirano and co-workers, who converted an α/β-barrel enzyme with one activity (indole-3-glycerol phosphate synthase) into another with equally efficient activity (phosphoribosylanthranilate isomerase)43.
Conversely, structure-based computational methods can be used to identify likely sites for evolutionary improvement, thereby supporting the generation of specific 'targeted' libraries and greatly reducing the experimental search. Voigt et al. (ref. 44 and unpublished data) have used powerful computational methods17 to search vast regions of sequence space to identify the most probable solutions to protein-design problems. They use the computational methods where they work best — solving the generic problems of identifying protein sites that are tolerant to mutation or that will tolerate crossover without significant disruption — and the evolutionary methods to find specific solutions within the generic ones.
The ideal would be to specify a catalyst de novo: purely from its primary sequence. In principle this should be possible, and indeed the first de novo proto-enzymes are now being reported45 — although they are not particularly impressive catalysts. Primitive iron- and oxygen-binding sites introduced into the small protein thioredoxin by computational design show varying selectivities in oxidation processes45. Such designed sites might be adequate starting points for evolutionary methods.
Biocatalysts need to become predictable and routine tools. At present they are neither, and biocatalyst design is still more of an art than a science. But things are changing. Laboratory evolution methods are now sufficiently robust that improved biocatalysts can be obtained with confidence on a reasonable timescale. Further developments, especially miniaturization and automation of high-throughput screening, will accelerate the acceptance and widespread application of biocatalysis.
Today, evolutionary methods seem the most fertile approach for developing new commercial biocatalysts. But the capabilities of rational design, particularly computational techniques and de novo design are expanding too. And emerging design methods that marry the best of the computational and the combinatorial approaches promise to make biocatalysis a key tool for synthetic chemistry in the century ahead.
I thank the many talented students and postdocs who have contributed to the development of new biocatalyst engineering tools in my laboratory, and the following organizations for their financial support: the US Office of Naval Research, the US National Science Foundation, the Army Research Office, Maxygen, Inc., The Biotechnology Research & Development Corporation, British Petroleum, Degussa AG and Procter & Gamble Co. I also thank C. Voigt and J. Shanklin for thoughtful comments, and J. Shanklin for Fig. 1.
About this article
Enhanced enantioselectivity of a carboxyl esterase from Rhodobacter sphaeroides by directed evolution
Applied Microbiology and Biotechnology (2013)