Biocatalysis

Bell, Elizabeth L.; Finnigan, William; France, Scott P.; Green, Anthony P.; Hayes, Martin A.; Hepworth, Lorna J.; Lovelock, Sarah L.; Niikura, Haruka; Osuna, Sílvia; Romero, Elvira; Ryan, Katherine S.; Turner, Nicholas J.; Flitsch, Sabine L.

doi:10.1038/s43586-021-00044-z

Download PDF

Primer
Published: 24 June 2021

Biocatalysis

Nature Reviews Methods Primers volume 1, Article number: 46 (2021) Cite this article

60k Accesses
259 Citations
31 Altmetric
Metrics details

Subjects

Biocatalysis

Abstract

Biocatalysis has become an important aspect of modern organic synthesis, both in academia and across the chemical and pharmaceutical industries. Its success has been largely due to a rapid expansion of the range of chemical reactions accessible, made possible by advanced tools for enzyme discovery coupled with high-throughput laboratory evolution techniques for biocatalyst optimization. A wide range of tailor-made enzymes with high efficiencies and selectivities can now be produced quickly and on a gram to kilogram scale, with dedicated databases and search tools aimed at making these biocatalysts accessible to a broader scientific community. This Primer discusses the current state-of-the-art methodology in the field, including route design, enzyme discovery, protein engineering and the implementation of biocatalysis in industry. We highlight recent advances, such as de novo design and directed evolution, and discuss parameters that make a good reproducible biocatalytic process for industry. The general concepts will be illustrated by recent examples of applications in academia and industry, including the development of multistep enzyme cascades.

Engineering new catalytic activities in enzymes

Article 20 January 2020

Biocatalytic synthesis of non-standard amino acids by a decarboxylative aldol reaction

Article 21 February 2022

The importance of catalytic promiscuity for enzyme design and evolution

Article 15 November 2019

Introduction

Enzymes have been employed for a wide variety of chemical processes for decades (Fig. 1). For example, nitrile hydratases are used to make acrylamide on the thousands of tons scale, and enzymes have been added to detergents for more than 30 years^1,2. More recently, the use of proteins as catalysts for chemical synthesis of more complex molecules, such as pharmaceuticals, has become increasingly widespread. Enzymes are particularly powerful because they merge the advantages of a directing group controlling selectivity and a catalyst in a single reagent³, which can also be used with other enzymes in a one-pot reaction. Over the past 20 years, combined synthetic–enzymatic systems have enabled multiple total synthesis endeavours, and the use of enzymes is becoming routine in some process chemistry groups in industry⁴. Until recently, only a subset of enzymes, such as lipases or ketoreductases (KREDs), were available for chemical synthesis applications⁵. However, the growth of potential sources of enzymes for process chemistry applications has accelerated, resulting in a diverse toolkit of enzymes now available to researchers. In 2014, the development of a total enzymatic synthesis of the nucleoside didanosine highlighted the possibility of ‘bio-retrosynthesis’⁶. Based on the principles of retrosynthesis, where the target molecule is transformed into simple precursors by ‘breaking’ bonds that can be formed from synthetic transformations, ‘bio-retrosynthesis’ involves the design of an artificial enzyme cascade — a synthetic biochemical pathway — that offers a possible route towards the desired target molecule by choosing enzymes as catalysts for the required chemistry. The fully biocatalyst-driven synthesis of the HIV inhibitor islatravir (Fig. 1), which will be discussed in more detail in the Applications section, demonstrates the power of combining modern approaches towards designing new enzyme cascades, including repurposing of known biosynthetic pathways, screening of saturation mutagenesis libraries of enzyme variants and directed evolution against selected residues towards increased enzyme stability and turnover⁷.

**Fig. 1: Examples of different products synthesized using biocatalysis.**

In this Primer, we discuss the different development stages (reaction design, biocatalyst choice and optimization, and bioprocess development) that can lead to a range of industrial products as shown in Fig. 1. These stages are interdependent and need to be closely integrated. Starting with a target molecule, a single or multistep biocatalytic process needs to be designed, often by manual design using expertise and precedent literature from organic synthesis and biocatalysis. More recently, programmes such as RetroBioCat⁸ including biocatalyst databases are being developed to speed up this step and enable the automatic design of de novo biosynthetic pathways. Once a process has been designed, suitable enzymes need to be selected for each step and tested (Experimentation). The increasing adoption of biocatalysis by the pharmaceutical industry has been driven by innovative tools in protein engineering, which allow fast optimization of catalyst activity, including laboratory evolution and computational design (Experimentation). As a result, strict reaction parameters (Results) can now be met at reasonable timescales for successful bioprocess development (Applications). These parameters include non-physiological reaction conditions such as high activity on non-natural substrates, high temperature, high concentration of substrates and tolerance of organic solvents and wide pH ranges. Alongside protein engineering tools, databases of available biocatalysts with their reaction profiles are starting to be established (Reproducibility and data deposition). We also detail the current limitations of biocatalysis and areas of importance for further advancing this method to expand the breadth of applications (Limitations and optimizations). Finally, we highlight what the future holds for biocatalysis and the impact it will likely have in the next decade (Outlook).

Experimentation

In this section, we highlight several sources available to scientists looking for an enzyme as a starting point to develop a new biocatalyst. We discuss how one can optimize biocatalysts using directed evolution and computational design as well as how to incorporate non-canonical amino acids to enable novel chemistries.

Sources of biocatalysts

Enzymes can be sourced from a few outlets that include commercial sources, adaptation of enzymes from biosynthesis, screening of metagenomic libraries and in silico mining of databases.

Commercial sources of biocatalysts

Purified enzymes or lyophilized crude cell lysates are often available for direct purchase through chemical vendors (Fig. 2a). For example, one can purchase specific dehydrogenases, reductases or carbohydrate-active enzymes, and directly employ them in chemical synthesis. Libraries of commercially produced enzymes can be screened against specific substrates to identify candidate biocatalysts. Available libraries include oxidoreductases (KREDs, imine reductases (IREDs), ene reductases, Baeyer–Villiger monooxygenases, monoamine oxidases), transferases (transaminases), lyases (nitrilases, halohydrin dehalogenases, acylases), carbon–carbon bond-forming enzymes and carbon–nitrogen bond-forming enzymes⁹ (Box 1).

In addition to these types of commercial sources and kits, the wide availability of commercial gene synthesis means that, in principle, any enzyme with a known amino acid sequence can be obtained through gene synthesis. Researchers can order the synthetic gene corresponding to the protein, recombinantly express it in a desirable host organism or by cell-free protein synthesis and, then, purify the protein for testing as one would similarly do with a catalyst in a chemical synthesis. Databases ranging from SciFinder (with a licence) to UniProt¹⁰ and BioCatNet¹¹ allow researchers to identify enzymes that catalyse desired chemical transformations. Thus, through the combination of publicly available sequence data and commercial gene synthesis, any enzyme reported is available to the researcher, for a cost.

Box 1 Common biocatalysis enzymes and their associated chemical transformations

The interconversion between alcohols and aldehydes or ketones is catalysed by alcohol dehydrogenases, ketoreductases (KREDs) and alcohol oxidases (reaction i). Imine reductases catalyse the reduction of imines to amines (reaction ii). Reductive aminases and transaminases transfer amino groups to prochiral ketones or aldehydes, giving rise to chiral amine products. Monoamine oxidases catalyse the oxidation of compounds containing one amino group and the resulting imine spontaneously forms a ketone in water (reaction iii). Ene reductases reduce C–C double bonds (reaction iv). Nitrile hydratases catalyse the addition of water across nitriles producing amides (reactions v, vi). The hydrolysis of nitriles to carboxylic acids and ammonia is catalysed by nitrilases (reaction vii). Baeyer–Villiger monooxygenases oxidize ketones to esters (reaction viii). Halohydrin dehalogenases produce enantiopure epoxides and corresponding ring-opening products (reaction ix). Acylases hydrolyse amide bonds, whereas lipases and esterases do so for ester bonds (reaction x). Part of the esterase group, thioesterases catalyse the hydrolysis of the thioester bonds. Carboxylic acid reductases reduce aromatic or aliphatic acids to aldehydes (reaction xi). Aldolases catalyse Aldol reactions (reaction xii). Amine dehydrogenases catalyse the reductive amination of carbonyl compounds (reaction xiii).

Natural product biosynthetic enzymes

Enzymes involved in the synthesis of specialized metabolites, or natural products, are particularly useful as starting points for biocatalysis. Natural products tend to have diverse chemical structures, and studies on the biosynthesis of such natural products have unveiled a correspondingly diverse set of biosynthetic enzymes. Therefore, natural product biosynthetic enzymes are a potential source for diverse catalysts. A recent review discusses the wide-ranging chemical and enzymatic diversity found in natural product biosynthesis¹². From a biocatalytic point of view, the most important criteria in selecting a potential biosynthetic enzyme include its substrate specificity, cofactor dependence, turnover, stability, functional recombinant expression and ability to perform a stand-alone function outside its natural pathway within a cell.

One group of biosynthetic enzymes widely found in natural product biosynthesis are oxidative Fe(II) and 2-oxoglutarate-dependent enzymes, which can catalyse challenging reactions such as hydroxylation, halogenation and oxidative cyclizations, typically for C(sp³)–H functionalization, due to the reactive intermediates generated in the catalytic cycle¹³. These powerful enzymes have emerged as useful biocatalysts for chemical synthesis¹⁴. One example is GriE — an enzyme involved in the synthesis of the peptide griselimycin in Streptomyces¹⁵ that hydroxylates the δ-carbon of l-leucine and was employed for a key step in the total synthesis of manzacidin C¹⁶ (Fig. 2b). Other examples include halogenases such as BesD, which chlorinate free amino acids to generate non-proteinogenic amino acids¹⁷. A final example is KabC, which catalyses an oxidative cyclization and was employed in a biocatalytic synthesis of the neurochemical kainic acid¹⁸. These types of Fe(II) and 2-oxoglutarate-dependent enzymes represent a promising group of biocatalysts for further development as they can be reconstituted either using Escherichia coli cell lysates or in vitro¹⁹.

Other known natural product biosynthetic enzymes with demonstrated use in chemical synthesis include methyltransferases²⁰, Diels–Alderases that catalyse cycloadditions²¹, halogenases²², uridine diphosphate-dependent glycosyltransferases²³, laccases²⁴ and pyridoxal phosphate-dependent enzymes²⁵. However, biosynthetic enzymes are typically sluggish catalysts, with Michaelis constant k_cat values typically about 30-fold lower than those of primary metabolism²⁶, which is probably owing to the constraints at work in their evolutionary histories^26,27. To this end, biosynthetic enzymes are viewed as starting points for directed evolution efforts to improve rates of catalysis, substrate tolerance as well as stability and solubility in desired solvents, which include ionic liquids and water-soluble organic solvents²³.

Metagenomic and in silico screening

Metagenomic libraries are an additional source for new biocatalysts²⁸; these are genomic libraries of DNA obtained from environmental samples such as marine sponges, soil or faeces²⁹. Functional-based screening or sequence-based screening is often used to search through a genomic library to find enzymes of interest (Fig. 2c). Functional-based screening can incorporate the use of colorimetric assays^30,31,32, mechanism-based probes³³ and/or droplet microfluidics³⁴. Researchers are looking for enzymes in these libraries that demonstrate catalysis of the desired target reaction within these screening platforms. For example, mammalian microbiomes are a rich source of carbohydrate-active enzymes, which are often encoded in polysaccharide utilization loci in specific bacterial genomes. Screening of a human metagenomic library using a fluorogenic substrate identified a pair of enzymes that convert the A antigen into the H antigen of O-type blood, enabling a biocatalytic approach to produce universal donor O-type blood from A-type blood³⁵.

By contrast, sequence-based screening is based on finding new enzymes with sequence similarity to known enzymes. One approach involves the use of PCR amplification of genomic sequences with degenerate primers. These primers are a mixture of oligonucleotide sequences that code for highly conserved regions of a desired type of catalyst, and therefore allow amplification of genes encoding for this catalyst from naturally derived DNA libraries. For example, sponge microbiomes are known for their biosynthetic capabilities, and in a case of sequence-based screening with degenerate primers, a new halogenase Krml was isolated from a sponge microbiome and employed for the regioselective halogenation of tryptophan and a range of indole-derived peptide substrates³⁶.

Another approach is the direct analysis of sequencing data, where specific types of desired enzymes can be found through in silico analysis³⁷. For example, the sequencing of a domestic drain metagenome was analysed in silico to find transaminase candidates, which were then tested for their ability to carry out transaminations on diverse substrates. One enzyme from this set of candidates retained activity in 50% dimethyl sulfoxide (DMSO), a solvent tolerance that had not been reported in a transaminase previously. As these examples demonstrate, once a gene encoding a desired enzyme is identified, the gene can be cloned and the enzyme can be expressed recombinantly to establish its function³⁸.

Directed evolution

Wild-type enzymes are often not suitable for direct use in industrial applications and must first undergo optimization to improve properties such as substrate specificity and selectivity as well as catalytic efficiency and stability. Directed evolution is a powerful and versatile technology for adapting these enzymes to perform new functions (as highlighted by the award of the 2018 Nobel Prize for Chemistry)^39,40,41. The directed evolution cycle involves iterative rounds of DNA library design and generation, gene expression and screening of enzyme library members (Fig. 3). Multiple properties can be optimized in parallel and improved variants can be isolated, characterized and used as templates for further rounds of evolution.

Following identification of a suitable starting template, DNA libraries are generated using numerous standard molecular biology techniques, such as random mutagenesis or site saturation mutagenesis. The chosen method of library generation depends on factors such as the availability of structural information and screening capacity. The design of smaller, more focused libraries (10²–10⁴ variants) often employs computational modelling and bioinformatics to guide the selection of amino acid residues for randomization. These libraries are generated using techniques such as saturation mutagenesis or iterative combinatorial active site testing⁴² and often employ reduced codons^43,44. Larger library diversity (10⁵–10⁸ variants) can be generated using techniques such as error-prone PCR⁴⁵ and gene shuffling⁴⁶. It is common to use multiple mutagenesis techniques during enzyme evolution to target different regions of the protein structure. For example, focused active site mutations can be beneficial for reshaping substrate binding pockets and improving activity towards non-native substrates, and mutations to the protein surface and flexible loop regions can often result in improved solvent tolerance and thermostability. Beneficial mutations are typically combined during evolutionary optimization using DNA shuffling and can be guided by computational algorithms.

Transforming cells with DNA libraries leads to spatial separation of library members and establishes a link between genotype and phenotype that must be maintained during protein production and screening to allow characterization of individual library members. Arraying colonies into multiwell plates for protein production and screening offers the greatest versatility, as variants can be evaluated using a wide range of chromatographic, spectrophotometric and spectroscopic techniques. Although chromatographic methods are of relatively low throughput, they are commonly employed for applications in industrial biocatalysis as the assays are compatible with screening under process conditions, which often employ high substrate loadings (>100 g l^–1), co-solvents (for example, up to 50% DMSO in aqueous media) and high temperatures (40–50 °C). This workflow can also be automated to improve speed, accuracy and throughput. For example, colony pickers allow users to array 10³ colonies per hour into 96-well plates, liquid handling robots accelerate aliquoting and transfer steps required for library generation and protein production, and reaction analysis using state-of-the-art ultra-high performance liquid chromatography systems allows evaluation of 10³ clones per instrument per day. GSK used this approach to engineer an enantioselective IRED with a 38,000-fold improvement over the wild-type enzyme⁴⁷. This IRED variant was employed in a reductive amination and kinetic resolution step to manufacture the lysine-specific demethylase 1 (LSD1) inhibitor GSK2879552, a treatment for small cell lung cancer and acute leukaemia. Following a similar approach, Codexis and Merck have engineered five different enzymes, which form part of an impressive nine-enzyme cascade process to manufacture the HIV treatment islatravir⁷ (see Applications section for a more detailed discussion and reaction scheme).

In order to evaluate larger libraries of 10⁵–10⁸ variants, more specialized screening approaches can be employed, including colony-based assays^48,49,50, fluorescence-activated cell sorting⁵¹, phage display⁵², microfluidic-based screening^53,54, selection-based approaches⁵⁵ and continuous evolution^56,57. For example, monoamine oxidase from Aspergillus niger (MAO-N) has been extensively engineered for the selective oxidation of a wide range of amine substrates using a colorimetric colony-based assay that relies on the detection of the hydrogen peroxide co-product by horseradish peroxidase (HRP) and a reactive dye⁵⁸. The throughput of oxidase evolution based on hydrogen peroxide detection can be further increased by screening variants in picolitre droplets. Indeed, in a recent study, an ultra-high-throughput microfluidic assay was used to screen a library of 10⁷ cyclohexylamine oxidase variants and, after only a single round of evolution, the most improved variant isolated had a 960-fold improvement in catalytic efficiency⁵⁹. Although the screening capacity in this example is greatly increased, picolitre droplet sorting is currently restricted to fluorescence as a detection method, which limits the versatility of this approach. Ongoing research is focused on coupling microfluidics with alternative methods of detection, such as mass spectrometry, to provide approaches for a wide host of chemistries⁶⁰.

Selection-based approaches including continuous evolution platforms^53,54,55, where improved catalyst performance is linked to cell viability, offer ultra-high-throughput screening capabilities (10⁶–10¹⁰ variants). These methods are highly specialized as improvements in the enzyme activity of interest must be linked to cell survival. Optimization of these multicomponent systems requires considerable effort and can take up to several years; it is important to have control over the stringency of the selection pressure and to ensure that the host organism is not able to evolve alternative mechanisms of survival. However, for enzymes whose native activities can be associated with organism fitness, this type of assay is particularly powerful as it allows rapid evaluation of broad sequence space. A key example is the development of a selection-based method for engineering pyrrolysyl-tRNA synthetases (PylRS) for the genetic incorporation of non-canonical amino acids into proteins⁵⁵. This versatile approach has been applied by numerous research groups to the evolution of a panel of PylRS variants, which are now able to accept more than 150 different non-canonical amino acids⁶¹.

Computational enzyme design and engineering

For the most part, the high efficiency of enzymes in accelerating chemical reactions has been attributed to their highly pre-organized active site pockets that precisely position the catalytic residues for transition state stabilization⁶². This precise arrangement in the active site pocket to optimize the chemical steps is complemented by the inherent flexibility of the enzyme structure. Enzymes can adopt multiple conformations that often play critical roles in equally important processes, such as substrate binding and/or product release for restarting the catalytic cycle⁶³. To this end, computational enzyme design protocols should propose specific amino acid changes (located in the active site but also in remote positions) to achieve highly pre-organized active site pockets for transition state stabilization, and optimize the enzyme conformational ensemble to favour substrate binding and product release⁶⁴.

In practice, available computational protocols focus only on a selected set of the complex features of enzyme catalysis; that is, they design enzymes based on either the chemical steps of the desired chemical transformations (see Fig. 4A), the substrate binding/product release process or the enzyme conformational dynamics (Fig. 4B). Different computational techniques are needed for each of the above features (see Fig. 4).

**Fig. 4: Computational approaches to enzyme design.**

Initial attempts to rationally design enzymes were focused on the chemical steps of the process (Fig. 4A, and selected examples Fig. 4Aa–c). The transition states of the desired transformation in the theoretical enzyme or ‘theozyme’ active site pocket is modelled with quantum mechanics to assess the potential rate acceleration and the ideal geometric constraints for optimal transition state stabilization (Fig. 4Aa). This optimal arrangement that contains only a few active site residues owing to the high computational cost of quantum mechanics calculations is then grafted onto an existing protein scaffold, and further optimized by means of Rosetta or other related protein design software^65,66. Further refinements to this original formulation (named inside-out) can be made by incorporating data on protein conformational dynamics by means of short molecular dynamics simulations. In these simulations, the enzyme variant is immersed in a water solvent box, and whether the optimal arrangement of the catalytic residues — also known as the near attack conformation — is maintained throughout the simulation time is assessed. A higher number of near attack conformations explored during the molecular dynamics is attributed to a higher catalytic activity and/or selectivity as the catalytic residues are properly arranged for catalysis most of the simulation time. These observations resulted in the development of some computational methodologies based on Rosetta and molecular dynamics simulations for enhancing the enzyme activity and selectivity (catalytic selectivity by computational design (CASCO), as shown in Fig. 4Ab) or thermostability (framework for rapid enzyme stabilization by computational libraries (FRESCO))⁶⁷. Additional refinements such as the use of an ensemble of closely related enzyme conformations from either normal mode analysis, short molecular dynamics simulations or small perturbations in the enzyme backbone angles for multi-state design (Fig. 4Ac) were proposed to include some limited protein flexibility in the design process⁶⁸. Although these strategies include some protein flexibility during the design, the ensemble of conformations used is rather similar as they come from usually short (picosecond to nanosecond) molecular dynamics simulations. Other strategies are based on computing the direct effect of the included mutations on the activation barrier of the enzyme-catalysed process (with the computationally more demanding quantum mechanics/molecular mechanics or empirical valence bond (EVB) strategies, as shown in Fig. 4A), rather than estimating the effect by means of some key geometrical constraints (as in the near attack conformation analysis)⁶⁹.

The importance of enzyme conformational dynamics for enzyme design gained popularity in recent years^70,71,72 (Fig. 4B). Conformationally flexible loops adjacent to the active site pocket can regulate substrate binding and/or product release, and some studies have shown these loops as crucial for enhanced enzymatic activity in many enzyme families⁷³. Bioinformatic tools such as CAVER have been developed to identify tunnels and channels, and to suggest potential mutational hotspots for novel catalytic activity⁷⁴ (Fig. 4Ba). The analysis of some natural and laboratory evolution pathways demonstrated that increased enzymatic activity is often achieved by introducing mutations that alter the enzyme’s conformational ensemble⁷⁵. These mutations can be located at the active site or may be located at distal positions and induce a long-range effect that impacts the enzyme active site pocket and, thus, catalysis. This impact on enzymatic activity is often achieved by favouring the enzyme conformational states that are key for the novel functionality (catalytically productive conformations), while disfavouring non-productive conformational states, thus converting computational enzyme design into a population shift problem⁶⁴. In this direction, some conformationally driven computational approaches focused on identifying such long-range allosteric networks of interactions include the shortest path map (SPM) tool and have been used recently for predicting distal and active site mutations⁷⁶ (Fig. 4Bb). Multistate computational design based on ensembles of enzyme conformations taken from room-temperature X-ray crystallography corresponds to a successful strategy for efficient computational enzyme design⁷⁷. The reconstruction of ancestral enzymes that display a higher degree of flexibility than their modern counterparts and their use as initial scaffolds for enzyme design has additionally yielded interesting new insights⁷⁸. The higher flexibility observed in many ancestral variants was key for achieving high catalytic activity with only a few mutations located at the active site. Ancestrally reconstructed enzymes are usually less specialized than their modern counterparts, thus often presenting higher levels of substrate and catalytic promiscuity⁷⁹, which makes them excellent starting protein scaffolds for enzyme design (Fig. 4Bc). These examples indicate that both the selection of a conformationally rich scaffold and the consideration of multiple enzyme conformations is crucial for successful computational enzyme design.

Biocatalysts with non-canonical amino acids

The design and engineering of enzymes with an expanded amino acid alphabet is a nascent and rapidly developing area of biocatalysis. Enzymes are exceptionally powerful catalysts capable of promoting chemical transformations with efficiencies and selectivities that are difficult to achieve with small-molecule systems. However, enzymes are typically biosynthesized from the 20 canonical amino acids that contain a limited number of functional groups, restricting the range of catalytic mechanisms that can be installed into designed active sites. The emergence of powerful genetic code expansion methodologies has enabled the site-specific installation of hundreds of structurally and functionally diverse non-canonical amino acids into proteins^80,81. Careful selection of a suitable non-natural amino acid and its positioning within the target protein scaffold is required to address the application of interest (Fig. 5). For instance, a key active site residue is often replaced with a non-canonical amino acid that is a close structural analogue to modulate catalytic function for mechanistic investigations of natural enzymes^{82,83,84,85,86}. Alternatively, to design enzymes with new functions, the selection of amino acid takes inspiration from structural motifs present in small-molecule catalysts with positioning within the protein guided by computation^87,88.

**Fig. 5: Design and engineering of biocatalysts using non-canonical amino acids.**

The favoured method for encoding a non-canonical amino acid exploits an engineered aminoacyl-tRNA synthetase/tRNA pair that is orthogonal to the host’s translation machinery to direct the incorporation of the non-canonical amino acid by suppression of a nonsense codon, which is most commonly the UAG stop codon. The aminoacyl-tRNA synthetase of the orthogonal translation component pair is typically engineered towards a desired non-canonical amino acid through iterative rounds of positive and negative selections, which link cell viability to aminoacyl-tRNA synthetase activity and selectivity^80,89. Introducing the non-canonical functionality directly through the cellular translation machinery offers significant advantages over alternative methods of chemically modifying protein structures. For instance, this approach facilitates the homogeneous production of precisely edited proteins, enables the introduction of aminoacyl-tRNA synthetase at diverse sites in any protein scaffold and, perhaps most significantly, allows for rapid optimization of enzyme properties using directed evolution workflows adapted to an expanded genetic code^87,90.

The availability of an expanded set of amino acid building blocks offers exciting new opportunities for biocatalysis. Genetically encoded non-canonical amino acids have been used to improve both biocatalyst activity and stability⁹¹ as well as provide new tools to understand how enzymes function at the molecular level^82,83,84. Key recent examples include the replacement of serine and cysteine catalytic nucleophiles with 2,3-diaminopropionic acid as a means of trapping acyl-enzyme intermediates for structural characterization⁸⁵, and the use of non-canonical axial haem ligands to unravel the active site features that control the reactivities of high-energy metal-oxo intermediates⁸⁶. The availability of an increased repertoire of covalently embedded functional groups also provides exciting opportunities to design de novo enzymes with catalytic mechanisms inspired by small-molecule catalysis. This approach was recently showcased through the design of an artificial hydrolase OE1 (ref.⁸⁷) that employs Nδ-methyl histidine (Me-His) as a catalytic nucleophile, which operates with a similar mode of action to the widely employed small-molecule catalyst dimethyaminopyridine (DMAP)⁹². Histidine methylation was integral to catalytic function and leads to the generation of reactive acyl-imidazolium intermediates, which are readily hydrolysed to regenerate the catalytic nucleophile (Fig. 5). By contrast, the catalytic function of de novo hydrolases employing canonical histidine, serine or cysteine nucleophiles was compromised by the formation of unreactive acyl-enzyme intermediates^93,94,95,96. The modest initial hydrolysis activity of OE1 was subsequently enhanced via iterative rounds of directed evolution giving rise to variant OE1.3 containing six active site mutations⁸⁷. OE1.3 accelerates ester hydrolysis beyond 9,000-fold and 2,800-fold as compared with free Me-His and DMAP, respectively. Further rounds of evolution lead to the enantioselective hydrolase OE1.4. This study showcases how the interplay of genetic code expansion, computational design and directed evolution can provide a truly versatile platform for building de novo biocatalysts with new and improved catalytic functions.

Cascade development

Combining multiple enzyme-catalysed steps in the same pot is a very important research area. Biocatalysis is particularly well suited to these cascade processes as enzymes possess inherent chemoselectivity, regioselectivity and stereoselectivity and operate in a common aqueous media. Akin to natural biosynthetic pathways, fully de novo non-natural biocatalytic cascades can be designed and developed for the synthesis of complex targets. It should be pointed out that biocatalytic cascades have only become more commonly used because of the advances in biocatalyst design and build discussed elsewhere in this Primer.

Biocatalytic cascades^97,98 typically feature two or more steps (functional group interconversions or bond forming) with at least one enzymatic transformation and without intermediate isolations (Fig. 6a). The definition of ‘cascade’ is generally broadly applied within the biocatalysis community to describe not only concurrent, multienzyme processes in one pot but also reactions in which components are added sequentially or process steps are telescoped despite attempts to impart order on the nomenclature^{99,100,101,102}.

**Fig. 6: Biocatalytic cascades and their development.**

The development of novel biocatalytic cascades can be broadly described by a design–build–optimize cycle (Fig. 6b) until a final process is achieved^3,103. Initially, retrosynthetic analysis is performed using the principles of biocatalytic retrosynthesis^{104,105,106,107} and/or retrobiosynthesis¹⁰⁸ to make key bond disconnections and plan the forward route. This can be performed manually or complemented by more recent computer-aided synthesis planning tools that are becoming the focus of increased interest^8,109,110. Additionally, selecting a cascade design that will enable the planned synthesis is required, which can range from simple linear to orthogonal or cyclic processes⁹⁹. Any cofactor requirements or potential compound incompatibilities should be considered at this stage.

Once a process design is in place, enzymes need to be identified to fulfil each cascade step. Enzymes can be identified from the literature, from screening of enzyme libraries or from enzyme discovery efforts. When it comes to building the cascade, there is a choice of operating the enzymatic steps with purified or crude cell-free extracts (in vitro), with viable whole cells (in vivo) or with a combination of the two (hybrid)¹⁰³. A multitude of factors will help determine which system is best to use, such as enzyme availability, cofactor recycling requirements and reactor/facility infrastructure. Often, each step in the cascade is validated individually before any single-pot combinations are tested.

Finally, optimization of the process can help maximize throughput and product titre. Several rounds of protein engineering are typically required, especially for industrial application, to improve enzyme activity and stability to overcome any bottlenecks in the pathway and maximize pathway flux. General process engineering optimizations also complement cascade development; for example, enzyme immobilization strategies to simplify the workup and/or improve a biocatalyst’s lifetime, recoverability and reuse^111,112.

Further understanding of the full process can then influence subsequent design–build–optimize cycles in an iterative fashion to streamline the entire synthetic route to the desired compound.

Results

What makes a good industrial biocatalyst?

Before scientists embark on the challenge of discovering a good (or ideally excellent) industrial biocatalyst, they need to define which properties the biocatalyst must have for efficiently performing a commercially interesting target reaction under select industrial conditions. Here, we describe the beneficial characteristics that are usually found in industrial biocatalysts, metrics to assess their performance in industrial processes and a few exciting examples. Other illustrative examples can be found in excellent recent articles^{4,113,114,115}. New biocatalytic processes aim to generate new molecules of considerable commercial interest. They may also be designed to replace or complement existing non-optimal chemical or biocatalytic syntheses in industry. In either case, the viability and possible bottlenecks of a biocatalytic process can be assessed using both economic and green chemistry process performance metrics^{116,117,118,119} (Table 1). Both high substrate concentration and conversion are desired in industrial reactions to achieve high product concentrations and so reduce the product recovery cost. Reactions resulting in low product concentrations may require additional concentration steps or large volumes of extraction solvent, which will increase costs associated with a rise in energy consumption and/or waste production. Thus, an ideal biocatalyst’s activity should not be inhibited by high substrate concentrations (>50 g l^–1) or the amount of co-solvent required for substrate solubilization. It is worth mentioning that substrate loadings as high as >1 kg l^–1 for aldoxime dehydratase, which catalyses the synthesis of linear aliphatic nitriles, have been reported¹²⁰. Nevertheless, frequently observed detrimental effects at high substrate concentration or by organic solvents on biocatalysts can be alleviated using fed-batch strategies¹²¹. In examples where inhibition of the enzyme by-product, unfavourable thermodynamic reaction equilibria or product side reactions are problematic, in situ product removal can be applied¹²². To remove a product resulting from an ongoing enzyme-catalysed reaction, various techniques can be used such as in situ product crystallization, adsorption, distillation and extraction^122,123. For example, in situ product crystallization can be achieved by forming a product salt via inclusion of an appropriate counter-ion in the reaction media. Similarly, another option is to perform the bioconversion in the presence of a resin that selectively adsorbs the product from the solution.

Table 1 Metrics used to evaluate efficacy of biocatalytic processes

Full size table

High stability under industrial process conditions is an essential property of a good biocatalyst. Numerous robust enzymes of industrial interest have been discovered or redesigned over the past decade by enzyme engineering, computational methods, genome mining, ancestral sequence reconstruction or combinations thereof. A recent example using FRESCO generated an alcohol dehydrogenase mutant with a melting temperature — the temperature at which half of the protein is unfolded at equilibrium — of 94 °C (close to water’s boiling point) and this has previously been applied successfully to other enzyme classes¹²⁴. Sequence reconstruction of a robust ancestor has been achieved for an increasing number of biocatalysts including cytochrome P450 monooxygenases¹²⁵, carboxylic acid reductases¹²⁶, flavin-containing monooxygenases¹²⁷ and laccases¹²⁸, made available in a recently created database of resurrected proteins with 211 members (Revenant)¹²⁹. Enzyme immobilization, which facilitates repeated enzyme reuse, has also been used to enhance enzyme operational stability in industrial processes^130,131,132. As there is great interest in the utility of enzyme immobilization, especially in continuous flow systems¹³³, tolerance to immobilization without significant loss of activity or selectivity is an appealing property for a biocatalyst¹³⁴.

Biocatalytic processes outcompete their chemical counterparts regarding sustainability, as illustrated when comparing the chemical and biocatalytic synthetic routes for pregabalin, atorvastatin intermediate, sitagliptin and ambrox¹³⁵. In contrast to chemical catalysts, biocatalysts are derived from renewable resources, are biodegradable, act in aqueous solvent under mild reaction conditions and generate low amounts of waste by-products. Furthermore, biocatalytic synthetic routes obviate the need for hazardous chemicals, high energy usage and additional reagents for functional group activation, protection or deprotection steps.

Biocatalytic processes requiring whole-cell fermentations (for either enzyme production or substrate conversion) generate waste biomass, which can be reused as a source of energy or animal feed. To reduce water usage and carbon feedstocks required for cell growth, biotransformations with isolated enzymes or cell lysates can be performed instead of whole-cell fermentations at increased concentrations. A reduction in biocatalyst loading, without reducing productivity as measured by yield and speed, can be accomplished by using engineered biocatalysts that offer improved properties such as higher turnover rates and/or stability for reuse. Energy consumption due to biocatalyst recovery from reaction solutions can be minimized by enzyme immobilization. Importantly, inexpensive renewable carriers for enzyme immobilization, such as rice husk, are being developed to replace organic fossil-based carriers¹³⁶. However, a significant expansion of enzyme-based technologies in the production of bulk chemicals (high volume, low priced) must be achieved to increase the impact of biocatalysis on sustainability¹³⁷. So far, biocatalysts are more frequently used to synthesize high-price low-volume products such as pharmaceuticals.

Various companies (for example, Merck, Pfizer, GlaxoSmithKline and AstraZeneca) have become active in the development of new biocatalytic processes and often collaborate with academic groups to accelerate progress in this research area. Examples of some of the enzymatic processes developed by industry with biocatalysts including KREDs, transaminases, hydroxylases and IREDs are described in recent review articles^138,139. When selecting a biocatalyst for process development, it is often desirable to select enzymes that will enable freedom to operate to avoid infringing intellectual property rights or to access desired patented biocatalysts during the early stages of process design. To this end, industries and universities often provide experts in the complex and rapidly evolving field of intellectual property to guide research scientists.

A good industrial biocatalyst should combine numerous beneficial properties to deliver higher-value molecules under demanding industrial conditions while achieving satisfactory economic and green metrics for various applications (Fig. 7). A few of the most desired characteristics of efficient industrial biocatalysts have been highlighted above, which include high activity, stability, ease of immobilization, environmental sustainability and accessibility. The importance of other relevant properties of a good biocatalyst, such as substrate selectivity, evolvability and affordability, will be illustrated through various examples in the following sections.

**Fig. 7: Beneficial properties of an excellent biocatalyst under industrial process conditions.**

Applications

An ideal catalyst converts renewable, cheap and readily available raw materials such as plant-derived feedstocks, generates few to no undesired by-products, is safe and exhibits a reduced environmental footprint (low energy consumption and waste). These characteristics are not often observed for industrial chemical catalysts. Also, biocatalysts usually act under mild reaction conditions and can be engineered towards the desired substrate scope. Thus, biocatalysis paves the way for a bio-based economy, less reliant on fossil fuels¹¹⁷. Here, we highlight the utility of biocatalysis in various applications, first according to different reaction metrics or enzyme properties that are of importance in biocatalysis followed by an overview of enzyme cascade development.

Activity and productivity

Biocatalysts should exhibit high activity under the desired industrial conditions to achieve a high reaction productivity. Chemically heterogeneous catalysts are challenging rivals for biocatalysts in terms of productivity, often reaching production rates of 1–10 and 0.001–0.3 kg l^–1 h^–1, respectively¹⁴⁰. High productivities of 50–100 g l^–1 h^–1 have been achieved using free-resting Rhodococcus cells containing nitrile hydratase for the synthesis of acrylamide from acrylonitrile, considered to be one of the most successful industrial biocatalytic processes^140,141. Acrylamide is used to produce polyacrylamide, which is used in water treatment, oil exploitation and the textile industry sector, as well as many others. The potential of nitrile hydratase as an industrial biocatalyst for the hydration of nitriles to form higher-value amides was demonstrated in the 1980s¹. The vast market for acrylamide and the lack of an efficient chemical process for its production have propelled the improvement of the biocatalytic process over the past few years. The selection and optimization of a robust microbial host for nitrile hydratase was instrumental in preventing enzyme inactivation, owing to the high acrylamide concentrations required in the industrial process (300–500 g l^–1) and the underlying exothermic nature of the reaction¹⁴¹. A selective robust transaminase was obtained, by combining rational mutagenesis, directed evolution and a substrate walking approach, for the large-scale manufacture of the antidiabetic drug sitagliptin under demanding industrial conditions (200 g l^–1 substrate loading, 50% DMSO and 45 °C)¹⁴². This is an impressive example of an excellent industrial biocatalytic approach that outcompeted the previously used rhodium-catalysed sitagliptin synthesis in terms of selectivity, productivity, sustainability and cost.

A relatively high productivity (13 g l^–1 h^–1) was recently achieved for IREDs by testing a commercially available IRED collection and various reaction conditions at the pilot plant scale, which was facilitated by a design of experiments strategy¹²¹. IREDs are of great interest for the industrial synthesis of cyclic and acyclic amines via the reduction of C=N bonds. This study identified reaction bottlenecks (for example, enzyme stability) and exposed possible strategies to overcome them (for example, using a fed-batch process) for a model reaction. Importantly, the first industrial synthesis catalysed by an IRED (on a 20-l scale) was recently reported⁴⁷, highlighting an excellent industrial biocatalyst after three rounds of directed evolution, which outcompeted the corresponding chemical process with respect to green metrics such as lower catalyst requirement. This engineered IRED is used for the industrial synthesis of the LSD1 inhibitor GSK2879552. In contrast to the IRED used as starting point in this study, the engineered IRED is an excellent biocatalyst due to its increased stability under the required reaction conditions (moderately acidic pH and 20 g l^–1 substrate concentration) showing a 38,000-fold improvement in turnover. In this case, the selectivity — another requirement for a good biocatalyst — needed no further improvement. The preparation of the fragrance ingredient (−)-ambrox using an engineered squalene hopene cyclase is another example of a successful industrial biocatalytic process, which achieved relatively high productivity (12 g l^–1 h^–1) for catalysing the cyclization of (E,E)-homofarnesol to yield (−)-ambrox¹⁴³. The enzyme variant used in this study, which exhibited a 10-fold increase in productivity over the wild type, was discovered by random mutagenesis. This cyclase whole-cell biotransformation in E. coli was carried out under conditions that were optimized using a design of experiments strategy, in which the optimized parameters included the cell, sodium dodecyl sulfate (SDS) and (E,E)-homofarnesol concentrations, temperature and pH. SDS was required in this process to ensure substrate solubilization and access to the enzyme through the cell membrane.

Selectivity and substrate scope

Enzymes with excellent regioselectivity, chemoselectivity and/or stereoselectivity and the desired substrate scope for industrial applications can be obtained by either mining the enormous diversity evolved by nature or performing protein engineering campaigns in the laboratory. Studies that have uncovered the extraordinary diversity of enzymes involved in natural product biosynthetic pathways have provided promising industrial biocatalysts with complementary selectivity as well as substrate scope. For example, the recent comparison of three similar FAD-dependent monooxygenases, which catalyse the oxidative dearomatization of phenol and resorcinol in different biosynthetic pathways, has revealed their complementary site selectivities and stereoselectivities by testing a diverse panel of unnatural substrates¹⁴⁴. This approach enabled the identification of an optimal biocatalyst for specific asymmetric transformations of phenols into ortho-quinols, a chemical reaction of great value in the synthesis of various bioactive natural products¹⁴⁴. In another example that highlights the importance of enzyme discovery and characterization, the substrate scope of 87 putative flavin-dependent halogenases was determined using a high-throughput mass spectrometry-based screen²². Various halogenases discovered in this study exhibited complementary regioselectivity on relatively complex substrates. Thus, this enzyme library is attractive for late-stage C−H functionalization of drug leads, leading to diverse drug candidates from common intermediates. Furthermore, this study enabled the discovery of new halogenases for biotechnology applications, which exhibited beneficial properties such as regioselectivity, substrate scope and stability that were engineered in other previously discovered halogenases²².

An increasing number of studies demonstrate that required selectivities can be readily engineered into different enzyme classes¹⁴⁵. A recent example is the synthesis of a Janus kinase (JAK) inhibitor, which involved engineering IRED variants with markedly improved selectivity and activity compared with the wild type¹⁴⁶. Synthesis of enantiomerically pure compounds is a key driver for the implementation of enzymes in the pharmaceutical industry³. Enantioselective enzymes are also used industrially for the production of target molecules required in food supplements, flavourings, fragrances and agrochemicals¹⁴⁷. To this end, a wild-type cytochrome P450 monooxygenase catalyses the enantioselective and regioselective C5 hydroxylation of decanoic acid to form (S)-5-hydroxydecanoic acid, which is subsequently converted by chemical lactonization into the high-value fragrance compound (S)-δ-decalactone¹⁴⁸. In the food industry, small-scale reactions using an engineered ethylenediamine-N,N′-disuccinic acid lyase have demonstrated its utility for the enantioselective synthesis of chiral synthons for artificial dipeptide sweeteners¹⁴⁹. The lyase used as a starting scaffold exhibited excellent enantioselectivity for the target substrate but had low activity, which was increased 1,140-fold by rational protein engineering.

Enzyme cascades

From an industrial perspective, biocatalytic cascade processes are especially attractive as they eliminate the need for intermediate isolation steps, reducing waste, saving time and costs as well as streamlining the overall synthesis¹⁵⁰. Some intermediates can be unstable to isolation or have inhibitory effects on the enzymes present in the system, and therefore the use of a cascade process can be beneficial to overcome these challenges and avoid the build up of problematic intermediates.

Several recent reviews have been published on enzymatic cascades that reveal the potential and scope of these processes^{99,101,102,151,152}. Some examples⁹⁷ of industrially applied systems are highlighted here (Fig. 8). Evonik Degussa GmbH described a whole-cell cascade to produce diamines — which are valuable building blocks in the polymer industry — from renewably sourced dicarboxylic acids (Fig. 8a). The patented process¹⁵³ details the co-expression of a carboxylic acid reductase and a transaminase to enable the desired cascade. An alanine dehydrogenase was also incorporated to provide a source of l-alanine, required for the transaminase step, from ammonia as an input nitrogen source. Additional process considerations for the in vivo implementation of the cascade included co-expression of fatty acid transporters to improve substrate uptake or the incorporation of an initial esterase step enabling the use of esters as starting materials.

**Fig. 8: Industrial examples of biocatalytic cascades.**

A hydrogen-borrowing, redox-neutral cascade was developed by GSK for the production of GSK2879552 (ref.⁴⁷) (Fig. 8b). A KRED IRED system was evaluated to take the desired alcohol to the chiral amine via an aldehyde, with internal cofactor recycling between the two enzymes. The main synthetic focus of the work was the engineering of the IRED step, involving reductive amination and concurrent resolution of the racemic amine substrate. The cascade synthesis enabled generation of the desired product in 48% yield with high enantiopurity (99.5% enantiomeric excess). Although the IRED step can operate as a stand-alone process and achieve higher yields, the proof of concept for the cascade was established. Process development and a more active KRED were highlighted as areas of potential focus to further improve the cascade and realize its potential for manufacturing.

Recently, Merck & Co.⁷ developed a total enzymatic synthesis of the HIV drug islatravir built on five key enzymatic steps (Fig. 8c). The selected enzymes were subjected to multiple rounds of protein engineering to achieve either the desired activity, stability or selectivity for operation of the cascade. A single aqueous reaction stream was employed throughout the entire process, in which the galactose oxidase (GOase) and pantothenate kinase (PanK) steps operated sequentially to avoid cross-reactivity between substrates. The final deoxyribose phosphate aldolase (DERA), phosphopentamutase (PPM) and purine nucleoside phosphorylase (PNP) steps were then run concurrently, and the equilibrium of these steps was pulled through to product formation by an orthogonal sucrose phosphorylase (SP) step that removed phosphate from the reaction mixture. The cascade synthesis of islatravir (and, more recently, of molnupiravir)¹⁵⁴ replaced alternative chemical routes to this drug that required more than double the step count with protecting group manipulations, thereby vastly improving the efficiency of synthesis.

Reproducibility and data deposition

Databases for biocatalysis

Over the past decade, the cost of DNA sequencing and synthesis has fallen rapidly; a trend commonly referred to as the Carlson curve¹⁵⁵.

This associated abundance of protein sequence data provides a rich seam for mining for new biocatalysts. The National Center for Biotechnology Information (NCBI) maintains databases of both DNA and protein sequences, regularly updated with new sequencing data, and with the option to search for sequences of interest using tools such as BLAST (Basic Local Alignment Search Tool)¹⁵⁶. Other databases, such as UniProt, InterPro or Pfam, offer further analysis of protein sequences, structures or families.

As the amount of data collected for an increasing number of enzymes and enzymatic transformations rises, it becomes prohibitive for interested researchers to efficiently scour the literature in search of ideal/appropriate candidates to analyse. Catalyst and enzyme selection, for use in organic chemistry syntheses or synthetic biology pathways, respectively, already benefit from numerous well-developed databases. Reaxys¹⁵⁷ and SciFinder¹⁵⁸ contain a plethora of searchable information related to reaction conditions, choice of catalyst, substrate scope, percentage conversions and analytical information, among others, for use when designing a synthetic chemistry route towards a target molecule, whereas BRENDA¹⁵⁹ and KEGG¹⁶⁰ hold data on the natural substrate specificity, and sequence information, of biosynthetic enzymes to be used in a synthetic biology pipeline. A comparable repository, comprising information collected for synthetic enzyme reactions in biocatalysis, would be of great use for the biocatalysis community.

Despite the fact that several databases for the biocatalysis community have been developed, none of them contains information related to the whole biocatalytic toolbox, and the majority do not provide such critical information as the substrate scope of specific enzymes, successful reaction conditions or reaction yields (Table 2). In general, the majority of the resources listed in Table 2 rely on data extracted from pre-existing databases such as BRENDA and PDB (Protein Data Bank) and, as such, are restricted to solely utilizing the sequence and/or structural information contained within them. Additionally, the curation and maintenance of substantial databases is often laborious and challenging, and so most biocatalyst databases focus on a specific reaction type or enzyme type of interest, rather than compiling data on the field as a whole. One of the few examples of a database recording information related to substrates, products and reaction outcomes in a biocatalysis context has been developed for the prenyltransferase enzyme class (PrenDB)¹⁶¹. PrenDB aims to collect data in the literature concerned with prenyltransferase enzymes and use them in various algorithms to achieve wider application of this family of synthetically useful enzymes. The compilation of a biocatalysis database, similar in scope to PrenDB but covering a broad spectrum of the different enzyme classes available in the biocatalytic toolbox, would unquestionably enhance the development of new enzymatic (cascade) reactions.

Table 2 Databases containing information specifically on enzymes for biocatalysis

Full size table

An ideal database dedicated to biocatalytic transformations would capture both successful and unsuccessful transformations on an enzyme by enzyme basis and would broadly collect both enzyme activity data and enzyme sequence data. For example, data regarding the substrate scope, reaction temperature and length, buffer choice and pH, cofactor use, co-solvent use, substrate concentrations, reaction outcomes including percentage conversions and selectivities would all need to be collected to maximize the applicability of such a database. Enzyme homologue information, such as the amino acid sequence, structural information, mutant information and accession codes, would also need to be obtained. Additionally, integration with existing databases, such as those outlined above for chemistry and synthetic biology applications, would allow for extremely powerful synthesis planning towards target molecules. A fully functioning and searchable biocatalyst database could be used to augment tools designed to automate synthesis planning and would, ultimately, benefit researchers from both the chemistry and biocatalysis fields. In related fields such as natural product discovery, crowdsourcing has been successfully utilized for the construction of similar databases¹⁶². Indeed, a platform for curation of biocatalysis data has recently been made available to the community with this in mind, as part of the computer-aided synthesis planning tool RetroBioCat⁸.

Reproducibility issues in the field

A successful biocatalyst database requires a system that captures all useful information on biocatalyst performance reported in the literature. However, the diverse scientific communities that work with and characterize biocatalysts have varying standards when it comes to recording reaction parameters and outcomes, with some favouring kinetic data and others preferring to record percentage conversions and overall yields, for example. These different approaches have resulted in a wealth of information for many different enzyme classes and homologues that may not be directly comparable with one another, and so it becomes necessary to standardize the data collected in order to obtain a better overall picture of where select developments stand. One such way of standardizing data reported in the literature would be to categorize reactions qualitatively for enzyme activities with respect to a given substrate (for example, high, medium, low, none). Different data sources, such as percentage conversions and specific activities, could be categorized in this way and then compared against each other.

Alternatively, biocatalytic experiments could be standardized in the laboratory prior to data deposition. For numerous years, the STRENDA (Standards for Reporting Enzyme Data) commission has sought to provide guidelines on the experimental detail required when reporting enzyme activities and kinetics¹⁶³. Recently, these guidelines have been incorporated into an online storage and validation tool, where enzyme data can be deposited and checked for compliance with the STRENDA guidelines¹⁶⁴. This serves as a useful blueprint for reporting biotransformations in biocatalysis, but likely must be extended to include the additional datatypes often reported in biocatalysis papers, for example percentage conversion.

Recently, numerous start-up companies have emerged across biology and chemistry to develop smart-laboratory infrastructures, aiming to make research more reproducible by capturing data on all of the possible variables in an experiment¹⁶⁵. Others offer platforms to structure the collected data in a process, allowing machine learning to pull insight out of the vast data sets that smart-laboratories might produce¹⁶⁶. Experimentally, this can allow trends to be observed that might otherwise be missed — for example, a new batch of a reagent causing a drop in yields, or a shift in pH causing improved enantioselectivity. The digitization of experimental procedures and data collection should greatly improve the reproducibility of experiments across biology and chemistry. In particular, this may allow methods sections in journals to offer links to a more atomized record of the experimental procedures carried out. However, uptake by academia may be slow in comparison with industry laboratories, where electronic laboratory notebooks are more commonly employed.

Limitations and optimizations

Cost and accessibility

Biocatalyst cost usually has an influence on the viability of an industrial process, but especially in the synthesis of low-priced products. Currently, a wide variety of affordable enzyme collections (kits) are accessible from various vendors (for example, Prozomix, Almac, Codexis and Gecco). Enzyme discovery and production in-house is the alternative approach. Advantages and limitations of these options, ‘the buy or build operating models’, have been previously discussed¹⁶⁷. The choice of biocatalyst format (for example, purified, whole-cell or crude preparations) varies, depending on the particular application and enzyme class. Obviously, well-expressed enzymes are highly desired to reduce costs and effort. Access to an increasingly diverse platform of molecular biology tools allows the tailoring of enzyme performance to meet demanding industrial requirements and to efficiently convert non-natural substrates. Generation of improved biocatalysts by enzyme engineering is possible simply because enzymes are able to tolerate in vitro mutation. Thus, evolvability is another highly desirable property of a good biocatalyst. Engineering one enzyme may take only a few months, but building complex cellular metabolic networks may take years and demand considerable economic investment¹⁶⁸. These timescales are not fast enough to meet ‘the need for speed’ in industry¹⁶⁹. A recent example of a three-step route including two enzymatic steps, which was developed in just 6 months, is the synthesis of the COVID-19 direct-acting antiviral molnupiravir¹⁵⁴. Development of highly efficient biocatalysts by either rational or evolution techniques will be accelerated in the near future by expanding both the use of machine learning¹⁷⁰ and ultra-high-throughput screening¹⁷¹ technologies for protein engineering.

Machine-learning algorithms use the sequence-function data resulting from experimental work to predict which new enzyme mutants may exhibit the desired property. Thus, DNA sequences of both improved and unimproved variants are valuable in generating initial data sets. Importantly, machine-learning methods allow a reduction in the number of mutants that have to be produced and tested in the laboratory to discover a significant fraction of improved enzymes, and are particularly interesting in cases that require expensive or labour-intensive screening methods¹⁷⁰. The additional costs of implementing machine-learning algorithms in a traditional protein engineering laboratory include computation and DNA sequencing, costs that are decreasing and, thus, are affordable for numerous research groups in both academia and industry. To explore a vast protein sequence space (library sizes >10⁶ variants), ultra-high-throughput screening technologies have been developed. Many academic or industrial researchers have access to flow cytometers to perform fluorescence-based screenings of up to 10⁸ enzyme variants per day¹⁷². Complementary or improved technologies are rapidly emerging in this field. Miniaturization of the reaction volume is generally pursued because it increases the speed of screening and reduces associated costs and waste. Label-free detection methods are also highly desired, allowing for screening without a reporter molecule. A recent example meeting both objectives allowed the analysis of around 15,000 samples in 6 h using droplet microfluidics (nanolitre scale) coupled to electrospray ionization mass spectrometry for detection⁶⁰. Development and wide access to novel technologies for biocatalysis has been propelled by recent investments from, for example, the European Commission and the UK Biotechnology and Biological Sciences Research Council (BBSRC) to facilitate collaborations between industry and academia. The recent establishment of a Global Biofoundry Alliance represents another example¹⁷³. Biofoundries are facilities to automate the design–build–test iteration cycle for engineering biology, which allows the fast delivery of genetically reprogrammed organisms for biotechnology⁷⁷. Access to biocatalysts is also facilitated by other strategies such as Science Exchange (an online marketplace of research services) and collaborations established between the Centre of Excellence for Biocatalysis, Biotransformations and Biocatalytic Manufacture (CoEBio3) and various companies.

Expanding the range of biocatalysis

Biocatalytic transformations, particularly those routinely applied in industry, often effect functional group interconversions with high conversion and selectivity. However, one of the biggest gaps is broader enzyme platforms that perform C–C bond formation. Despite a plethora of enzymes used in nature for C–C bond formation in primary and secondary metabolism, they are often challenging to repurpose for non-natural substrates. Only a handful of enzymatic C–C bond-forming enzymes have been utilized for industrial applications, mainly limited to aldol reactions, acyloin condensations or cyanohydrin formation catalysed by lyases^174,175. A recent review highlights progress made in this space to diversify the toolbox of enzymes and the C–C bond-forming reactions they catalyse¹⁷⁶. Another industrial gap is scalable and robust oxidative enzymes. Despite the potential to catalyse remote and unactivated C–H oxidations, which are chemically challenging, enzymes such as cytochrome P450s and other oxygenases are problematic to scale up due to low activity, instability and promiscuity, resulting in a mixture of products¹⁷⁷. However, these features are well-suited for small-scale, late-stage diversification of biologically active compounds in which the enzyme promiscuity is advantageous to generate new libraries of compounds for evaluation^178,179. The synthetic utility of the transformation afforded by these enzymes encourages continued efforts to find solutions and realize the potential of these biocatalysts for large-scale manufacture¹⁸⁰.

Speeding up synthesis

In the pharmaceutical industry, the acceleration of the drug development process is crucial to be able to deliver new medicines to patients as quickly as possible as well as maximize patent lifetimes for approved drugs. As such, time pressures for synthetic development are increasingly tight, which is driving advancements in the speed of rounds of protein engineering and the establishment of biocatalysis earlier in synthetic route planning^3,169. These advances include improvements in DNA library syntheses, smart library design and high-throughput screening. On the horizon are technologies such as cell-free expression, which enables skipping the need for growing and harvesting cells that contain enzyme mutants, to further reduce cycle times^181,182. Although the acceleration of development timelines is often associated with the pharmaceutical industry, these improvements are also beneficial to the wider chemical industry, making development more efficient and cost-effective¹⁴⁰.

Outlook

Advances in protein engineering, genomic database mining and computational methods have enabled a step change in biocatalysis over the past 20 years, and have led to its increasing application in the chemical and pharmaceutical industries as highlighted in this Primer. Adoption of biocatalysis is also driven by reduced cost, the need to develop environmentally friendly processes and use of renewable resources¹⁸³.

The number of chemical reactions realized as amenable to biocatalysis has dramatically increased, as new enzyme classes become accessible and non-natural biocatalytic reactions are being developed¹⁸⁴. However, the range of reactions compared with those used in organic synthesis is still small and there are some obvious gaps in the repertoire of biocatalytic reactions that are currently being identified, including halogenation, amide-bond formation, C–C bond formation and cleavage, ether formation, carbonylation, C=C bond functionalization, isomerization and reduction of isolated C=C bonds¹⁴⁰. Some of these issues are being addressed by combining chemical and enzymatic reactions¹⁸⁵. Biocatalysis also offers opportunities to develop reactions that are chemically difficult, such as remote and selective C–H activation, which is often observed in nature^148,178, but the scale and substrate scope remain limitations for biocatalytic C–H activation¹⁸⁶.

The use of enzyme cascades is a particularly attractive aspect of biocatalysis, because of general reaction compatibility and the ability to telescope several reactions either in cell-free systems or whole cells¹⁰³, akin to biosynthetic pathways. The design of such cascades is already starting to become automated using dedicated computational tools and databases that provide rich resources to the scientific community⁸. The accessibility of obtaining biocatalysts through commercial sources or from synthetic genes is continuing to lower the barrier to entry for biosynthesis and the bottleneck for reaction screening is now often at the assay stage, where more label-free high-throughput analytical methods are needed⁵⁰. Current successes for compounds such as islatravir⁷ and molnupiravir¹⁵⁴ have demonstrated the application of biocatalysis to multistep syntheses of small molecules. The next challenges will be to extend the application scope to targets of increasing molecular complexity and size, as well as to decrease the time required to develop efficient biocatalytic industrial processes. Examples of production of bulk chemicals and polymers by biocatalysis are still rare and offer a rich opportunity in terms of green chemistries. Biocatalysis also has a role to play in generating new modalities more efficiently and selectively for the biopharmaceutical industries, such as producing biomacromolecules and antibody–drug conjugates.

Looking to the future, there are numerous key trends and scientific breakthroughs that are promising to have a significant impact on accelerating the discovery, development and application of biocatalysts. First of all, the range of chemical, new to nature biocatalytic reactions is rapidly expanding using de novo design¹⁸⁷ and/or directed evolution¹⁸⁸. Increasingly powerful computational tools will allow for better de novo design but will also provide better selection tools for identifying suitable biocatalysts from the rich protein primary sequence information already accessible in databanks. Advances in computational methods to predict the protein structure from sequences through artificial intelligence¹⁸⁹ and subsequent prediction of function and physicochemical properties will provide access to biocatalysts that are finely tuned to the requirements of a desired target reaction and/or product^190,191,192. To maximize synthetic utility, these tools will need to be integrated with the design of new biocatalytic cascade processes. Many individual steps of biocatalyst development can already be automated at the implementation stage, including desktop DNA printing, cell-free protein expression, enzyme immobilization and analysis, which hints at the potential for ‘fully automated biocatalytic synthesizers’ being available to individual laboratories³ within the next decade.

In conclusion, biocatalysis has enabled essential contributions to the safe, cheap and sustainable production of high-value chemicals and pharmaceuticals, but still provides many exciting challenges for potential advancements.

References

Yamada, H. & Kobayashi, M. Nitrile hydratase and its application to industrial production of acrylamide. Biosci. Biotechnol. Biochem. 60, 1391–1400 (1996).
Google Scholar
Kirk, O., Borchert, T. V. & Fuglsang, C. C. Industrial enzyme applications. Curr. Opin. Biotechnol. 13, 345–351 (2002).
Google Scholar
Devine, P. N. et al. Extending the application of biocatalysis to meet the challenges of drug development. Nat. Rev. Chem. 2, 409–421 (2018).
Google Scholar
Wu, S., Snajdrova, R., Moore, J. C., Baldenius, K. & Bornscheuer, U. T. Biocatalysis: enzymatic synthesis for industrial applications. Angew. Chem. Int. Ed. 60, 88–119 (2021).
Google Scholar
Sheldon, R. A., Brady, D. & Bode, M. L. The hitchhiker’s guide to biocatalysis: recent advances in the use of enzymes in organic synthesis. Chem. Sci. 11, 2587–2605 (2020). This article presents an excellent recent overall review of biocatalysis.
Google Scholar
Birmingham, W. R. et al. Bioretrosynthetic construction of a didanosine biosynthetic pathway. Nat. Chem. Biol. 10, 392–399 (2014). This article develops the concept of ‘bio-retrosynthesis’ and its application to afford an important biomolecule.
Google Scholar
Huffman, M. A. et al. Design of an in vitro biocatalytic cascade for the manufacture of islatravir. Science 366, 1255–1259 (2019). This article presents the very impressive development of a multistep enzyme cascade with multiple enzyme engineering challenges by an industrial team.
ADS Google Scholar
Finnigan, W. et al. RetroBioCat as a computer-aided synthesis planning tool for biocatalytic reactions and cascades. Nat. Catal. 4, 98–104 (2021). This article establishes a database and computer-aided synthesis planning tool that allows scientists to design enzyme cascades.
Google Scholar
Charnock, S., Bernardini, A. M., Monza, E., Lucas, M. F. & Sutton, P. W. in Applied Biocatalysis (eds Whittall, J. & Sutton, P. W.) 27–133 (Wiley, 2020).
Bateman, A. et al. UniProt: the Universal Protein Knowledgebase. Nucleic Acids Res. 45, D158–D169 (2017).
Google Scholar
Buchholz, P. C. F. et al. BioCatNet: a database system for the integration of enzyme sequences and biocatalytic experiments. ChemBioChem 17, 2093–2098 (2016).
Google Scholar
Scott, T. A. & Piel, J. The hidden enzymology of bacterial natural product biosynthesis. Nat. Rev. Chem. 3, 404–425 (2019).
Google Scholar
Martinez, S. & Hausinger, R. P. Catalytic mechanisms of Fe(II)- and 2-oxoglutarate-dependent oxygenases. J. Biol. Chem. 290, 20702–20711 (2015).
Google Scholar
Zwick, C. R. & Renata, H. Harnessing the biocatalytic potential of iron- and α-ketoglutarate-dependent dioxygenases in natural product total synthesis. Nat. Prod. Rep. 37, 1065–1079 (2020).
Google Scholar
Lukat, P. et al. Biosynthesis of methyl-proline containing griselimycins, natural products with anti-tuberculosis activity. Chem. Sci. 8, 7521–7527 (2017).
Google Scholar
Zwick, C. R. & Renata, H. Remote C–H hydroxylation by an α-ketoglutarate-dependent dioxygenase enables efficient chemoenzymatic synthesis of manzacidin C and proline analogs. J. Am. Chem. Soc. 140, 1165–1169 (2018).
Google Scholar
Neugebauer, M. E. et al. A family of radical halogenases for the engineering of amino-acid-based products. Nat. Chem. Biol. 15, 1009–1016 (2019).
Google Scholar
Chekan, J. R. et al. Scalable biosynthesis of the seaweed neurochemical, kainic acid. Angew. Chem. Int. Ed. 58, 8454–8457 (2019).
ADS Google Scholar
Chakrabarty, S., Wang, Y., Perkins, J. C. & Narayan, A. R. H. Scalable biocatalytic C–H oxyfunctionalization reactions. Chem. Soc. Rev. 49, 8137–8155 (2020). This article presents an excellent review on the current state of the art of C–H oxyfunctionalizations of organic molecules using biocatalysis.
Google Scholar
Liao, C. & Seebeck, F. P. S-Adenosylhomocysteine as a methyl transfer catalyst in biocatalytic methylation reactions. Nat. Catal. 2, 696–701 (2019).
Google Scholar
Marsh, C. O. et al. A natural Diels–Alder biocatalyst enables efficient [4 + 2] cycloaddition under harsh reaction conditions. ChemCatChem 11, 5027–5031 (2019).
Google Scholar
Fisher, B. F., Snodgrass, H. M., Jones, K. A., Andorfer, M. C. & Lewis, J. C. Site-selective C–H halogenation using flavin-dependent halogenases identified via family-wide activity profiling. ACS Cent. Sci. 5, 1844–1856 (2019).
Google Scholar
Galanie, S., Entwistle, D. & Lalonde, J. Engineering biosynthetic enzymes for industrial natural product synthesis. Nat. Prod. Rep. 37, 1122–1143 (2020).
Google Scholar
Schultz, B. J., Kim, S. Y., Lau, W. & Sattely, E. S. Total biosynthesis for milligram-scale production of etoposide intermediates in a plant chassis. J. Am. Chem. Soc. 141, 19231–19235 (2019).
Google Scholar
Chen, M., Liu, C. T. & Tang, Y. Discovery and biocatalytic application of a PLP-dependent amino acid γ-substitution enzyme that catalyzes C–C bond formation. J. Am. Chem. Soc. 142, 10506–10515 (2020).
Google Scholar
Bar-Even, A. et al. The moderately efficient enzyme: evolutionary and physicochemical trends shaping enzyme parameters. Biochemistry 50, 4402–4410 (2011).
Google Scholar
Goldsmith, M. & Tawfik, D. S. Enzyme engineering: reaching the maximal catalytic efficiency peak. Curr. Opin. Struct. Biol. 47, 140–150 (2017).
Google Scholar
Handelsman, J. Metagenomics: application of genomics to uncultured microorganisms. Microbiol. Mol. Biol. Rev. 68, 669–685 (2004).
Google Scholar
Iqbal, H. A., Feng, Z. & Brady, S. F. Biocatalysts and small molecule products from metagenomic studies. Curr. Opin. Chem. Biol. 16, 109–116 (2012).
Google Scholar
Coscolín, C. et al. Bioprospecting reveals class ω-transaminases converting bulky ketones and environmentally relevant polyamines. Appl. Environ. Microbiol. 85, e02404-18 (2019).
Google Scholar
Green, A. P., Turner, N. J. & O’Reilly, E. Chiral amine synthesis using ω-transaminases: an amine donor that displaces equilibria and enables high-throughput screening. Angew. Chem. Int. Ed. 53, 10714–10717 (2014).
Google Scholar
Baud, D., Ladkau, N., Moody, T. S., Ward, J. M. & Hailes, H. C. A rapid, sensitive colorimetric assay for the high-throughput screening of transaminases in liquid or solid-phase. Chem. Commun. 51, 17225–17228 (2015).
Google Scholar
Nasseri, S. A., Betschart, L., Opaleva, D., Rahfeld, P. & Withers, S. G. A mechanism-based approach to screening metagenomic libraries for discovery of unconventional glycosidases. Angew. Chem. Int. Ed. 57, 11359–11364 (2018).
Google Scholar
Colin, P. Y. et al. Ultrahigh-throughput discovery of promiscuous enzymes by picodroplet functional metagenomics. Nat. Commun. 6, 1–12 (2015).
ADS Google Scholar
Rahfeld, P. et al. An enzymatic pathway in the human gut microbiome that converts A to universal O type blood. Nat. Microbiol. 4, 1475–1485 (2019). This article identifies a two-enzyme system from a large gut microbiome library that enables generation of universal O-type blood from A-type donors.
Google Scholar
Smith, D. R. M. et al. An unusual flavin-dependent halogenase from the metagenome of the marine sponge Theonella swinhoei WA. ACS Chem. Biol. 12, 1281–1287 (2017).
Google Scholar
Baud, D., Jeffries, J. W. E., Moody, T. S., Ward, J. M. & Hailes, H. C. A metagenomics approach for new biocatalyst discovery: application to transaminases and the synthesis of allylic amines. Green. Chem. 19, 1134–1143 (2017).
Google Scholar
Armstrong, Z. et al. Metagenomics reveals functional synergy and novel polysaccharide utilization loci in the Castor canadensis fecal microbiome. ISME J. 12, 2757–2769 (2018).
Google Scholar
Bornscheuer, U. T. et al. Engineering the third wave of biocatalysis. Nature 485, 185–194 (2012).
ADS Google Scholar
Arnold, F. H. Directed evolution: bringing new chemistry to life. Angew. Chem. Int. Ed. 57, 4143–4148 (2018). This article presents a review of directed evolution by the 2018 Nobel Prize Laureate for Chemistry.
Google Scholar
Qu, G., Li, A., Acevedo-Rocha, C. G., Sun, Z. & Reetz, M. T. The crucial role of methodology development in directed evolution of selective enzymes. Angew. Chem. Int. Ed. 59, 13204–13231 (2020).
Google Scholar
Reetz, M. T., Bocola, M., Carballeira, J. D., Zha, D. & Vogel, A. Expanding the range of substrate acceptance of enzymes: combinatorial active-site saturation test. Angew. Chem. Int. Ed. 44, 4192–4196 (2005).
Google Scholar
Reetz, M. T., Kahakeaw, D. & Lohmer, R. Addressing the numbers problem in directed evolution. ChemBioChem 9, 1797–1804 (2008).
Google Scholar
Kille, S. et al. Reducing codon redundancy and screening effort of combinatorial protein libraries created by saturation mutagenesis. ACS Synth. Biol. 2, 83–92 (2013).
Google Scholar
Cadwell, R. C. & Joyce, G. F. Mutagenic PCR. CSH Protoc. https://doi.org/10.1101/pdb.prot4143 (2006).
Article Google Scholar
Stemmer, W. P. C. Rapid evolution of a protein in vitro by DNA shuffling. Nature 370, 389–391 (1994). This article establishes DNA shuffling as a method for generation of gene libraries for directed evolution.
ADS Google Scholar
Schober, M. et al. Chiral synthesis of LSD1 inhibitor GSK2879552 enabled by directed evolution of an imine reductase. Nat. Catal. 2, 909–915 (2019).
Google Scholar
Heath, R. S., Pontini, M., Bechi, B. & Turner, N. J. Development of an R-selective amine oxidase with broad substrate specificity and high enantioselectivity. ChemCatChem 6, 996–1002 (2014).
Google Scholar
Weiß, M. S., Pavlidis, I. V., Vickers, C., Hohne, M. & Bornscheuer, U. T. Glycine oxidase based high-throughput solid-phase assay for substrate profiling and directed evolution of (R)- and (S)-selective amine transaminases. Anal. Chem. 86, 11847–11853 (2014).
Google Scholar
Yan, C. et al. Real-time screening of biocatalysts in live bacterial colonies. J. Am. Chem. Soc. 139, 1408–1411 (2017). This article presents recent approaches to using label-free screening methods based on mass spectrometry.
Google Scholar
Becker, S., Schmoldt, H. U., Adams, T. M., Wilhelm, S. & Kolmar, H. Ultra-high-throughput screening based on cell-surface display and fluorescence-activated cell sorting for the identification of novel biocatalysts. Curr. Opin. Biotechnol. 15, 323–329 (2004).
Google Scholar
Chen, T. et al. Evolution of thermophilic DNA polymerases for the recognition and amplification of C2′-modified DNA. Nat. Chem. 8, 556–562 (2016).
Google Scholar
Agresti, J. J. et al. Ultrahigh-throughput screening in drop-based microfluidics for directed evolution. Proc. Natl Acad. Sci. USA 107, 4004–4009 (2010).
ADS Google Scholar
Obexer, R. et al. Emergence of a catalytic tetrad during evolution of a highly active artificial aldolase. Nat. Chem. 9, 50–56 (2017).
Google Scholar
Wang, L. & Schultz, P. G. A general approach for the generation of orthogonal tRNAs. Chem. Biol. 8, 883–890 (2001).
Google Scholar
Bryson, D. I. et al. Continuous directed evolution of aminoacyl-tRNA synthetases. Nat. Chem. Biol. 13, 1253–1260 (2017).
Google Scholar
Ravikumar, A., Arrieta, A. & Liu, C. C. An orthogonal DNA replication system in yeast. Nat. Chem. Biol. 10, 175–177 (2014).
Google Scholar
Ghislieri, D. et al. Engineering an enantioselective amine oxidase for the synthesis of pharmaceutical building blocks and alkaloid natural products. J. Am. Chem. Soc. 135, 10863–10869 (2013).
Google Scholar
Debon, A. et al. Ultrahigh-throughput screening enables efficient single-round oxidase remodelling. Nat. Catal. 2, 740–747 (2019).
Google Scholar
Holland-Moritz, D. A. et al. Mass activated droplet sorting (MADS) enables high-throughput screening of enzymatic reactions at nanoliter scale. Angew. Chem. Int. Ed. 59, 4470–4477 (2020). This article applies droplet sorting as one of the most successful methods for ultra-high-throughput screening of enzyme libraries.
Google Scholar
Wan, W., Tharp, J. M. & Liu, W. R. Pyrrolysyl-tRNA synthetase: an ordinary enzyme but an outstanding genetic code expansion tool. Biochim. Biophys. Acta 1844, 1059–1070 (2014).
Google Scholar
Warshel, A. et al. Electrostatic basis for enzyme catalysis. Chem. Rev. 106, 3210–3235 (2006).
Google Scholar
Boehr, D. D., Nussinov, R. & Wright, P. E. The role of dynamic conformational ensembles in biomolecular recognition. Nat. Chem. Biol. 5, 789–796 (2009).
Google Scholar
Osuna, S. The challenge of predicting distal active site mutations in computational enzyme design. WIREs Comput. Mol. Sci. 11, e1502 (2021).
Google Scholar
Kiss, G., Çelebi-Ölçüm, N., Moretti, R., Baker, D. & Houk, K. N. Computational enzyme design. Angew. Chem. Int. Ed. 52, 5700–5725 (2013).
Google Scholar
Privett, H. K. et al. Iterative approach to computational enzyme design. Proc. Natl Acad. Sci. USA 109, 3790–3795 (2012).
ADS Google Scholar
Wijma, H. J. et al. Enantioselective enzymes by computational design and in silico screening. Angew. Chem. 127, 3797–3801 (2015).
Google Scholar
Davey, J. A. & Chica, R. A. Multistate approaches in computational protein design. Protein Sci. 21, 1241–1252 (2012).
Google Scholar
Mondal, D., Kolev, V. & Warshel, A. Combinatorial approach for exploring conformational space and activation barriers in computer-aided enzyme design. ACS Catal. 10, 6002–6012 (2020).
Google Scholar
Maria-Solano, M. A., Serrano-Hervás, E., Romero-Rivera, A., Iglesias-Fernández, J. & Osuna, S. Role of conformational dynamics in the evolution of novel enzyme function. Chem. Commun. 54, 6622–6634 (2018).
Google Scholar
Campbell, E. C. et al. Laboratory evolution of protein conformational dynamics. Curr. Opin. Struct. Biol. 50, 49–57 (2018).
Google Scholar
Crean, R. M., Gardner, J. M. & Kamerlin, S. C. L. Harnessing conformational plasticity to generate designer enzymes. J. Am. Chem. Soc. 142, 11324–11342 (2020).
Google Scholar
Kreß, N., Halder, J. M., Rapp, L. R. & Hauer, B. Unlocked potential of dynamic elements in protein structures: channels and loops. Curr. Opin. Chem. Biol. 47, 109–116 (2018).
Google Scholar
Vavra, O. et al. CaverDock: a molecular docking-based tool to analyse ligand transport through protein tunnels and channels. Bioinformatics 35, 4986–4993 (2019).
Google Scholar
Otten, R. et al. How directed evolution reshapes the energy landscape in an enzyme to boost catalysis. Science 370, 1442–1446 (2020).
ADS Google Scholar
Romero-Rivera, A., Garcia-Borràs, M. & Osuna, S. Role of conformational dynamics in the evolution of retro-aldolase activity. ACS Catal. 7, 8524–8532 (2017).
Google Scholar
Casini, A. et al. A pressure test to make 10 molecules in 90 days: external evaluation of methods to engineer biology. J. Am. Chem. Soc. 140, 4302–4316 (2018).
Google Scholar
Gardner, J. M., Biler, M., Risso, V. A., Sanchez-Ruiz, J. M. & Kamerlin, S. C. L. Manipulating conformational dynamics to repurpose ancient proteins for modern catalytic functions. ACS Catal. 10, 4863–4870 (2020).
Google Scholar
Pabis, A., Risso, V. A., Sanchez-Ruiz, J. M. & Kamerlin, S. C. Cooperativity and flexibility in enzyme evolution. Curr. Opin. Struct. Biol. 48, 83–92 (2018).
Google Scholar
Liu, C. C. & Schultz, P. G. Adding new chemistries to the genetic code. Annu. Rev. Biochem. 79, 413–444 (2010). This article reviews the methods to expand the genetic code for the introduction of non-natural amino acids into proteins.
Google Scholar
Chin, J. W. Expanding and reprogramming the genetic code. Nature 550, 53–60 (2017).
ADS Google Scholar
Seyedsayamdost, M. R., Xie, J., Chan, C. T. Y., Schultz, P. G. & Stubbe, J. Site-specific insertion of 3-aminotyrosine into subunit α2 of E. coli ribonucleotide reductase: direct evidence for involvement of Y730 and Y731 in radical propagation. J. Am. Chem. Soc. 129, 15060–15071 (2007).
Google Scholar
Faraldos, J. A. et al. Probing eudesmane cation–π interactions in catalysis by aristolochene synthase with non-canonical amino acids. J. Am. Chem. Soc. 133, 13906–13909 (2011).
Google Scholar
Wu, Y. & Boxer, S. G. A critical test of the electrostatic contribution to catalysis with noncanonical amino acids in ketosteroid isomerase. J. Am. Chem. Soc. 138, 11890–11895 (2016).
Google Scholar
Huguenin-Dezot, N. et al. Trapping biosynthetic acyl-enzyme intermediates with encoded 2,3-diaminopropionic acid. Nature 565, 112–117 (2019).
ADS Google Scholar
Ortmayer, M. et al. Rewiring the ‘push–pull’ catalytic machinery of a heme enzyme using an expanded genetic code. ACS Catal. 10, 2735–2746 (2020).
Google Scholar
Burke, A. J. et al. Design and evolution of an enzyme with a non-canonical organocatalytic mechanism. Nature 570, 219–223 (2019). This article is a recent example of using non-canonical amino acids in protein evolution to obtain new enzyme activities.
ADS Google Scholar
Drienovská, I., Mayer, C., Dulson, C. & Roelfes, G. A designer enzyme for hydrazone and oxime formation featuring an unnatural catalytic aniline residue. Nat. Chem. 10, 946–952 (2018).
Google Scholar
Santoro, S. W., Wang, L., Herberich, B., King, D. S. & Schultz, P. G. An efficient system for the evolution of aminoacyl-tRNA synthetase specificity. Nat. Biotechnol. 20, 1044–1048 (2002).
Google Scholar
Pott, M. et al. A noncanonical proximal heme ligand affords an efficient peroxidase in a globin fold. J. Am. Chem. Soc. 140, 1535–1543 (2018).
Google Scholar
Li, J. C., Liu, T., Wang, Y., Mehta, A. P. & Schultz, P. G. Enhancing protein stability with genetically encoded noncanonical amino acids. J. Am. Chem. Soc. 140, 15997–16000 (2018).
Google Scholar
Wurz, R. P. Chiral dialkylaminopyridine catalysts in asymmetric synthesis. Chem. Rev. 107, 5570–5595 (2007).
Google Scholar
Bolon, D. N. & Mayo, S. L. Enzyme-like proteins by computational design. Proc. Natl Acad. Sci. USA 98, 14274–14279 (2001).
ADS Google Scholar
Richter, F. et al. Computational design of catalytic dyads and oxyanion holes for ester hydrolysis. J. Am. Chem. Soc. 134, 16197–16206 (2012).
Google Scholar
Moroz, Y. S. et al. New tricks for old proteins: single mutations in a nonenzymatic protein give rise to various enzymatic activities. J. Am. Chem. Soc. 137, 14905–14911 (2015).
Google Scholar
Burton, A. J., Thomson, A. R., Dawson, W. M., Brady, R. L. & Woolfson, D. N. Installing hydrolytic activity into a completely de novo protein framework. Nat. Chem. 8, 837–844 (2016).
Google Scholar
Nazor, J., Liu, J. & Huisman, G. Enzyme evolution for industrial biocatalytic cascades. Curr. Opin. Biotechnol. 69, 182–190 (2021). This review focuses on recent industrial biocatalytic cascades.
Google Scholar
McIntosh, J. A. & Owens, A. Enzyme engineering for biosynthetic cascades. Curr. Opin. Green Sustain. Chem. 29, 100448 (2021).
Google Scholar
Schrittwieser, J. H., Velikogne, S., Hall, M. & Kroutil, W. Artificial biocatalytic linear cascades for preparation of organic molecules. Chem. Rev. 118, 270–348 (2018).
Google Scholar
Mayer, S. F., Kroutil, W. & Faber, K. Enzyme-initiated domino (cascade) reactions. Chem. Soc. Rev. 30, 332–339 (2001).
Google Scholar
García-Junceda, E., Lavandera, I., Rother, D. & Schrittwieser, J. H. (Chemo)enzymatic cascades — nature’s synthetic strategy transferred to the laboratory. J. Mol. Catal. B Enzym. 114, 1–6 (2015).
Google Scholar
Rudroff, F. et al. Opportunities and challenges for combining chemo- and biocatalysis. Nat. Catal. 1, 12–22 (2018).
Google Scholar
France, S. P., Hepworth, L. J., Turner, N. J. & Flitsch, S. L. Constructing biocatalytic cascades: in vitro and in vivo approaches to de novo multi-enzyme pathways. ACS Catal. 7, 710–724 (2017). This article reviews the literature on a wide range of multienzyme de novo cascades using isolated enzymes and whole-cell systems.
Google Scholar
Turner, N. J. & O’Reilly, E. Biocatalytic retrosynthesis. Nat. Chem. Biol. 9, 285–288 (2013). This review develops the concept of biocatalytic retrosynthesis.
Google Scholar
Hönig, M., Sondermann, P., Turner, N. J. & Carreira, E. M. Enantioselective chemo- and biocatalysis: partners in retrosynthesis. Angew. Chem. Int. Ed. 56, 8942–8973 (2017).
Google Scholar
Green, A. P. & Turner, N. J. Biocatalytic retrosynthesis: redesigning synthetic routes to high-value chemicals. Perspect. Sci. 9, 42–48 (2016).
Google Scholar
de Souza, R. O. M. A., Miranda, L. S. M. & Bornscheuer, U. T. A retrosynthesis approach for biocatalysis in organic synthesis. Chem. A Eur. J. 23, 12040–12063 (2017).
Google Scholar
Bachmann, B. O. Biosynthesis: is it time to go retro? Nat. Chem. Biol. 6, 390–393 (2010).
Google Scholar
Hadadi, N. & Hatzimanikatis, V. Design of computational retrobiosynthesis tools for the design of de novo synthetic pathways. Curr. Opin. Chem. Biol. 28, 99–104 (2015).
Google Scholar
Coley, C. W., Green, W. H. & Jensen, K. F. Machine learning in computer-aided synthesis planning. Acc. Chem. Res. 51, 1281–1289 (2018).
Google Scholar
Mohamad, N. R., Marzuki, N. H. C., Buang, N. A., Huyop, F. & Wahab, R. A. An overview of technologies for immobilization of enzymes and surface analysis techniques for immobilized enzymes. Biotechnol. Biotechnol. Equip. 29, 205–220 (2015).
Google Scholar
Basso, A. & Serban, S. Industrial applications of immobilized enzymes — a review. Mol. Catal. 479, 110607 (2019).
Google Scholar
Fryszkowska, A. & Devine, P. N. Biocatalysis in drug discovery and development. Curr. Opin. Chem. Biol. 55, 151–160 (2020).
Google Scholar
Latham, J. et al. in Applied Biocatalysis (eds Whittall, J. & Sutton, P. W.) 1–25 (Wiley, 2020).
Prier, C. K. & Kosjek, B. Recent preparative applications of redox enzymes. Curr. Opin. Chem. Biol. 49, 105–112 (2019).
Google Scholar
Woodley, J. M. New frontiers in biocatalysis for sustainable synthesis. Curr. Opin. Green. Sustain. Chem. 21, 22–26 (2020).
Google Scholar
Sheldon, R. A. & Woodley, J. M. Role of biocatalysis in sustainable chemistry. Chem. Rev. 118, 801–838 (2018).
Google Scholar
Sheldon, R. A. Metrics of green chemistry and sustainability: past, present, and future. ACS Sustain. Chem. Eng. 6, 32–48 (2018).
Google Scholar
Tieves, F. et al. Energising the E-factor: the E+-factor. Tetrahedron 75, 1311–1314 (2019).
Google Scholar
Hinzmann, A., Glinski, S., Worm, M. & Gröger, H. Enzymatic synthesis of aliphatic nitriles at a substrate loading of up to 1.4 kg/l: a biocatalytic record achieved with a heme protein. J. Org. Chem. 84, 4867–4872 (2019).
Google Scholar
Bornadel, A. et al. Technical considerations for scale-up of imine-reductase-catalyzed reductive amination: a case study. Org. Process. Res. Dev. 23, 1262–1268 (2019).
Google Scholar
Hülsewede, D., Meyer, L. & von Langermann, J. Application of in situ product crystallization and related techniques in biocatalytic processes. Chem. A Eur. J. 25, 4871–4884 (2019).
Google Scholar
Fellechner, O., Blatkiewicz, M. & Smirnova, I. Reactive separations for in situ product removal of enzymatic reactions: a review. Chem. Ing. Tech. 91, 1522–1543 (2019).
Google Scholar
Aalbers, F. S. et al. Approaching boiling point stability of an alcohol dehydrogenase through computationally-guided enzyme engineering. eLife 9, e54639 (2020).
Google Scholar
Gumulya, Y. et al. Engineering highly functional thermostable proteins using ancestral sequence reconstruction. Nat. Catal. 1, 878–888 (2018).
Google Scholar
Thomas, A., Cutlan, R., Finnigan, W., van der Giezen, M. & Harmer, N. Highly thermostable carboxylic acid reductases generated by ancestral sequence reconstruction. Commun. Biol. 2, 1–12 (2019).
Google Scholar
Nicoll, C. R. et al. Ancestral-sequence reconstruction unveils the structural basis of function in mammalian FMOs. Nat. Struct. Mol. Biol. 27, 14–24 (2020).
Google Scholar
Gomez-Fernandez, B. J., Risso, V. A., Rueda, A., Sanchez-Ruiz, J. M. & Alcalde, M. Ancestral resurrection and directed evolution of fungal mesozoic laccases. Appl. Environ. Microbiol. 86, e00778-20 (2020).
Google Scholar
Carletti, M. S. et al. Revenant: a database of resurrected proteins. Database. 2020, 31 (2020).
Google Scholar
Truppo, M. D., Strotman, H. & Hughes, G. Development of an immobilized transaminase capable of operating in organic solvent. ChemCatChem 4, 1071–1074 (2012).
Google Scholar
Mattey, A. P. et al. Natural heterogeneous catalysis with immobilised oxidase biocatalysts. RSC Adv. 10, 19501–19505 (2020).
ADS Google Scholar
Böhmer, W. et al. Highly efficient production of chiral amines in batch and continuous flow by immobilized ω-transaminases on controlled porosity glass metal-ion affinity carrier. J. Biotechnol. 291, 52–60 (2019).
Google Scholar
Britton, J., Majumdar, S. & Weiss, G. A. Continuous flow biocatalysis. Chem. Soc. Rev. 47, 5891–5918 (2018).
Google Scholar
Rodrigues, R. C., Ortiz, C., Berenguer-Murcia, Á., Torres, R. & Fernández-Lafuente, R. Modifying enzyme activity and selectivity by immobilization. Chem. Soc. Rev. 42, 6290–6307 (2013).
Google Scholar
Sheldon, R. A. in Green Biocatalysis (ed. Patel, R. N.) 1–15 (Wiley, 2016).
Cespugli, M. et al. Rice husk as an inexpensive renewable immobilization carrier for biocatalysts employed in the food, cosmetic and polymer sectors. Catalysts 8, 471 (2018).
Google Scholar
Woodley, J. M. Towards the sustainable production of bulk-chemicals using biotechnology. N. Biotechnol. 59, 59–64 (2020).
Google Scholar
Hughes, D. L. Biocatalysis in drug development — highlights of the recent patent literature. Org. Process. Res. Dev. 22, 1063–1080 (2018).
Google Scholar
de María, P., de Gonzalo, G. & Alcántara, A. Biocatalysis as useful tool in asymmetric synthesis: an assessment of recently granted patents (2014–2019). Catalysts 9, 802 (2019).
Google Scholar
Hauer, B. Embracing nature’s catalysts: a viewpoint on the future of biocatalysis. ACS Catal. 10, 8418–8427 (2020). This article presents an insightful review of the future challenges of biocatalysis in academia and industry.
Google Scholar
Jiao, S., Li, F., Yu, H. & Shen, Z. Advances in acrylamide bioproduction catalyzed with Rhodococcus cells harboring nitrile hydratase. Appl. Microbiol. Biotechnol. 104, 1001–1012 (2020).
Google Scholar
Savile, C. K. et al. Biocatalytic asymmetric synthesis of chiral amines from ketones applied to sitagliptin manufacture. Science 329, 305–309 (2010).
ADS Google Scholar
Eichhorn, E. et al. Biocatalytic process for (−)-ambrox production using squalene hopene cyclase. Adv. Synth. Catal. 360, 2339–2351 (2018).
Google Scholar
Baker Dockrey, S. A., Lukowski, A. L., Becker, M. R. & Narayan, A. R. H. Biocatalytic site- and enantioselective oxidative dearomatization of phenols. Nat. Chem. 10, 119–125 (2018).
Google Scholar
Li, G., Wang, J.-B. & Reetz, M. T. Biocatalysts for the pharmaceutical industry created by structure-guided directed evolution of stereoselective enzymes. Bioorg. Med. Chem. 26, 1241–1251 (2018).
Google Scholar
Arora, K. K. et al. Manufacturing Process and Intermediates for a Pyrrolo[2,3- D]Pyrimidine Compound and use Thereof. US Patent 10,815,240 (2020).
Jaeger, K. E., Eggert, T., Eipper, A. & Reetz, M. T. Directed evolution and the creation of enantioselective biocatalysts. Appl. Microbiol. Biotechnol. 55, 519–530 (2001).
Google Scholar
Manning, J. et al. Regio- and enantio-selective chemo-enzymatic C–H-lactonization of decanoic acid to (S)-δ-decalactone. Angew. Chem. Int. Ed. 58, 5668–5671 (2019).
Google Scholar
Zhang, J. et al. Engineered C–N lyase: enantioselective synthesis of chiral synthons for artificial dipeptide sweeteners. Angew. Chem. 132, 437–443 (2020).
Google Scholar
Bruggink, A., Schoevaart, R. & Kieboom, T. Concepts of nature in organic synthesis: cascade catalysis and multistep conversions in concert. Org. Process. Res. Dev. 7, 622–640 (2003).
Google Scholar
Sperl, J. M. & Sieber, V. Multienzyme cascade reactions — status and recent advances. ACS Catal. 8, 2385–2396 (2018).
Google Scholar
Lenz, M., Borlinghaus, N., Weinmann, L. & Nestl, B. M. Recent advances in imine reductase-catalyzed reactions. World J. Microbiol. Biotechnol. 33, 199 (2017).
Google Scholar
Schaffer, S. et al. Producing amines and diamines from a carboxylic acid or dicarboxylic acid or a monoester thereof. US Patent 9,725,746 (2017)
Benkovics, T. et al. Evolving to an ideal synthesis of molnupiravir, an investigational treatment for COVID-19. Preprint at https://doi.org/10.26434/chemrxiv.13472373.v1 (2020).
Yin, Z. et al. Computing platforms for big biological data analytics: perspectives and challenges. Comput. Struct. Biotechnol. J. 15, 403–411 (2017).
Google Scholar
Agarwala, R. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 44, D7–D19 (2016).
Google Scholar
Goodman, J. Computer sSoftware review: Reaxys Reaxys. Elsevier Properties SA 360 Park Avenue South, New York, NY 10010-1710. www.info.reaxys.com. J. Chem. Inf. Model. 49, 2897–2898 (2009).
Google Scholar
Garritano, J. R. Evolution of SciFinder, 2011–2013: new features, new content. Sci. Technol. Libr. 32, 346–371 (2013).
Google Scholar
Jeske, L., Placzek, S., Schomburg, I., Chang, A. & Schomburg, D. BRENDA in 2019: a European ELIXIR core data resource. Nucleic Acids Res. 47, D542–D549 (2019).
Google Scholar
Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y. & Morishima, K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45, D353–D361 (2017).
Google Scholar
Gunera, J., Kindinger, F., Li, S. M. & Kolb, P. PrenDB, a substrate prediction database to enable biocatalytic use of prenyltransferases. J. Biol. Chem. 292, 4003–4021 (2017).
Google Scholar
Van Santen, J. A. et al. The natural products atlas: an open access knowledge base for microbial natural products discovery. ACS Cent. Sci. 5, 1824–1833 (2019).
Google Scholar
Tipton, K. F. et al. Standards for reporting enzyme data: the STRENDA consortium: what it aims to do and why it should be helpful. Perspect. Sci. 1, 131–137 (2014).
Google Scholar
Swainston, N. et al. STRENDA DB: enabling the validation and sharing of enzyme kinetics data. FEBS J. 285, 2193–2204 (2018).
Google Scholar
Perkel, J. M. The Internet of Things comes to the lab. Nature 542, 125–126 (2017).
ADS Google Scholar
Jennings-Antipov, L. D. & Gardner, T. S. Digital publishing isn’t enough: the case for ‘blueprints’ in scientific communication. Emerg. Top. Life Sci. 2, 755–758 (2018).
Google Scholar
Goodwin, N. C., Morrison, J. P., Fuerst, D. E. & Hadi, T. Biocatalysis in medicinal chemistry: challenges to access and drivers for adoption. ACS Med. Chem. Lett. 10, 1363–1366 (2019).
Google Scholar
Nielsen, J. & Keasling, J. D. Engineering cellular metabolism. Cell 164, 1185–1197 (2016).
Google Scholar
Truppo, M. D. Biocatalysis in the pharmaceutical industry: the need for speed. ACS Med. Chem. Lett. 8, 476–480 (2017). This article presents an excellent review of the adaptation and challenges of biocatalysis in the pharmaceutical industry, with particular focus on timescales of process development.
Google Scholar
Yang, K. K., Wu, Z. & Arnold, F. H. Machine-learning-guided directed evolution for protein engineering. Nat. Methods 16, 687–694 (2019).
Google Scholar
Markel, U. et al. Advances in ultrahigh-throughput screening for directed enzyme evolution. Chem. Soc. Rev. 49, 233–262 (2020).
Google Scholar
Bunzel, H. A., Garrabou, X., Pott, M. & Hilvert, D. Speeding up enzyme discovery and engineering with ultrahigh-throughput methods. Curr. Opin. Struct. Biol. 48, 149–156 (2018).
Google Scholar
Hillson, N. et al. Building a global alliance of biofoundries. Nat. Commun. 10, 1–4 (2019).
ADS Google Scholar
Windle, C. L., Müller, M., Nelson, A. & Berry, A. Engineering aldolases as biocatalysts. Curr. Opin. Chem. Biol. 19, 25–33 (2014).
Google Scholar
Brovetto, M., Gamenara, D., Saenz Méndez, P. & Seoane, G. A. C–C bond-forming lyases in organic synthesis. Chem. Rev. 111, 4346–4403 (2011).
Google Scholar
Zetzsche, L. E. & Narayan, A. R. H. Broadening the scope of biocatalytic C–C bond formation. Nat. Rev. Chem. 4, 334–346 (2020).
Google Scholar
Li, Z. et al. Engineering cytochrome P450 enzyme systems for biomedical and biotechnological applications. J. Biol. Chem. 295, 833–849 (2020).
Google Scholar
Fessner, N. D. P450 monooxygenases enable rapid late-stage diversification of natural products via C–H bond activation. ChemCatChem 11, 2226–2242 (2019).
Google Scholar
Lall, M. S. et al. Late-stage lead diversification coupled with quantitative nuclear magnetic resonance spectroscopy to identify new structure–activity relationship vectors at nanomole-scale synthesis: application to loratadine, a human histamine H1 receptor inverse agonist. J. Med. Chem. 63, 7268–7292 (2020).
Google Scholar
Dong, J. J. et al. Biocatalytic oxidation reactions: a chemist’s perspective. Angew. Chem. Int. Ed. 57, 9238–9261 (2018).
Google Scholar
Silverman, A. D., Karim, A. S. & Jewett, M. C. Cell-free gene expression: an expanded repertoire of applications. Nat. Rev. Genet. 21, 151–170 (2020).
Google Scholar
Khambhati, K. et al. Exploring the potential of cell-free protein synthesis for extending the abilities of biological systems. Front. Bioeng. Biotechnol. 7, 248 (2019).
Google Scholar
Zimmerman, J. B., Anastas, P. T., Erythropel, H. C. & Leitner, W. Designing for a green chemistry future. Science 367, 397–400 (2020).
ADS Google Scholar
Hammer, S. C., Knight, A. M. & Arnold, F. H. Design and evolution of enzymes for non-natural chemistry. Curr. Opin. Green Sustain. Chem. 7, 23–30 (2017).
Google Scholar
DeHovitz, J. S. et al. Static to inducibly dynamic stereocontrol: the convergent use of racemic β-substituted ketones. Science 369, 1113–1118 (2020).
ADS Google Scholar
O’Reilly, E., Köhler, V., Flitsch, S. L. & Turner, N. J. Cytochromes P450 as useful biocatalysts: addressing the limitations. Chem. Commun. 47, 2490–2501 (2011).
Google Scholar
Basler, S. et al. Efficient Lewis acid catalysis of an abiological reaction in a de novo protein scaffold. Nat. Chem. 13, 231–235 (2021).
Google Scholar
Liu, Z. & Arnold, F. H. New-to-nature chemistry from old protein machinery: carbene and nitrene transferases. Curr. Opin. Biotechnol. 69, 43–51 (2021). This review describes recent developments of new to nature reactions catalysed by engineered enzymes.
Google Scholar
Callaway, E. ‘It will change everything’: DeepMind’s AI makes gigantic leap in solving protein structures. Nature 588, 203–204 (2020). This article presents a recent breakthrough in computational protein structure prediction from the primary sequence.
ADS Google Scholar
Dou, J. et al. De novo design of a fluorescence-activating β-barrel. Nature 561, 485–491 (2018).
ADS Google Scholar
Senior, A. W. et al. Improved protein structure prediction using potentials from deep learning. Nature 577, 706–710 (2020).
ADS Google Scholar
Huang, P. S., Boyken, S. E. & Baker, D. The coming of age of de novo protein design. Nature 537, 320–327 (2016).
ADS Google Scholar
Fischer, M. & Pleiss, J. The Lipase Engineering Database: a navigation and analysis tool for protein families. Nucleic Acids Res. 31, 319–321 (2003).
Google Scholar
Lombard, V., Golaconda Ramulu, H., Drula, E., Coutinho, P. M. & Henrissat, B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 42, D490–D495 (2014). This article discusses the curated comprehensive database of carbohydrate-active enzymes (CAZymes) that has been a useful tool to the glycoscience community.
Google Scholar
Savelli, B. et al. RedoxiBase: a database for ROS homeostasis regulated proteins. Redox Biol. 26, 101247 (2019).
Google Scholar

Download references

Acknowledgements

The authors are grateful for funding from the European Research Council (ERC) (S.L.F., 788231; A.P.G., 757991; S.O., 679001; N.J.T., 742987), the Engineering and Physical Sciences Research Council (EPSRC) (S.L.F. and N.J.T., EP/S005226/1), the Biotechnology and Biological Sciences Research Council (BBSRC) (S.L.F. and N.J.T., BB/M027791/1, BB/M028836/1; A.P.G., BB/M027023/1), the Spanish Ministry of Economy and Competitiveness (MINECO) (S.O., PGC2018-102192-B-I00), Generalitat de Catalunya (S.O., SGR 2017 1707) and the University of Manchester (Presidential Fellowship to S.L.L.).

Author information

Authors and Affiliations

School of Chemistry and MIB, The University of Manchester, Manchester, UK
Elizabeth L. Bell, William Finnigan, Anthony P. Green, Lorna J. Hepworth, Sarah L. Lovelock, Nicholas J. Turner & Sabine L. Flitsch
Pfizer Worldwide Research and Development, Groton, CT, USA
Scott P. France
Compound Synthesis and Management, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
Martin A. Hayes & Elvira Romero
Department of Chemistry, The University of British Columbia, Vancouver, British Columbia, Canada
Haruka Niikura & Katherine S. Ryan
CompBioLab Group, Institut de Quimica Computacional i Catalisi, Departament de Química, Universitat de Girona, Girona, Spain
Sílvia Osuna
ICREA, Barcelona, Spain
Sílvia Osuna

Authors

Elizabeth L. Bell
View author publications
You can also search for this author in PubMed Google Scholar
William Finnigan
View author publications
You can also search for this author in PubMed Google Scholar
Scott P. France
View author publications
You can also search for this author in PubMed Google Scholar
Anthony P. Green
View author publications
You can also search for this author in PubMed Google Scholar
Martin A. Hayes
View author publications
You can also search for this author in PubMed Google Scholar
Lorna J. Hepworth
View author publications
You can also search for this author in PubMed Google Scholar
Sarah L. Lovelock
View author publications
You can also search for this author in PubMed Google Scholar
Haruka Niikura
View author publications
You can also search for this author in PubMed Google Scholar
Sílvia Osuna
View author publications
You can also search for this author in PubMed Google Scholar
Elvira Romero
View author publications
You can also search for this author in PubMed Google Scholar
Katherine S. Ryan
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas J. Turner
View author publications
You can also search for this author in PubMed Google Scholar
Sabine L. Flitsch
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Introduction (S.L.F., H.N. and K.S.R.); Experimentation (E.L.B., A.P.G., H.N., S.O., K.S.R., N.J.T., S.L.L., L.J.H. and W.F.); Results (M.A.H. and E.R.); Applications (S.P.F., M.A.H. and E.R.); Reproducibility and data deposition (W.F. and L.J.H.); Limitations and optimizations (W.F., L.J.H., M.A.H. and E.R.); Outlook (S.L.F.); overview of Primer (S.L.F.). All authors contributed equally to planning and revision of the manuscript as described. Please note that co-authors have been listed in alphabetical order.

Corresponding author

Correspondence to Sabine L. Flitsch.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information

Nature Reviews Methods Primers thanks L. Betancor, A. Fryszkowska and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

Enzyme cascade: Within the biocatalysis community this term is used broadly for concurrent, multienzyme one-pot biocatalytic reactions as well as reactions in which components are added sequentially or process steps are telescoped.
Metagenomic libraries: Genomic libraries constructed by the direct cloning of the large fragments of the environmental DNA into an appropriate vector, transformed into the host bacteria.
C(sp ³)–H functionalization: A type of reaction in which a C–H bond, in which the carbon is sp³ hybridized, is cleaved and a new C–X bond is formed (where X is usually carbon, oxygen, nitrogen or a halide).
Diels–Alderases: Enzymes that catalyse a [4 + 2] cycloaddition reaction between a conjugated diene and a substituted alkene forming a cyclohexene derivative.
Rates of catalysis: The rates by which substrates are converted into products in catalytic reactions.
Saturation mutagenesis: A method that allows the randomization of a target codon or set of codons in a gene.
Iterative combinatorial active site testing: A method that allows the generation of DNA libraries where active site positions are randomized in pairs
Error-prone PCR: A PCR (polymerase chain reaction) that is run under reaction conditions that introduce random mutations into the target DNA sequence.
Gene shuffling: A method that allows for the generation of chimeric libraries of genes.
DNA shuffling: A method that allows the recombination of beneficial mutations in a directed evolution experiment.
High performance liquid chromatography: An analytical technique that allows for the rapid separation and quantification of compound mixtures using pressurized liquid solvent passed through chromatographic columns.
Nonsense codon: A codon within the genetic code that does not encode an amino acid but is recognized as a stop codon in transcription and translation of DNA.
Regioselectivity: The property that favours bond formation or breaking at a particular atom over all other possible atoms in a molecule.
Enzyme operational stability: Retention of enzyme activity when the enzyme is in use.
Evolvability: Capacity of an enzyme to acquire beneficial properties or functions through genetic modification.
Design of experiments: A statistical approach to analyse the influence of various factors in a system to predict the optimal operating conditions.
BLAST: (Basic Local Alignment Search Tool). A tool that compares nucleotide or protein sequences of interest (most commonly to sequences within a database), and finds regions of statistically significant similarity.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bell, E.L., Finnigan, W., France, S.P. et al. Biocatalysis. Nat Rev Methods Primers 1, 46 (2021). https://doi.org/10.1038/s43586-021-00044-z

Download citation

Accepted: 20 May 2021
Published: 24 June 2021
DOI: https://doi.org/10.1038/s43586-021-00044-z

This article is cited by

Towards glycan foldamers and programmable assemblies
- Surusch Djalali
- Nishu Yadav
- Martina Delbianco
Nature Reviews Materials (2024)
Natural diversity screening, assay development, and characterization of nylon-6 enzymatic depolymerization
- Elizabeth L. Bell
- Gloria Rosetto
- Gregg T. Beckham
Nature Communications (2024)
A light-driven enzymatic enantioselective radical acylation
- Yuanyuan Xu
- Hongwei Chen
- Xiaoqiang Huang
Nature (2024)
Asymmetric α-benzylation of cyclic ketones enabled by concurrent chemical aldol condensation and biocatalytic reduction
- Yunting Liu
- Teng Ma
- Yanjun Jiang
Nature Communications (2024)
Catalytic asymmetric cationic shifts of aliphatic hydrocarbons
- Vijay N. Wakchaure
- William DeSnoo
- Benjamin List
Nature (2024)