Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# Engineering the protein dynamics of an ancestral luciferase

## Abstract

Protein dynamics are often invoked in explanations of enzyme catalysis, but their design has proven elusive. Here we track the role of dynamics in evolution, starting from the evolvable and thermostable ancestral protein AncHLD-RLuc which catalyses both dehalogenase and luciferase reactions. Insertion-deletion (InDel) backbone mutagenesis of AncHLD-RLuc challenged the scaffold dynamics. Screening for both activities reveals InDel mutations localized in three distinct regions that lead to altered protein dynamics (based on crystallographic B-factors, hydrogen exchange, and molecular dynamics simulations). An anisotropic network model highlights the importance of the conformational flexibility of a loop-helix fragment of Renilla luciferases for ligand binding. Transplantation of this dynamic fragment leads to lower product inhibition and highly stable glow-type bioluminescence. The success of our approach suggests that a strategy comprising (i) constructing a stable and evolvable template, (ii) mapping functional regions by backbone mutagenesis, and (iii) transplantation of dynamic features, can lead to functionally innovative proteins.

## Introduction

Contemporary biocatalysts may originate from a small number of possibly multifunctional common ancestors and a limited number of structural folds. Natural enzymes have undergone billions of years of evolution to generate vast functional diversity and strikingly precise and efficient activities. This diversity resulted from complex evolutionary processes1, which have been grouped into two complementary mechanisms: creeping and leaping evolution2. The former involves relatively minor functional changes that generally increase specificity or activity, whereas the latter involves radical shifts that introduce functional innovations such as the ability to bind a completely different substrate or change an enzyme’s mechanism. While small adaptive changes may result from point substitutions introduced during evolution3, larger functional leaps may require more profound rearrangements of the protein backbone4. However, the potential for innovation comes at the price of disruption and destabilization. One way to harvest the effects of profound modifications is to infer ancestral enzymes5 with features that enhance evolvability: stability and promiscuity6,7. This enables the creation of robust generalist scaffolds that may be more catalytically versatile because they are less burdened by adaptive pressure towards a specific function than their modern-day counterparts8,9. If ancestors are indeed more stable, their capacity to buffer the large mutational load arising from backbone modifications4 may be enhanced.

Most directed evolution efforts rely on point substitutions. There have been far fewer reports of experimental insertions or deletions (InDels), despite their relatively frequent and beneficial occurrence in natural evolution10. Various genomic analyses show that the ratio of InDels to point substitutions in protein-coding regions typically ranges from 1:5 in primates to 1:20 in bacteria, indicating that InDels are typically subjected to stronger purifying selection than substitutions during evolution. This is due to their potentially more deleterious effects, including losses of stability, disruptions of secondary structure elements and/or perturbations of folding pathways11. To explore the potential effects of InDel mutagenesis on functional proteins, we previously developed TRIAD (transposition-based random insertions and deletions)12,13, a method for generating random InDel libraries that provides ready access to variants that cannot be obtained by substitution mutagenesis https://doi.org/10.21203/rs.3.pex-1448/v1. TRIAD is transposon-based and was shown to have only minimal sequence bias, with >85% of all possible sites shown to be targeted by the transposon12. In the present work, TRIAD is applied to a particularly stable ancestral protein14, the recently designed and characterized AncHLD-RLuc that was reconstructed from the catalytically distinct but evolutionarily and structurally related haloalkane dehalogenases15 (HLD, EC 3.8.1.5) and Renilla luciferase16 (RLuc, EC 1.13.12.5). This ancestor14 is bifunctional and catalytically versatile, with a dehalogenase activity comparable to contemporary HLDs and a promiscuous luciferase activity almost 7000-fold lower than that of the stabilized RLuc8, a popular molecular probe. The light-producing reaction catalysed by the Renilla-type luciferase is one of the most widely used biochemical reactions in molecular and cell biology research. Based on conformational differences of the RLuc8 backbone in crystal structures17, it had been suggested that the opening of the L9 loop might be relevant for luciferase substrate binding. The availability of AncHLD-RLuc, a stable bi-functional scaffold14, enabled investigation of structure-function relationships of luciferase activity.

Here we present a three-step protein engineering strategy designed to exploit the effects of InDels as promotors of evolutionary innovation: (i) constructing a robust and evolvable template, (ii) mapping functional regions by backbone mutagenesis and multivariate statistics and (iii) transplantation of a dynamic structural (loop-helix) feature. Its application to the bifunctional protein AncHLD-RLuc (Fig. 1) results in an engineered 7000-fold more efficient catalyst with 100-fold longer glow-type bioluminescence applicable as a molecular probe for use in bacterial as well as mammalian cells.

## Results

### Structure–function relationships by analysis of InDel libraries

Libraries of AncHLD-RLuc with random single amino acid insertions (1st round, R1I) and deletions (1st round, R1D) distributed over the length of the protein were constructed using TRIAD12 and screened for LUC and HLD activities (Supplementary Note 1). All hits with significantly improved LUC activity had backbone alterations localized in one of three regions of the enzyme cap domain: the L9 loop, the α4 helix or the L14 loop (Fig. 2a, Supplementary Table 1). We selected 11 variants (10 from R1I, 1 from R1D) for further analysis and complemented them with 15 variants with low LUC activity that retained HLD activity (8 from R1D and 7 from R1I, Source data file). All variants were expressed, purified and subjected to activity and stability measurements and thermodynamic analyses using a microfluidic approach (Supplementary Fig. 1). To understand the role of cooperativity and long-range interactions after mutagenesis, we used an anisotropic network model (ANM, Supplementary Note 2) to calculate the cross-correlation of motions of selected structural fragments (Supplementary Tables 2, 3, Supplementary Note 3) to the regions carrying the InDel mutations of the selected variants.

Multivariate partial least squares (PLS) regression (Supplementary Note 4) revealed that the initiation of protein unfolding (Tonset) and its midpoint (Tm) were the strongest predictors for both LUC and HLD activities. At Tonset (0.5% deviation from a linear fit on the 330 nm/350 nm vs. temperature curve) a protein starts to unfold, while at Tm (inflection point on the 330 nm/350 nm vs. temperature curve) 50% of the protein is unfolded. The directionality of the relationships was the opposite for HLD and LUC activities (Fig. 2b). This suggested that stabilities are linked, positively and negatively, to the two activities with their distinct chemical mechanisms and structural requirements. Moreover, cross-correlated motions of the mutated positions to the L9 loop were identified as very strong contributors to LUC activity, and cross-correlated motions to the L3 and L18 loops (which encompass catalytic residues14,18) as key contributors to HLD activity.

All the positive hits had significantly reduced thermal stability compared to the ancestral protein (up to –20 °C lower Tonset and Tm values; Source data file), in agreement with the notion that a stable template is crucial for obtaining functional mutants. Moreover, we assumed that using a highly stable starting point for engineering would promote evolvability by allowing the protein to accept a wider range of mutations while retaining its native fold19, thereby enabling multiple rounds of directed evolution. To probe this hypothesis, the best variant from R1I library (AncINS) carrying an insertion and substitution in the α4 helix was used as the template for construction of another amino acid insertion library (2nd round, R2I) that was screened for further improvement in LUC activity. Statistical analysis of activity data shows that R1I and R1D yielded populations of variants with similar LUC activity patterns, and most of the increases in relative activity obtained in these rounds were compromised in R2I (Fig. 2c, d). The proportions of variants showing at least doubled activity of the starting template generated in R1I, R1D and R2I were 13%, 15.5% and just 0.5%, respectively. As no significant activity improvements were obtained in the second round of insertion mutagenesis, no further rounds were pursued.

The finding that a single round of InDel mutagenesis of the stabilized template already provided mutants with >100-fold increases in LUC activity raises the question: why are the structural elements (L9 and L14 loops and α4 helix), highlighted by directed evolution experiments, crucial for LUC activity? To address this question, the mutant AncINS, carrying an insertion in the α4 helix, was selected as a representative and studied by transient and steady-state kinetics.

### Kinetic analysis underlines the importance of conformational flexibility for substrate binding

To gain insight into the catalytic mechanism, steady-state kinetics of LUC activity of the three luciferases (AncINS, the template AncHLD-RLuc and the modern variant RLuc8; Fig. 1). were measured with the substrate coelenterazine (CTZ). Complete luminescence progress curves (Supplementary Fig. 5) gave, by numerical simulation, estimates of the turnover number (kcat) and the enzyme specificity constant (kcat/Km). kcat provides a lower limit for each first-order rate constant following binding of a substrate through product release, while kcat/Km indicates a lower limit of the rate for substrate binding (Fig. 3a). The parameters obtained for RLuc8 correspond well with previously reported values derived from initial rate measurements combined with quantum yield calibration (Supplementary Table 4)20. Moreover, following a time course of the substrate-to-product conversion, our data provide a sensitive estimate of the equilibrium dissociation constants for each enzyme–product complex (Kp) that cannot be obtained by conventional initial rate analysis. Both kcat and kcat/Km values of AncINS were in-between those of AncHLD-RLuc and RLuc8, indicating a 124-fold enhancement of its catalytic efficiency relative to that of the ancestral enzyme.

We also investigated by pre-steady-state kinetic analysis, whether conformational changes of the protein are essential for effective binding of the bulky CTZ substrate (Fig. 3b). The combination of analytical data processing and global fitting by numerical integration revealed a two-step induced fit substrate-binding mechanism: initial collision of the enzyme with the substrate, followed by an induced conformational change of the enzyme (Fig. 3b, Supplementary Fig. 6). Details and the rationale of the mechanistic analysis are described in the Supplementary Information (Supplementary Note 5). The kinetic constants are consistent with a model in which the reaction catalysed by AncHLD-RLuc involves a slow simple binding mechanism, with no sign of conformational flexibility. By contrast, we detected conformational flexibility of AncINS, accompanied by 1000-fold faster binding, based on determined k+1 and k-1. The changes in AncINS increased rates of the initial collision step to match those of RLuc8, but a subsequent conformational change was still 10-fold slower. Faster kinetics of substrate binding after backbone modification in the α4 helix suggest that this region plays a significant role in the initial step of the catalytic cycle.

### Backbone conformational differences in crystal structures

To investigate the relationship between the structural changes induced in AncHLD-RLuc by InDel mutagenesis and enhanced kinetics of substrate binding, we crystallized AncINS (PDB ID 6S6E, Supplementary Table 6) and compared its structure to previously published structures of AncHLD-RLuc (PDB ID 6G75) and RLuc8 (PDB ID 2PSF). These proteins have identical α/β-hydrolase folds but differ in their cap domains. Important changes were identified in the conformation of the α4 helix and L9 loop in the cap domain, size of the active site cavity, width of the tunnel mouth, and active site accessibility (Fig. 4).

The asymmetric unit of the crystal lattice of AncINS contains two monomers (chains A and B). Chain A has a structure similar to the AncHLD-RLuc template but features a π-helix bulge in the α4 helix where the L162 insertion and F163P substitution occurred. In AncINS chain B, the α4 helix is markedly distorted towards the α5 helix. Moreover, the electron density map for the L9-α4 fragment is not perfectly resolved. Some side chains are poorly visible or invisible, suggesting that appreciable internal motion occurs in this region. RLuc8 also has two monomers in the asymmetric unit; chain B is similar to that of AncHLD-RLuc, while chain A is in an open conformation with the α4 helix pointing away from the α5 helix.

In AncINS, the inserted L162 occupies the same position as the bulky F162 in AncHLD-RLuc, making the active site cavity of AncINS bigger than that of AncHLD-RLuc. The F163P substitution in AncINS disrupts the α4 helix because proline acts as a helix breaker, introducing a kink into the polypeptide backbone. The resulting distortion closes the main access tunnel in the chain B of AncINS. In RLuc8, the situation is a little bit more complex. The sequence alignment (Supplementary Fig. 9) indicates that I163 is a conserved amino acid, but it does not occupy the same position as I161 in AncHLD-RLuc. Instead, it is flipped down into the position occupied by the bulky F162 in AncHLD-RLuc. The outward conformation of the α4 helix in the cap domain of RLuc8 chain A gives it the largest active site cavity of the crystal structures studied.

To ascertain the role of the L9-α4 fragment in substrate binding, we tried to co-crystallize RLuc8 with CTZ. Despite our best efforts, we did not obtain sufficiently well diffracting crystals. In contrast, mixing the RLuc8-W121F/E144Q mutant with CTZ yielded a high-resolution complex structure (Fig. 5a; Supplementary Table 6) with coelenteramide (CEI), which is the product of the LUC reaction. Unlike in an RLuc8-CEI complex structure17, the CEI molecule in our complex is inverted by nearly 180°, which allows its accommodation in the luciferase active site cavity (Fig. 5b, Supplementary Fig. 10). The rest of the CEI molecule occupies the main enzyme access tunnel, where it is tightly wrapped by multiple, predominantly hydrophobic, residues: V146, I150, I159, V185, F181, F261, F262, H285 and I266.

Interestingly, many of the residues directly interacting with the CEI are located on the loop L9 (V146, I150, W156 and I159) and the loop L14 (I223 and P224), identified as important for catalysis by our directed evolution experiments. The α4 helix is in an open conformation relative to the α5 helix (Fig. 5c). Ultimately, the L9-α4 fragment directly affects the opening and closing of the access tunnels and we deduce that it is involved in substrate binding and product release. Thus, our structural data support conclusions from InDel mutagenesis and kinetic studies that the L9-α4 fragment is the key hotspot region for the introduction of LUC activity.

### Profiling protein dynamics by mass spectrometry and molecular dynamics

To analyse the effects of dynamics on LUC activity, we recorded hydrogen-deuterium exchange mass spectrometry (HDX-MS) time courses and complementary molecular dynamics (MD) simulations (Fig. 6). In this context, it should be emphasized that HDX-MS reveals changes occurring over seconds or minutes, whereas MD simulations cover a few microseconds (in this case 4.8 μs). Kinetic analyses were carried out with the substrate, while HDX-MS and MD simulations were performed in the absence of any ligand. HDX-MS captures both dynamics and solvation.

Backbone amide hydrogen-deuterium exchange is particularly sensitive to hydrogen bonding and thus reflects hydrogen bonding strength, conformational dynamics, and the solvent accessibility of protein structures. The HDX-MS profiles (Fig. 6c, Supplementary Fig. 11) show that AncINS was deuterated more rapidly overall than the template AncHLD-RLuc, and that the modification of its α4 helix backbone leads to greater solvation and more pronounced dynamics, particularly in the cap domain region (Supplementary Fig. 11). Moreover, AncINS exhibited very high overall deuteration within 10 and 60 s, with the amino acids between residues W151 (L9 loop) and F180 (α5 helix) of the cap domain being most extensively deuterated. Residues S212-E235 (which comprise the L14 loop) were also appreciably more deuterated than their counterparts in AncHLD-RLuc. The deuteration of the α4-α5 fragment in RLuc8 reached a deuteration level comparable to the level observed in AncINS within 60 s, but it increased further in 300 s. In addition, the L9 loop of RLuc8 was less extensively deuterated than the corresponding loop in AncHLD-RLuc and the region S212-E235 (L14 loop) of RLuc8 was more heavily deuterated than in AncHLD-RLuc. These deuteration patterns in HDX-MS indicate a possible correlation between the dynamics of hot spot regions comprising the α4 and α5′ helices and L14 loop and high LUC activity. The protein dynamics are coupled to efficient binding of the bulky substrate and release of the structurally similar product, based on the kinetic constants k+1 and k-1 obtained by the pre-steady state kinetics (Fig. 3b, Supplementary Fig. 6).

The regions encompassing the L9-α4 fragment and L14 loop are important structural elements that line the main access tunnels connecting the buried active site to the surrounding solvent. Their motions are concerted based on ANM and significantly affect the active site’s volume, while fully preserving the catalytic residues’ geometry. Since HDX-MS experiments measure the exchange of backbone amide hydrogens with deuterium, we used backbone B-factors from MD simulations to characterize the enzymes’ dynamics14,21. The overall B-factors obtained for the cap domain of RLuc8 were highest in the α4 helix, followed by the α5′ helix and L9 loop (Fig. 6b). The B-factors for the L14 loop were similar to those of the L9 loop, but values for the rest of the cap domain were below average (Supplementary Fig. 12). The dynamic profile of the AncHLD-RLuc template differs significantly: unlike in RLuc8, the dynamics are most pronounced in the L9 loop. Interestingly, the AncINS mutant exhibits markedly higher B-factors in the α4 helix, making it the most dynamic region together with the L9 loop (Fig. 6b).

### Validation by fragment transplantation yields stable glow-type bioluminescence

To verify the importance of the hotspot regions highlighted by biophysical analysis, we transplanted the sequence corresponding to the L9-α4 fragment from the modern RLuc8 enzyme into the ancestral AncHLD-RLuc. The resulting AncFT protein had lower thermal stability than the ancestral variant (with ca. 7 °C lower Tm value) but folded correctly (with a CD spectrum similar to that of AncHLD-RLuc and AncINS; Supplementary Fig. 13). We also constructed a second variant with L14 transplanted from RLuc8 onto the ancestral scaffold, but this protein easily aggregated, preventing its biochemical characterization.

Transplantation of the L9-α4 flexible region from RLuc8 into the scaffold of AncHLD-RLuc affected substrate-binding kinetics (Fig. 3): AncFT was found to have comparable initial collision kinetics to AncINS and RLuc8, but the fragment transplantation enhanced the following conformational change and (unlike AncINS) AncFT reached the overall binding efficiency of the modern luciferase. These findings correspond well with the results of the steady-state analysis. Similar values of the specificity constant kcat/Km were obtained for AncFT and RLuc8, indicating comparable efficiency of substrate binding. Thus, AncFT’s lower LUC activity is due to a lower rate of the chemical transformation (i.e., the step after binding), as indicated by a 10-fold lower kcat value for AncFT compared to RLuc8 (Fig. 3a, Supplementary Table 4). On the other hand, AncFT exhibited markedly weaker product inhibition than RLuc8. This was apparent from a significantly lower Km/Kp ratio, quantifying a binding of the substrate and the product, which has an important effect on the bioluminescence signal’s stability. The significant product inhibition of RLuc8, indicated by the high Km/Kp, is one of the reasons why the bioluminescence signal provided by the luciferase rapidly decays after a strong initial flash. The fast inactivation and signal instability are major limitations of this popular molecular probe. Re-engineering of RLuc8 by ancestral reconstruction and subsequent backbone modification changed the unstable flash-type of bioluminescence to a significantly more stable glow-type bioluminescence. The half-life (t1/2) of AncFT bioluminescence was two orders of magnitude longer than that of RLuc8 (Supplementary Table 7, Fig. 7a, Supplementary Fig. 14), being consistent with a significantly lower Km/Kp ratio determined for AncFT, in comparison to RLuc8. The highly stable glow-type bioluminescence of AncFT persisted when this protein was expressed heterologously in mammalian (mouse fibroblast) cells (Fig. 7b, Supplementary Fig. 14). This result indicates that AncFT could be used as a reporter protein22 for in vivo experiments.

Crystallographic analysis of AncFT (PDB ID 6S97, Supplementary Table 6) revealed a canonical α/β-hydrolase fold similar to that of AncHLD-RLuc, but with some structural features reminiscent of RLuc8. Most importantly, both the active site cavity and the α4 helix conformation of AncFT are more open than in AncHLD-RLuc, AncINS, and even the RLuc8 chain B (Fig. 4). Chain A of RLuc8 (PDB ID 2PSF) has the most open conformation seen in any of the structures (Figs. 4, 6a).

HDX-MS experiments showed that the deuteration pattern of AncFT was almost identical to that of AncHLD-RLuc, with differences only in the transplanted region (Fig. 6c, Supplementary Fig. 11). Specifically, the end of the L9 loop was less extensively deuterated than in AncHLD-RLuc while the α4-α5′ helices were more heavily deuterated. A similar trend was observed for RLuc8, providing experimental evidence that cap domain dynamics of the mutant with the transplanted fragment mimic those of RLuc8 (Fig. 6c).

MD simulations of AncFT revealed that the B-factor profile of its cap domain closely resembled that of RLuc8 (Fig. 6b). The B-factors were highest in the α5′ and α4 helices, while those of the L9 and L14 loops were close to the average for the rest of the enzyme and the rest of the cap domain had lower values (Supplementary Fig. 12). Tunnel analysis using Caver23 showed that tunnel dynamics of AncHLD-RLuc and AncINS clearly differ from those in AncFT and RLuc8 (Fig. 6d). The former two enzymes can adopt conformations with closed tunnels (average bottleneck radii: 0.9 and 1.0 Å, respectively), and conformations with open tunnels (average bottleneck radii: 1.7 Å in both cases). We consider a tunnel to be closed when its bottleneck radius drops below the radius of a water molecule (1.4 Å). In contrast, there is a clear overlap in bottleneck values calculated for the extremes of the conformational spectra of AncFT and RLuc8: the conformations that form the narrowest tunnels have identical average bottleneck radii of 1.5 Å, while those forming open tunnels have average bottleneck radii of 1.7 and 1.9 Å, respectively (Fig. 6d).

In addition, we performed an analysis of the cavity size of the proteins to test whether changes in the catalytic properties correlate with the changes in the volume of the active site. When analysing 100 random snapshots from MD simulations we noticed that cavity volumes of all proteins can adopt wide ranges, but they also converge to similar minimal volume of 200 Å3 in the closed state. The volumes calculated from the simulations starting from the open and closed conformations showed different absolute values, except for AncFT, which does not have multiple monomers in the asymmetric unit (Supplementary Fig. 15). Neither analysis of volumes of the static structures nor 100 representative snapshots obtained from MD simulations show any sign of correspondence between the catalytic characteristics and the active site volumes. Furthermore, on interpreting whether the system dynamics or the cavity volume is more important for the catalytic activity, the analyses of the ANM presented above agree with the importance of the dynamics.

## Discussion

The present work validates a strategy for engineering enzyme backbone dynamics based on InDel mutagenesis of a stable (high melting temperature) and catalytically promiscuous (evolvable) template. We used TRIAD12 mutagenesis to generate single amino acid insertion and deletion libraries of a stabilized bi-functional ancestral enzyme14 of haloalkane dehalogenase and Renilla-type luciferase enzymes (Fig. 1). Alternatively, other published methods for introducing insertions and deletions could be used24,25,26,27. The most potent insertion variant, AncINS, has both an insertion and a substitution in the α4 helix. These mutations markedly improved substrate binding, linked to changes in dynamics of the cap domain, but not the putative catalytic pentad. Remarkably, this was achieved in a single round of mutagenesis. In contrast, routinely used substitution mutagenesis protocols often require multiple time-consuming rounds of directed evolution and high-throughput screenings28,29,30. Systematic mutagenesis strategies, that allow a single-round selection to identify regions of interest for more focused study, are available31. In the future, maximizing information obtained from the selection through deep sequencing of gene libraries32 may also provide information to predict beneficial mutations based on a single round.

The crystal structure of the insertion variant AncINS (PDB ID 6S6E) revealed that its asymmetric unit contains two monomers with different spatial arrangements of the α4 helix (implying they represent two conformers), and MD simulations confirmed that the mutations caused substantial structural rearrangements, most probably because the substituted residue P163 acts as a helix breaker. We conclude that the LUC activity benefits from conformational dynamics of the protein’s rigid scaffold to allow binding of a bulky substrate. The outward movement of the α4 helix has an immediate effect on the opening of the access tunnel and increasing of the buried cavity’s volume, thus facilitating binding of the bulky substrate CTZ. However, once CTZ is bound inside of the active site cavity, the protein has to provide tight binding of the molecule inside it and provide a desolvated environment for the subsequent monooxygenation reaction and bioluminescence to occur33,34. If the excited product of the reaction, CEI, would not be tightly bound, the energy would probably not be released in the form of light during decay to the ground state, but in another form, e.g., heat. Accordingly, we propose that the evolutionary processes that generated the modern Renilla luciferase altered its dynamics in a way that facilitated substrate binding to the open state. While the movement of the α4 helix was shown to be important for CTZ binding, the dynamic profile of AncINS, as determined by HDX-MX experiments and MD simulations, still differs from that of RLuc8. To validate the proposed conclusions, we transplanted the α4 helix and L9 loop from the highly efficient modern RLuc8 into the poorly active ancestral AncHLD-RLuc. The resulting enzyme, AncFT, was highly efficient, with significantly decreased product inhibition compared to RLuc8, providing glow-type luminescence. Crystallographic analysis revealed that its structure (PDB ID 6S97) was more open than those of both the ancestral protein and B chain of RLuc8, facilitating the entrance of the bulky substrate. Based on MD simulations and HDX-MS experiments, we conclude that the transplantation reduced motions of the L9 loop, but increased motion of the α4-α5′ helices, making the mutant protein’s dynamics more similar to RLuc8.

Our results provide experimental evidence for the role of protein dynamics in enzymatic catalysis. An ancestral protein (AncHLD-RLuc) with sub-optimal dynamics has its catalytic activity limited by substrate binding, while mutants with tailored dynamism resulting from insertion/substitution (AncINS) or fragment transplantation (AncFT) have catalytic activity limited by the subsequent step of the chemical reaction. Transplantation of the most dynamic region of the extant RLuc onto the ancestral AncHLD-RLuc improved its catalytic efficiency 7000-fold to match the catalytic efficiency of RLuc, while extending the half-life of its light output 100-fold. Long-lasting glow-type light emission of AncFT was confirmed in mammalian cells, which paves the way towards its use as an efficient molecular reporter or biosensor22. Given the growing sensitivity of optical devices, signal stability is becoming a key requirement for modern molecular probes. In comparison with the short-term flash type of signal, a stable glow signal enables continuous detection in long-term biological experiments or maintenance of a stable response required during high-throughput screening campaigns22.

Thus, both the introduction of InDels and transplantation of dynamic elements into stable ancestors appear to be viable protein engineering strategies for improving enzymes or introducing novel functions. The potential effectiveness of grafting dynamic loops to introduce novel enzymatic functions has been discussed35,36,37 and experimentally demonstrated by the successful introduction of β-lactamase activity into the αβ/βα metallohydrolase scaffold38, thereby generating an enzyme that lacked its original activity but catalysed hydrolysis of cefotaxime. The potential of loop remodelling for engineering enzyme functions has also been illustrated by the deletion of a specific loop and introduction of a point mutation, leading to emergence of homoserine lactonase activity in a phosphotriesterase39. The potential of loop modification in enzyme engineering has also been discussed in several recent reviews35,36,37.

There are several preconditions necessary for the application of our engineering strategy. Ancestral sequence reconstruction requires about 150 sequences40, but a phylogenetic tree can also be reconstructed with a lower number of sequences. Mutational events leading to insertions and deletions make the reconstruction less reliable41. The InDel mutagenesis strategy is applicable to any protein of interest, but a screening assay is necessary for probing a sufficient number of mutants to identify the regions of interest and to allow robust statistical analysis. For the loop transplantation strategy, structural information about the proteins is necessary. The strategy is applicable for engineering catalytic efficiency, substrate specificity, and selectivity, and may enable the design and evolution of multifunctional enzymes. It is specifically suitable for enzymes with active sites flanked by loops and enzymes with reaction rates limited by substrate binding or product release.

In summary, we have developed a strategy for engineering backbone dynamics based on InDel mutagenesis of a stable template, multivariate statistics of the data from microscale and microfluidic experiments, and transplantation of a highly dynamic element. We validated its utility by an engineering effort in which catalytic efficiency was increased, and the bioluminescence was stabilized to achieve a long-lasting glow-type signal. The results achieved support the conceptual ideas, which guided our experiments and stepwise implementation of the strategy. The developed catalyst can serve as a molecular probe in bacterial and mammalian cells. The strategy may provide a useful addition to the repertoire of methods available for engineering catalytically efficient enzymes40,41 by exploring less travelled parts of protein sequence space. The loop transplantation strategy is being implemented in the web application LoopGrafter, which will make it accessible to a broad community (https://loschmidt.chemi.muni.cz/loopgrafter/).

## Methods

### Reagents and procedures used in TRIAD

FastDigest restriction endonucleases, MuA transposase and T4 DNA ligase were purchased from Thermo Fisher Scientific. DNA Polymerase I, Large (Klenow) Fragment, was purchased from New England Biolabs. All DNA modifying enzymes were used according to the manufacturer’s conditions. All DNA purification procedures were performed according to the manufacturers’ instructions using kits, including GeneJET Plasmid Miniprep kit (Thermo Fischer Scientific) for plasmid extractions from E. coli cells, Zymoclean Gel DNA Recovery kit (Zymo Research) for agarose gel extraction of DNA fragments upon electrophoresis and DNA Clean & Concentrator kit (Zymo Research) for DNA purification and concentration. Bacterial transformations were performed by electroporation using E. cloni 10G ELITE electrocompetent cells (Lucigen).

### Generation of insertion and deletion libraries of AncHLD-RLuc using TRIAD

InDel libraries were prepared following the TRIAD12,42 method. The sequence encoding AncHLD-RLuc was subcloned from pET21b::ancHLD-RLuc to the TRIAD-dedicated vector pID-Tet using NdeI and BamHI and yielding pID-Tet::ancHLD-RLuc. This construct was used as template for the generation of transposition insertion libraries with engineered TRIAD transposons TransDel and TransIns. The transposons (~1 kbp) were extracted from pUC57 by BglII digestion and recovered by gel electrophoresis and purification. Insertion of TransDel or TransIns in pID-Tet::ancHLD-RLuc (pID-Tet plasmid: ~2.7 kbp; ancHLD-RLuc: ~950 bp) was performed by in vitro transposition using ~300 ng of plasmid, ~50 ng of transposon and 0.22 μg MuA transposase in a 20 μL reaction volume. After incubation for 2 h at 30 °C, the MuA transposase was heat-inactivated for 10 min at 75 °C. DNA products were purified and concentrated in 7 μL deionized water. Two microlitres of the purified DNA was used to transform E. cloni 10G ELITE electrocompetent cells by electroporation. The transformants (~30,000–50,000 CFU) were selected on LB agar containing ampicillin (amp; 100 μg/mL) and chloramphenicol (cam; 34 μg/mL). The resulting colonies were pooled, and their plasmid DNA extracted. The fragments corresponding to ancHLD-RLuc containing the inserted transposon (~2 kbp) were obtained by double restriction digestion (NdeI/BamHI) followed by gel extraction and ligated back into pID-Tet (50–100 ng). The ligation products were then transformed into E. cloni 10G cells. Upon selection on LB-agar-amp-cam, transformants (~1–2 × 106 CFU) were pooled and their plasmid DNA extracted, yielding TransDel or TransIns insertion libraries depending on the TRIAD transposon used at the start. For the generation of the triplet nucleotide deletion library of AncHLD-RLuc, TransDel insertion library plasmids were first digested with MlyI to remove TransDel. The fragments corresponding to linearized pID-Tet::ancHLD-RLuc (with a −3 bp deletion in AncHLD-RLuc) were isolated by gel electrophoresis and purified. Self-circularization was then performed using T4 DNA ligase and 10–50 ng linearized plasmid in 50 μL reaction volume (final DNA concentration: ≥1 ng/μL). Upon purification and concentration, the ligation products were transformed into electrocompetent into E. cloni 10G cells subsequently selected on LB-agar-amp, yielding a library of gene of interest variants with random triplet nucleotide deletions. For the generation of the triplet nucleotide insertion library of AncHLD-RLuc, TransIns insertion library plasmids were first digested with NotI and MlyI to remove TransIns. The linearized pID-Tet::ancHLD-RLuc plasmids were recovered by gel electrophoresis and purification. TRIAD cassette Ins1 (containing randomized 3 bp at one extremity and a kanamycin-resistance gene) was extracted from pUC57 by NotI/MlyI digestion, isolated by gel electrophoresis, purified and inserted into the linearized pID-Tet::ancHLD-RLuc plasmid (50–100 ng) in a 1:3 molar ratio. After purification and concentration, these ligation products were transformed into E. cloni 10G cells and the transformants (~2 × 106 CFU) were selected on LB-agar supplemented with 100 μg/mL ampicillin and 50 μg/mL kanamycin. The resulting plasmid ins1 insertion library was then extracted from the transforming colonies and subsequently digested with AcuI. The linearized pID-Tet::ancHLD-RLuc plasmids (with a 3 bp insertion) were recovered by gel electrophoresis, purified and subsequently treated with the Klenow fragment of DNA Polymerase I to remove 3′ overhangs created by AcuI digestion. After that blunting step, the plasmids were self-circularized. The resulting ligation products were transformed into electrocompetent E. cloni 10G cells subsequently plated on LB-agar-amp, yielding libraries of AncHLD-RLuc variants with random triplet nucleotide insertions (one per variant). The same procedure was applied to generate a second-round triplet nucleotide insertion library of AncINS. All the libraries were purified and stored in the form of plasmid solutions prior to transformation into E. coli for protein variant expression and screening. Plasmid DNA from 10 randomly selected colonies from all three libraries were sequenced for quality control (Eurofins Genomics, Germany).

### Point mutagenesis of RLuc8

Site-directed PCR-based mutagenesis was applied to create rLuc8-W121F/E144Q in two steps using QuikChange site-directed mutagenesis kit via manufacturer’s protocol (Agilent, USA). First, rLuc8-W121F was created using oligonucleotides RLuc8-W121F-FWD1 and RLuc8-W121F-RVS1 with rLuc8 gene serving as a template. Then rLuc8-W121F gene was used as a template to create rLuc8-W121F/E144Q using RLuc8-E144Q-FWD2 and RLuc8-E144Q-RVS2. Error-free clones were confirmed by DNA sequencing (Eurofins Genomics, Germany). Mutagenic primers are available in Supplementary Table 8.

### Transplantation of secondary structure elements

The AncHLD-RLuc and RLuc8 sequences were aligned using Jalview43 and T-coffee44 with default settings (Supplementary Fig. 9). The construct pET21b::ancHLD-RLuc (NdeI/BamHI) was chosen as a template for mutagenesis using Phusion® High-Fidelity DNA Polymerase according to the manufacturer’s protocol (New England BioLabs, USA). Mutagenic primers (Supplementary Table 8) (Sigma-Aldrich, USA) were designed manually for two separate PCR runs (Supplementary Fig. 16). In the next step, a standard fusion PCR protocol was used to fuse DNA fragments from the first two PCR runs. The resulting fused DNA fragment was cloned into pET21b or pID-Tet vectors. The error-free status of the clones was confirmed by sequencing (Eurofins Genomics, Germany).

### Cultivation of InDel libraries in 96-well plates

Chemocompetent Escherichia coli BL21 cells (♯C2530H, New England BioLabs, USA) were transformed with the ancHLD-RLuc insertion/deletion library (in the pID-Tet vector) and grown on LB agar plates containing 100 μg/ml ampicillin overnight at 37 °C. The cells transformed with the vector pID-Tet (lacking an insert) and pID-Tet::ancHLD-RLuc were used as negative and positive controls, respectively, when screening the libraries. Single colonies of transformed cells carrying InDel variants of ancHLD-RLuc were transferred into sterile 96-well plates containing 150 μl LB medium with 100 μg/ml ampicillin in each well. The plates were covered with AeraSealTM film (Sigma-Aldrich, USA) and incubated for 15 h at 37 °C under shaking at 200 rpm. After cultivation, 100 μl of the culture was transferred into new microtiter plates (MTPs) and 100 μl of fresh LB medium with 100 μg/ml ampicillin and anhydrotetracycline (to a final concentration 200 ng/ml) was added to each well. To make replica plates to be stored at −70 °C, 50 μl of 30% glycerol was added to the remaining 50 μl of the culture. The MTP was incubated at 30 °C and 200 rpm for 4 h. Cell cultures were harvested by centrifugation at 1600 × g for 20 min. The supernatant was discarded and the MTP was frozen at −70 °C. Before screening of the libraries, MTPs were defrosted and kept at laboratory temperature for 10 min. Then, 70 μl of lysis buffer (20 mM potassium phosphate, 20 mM Na2SO4 and 1 mM EDTA, pH 8.0) containing lysozyme (1 mg/ml) was added to each well. Cell debris was removed from the lysate by centrifugation at 1600 × g for 20 min after incubation at 23 °C and 100 rpm for 1 h. Robotic cultivation and cell lysis were performed analogously to the manual screening described above using a colony picking robot (Colony Picker, Molecular Devices QPix 420), a liquid handling robot (Bravo, Agilent) with a 96-tip head and the LARA robotic system (Greifswald, Germany).

### Screening of luciferase activity in 96-well plates

Luciferase activity was measured using a previously described procedure14, with adaptation, in both the manual and robotic screenings. For manual screening, 30 μl of cell lysate was transferred into a new MTP and 220 μl of assay buffer (100 mM potassium phosphate, 1 mM Na2SO4, pH 7.5) with 2.2 μM coelenterazine (CTZ) was added. The luminescence signal was immediately measured for 22 s with the gain value set to 3250. Before the addition of the assay buffer with CTZ, the sample’s baseline luminescence signal was measured for 10 s. Luminescence was measured at 30 °C in a FLUOstar OPTIMA Microplate Reader (BMG Labtech, Germany). Luciferase activity was expressed in relative light units (RLU) s−1 mg−1 of an enzyme14. The RLUs were integrated over the first 72.5 s immediately after injection of the substrate into the enzyme solution.

For robotic screening, 50 μl of cell lysate was pipetted by a Liquid Handling Robot into a new white MTP (Nunc™ F96 MicroWell™ White Polystyrene Plate, Nunclon Delta Surface, Thermo Fisher Scientific) and the reaction was initiated by adding 50 μl of a CTZ stock solution (1 mg of CTZ dissolved in 2 ml of pure EtOH) diluted 1000x in MilliQ water. The luminescence signal was measured using a VarioscanLUX reader (Thermo Fisher Scientific, SkanIt Software 4.1 for Microplate Readers RE, ver. 4.1.0.43). Each measurement consisted of 50 readings, each of which took 200 ms (standard optics, automatic dynamic range). The CTZ solution was dispensed during reading 15 at a moderate dispensation speed. Luciferase activity was expressed in relative light units (RLU) s−1 mg−1 of an enzyme14.

### Screening of dehalogenase activity in 96-well plates

Haloalkane dehalogenase activity measurements were based on a pH assay45 according to Holloway and co-workers45. The principle of the assay is based on the detection of protons produced during the dehalogenation reaction. Substrate 1-bromobutane (100 μl) was incubated in the reaction buffer (200 ml; 1 mM HEPES, 20 mM Na2SO4, 1 mM EDTA and 25 μg/ml phenol red, pH 8.2) at 37 °C for 30 min. Fifteen microlitres of cell lysate was transferred into new MTP and 185 μl of assay buffer with 1-bromobutane were added. The MTP plate was properly closed by the Adhesive film for microplates (VWR, USA) and incubated at laboratory temperature for 15 h. The change in colour of pH indicator was measured at 540 nm by using an Eon spectrometer (BioTek, USA).

### Overproduction and purification of AncHLD-RLuc variants

AncHLD-RLuc variants were overexpressed from pID-Tet (ampR) in E. coli BL21 cells (#C2530H, New England BioLabs, USA) cultivated in LB medium supplemented with ampicillin (100 μg/ml) at 37 °C. Protein production was induced under the TET promotor at 20 °C once the OD600 reached ~0.5 by adding anhydrotetracycline (Cayman Chemical, USA) to a final concentration of 200 ng/ml. At small scale, proteins were purified using the MagneHisTM Protein Purification System (Promega, USA), dialysed using a Slide-A-LyzerTM MINI Dialysis Device (Thermo Scientific, USA), and their purity was verified by SDS-polyacrylamide gel electrophoresis. At large scale, proteins were purified by affinity chromatography targeting their C-terminal hexahistidine tags. The monomer fraction was separated on a HiLoadTM 16/600 SuperdexTM 200 pg column (GE Healthcare, UK) equilibrated with 100 mM potassium phosphate buffer (pH 7.5). All enzymes were concentrated using AmiconR Ultra-15 UltracelR−10K Centrifugal Filter Units (Merck Millipore Ltd., Ireland). The purity of all enzyme preparations was checked by SDS-polyacrylamide gel electrophoresis; in all cases, only one band corresponding to the monomer fraction was visible.

### Large-scale overproduction and purification of RLuc8-W121F/E144Q

The E. coli BL21 (DE3) (C2527H, New England BioLabs, USA) cells were transformed by the heat shock method with the plasmid pET21b::rLuc8-W121F/E144Q. Ampicillin-resistant colonies were inoculated in LB medium (10 ml) (1xLB medium, ampicillin 100 μg/ml), which was incubated (200 rpm) overnight at 37 °C. On the next day, a bacterial culture was used to inoculate 5-liter Erlenmeyer flasks containing 1 liter of 1xLB medium with ampicillin (100 μg/ml) where cells were grown (200 rpm, 37 °C) until the culture reached OD600 = 0.5. Induction of expression was done at 22 °C by adding 0.5 ml of 1 M IPTG, and the culture was then incubated overnight (typically 12–16 h) at 22 °C/115 rpm. Next day, the bacterial biomass was harvested by centrifugation (3000 × g/10 min/4 °C) and re-suspended in a purification buffer (500 mM NaCl, 20 mM potassium phosphate buffer pH 7.5, 10 mM imidazole) and lysed by sonication (50% amplitude, 32 min (5 s pulse/5 s pause) using a sonicator Sonic Dismembrator Model 705 (Fisher Scientific, USA). The sonicated lysate was clarified by centrifugation (21,000 ×g, 60 min, 4 °C). The supernatant was then collected, filtered and applied on Ni-NTA Superflow Cartridge (Qiagen, Germany) column equilibrated with the purification buffer. The target enzyme was eluted by a linear gradient of the purification buffer supplemented with 300 mM imidazole. The eluted protein fractions of RLuc8-W121F/E144Q were pooled and dialysed overnight against 100 mM NaCl, 10 mM Tris-HCl pH = 7.0. The dialysed protein was subsequently loaded onto 16/60 Superdex 200 gel filtration column (GE Healthcare, UK) pre-equilibrated with the corresponding buffer. The purified RLuc8-W121F/E144Q protein was concentrated with an Amicon Ultra centrifugal filter units (Millipore, USA), and protein concentrations were assayed by the DS-11 Spectrophotometer (DeNovix, USA).

### Mammalian cells experiments

NIH/3T3 mouse fibroblast cells (ATCC® CRL-1658™) were transfected according to manufacturer’s protocol using Lipofectamine 2000 (Thermo Fisher, USA) with pcDNA3.1(+) plasmids containing genes codon-optimized for expression in mammalian cells (Gene Art, Thermo Fisher, USA). Cells were lysed 24 h after transfection and luciferase activity was measured in lysate via Microplate Reader FLUOstar Omega (BMG Labtech, Germany) using a commercial Renilla Luciferase Assay System (Promega, USA) and also using an in-house prepared assay buffer 100 mM PBS pH = 7.5 with 4.5 μM CTZ (final concentration in the reaction mixture). Cells transfected with pcDNA3.1(+) plasmid were used as a negative control. The measurements were done in triplicates.

### Circular dichroism spectroscopy

Circular dichroism (CD) spectra were recorded at 20 °C using a Chirascan spectropolarimeter (Applied Photophysics, UK). Data were collected from scans of the focal proteins from 185 to 260 nm at 100 nm/min with a 1 s response time and 1 nm bandwidth in 0.1 cm quartz cuvettes. Each presented spectrum (Fig. S13) is an average of five individual scans, corrected for absorbance of the buffer. The CD data were expressed in terms of mean residue ellipticity ΘMRE described by Eq. (1).

$${\Theta }_{{\rm{MRE}}}=\frac{{\Theta }_{{\rm{obs}}}.{M}_{{\rm{W}}}.100}{n.c.l}$$
(1)

where Θobs is the observed ellipticity in degrees, Mw is the scanned protein’s molecular weight, n is the number of residues, l is the cell path length, c is the protein concentration (in mg/ml) and the factor of 100 originates from the conversion of the molecular weight to mg/dmol.

### Thermal stability

Thermal unfolding was studied by using a NanoDSF Prometheus instrument (NanoTemper, Germany) to monitor Trp fluorescence during heating at 1 °C/min from 20 to 90 °C. The melting temperatures (Tonset and Tm) were evaluated directly by ThermControl v2.0.2.

### Microfluidic determination of temperature profiles and thermodynamics

Temperature profiles of specific activities of individual enzyme variants towards the substrate 1,3-dibromopropane were measured in 2.5 °C increments from 20 to 40 °C using a capillary-based droplet microfluidic platform, enabling characterization of enzymatic activity within droplets for multiple enzymes in a single run46. The droplets were generated using the Mitos Dropix (Dolomite, UK). A custom sequence of droplets (150 nl aqueous phase, 300 nl oil spacing) was generated using negative pressure (microfluidic pump) and the droplets were guided through polyethylene tubing to the incubation chamber. Within the incubation chamber, the halogenated substrate was delivered to the droplets via a combination of microdialysis and partitioning between the oil (FC 40) and the aqueous phase. The reaction solution consisted of a weak buffer (1 mM HEPES, 20 mM Na2SO4, pH 8.2) and a complementary fluorescent indicator 8-hydroxypyrene-1,3,6-trisulfonic acid (50 μM HPTS). The fluorescence signal was obtained by using an optical setup with excitation laser (450 nm), a dichroic mirror with a cut-off at 490 nm filtering the excitation light and a Si-detector. By employing a pH-based fluorescence assay, small changes in the pH were observed and enabling monitoring of the enzymatic activity. The reaction progress was analysed as an end-point measurement recorded after passing of 7 or 10 droplets/sample through the incubation chamber. The reaction time was 4 min. The raw signal was processed by a droplet detection script46 written in MATLAB 2017b (Mathworks, USA) to obtain the specific activities. Natural logarithms of the specific activities were used in subsequent thermodynamic analysis to generate both Arrhenius and Eyring plots and to derive thermodynamic parameters (Supplementary Fig. 1).

### Anisotropic network modelling

Secondary structure elements were defined in AncHLD-RLuc and RLuc8 based on their respective crystal structures (PDB ID 6G75, chain A; and 2PSF, chain B) using DSSP47 followed by manual edition after visual inspection (Supplementary Tables 2, 3). Anisotropic network models were computed using the Prody 1.10.8 standalone package48 and the position-specific vector of squared fluctuations and matrix of motion cross-correlations were obtained. These matrices were further extended by calculating the averaged cross-correlation values corresponding to each secondary structure element. To facilitate distinction of differences in predicted motions, the matrix calculated for RLuc8 was subtracted from that calculated for AncHLD-RLuc. The final matrix M encompassed values ranging from −2 (motions more cross-correlated in RLuc8) to 2 (motions more cross-correlated in AncHLD-RLuc), where values close to 0 indicate similar motions in the two structures. To improve understanding of effects of mutations on the 25 selected variants, the positions and secondary structure elements of mutations resulting in extreme values were mapped to both reference crystals (AncHLD-RLuc and RLuc8) based on their alignment (Supplementary Fig. 9). Then, three groups of interesting regions were defined and used to annotate each variant with 18 values obtained from matrix M: nine for the cross-correlation of motions of the specific position and nine for the secondary structure element.

### Calculation of segment-centred cross-correlation values from ANM

Let i1 and i2 be the indices of the beginning and ending position of a given secondary structure element (SSEi) in the structure for which the ANM is computed; j1 and j2 be the indices of the beginning and ending position of a different secondary structure element (SSEj) on the same structure; var(x) and var(y) be the squared fluctuation value for the xth and yth residue on the structure, respectively; and r(x,y) the cross-correlation value for the pair formed by the xth and yth residues on the structure obtained from the same ANM calculation. The averaged cross-correlation value for the pair (SSEi, SSEj) can be calculated according to Eq. (2).

$${Ccor}\left({{{\mathrm{SSE}}}}_{i},{{{\mathrm{SSE}}}}_{j}\right)\frac{\mathop{\sum }\nolimits_{x={i}_{1}}^{x={i}_{2}}\mathop{\sum }\nolimits_{y={j}_{1}}^{y={j}_{2}}\left(r\left(x,y\right)\sqrt{{var}\left(x\right)\cdot {var}\left(y\right)}\right)}{\sqrt{\mathop{\sum }\nolimits_{x={i}_{1}}^{x={i}_{2}}\mathop{\sum }\nolimits_{y={i}_{1}}^{y={i}_{2}}\left(r\left(x,y\right)\sqrt{{var}\left(x\right)\cdot {var}\left(y\right)}\right).\mathop{\sum }\nolimits_{x={j}_{1}}^{x={j}_{2}}\mathop{\sum }\nolimits_{y={j}_{1}}^{y={j}_{2}}r\left(x,y\right)\sqrt{{var}\left(x\right)\cdot {var}\left(y\right)}}}$$
(2)

Note that if additionally to the E SSEx elements defined in each of the Supplementary Tables S2, S3, each of the P amino-acids of each analysed protein is considered as one of such elements, the dimension of the resulting cross-correlation matrix is (E + P)2. Such matrix would be formed by two diagonal matrices with dimensions E2 and P2 representing the SSE to SSE and amino-acid to amino-acid cross-correlations, respectively; and by two mirroring rectangular matrices (E × P and P × E) representing the amino-acid to SSE cross-correlations.

### Multivariate statistical analysis

Partial least squares (PLS) regression analysis49 was used to explore relationships between structural and molecular variables (X) of the proteins and their enzymatic activities (Y), all of which were autoscaled and centred. The complete data matrix used in the PLS analysis is provided in Source data file. Variable importance in the projection (VIP)50 and variable weight plots51 were used to assess the importance of every descriptor in the model, with further validation by cross-validation49 and permutation testing. During the cross-validation procedure, parts of the Y data are not considered during model building, and the ability of the resulting model to predict those data is assessed by comparing predicted and actual values, providing cross-validated Q2 values. In this study, 1/7 of the mutants were deleted in each cross-validation round. In the permutation testing, the model was recalculated 999 times with a randomly re-ordered dependent variable. SIMCA-P version 12 (Umetrics, Umea, Sweden) was used for statistical analyses.

### Success rate frequency analysis

Luminescence readings were processed to determine each tested variant’s luciferase activity relative to the mutagenesis template. Each microtiter plate used for this analysis included negative and positive controls (in sets of four wells) as well as the tested variants, and transformations were applied to obtain relative activities of the preparation in each well. First, the negative control wells were averaged time-point by time-point. Second, the rest of the wells were blanked using subtracting from each of their time-point values of the corresponding averaged time-point from the negative controls. Third, two segments for each data series were defined: pre- and post-injection of the coelenterazine substrate, which occurred ten seconds after the beginning of the readings on the FLUOstar OPTIMA instrument. The pre-injection series consisted of all data points before injection time. The post-injection series consisted of all data points collected from the moment the reading of the signal was stable (0.2 s after the injection time) until the end of the readings. From each data series, outliers defined as Q1 − 1.5*(Q3 − Q1) and Q3 + 1.5*(Q3 − Q1) (where Q1 and Q3 represent the first and third quartile values of each distribution) were omitted to remove noise from the data. Fourth, the average value of the pre-injection series was subtracted from the average value of the post-injection one, to obtain the raw intensity value of each well. Fifth, the reference intensity value of the plate was calculated by averaging the raw intensity values of the positive control wells. Sixth, the relative intensity value was calculated as the ratio of the raw intensity value of each well over the reference intensity value of the plate.

### Steady-state kinetics measurement and data analysis

Solid coelenterazine was dissolved in ice-cold ethanol and stored under nitrogen atmosphere in dark glass vials at –20 °C. Before measurement, concentration and quality of the ethanol stock solution were verified spectrophotometrically. Series of buffer solutions with different coelenterazine concentration was prepared by manual injection of an appropriate volume of the ethanol stock solution into 10 ml of 100 mM phosphate buffer pH 7.5 immediately before the measurement. The reaction mixture was composed of 10% (v/v) enzyme solution in 100 mM phosphate buffer pH 7.5 and 90% (v/v) of buffer solution of coelenterazine. All reactions were carried out at 37 °C in microtiter plates using the microplate reader FLUOstar OPTIMA (BMG Labtech, Germany) set to broad-spectrum luminescence reading. Microplate well with pre-pipetted 25 μl of enzyme solution was first monitored for background light for 10 s, after which, 225 μl of a buffer solution with coelenterazine was added via an automatic syringe. The luminescence of the reaction mixture was then measured for the desired time until the luminescence intensity decreased under 0.5% of its maximal measured value. Each reaction was performed in three repetitions. The gain of the reader was tailored to each enzyme separately, however, for each enzyme, all readings at different substrate concentrations were obtained using the same gain value.

An updated protocol applying new standards for collecting and fitting steady-state kinetic data52,53 was used. Unlike the classical initial velocity analysis, which requires a sophisticated luminometer calibration and quantum yield evaluation to obtain complete kinetic data54, luminescence data were recorded when letting the reaction proceed to completion beyond the initial linear phase. The recorded luminescence traces (rate vs. time) were transformed to reaction progress curves corresponding to cumulative luminescence in time (Supplementary Fig. 5). The transformed steady-state kinetic data (product vs. time) were fitted globally by numerical methods with the KinTek Explorer (KinTek Corporation, USA) to obtain direct estimates of turnover number kcat, Michaelis constant Km, specificity constant kcat/Km, and equilibrium dissociation constant for enzyme-product complex Kp with no need for luminometer quantum yield calibration. The software allows for the input of a given kinetic model via a simple text description, and the programme then derives the differential equations needed for numerical integration automatically. Numerical integration of rate equations searching a set of kinetic parameters that produce a minimum χ2 value was performed using the Bulirsch–Stoer algorithm with adaptive step size, and nonlinear regression to fit data was based on the Levenberg–Marquardt method55. To account for fluctuations in experimental data, enzyme or substrate concentrations were slightly adjusted (±5%) to derive best fits. Residuals were normalized by sigma value for each data point. The standard error (S.E.) was calculated from the covariance matrix during nonlinear regression. In addition to S.E. values, more rigorous analysis of the variation of the kinetic parameters was accomplished by confidence contour analysis by using FitSpace Explorer (KinTek Corporation, USA). In these analyses, the lower and upper limits for each parameter were derived (Supplementary Table 4) from the confidence contour obtained from setting χ2 threshold at 0.956. The scaling factor, relating luminescence signal to product concentration, was applied as one of the fitted parameters, well constrained by end-point levels of kinetic traces recorded at particular substrate concentrations. Depletion of the available substrate after the reaction was ensured by repeated injection of the fresh enzyme. The steady-state model (Eqs. (3)–(6)) was used to obtain the values of turnover number kcat, Michaelis constant Km and equilibrium dissociation constant for enzyme-product complex Kp. A conservative estimate for diffusion-limited substrate and product binding k+1 and k−3 (100 μM−1.s−1) was used as a fixed value to mimic rapid equilibrium assumption. An alternative form of the model (Eqs. (7) and (8)) was used to obtain estimates of kcat/Km directly by setting k−1 = 052,53.

$${\rm{E}}+{\rm{S}}\begin{array}{c}\mathop{\to }\limits^{100}\\ \mathop{\leftarrow }\limits_{{k}_{-1}}\end{array}{\rm{E}}.{\rm{S}}\mathop{\to }\limits^{{k}_{+2}}{\rm{E}}.{\rm{P}}\begin{array}{c}\mathop{\to }\limits^{{k}_{+3}}\\ \mathop{\leftarrow }\limits_{100}\end{array}{\rm{E}}+{\rm{P}}$$
(3)
$${k}_{{\rm{cat}}}={k}_{+2}$$
(4)
$${K}_{{\rm{m}}}={k}_{-1}/100$$
(5)
$${K}_{{\rm{p}}}={k}_{+3}/100$$
(6)
$${\rm{E}}+{\rm{S}}\mathop{\to }\limits^{{k}_{+1}}{\rm{E}}.{\rm{S}}\mathop{\to }\limits^{{k}_{+2}}{\rm{E}}.{\rm{P}}\begin{array}{c}\mathop{\to }\limits^{{k}_{+3}}\\ \mathop{\leftarrow }\limits_{100}\end{array}{\rm{E}}+{\rm{P}}$$
(7)
$${k}_{{\rm{cat}}}/{K}_{{\rm{m}}}={k}_{+1}$$
(8)

### Bioluminescence decay kinetics

Bioluminescence decay kinetics were monitored in the same conditions as the steady-state kinetics, with final concentrations of 2.2 μM CTZ and 50 nM enzyme. The obtained luminescence kinetic data were fitted to an exponential decay or logistic model using Origin 6.1 (OriginLab, USA) to obtain the half-life t1/2 of the luminescence signal.

### Transient kinetics of substrate binding experiments and data analysis

Transient kinetic traces of the protein variants were monitored after rapidly mixing components in 100 mM potassium phosphate buffer (pH 7.5) using the Stopped-Flow SFM 3000 mixing system and the MOS-500 spectrometer (BioLogic, France). In each case, the reaction was initiated by mixing 75 µl of an enzyme solution with 75 µl of CTZ solution and then monitored by the changes of fluorescence at 340 ± 13 nm after excitation at 295 nm. In experiments with each protein variant, concentrations of the CTZ substrate and an enzyme were separately varied in sets of seven consecutive replicates and the resulting kinetic traces were averaged. The experiments were based on monitoring changes in native tryptophan fluorescence that was quenched by the coelenterazine substrate upon its binding by an enzyme. The drop of the initial fluorescence level caused by the non-specific interactions and inner filter effects of coelenterazine was corrected based on experiments with bovine serum albumin, which exhibited the same initial decrease of the fluorescence signal. At room temperatures, the kinetics of the substrate binding was too fast, and most of the kinetic information was lost in the dead time of the instrument (0.3 ms). After decreasing the temperature down to 15 °C, kinetic traces collected for AncINS, AncFT and RLuc8 exhibited a triple-exponential decay of the fluorescence signal.

The two initial kinetic phases related to the binding of the substrate were fit with a double exponential equation (Eq. (9)) for each fluorescence trace (Supplementary Fig. 8). The dependence of the observed rates on the substrate concentration for both the fast and the slow kinetic phases were fit according to the induced-fit (IF) mechanism (Eqs. (10)–(12)) and the conformational selection (CS) mechanism (Eqs. (13)–(15)), assuming that k−1 >> k−2, k+257. The obtained values were used as initial estimates of the elementary rate constants describing the binding process.

$$F={F}_{{\rm{SS}}}+{A}_{1}\cdot {e}^{-{k}_{{\rm{fast}}}\cdot t}+{A}_{2}\cdot {e}^{-{k}_{{\rm{slow}}}\cdot t}$$
(9)
$${\rm{E}}+{\rm{S}}\begin{array}{c}\mathop{\to }\limits^{{k}_{+1}}\\ \mathop{\leftarrow }\limits_{{k}_{-1}}\end{array}{\rm{E}}.{\rm{S}}\begin{array}{c}\mathop{\to }\limits^{{k}_{+2}}\\ \mathop{\leftarrow }\limits_{{k}_{-2}}\end{array}{{\rm{E}}}^{\ast }.{\rm{S}}$$
(10)
$${k}_{{\rm{fast}},{\rm{IF}}}={k}_{+1}\cdot \left[S\right]+{k}_{-1}$$
(11)
$${k}_{{\rm{slow}},{\rm{IF}}}=\frac{{k}_{+2}\cdot \frac{{k}_{+1}}{{k}_{-1}}\cdot [S]}{\frac{{k}_{+1}}{{k}_{-1}}\cdot \left[S\right]+1}+{k}_{-2}$$
(12)
$${\rm{E}}+{\rm{S}}\begin{array}{c}\mathop{\to }\limits^{{k}_{+1}}\\ \mathop{\leftarrow }\limits_{{k}_{-1}}\end{array}{{\rm{E}}}^{\ast }+{\rm{S}}\begin{array}{c}\mathop{\to }\limits^{{k}_{+2}}\\ \mathop{\leftarrow }\limits_{{k}_{-2}}\end{array}{{\rm{E}}}^{\ast }.{\rm{S}}$$
(13)
$${k}_{{\rm{fast}},{\rm{CS}}}={k}_{+2}\cdot \left[S\right]+{k}_{-1}$$
(14)
$${k}_{{\rm{slow}},{\rm{CS}}}=\frac{{k}_{+1}\cdot \left(\frac{{k}_{+1}}{{k}_{-1}}\cdot {k}_{+2}\cdot \left[S\right]+{k}_{-2}\right)}{{k}_{+1}\cdot \frac{{k}_{+1}}{{k}_{-1}}\cdot {k}_{+2}\cdot \left[S\right]+{k}_{-2}}$$
(15)

In the case of AncHLD-RLuc, the kinetic traces exhibited slow equilibration with low signal amplitudes, resulting in a limited precision of analysis (Supplementary Fig. 8). The data could be fit with only a single exponential equation (Eq. 16), and the dependence of the observed rate constant and the amplitude on the substrate concentration were fit according to a one-step binding mechanism (Eqs. (17)–(19)). Such a result pointed out that the rigidified ancestral enzyme shows no profound conformational flexibility and that only a simple substrate binding could be detected.

$$F={F}_{{\rm{SS}}}+A\cdot {e}^{-{k}_{{\rm{obs}}}\cdot t}$$
(16)
$${\rm{E}}+{\rm{S}}\begin{array}{c}\mathop{\to }\limits^{{k}_{+1}}\\ \mathop{\leftarrow }\limits_{{k}_{-1}}\end{array}{\rm{E}}.{\rm{S}}$$
(17)
$${k}_{{\rm{obs}}}={k}_{+1}\cdot \left[S\right]+{k}_{-1}$$
(18)
$$A=\frac{{A}_{{\rm{lim}}}\cdot [S]}{{K}_{{\rm{d}}}+\left[S\right]}$$
(19)

The original raw kinetic data with residuals normalized by sigma values were subsequently analysed globally by numerical simulation using the KinTek Explorer software (KinTek Corporation, USA)52,55,56 and following the same protocol as described in the steady-state kinetics section. The time-course of the fluorescence quenching upon substrate binding was described with Eqs. (20)–(22) for the induced-fit, conformational-selection, and one-step binding mechanisms, respectively. The estimates of the rate constants obtained by analytical fitting were used as starting points, and the final values were obtained by fitting the mechanism directly to the collected kinetic data. The quality of the best fits for different mechanisms was compared in terms of χ2 value, “chi-by-eye” assessment, and confidence contour analysis assessing constraint and independency of the determined parameters. Based on this analysis, the correct mechanism of the substrate binding and the respective elementary rate constants were identified. Lower and upper limits for each parameter were derived for the χ2 threshold of 0.9556.

$${F}_{{\rm{IF}}}=f\cdot \left([E]+a\cdot [{ES}]+b\cdot [{E}^{\ast }S]\right)$$
(20)
$${F}_{{\rm{CS}}}=f\cdot \left([E]+a\cdot [{E}^{\ast }]+b\cdot [{E}^{\ast }S]\right)$$
(21)
$${F}_{{\rm{one}}-{\rm{step}}}=f\cdot \left([E]+a\cdot [{ES}]\right)$$
(22)

### Hydrogen/deuterium exchange (HDX) mass spectrometry

Enzyme samples were diluted with 100 mM phosphate buffer in H2O (pH 7.5) to prepare undeuterated controls and for peptide mapping, or with 100 mM phosphate buffer in D2O (pD 7.1) to prepare deuterated samples. The final concentrations of the enzyme samples used for analysis was 2 µM. HDX was carried out at room temperature and was quenched after 10, 60 or 300 s by adding 1 M HCl in 1 M glycine with pepsin. Each sample was directly injected into an LC-system (UltiMate 3000 RSLCnano, Thermo Fisher Scientific, USA) with an immobilized nepenthesin enzymatic column (Affipro, CZ; 15 µl bed volume, flow rate 20 µl/min, 2% acetonitrile/0.05% trifluoroacetic acid). Peptides were trapped and desalted on-line on a peptide microtrap (Michrom Bioresources, CA) for 3 min at a flow rate of 20 µl/min. Next, the peptides were eluted onto an analytical column (Jupiter C18, 1.0 × 50 mm, 5 µm, 300 Å, Phenomenex, CA) and separated by linear gradient elution starting with 10% buffer A in buffer B and rising to 40% buffer A over 2 min. This was followed by 31 min isocratic elution at 40% B. Buffers A and B consisted of 0.1% formic acid in water and 80% acetonitrile/0.08% formic acid, respectively. The immobilized nepenthesin column, trap cartridge, and analytical column were kept at 1 °C using a refrigerated system (Science Instruments and Software, CZ).

Mass spectrometric analysis was carried out using an Orbitrap Elite mass spectrometer (Thermo Fisher Scientific, USA) with ESI ionization connected on-line to a robotic system based on the HTS-XT platform (CTC Analytics, Switzerland). The instrument was operated in a data-dependent mode for peptide mapping (HPLC-MS/MS). Each MS scan was followed by MS/MS scans of the three most intensive ions from both CID and HCD fragmentation spectra. Tandem mass spectra were searched using SequestHT against the cRap protein database (ftp://ftp.thegpm.org/fasta/cRAP) containing the sequences of Rluc8, AncHLD-RLuc and the studied mutants with the following search settings: mass tolerance for precursor ions of 10 ppm, mass tolerance for fragment ions of 0.6 Da, no enzyme specificity, two maximum missed cleavage sites and no-fixed or variable modifications. The false discovery rate at the peptide identification level was set to 1%. Sequence coverage was analysed with Proteome Discoverer version 1.4 (Thermo Fisher Scientific, USA) and graphically visualized with the MS Tools application58. Analysis of deuterated samples was done in HPLC-LC-MS mode with ion detection in the orbital ion trap. The MS raw files together with the list of peptides (peptide pool) identified with high confidence characterized by requested parameters (amino acid sequence of each peptide, its retention time, XCorr, and ion charge) were processed using HDExaminer version 2.2 (Sierra Analytics, CA). The software used to analyse protein and peptide behaviour, creates the uptake plots that measured peptide deuteration over time with the calculated confidence level.

### HDX mass spectrometry data analysis

The acquired data were analysed as follows. The raw MS files together with the list of peptides (peptide pool) identified with high confidence characterized by requested parameters (retention time, XCorr, and charge) were processed using HDExaminer version 2.2 (Sierra Analytics, Modesto, CA). The software analysed the proteins’ and peptides’ behaviour and generated uptake data that were mapped to the proteins’ amino acid sequences via the following procedure. Each residue was assigned the uptake data from any peptide solved with high confidence. Medium confidence peptides were also accepted for positions without previously assigned data. Low confidence peptides were rejected. The final uptake value (expressed as % of deuteration) assigned to each amino acid corresponded to the average of all assigned values for its position.

### MD simulations

MD simulations were carried out on the crystal structures of RLuc8 (PDB ID 2PSF17), AncHLD-RLuc (PDB ID 6G7514), AncINS (PDB ID 6S6E) and AncFT (PDB ID 6S97). Hydrogen atoms were added to the structures using the H++ web server59, at pH 7.5. Water molecules from the crystal structures, which did not overlap with the protonated structures, were retained. The systems were solvated using the solvate module of high-throughput molecular dynamics (HTMD) in a cubical water box of TIP3P water molecules so that all atoms were at least 10 Å from the surface of the box60. Cl and Na+ ions were added to neutralize the protein’s charge and get a final concentration of 0.1 M. In all the simulations periodic boundary conditions were applied, the particle mesh Ewald method was used to treat interactions beyond a 9 Å cut-off, electrostatic interactions were suppressed >4 bond terms away from each other, and the smoothing and switching of van der Waals and electrostatic interactions were cut-off at 7.5 Å61. HTMD was used for adaptive sampling of the RMSD of the Cα atoms of residues 10–290. The production runs (20 ns) were started with the files resulting from the equilibration using settings from the final equilibration step. The trajectories were saved every 100 ps. The adaptive epochs were updated every 6 h to run with a minimum of five simulations and a maximum of 10 simulations for the total maximum of 110 epochs using TICA in 1 dimension62.

### Equilibration of systems for MD simulations

The systems were equilibrated using the Equilibration_v2 module of HTMD 1.23.459. The system was first minimized using the conjugate-gradient method for 500 steps, after which the system was heated and minimized as follows (i) 500 steps (2 ps) of NVT heating, with the Berendsen barostat, to 310 K, with constraints on all heavy atoms of the protein; (ii) 2.5 ns of NPT equilibration, with the Langevin thermostat, with 1 kcal mol−1 Å−2 constraints on all the heavy atoms of the protein; and (iii) 2.5 ns of NPT equilibration, with the Langevin thermostat, without constraints. During the equilibration simulations, holonomic constraints were applied to all hydrogen-heavy atom bond terms and the mass of hydrogen atoms was scaled by a factor of 4, enabling the use of a 4 fs timestep60,61. This approach transfers some mass from the heavy atom to the hydrogen to slow the fastest vibrations in the hydrogen atom, which is acceptable because the thermodynamic properties of biological systems are insensitive to the distribution of atomic masses63,64.

Adaptive sampling here refers to the MD method that performs simulations in small sequential batches of 10 simulations. Before the next batch of simulations is submitted, the first batch is analysed considering a certain parameter (RMSD in this study) and the less sampled conformational regions are used to start a new batch. This way the method ensures that the sampled regions are maximized. This sampling analysis is done using Markov state models making a list of metastable states distributed through the conformational space.

### Calculation of B-factors

B-factors were calculated from all snapshots obtained from the MD simulations for the backbone atoms of each enzyme. The Metric Fluctuation tool from the HTMD package was used to calculate the RMSF, from which B-factors were calculated using Eq. (23).

$$B=\left[\frac{8* {\pi }^{2}}{3}\right]* {{{\mathrm{RMSF}}}}^{2}$$
(23)

The B-factors obtained from the MD simulations were standardized for each enzyme to enable comparisons between enzymes. This was done using Eq. (24).

$$Z=\frac{X-\mu }{\delta }$$
(24)

Here, Z is the standardized value, X is the original B-factor value, μ is the average of the B-factors, and δ is the sample standard deviation. The simulations were made into a simulation list using HTMD, water was filtered out, and crashed simulations with lengths below 20 ns were omitted59. The total simulation time required for the calculations was in excess of 4 μs for each enzyme. The dynamics of the enzymes were evaluated based on the RMSD of the Cα atoms of residues 10–290 to avoid problems due to the misleadingly high RMSD values of the dynamically free residues at each end of the protein. The data were clustered into 200 clusters using the MiniBatchKmeans algorithm. The implied timescale plot (based on a Markov state model with various lag times) was constructed to select a lag time for Markov model construction, using the RMSD of the full protein as a metric. Because the timescales mostly stabilized after 15 ns lag time, this value was used in the models to construct the 2 Markov states (open and closed). The final data were calculated from 1000 bootstrapping runs using 50% of the data.

### Tunnel calculations

The tunnels were calculated with Caver 3.0223 using 1000 random frames belonging to each state using Caver Analyst 2.0.3165, with Probe radius: 0.9 Å, clustering threshold: 5, shell depth: 5 Å and shell radius: 5 Å. For the AncHLD-RLuc the starting point of the tunnel was defined by: ND2 of the Asn51 and OE1 of Glu142. For other enzymes, the starting point was defined by ND2 and OE1 in analogous residues.

### Cavities calculations

The cavity calculations for the four proteins were carried out in the duplicates using the software Caver Analyst 2.0.3165. We have used 100 random snapshots to obtain the cavity volumes from the open conformations and 100 random snapshots from the closed conformations. A probe of 1.4 Å was used for the cavities and a maximum probe size of 3.0 Å was used to delimit the interface between solvent and cavity. The same parameters for the probes were used when calculating the cavity volume of the crystal structures. When two chains were present, the cavities were calculated for both chains. Cavities were also calculated for the minimized crystal structures, both chains when present.

### Crystallizations

Diffraction-quality crystals of AncINS were obtained using the sitting-drop vapour diffusion technique in a 96-well crystallization plate (UVXPO 2 Lens Crystallization Plate, SWISSCI, Switzerland) at 20 °C after 5 days by mixing equal volumes of AncINS (4.5 or 10.5 mg/ml) with the reservoir solution from Morpheus crystallization screen condition C5, which consists of 0.09 M NPS additive mix (sodium nitrate, sodium phosphate dibasic, ammonium sulfate) and 0.1 M buffer system 2 (sodium HEPES, MOPS) at pH 7.5 with 30% v/v precipitant mix 1 (PEG 500 MME, PEG 20,000). The crystals were fished out and directly flash-frozen in liquid nitrogen for diffraction analysis at ESRF beamline ID23-1 in Grenoble (France).

AncFT crystals were obtained using the hanging-drop vapour diffusion technique in EasyXtal 15-well plates (Qiagen, Germany) with seeding at 20 °C after 2 days by mixing 0.5 µl of seed, 2 µl of AncFT (9.2 mg/ml), and 1 µl of reservoir solution (0.2 M sodium acetate, 0.1 M Tris pH 8.5, 19% PEG 3350). Crystals were cryo-protected with 20% glycerol and flash-cooled in liquid nitrogen. Crystallographic data were collected at cryogenic temperature at SLS beamline PXIII in Villigen (Switzerland), with wavelength of 1 Å.

Prior to the co-crystallization screening, the purified protein RLuc8-W121F/E144Q was concentrated to 6.0 or 8.5 mg/ml and mixed with 3–5 molar excess of coelenterazine (stock solution of coelenterazine was 11.8 mM in isopropanol). The mixture was incubated 1 h at 4 °C, and then precipitated material was removed by centrifugation (13,000 rpm/10 min/4 °C), and the supernatant containing enzyme-ligand complexes was directly crystallized. The crystallization was performed in Easy-Xtal 15-well crystallization plates in a hanging drop vapour diffusion technique, where enzyme-ligand drops (typically 1.5 μl) were mixed with the reservoir solution (2.2 M NaHPO4/K2HPO4, 0.1 M sodium acetate pH = 4.5) in the ratio 1:1 and equilibrated against 500 μl of the reservoir solution. Diffraction quality crystals of RLuc8-W121F/E144Q were obtained at 20 °C after 3–5 days. All crystals used for X-ray diffraction analysis were flash-frozen in liquid nitrogen in the corresponding reservoir solutions supplemented with 20–25% glycerol. All data obtained in this project were collected at 100 K on SLS beamline PX3 (Villigen, Switzerland).

### Structure determination, model building and refinement

The crystallographic data were processed and scaled using XDS66 and Aimless67. Structures of AncINS and AncFT were solved by molecular replacement using AncHLD-RLuc (PDB ID 6G7514) as a search model with the help of Phaser68 as implemented in the Phenix package69. The structure of RLuc8-W121F/E144Q was solved by molecular replacement with Phenix2 using the RLuc8 structure (PDB ID 2PSD) as a search model. Multiple cycles of automated refinement were performed in the phenix.refine programme69 and manual model building was performed in Coot70. The final models were validated using tools provided in Coot70 and Molprobity71. Ramachandran favoured 95.3%, 95.9% and 96.6%, Ramachandran allowed 4.4%, 3.8% and 3.4%, Ramachandran outliers 0.3%, 0.3% and 0% for AncINS, AncFT and RLuc8-W121F/E144Q, respectively. Structural data were graphically visualized with PyMol Molecular Graphics System, version 1.8 (Schrödinger, LLC). Atomic coordinates and structure factors for AncINS, AncFT and RLuc8-W121F/E144Q were deposited in the Protein Data Bank under the codes 6S6E, 6S97, and 6YN2, respectively.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

## Data availability

Atomic coordinates and experimental data have been deposited in the Protein Data Bank (www.wwpdb.org). They are publicly available for accession codes 6S6E and 6S97; 6YN2 will be released upon article publication. The validation reports are provided in Supplementary Notes 7, 8 and 9. The primary data from microfluidics are available in figshare with the identifier https://doi.org/10.6084/m9.figshare.14453700.v1Source data are provided with this paper.

## References

1. Baier, F., Copp, J. N. & Tokuriki, N. Evolution of enzyme superfamilies: comprehensive exploration of sequence–function relationships. Biochemistry 55, 6375–6388 (2016).

2. Tyzack, J. D., Furnham, N., Sillitoe, I., Orengo, C. M. & Thornton, J. M. Understanding enzyme function evolution from a computational perspective. Curr. Opin. Struct. Biol. 47, 131–139 (2017).

3. Arnold, F. H. Directed evolution: bringing new chemistry to life. Angew. Chem. Int. Ed. Engl. 57, 4143–4148 (2018).

4. Tóth-Petróczy, Á. & Tawfik, D. S. Protein insertions and deletions enabled by neutral roaming in sequence space. Mol. Biol. Evol. 30, 761–771 (2013).

5. Hochberg, G. K. A. & Thornton, J. W. Reconstructing ancient proteins to understand the causes of structure and function. Annu. Rev. Biophys. 46, 247–269 (2017).

6. Thornton, J. W. Resurrecting ancient genes: experimental analysis of extinct molecules. Nat. Rev. Genet. 5, 366–375 (2004).

7. Harms, M. J. & Thornton, J. W. Evolutionary biochemistry: revealing the historical and physical causes of protein properties. Nat. Rev. Genet. 14, 559–571 (2013).

8. Zou, T., Risso, V. A., Gavira, J. A., Sanchez-Ruiz, J. M. & Ozkan, S. B. Evolution of conformational dynamics determines the conversion of a promiscuous generalist into a specialist enzyme. Mol. Biol. Evol. 32, 132–143 (2014).

9. Risso, V. A., Gavira, J. A., Mejia-Carmona, D. F., Gaucher, E. A. & Sanchez-Ruiz, J. M. Hyperstability and substrate promiscuity in laboratory resurrections of precambrian β-lactamases. J. Am. Chem. Soc. 135, 2899–2902 (2013).

10. Chothia, C., Gough, J., Vogel, C. & Teichmann, S. A. Evolution of the protein repertoire. Science 300, 1701 (2003).

11. Pascarella, S. & Argos, P. Analysis of insertions/deletions in protein structures. J. Mol. Biol. 224, 461–471 (1992).

12. Emond, S. et al. Accessing unexplored regions of sequence space in directed enzyme evolution via insertion/deletion mutagenesis. Nat. Commun. 11, 1–14 (2020).

13. Skamaki, K. et al. In vitro evolution of antibody affinity via insertional scanning mutagenesis of an entire antibody variable region. Proc. Natl Acad. Sci. USA 117, 27307–27318 (2020).

14. Chaloupkova, R. et al. Light-emitting dehalogenases: reconstruction of multifunctional biocatalysts. ACS Catal. 9, 4810–4823 (2019).

15. Koudelakova, T. et al. Haloalkane dehalogenases: biotechnological applications. Biotechnol. J. 8, 32–45 (2013).

16. Lorenz, W. W., McCann, R. O., Longiaru, M. & Cormier, M. J. Isolation and expression of a cDNA encoding Renilla reniformis luciferase. Proc. Natl Acad. Sci. USA 88, 4438–4442 (1991).

17. Loening, A. M., Fenn, T. D. & Gambhir, S. S. Crystal structures of the luciferase and green fluorescent protein from Renilla reniformis. J. Mol. Biol. 374, 1017–1028 (2007).

18. Woo, J., Howell, M. H. & Arnim, A. Gvon Structure–function studies on the active site of the coelenterazine-dependent luciferase from Renilla. Protein Sci. 17, 725–735 (2008).

19. Bloom, J. D., Labthavikul, S. T., Otey, C. R. & Arnold, F. H. Protein stability promotes evolvability. Proc. Natl Acad. Sci. USA 103, 5869 (2006).

20. Loening, A. M., Fenn, T. D., Wu, A. M. & Gambhir, S. S. Consensus guided mutagenesis of Renilla luciferase yields enhanced stability and light output. Protein Eng. Des. Sel. 19, 391–400 (2006).

21. Bradshaw, R. T. et al. Neurotransmitter transporter conformational dynamics using HDX-MS and molecular dynamics simulation. Biophys. J. 114, 207a (2018).

22. Yeh, H.-W. & Ai, H.-W. Development and applications of bioluminescent and chemiluminescent reporters and biosensors. Annu. Rev. Anal. Chem. Palo Alto Calif. 12, 129–150 (2019).

23. Chovancova, E. et al. CAVER 3.0: a tool for the analysis of transport pathways in dynamic protein structures. PLOS Comput. Biol. 8, e1002708 (2012).

24. Kipnis, Y., Dellus-Gur, E. & Tawfik, D. S. TRINS: a method for gene modification by randomized tandem repeat insertions. Protein Eng. Des. Sel. 25, 437–444 (2012).

25. Jones, D. D. Triplet nucleotide removal at random positions in a target gene: the tolerance of TEM-1 β-lactamase to an amino acid deletion. Nucleic Acids Res. 33, e80–e80 (2005).

26. Fujii, R., Kitaoka, M. & Hayashi, K. in Directed Evolution Library Creation: Methods and Protocols (eds. Gillam, E. M. J., Copp, J. N. & Ackerley, D.) 151–158 (Springer, 2014).

27. Jones, D. D., Arpino, J. A. J., Baldwin, A. J. & Edmundson, M. C. in Directed Evolution Library Creation: Methods and Protocols (eds Gillam, E. M. J., Copp, J. N. & Ackerley, D.) 159–172 (Springer, 2014).

28. Obexer, R. et al. Emergence of a catalytic tetrad during evolution of a highly active artificial aldolase. Nat. Chem. 9, 50–56 (2017).

29. Khare, S. D. et al. Computational redesign of a mononuclear zinc metalloenzyme for organophosphate hydrolysis. Nat. Chem. Biol. 8, 294 (2012).

30. Rockah-Shmuel, L. & Tawfik, D. S. Evolutionary transitions to new DNA methyltransferases through target site expansion and shrinkage. Nucleic Acids Res. 40, 11627–11637 (2012).

31. Qu, G., Li, A., Acevedo‐Rocha, C. G., Sun, Z. & Reetz, M. T. The crucial role of methodology development in directed evolution of selective enzymes. Angew. Chem. Int. Ed. 59, 13204–13231 (2020).

32. Mazurenko, S., Prokop, Z. & Damborsky, J. Machine learning in enzyme engineering. ACS Catal. 10, 1210–1223 (2020).

33. Lourenço, J. M., Esteves da Silva, J. C. G. & Pinto da Silva, L. Combined experimental and theoretical study of Coelenterazine chemiluminescence in aqueous solution. J. Lumin. 194, 139–145 (2018).

34. Magalhães, C. M., Esteves da Silva, J. C. G. & Pinto da Silva, L. Comparative study of the chemiluminescence of coelenterazine, coelenterazine-e and Cypridina luciferin with an experimental and theoretical approach. J. Photochem. Photobiol. B 190, 21–31 (2019).

35. Tokuriki, N. & Tawfik, D. S. Protein dynamism and evolvability. Science 324, 203–207 (2009).

36. Kreß, N., Halder, J. M., Rapp, L. R. & Hauer, B. Unlocked potential of dynamic elements in protein structures: channels and loops. Energy Mech. Biol. 47, 109–116 (2018).

37. Nestl, B. M. & Hauer, B. Engineering of flexible loops in enzymes. ACS Catal. 4, 3201–3211 (2014).

38. Park, H.-S. et al. Design and evolution of new catalytic activity with an existing protein scaffold. Science 311, 535–538 (2006).

39. Afriat-Jurnou, L., Jackson, C. J. & Tawfik, D. S. Reconstructing a missing link in the evolution of a recently diverged phosphotriesterase by active-site loop remodeling. Biochemistry 51, 6047–6055 (2012).

40. Bornscheuer, U. T. et al. Engineering the third wave of biocatalysis. Nature 485, 185–194 (2012).

41. Romero, P. A. & Arnold, F. H. Exploring protein fitness landscapes by directed evolution. Nat. Rev. Mol. Cell Biol. 10, 866–876 (2009).

42. Emond, S. & Hollfelder, F. TRIAD: a transposition-based approach for gene mutagenesis by random short in-frame insertions and deletions for directed protein evolution. Protoc. Exch. https://doi.org/10.21203/rs.3.pex-1448/v1 (2021).

43. Waterhouse, A. M., Procter, J. B., Martin, D. M. A., Clamp, M. & Barton, G. J. Jalview Version 2—a multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189–1191 (2009).

44. Notredame, C., Higgins, D. G. & Heringa, J. T-coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217 (2000).

45. Holloway, P., Trevors, J. T. & Lee, H. A colorimetric assay for detecting haloalkane dehalogenase activity. J. Microbiol. Methods 32, 31–36 (1998).

46. Buryska, T. et al. Controlled oil/water partitioning of hydrophobic substrates extending the bioanalytical applications of droplet-based microfluidics. Anal. Chem. 91, 10008–10015 (2019).

47. Joosten, R. P. et al. A series of PDB related databases for everyday needs. Nucleic Acids Res. 39, D411–D419 (2011).

48. Bakan, A. et al. Evol and ProDy for bridging protein sequence evolution and structural dynamics. Bioinforma. Oxf. Engl. 30, 2681–2683 (2014).

49. Wold, S. Validation of QSAR’s. Quant. Struct. -Act. Relatsh. 10, 191–193 (1991).

50. Wold, S., Johansson, E. & Cocchi, M. in 3D QSAR in Drug Design. Theory, Methods, and Applications (ed. Kubinyi, H.) 523–550 (ESCOM Science Publisher, 1993).

51. Wold, S. & Dunn, W. J. Multivariate quantitative structure-activity relationships (QSAR): conditions for their applicability. J. Chem. Inf. Comput. Sci. 23, 6–13 (1983).

52. Johnson, K. A. in Methods in Enzymology Vol. 467 (eds. Johnson, M. L. & Brand, L.) Ch. 23, 601–626 (Academic Press, 2009).

53. Johnson, K. A. New standards for collecting and fitting steady state kinetic data. Beilstein J. Org. Chem. 15, 16–29 (2019).

54. O’Kane, D. J. & Lee, J. Absolute calibration of luminometers with low-level light standards. Methods Enzymol. 305, 87–96 (2000).

55. Johnson, K. A., Simpson, Z. B. & Blom, T. Global Kinetic Explorer: a new computer program for dynamic simulation and fitting of kinetic data. Anal. Biochem. 387, 20–29 (2009).

56. Johnson, K. A., Simpson, Z. B. & Blom, T. FitSpace Explorer: an algorithm to evaluate multidimensional parameter space in fitting kinetic data. Anal. Biochem. 387, 30–41 (2009).

57. Bagshaw, C. R. Biomolecular Kinetics: A Step-by-Step Guide (CRC Press, 2017).

58. Kavan, D. & Man, P. MSTools—Web based application for visualization and presentation of HXMS data. Hydrog. Exch. Mass Spectrom. 302, 53–58 (2011).

59. Doerr, S., Harvey, M. J., Noé, F. & De Fabritiis, G. HTMD: high-throughput molecular dynamics for molecular discovery. J. Chem. Theory Comput. 12, 1845–1852 (2016).

60. Harvey, M. J., Giupponi, G. & Fabritiis, G. D. ACEMD: accelerating biomolecular dynamics in the microsecond time scale. J. Chem. Theory Comput. 5, 1632–1639 (2009).

61. Harvey, M. J. & De Fabritiis, G. An implementation of the smooth particle mesh ewald method on GPU hardware. J. Chem. Theory Comput. 5, 2371–2377 (2009).

62. Naritomi, Y. & Fuchigami, S. Slow dynamics in protein fluctuations revealed by time-structure based independent component analysis: the case of domain motions. J. Chem. Phys. 134, 065101 (2011).

63. Feenstra, K. A., Hess, B. & Berendsen, H. J. C. Improving efficiency of large time-scale molecular dynamics simulations of hydrogen-rich systems. J. Comput. Chem. 20, 786–798 (1999).

64. Hopkins, C. W., Le Grand, S., Walker, R. C. & Roitberg, A. E. Long-time-step molecular dynamics through hydrogen mass repartitioning. J. Chem. Theory Comput. 11, 1864–1874 (2015).

65. Jurcik, A. et al. CAVER Analyst 2.0: analysis and visualization of channels and tunnels in protein structures and molecular dynamics trajectories. Bioinformatics 34, 3586–3588 (2018).

66. Kabsch, W. XDS. Acta Crystallogr. D. Biol. Crystallogr 66, 125–132 (2010).

67. Evans, P. R. & Murshudov, G. N. How good are my data and what is the resolution?. Acta Crystallogr. D. Biol. Crystallogr 69, 1204–1214 (2013).

68. McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007).

69. Adams, P. D. et al. The Phenix software for automated determination of macromolecular structures. Methods Struct. Proteomics 55, 94–106 (2011).

70. Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D. Biol. Crystallogr 66, 486–501 (2010).

71. Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D. Biol. Crystallogr. 66, 12–21 (2010).

## Acknowledgements

The authors would like to express thanks to Dr. Chris Badenhorst (University Greifswald) for his very helpful comments on the manuscript. This work was supported by grants from the Czech Ministry of Education (LQ1605, CZ.02.1.01/0.0/0.0/16_013/0001761, LM2015051, LM2015047, LM2015055), the Grant Agency of the Czech Republic (17-24321S), the Ministry of Health of the Czech Republic MH CZ-DRO MMCI, 00209805 and the European Union (720776, 722610, 814418). J.P.-I. is supported by Postdoc@MUNI (CZ.02.2.69/0.0/16_027/0008360), M.T. and M.V. by Brno Ph.D. Talent. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 792772. M.M. acknowledges financial support from GAMU of the Masaryk University (MUNI/H/1561/2018). Computational resources were provided by CERIT-SC and Meta Centrum (LM2015042, LM2015085). F.H. is an H2020 ERC Advanced Investigator [695669] and work in Cambridge was supported by grant BB/L002469/1 from the BBSRC (Biological Sciences Research Council). We thank members of the European Synchrotron Radiation Facility (ESRF) and the Swiss Light Source (SLS) synchrotrons for the use of their beamline facilities and for help during data collection. Core Facility Biomolecular Interactions and Crystallization of CEITEC MU is gratefully acknowledged for obtaining scientific data presented in this paper. A.S. is on leave from the Institute of Chemistry, Slovak Academy of Sciences, 845 38 Bratislava, Slovak Republic. We are grateful to Klaudia Chmelova (Masaryk University) for her kind assistance during X-ray data collection and Dr. Stanislav Mazurenko (Masaryk University, Brno) for help with mathematical analysis.

## Author information

Authors

### Contributions

V.D.L. and S.E. constructed libraries and performed manual screening; A.S., M.T., R.C., V.D.L., M.V. and M.M. conducted biochemical and biophysical characterization of protein variants; A.S. and M.D. performed robotic screening; A.S. and M.M. crystallized proteins, collected X-ray data and determined structures; A.S. and M.M. designed and performed experiments with mammalian cell lysates; G.P.P., J.P.-I. and D.B. performed and analysed MD simulations; D.P., M.T. and Z.P. performed steady-state and transient kinetic experiments and analysed data; L.H. conducted HDX-MS experiments and analysed data; J.D. and J.P.-I. conducted bioinformatics analysis; J.D., F.H. and U.B.T. designed the study and contributed to data interpretation. All authors contributed to the writing of the manuscript and preparation of tables and figures.

### Corresponding authors

Correspondence to Florian Hollfelder, Uwe T. Bornscheuer or Jiri Damborsky.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

Peer review information Nature Communications thanks Kirill Alexandrov, Vitor Pinheiro and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

Reprints and Permissions

Schenkmayerova, A., Pinto, G.P., Toul, M. et al. Engineering the protein dynamics of an ancestral luciferase. Nat Commun 12, 3616 (2021). https://doi.org/10.1038/s41467-021-23450-z

• Accepted:

• Published:

• DOI: https://doi.org/10.1038/s41467-021-23450-z