Strategic single point mutation yields a solvent- and salt-stable transaminase from Virgibacillus sp. in soluble form

A new transaminase (VbTA) was identified from the genome of the halotolerant marine bacterium Virgibacillus 21D. Following heterologous expression in Escherichia coli, it was located entirely in the insoluble fraction. After a single mutation, identified via sequence homology analyses, the VbTA T16F mutant was successfully expressed in soluble form and characterised. VbTA T16F showed high stability towards polar organic solvents and salt exposure, accepting mainly hydrophobic aromatic amine and carbonyl substrates. The 2.0 Å resolution crystal structure of VbTA T16F is here reported, and together with computational calculations, revealed that this mutation is crucial for correct dimerisation and thus correct folding, leading to soluble protein expression.

studied. To address this problem, we successfully adopted a protein engineering approach, based on the mutation of a single residue (T16F) that resulted in sufficiently soluble expression of recombinant VbTA T16F in Escherichia coli. Here, we report the functional and structural characterisation of the mutant enzyme. Analysis of the VbTA T16F crystal structure, together with molecular modelling, provides an insight into the role of F16 in stabilising the tertiary and quaternary structure of VbTA T16F.

Results and Discussion
Production of soluble VbTA after protein engineering. Protein sequences of TAs from Vibrio fluvialis (VfTA), Chromobacterium violaceum (CvTA), and Halomonas elongata (HeTA) were used to search for homologous genes in the genome of Virgibacillus sp. 21D; this analysis detected a sequence (VbTA) with an identity of 38% and 55% similarity with the HeTA sequence, and 36% similarity with VfTA and CvTA genes. The selected 1350 bp gene encodes for a protein of 448 amino acid residues. Classification of PLP-dependent enzymes is primarily based on fold type 19 ; e.g., fold type I, also referred to as the 'aspartate aminotransferase superfamily' , which combines the highest quantity and diversity of members 20 . Transaminases are additionally divided into six classes depending on common structural features and sequence similarity; VbTA, according to its sequence, belongs to fold type I and the class III subgroup. Based on sequence alignments, it was possible to identify three highly conserved residues (see ConSurf analysis, Supplementary Figure 1) that bind PLP: D24 and Y148, involved in salt bridge and hydrogen bond formation with PLP, respectively, and K273 that forms a Schiff base with PLP ( Fig. 1).
The codon optimised VbTA gene sequence was cloned into different conventional E. coli expression systems (see Materials for details) and expressed under different fermentation settings (temperature, expression time, and inducer concentration) using a Mutisimplex optimization design 21 , resulting in the accumulation of expressed protein as insoluble inclusion bodies, under all conditions tested. The halophilic archaeon host Haloferax volcanii 22 was also tested to optimise the marine transaminase expression, without any significant improvement.
While insoluble expression can be due to incompatibility of the host cytoplasmic environment, it is also known that misfolding can be linked to specific mutations in the sequence 23 . All fold type-I TAs reported in literature exist as dimers or tetramers and are generally produced as soluble proteins 1 . Therefore, the VbTA sequence was further analysed to gain insight into its structure and lack of solubility. Previous structural studies, whereby the active site of a HeTA homology model was probed by site directed mutagenesis to investigate the enzyme affinity for specific aromatic substrates 24 , highlighted that F18 (corresponding to T16 in VbTA) is a key active site residue in the catalytic pocket. It forms a strong interaction with a second phenylalanine (F84, contributed by the second monomer), thus stabilizing the compact dimeric structure. He-TA F18A was generated at the time to increase the size of the small pocked to shift the activity of the enzyme towards bulky-bulky substrates, however this variant was found to be completely insoluble (data not shown), suggesting that the aromatic side chain at position 16 was essential for the correct folding of the protein. Furthermore, using the program ConSurf (http:// consurf.tau.ac.il) 25 , sequence alignments made with around 500 homologous protein sequences showed that this position is highly conserved for aromatic residues (76.8% Phe, 16.3% Tyr, see Supplementary Figures 1 and 2). Therefore, based on the hypothesis that dimerisation is critical for solubility, and on the fact that amino acids involved in the structural stabilisation of the functional homodimer must be evolutionary preserved, to some extent, the mutant VbTA T16F was produced.  Expression of the N-terminal His-tagged transaminase was performed in a conventional BL21 DE3 E. coli strain, transformed with the expression vector pET100-D-TOPO and its expression under different fermentation conditions was compared by SDS-PAGE (Supplementary Figure 3). The best results were achieved in ZYM-5052 auto-induction medium, with incubation at 30 °C, with shaking at 150 rpm, for 24 h. VbTA T16F migrated with a MW of 50-55 kDa in agreement with its calculated MW of 53.2 kDa (including 4.1 kDa corresponding to the His-tag). In order to determine the quaternary arrangement of VbTA T16F, size-exclusion chromatography experiments were carried out, resulting in an estimated MW of 104 kDa, corresponding to a dimeric quaternary structure (theoretical value 106 kDa).
Characterization of transaminase activity. A standard assay, using (S)-1-phenylethylamine as the amino donor and pyruvate as the amino acceptor, was used to determine the transaminase activity of VbTA T16F 26 . The highest activity was observed at pH 8.0 and 45 °C, with 60% of the initial activity still maintained at 60 °C. As described for the first time by Ikai et at. 27 , this mesophilic profile could be related to the high aliphatic index (94.44) of VbTA T16F. Stability tests at different pHs and temperatures showed that the enzyme is highly stable at temperatures up to 45 °C when stored at pH 8.0 (see Supplementary Figure 4 for details).
Virgibacillus sp. strain 21D was isolated from a deep hypersaline anoxic basin characterised by the presence of high concentration of MgCl 2 (5 M) 18 . This strain also grows in presence of NaCl concentrations up to 1.5 M, which may imply halotolerant behavior of its enzymes. As previously observed, enzymes from halotolerant microorganisms are stable in conditions of high ionic strength and co-solvents, which are often crucial for the reaction or storage medium in biocatalysis. Stability towards organic solvents ( Fig. 2) was relatively high in most solvents tested; the enzyme, stored at 25 °C for 2 days in the presence of 20% (v/v) methanol (MeOH) maintained 94-95% of its initial activity. Activity of VbTA T16F in the presence of 10% (v/v) MeOH was nearly unaffected and around 50% activity was observed in the presence of 20% (v/v) DMSO. This tolerance towards polar organic solvents, commonly employed for solubilising hydrophobic substrates, is quite remarkable and significantly better than previously reported for other TAs 1 .
Stability of VbTA T16F in the presence of different concentrations of NaCl and KCl is reported in Fig. 3. VbTA T16F was stable and exhibited 63-65% of its initial activity after 7 days in the presence of 1 M NaCl (5.84%) and up to 80% in the presence of 1 M KCl (7.45%); even at 3 M NaCl (17.5%) still 40% of residual activity was retained after 1 week.
Under optimal reaction conditions, the maximum turnover number (k cat ) of VbTA T16F was 0.099 s −1 and measured Michaelis-Menten constants (K M ) were 1.9 mM and 10.7 mM for (S)-1-phenylethylamine and pyruvate, respectively (see Supporting Information for further details). (Table 1) to evaluate the substrate scope of VbTA T16F.

Substrate scope. Different amino acceptors and donors were evaluated under standard conditions
Hydrophobic substrates bearing aromatic groups were the preferred substrates, both as amino donors and acceptors, whereas small aliphatic substrates and keto-sugars were not generally converted; pyruvate/alanine were among the only small polar substrates accepted. VbTA T16F showed a complete S-enantioselectivity toward 1-phenylethylamine.
The crystal structure of VbTA T16F. In order to investigate the structural bases that govern the improved solubility of the heterologously expressed T16F mutant, the 3D structure of VbTA T16F was solved at 2 Å resolution by X-ray crystallography. VbTA T16F crystallised with one monomer in the asymmetric unit (Matthews coefficient (V M ) of 2.68 A 3 /Da, with an estimated solvent content of 54.1%. Electron density was evident for residues 2 to 444 with no gaps. One molecule of PLP was bound at the active site, although it does not form a covalent aldimine bond with the catalytic lysine residue (K273), as seen in the crystal structures of many other PLP-dependent enzymes (See Supplementary Figure 5). The VbTA T16F monomer possesses the canonical class III (S)-selective ω-aminotransferase fold, comprising two domains: a PLP-dependent transferase-like domain (residues 81 to 313) that hosts a central, seven-stranded mixed β-sheet (β4-β10-β9-β8-β7-β5-β6), with β-strand 10 being antiparallel to the rest, surrounded by 8 α-helices and three 3 10 helices (η1-3) (Fig. 4a); ii) domain 2 (residues 2-80; 314-444) that contains a small N-terminal sub-domain (residues 2 to 49; α1-α2-β1-β2-β3) (Fig. 4b).
In agreement with the majority of class III TAs, VbTA T16F forms a dimer, generated between VbTA T16F chain A and its symmetry-related monomer (180° rotation) (Fig. 5a). Dimerisation covers an interaction surface area of 5463 A 2 , involving 103 residues from each monomer and 48 hydrogen bonds (as calculated using PDBsum) 28 .
The closest structural homolog to VbTA T16F is the apo-form of the omega transaminase from Chromobacterium violaceum (PDB entry 4A6R), with a sequence identity of 37.5% over 432/443 aligned residues (RMSD value = 1.3 Å) 11 . The sequence and structural conservation of VbTA T16F with all homologous structures deposited in the PDB was assessed using ENDscript 2.0 (http://endscript.ibcp.fr) 29 . 125 deposited unique structures (sequence conservation > 30%) were superimposed with VbTA T16F. As expected, the highest conservation was located to the PLP-binding domain residues; the least conservation was observed in flexible loop regions, and in particular in the N-terminal domain that mediates dimerization (Fig. 5b).
Based on the availability of a holo-, two apo-forms and intermediate states of CvTA, Humble et al., demonstrated for the first time (for a transaminase) that, upon PLP binding, the enzyme undergoes extensive structural rearrangements in three loop regions that form the architecture of the active site 11 . In particular, the N-terminal subdomain of domain 2 completely refolds, as indicated by the lack of electron density in the apo-form of the CvTA crystal structure (PDB entry 4A6R). Correct structural reorganisation of these loops was shown to be essential to form the catalytic pocket at the dimer interface.
F16 is located in a critical position, at the center of the 14-residue loop, connecting α1 to α2 of subdomain 2 (Fig. 5c,d). This loop clasps between α11 (domain 1) and α12 (domain 2) of the opposing monomer, locking the dimer together (Fig. 5c). F16 has a clear role in stabilising the dimer interface, making 26 contacts (<4 Å) with main-and side-chain atoms from six residues in the opposing monomer, including three hydrogen bonds with the carbonyl oxygen atoms of L303 (3.4 Å) and H305 (2.97 Å), and the nitrogen atom of the α-amine group of S84 (2.84 Å) (Fig. 5d). Such interactions result in the stabilisation of a second loop (residues 298-313) that contributes to the active site. The correct positioning of this loop is essential as it houses the highly conserved residue T308 that binds PLP via two hydrogen bonds between the phosphate oxygen atom (O2P) of PLP and the main chain amide nitrogen atom (length 2.78 Å) and side chain hydroxyl atom (length 2.64 Å) of T308. Furthermore, T308, in turn, also hydrogen bonds (length 2.68 Å) to the catalytic lysine residue K273 (NZ atom) via its side chain hydroxyl atom (See Supplementary Figure 5). These observations indicate that the loss of the F16 interaction network by substitution with threonine would result in significant destabilisation of the dimer interface, as confirmed in structural bioinformatics calculations described below.

Stability and affinity calculations (wt versus T16F).
To evaluate the stability of monomeric and homodimeric forms of VbTA, we performed in silico mutagenesis simulations and evaluated protein stability and affinity ( Table 2).
The introduction of the T16F mutation significantly increases the protein stability, with a contribution of 4.82 kcal/mol computed for the monomeric forms. Since the F16T mutation is located at the dimerisation interface of VbTA, its impact is significant for the homodimerisation process. The affinity of the protomers for one another differs by 14.23 kcal/mol between the wild type VbTA and the F16T mutant, suggesting that the latter has a propensity to dimerise.
The stability of the VbTA T16F dimer is 10.24 kcal/mol higher than mutant monomer (2 protomers × 4.82 kcal/mol), suggesting that dimerisation can increase T16F mutant stabilisation not only through intramolecular but also via intermolecular interactions (5.12 kcal/mol protomer).  In silico solvent analysis. The surface area regions of VbTA T16F, with significant hydration free energy contributions (Fig. 6, Panel a, solid green surface), correspond to superficial, negatively charged, glutamic and aspartic acid residues (46 out of 443 total residues). The presence of salt at high concentrations can significantly stabilise VbTA T16F, specifically via extensive interactions with positively charged sodium ions (Fig. 6, Panel b, dotted cyan surface), as can be appreciated from the almost perfect superposition between the hydration free energy and the relative sodium density surfaces. Differently, chloride ions (Fig. 6, Panel b, dotted yellow surface) do not seem to concentrate, to the same extent, around the solvent shell close to VbTA T16F and thus do not play a similar, strong stabilisation role as sodium ions. The Lennard-Jones probe particles (Fig. 6, Panel c, orange surface lines) concentrate all around the VbTA T16F surface, generating a diffused and extended interaction network, but they avoid the high hydration free energy areas (Fig. 6, Panel a, solid green surface) that are preferentially occupied by sodium ions (Fig. 6, Panel b, dotted yellow surface and Panel c, yellow surface lines).

Conclusion
Our data illustrate how the strategic mutation of a single residue can improve the heterologous expression of an otherwise, unstable/insoluble protein, and represents a strategy that may be adopted for other TA enzymes. This approach was employed to prepare a new transaminase (VbTA) from the highly tolerant marine bacterium Virgibacillus sp. 21D. The mutant VbTA T16F was obtained as soluble protein in E. coli, and showed remarkable stability towards organic solvents and displayed activity over hydrophobic aromatic substrates (both primary amines and ketones/aldehydes). The crystal structure of VbTA T16F and related computational calculations reveal the crucial structural role of the N-terminal subdomain and the T16F mutation for correct active site architecture and stable dimerisation.
n.d. Table 1. Substrate scope of VbTA T16F. Each reaction was performed in triplicate and the results are reported as the average of the data obtained after 24 h. Substrates were prepared in methanol solution to guarantee the correct concentration of the substrate in the reaction mixture (final concentration 10% (v/v) MeOH). Final conversions were determined by HPLC. a Amino donor concentration was kept constant at 10 mM, and 10 mM pyruvate was used as the amino acceptor; b amino acceptor concentration was kept constant at 10 mM; L-alanine was used as the amino donor. n.d. = non-detectable in the tested condition.  shown in sausage representation, as automatically generated by ENDscript 2.0 (http://endscript.ibcp.fr) 29 . Structure conservation with 125 structure homologs deposited in the PDB is indicated by ribbon thickness, with regions of low conservation being thicker than highly conserved regions (thin regions). Sequence identity is indicated by red coloring; the redder the residue, the more conserved it is. The N-and C-termini are indicated and PLP is shown in sticks; (c) the F16 mutation (orange sticks) and bound PLP (sticks, atom colouring) in chain A are highlighted. The active site loop region (residues 298-313) in chain B, located between α11 and α12 is shown in yellow, and d) Detailed view of the interaction network (<4 Å; grey lines) between F16 (orange sticks) and residues F83, S84 and 302-305 (grey sticks) of the opposing monomer. The active site loop region (residues 298-313) in chain B is coloured yellow. Bond distances (Å) for the three hydrogen bonds are indicated. Panels A, C and D were generated using Chimera 32 , whereas panel B was generated using MacPymol 2.0.6.  Table 2. In silico mutagenesis and affinity/stability analysis and calculations. Δaffinity values report the change in binding affinity within protomers between the wild type and VbTA T16F mutant; Δstability values report the changes in protein (monomer or homodimer) stability occurring after the mutation; stability is defined as the difference in energy between the folded and unfolded states. Δaffinity and Δstability data are calculated using an implicit solvent MM-GBSA method. The best expression conditions were observed using BL21(DE3) cells transformed with the pET100-D-TOPO vector in auto-induction medium 33 . Pellets derived from 300 mL cultures were resuspended in approx. 12 mL (2 mL per g pellet) of wash buffer (50 mM Tris-HCl pH 8.0, 100 mM NaCl, 0.1 mM PLP, 30 mM imidazole) and Figure 6. In silico solvent analysis of the VbTA T16F surface (a) shows the hydration free energy iso-level density (solid green) with a DG cutoff value −5.0 kcal/mol/A3; (b) shows the relative sodium and chloride ion iso-levels density relative to an 1 M bulk concentration of NaCl; (c) shows both the relative sodium and chloride ion iso-levels density relative to an 1 M bulk concentration of NaCl and the relative hydrophobe (a Lennard-Jones particle, i.e. a probe of the size of a neutral Cl atom) iso-levels density relative to a 50 mM bulk concentration. lysed by sonication, as previously described 17 . The bacterial lysate was clarified by centrifugation at 13,000 × g for 1 h at 4 °C, and filtered using a 0.45 μM filter (Millipore). Using an ÄKTA Start System (GE Healthcare), crude extract was loaded at a flow rate of 1 mL/min into a 1 mL HisTrap HP column packed with NiSO 4 (0.1 M). The column was washed with wash buffer for 10 column volumes (CVs) and the purified protein was eluted with elution buffer (50 mM Tris-HCl pH 8,0, 100 mM NaCl, 0.1 mM PLP, 300 mM imidazole), following an intermediate wash step with 10 CVs of buffer, prepared by mixing 15% elution buffer with wash buffer, to remove non-specifically bound proteins. Fractions containing the purified enzyme were desalted via overnight dialysis against 50 mM phosphate buffer pH 8, containing 0.1 mM PLP. The purified enzyme was quantified by Epoch Take3 and stored at 4 °C.
Enzymatic assay. The purified recombinant enzyme was quantified and stored at 4 °C. A kinetic assay derived from Schätzle et al. 26

HPLC analysis. The final conversion of the different amino acceptors was determined using a Thermo
Scientific HPLC instrument equipped with Accucore C18, LC column, Particle size 2.6-micron, diameter 4.6 mm, length 150 mm. Substrates were detected at 210, 245, 280 nm using the following mobile phase A: formic acid (0.1% in water), B: ACN; the gradient elution method adopted was 15% B (10 min), increasing to 80% B (over 8 min), decreasing to 15% B (over 2 min) at 25 °C at a flow rate of 1 mL/min. The depletion of aromatic amines, aldehydes and the formation of acetophenone was evaluated using a calibration curve. Samples were injected after a 1:50 dilution with 0.2% HCl in the quenching step.
Crystallisation. 400 34,35 . Data were processed using XDS and assigned to a trigonal (P3 2 21) space group using POINTLESS and scaled using SCALA; both included in the CCP4i suite. The 3D structure of VbTA T16F was solved using Molrep and the structure of a putrescine aminotransferase from Pseudomonas aeruginosa (PDB entry 5TI8; 43% sequence identity over 433 aligned residues) as a search model 36 . The structure was manually built using COOT and refined using phenix.refine until satisfactory refinement parameters were achieved (R work = 17.9%; R free = 21.8%). All residues are located in allowed regions of the Ramachandran plot, except for three residues, located in geometrically-restrained regions: S53, located between two β-turns and A283 and the catalytic K284. Data collection parameters and refinement statistics are shown in Supplementary In silico mutagenesis and affinity/stability analysis and calculations. In silico mutagenesis simulations and evaluation of protein stability and affinity were carried out using the BioLuminate suite and the OPLS3 force field (Schrödinger, LLC), using the Residue Scanning panel. After residue substitution, the new side chain, surrounding residues and backbone were sampled to the nearest energy minimum and an implicit solvent MM-GBSA method was used to retrieve the solvation energy and the binding affinity of the mutated monomers. The wild type VbTA monomer was generated from the experimental crystallographic structure of VbTA T16F mutant, crystallised as a monomer in the asymmetric unit. The dimer was generated using Chain A and the coordinates for its symmetry-related monomer (180° rotation).
In silico solvent analysis. In order to estimate the effect of the ionic strength on VbTA8 T16F stability, we carried out solvent analyses to characterise the interplay between the solvent (mainly water and salt) and the solute. Calculations were run using the three-dimensional reference interaction site model (3D-RISM) of the Molecular Operating Environment (MOE 2018.01). This application computes a time-averaged distribution of water H and O densities, along with free-energy maps for analysing solvent stability and solvation contributions to binding free-energy. Calculations were run using the "Solvent analysis" program of the MOE suite and the AMBER10:EHT forcefield. Crystallographic water molecules were removed and VbTA8 was submitted to MOE QuickPrep program before 3D-RISM calculations with 1 M NaCl.
Cut-off values for both the relative sodium and chloride ion densities were set as four-fold denser than bulk and the cut-off value for hydrophobe density was set as two-fold denser than the bulk. In order to favour a graphical comparison between hydration free energy and relative ion and hydrophobe densities, a multiplication factor of 2 was applied to both the relative ion and hydrophobe density grid iso-levels.

Database.
Coordinates and structure factors have been deposited in the Protein Data Bank (www.rcsb.org) under accession number 6FYQ.