Zymophore identification enables the discovery of novel phenylalanine ammonia lyase enzymes

The suite of biological catalysts found in Nature has the potential to contribute immensely to scientific advancements, ranging from industrial biotechnology to innovations in bioenergy and medical intervention. The endeavour to obtain a catalyst of choice is, however, wrought with challenges. Herein we report the design of a structure-based annotation system for the identification of functionally similar enzymes from diverse sequence backgrounds. Focusing on an enzymatic activity with demonstrated synthetic and therapeutic relevance, five new phenylalanine ammonia lyase (PAL) enzymes were discovered and characterised with respect to their potential applications. The variation and novelty of various desirable traits seen in these previously uncharacterised enzymes demonstrates the importance of effective sequence annotation in unlocking the potential diversity that Nature provides in the search for tailored biological tools. This new method has commercial relevance as a strategy for assaying the ‘evolvability’ of certain enzyme features, thus streamlining and informing protein engineering efforts.


Database searches
Discovery of uncharacterised potential enzyme sequences was undertaken using an appropriate query (protein sequence of ammonia lyases from Anabaena variabilis, Photorhabdus luminescens and Streptomyces sp.) to perform a sequence similarity search within the knowledge base of the universal protein resource. In each case the basic local alignment search tool (BLAST) was used to find regions of sequence similarity between an amino acid sequence and the in silico translated DNA sequences of all available genomes and metagenomes (tBLASTn). All searches were gapped and unfiltered with a statistical significance value threshold of E=10. The choice of protein substitution matrix was set to automatic and thus assigned computationally based on the length of sequence. Sequences were downloaded in fasta format for inspection and alignment. All sequence alignments were performed through use of the W2 command line interface for the Clustal multiple sequence alignment computer programme, as available online. All alignments made use of the Gonnet protein weight matrix with the 'gap open' penalty score set to 10 in all cases. The initial pairwise alignment type was set to slow with a gap extension score of 0.1. With subsequent multiple alignments the gap extension score was set to 0.2 in addition to a gap distance penalisation value of 5, without end gap penalisation or iteration. Sequences were clustered via the neighbour-joining method. Alignments of all putative PALs were performed against the primary sequence of AvPAL to allow accurate mapping of the zymophore motif onto homologous positions.

HPLC analysis
Reverse phase HPLC analyses were performed on an Agilent 1200 Series system equipped with a G1379A degasser, G1312A binary pump, a G1329 autosampler unit, a G1316A temperature controlled column compartment and a G1315B diode array detector.
Lysozyme (500 μL, 10 mg mL -1 ) was added and the mixture was incubated at 37°C and 220 rpm for 45 min. The suspension was sonicated (20 s on, 20 s off, 20 cycles, Soniprep 150, MSE UK Ltd) on ice and treated with DNAse (100 μL, 1 mg mL -1 ) at 37°C and 220 rpm for 45 min. The mixture was centrifuged (18,000 rpm, 30 min, 4°C) and the supernatant was filtered (0.2 μm syringe filter) and loaded onto a prepacked HisTrap FF column (GE Healthcare, 1 mL solid phase). The column was washed with the wash buffer (5-10 mL) and the protein was eluted with the elution buffer (10 mL, 50 mM KPi, 500 mM NaCl, 250 mM imidazole, pH 7.4), collecting the eluate in different fractions.
Fractions were pooled according to the protein concentration (measured by Bradford assay) with sufficient purity (judged by SDS-PAGE analysis) and used without further processing.

Calculation of specific activities
The purified PAL (50 μL of a 0.5 mg mL -1 solution in NaP i buffer pH 7.4) was added to a solution of L-phenylalanine (20 mM, 225 μL) and borate buffer (675 μL, 100 mM, pH 10.0). The mixture was incubated at 37°C, 220 rpm for 1 h. Samples were analysed by HPLC on a non-chiral stationary phase. For the calculation, 1 unit is defined as the amount of enzymes converting 1 μmol of L-2a to 1a in 1 min.

Purified enzyme plate reader assay
A solution containing 0.5 mg mL -1 enzyme (20 µL) was added to a 96-well plate followed by addition of substrate solution (L-Tyr or L-His, 180 µL, 5 mM) to a total volume of 200 µL. The assay was performed at 37°C for 20 minutes measuring at 30 s intervals. Detection wavelengths: coumaric acid 380 nm, urocanic acid 320 nm.

Purified enzyme analytical scale assay
A solution containing 0.5 mg mL -1 enzyme (50 µL) was added to solution of either borate buffer (430 µL, 100 mM, pH 8.0-10.0) or NaP i (430 µL 100 mM, pH 6.0), followed by addition of Lphenylalanine solution (20 µL, 250 mM in the same buffer) to a final substrate concentration of 5 mM and a final enzyme concentration of 0.05 mg mL -1 . Temperature stability tests were conducted by incubating the enzyme in borate buffer pH 8.0 for the specified time (1, 24 and 48 h) at 37°C, followed by addition of L-phenylalanine (20 µL 250 mM) and incubation for a further 16 h. Results for these were reported as conversions relative to the maximum obtained with the untreated cells.

S8
...   S8 and Anabaena variabilis (AvPAL), S1 the (R)selective PAM from from Taxus canadensis (TcPAM), S9 one of the PAL paralogues from Petroselinum crispum (PcPAL1), S10 the bifunctional PAL/TAL from Rhodosporidium toruloides (RtPAL) S11 and two distinct TALs from from Streptomyces sp. (BagA) S12 and from Saccharothrix espanaensis (Sam8). S13 S10  Figure S4. The full amino acid sequences of the 5 new PALs selected for characterisation in Fasta format.   Figure S6. A percentage identity plot of the new PALs as inferred from the multiple sequence alignment.