Reconstitution and optimisation of the biosynthesis of bacterial sugar pseudaminic acid (Pse5Ac7Ac) enables preparative enzymatic synthesis of CMP-Pse5Ac7Ac

Pseudaminic acids present on the surface of pathogenic bacteria, including gut pathogens Campylobacter jejuni and Helicobacter pylori, are postulated to play influential roles in the etiology of associated infectious diseases through modulating flagella assembly and recognition of bacteria by the human immune system. Yet they are underexplored compared to other areas of glycoscience, in particular enzymes responsible for the glycosyltransfer of these sugars in bacteria are still to be unambiguously characterised. This can be largely attributed to a lack of access to nucleotide-activated pseudaminic acid glycosyl donors, such as CMP-Pse5Ac7Ac. Herein we reconstitute the biosynthesis of Pse5Ac7Ac in vitro using enzymes from C. jejuni (PseBCHGI) in the process optimising coupled turnover with PseBC using deuterium wash in experiments, and establishing a method for co-factor regeneration in PseH tunover. Furthermore we establish conditions for purification of a soluble CMP-Pse5Ac7Ac synthetase enzyme PseF from Aeromonas caviae and utilise it in combination with the C. jejuni enzymes to achieve practical preparative synthesis of CMP-Pse5Ac7Ac in vitro, facilitating future biological studies.


Results
Pseudaminic acid biosynthetic enzyme production. Cognisant that scaleable in vitro enzymatic synthesis of CMP-Pse5Ac7Ac 3 would be dependant on ready access to the six biosynthetic enzymes, PseBCHGIF (Scheme 1) 20 , we initially set out to attain the large scale production of the recombinant enzymes. Expression trials in E. coli BL21 (DE3) cells, using plasmids encoding N-terminal His-tagged PseC, PseH, PseG, PseI, and C-terminal His-tagged PseB genes from C. jejuni, imaged on SDS-PAGE displayed overexpressed enzymes at the predicted molecular weight for the desired enzyme and allowed for identification of induction conditions ( Table 1, Supplementary Fig. SI.1). The production of the PseB, PseH, PseG and PseI enzymes routinely afforded mg/L yields of protein post-purification, greater or equal to those previously reported (Table 1, Fig. 2) 20 . However, the PseC enzyme displayed a propensity to precipitate during purification when expressed at an induction temperature of 37 °C over 4 h. We therefore explored reducing the temperature to 16 °C, which reduced the concentration of protein produced but also precipitation post purification. These conditions were therefore used in future large scale protein preparations. Unfortunately the N-terminal His-tagged H. pylori PseF enzyme was largely insoluble in all expression conditions trialled in our hands ( Supplementary Fig. SI.2), we therefore turned our attention to the PseF homologue from A. caviae, a gram negative bacterium which presents Pse5Ac7Ac 1 on its flagella 21 . Expression of A. caviae PseF was investigated in E. coli BL21 (DE3) cells with induction conditions of 0.1 mM IPTG followed by three hours incubation at 30 °C found to yield soluble protein at ~13 mg L −1 post purification. The enzyme was characterised by mass spectrometry, circular dichroism and size exclusion chromatography-multi-angle laser light scattering (SEC-MALS) ( Supplementary Fig. SI. [4][5][6], which confirmed it existed as a homodimer, consistent with homologous CMP-Neu5Ac synthetase 22 and CMP-Kdo synthetase 23 . Furthermore the activity of the enzyme as a CMP-Pse5Ac7Ac synthetase was confirmed in small scale negative ion ESI-LCMS assays with Pse5Ac7Ac 1 and CTP ( Supplementary Fig. SI.7). Notably all enzymes could be stored in the freezer without cryoprotectant prior to use.   (Fig. 3). Significant apparent starting material peak at 606 m/z also remained after 45 min, in addition to some seem-   , with no increase in conversion noted over 6 h. This result therefore implied turnover by the PseB enzyme may be the limiting step in the coupled transformation, however we found increasing the PseB enzyme concentration had no effect on conversion. Consideration of the PseB mechanism in more detail, and noting previous biochemical characterisation, revealed that in addition to acting as a 5-inverting 4,6-dehydratase, PseB can also catalyse a further C5 epimerisation (highlighted with an asterisk in Scheme 2) of the initial product 7 to afford UDP-4-keto-6-deoxy GlcNAc 8 20,26 , albeit at a lower rate. The GlcNAc configured 8 is the first intermediate en route to the biosynthesis of the bacterial sugar UDP-diNAcBac, integral to N-linked protein glycosylation in C. jejuni and no longer a substrate for PseC in the Pse5Ac7Ac pathway 25,27 . We hypothesised that our transformation might have stalled due to C5-epimerisation, and that the epimeric ketone products of PseB 7 and 8 (with identical ESI-LCMS peaks at [M−H] − = 588 m/z, red) would exist in equilibrium with their hydrated counterparts 9 and 10 26 . This would complicate ESI analysis as the hydrates would have the same [M−H] − peak (606 m/z) as the UDP-GlcNAc 4 starting material, therefore the apparent remaining starting material in the PseB/C coupled reaction could instead represent epimeric hydrated PseB products 9 and 10. This suggested an excess of PseC enzyme rather than PseB may be required to drive the reaction forward prior to epimerisation occurring. To explore this hypothesis however it was necessary to clarify the ESI-LCMS analysis of the reaction and distinguish between the intermediates in the reaction, particularly hydrates 9 and 10 and the UDP-GlcNAc starting material 4. We therefore opted to perform the reaction in deuterated buffer, as during the PseB catalysed transformation incorporation of a non-exchangeable C5-proton from bulk solvent occurs 26  www.nature.com/scientificreports/ however the optimised ratio of PseB: PseC determined here would be particularly useful in the absence of PseH i.e. when isolated PseC product UDP-4-amino-4,6-dideoxy-β-l-AltNAc 6 is the desired target, and or when a non-enzymatic acylation procedure is utilised to install natural or unnatural N-acyl groups 24 . biosynthesis is the transfer of an acetyl group to N4 of the PseC product 6 affording UDP-4-acetamido-4,6dideoxy-β-l-AltNAc 17 (Scheme 1). Enzymatically this transformation is catalysed by PseH 29,30 , an aminoglycoside N-acetyl transferase from the GNAT superfamily, and as such utilises Ac-CoA 5 as a co-factor 31 . Despite being used ubiquitously as an acetyl transfer group in vivo, the complex structure of Ac-CoA 5 makes it a high cost reagent, limiting its use for in vitro preparative enzymatic synthesis of Pse5Ac7Ac 1. Therefore to make the synthesis more economically viable, we considered strategies to reduce the amount of Ac-CoA 5 required in the acylation of PseC product 6. Notably chemical acetylation has been used previously to complete this transformation, and offers an enticing opportunity to install unnatural functionality into the Pse5Ac7Ac 1 backbone 18,24,25 . However the components of such chemical acetylation reactions are invariably incompatible with subsequent enzymatic transformations, necessitating extra purifications of biosynthetic intermediates. Therefore, instead we sought to explore a method for recycling Ac-CoA 5 which would be compatible with a one-pot multienzymatic synthesis of Pse5Ac7Ac 1 and its derivatives. Previously the thioester acetyl thiocholine iodide 16 had been reported as a low cost acetyl transfer agent in the regeneration of sub-stoichiometric amounts of Ac-CoA 5 for the synthesis of citric acid 32 . We therefore opted to apply this system for recycling Ac-CoA 5 in the PseH catalysed acetylation of 6, wherein any catalytic CoA thiol 18 liberated after acetylation would undergo in situ thioester exchange with the water soluble acetyl thioester 16, regenerating the co-factor 5 (Scheme 3). In order to ascertain the efficiency of this method a number of small scale three-enzyme one-pot reactions were set up with UDP-GlcNAc 4 (1 mM), PseB (25 μM), PseC (125 μM), PseH (50 μM), and either sub-stoichiometric Ac-CoA 5 (0.15 mM) and increasing concentrations of acetyl thiocholine iodide 16 (0-20 mM) or Ac-CoA 5 (0 mM) and acetyl thiocholine iodide 16 (20 mM) as a control. The reactions were once more monitored by negative ESI-LCMS and the relative conversion to acetylated PseH product 17 ([M−H] − = 631 m/z, purple) calculated as a percentage of the total biosynthetic intermediates remaining. In the absence of any Ac-CoA 5 and 20 mM acetyl thiocholine iodide 16 (Fig. 5a), no PseH product 17 is observed, indicating that 16 cannot itself act as a co-factor for the reaction. Addition of sub stoichiometric Ac-CoA 5 (0.15 mM) alone does yield PseH product 17 ( Fig. 5b) but as expected at a lower overall conversion. However this conversion could be increased to 66% upon addition of 2 mM acetyl thiocholine iodide 16 (Fig. 5c), and 72% when 16 was included at 20 mM (Fig. 5d), indicating regeneration of the catalytic Ac-CoA 5 does occur through in situ thioester exchange. Indeed even using lower cost sub-stoichiometric CoA thiol 18 at 0.15 mM, as opposed to Ac-CoA 5 could also yield 65% conversion to the PseH product 17 in the presence of 20 mM 16, and a conversion of 44% could still be achieved at 0.0015 mM CoA thiol 18 ( Supplementary Fig. SI.8a). Similarly increasing the concentration of acetyl thiocholine iodide 16 from 20 to 100 mM, in the presence of 0.0015 mM CoA thiol 18 resulted in an increased 61% conversion to the PseH product 17 (Supplementary Fig. SI.8b). These conditions represent a 1000 fold decrease in the level of Ac-CoA 5/CoA 18 previously required for PseH turnover 19 .
Optimised 'one-pot' multienzyme preparative synthesis of CMP-Pse5Ac7Ac 3. The vast reduction in co-factor requirement and cost for PseH turnover, allied to the optimisation of the PseB/C coupled transformation now made a 'one-pot' two-step multienzyme synthesis more economically viable and practical for production of activated CMP-Pse5Ac7Ac 3. To demonstrate, we completed the preparative scale synthesis and purification of CMP-Pse5Ac7Ac 3 using the optimised conditions starting from 90 mg UDP-GlcNAc 4 (Scheme 4). We utilised the purified C. jejuni enzymes PseB, PseC (in excess), PseH (using Ac-CoA 5 regeneration), PseG which hydrolyses the UDP group, and the Pse5Ac7Ac synthase PseI which condenses phosphoenolpyruvate (PEP) with the newly formed reducing terminus in the PseG product 19, to afford Pse5Ac7Ac 1 in one-pot over 12 h. Subsequently, the newly characterised soluble CMP-Pse5Ac7Ac synthetase PseF from A. caviae, was added to the mixture catalysing conversion to the activated Leloir glycosyl donor CMP-Pse5Ac7Ac 3.

Discussion
In order to optimise the in vitro reconstitution of the biosynthesis of the bacterial nonulosonic acid sugar Pse5Ac7Ac 1 we have explored the relationship between the transformations catalysed by the first two enzymes in the biosynthetic pathway from C. jejuni, PseB and PseC. Notably PseB catalyses an undesired secondary epimerisation, which poses a challenge for in vitro enzymatic synthesis of Pse5Ac7Ac 1 as the resulting epimeric product is no longer a substrate for PseC 20,26 , but rather the PglE enzyme in UDP-diNAcBac biosynthesis, a precursor to N-linked glycoproteins in C. jejuni 25,27 . Although the enzymatic epimerisation reaction has been previously disclosed, the optimisation of the coupled PseB/C reaction to maximise flux through the Pse5Ac7Ac pathway was unexplored. We have unequivocally demonstrated that deuterium wash-in experiments enable optimisation of comparative PseB and PseC enzyme concentrations for this transformation and thus maximise desired conversion in ESI-LCMS experiments. It would be beneficial in further investigations to focus on determining the residues involved in the PseB catalysed epimerisation to reduce the need for excess PseC enzyme. Indeed previous mutagenesis studies have highlighted PseB active site residues which are essential for the initial inversion and dehydration but seemingly play no role in the secondary epimerisation 26 , thus implying that rational mutagenesis studies may be used to eliminate undesired byproduct formation with miminal effect on the rate of the desired transformation. Furthermore in the third step in Pse5Ac7Ac biosynthesis, PseH catalysed acetylation of the 4-amino group, we established a method for in situ regeneration of the expensive co-factor Ac-CoA 5 using acetyl thiocholine iodide 16 as an acetyl transfer reagent. This advance significantly increases the economic viability of in vitro enzymatic synthesis of Pse5Ac7Ac derivatives, and eliminates the need for multiple purification steps as is required when chemical acetylation is employed.
We showcased the benefits of these optimisation studies by combining PseB, PseC and PseH with the final two steps in the Pse5Ac7Ac pathway PseG and PseI, in the process establishing standard conditions for large scale production and storage of the enzymes. The CMP-Pse5Ac7Ac synthetase PseF from H. pylori 26695 has previously 19 been purified so we were curious as to why the majority of the expressed protein remained insoluble in our hands. Consideration of the constructs physiochemical parameters identified that, although not classified as hydrophobic (GRAVY score = − 0.34) 33 , the resulting protein sequence was classified as unstable in vitro with an instability index of 45.3. Therefore our attention turned to other CMP-Pse5Ac7Ac synthetases such as the enzyme encoded by Cj1311 from C. jejuni 81-176 34 and a putative enzyme in A. caviae which both share similar sequence identity with HpPseF, 37.2% and 35.9% respectively (aligned in Clustal Omega). All three proteins lack transmembrane regions 35 and are predicted to be cytoplasmic 36 , which is concordant with their negative GRAVY scores (CjPseF − 0.32 and AcPseF − 0.22). However considering their function as carbohydrate-active enzymes they may be membrane associated with Pse5Ac7Ac 1 activation occurring in the cytoplasm near the inner membrane prior to utilisation by Pse5Ac7Ac glycosyltransferases. The instability index of CjPseF was calculated as 53.6 and hence predicted to be even less stable than the H. pylori counterpart. Therefore we focussed on AcPseF 21 as it has the lower instability index (40.4) and successfully purified soluble protein obtaining a yield of 13 mg L −1 . Preliminary characterisation of AcPseF using SEC-MALS confirmed it existed predominantly (> 99%) as a homodimer in solution, and ESI-LCMS studies confirmed its activity as a bona fide CMP-Pse5Ac7Ac synthetase, with further biochemical characterisation a subject of future work. Additionally CD studies of the protein indicated it may also be amenable to crystallisation with over 86% secondary structure, consistent with the computational data obtained for homologous HpPseF 37 . AcPseF also shows 27% sequence identity to the CMP-Neu5Ac synthetase from Neisseria meningitidus (NmCNS), for which a 2 Å X-ray crystal structure has been solved with the substrate analogue CDP present in the active site and Neu5Ac 2 docked 36 . Unsurprisingly alignment of these sequences alongside HpPseF and CjPseF, in addition to CMP-Kdo synthetase homologues 22,23,38,39 revealed conservation of several key residues, such as those involved in binding to the cytosine moiety. However differences in the sequence between AcPseF and NmCNS at residues predicted to bind the NHAc substituent at C5, which is equatorial in Neu5Ac 2, as opposed to axial in Pse5Ac7Ac 1, and the residues which are proposed to bind the C6 propyl chain in Neu5Ac 2 are also apparent. These differences in sequence may in-part account for the altered specificity for carbohydrate substrates between these enzymes. Importantly, we further demonstrated that the soluble AcPseF enzyme was suitable for a "one-pot" multienzymatic synthesis with the biosynthetic enzymes from C. jejuni, which enabled the preparative synthesis of purified CMP-Pse5Ac7Ac 3 from UDP-GlcNAc 4. With multimilligram quantities of the activated Leloir glycosyl donor now in hand and practically   A. caviae PseF characterisation 40 . Aliquots of AcPseF were further purified by gel filtration in 25 mM Tris-HCl pH 7.3 buffer containing 50 mM NaCl and 2 mM MgCl 2 . Following SDS PAGE analysis, pure protein was extracted from bands at the expected PseF construct molecular weight and subject to trypsin digest. The resultant peptides were analysed by MALDI-MS and MS/MS and spectral data was compared to the Mascot database to identify the protein as AcPseF (Supplementary Fig. SI.4). Following gel filtration, aliquots of AcPseF were dialysed into 25 mM sodium phosphate buffer pH 7.4 and analysed by circular dichroism at 30 °C, from 180 to 260 nm at a final concentration of 0.2 mg mL −1 . Under these conditions 86.5% of AcPseF was predicted to have a fixed secondary structure, suggesting that it is amenable for crystallisation studies (Supplementary Fig. SI.5). Secondary structure predictions were made from circular dichroism data using K2D3 (http://cbdm-01.zdv.uni-mainz .de/~andra de/k2d3/).
Aliquots of AcPseF were dialysed into 20 mM Tris pH 7.8 buffer containing 50 mM NaCl and 2 mM MgCl 2 and concentrated to 4 mg mL −1 . 100 µL samples were applied to a Superdex S200 size-exclusion column (G.E. Healthcare) pre-equilibrated with the same buffer, attached to a system comprising of a Wyatt HELEOS-II multiangle light scattering detector and a Wyatt rEX refractive index detector linked to a Shimadzu HPLC system (SPD-20A UV detector, LC20-AD isocratic pump system, DGU-20A3 degasser and SIL-20A autosampler). A 2.5 mg mL −1 BSA sample was run as a standard and all data analysed using Astra V software ( Supplementary  Fig. SI.6).
Reaction mixtures containing 130 μg mL −1 AcPseF, 0.5 mM Pse5Ac7Ac (Sussex Research), 1.5 mM CTP, 1 mM MgCl 2 , 50 mM NaCl, 25 mM sodium phosphate, pH 7.4, were incubated at 25 °C, alongside control reactions with reaction mixture as described without either Pse5Ac7Ac 1, CTP or AcPseF. Reactions were analysed by-ESI LC-MS and a peak indicating the formation of CMP-Pse5Ac7Ac 3 ([M−H] − 638.2) was only observed in the reaction containing all components thus suggesting AcPseF functions as a CMP-Pse5Ac7Ac synthetase. However a peak corresponding to Pse5Ac7Ac 1 was also observed, even after 6.5 h indicating that the reaction had not gone to completion and/or that hydrolysis of CMP-Pse5Ac7Ac 3 was occurring ( Supplementary Fig. SI.7).

Data availability
All data generated and/or analysed in this study are included in this published article (and its Supplementary  Information).