Introduction

Bacterial microcompartments (BMCs) are protein-walled metabolic compartments that sequester pathways for the catabolism of various carbon sources1,2. The protein shells of BMCs are formed through the interaction of thousands of copies of different proteins belonging to the BMC protein family3 to produce a pseudo-icosahedral container of around 150 nm in diameter. The enzyme contents of BMCs are directed to the interior of the compartment as it forms, through interactions between short peptides appended to their functional domains and the BMC shell proteins4,5.

BMCs are found in species with diverse ecological niches, from pathogenic strains of Escherichia coli and Salmonella6,7, to the cellulose degrading Clostridium phytofermentans8, and marine bacteria including Haliangium ochraceum and Rhodospirillum rubrum9. As a consequence of this wide species distribution, BMCs have diverged in substrate specificity and the enzyme activities that they encapsulate to be able to utilise carbon sources such as 1,2-propanediol7,10, ethanolamine11,12 fucose/rhamnose13, and choline14. A common feature of the metabolic pathways found within BMCs is the production of a toxic aldehyde intermediate in the breakdown of their substrates6,10,15. In the well-characterised propanediol utilisation BMC, a vitamin B12 dependent lyase enzyme acts on the 1,2-propanediol substrate to produce propionaldehyde10; while the lyase in the ethanolamine utilisation BMC produces acetaldehyde and ammonia as its products16. A recently discovered family of BMCs use a glycyl-radical enzyme (GRE) to break down their primary substrates producing an aldehyde intermediate, such as the production of acetaldehyde from the choline utilisation BMC found in Desulfovibrio desulfuricans14 and the C. phytofermentans fucose/rhamnose BMC13. To detoxify the aldehyde produced by these enzymes, and to ultimately generate ATP through substrate level phosphorylation, all BMCs identified to date possess an acylating-aldehyde dehydrogenase enzyme that produces an acyl-CoenzymeA and NADH15.

The aldehyde dehydrogenase enzymes (AldDH) encoded with BMC loci are invariably accompanied by an alcohol dehydrogenase enzyme17, the activity of which appears to be necessary to recycle the NADH produced by the activity of the aldehyde dehydrogenase. Genetic knockout of the aldehyde dehydrogenase in the propanediol17 and ethanolamine18 BMC loci leads to reduced growth on their respective substrates. These results imply that BMCs contain private NAD+/NADH cofactor pools that are not exchanged with the bulk cytosol of the host cell and explain the requirement for both the aldehyde and alcohol dehydrogenase enzymes to be present within the BMC. As a consequence of the need to maintain the balance of the cofactor pool within the BMC, two substrate molecules are required to produce one acyl-CoA molecule, with the second substrate molecule used to oxidise the NADH produced by the AldDH through the action of the alcohol dehydrogenase to produce alcohol from the aldehyde (Supplementary Fig. 1). This is consistent with the production of almost equimolar amounts of the carboxylic acid product from substrate-level phosphorylation via an acyl-phosphate intermediate, and the alcohol product of the alcohol dehydrogenase13.

The aldehyde dehydrogenase superfamily are well studied and are active against a range of aldehydes, including the short chain fatty aldehyde products of the lyase enzymes in BMCs19,20,21. These enzymes have a common architecture, with a Rossman-fold nucleotide-binding domain that positions the NAD(P)+ cofactor required for hydride transfer from the aldehyde substrate22; the catalytic domain has a substrate-binding tunnel with a catalytic cysteine residue and glutamic acid residue that acts as a general base in the hydrolysis of the acyl-enzyme intermediate23. The acylating aldehyde dehydrogenase enzymes do not possess the glutamic acid general base residue, presumably because the CoA cofactor acts to resolve the acyl-enzyme intermediate to produce the acyl-CoA product via a bi-uni-uni-uni-ping-pong mechanism24. Although a recent study of the kinetics of the PduP enzyme from Lactobacillus reuterii with its native substrate presented a model of CoA binding to the enzyme in the same binding pocket as NAD+, there are no experimental structures of acylating-aldehyde dehydrogenase enzymes in complex with CoA in the PDB25.

Here, we study the kinetics and substrate specificity of an AldDH enzyme from the Clostridium phytofermentans fucose/rhamnose utilisation BMC, Cphy1178. We present the X-ray crystal structure of an N-terminal truncation of this enzyme and show by native mass spectrometry that the quaternary structure of the protein is a tetramer. We have determined the structure of the protein in complex with its cofactors NAD+ and CoA and show that the adenine nucleotides of these co-factors adopt different conformations within the Rossman fold domain.

Table 1 The catalytic activity of Cphy1178 against aldehyde substrates.

Results

Aldehyde dehydrogenase activity of Cphy1178

Using recombinantly produced protein, we tested the NAD(P)+ dependent in vitro activity of the putative aldehyde dehydrogenase enzymes from the three Clostridium phytofermentans BMC loci (Cphy1178, Cphy1428, Cphy2642) and the Clostridium difficile ethanolamine utilisation locus (CD630_19170). The full-length recombinant enzymes produced for this study were unstable in various common buffer systems and aggregated rapidly upon purification; therefore, we produced truncations to remove the proposed BMC localisation sequences (Supplementary Fig. 2). These truncated enzymes were more stable in solution than their full-length counterparts and were used in subsequent activity assays.

By monitoring the hydride transfer step of the aldehyde dehydrogenase enzyme reaction by spectrophotometric measurement of the production of NADH in the presence of aldehyde substrates, we were only able to detect NAD+ dependent activity for Cphy1178. The other enzymes displayed no distinguishable activity with either NAD+, or NADP+, for the substrates used in this study. This may be due to issues with the instability of these proteins outwith their native environment within the BMC, or a requirement for some additional factor not present in our assays. The Cphy1178 aldehyde dehydrogenase displayed negligible activity with glyceraldehyde as a substrate. However, we were able to detect activity against short-chain fatty aldehydes with up to seven carbon atoms, although longer chain aldehydes were difficult to assay reliably due to their insolubility in biological buffers and we were unable to determine accurate kinetic parameters for C7 and higher aldehydes (Table 2 and Supplementary Fig. 3). Cphy1178 enzyme displayed the lowest KM and the highest kcat/KM values for the substrates tested. This data is consistent with the hypothesis that this protein acts as a propionaldehyde dehydrogenase within the C. phytofermentans fucose/rhamnose BMC. It is noteworthy that this enzyme displays substrate inhibition at high concentrations of aldehydes; interestingly this is more marked with aldehydes with an odd number of carbon atoms. This substrate inhibition is consistent with previous reports on the activity of yeast ALDH2 enzyme19.

Table 2 Primers for constructs used in this study.

Our observation that this enzyme displays good activity against longer chain aldehydes offers promise for engineering BMCs to produce alkanes and make use of the native aldehyde dehydrogenase enzymes found within them26. Based on identification of the catalytic cysteine by sequence alignment, mutagenesis of C269A completely abolished enzymatic activity, as did the mutagenesis of the putative base (H387)27. The structures of these two mutants were essentially unchanged compared to the wild-type enzyme, indicating that changes to their catalytic activity were not a consequence of protein misfolding (see below for detail of the structure of this protein).

It is important to note, that our spectrophotometric assay only monitors the production of NADH which occurs in the first step of the multistep reaction. In order to confirm the acyl-transfer from the acyl-enzyme intermediate to CoA we used mass spectrometry to detect the final product. This MS assay was performed using propionaldehyde as the substrate, with reaction mixtures containing NAD+ and CoA. LC-MS analysis confirmed the presence of the propionyl-CoA product (Fig. 1) at a mass consistent with the monoisotopic mass of the standard used as a control.

Figure 1: Detection of propionyl-CoA assay product by LC-MS.
figure 1

Main Figure, Extracted ion chromatogram at 824 m/z. Top, the Cphy1178(20–462) assay mixture containing (50 nM Cphy1178(20–462), 75 μM NAD+, 100 μM CoA, 10 mM propionaldehyde, 100 mM Tris.HCl pH 8.0, 100 mM KCl); middle, 10 ug propionyl-CoA control; bottom, negative control. Insert, mass spectrum of the Propionyl-CoA obtained by summing the spectra between 4.8 to 5.8 minutes. Top, Cphy1178(20–462) assay; bottom, propionyl-CoA control. (predicted monoisotopic mass [M+H]+ C24H41N7O17P3S; 824.1487 Da).

Structure of Cphy1178(20–462)

To understand the co-factor binding properties of Cphy1178 we determined the structure of an N-terminal truncation, comprising residues 20–462, with bound NAD+ cofactor to 1.6 Å resolution. In these structures, the asymmetric unit contains a single chain with residues 28 to 462 visible in the electron density map. The overall structure corresponds to other members in the aldehyde dehydrogenase family (Fig. 2A), with a catalytic domain (residues 238–427, highlighted pink), a cofactor-binding domain with the Rossman-fold type nucleotide binding architecture (residues 28–108, 127–237, and 428–447, highlighted grey), and an oligomerisation domain (residues 109–126 and 448–462, highlighted green). The catalytic and nucleotide binding domains come together to form an extended nucleotide and ligand-binding tunnel that is open at both ends, with the catalytic cysteine (C269) at the centre of the tunnel and the NAD+ cofactor binding at one side (Fig. 2B). The ligand-binding tunnel is about 5 Å in diameter at its widest point and spans 16 Å from the solvent exposed entry point to the catalytic cysteine and is lined with hydrophobic residues. The tunnel is long enough to accommodate up to a C10 aldehyde, although in vivo the enzyme is unlikely to encounter such a substrate. The structures of Cphy1178 were also solved with mutations in active site residues, C269A with bound CoA (1.77 Å resolution) and H387A (2.08 Å resolution) to ensure that mutation of these residues did not destabilise the structure of the protein. Both of these proteins display essentially identical structures to the wild-type protein. In the structure of the Cphy1178C269A mutant the position of the loop between 335 and 339 was not clear in the electron density maps, so was omitted from the final structure refinement. This structure was determined with CoA bound in the active site and is discussed below in the section on cofactor binding.

Figure 2: Structure of Cphy117820–462.
figure 2

(A) Secondary structure cartoon of a monomer of Cphy117820–462 showing bound NAD+ as spheres coloured yellow for carbon, red for oxygen, blue for nitrogen and orange for phosphorus. The catalytic domain is highlighted in pink, rossman fold nucleotide binding domain in grey and oligomerisation domain in green. (B) Active site tunnel and nucleotide binding pocket. The surface of Cphy117820–462 is shown coloured by electrostatic potential (blue for positive, red for negative) and the secondary structure cartoon is shown in blue. The catalytic cysteine and histidine residues are shown as sticks. Bound cofactors, NAD+ and CoA are shown with yellow and pink carbons respectively. (C) Cphy1178 forms a dimer of dimers quaternary structure. Two subunits are shown as cartoons, with NAD+ shown as spheres to show the relative orientation of the nucleotide-binding cleft; two further subunits are shown as grey surface representations.

Cphy1178 forms a tetrameric quaternary structure generated by crystal symmetry, with molecules related by D2 symmetry (Fig. 2C). A tetrameric assembly is also observed when analysing Cphy1178 by native (nondenaturing) electrospray ion-mobility mass spectrometry (IM-MS)28; using this technique a single charge state distribution is observed which corresponds to tetrameric Cphy1178 in the +26 to +29 charge states (191.3 kDa assembly mass, in agreement with the predicted molecular mass of 4 × 47,452 Da; Fig. 3A). Ion-mobility of the Cphy1178(20–426) tetrameric assembly reveals that the complex exists as a single conformer with a collision cross section (CCS) of ~11,000 Å2 (Fig. 3A). This gas-phase value is in agreement with the calculated CCS of the tetrameric structure determined by crystallography (Cphy117828–426; CCS calculated as 9599 Å2 by IMPACT v 0.9.1); suggesting that the overall solution architecture of the enzyme is retained during MS analysis. This tetramer also corresponds to the most probable stable arrangement proposed by the PISA server29 and is consistent with previously published structures of this family of enzymes23.

Figure 3: Native mass spectrometry analysis of Cphy1178.
figure 3

(A) Ion-mobility MS analysis of recombinant Cphy1178 displays a charge state distribution consistent with the +26 to +29 ions of a tetrameric assembly of Cphy1178 (Top, 191.3 KDa). Ion-mobility data, displayed as a greyscale heat map (bottom) reveals that each charge states exhibit a single drift time. When corrected for charge, the drift times of all charge states are all consistent with a collision cross section of 11,000 Å2 (hashed red line). (A , Insert) A topological cartoon of the Cphy1178 tetramer using ball-and-stick representations (ball, monomer; stick, interface). Interface areas in Å2 (black) and number of salt bridges (red) are marked; and calculated collision cross section (Ω) in Å2 is given. (B) Gas phase dissociation of the Cphy1178 using collision induced dissociation leads exclusively to the appearance of monomeric Cphy1178; a 1 KDa adduct is also observed (*) (C) Partial disruption of the Cphy1178 assembly by solution-phase dissociation prior to MS analysis (using 40% v/v MeOH) leads to the appearance of both monomeric and dimeric subcomplexes.

From the crystal structure it is apparent that intersubunit interactions are primarily mediated by contacts between the oligomerisation domain and the catalytic domain, which produces a dimer with 2350 Å2 buried surface per monomer (out of a total solvent accessible surface of 18,229 Å2) and is stabilised by 15 hydrogen bonds and 14 salt bridges (interface between red/blue and grey/grey monomers in Fig. 2C). A second dimerisation interface is found between symmetry related oligomerisation domains, this is stabilised by 3 hydrogen bonds and 4 salt bridges and buries 1148 Å2 of surface area per monomer (red/grey and blue/grey monomers in Fig. 2C). These dimers are further stabilised by contacts between an α-helix in the catalytic domain (residues 395–407) and the oligomerisation domain (500 Å2 buried SA, one hydrogen bond and two salt bridges); the C-terminus of the protein is buried within this interface and the terminal carboxylic acid group is visible in the electron density map.

Further evidence for the existence of two interfaces of differing strengths within the Cphy1178 tetramer came from topology-mapping mass spectrometry studies30,31. Collision induced dissociation (CID) of the protein tetramer led to the appearance of a charge state distribution consistent with monomeric Cphy1178 (Fig. 3B) – an observation consistent with the typical mechanism of dissociation using this technique (i.e. ejection of a highly charged monomer from the assembly)32. In contrast, titration of increasing organic solvent to the protein solution prior to MS analysis, resulted in the appearance of both monomer and dimer charge state distribution in the mass spectrum (Fig. 3C). Solution-phase dissociation of protein complexes is known to occur via dissociation of the weakest interface in the complex first. Therefore using this technique, subcomplexes of the protein assembly can be formed, their composition determined by MS, and relative interface binding strengths can be inferred33. We interpret our observation as cleavage of the second smaller interface, whilst the larger interface (area 2350 Å2; which includes extensive salt-bridge interactions) remains intact.

Co-factor binding

The structure of Cphy117820–462 was determined with NAD+ bound in the nucleotide binding cleft and active-site pocket after soaking apo-crystals with crystallization solution supplemented with 10 mM NAD+. The ligand displayed excellent electron density (Fig. 4A) and was refined to an occupancy of 0.83 in the final structure using phenix.refine. The NAD+ is found in the hydride transfer conformation34,35 and participates in 4 direct hydrogen bonding interactions with the enzyme and a number of other interactions mediated by ordered solvent molecules. The adenine ring does not directly participate in any hydrogen bonding interactions, but is positioned between Leucine 198 and Valine 221. The N7 of the adenine ring accepts a hydrogen bond from an ordered solvent molecule that is also coordinated by an oxygen atom in the adenosine phosphate group and the backbone nitrogen of Valine 221. Both O2 and O3 of the adenosine ribose interact with an ordered water molecule that bridges them to the carbonyl oxygen of Proline 161. Similarly, two oxygen atoms from the phosphate groups are bridged to Histidine 162 via a solvent molecule. Glutamine 357 participates in hydrogen bonding interactions with O2 and O3 of the nicotinamide ribose. The nicotinamide ring is positioned next to the catalytic Cysteine 269 through hydrogen bonding interactions between N7 and O7 and the peptide backbone of Isoleucine 433. The absence of enzymatic activity in the presence of the NADP+ co-factor is explained by the presence of a histidine and proline residue blocking the position that the 2′-phosphate adopts in structures of aldehyde dehydrogenase enzymes determined with NADP+ bound (Supplementary Fig. 4A/B).

Figure 4: Electron density maps of bound NAD+ and CoA cofactors.
figure 4

(A) Structure of NAD+ soaked Cphy117820–462. NAD+ and protein residues within 4 Å shown as stick representations and 2mFo-DFc map shown as a blue mesh contoured at 1σ. (B) Structure of CoA soaked Cphy117820–462(C269A) displayed as in (A).

The structure of Cphy117820–462(C269A) was determined using crystals soaked with 10 mM CoA. The cofactor showed good electron density for the adenosine group, but poor density for the pantothenic acid group, particularly the terminal region with the sulphydryl group in the active site, which has no visible electron density (Fig. 4B). A comparison of the structure of the NAD+ and CoA bound forms of the protein show that the structures are essentially identical, with an rmsd Cα of 0.24 Å over 431 residues. The two structures differ only in the position of the loop between residues 215 and 223, where P219, G220 and V221 are shifted toward the adenine ring in the NAD+-bound structure (Supplementary Fig. 5).

In contrast to the models of CoA binding to the Pseudomonas DmpF aldehyde dehydrogenase36 and Lactobacillus reuteri PduP aldehyde dehydrogenase25 our crystal structure shows distinct adenine-binding modes for NAD+ and CoA (Fig. 4B and Supplementary Fig. 4C–F). The adenine ring is flipped by 180 degrees relative to its position in the NAD+ structure and makes hydrogen-bonding contacts with both the side chain and carbonyl oxygen of Asparagine 160, and the carbonyl oxygen of Threonine 134, placing the phospho-ribose group above both of the nucleotide binding pockets. The phosphate groups between the nucleotide and pantothenate group are held in place by hydrogen bonds formed between the side-chain nitrogen atoms of Histidine 162 and Arginine 318. While there is clear electron density for the nucleotide and phosphate-proximal portion of the pantothenate group of the CoA, the β-mercapto ethylamine group is not visible, indicating that the distal arm of CoA is flexible in the absence of thio-acyl substrates bound to the catalytic cysteine for acyl transfer.

Discussion

Substrate Discrimination

Our initial enzyme assays indicate that Cphy117820–426 is active against a range of short chain aldehydes (C2–6). This observation can be rationalised by analysis of the structure of the enzyme, which reveals that the substrate-binding cleft of Cphy117820–426 is lined with hydrophobic residues and can comfortably accommodate an aldehyde substrate with up to six carbon atoms and could potentially fit a chain of up to nine or ten carbon atoms in the substrate tunnel (Fig. 5A). A phenyl alanine residue (F423) found at the solvent exposed side of the tunnel appears in multiple conformations in the crystal structure, this may indicate that it acts to control substrate access at this entry site. The tunnel is enlarged close to the catalytic cysteine, which may allow substrate rearrangement during acyl-transfer. The lack of activity against glyceraldehyde is explained by the presence of hydrophobic residues in this region (I421 and I433), which would make unfavourable interactions with the hydroxyl groups of this substrate. Cphy1178 is present in the C. phytofermentans fucose/rhamnose BMC, which has a fucose aldolase enzyme that produces lactaldehyde (2-hydroxypropanal) as its product, alongside a propanediol dehydratase that produces propionaldehyde13. Comparison of the ligand-binding tunnel of Cphy1178 with the lactaldehyde dehydrogenase enzyme from Escherichia coli (PDBID: 2HG2), shows that the two isoleucine residues in the substrate-binding tunnel of Cphy1178 are replaced by a glutamic acid and a histidine residue to accommodate the polar hydroxyl group of its lactaldehyde substrate23. These key amino acid differences and the preferential activity of Cphy1178 against short-chain fatty aldehydes indicate that propionaldehyde is the most likely natural substrate for Cphy1178.

Figure 5: Cphy117820–462 active site architecture.
figure 5

(A) Surface slice through Cphy117820–462 showing the position of NAD+ and modelled hexanal substrate. Protein residues and substrates shown as stick representations with different coloured carbon atoms, the active-site histidine is shown in red for clarity. The ligand/substrate-binding tunnel is open at both ends and accommodates the NAD+/CoA cofactors at one end and the aldehyde ligand at the other. The aldehyde-binding portion of the tunnel is lined with hydrophobic residues and is gated by F423, which is present in multiple conformations in the crystal structure. (B) Binding of CoA in the active site showing distances between H387 and C269 and the sulphur of the CoA and E357. CoA shown in magenta.

Acylation

We have shown that h1178 shows a higher level of activity with propionaldehyde as a substrate compared to other aldehydes tested and is capable of acyl-transfer to CoA using this substrate. We have also determined the structure of this enzyme with CoA bound in the active site. The acylating aldehyde dehydrogenase family enzymes do not possess the glutamic acid general-base that is required to activate the catalytic cysteine and to deprotonate water for deacylation of the acyl-enzyme intermediate37. In Cphy1178 this residue is replaced by an alanine (A235) (Fig. 5A) and is a small hydrophobic residue in all of the acylating aldehyde dehydrogenases in this study (Supplementary Fig. 2). Mutagenesis of a strictly conserved Histidine residue (H387), that is within 2.5 Å of the catalytic cysteine (Fig. 5B) completely abolishes activity of the enzyme. This residue is not found in any of the non-acylating aldehyde dehydrogenase enzymes20. The proximity of this residue to the catalytic cysteine is incompatible with the formation of the correct geometry for deprotonation of water for resolution of the acyl-intermediate to produce a carboxylic acid product34. In light of our biochemical data showing that the H387A mutant is inactive, we suggest that this residue acts as a base to activate the catalytic cysteine and to stabilise the acyl-transfer intermediate between the enzyme and CoA cofactor. Due to the proximity of a glutamic acid residue (E357) to the cysteine group of CoA (Fig. 5B) and the strict conservation of this residue, we propose a model in which this residue activates the CoA cofactor for acyl transfer from the acyl-enzyme intermediate (Fig. 6). In our reaction scheme H387 deprotonates the catalytic cysteine to allow nucleophilic attack on the substrate to form tetrahedral intermediate. Consistent with previous reports the oxyanion would be N13834. After hydride transfer to NAD+, the NADH product leaves the cofactor-binding pocket in the rate-limiting step, followed by entry of CoA, which is deprotonated by E357 and subsequently attacks the acyl-enzyme intermediate to produce the thioester product and free enzyme (Fig. 6). The extra proton on E357 is likely transferred to bulk water via a proton-relay system comprising the conserved residues N138 and K9438.

Figure 6: Proposed catalytic mechanism of Cphy1178.
figure 6

The catalytic cycle proceeds via a bi-uni-uni-uni-ping-pong mechanism involving acylation of the catalytic cysteine (C269) and hydride transfer to NAD+, followed by trans-thioesterification between the enzyme and CoA to produce the Acyl-CoA product.

In this study we have determined the activity of Cphy1178 against a range of aldehyde substrates and present the first structure of CoA bound to an acylating aldehyde dehydrogenase and suggest a role for a conserved histidine found in place of the glutamic acid that acts as the general base in non-acylating aldehyde dehydrogenases. The capture of acyl-transfer intermediates in future crystal structural studies will shed further light on the mechanism of acyl transfer in this important class of enzymes that are essential to the function of bacterial microcompartments.

Materials and Methods

Cloning

Genes encoding aldehyde dehydrogenase enzymes from the Clostridium phytofermentans and Clostridium difficile bacterial microcompartments were amplified by PCR using the primers detailed in Table 2 using the KOD HotStart DNA polymerase as described in the user protocol. PCR products were run on a 0.8% Agarose/TAE gel and visualised by SybrSafe staining (Life Technologies), bands of the appropriate size for each construct were excised and subjected to gel-cleanup using a Qiagen kitand following the protocol described in the user manual. Purified PCR products were digested with appropriate restriction enzymes (Fermentas) for ligation into the pET28a vector (Novagen). Site directed mutagenesis was performed by the QuickChange method (Agilent) for the Cphy1178 active site mutants. All constructs were sequence confirmed by Sanger sequencing at the Edinburgh Genomics facility.

Protein production

Plasmids for recombinant aldehyde dehydrogenases were transformed into chemically competent Escherichia coli BL21(DE3) cells as follows, 1 μl of plasmid (100 ng/ul) was mixed with 50 μl cells and incubated on ice for 30 minutes. Cells were subsequently subjected to heat shock for 45 seconds at 42 °C and returned to ice for 2 minutes. 200 μl LB media was added to cells before incubation at 37 °C for 1 hour. Cells were then plated onto LB agar supplemented with 10 μg/ml kanamycin and incubated at 37 °C overnight.

Single colonies of transformed cells were transferred to 500 μl LB medium supplemented with 10 ug/ml kanamycin and incubated with shaking to OD600 at 37 °C. Cells were induced with 1 mM IPTG and grown overnight at 18 °C. Cells were harvested by centrifugation at 4,000 × g for 30 minutes, pelleted cells were washed in PBS and pelleted by centrifugation at 4,000 × g for 15 minutes.

Protein purification

Purification of Hexa-histidine tagged proteins

Cells expressing hexa-histidine tagged proteins were resuspended in 10 ml/g (wet weight) buffer HisA (50 mM Tris.HCl, pH 8.0, 500 mM NaCl, 50 mM imidazole). Cells were lysed by sonication on ice, with pulses of 10 μm amplitude for 10 seconds with 10 seconds between pulses for 5 minutes. The resulting cell lysate was clarified by centrifugation at 35,000 × g for 30 minutes and the resulting supernatant was filtered with a 0.45 μm syringe filter (Sartorius). Clarified lysate was loaded onto a 5 ml HisTrap column (GE Healthcare) equilibrated with buffer HisA on an Åkta FPLC system (GE Healthcare). Unbound protein was washed from the column with HisA until the A280 returned to the baseline value. His6-tagged protein was eluted from the column with a step gradient of 5 column volumes of 50% HisA/ 50% HisB (50 mM Tris.HCl, pH 8.0, 500 mM NaCl, 500 mM Imadazole), followed by 5 column volumes of 100% HisB. Protein fractions were analysed by 15% SDS-PAGE to identify the fractions containing the protein of interest. These fractions were loaded onto an S200 16/60 size-exclusion gel-filtration column (GE Healthcare) equilibrated with buffer GF (50 mM Tris.HCl, pH 8.0, 150 mM NaCl) on an Åkta FPLC system. Fractions corresponding to recorded A280 peaks were analysed by SDS-PAGE and those containing the protein of interest were pooled for subsequent experiments. A protein purification summary is shown in supplementary table 1.

Purification of untagged proteins

Cells expressing untagged proteins were resuspended in 10 ml/g (wet weight) buffer Q-A (50 mM Tris.HCl, pH 8.0) and lysed and clarified as detailed above for His6-tagged proteins. Clarified lysate was loaded on a 12 ml Q-sepharose column (GE Healthcare) equilibrated with buffer Q-A. Unbound protein was washed from the column with buffer Q-A until the A280 returned to the baseline value. Bound proteins were eluted from the column with a 0–80% linear gradient of 25 column volumes of buffer Q-B (50 mM Tris.HCl, pH 8.0, 1 M NaCl), followed by 5 column volumes of 100% Q-B. Protein fractions were analysed by 15% SDS-PAGE to identify those containing the protein of interest. Fractions containing the protein of interest were pooled and subjected to size-exclusion gel-filtration as described above for His6-tagged protein. Fractions containing protein of interest were pooled for subsequent experiments. A protein purification summary is shown in supplementary table 1.

Aldehyde dehydrogenase assay

Kinetic assays were performed using a multimode plate reader (Molecular Devices, M5), in flat bottomed 96 well plates (Corning) at 21 °C. Each 300 μl reaction contained 100 mM Tris.HCl, pH 8.0; 0.66 mM NAD+; 100 mM KCl; 10 mM 2-mercaptoethanol; 100 nM protein; and various concentrations of aldehyde substrates, using both 100 mM and 10 mM stock solutions to increase assay accuracy. The enzyme activity was monitored by measuring the formation of NADH at 340 nm using an absorption coefficient of 6.22 mM−1 cm−1. Enzyme activity is expressed as mM NADH produced per second. Steady state kinetic data, obtained with five technical repeats, were analysed by non-linear regression to the Michaelis-Menten equation using GraphPad Prism. Datasets for these assays are available at http://doi.org/10.6084/m9.figshare.2067375.

Detection of Propionyl-CoA product formation by LC-MS

To confirm formation of propionyl-CoA, assay mixtures (50 nM Cphy1178(20–462) (reduced with 4 times excess 2-mercaptoethanol and subsequently dialysed into 4000 times excess buffer GF), 75 μM NAD+, 100 μM CoA, 10 mM propionaldehyde, 100 mM Tris.HCL pH8.0, 100 mM KCl) were prepared and incubated at room temperature for 2 hours before analysis by LC-MS. Analysis was performed on an ESI Synapt-G2 Q-ToF mass spectrometer equipped with an Acquity UPLC system. For chromatography, samples were separated on a reverse phase Kinetixs C18 50 × 2.1 mm column (Phenomenex), using a gradient of 5 to 95% methanol (0.1% formic acid) over 12 minutes and a flow rate of 250 ul/min. For detection of propionyl-CoA, single ion monitoring was employed with the mass resolving quadrupole set to 824 m/z. The resulting mass spectra were calibrated by applying a single point calibration using Leucine Enkephalin as a reference.

Protein mass spectrometry

All mass spectrometry (MS) experiments were performed on a Synapt G2 ion-mobility equipped Q-ToF instrument (Waters). LC-MS experiments were performed using an Acquity UPLC equipped with a reverse phase C4 Aeris Widepore 50 × 2.1 mm HPLC column (Phenomenex) and a gradient of 595% MeOH (0.1% Formic Acid) over 10 minutes was employed. Samples were typically analysed at 5 μM, and data analysis was performed using MassLynx v4.1 and MaxEnt deconvolution.

For native MS and ion-mobility mass spectrometry (IM-MS) experiments, nano-ESI ionisation was performed using a Nanomate robot operating in infusion mode (Advion Biosciences). Immediately prior to analysis, protein samples were buffer-exchanged into 100 mM ammonium acetate (pH 7.0) using micro BioSpin 6 columns (Biorad). Instrument parameters were carefully tuned to preserve protein complexes and a backing pressure of 4 mbar was used. For IM-MS helium was used as the drift gas. Typically, the IMS wave velocity was set to 300 m/s; wave height to 15 V; and the IMS pressure was 1.8 mbar. For collision cross section determination, IM-MS data was calibrated using denatured equine myoglobin and data was processed using Driftscope v2.5 and MassLynx v4.1 (Waters Corp., UK). For gas-phase disassembly experiments, collision induced dissociation (CID) was employed by applying a Trap collision cell voltage of 75 to 100 V. For solution phase disassembly experiments, The protein tetramer was disrupted in solution by addition of 40% methanol (v/v) prior to MS analysis. Theoretical collision cross sections (CCS) were calculated from PDB files using IMPACT software v. 0.9.1 39.

Protein crystallisation and data collection

Purified Cphy1178(20–462) and the C269A and H378A mutants were concentrated to between 8-12 mg/ml in 30 mM NaCl, 50 mM Tris.HCl pH8.0 and crystallised by sitting drop vapour diffusion in drops of 2 μl protein plus 2 μl crystallisation solution, over 1 ml of the latter. Crystals were obtained in 0.1 M sodium acetate pH 4.7, 1.8 M ammonium sulphate and grew as bi-pyramids of 50–100 μm in size. Crystals were harvested from the well using a LithLoop (Molecular Dimensions), transferred briefly to a saturated ammonium sulphate solution and then paratone oil (Molecular Dimensions), excess oil was wicked off and the crystals were subsequently flash cooled in liquid nitrogen. Structures of Cphy1178(20–462) with NADH and CoA ligands were obtained by soaking harvested crystals in a solution of 0.1 M sodium acetate pH 4.7, 1.8 M ammonium sulphate, supplemented with 10 mM ligand and cryo-protected as described above. All crystallographic datasets were collected on beamlines I02 and I04 at Diamond Light Source (Didcot, UK) at 100 K and using ADSC CCD, or Pilatus 6 M detectors. Diffraction data were integrated and scaled using XDS40 and merged with Aimless41. Data collection and refinement statistics are shown in Table 3.

Table 3 Crystal parameters and data collection statistics for C. phytofermentans Cphy1178.

Structure solution and analysis

All structures were solved by molecular replacement using Phaser42, using PDBID:3K9D as the molecular replacement model. Refinement of the coordinates, TLS parameters and atomic temperature factors was carried out using Phenix.refine in the Phenix suite43. Rounds of iterative model building were performed using Coot44. The secondary structure and stereochemistry of the models were analysed using MolProbity45. Cofactors present in CoA and NAD+ soaked structures were subjected to occupancy refinement for the whole molecule and individual B-factors were refined to accommodate positional uncertainty, rather than using the alternative strategy of setting the occupancy of disordered regions to zero. The oligomeric states of the protein and corresponding buried surface areas were calculated using the PISA server29. Structural superimpositions were calculated using Coot. Crystallographic figures were generated with PyMOL (www.pymol.org).

Bioinformatics analysis of BMC localisation sequences

Sequences of encapsulated aldehyde dehydrogenase enzymes homologous to Cphy1178 were aligned using Clustalω46 and an alignment figure produced using ESPript47. Sequence regions falling outside the conserved domain of non-encapsulated aldehyde dehydrogenase enzymes were identified as putative BMC localisation sequences and omitted from the constructs used to produce the aldehyde dehydrogenase proteins for crystallization and activity assays.

Additional Information

How to cite this article: Tuck, L. R. et al. Insight into Coenzyme A cofactor binding and the mechanism of acyl-transfer in an acylating aldehyde dehydrogenase from Clostridium phytofermentans. Sci. Rep. 6, 22108; doi: 10.1038/srep22108 (2016).