Protein-pyridinol thioester precursor for biosynthesis of the organometallic acyl-iron ligand in [Fe]-hydrogenase cofactor

The iron-guanylylpyridinol (FeGP) cofactor of [Fe]-hydrogenase contains a prominent iron centre with an acyl-Fe bond and is the only acyl-organometallic iron compound found in nature. Here, we identify the functions of HcgE and HcgF, involved in the biosynthesis of the FeGP cofactor using structure-to-function strategy. Analysis of the HcgE and HcgF crystal structures with and without bound substrates suggest that HcgE catalyses the adenylylation of the carboxy group of guanylylpyridinol (GP) to afford AMP-GP, and subsequently HcgF catalyses the transesterification of AMP-GP to afford a Cys (HcgF)-S-GP thioester. Both enzymatic reactions are confirmed by in vitro assays. The structural data also offer plausible catalytic mechanisms. This strategy of thioester activation corresponds to that used for ubiquitin activation, a key event in the regulation of multiple cellular processes. It further implicates a nucleophilic attack onto the acyl carbon presumably via an electron-rich Fe(0)– or Fe(I)–carbonyl complex in the Fe-acyl formation. The iron-guanylylpyridinol cofactor of [Fe]-hydrogenase is the only stable acyl-organometallic cofactor compound found in nature. Here, the authors perform a combined structural genomics and crystallography biochemical study to determine its biosynthetic origins.

N ature expands the catalytic capability of proteins by developing metal cofactors 1 . Some of the most attractive representatives are those of [NiFe]-, [FeFe]-and [Fe]hydrogenases, which catalyse the production and consumption of H 2 in microorganisms [2][3][4][5] . The unusual Fe centres of [NiFe]-and [FeFe]-hydrogenases contain thiolate, CO and CN À ligands. Instead of CN À ligands, the Fe centre of the ironguanylylpyridinol (FeGP) cofactor of [Fe]-hydrogenase has a guanylylpyridinol (GP), which is bidentately chelated to Fe by its nitrogen and acyl-carbon 6,7 . The Fe-acyl bond constitutes the only stable acyl-organometallic compound found in nature, although an acyl-Ni species has been reported to be an intermediate in the acetyl-CoA formation reaction catalysed by acetyl-CoA synthase 8,9 . The metal cofactors of hydrogenases and their biosynthesis have been extensively explored to unravel their fascinating chemistry [10][11][12] and to use their H 2 activation capabilities for biotechnological applications 13,14 . In contrast to studies of [NiFe]-and [FeFe]-hydrogenases, convincing hypotheses and experimental data have not been reported for the biosynthesis of the Fe centre of [Fe]-hydrogenase, including the formation of the unique acyl-group, the greatest challenge in FeGP cofactor biosynthesis.
The seven conserved hmd co-occurring (hcgA-G) genes in the genomes of hydrogenotrophic methanogens are involved in the biosynthesis of the FeGP cofactor 15,16 . Recent findings using structure-to-function strategy 17,18 indicated that the formation of 6-carboxymethyl-guanylylpyridinol from GTP and a pyridinol compound are catalysed by HcgB, which strongly suggested that GP is an intermediate in the FeGP biosynthetic pathway and, consequently, that its carboxy group is converted to the acyl-iron ligand in subsequent enzymatic reactions 19 .
In the current study, we find that HcgE and HcgF catalyse the formation of a thioester-activated acyl-group using a two-state mechanism reminiscent of that of ubiquitin activation 20 . HcgE catalyses the adenylylation of the carboxy group of GP to afford adenosine 5 0 -monophosphate (AMP)-GP. HcgF catalyses the transesterification of AMP-GP to afford a Cys-S-GP thioester. On the basis of the findings, possible Fe-acyl formation mechanisms are proposed.

Results
Structural identification of substrates of HcgE. The functional characterization of HcgE was inspired by the relationship between the primary structures of HcgE and E1-like ubiquitin-activating enzymes (E1 enzymes) ( Supplementary Figs 1 and 2) 20 . E1 enzymes activate the C-terminal carboxy group of ubiquitin or ubiquitin-like proteins and of a precursor heptapeptide of an antibiotic by adenylylation with adenosine 5 0 -triphosphate (ATP). This essential activation reaction is performed by various eukaryotic E1 enzymes/ubiquitin-like proteins 20,21 , and by the prokaryotic protein pairs MoeB/MoaD 22 , ThiF/ThiS 23 and MccB/ MccA 24 , which are involved in the modulation of protein function in the regulation of multiple cellular processes and in the biosynthesis of molybdopterin, thiamine and the antibiotic microcin C7, respectively. Likewise, HcgE might therefore catalyse the adenylylation of a carboxy group of an unknown compound, that is, GP, in FeGP cofactor biosynthesis. Structures of [Fe]-hydrogenase, the FeGP cofactor and GP are shown in Fig. 1.
To experimentally explore this hypothesis, we determined the crystal structure of homotrimeric HcgE and of the HcgE-ATP complexes from Methanothermobacter marburgensis ( Fig. 2a; Supplementary Figs 3 and 4) at 1.6 and 1.8 Å resolution, respectively. ATP binds to the canonical ATP-binding site of E1 enzymes located in the conserved N-terminal half. In contrast, the fold of the C-terminal half substantially differs between HcgE and the structurally known E1 enzymes, resulting in a completely altered substrate-binding site ( Fig. 2b; Supplementary Fig. 3). We identified GP as a possible substrate of HcgE on the basis of the ternary HcgE-ATP-GP complex structure determined at 2.8 Å resolution (Fig. 2c,d). Notably, GP binding was only achievable when HcgE and GP were crystallized in the presence of ATP. Despite different substrate-binding sites, the carboxy group of GP in HcgE and the carboxy group of the C-terminal glycine of MoaD are in equivalent position.
In vitro activity assay of HcgE. To complement the structureguided functional prediction, we analysed a reaction mixture containing HcgE, ATP, GP and Mg 2 þ by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MS) ( Supplementary Fig. 5). The obtained peak at m/z 872.2 corresponded to the expected product AMP-GP (calculated mass ¼ 871.2) and its fragmentation peaks at m/z 348. 1   followed Michaelis-Menten kinetics with an apparent K m for ATP of 27.0±3.5 mM and a k cat of 7.2±0.2 min À 1 based on determination of the pyrophosphate concentration ( Supplementary Fig. 6). Thus, structural, mass spectrometry and kinetic data identified HcgE as an adenylyltransferase that forms AMP-GP from ATP and GP.
Catalytic mechanism of HcgE. The ternary structure of the HcgE-ATP-GP complex revealed that the carboxy group of GP points towards the a-phosphate group of ATP (Fig. 2c,d). This transition state-like arrangement not only strongly argues for an adenylylation reaction catalysed by HcgE but also provides clues to its catalytic mechanism (Fig. 2e). In this proposed mechanism, the carboxy group of GP attacks the a-phosphate of ATP via an S N 2 reaction thereby adenylylating GP via a pentavalent intermediate. The negative charge of the carboxy oxygen of GP is compensated by a hydrogen bond to Leu24-NH located at the N-terminal side of helix 24:34. The guanidinium group of Arg23 and the imidazole group of His36 of the partner subunit of the trimer compensate the negative charge of the a-phosphate oxygen of ATP (Fig. 2c,d)  pyrophosphate is delocalized by interactions with Mg 2 þ (not visible in the electron density map), which probably interacts with its band g-phosphates and with several polypeptide side chains (Fig. 2e).
Possible function of HcgF provided by protein structure. We determined the function of HcgF using the structure-to-function strategy used for HcgE. HcgF does not show sequence similarities to any enzyme of known function ( Supplementary Figs 7 and 8), but the crystal structure determined for homodimeric HcgF from Methanocaldococcus jannaschii is strongly related to that of the nicotinamide mononucleotide (NMN) deaminase PncC 25 ( Supplementary Fig. 9). HcgF does not exhibit NMN hydrolase activity, but the profiles of the proposed NMN-binding cavity of PncC and the equivalent site of HcgF are compatible with the demands for accommodating an NMN-like compound, that is, GP (Supplementary Fig. 9). Indeed, the X-ray crystal structure of HcgF at 2.0 Å resolution co-crystallized with GP revealed bound GP (Fig. 3a-d). Interestingly, GP in one HcgF monomer of the asymmetric unit partially forms a covalent thioester bond between its carboxy carbon and the thiolate of Cys9 (Fig. 3c,d), while in the other monomer, Cys9 and GP are in van der Waals contact ( Fig. 3b; Supplementary Fig. 10). In the substrate-free HcgF, the strictly conserved Cys9 is activated by the neighbouring Lys79-N. Upon binding of GP, conformational changes of the segments after strands 46:53 and 125:131 up to 6 Å enlarge the substrate-binding site and, concomitantly, induce a movement of the side chain of Lys79 by more than 10 Å from the largely buried Cys9 towards the protein surface (Fig. 3b,c). The key residue Cys9 positioned at the C-terminal end of the shortened strand 4:9 is strictly conserved in HcgF.
AMP-GP is the substrate of HcgF. Consideration of the obtained structural data and the established reaction of HcgE led to the proposal that the same two-step carboxylate activation strategy reported for ubiquitin activation is used in the biosynthesis of the FeGP cofactor, namely adenylylation of the carboxy group of GP by HcgE, followed by thioester formation by HcgF (Fig. 4). This hypothesis implies that not GP but rather AMP-GP formed by the HcgE reaction is the natural substrate of HcgF. Unfortunately, the instability of AMP-GP, precludes its use in crystallographic studies. Docking simulation of AMP on GP covalently bound to HcgF, however, indicated that it is spatially possible for AMP-GP to be accommodated in the binding site adjacent to Cys9 ( Supplementary Fig. 9c). To demonstrate the transesterification reaction from AMP-GP to Cys9 (HcgF)-GP in solution, we added HcgF to the HcgE reaction assay described above. The production of AMP confirmed the proposed function of HcgF ( Supplementary Fig. 11).
Catalytic mechanism of HcgF. A proposed reaction mechanism for the thioester bond formation between Cys9 of HcgF using AMP-GP as a substrate is shown in Fig. 3e. In this mechanism, the thiolate of Cys9 attacks the adenylated carboxy group of AMP-GP in a S N 2 reaction, resulting in the formation of a thioester bond and the release of AMP. The oxygen of the thioester group interacts with Phe10-NH and Ala111-NH (see also Fig. 3d), which most likely also stabilizes the anionic tetrahedral transition state. Notably, the GP-HcgF adduct is the first example of a ubiquitin activation system for which the thioester product has been structurally characterized.

Discussion
The identified GP carboxylate activation involved in Fe-acyl formation corresponds to the mechanism established for ubiquitin activation and thereby expands this versatile strategy from (poly)peptides to other carboxylated substrates. Activation via a thioester bond suggests a nucleophilic substitution reaction for Fe-acyl formation to allow the release of Cys9 (HcgF) as thiolate, which is a favourable leaving group (Fig. 4). An ironcarbonyl precursor, so far uncharacterized in the biosynthesis of the Fe-acyl bond of FeGP cofactor model compounds 28 . However, the formation of an Fe(0)-carbonyl complex precursor requires a chemistry challenging in a water-rich biological environment, such as that of the novel iron(II)-complex reduction system in methanogenic archaea (Fig. 4). Alternatively, a one-electron reduction system might formally generate a Fe(I)-carbonyl complex for nucleophilic attack, followed by a second oneelectron reduction step such as proposed for the catalytic mechanism of acetyl-coenzyme A synthase 8 . The enzymatic system involved in the Fe-acyl formation is unknown. HcgF could also function as a scaffold protein on which biosynthesis of the FeGP cofactor is completed after an electron-rich Fe-carbonyl complex is accepted. The intact cofactor biosynthesized on HcgF could be transferred to the [Fe]-hydrogenase apoenzyme to form its holoenzyme. However, a scaffold function of the other enzymes cannot be excluded 17,29 . Nevertheless, the established thioester activation offers first insights into the chemistry of the subsequent Fe-acyl bond formation and places this reaction in the pathway of Fe centre biosynthesis after iron-carbonyl formation. Heterologous production and purification of HcgF. The synthesized M. jannaschii hcgF gene (Supplementary Fig. 13) inserted into the NdeI and HindIII sites of pET28b was introduced into E. coli BL21 (DE3) Star (Invitrogen) by transformation. The transformed cells were grown in LB medium containing 50 mg ml À 1 kanamycin at 37°C to an OD 600 of 1.0. Protein expression was induced by addition of 1.0 mM IPTG and further incubation for 4-6 h.

Methods
The N-terminal His 6 -tagged HcgF protein was purified, and the His 6 -tag was cleaved as described for the His 6 -tagged HcgE protein. To remove the His 6 -tag and thrombin, the reaction mixture of HcgF and thrombin was loaded onto a Ni 2 þcharged HiTrap chelating column and a HiTrap Benzamidine FF column (GE Healthcare) equilibrated with buffer A. After the fractions containing the HcgF protein were isocratically eluted and concentrated, the protein solution was loaded onto a Sephacryl S-200 column (GE Healthcare) equilibrated with 50 mM potassium phosphate buffer (pH 7.0) containing 0.3 M KCl and eluted with the same buffer as used for the equilibration. For the SeMet-labelled HcgF preparation, E. coli B834 (DE3) (Novagen) cells transformed with pET28b carrying the synthesized M. jannaschii hcgF gene inserted into its NdeI and HindIII sites were incubated in M9 medium supplemented with 2.5 mM MgSO 4 , 2% (w/v) D-( þ )glucose, 0.01% (w/v) thiamine, 0.025 mM FeCl 3 , 50 mg ml À 1 L-selenomethionine and 50 mg ml À 1 kanamycin at 37°C to an OD 600 of 0.5. SeMet-HcgF was purified as described for HcgF, except that all buffers contained 1 mM DTT. X-ray data collection and refinement. The crystal of HcgE was immersed into a reservoir solution containing 20% (v/v) ethylene glycol for cryoprotection before freezing the crystal under a cryo-stream of N 2 at 100 K. The crystals of ATP-bound HcgE, ATP-GP-bound HcgE, HcgF, SeMet-labelled HcgF and GP-bound HcgF were frozen under a cryo-stream of N 2 at 100 K without adding a cryoprotectant. Diffraction data were collected on the beamline X10SA equipped with a PILATUS 6 M detector at the Swiss-Light-Source (Villigen, Switzerland) at 100 K. Data were processed using X-ray Detector Software (XDS) 31 . The HcgE structure was solved by molecular replacement with Molrep 32 using a ThiF (PDB code: 1ZUD)-based model created by the Robetta server 33 . Automated model building was then performed by ARP/wARP 34 to model amino-acid fragments and loops of HcgE. ATP-bound and ATP-GP-bound HcgE structures were solved by molecular replacement with Molrep using the solved HcgE structure. The HcgE models after molecular replacement and automated model building were completed and refined using COOT 35 , REFMAC5 (ref. 36) and PHENIX 37 . For the HcgE structure, one out of three monomers in the asymmetric unit showed electron density at the equivalent position of ATP in ATPbound HcgE, although no ligands were added during the crystallization. On the basis of the shape of the density, ADP and not ATP could be fitted. An explanation is that ADP or ADP-like compounds may slightly co-purify with HcgE from E. coli cells. Phenix Xtriage 37 indicated that the data of HcgE co-crystallized with ATP or with both ATP and GP were merohedral twinned with twin fractions of 0.43 and 0.44 and with twin operators of k, h, -l and -h, -k, l; k, h, -l; -k, -h, -l, respectively. Twin refinement was performed with Refmac5 in the amplitude-based mode.
To determine the crystal structure of HcgF, multiple anomalous dispersion data at the selenium edge of the SeMet-HcgF crystal was measured. Selenium atom sites were detected with SHELX C/D 38 . The selenium sites were refined and the phase was determined using the program SHARP and improved by the solvent-flattening procedure of SOLOMON implemented in SHARP 39 . The electron density after phasing and solvent flattering was shown as a blue mesh contoured at 1s (Supplementary Fig. 14). Automatic model building was performed using ARP/wARP 34 . GP-HcgF was solved by molecular replacement with Molrep using the solved HcgF structure as a search model. Further modelling and refinement of HcgF and GP-HcgF were performed using COOT, REFMAC5 and PHENIX. The thioesterified GP model was generated by the Dundee Prodrg server 40 .
TLS refinement was performed in the final stage of refinement 41 . Structures were validated using MolProbity 42 . Simulated annealing refined 2F o À F c omit maps for ATP-bound HcgE, ATP-GP-bound HcgE and GP-bound HcgF were calculated using PHENIX (Supplementary Fig. 15 Docking simulation. Molecular docking of AMP to the GP-thioester-binding site of HcgF was performed with AutoDock Vina 51 . For docking calculations, an AMP molecule was placed in a pocket adjacent to the GP-binding site. The calculation converged in an orientation of AMP in which its phosphate is close to the carboxy group of GP. In the case of NMN docking to CinA, the NMN molecule was placed close to a NMN-binding site proposed previously 25 . The calculation has converged, and the most probable binding mode of NMN to CinA was selected. Assay for adenylylation of GP catalysed by HcgE. The HcgE-catalysed adenylylation reaction of GP was carried out in 1 mM HcgE, 1 mM inorganic pyrophosphatase, 0.25 mM GP, 0-0.25 mM ATP, 1 mM MgCl 2 and 10 mM MOPS/KOH, pH 7.0, at 30°C for 10 min. The activity of HcgE was determined by analysing the amount of inorganic pyrophosphate released from the HcgE-catalysed reaction; the pyrophosphate was converted to monophosphate by inorganic pyrophosphatase from S. cerevisiae (Sigma-Aldrich Co. LCC.), and the monophosphate was analysed using Pi ColorLock Gold (Innova Biosciences). The AMP-GP product in the HcgEcatalysed reaction was analysed by matrix-assisted laser desorption time-of-flight MS and MS/MS.
Assay for GP-HcgF thioester bond formation. The GP-HcgF thioester bond formation was assayed by monitoring the amount of AMP resulting from a plausible reaction of HcgF with AMP-GP that was produced by HcgE-catalysed adenylylation of GP with ATP. The reaction conditions were as follows: 5 mM MgCl 2 , 5 mM ATP, 0.25 mM GP, 27 mM HcgE and 0-49 mM HcgF in 11 mM potassium phosphate, pH 7.0, containing 63 mM KCl and 0.05 mM DTT at 65°C for 10 min. The AMP amount was analysed with a HPLC system (JASCO Co.) equipped with a Synergi 4 m fusion-RP column (4.6 Â 250 mm; Phenomenex), which was equilibrated with a 5 mM ammonium acetate buffer pH 4.5 to acetonitrile ratio of 97:3 (v/v). HPLC peaks were monitored at 252 nm. The amount of AMP was estimated by calculating the peak area of AMP with a standard calibration curve, which was made by HPLC analysis of the peak area of different concentrations of AMP.