Structural Insights into Substrate Specificity of Feruloyl-CoA 6’-Hydroxylase from Arabidopsis thaliana

Coumarins belong to an important class of plant secondary metabolites. Feruloyl-CoA 6’-hydroxylase (F6’H), a 2-oxoglutarate dependent dioxygenase (2OGD), catalyzes a pivotal step in the biosynthesis of a simple coumarin scopoletin. In this study, we determined the 3-dimensional structure of the F6’H1 apo enzyme by X-ray crystallography. It is the first reported structure of a 2OGD enzyme involved in coumarin biosynthesis and closely resembles the structure of Arabidopsis thaliana anthocyanidin synthase. To better understand the mechanism of enzyme catalysis and substrate specificity, we also generated a homology model of a related ortho-hydroxylase (C2’H) from sweet potato. By comparing these two structures, we targeted two amino acid residues and verified their roles in substrate binding and specificity by site-directed mutagenesis.

In this study, we determined the crystal structure of F6'H1 apo enzyme by molecular replacement and also generated a homology model of C2'H structure. By comparison of two protein structures, we targeted two amino acid residues and verified their roles in enzyme activity and substrate selectivity by site-directed mutagenesis.

Results and Discussion
Diffraction analysis of the colorless plate shaped F6'H1 crystals indicated that they belonged to space group C2, with unit-cell parameters a = 193.22 Å b = 54.55 Å, c = 78.82 Å and γ = 111.5°. Based on these unit cell dimensions and assuming two molecules per crystallographic asymmetric unit, the calculated Matthews coefficient is 2.38 Å 3 Da −1 giving an estimated solvent content of ~48.2% 15 . The Overall F6'H1 Structure. The F6'H1 structure ( Fig. 2A) consists of residues A15-A343, B15-B141, B144-B344, 28 solvent molecules modeled as water and two sodium ions. The histidine purification tags, residues A1-A14, A344-A361 and residues B1-B14, B142-B143, B345-B361 were not observed in the electron density maps and are presumed to be disordered. As expected the F6'H1 structure closely resembles the structure of the A. thaliana Anthocyanidin synthase (ANS) search model. The two structures share a beta sandwich topology, and can be superimposed with an RMSD of superposition (242 α carbons) of 1.363 Å 16 . Like other members of this class of enzymes 17-21 the structure contains an N-terminal DIOX_N (PF14226) domain, residues 62-172 linked to a C-terminal 2OG-FeII_OXY (PF03171) domain, residues 212 -312 that contains the catalytic site 22 . Major features of the structure are the 15 helices and 14 b strands (Table 1). Strands b 1, b 2, b 10, b 7, b 12, b 5, b 4 and b 3 form an 8-stranded mixed b sheet (sheet S1), which assumes a b jellyroll fold common to this family of enzymes. Strands b 6, b 11, b 8 and b 9 forms an antiparallel b sheet (sheet S2), while strands b 13 and b 14 forms antiparallel b sheet (sheet S3). Sheets S1 and S2 together form a large (2,309 Å 3 ) hydrophobic pocket that contains the catalytic    23 . There are two enzyme molecules in the crystallographic asymmetric unit. A superposition of the two chains gives an RMSD of superposition of 0.670 Å for 320 Cα pairs 16 with the largest deviations observed in the region containing α 10 and spans the C-terminus of α 9 to the N-terminus of strand S1b 6.
The Substrate Binding Pocket. The F6'H1 substrate binding pocket is contained within the C-terminal 2OG-FeII_OXY domain with its dominate feature being the 2-HIS-1-carboxylate facial triad (residues HIS 235, ASP 237 and HIS 293) that is involved in iron binding (Fig. 2B). However, in the apo structure reported here the iron site is occupied by a sodium ion. Strands S1b 3 -S1b 7 forms the back of the binding pocket. The catalytic triaid is positioned at the front of the binding pocket facing sheet S1 with HIS 235 and ASP 237 located in the long meandering loop connecting strands S1b 4 to S1b 6, and HIS 293 located at the N-terminal of strand S2b 1.
The binding pocket is similar to that observed for the ANS search model giving an RMSD of superposition of 0.838 Å for 97 of the 113 Cα 's comprising strand S1b 7 and the 2OG-FeII_OXY domain. Figure 2C shows a theoretical model 16 of the F6'H1 active site based on the A. thaliana ANS crystal structure (PDB entries 1GP5, 1GP5 and 1GP6) 24 . Many of the key residues (ASN 218, TYR 220, ARG 303 and SER 305) involved in binding the 2OG co-substrate are structurally conserved in the F6'H1 structure. Interactions of the side chains of ARG 303, SER 305 and TYR 220 were used to anchor 5-carboxylate terminal of 2OG. This process required only slight rearrangements of the side chains involved. Interactions of the 2-keto and 1-carboxylate groups of 2OG with the sodium ion occupying the iron-binding site were used to anchor the other end of the molecule. The ferulic acid fragment of the feruloyl-CoA substrate was modeled using the ferulic acid molecule from PDB entry1JT2 25 with O2 of the carboxyl group replaced by sulfur reflecting the CoA linkage. The ferulic acid fragment was placed using the (2R,3R)-trans-dihydroquercetin substrate from the 1GP5 crystal structure and the structurally conserved residues TYR 151, ASN 237 and PHE 309 as guides. In this process, interactions with the side chains of TYR 151 and ASN 237 were used to anchor the thiocarboxyl oxygen and para-hydroxyl groups, respectively of the substrate while PHE 309 stabilized the substrate via π stacking with the ferrul ring. This arrangement places ARG 214 in position to interact with the oxygen lone pairs of the ferulic acid methoxy group, while residues SER 153 and ASN 216 are in position (slight side chain movements) to hydrogen bond with the thiocarboxyl oxygen.

Structural Comparison of F6'H1 and C2'H Structures. The C2'H homology model closely resem-
bles the F6'H1 template used in the modeling. The two structures can be superimposed 16 to give an RMSD of superposition of 0.407 Å for 320 α carbon pairs (Fig. 3A). The largest deviations between the two structures are observed for C2'H residues 174-177 (LKSC) which disrupts helix α 9. The sequence for the corresponding residues in the F6'H1 structure (residues 179 to 182) is NKSK.
The binding pocket in the C2'H homology model is also hydrophobic but is slightly larger at 2,938 Å 3 . A comparison of residues comprising the active site pocket (Fig. 3B) for the two enzymes shows that 86 of the 113 residues are structurally conserved including key residues involved in iron (HIS 231, ASP 233 and HIS 289), 2OG (ASN 216, TYR 220, ARG 299 and SER 301) and substrate (PHE 305) binding. However, in the C2'H structure TYR 151 which anchors the substrates thiocarboxyl oxygen in the active  Site-directed mutagenesis was carried out to verify the roles of the above described amino acid residues on enzyme activity and specificity. For F6'H1 we introduced three single mutations (TYR151HIS, TYR151PHE and VAL238ILE) and a double mutation (TYR151HIS, VAL238ILE). Similarly, three single mutations (HIS147TYR, HIS147PHE and ILE234VAL) and one double mutation (HIS147TYR, ILE234VAL) were introduced into C2'H. As shown in Fig. 4A, the activity of all the F6'H1 mutants towards feruloyl-CoA was decreased compared with the wild type F6'H1 (F6'H1 WT), while C2'H HIS147TYR showed improved activity towards feruloyl-CoA. These results partly explain the higher activity of F6'H1 towards feruloyl-CoA than C2'H.
Interestingly, F6'H1 TYR151PHE and C2'H HIS147PHE mutants still showed relatively high activity towards feruloyl-CoA (Fig. 4A). On the contrary, compared with that of the wild type C2'H, the activity of C2'H HIS147PHE towards 4-coumaroyl-CoA decreased significantly (Fig. 4B). These results indicate that the interaction between TYR/HIS residues and the substrate thiocarboxyl oxygen was critical for hydroxylation of 4-coumaroyl-CoA but not for hydroxylation of feruloyl-CoA.
In addition, the activity of F6'H1 towards 4-coumaroyl-CoA was not improved by introducing those mutations, indicating that besides TYR 151 and VAL 238, the subtle difference between the overall structures of F6'H1 and C2'H is also important for the catalytic activity and substrate specificity. In the future studies, we plan to determine the crystal structure of C2'H to better understand the mechanisms of these important enzymes.  pETDuet1-F6'H1 was constructed by inserting F6'H1 gene into the BamHI/NdeI restriction sites of a pETDuet1 vector. E. coli strain BL21 Star (DE3) was then transformed with plasmid pETDuet1-F6'H1 containing an N-terminal 6XHis-tag (GSSHHHHHHSQD) to aid in purification. A fresh colony was inoculated into 50 mL of LB medium containing 100 μ g/mL ampicillin and grown aerobically at 37 o C overnight. The whole overnight culture was then used to inoculate 1 L of LB medium supplemented with 100 μ g/mL ampicillin and grown at 37 o C with shaking (250 rpm). When OD 600 reached around 0.6, the culture was induced with 0.25 mM IPTG and cultivated at 30 o C for an additional 3 hours.

Methods
Selenomethionine-substituted F6'H1 (Se-F6'H1) was produced using a metabolic inhibition protocol 26 . Briefly, 2 mL of cells from an overnight culture grown in LB medium containing 100 μ g/mL ampicillin were collected by centrifugation and resuspended in 50 mL of M9 minimal medium containing 100 μ g/mL ampicillin, 0.4% glucose, 2 mM MgSO 4 , vitamins, and trace elements. The culture was allowed to grow overnight and used to inoculate 4 × 1 L using the minimal media cocktail described above. The large-scale cell culture was shaken at 37 o C until the OD 600 reached around 0.6, at which point 100 mg of lysine, threonine and phenylalanine, and 50 mg of selenomethionine, leucine, isoleucine, and valine were added as solids into each liter of culture. Cells were then allowed to grow for an additional 20 min before IPTG was added to the final concentration of 1 mM. The resulting culture was grown overnight at 30 o C.
For both the native and selenomethionine-substituted protein, the cells were harvested by centrifugation at 6000 × g for 15 min at 4 o C. The cell pellet was then suspended in 30 mL lysis buffer (20 mM phosphate buffer, pH 7.4, 500 mM NaCl, 20 mM imidazole, 10 ug/mL phenylmethylsulfonyl fuoride (PMSF)). The cell suspension was lysed by sonication on ice and cleared by centrifugation at 25,000 × g for 30 min. The supernatant was then loaded onto a HisTrap HP column (5 mL, GE Healthcare) connected to AKTAprime plus (GE Healthcare) and pre-equilibrated with binding buffer (20 mM phosphate buffer, pH 7.4, 500 mM NaCl, 20 mM imidazole). The column was washed with 50 mL of binding buffer and the F6'H1 proteins eluted with a linear (20 to 500 mM) imidazole concentration gradient. The resulting purified proteins were dialyzed against 20 mM Tris-HCl, pH 7.4 containing 50 mM NaCl, 1 mM DTT and concentrated to approximately 12 mg/mL for crystallization.
Crystallization, X-ray Data Collection and Structure Determination. Crystals of Se-F6'H1 were grown by sitting drop vapor diffusion at 291K using 2 μ L drops containing equal volumes of protein concentrate and a precipitant cocktail containing 20% (w/v) PEG-8000, 0.1 M MES, 0.3 M Ca(OAc) 2 , pH 6.0. Crystals appeared in ~3 days and grew to usable size in 9-10 days.
For data collection a crystal measuring 200 × 200 × 50 microns was harvested from the well, briefly immersed in a drop of cryoprotectant solution containing the above precipitant cocktail with 20% (v/v) glycerol. The cryoprotected crystal was then flash cooled 27 in liquid nitrogen and stored at cryogenic temperatures for data collection. A data set to 2.7 Å resolution was collected at 100 K on beamline 22ID, SER-CAT, Advanced Photon Source, Argonne National Laboratory using a 50 micron beam, a MAR300 CCD detector and 0.979 Å X-rays. A total of 360 one-degree images were recorded using a crystal-to-detector distance of 380 mm and an exposure time of 1 second. The data were indexed, integrated and scaled using HKL-2000 28 .
Initial attempts to solve the structure using SelenoMet SAD (single wavelength anomalous scattering) 29 were unsuccessful and the structure was determined by molecular replacement (MR) using PHENIX 30 . The structure of Anthocyanidin synthase (ANS) from A. thaliana (PDB entry 1GP4) 24 , the closest PDB sequence homologue (34% identity), was used as the search model. Phaser-MR gave single molecular replacement solution containing a dimer in the asymmetric unit. Using this solution and two rounds of AutoBuild, the second employing noncrystallographic symmetry, 647 residues out of 746 (including His-tag) were built giving a map-model correlation of 0.78 and initial R and R free values of 0.23 and 0.31, respectively. The model was further improved using iterative rounds of validation 31 , model building 32 and refinement (using torsion angle noncrystallographic symmetry restraints). Since SelenoMet SAD was unsuccessful, the occupancies for the 6 selenium atoms were also refined. During the latter stages of refinement solvent molecules, modelled as water, were added to the model based on their environment and hydrogen-bonding scheme. Density was also observed at the iron-binding site and was modelled as a sodium ion since energy dispersive fluorescence scans of the crystal did not indicate the presence of iron. As outlined in Table 2  In vivo Assay of F6'Hs and C2'Hs. A coupled enzyme assay was used to evaluate the relative activity of F6'H1 and C2'H enzymes used in the analysis. Plasmids pZE-F6'H1-Pc4CL2 and pZE-C2'H-Pc4CL2 were constructed in our previous study 14 . Other plasmids were constructed by replacing F6'H1 encoding gene or C2'H encoding gene with their corresponding mutated genes. E. coli strain BW25113 was transformed with these plasmids, respectively. Overnight cultures were inoculated into 20 mL of M9Y media containing 100 μ g/L ampicillin. Cell cultures were grown at 37 o C with shaking. When OD 600 reached 0.4, cells were induced with 0.25 mM IPTG for 3 h, at which time point the substrates ferulic acid or coumaric acid were fed into the cell cultures. After another 15 hours of incubation, samples were taken for OD 600 measurement and supernatants after centrifugation were used for HPLC analysis. The enzyme activity was calculated based on the formation of the corresponding product (scopoletin or umbelliferone) and was expressed as mg/L/OD 600 . The ingredients of M9Y media, and the HPLC analysis method were described in our previous study 14 .