Identification and characterization of a novel β-D-galactosidase that releases pyruvylated galactose

Pyruvyl modification of oligosaccharides is widely seen in both prokaryotes and eukaryotes. Although the biosynthetic mechanisms of pyruvylation have been investigated, enzymes that metabolize and degrade pyruvylated oligosaccharides are not well known. Here, we searched for a pyruvylated galactose (PvGal)-releasing enzyme by screening soil samples. We identified a Bacillus strain, as confirmed by the 16S ribosomal RNA gene analysis, that exhibited PvGal-ase activity toward p-nitrophenyl-β-D-pyruvylated galactopyranose (pNP-β-D-PvGal). Draft genome sequencing of this strain, named HMA207, identified three candidate genes encoding potential PvGal-ases, among which only the recombinant protein encoded by ORF1119 exhibited PvGal-ase activity. Although ORF1119 protein displayed broad substrate specificity for pNP sugars, pNP-β-D-PvGal was the most favorable substrate. The optimum pH for the ORF1119 PvGal-ase was determined as 7.5. A BLAST search suggested that ORF1119 homologs exist widely in bacteria. Among two homologs tested, BglC from Clostridium but not BglH from Bacillus showed PvGal-ase activity. Crystal structural analysis together with point mutation analysis revealed crucial amino acids for PvGal-ase activity. Moreover, ORF1119 protein catalyzed the hydrolysis of PvGal from galactomannan of Schizosaccharomyces pombe, suggesting that natural polysaccharides might be substrates of the PvGal-ase. This novel PvGal-catalyzing enzyme might be useful for glycoengineering projects to produce new oligosaccharide structures.

While the molecular mechanisms underlying the biosynthesis of Pv-oligosaccharides have been relatively well analyzed, enzymes that metabolize and degrade Pv-containing sugar chains remain to be identified. As described above, there are several reports about the existence of PvGal in bacteria; here, therefore, we screened soil samples for microorganisms that exhibit PvGal-ase activity to release PvGal. As a result, we identified a novel gene that encodes a PvGal-ase and characterized its enzymatic activity.

Materials and Methods
Microbial culture and microscopy. Soil samples were collected in Yame city, Fukuoka prefecture, Japan.
Enzymatic analysis. PvGal-ase activity was determined by using pNP-β-D-PvGal as a substrate. To screen for microbes with PvGal-ase activity, 5 μL of culture supernatant was mixed with 4 mM substrate in 10 μL of 50 mM acetate buffer, pH 6.0. To determine substrate specificity, 3 μg of ORF1119 protein was mixed with each pNP substrate at 3 mM in 5 μL of 200 mM phosphate buffer, pH 7.5. To determine the optimum pH, 10 μg of ORF1119 protein was mixed with 2 mM pNP-β-D-PvGal in 10 μL of 400 mM buffers, which varied in pH from 4.0 to 9.5 in increments of 0.5 as follows; acetate buffer, pH 4.0-6.0; phosphate buffer, pH 6.0-8.0; Tris-HCl buffer, pH 7.0-8.5; and MOPS-NaOH buffer, pH 7.0-9.5. To investigate thermal stability, the enzyme was incubated at 4, 30, 37, 42, 50 or 55 °C for 10 min. After incubation, 10 μL of 1 M NaOH was added to terminate the reaction, and the released pNP was assessed from the absorbance at 405 nm. One unit of enzyme activity was defined as the activity producing 1 μmol of pNP per min. α-D-Xyl and β-D-Xyl were purchased from Seikagaku; pNP-β-D-Glc, pNP-β-D-Gal and pNP-β-D-Fuc were purchased from Sigma. Genomic DNA analysis. Genomic DNA of strain HMA207 was prepared by using the ISOPLANT extraction kit (Wako) in accordance with the manufacturer's instructions. The 16S rRNA gene sequence was amplified from the extracted genomic DNA sample of strain HMA207 by PCR using the universal primers listed in Supplementary Table 1. The DNA sequence of the PCR product was applied to a BLAST (http://blast.ncbi.nlm. nih.gov/Blast.cgi) search, and the species of strain was determined. Whole-genome shotgun sequencing of strain HMA207 was conducted by using a MiSeq sequencer (Illumina). The program Platanus version 1.2.1 was used for sequence assembling. Both Glimmer version 3.02b and BLAST 2.2.26 were used for annotation of the genome. Search for glycoside hydrolase (GH) families was performed according to the CAZy website (http://www.cazy. org/Glycoside-Hydrolases.html).
Preparation of recombinant proteins. To prepare recombinant expression plasmids, three candidate PvGal-ases, BglC and BglH genes were amplified by PCR using the DNA polymerase PrimeStarGXL (Takara), primers listed in Supplementary Table 1, and genomic DNA of HMA207, Clostridium saccharoperbutylacetonicum and Bacillus subtilis as a template, respectively. The amplified DNA was ligated into the pET-32b vector, which incorporated a His 5 sequence at the N-terminus, by using an In-Fusion HD Cloning Kit (Takara). Escherichia coli BL21(DE3)CodonPlusΔlacZ strain harboring each candidate PvGal-ase expression plasmid was precultured in 3 ml of LB medium at 30 °C overnight. The OD 600 was adjusted to 0.05 and the cells were cultured until OD 600 = 0.5. Next, 100 mM IPTG was added and the cells were cultured at 160 rpm and 15 °C for 48 h. Cells were lysed by ultrasonication on ice and the cell lysate was centrifuged at 20600 × g for 10 min at 4 °C. To purify the recombinant protein, the resultant supernatant was applied to a HisTrap TM FF 1 mL column (GE Healthcare) in accordance with the manufacturer's instructions.
Preparation of mutant enzymes. Mutations into ORF1119 protein were introduced by PCR using the PrimeSTAR mutagenesis Basal Kit (Takara), primers listed in Supplementary Table 1, and the pET-32b-based expression plasmid as a template. The mutated genes were confirmed by DNA sequencing (Macrogen, Japan) to ensure that only the desired mutations were introduced. The mutants were expressed and purified by using a procedure similar to that described for the wild-type ORF1119 enzyme. TLC analysis. The purified ORF1119 protein (3 μg) was incubated with each pNP sugar at 3 mM in 5 μL of 200 mM phosphate buffer (pH 7.5) at 37 °C overnight. The reaction samples were separated by TLC by using a TLC Silica gel 60 plate (Millipore) and 1-butanol/ethanol/water (2:1:1, v/v/v) solvent. The TLC plate was sprayed with 0.2% orcinol and 10% methanol/sulfuric acid, baked at 120 °C, and visualized for spots.
Crystallography. For crystallization trials, immobilized metal chromatography-purified protein was con- Healthcare, Fairfield, CT) in 50 mM HEPES-NaOH (pH 7.4) containing 150 mM NaCl. Ligand-free crystals were obtained at 20 °C using the sitting drop vapor diffusion method. A 0.5-µL aliquot of protein solution containing 12 mg/mL of ORF1119 and 50 mM pNP-β-D-PvGal was mixed with an equal volume of a reservoir solution containing 0.1 M phosphate-citrate buffer (pH 5.2) and 50% PEG 300. Despite supplementation of the protein solution with substrate, the crystal structure was obtained as a ligand-free form. Crystals of the E163A mutant complexed with PvGal were obtained by the co-crystallization method. A 0.5-µL aliquot of protein solution containing 17 mg/mL of ORF1119 E163A and 50 mM pNP-β-D-PvGal was mixed with an equal volume of a reservoir solution containing 0.1 M phosphate-citrate (pH 5.2), and 32% PEG 300. Crystals appropriate for X-ray diffraction experiments grew in at least 1 week. Crystals were flash-cooled at 100 K in a stream of nitrogen gas. X-ray diffraction data were collected by using a charge-coupled device camera on beamline BL-17A at the Photon Factory of the High Energy Accelerator Research Organization (KEK, Japan) or beamline BL-26B1 at SPring-8 (Japan). The dataset was indexed, integrated, and scaled by using HKL2000 21 . The initial phase was determined by the molecular replacement method using MOLREP 22 . Manual model building and refinement were carried out by using Coot 23 and Refmac5 24 . A model structure of PvGal was built by using JLigand 25 . Molecular graphics were prepared by using PyMOL (DeLano Scientific, Palo, Alto, CA).

Kinetic analysis.
The kinetic parameters of the wild-type and mutant enzymes of ORF1119 toward pNP-β-D-PvGal, pNP-β-D-Gal, pNP-β-D-Glc and pNP-β-D-Fuc were determined by using a discontinuous assay in which the liberation of pNP was measured at an absorbance of 405 nm. To determine the kinetic parameters for pNP-β-D-PvGal, different concentrations of substrate (1.0-10 mM) and enzyme (S427E, 23 μg; Y436A, 1.6 μg; and others, 0.14-0.54 μg) in 50 mM HEPES-NaOH (pH 7.5) were separately warmed at 37 °C, and the reaction was initiated by mixing the substrate (4.0 μl) and enzyme (6.0 μl) solutions. After an appropriate time interval (typically, from 3 to 12 min), a 2.0 μl aliquot was removed and a 4.0 μl of 1 M Na 2 CO 3 was added to stop the reaction. The absorbance was measured by using a NanoDrop ND-1000 Spectrophotometer (Thermo Fisher Scientific). To determine the kinetic parameters for the other three substrates, 50 μl of substrate solution (0.25-100 mM) and 75 μl of enzyme solution (0.012-0.54 μg) in 50 mM HEPES-NaOH (pH 7.5) were mixed to initiate the reaction. After an appropriate time interval (typically, 5 to 20 min), a 25 μl aliquot was sampled, and 50 μl of 1 M Na 2 CO 3 was added to stop the reaction. The absorbance was measured by using a Synergy H1 microplate reader (BioTek).

Alcian blue staining.
To determine the level of extracellular negatively-charged glycans on S. pombe, alcian blue staining was performed as described previously 18

Results
Identification of a soil microorganism with PvGal-ase activity. To search for a PvGal-ase, we isolated more than 200 bacterial strains from soil samples. Culture supernatants of one isolated strain, named HMA207, exhibited PvGal-ase activity when pNP-β-D-PvGal was used as a substrate. HMA207 appeared to be a Grampositive and bacillary bacterium (Fig. 1A). To identify the strain, we performed a BLAST search based on the 16S rRNA gene sequence, and confirmed that it belongs to the species Bacillus (Fig. 1B).
Exploration of candidate PvGal-ase genes in strain HMA207. Next, to search for genes encoding a potential PvGal-ase, we conducted whole-genome shotgun sequencing of strain HMA207. As a result, 7.78 Gbp were generated from 3.38 × 10 7 sequencing reads, yielding 1,470 fold-coverage, and 18 contigs were assembled. We determined most of the genome sequence, the details of which will be reported elsewhere. Because pNP-β-D-PvGal can be cleaved by β-D-Gal-ase, we searched for β-D-Gal-ase-related genes from the gene annotation generated from our genome analysis; however, no β-D-Gal-ase-related genes were predicted. In addition, we did not find any genes encoding putative GH2 β-galactosidase or GH53 β-galactanase. Thus, among 31 putative glycosidases in the genome of strain HMA207, we selected three genes (ORF1119, ORF4395 and ORF4971) ( Table 1). The three genes were selected for the following reasons: ORF1119 is a GH1-like β-D-Gal-ase; and ORF4395 and ORF4971 are predicted to be disaccharidases.

Enzymatic activities of recombinant ORF1119 protein.
We introduced the ORF1119, ORF4395 and ORF4971 sequences into an E. coli expression vector lacking lacZ to circumvent the potential risk of β-Gal-ase contamination in subsequent enzymatic assays. Crude samples from E. coli cells expressing each ORF were tested for PvGal-ase activity using pNP-β-D-PvGal as substrate, and found that only ORF1119 exhibited PvGal-ase activity. Amino acid sequence of ORF1119 was predicted to lack signal peptide, suggesting that ORF1119 is an intracellular protein. We prepared recombinant ORF1119 protein and purified it by using a Ni affinity column ( Fig. 2A). Using TLC, we then analyzed the reaction between recombinant ORF1119 protein and pNP-β-D-PvGal as substrate. The reaction sample showed a spot that was not identical to either pNP-β-D-PvGal or pNP-β-D-Gal, suggesting that ORF1119 protein cleaves the linkage between Gal and pNP to release PvGal (Fig. 2B,C).

Enzymatic properties of ORF1119 PvGal-ase.
To determine the substrate specificity of the recombinant ORF1119 protein, we measured its hydrolytic activity over a 1-h reaction time at 37 °C using a variety of pNP-glycosides. In addition to pNP-β-D-PvGal, ORF1119 exhibited hydrolase activity for pNP-β-D-Gal, pNP-β-D-Glc and pNP-β-D-Xyl (Fig. 3A). To analyze substrate specificity more in detail, we monitored the time course of hydrolytic activity and found that ORF1119 exhibits 2.27 and 1.62 U/mg protein for pNP-β-D-PvGal and pNP-β-D-Gal, respectively (Fig. 3B). Collectively, ORF1119 protein has highest activity for pNP-β-D-PvGal, confirming that this enzyme preferentially hydrolyzes β-D-PvGal.
The optimum pH for ORF1119 PvGal-ase activity was found to be 7.5. Examination of the thermal stability of the enzyme by heating at various temperatures for 10 min indicated that the enzyme was stable at temperatures up to 42 °C.
Next, we examined K m values for pNP-β-D-PvGal and pNP-β-D-Gal, and unexpectedly found that they were 0.81 mM and 3.3 mM, respectively. However, we determined k cat /K m values for pNP-β-D-PvGal and pNP-β-D-Gal, and revealed that they were 6.3 mM −1 ·s −1 and 0.06 mM −1 ·s −1 , respectively. Taken together, these findings suggest that the stronger activity toward PvGal-ase is due to the high k cat value.
Analysis of ORF1119 homologs. Next, to obtain information on potential homologs, we performed a BLAST search based on the amino acid sequence of ORF1119 and found that ORF1119 homologs widely exist in bacteria (Fig. 4). Among these homologs, Bacillus subtilis BglH belongs to the GH1 family, similar to ORF1119, and exhibits 29.4% identity to ORF1119 (Fig. 5). The catalytic nucleophile (E374) and the catalytic acid/base (E163) residues of ORF1119 were conserved with other GH1 enzymes 26,27 . We prepared the recombinant BglH from E. coli cells, tested its specificity using pNP substrates, and found that BglH does not hydrolyze pNP-β-D-PvGal (Table 2). We further analyzed BglC, a closer homolog of ORF1119 from Clostridium saccharoperbutylacetonicum (Figs 4 and 5). Using the same methods, we revealed that BglC exhibits PvGal-ase activity similar to that of ORF1119 protein (Table 2).
Crystal structure of ORF1119 PvGal-ase. Because ORF1119 protein exhibited hydrolase activity toward both pNP-β-D-PvGal and pNP-β-D-Gal, we investigated which amino acid residues are responsible for the recognition of these substrates. The crystal structure of ORF1119 was determined by molecular replacement using the structure of Halothermothrix orenii BGL 28,29 (HoBGL, PDB code 3TA9) as a search model. Structures of the ligand-free form and PvGal-bound form were determined at 2.5 Å and 2.45 Å resolution, respectively ( Table 3). The complex crystal was prepared by co-crystallization using the catalytic acid/base residue mutant E163A and pNP-β-D-PvGal. The crystals belonged to space group P322 1 and contained two protein molecules in the asymmetric unit. Part of the structure was disordered (residues 311-320 in chain A of the PvGal complex) and was not    included in the final model (red dotted line in Fig. 6A). Because the structures of all four chains (chains A and B in both the ligand-free and complex forms) were almost the same (root mean square deviation for Cα atoms, <0.6 Å), below we describe chain A of the PvGal-bound form. The overall structure of ORF1119 was similar to those of other GH1 enzymes, which have a typical (β/α) 8 barrel fold (Fig. 6A). The active site is located at the bottom of the pocket formed by loops of the barrel, a characteristic of most GH1 enzymes.
Active site architecture. In the complex structure, clear electron density is observed for the PvGal ligand, apart from the methyl group in the pyruvate moiety of PvGal (Fig. 6B). Although the complex structure was obtained by co-crystallization with pNP-β-D-PvGal, electron density of the pNP moiety is not observed in the structure, probably due to slow hydrolysis of the substrate during crystal growth. It was shown that a catalytic acid residue mutant (E170G) of β-glucosidase from Agrobacterium faecalis exhibited weak but significant remaining activity (k cat = 0.015 s -1 ) toward pNP-β-D-glucoside, which has a "good" leaving group (alcohol of low pK a ) 27 . The remaining activity of the catalytic acid/base residue mutant was considered to be enough for producing released PvGal during the long time scale of protein crystallization (typically 1-4 weeks in this study). The catalytic residues of ORF1119, E374 (nucleophile) and E163A (acid-base mutant) are in close proximity to the anomeric C1 and O1 atoms of PvGal, respectively (Fig. 7A). The O-2 and O-3 hydroxyl groups of PvGal are extensively recognized, forming one or more direct hydrogen bonds with the side chains of Q17, H118, N162 and W428. The carboxyl group of pyruvate forms hydrogen bonds with S427, K434 and Y436. W420 forms a stacking interaction with the sugar ring. Figure 7B,C show the active site structures of two GH1 enzymes: 6-phospho-β-glucosidase/ galactosidase Gan1D from Geobacillus stearothermophilus 30 (PDB code 4ZEN, sequence identity 44.6%), and β-glucosidase HoBGL from Halothermothrix orenii 28,29 (PDB code 4PTX, sequence identity 39.1%). As shown in Fig. 7, the three-dimensional position of the active site residues (E374 and E163) of ORF1119 were also conserved with other GH1 enzymes. Almost all of the residues that recognize the sugar, except for residues around the O-6 hydroxyl group, are conserved in these enzymes. Residues around the O-6 hydroxyl group are conserved between ORF1119 and Gan1D, both of which have substrate preference toward sugars carrying a negatively charged group near the O-6 atom. In HoBGL, three different residues (E408, Y411 and F417) are located around the O-6 hydroxyl group, and the side chain conformation of K415 is different from that of K434 in ORF1119. These results indicate that the four residues in this region (S427, N430, K434 and Y436) of ORF1119 seem to be involved in binding to the pyruvate moiety in ORF1119. Mutational analysis. We constructed seven mutant proteins in which the four residues identified above were substituted with either the corresponding amino acid of HoBGL or Ala (S427A, S427E, N430A, N430Y, K434A, Y436A and Y436F). Table 4 lists the kinetic parameters of the wild-type ORF1119 and mutant enzymes toward pNP-β-D-PvGal. As expected, the activity of the two S427 mutant enzymes (S427A and S427E) toward pNP-β-D-PvGal was significantly reduced owing to the increased K m value. In particular, the k cat /K m value of S427E was reduced 200-fold relative to that of the wild-type enzyme, indicating that the elongated side chain of Glu probably leads to steric hindrance with the large pyruvylated group of PvGal. The Y436A enzyme also showed reduced activity (approximately 10-fold in k cat /K m ) toward pNP-β-D-PvGal owing to the reduced k cat value, while substitution of Y436 with a residue of a similar-sized side chain (Y436F) did not largely affect the activity. This result suggests that the large aromatic side chain of Y436 supports recognition of the pyruvylated group. Substitution of the other residues (N430 and K434) led to similar or slightly increased activity toward pNP-β-D-PvGal, suggesting that these residues do not play important roles in substrate recognition. We also measured the kinetic parameters of enzyme activity toward three other substrates, pNP-β-D-Gal (without the pyruvylated group), pNP-β-D-Glc (4-epimer of Gal), and pNP-β-D-Fuc (6-deoxy variant of Gal). Wild-type ORF1119 showed relatively higher activity toward pNP-β-D-Fuc as compared with pNP-β-D-Gal or pNP-β-D-Glc, indicating that the enzyme prefers a hydrophobic group at the C-6 position. The S427E mutant enzyme exhibited higher activity toward these three substrates (2.4 ~ 7.0-fold increase in the k cat /K m ) as compared with the wild-type enzyme. The longer side chain of Glu might alter the active site so that it is suitable for non-pyruvylated substrates. Other mutations also slightly affected the activity toward non-pyruvylated substrates, but the changes in k cat /K m values were less than those observed for the S427E mutant enzyme. Collectively, these results indicate that S427 is the most crucial residue for the recognition of PvGal moiety.
ORF1119 protein releases PvGal from fission yeast galactomannan. Finally, we tested whether ORF1119 PvGal-ase can catalyze not only the artificial substrate pNP-β-D-PvGal but also a natural PvGal-containing oligosaccharide. PvGal is present in glycans on the cell surface of S. pombe 6,7 . We stained the cells with alcian blue, which attaches to negatively charged oligosaccharides, including the pyruvate moiety of  PvGal. The addition of ORF1119 to S. pombe cells significantly reduced the level of alcian blue staining (Fig. 8A). We confirmed that the reduction in alcian blue staining was time-dependent, suggesting that the ORF1119 reaction occurred enzymatically and that ORF1119 can hydrolyze a natural PvGal-containing oligosaccharide (Fig. 8B).

Discussion
In this study, we isolated a Bacillus strain that harbors a PvGal-ase encoded by ORF1119. This protein belongs to the GH1 family, which generally has broad substrate specificity. Indeed, ORF1119 protein exhibited hydrolytic activity not only toward pNP-β-PvGal but also weakly toward pNP-β-Gal, suggesting that it catalyzes hydrolysis of these substrates at the same active site but discriminate these substrates. Moreover, ORF1119 protein could not hydrolyze pNP-β-Lac or pNP-β-PvLac, both of which include a Glc residue between the Gal and pNP moieties,   Table 4. Kinetic parameters of wild-type and mutant ORF1119 enzymes toward pNP-substrates. suggesting that ORF1119 protein exclusively recognizes a galactose-type monosaccharide group at subsite −1 and does not accept a Glc moiety at subsite +1.
Our BLAST search revealed that ORF1119 protein-like PvGal-ases are present in a variety of microorganisms (Fig. 4). Among them, we further examined two ORF1119 homologs: the putative β-D-glucosidase BglC from Clostridium and the aryl-phospho-β-D-glucosidase BglH from Bacillus. Although the catalytic residues (E163 and E374 in ORF1119) are conserved among these homologs, BglH did not exhibit PvGal-ase activity toward pNP-β-PvGal as a substrate, suggesting that ORF1119 homologs contain similar active sites but have different substrate specificities.
On the basis of our structural and mutational analyses, we propose that S427 is responsible for recognition of PvGal, and the aromatic side chain of Y436 supports this recognition. In contrast, N430 and K434 are not crucial residues. In addition, we confirmed that ORF1119 PvGal-ase catalyzes not only artificial pNP substrates but also the natural substrate S. pombe galactomannan. In the genome of S. pombe, there is no obvious homolog of ORF1119. Therefore, it is currently hard to predict how S. pombe galactomannan is metabolized. Previously, we generated pyruvylated human-type complex glycopeptides by using a Pvg1p H168C mutant that can attach Pv to the terminal β-1,4-linked Gal residue of human-type complex glycopeptide 20 . Thus, we also tested whether ORF1119 PvGal-ase can cleave PvGal from pyruvylated human-type complex glycopeptide; in this case, however, we did not detect enzymatic activity (data not shown). This might be because ORF1119 PvGal-ase can catalyze β-1,3-linked PvGal in S. pombe galactomannan but cannot cleave β-1,4-linked PvGal in human-type complex glycopeptide. Further molecular dissection of the substrate recognition mechanisms of ORF1119 PvGal-ase will be required to engineer an enzyme that can catalyze pyruvylated human-type complex oligosaccharides.
In summary, we have characterized a novel PvGal-ase encoded by ORF1119 in Bacillus strain HMA207. Given that PvGal may exist in oligosaccharide structures of versatile organisms, more PvGal-ases are likely to be identified in addition to ORF1119 protein and BglC. Because Pv has similar features to mammalian sialic acid, PvGal-catalyzing enzymes would be beneficial for novel glycoengineering to produce new oligosaccharides with sialic acid-mimicking property.