Insights into the κ/ι-carrageenan metabolism pathway of some marine Pseudoalteromonas species

Pseudoalteromonas is a globally distributed marine-associated genus that can be found in a broad range of aquatic environments, including in association with macroalgal surfaces where they may take advantage of these rich sources of polysaccharides. The metabolic systems that confer the ability to metabolize this abundant form of photosynthetically fixed carbon, however, are not yet fully understood. Through genomics, transcriptomics, microbiology, and specific structure-function studies of pathway components we address the capacity of newly isolated marine pseudoalteromonads to metabolize the red algal galactan carrageenan. The results reveal that the κ/ι-carrageenan specific polysaccharide utilization locus (CarPUL) enables isolates possessing this locus the ability to grow on this substrate. Biochemical and structural analysis of the enzymatic components of the CarPUL promoted the development of a detailed model of the κ/ι-carrageenan metabolic pathway deployed by pseudoalteromonads, thus furthering our understanding of how these microbes have adapted to a unique environmental niche.

ĸ k -c a r r a g e e n a n s t a n d a r d s βĸ k -c a r r a g e e n a n + S 1 _ 1 9 A i -c a r r a g e e n a n ɩ -c a r r a g e e n a n + S 1 _ 1 9 A k -c a r r a g e e n a n o n l y i -c a r r a g e e n a n o n l y ĸ-NC8 ĸ-NC4 β-NC2 βĸ-NC4 ĸ-NC2 ĸ k -c a r r a g e e n a n s t a n d a r d s βĸ k -c a r r a g e e n a n + S 1 _ 1 9 A i -c a r r a g e e n a n ɩ -c a r r a g e e n a n + S 1 _ 1 9 A k -c a r r a g e e n a n o n l y i -c a r r a g e e n a n o n l y . . Q .    The superscript "p" ( p ) in the sample label indicates that the carrageenan was pretreated overnight with the sulfatase, followed by heat inactivation of the sulfatase then digestion with the GH16. The asterisks (*) indicates the band corresponding to excess ANTS fluorophore.  s t a n d a r d s k βk s t a n d a r d s βk k βk ɩ s t a n d a r d s   bound in the active site with a focus on the DA2S and G4S residues in the potential +1 and +2 subsites, respectively. Representative electron density for S1_NC active site residues. The gray mesh shows the electron density map as a maximum likelihood/σa-weighted 2Fo-Fc map contoured at 1.0 σ. The green mesh shows the unbiased electron density map as a maximum likelihood/σa-weighted Fo-Fc map (contoured at 3.0 σ) produced by refinement with the atoms for residue 84 omitted. Residues are coloured as gray lines. In panels b -g, representative electron density for S1_NC residue 84 is shown when modeled as FGly

fig|6666666.18474.peg.2034 237 T S I K R Y S M E G G N A P I V C D A L
fig|6666666.18474.peg.2034 420                     amino acid sequence identity with both GH16A and GH16C, but ~78% amino acid sequence identity with a Paraglaciecola hydrolytica S66 T GH16 enzyme that is classified as a b-carrageenan specific endo-hydrolase 2 . The gene encoding GH16A, which is predicted to be a periplasmic protein, is also present in the Pseudoalteromonas strains, though the gene is fragmented in P. carrageenovora 9 T and U2A. With 92% amino acid sequence identity over their entire lengths, GH16C is the orthologue of the well characterized k-carrageenanase CgkA (PCAR9_p0048), which is responsible for the secreted k-carrageenanase activity in P. carrageenovora 9 T 1,3-6 and therefore possibly also in PS47. GH16A and GH16C have 76% amino acid sequence identity; however, GH16A lacks the ~100 amino acid C-terminal domain present in GH16C and CgkA (Supplementary Figure 4a). Modeling the structures of GH16A and GH16C using CgkA as a template indicate 100% conservation of the active sites, pointing to conserved specificity amongst these enzymes (Supplementary Figure 4b).
3. Structural analysis of S1_19B. The structure of the wild type S1_19B enzyme at 2.50 Å resolution was solved using S1_19A from PS47 (PDB ID code: 6BIA) 7 as the search model. The final refined model contained four molecules in the asymmetric unit arranged as a tetramer (Supplementary Figure 7b). PISA analysis 8 of the S1_19B structure predicts the tetramer to be composed of two dimers, with buried surface areas of ~1460 Å 2 between the chains participating in the dimer interface (chains A and B, or C and D), and only ~362 Å 2 between chains from the different dimers (i.e. between chains A and C). The molecular interactions of the dimer interfaces show an extensive and direct hydrogen bond network (Supplementary Figure 7b inset). To assess the oligomeric state of S1_19B in solution, the protein was passed through a calibrated size exclusion chromatography column. The elution volume for S1_19B corresponds to an experimental molecular weight of 91.3 kDa (Supplementary Figure 7c). This is between the expected molecular weight of the S1_19B monomer (55.828 kDa) and dimer (112 kDa), but is closer to that of a dimer. The PISA analysis and the elution volume results support the dimer as being the biological unit for S1_19B. In contrast, the other S1_19 family member in the CarPUL, S1_19A, adopts a stable trimeric structure 7 .
4. Structural analysis of S1_NC. The final refined model of native S1_NC contained two molecules in the asymmetric unit arranged as a homodimer with C2 symmetry  9 . Despite extensive efforts, we were unable to circumvent generation of this inappropriately matured form of the protein. Therefore, to provide insight into the specificity of this enzyme we generated C84A and C84S mutants, which prevented post-translational modification of the catalytic site, and attempted to determine structures of these mutants in complex with neocarrageenan oligosaccharides and fragments thereof. These initial efforts led to separate structures of S1_NC C84A and S1_NC C84S in complex with a sulfate and k-NC2, respectively ( Supplementary Figures 10a and 10b). Superimposition of these structures revealed a composite mimicking a product complex of i-NC2 hydrolysis whereby 0 and +1 subsites were occupied by the DA and G4S residues, respectively, of the k-NC2 "product," while the S-subsite was occupied by a free sulfate found in proximity to the O2 of the DA residue (Supplementary Figure 10c). Informed by this, we subsequently determined the structure of S1_NC C84S in complex with i-NC4 (Supplementary Figure 10d, see also main text).

Characterization of the 3,6-anhydro-D-galactose dehydrogenase, DauA. Recombinant
DauA from PS47 had no significant activity on DA when using NAD + as a co-factor.
However, it displayed activity on DA when using NADP + as a co-factor and had optimum activity between pH 7. DauA was crystallized in two distinct crystal forms, each in the space group P21. Native DauA, whose structure was determined by molecular replacement, crystallized with four molecules in the asymmetric, while DauA in complex with NADP + crystallized with six molecules in the asymmetric unit. The common quaternary structure of the two crystal forms, which is predicted by PISA analysis to be stable, is a dimer.
In the NADP + complex, the 2 prime-sulfate group of the adenine group is accommodated in a pocket where two direct and two water mediated hydrogen bonds are made between the sulfate and the protein (Supplementary Figure 13b). The preference of DauA for NADP + over NAD + distinguishes the Pseudoalteromonas enzyme from ZgDauA, which prefers NAD + 10 . Notably, S174 of DauA, which provides the end-wall of the sulfate binding pocket, is substituted by a glutamic acid in ZgDauA. Additionally, D204, which in DauA provides a hydrogen bond with the sulfate, is substituted by a valine in ZgDauA (Supplementary Figure 13c). These changes may provide steric hindrance with the 2 prime-sulfate of NADP + and a loss of H-bonding potential in ZgDauA, possibly explaining the difference in co-factor preference between the two enzymes.
6. Activity of S1_19A on i-NC4. Our previous characterization of S1_19A indicated that this enzyme is primarily an endo-acting 4S-sulfatase that prefers i-carrageenan 7 .
However, the enzyme displayed the ability to process i-NC4, but not i-NC2, and had very low activity on k-carrageenan oligosaccharides with a degree of polymerization of four or more. X-ray crystallographic and NMR analyses of the interaction of S1_19 with substrate indicated its specificity for G4S residues. Furthermore, the poise of i-NC4 in the crystal structure of the S1_19A i-NC4 complex showed recognition of the non-reducing end ineocarrabiose motif in all three of the protein monomers in the asymmetric unit. Given the ability to process an i-carrageenan oligosaccharide and the role this could potentially play in the Pseudoalteromonas pathway of carrageenan processing we further probed i-NC4 processing by mass spectrometry. Profiling of oligosaccharide species in i-NC4 and its digestion products treated with sulfatase for 1 or 2 hrs was carried out using a LC-HRMS/DAD/ELSD method, with a graphitized carbon column as reported 11 . The starting i-NC4 substrate showed predominantly masses consistent with a tetrasaccharide having four sulfate groups, as expected for i-NC4 (Supplementary Figure 14a and 14b, and   Supplementary Table 7). Minor amounts of di-and tri-sulfated species were detected and are likely due to ionization induced neutral loss of sulfate groups.
After one and two hours of incubating i-NC4 with S1_19A a new product was formed with a different retention time; this was the same for both samples so only the 2 hour sample was used for further analysis and comparison (Supplementary Figure 14c). The product showed predominantly masses consistent with a tetrasaccharide having three sulfate groups, indicating the enzymatic removal of a single sulfate group (Supplementary Figure   14d and Supplementary Table 6). Minor amounts of di-sulfated species were detected, which are also likely due to ionization induced neutral loss of sulfate groups. Thus, combining the mass spectrometry and structural data, our interpretation is that S1_19A is capable of removing the 4-sulfate from the G4S residue adjacent to the non-reducing end, but is unlikely to desulfate all G4S residues in an oligosaccharide.
7. Analysis of EU509_8830, EU509_8835, and EU509_8875 -hypothetical a-1,3-(3,6anhydro)-D-galactosidases. As an initial step towards identifying the a-1,3-(3,6-anhydro)-  Figure 15). EU509_08830 and EU509_08875, which share 38% amino acid sequence identity, are predicted with 100% confidence by Phyre2 to have a 7-bladed b-propeller fold (Supplementary Figure 16). This fold is associated with a variety of different functions, but, notably, is found in carbohydrate specific lectins, polysaccharide lyases, and glycoside hydrolases, latter two of which are classes of enzymes that cleave glycosidic bonds. Similarly, the 6-bladed b-propeller fold predicted to be adopted by EU509_8835 is also associated with a variety of functions, including glycoside hydrolysis (Supplementary Figure 16). On this basis, we favor EU509_08830, EU509_08835, and EU509_08875 as the most likely candidates for the a-1,3-(3,6anhydro)-D-galactosidase(s) in the CarPUL.

Comparison of agarose and carrageenan degradation pathways.
There are two general models proposed for saccharification of agarose, which is an algal galactan that is related to carrageenan but is typically non-sulfated (or less sulfated) and has 3,6-anhydro-Lgalactose in place of DA. Both models rely on initiation of depolymerization by the action of endo-acting b-agarases to generate a pool of neoagaroligosaccharides. One model, which we refer to as the "exo model," then relies on the sequential action of an exo-a-1,3-L-neoagarooligosaccharide hydrolase (GH117) and an exo-b-D-galactosidase (GH2) to reduce the oligosaccharides to monosaccharides 14 . The other model uses a neoagarobiose releasing b-D-agarase (GH50) to reduce neoagarooligosaccharides to neogarobiose, which is then followed by hydrolysis to monosaccharides by a GH117 15 .
The Pseudoalteromonas CarPUL does not contain any genes encoding a candidate exob-D-galactosidase, such as the GH2 enzymes of Z. galactanivorans and Paraglaciecola hydrolytica S66 T that have exo-b-galactosidase activity on the b-1,4-linkages in carrageenan 2,10 , but instead employs a b-NC2 releasing b-carrageenanase. Therefore, the model of i/k-carrageenan metabolism by our pseudoalteromonad isolates most closely parallels the latter model of agarose metabolism. In contrast to the pseudoalteromonad model, the pathway proposed for i/k-carrageenan metabolism by Z.
galactanivorans utilizes a parallel of the "exo model" 10 .
In both models of carrageenan depolymerization a key step is the hydrolysis of the nonreducing terminal a-1,3-(3,6-anhydro)-D-galactose residue. While the a-1,3-(3,6anhydro)-D-galactosidase has not yet been identified in CarPUL-containing Pseudoalteromonas species, their growth phenotype on kand i-carrageenan indicates that this enzyme activity must be present. The failure to identify candidate enzymes belonging to GH families 127 and 129 indicates that the pseudoalteromonad solution to hydrolyzing the a-glycosidic linkage in carrageenan is different than that of Z. galactanivorans, where this activity is clearly attributed to GH127 and GH129 enzymes, thus highlighting another difference between the two pathways.
With respect to the sulfatases employed, which are a notable feature of carrageenan metabolism, the two characterized pathways use different complements of enzymes. The Pseudoalteromonas endo-acting sulfatase S1_19A and Z. galactanivorans ZGAL_3145 (family S1_19) display very similar biochemical properties: activity most consistent with endo-G4S i-carrageenan sulfatase specificity and the capability of producing acarrageenan on the non-reducing end of an i-carrageenan oligosaccharide. The kcarrageenan specific G4S-sulfatase from Z. galactanivorans (ZGAL_3146, family S1_7) displays properties most consistent with endo-activity whereas the k-carrageenan specific G4S-sulfatase S1_19B from Pseudoalteromonas is exo-acting on the non-reducing end.
Finally, the DA2S-sulfatases from Z. galactanivorans (ZGAL_3151, family S1_17) and Pseudoalteromonas (S1_NC) are both exo-acting on non-reducing end DA2S residues; however, ZGAL_3151 is a-carrageenan specific whereas S1_NC has structural properties most consistent with action on both iand a-carrageenan. Thus, not only does the mode of oligosaccharide depolymerization differ between the two pathways but the order in which desulfation occurs must also differ 10 .

Supplementary Methods.
Oligomeric state determination using size exclusion chromatography -Elution volumes 2,000 kDa) was used to determine the void volume (Vo) and a standard curve was created plotting molecular mass vs. Ve/Vo for each respective protein standard. Samples of S1_19B and S1_NC at 10 mg mL -1 were applied to the column and their Ve/Vo values plotted against the standard curve. All samples and standards were run at a flow rate of 0.5 mL/min in 500 mM NaCl and 20 mM Tris (pH 8.0).
Mass spectrometry -For liquid chromatography high resolution mass spectrometry/diode array detector/evaporative light scattering detector (LC-HRMS/DAD/ELSD), an Accela TM 1250 LC system was coupled to an Exactive TM mass spectrometer (Thermo Fisher Scientific) equipped with electrospray ionization source (HESI-II) probe. Through a flowsplitter, the LC eluent was simultaneously sent to a diode array detector (UltiMate 3000 DAD) to acquire UV signal and subsequently to an evaporative light scattering detector (Alltech 3300 ELSD). A makeup solution consisting of 0.1% formic acid in 80% methanol was delivered constantly at 100 µL to the MS. Separation was based on a modified method by Itoh et al. 11 . It was carried out on a Hypercarb column (100 x 2.1 mm, 5 µm Thermo Scientific) using mobile phase consisting of (A) 5 mM ammonium acetate pH 9.6 with 2% acetonitrile and (B) 5 mM ammonium acetate pH 9.6 with 80% acetonitrile, with a linear gradient from 5% B to 30% B in 30 min, and then to 100% B in another 5 min, before returning to initial gradients, at a flow-rate of 400 µL min -1 .
HRMS was acquired in negative polarity at 50,000 resolution. The following MS conditions were used: sheath flow 15, auxiliary gas flow rate 4; spray voltage -2.3 kV; both capillary and heater temperature at 250 °C. Mass range was scanned from m/z 100-2,000. MSMS was performed in high-energy collisional dissociation (HCD) scan at 10,000 resolution, using 60 eV. Maximum inject time for both MS and MSMS channels was at 50 ms.