Introduction

Seaweeds are a major resource in the marine ecosystem, and they contain various polysaccharides, including alginate, mannan, cellulose and laminarin. Marine algal polysaccharides are distinct from terrestrial forms in both their diversity and their modifications1. For example, sulfated polysaccharides are found in brown marine seaweeds, which may be related to the high salinity of their environment2. Notably, seaweed polysaccharides and their corresponding degradation products exhibit various biomedical properties, including antitumor, antihypertensive, and antioxidant activities3,4,5,6. The biomedical activities of seaweed polysaccharides are a function of their monosaccharide composition, molecular weight, and sulfate content. Lower-molecular-weight oligosaccharides have discrete advantages over longer polysaccharides in pharmaceutical uses6,7. Consequently, the identification of novel enzymes capable of digesting seaweed polysaccharides is important.

For some herbivorous marine mollusks, seaweeds provide not only a habitable environment but also their daily diets. Therefore, these marine mollusks likely possess a breadth of polysaccharide-degrading enzymes with unique substrate specificities and mechanisms of action8,9. Indeed, a number of polysaccharide-degrading enzymes, including alginate lyase, mannanase, cellulose and laminarinase, have been derived from the digestive tracts of marine mollusks10,11,12,13. These enzymes exhibit high cleavage efficiency and unique substrate specificity, suggesting that marine mollusks could be a source of novel polysaccharide-degrading enzymes.

In the current study, we focused on the Zhikong scallop (Chlamys farreri), which is widely distributed in the oceans around north of China. Scallops are filter feeders and eat plankton, including macroalgae fragments, microalgae, bacteria and copepods. In a previous study, scallop (Chlamys albidus and Mizuhopecten yessoensis) endo-β-1,3-glucanases were extracted and characterized14,15, and a novel α-L-fucosidase was isolated from the scallop Pecten maximus16. However, although several polysaccharide-degrading enzymes have been identified from the scallop hepatopancreas, the diversity of these scallop enzymes remains unclear. To explore a more comprehensive method for identifying scallop hepatopancreas polysaccharidases, we carried out a proteomic approach to identify these enzymes. A variety of proteins were identified, including several distinct polysaccharide-degrading enzymes and enzymes associated with sulfate catabolism. Our study provides new information relevant to the enzymatic adaptation of marine algal diets and a description of novel marine polysaccharide-degrading enzymes.

Results and Discussion

Mass spectroscopy analysis

Proteins extracted from the scallop digestive gland were subjected to SDS-PAGE (Figs 1 and 2), and the protein mixtures were analyzed using a shot-gun proteomics approach as described in the Materials and Methods section. LC-MS/MS analysis identified 477 unique proteins, and 435 unique proteins could be annotated using BlastX against the Swiss-prot database (Supplementary Table S1). These analyses represent a comprehensive protein profile for the scallop hepatopancreas.

Figure 1
figure 1

Photograph of scallop hepatopancreas.

The scallop hepatopancreas is marked by a dashed circle.

Figure 2
figure 2

SDS-PAGE analysis scallop hepatopancreas extract.

Lane 1, Scallop hepatopancreas extract; lane M, protein marker.

Gene ontology (GO) term annotation and enrichment analysis of the scallop hepatopancreatic proteins demonstrated that the most enriched GO categories were metabolic and biological processes, including 109 proteins with hydrolytic activity (GO: 0016787) (P value: 5.72E-22) and 164 proteins associated with organic substance metabolic processes (GO: 0071704) (P value: 9.15E-15) (Table 1). Both of these GO categories were dominated by glycoside hydrolases, sulfatases, proteinases and components of the proteasome, previously reported as lysosomal enzymes17 (Table 1).

Table 1 GO term enrichment of scallop hepatopancreas proteins.

Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis indicated these hepatopancreatic proteins were significantly enriched in pathways involved in glycan degradation and the biosynthesis of amino acids as well as lysosomal and proteasomal pathways. Many of the proteins associated with these pathways contribute to carbohydrate and protein metabolism (Table 2). The proteins identified correlate well with the biological function of the hepatopancreas, the main digestive organ in scallops.

Table 2 KEGG enrichment of scallop hepatopancrea proteins.

Sequence analysis of the enzymes relevant to marine polysaccharide metabolism

The composition of marine polysaccharides is complex and diverse18. For example, agar is mainly composed of repetitive units of β-D-galactose and 3,6-anhydro-α-L-galactose19, alginate derived from brown algae is mainly composed of heteropolysaccharides of α-L-guluronate and β-D-mannuronate20, and the carrageenans derived from red algae consist of galactopyranose units21. Marine organisms require a variety of enzymes with different substrate specificities to digest and obtain these nutrients. In the scallop hepatopancreas, we identified mannosidase, α-glucosidase, β-galactosidase, endoglucanase, β-glucuronidase, chitotriosidase, xylose isomerase and α-L-fucosidase (Table 3). With the exception of endoglucanase and α-L-fucosidase14,15,16, this is the first description of these enzymes being isolated from the scallop hepatopancreas. Notably, for some of these types of enzymes, more than one novel enzyme was found (Table 3), and further analysis suggests these enzymes are likely to help the scallop digest diverse food sources as described below.

Table 3 Polysaccharide metabolism related enzymes identified in scallop hepatopancreatic extract by mass spectroscopy.

Eight α-L-fucosidase genes were identified in the scallop genome, and we detected seven of these in our mass spectroscopy analysis of hepatopancreatic proteins (Table 3). These results suggest that most α-L-fucosidase genes in the genome are actually expressed and the corresponding enzymes are present in the hepatopancreas. The inability to identify the remaining α-L-fucosidase by mass spectroscopy could be due to a low expression level in the hepatopancreas. We carried out phylogenetic analyses of α-L-fucosidases from the Zhikong scallop (C. farreri) and other mollusk species (including the Pacific oyster, Cassostreas gigas, sea hare, Aplysia californica, and owl limpet, Lottia gigantea). The resulting MrBayes phylogenetic tree reveals two major molluscan clades of α-L-fucosidases (Fig. 3A). Both clades contain sequences from all of the selected mollusks. Differential duplication of α-L-fucosidases was observed for bivalves and gastropods, with the former predominantly occurring in clade I and the latter in clade II. For example, six out of eight scallop α-L-fucosidases were clustered into clade I, including Cf Fuca 3–Cf Fuca 8. Similarly, four of eleven oyster α-L-fucosidases were clustered in clade I, while only two were clustered in clade II. Similar preferential duplications were also detected for the gastropods (sea hare and owl limpet) in clade II (Fig. 3A). The high diversity and preferential gene duplication of α-L-fucosidases in mollusks may be related to their adaptations to different food sources.

Figure 3
figure 3

Scallop hepatopancreas α-L-fucosidase analyses.

(A) Phylogenetic analysis of α-L-fucosidases. The C. farreri (Cf) sequences are highlighted by black triangles in both the genome data and proteomic analyses. Hollow triangles represent sequences present only in the genome data. Fuca: α-L-fucosidase; Cg: Crassostrea gigas; Lg: Lottia gigantean; Ac: Aplysia californica; At: Arabidopsis thaliana. The accession numbers of the protein sequences are provided in Supplementary Table S2. (B) Amino acid sequence alignment of the α-L-fucosidases. Identical residues are shaded in gray. Chlamys farreri (Fuca1–7) and Thermotoga maritime (FucaT, PDB: 1HL8). The two catalytic residues identified in FucaT22 are marked with closed circles and the three residues that form the substrate binding pocket are marked with triangles. The gap described in the text is located in the boxed region. (C) Structure of FucaT. The FucaT structure from Thermotoga maritime (PDB: 1HL8) is illustrated with yellow ribbons and the key residues shown as sticks. The loop adjacent to the substrate binding pocket is highlighted by a dashed line. The image was generated by PyMOL32.

Additional sequence and structural analyses of the seven scallop hepatopancreatic α-L-fucosidases identified in the proteomic screening suggest these enzymes likely have distinct substrate binding specificities, based on their sequence conservation and substrate binding cleft analysis, as shown in Fig. 3B,C. The α-L-fucosidases are able to catalyze the removal of nonreducing terminal L-fucose residues linked by α-1,2, α-1,3, α-1,4 or α-1,6 bonds to oligosaccharides and their conjugates. They play crucial roles in the metabolic processing of fucosylated glycoconjugates, which are widely distributed from bacteria to human22. A previous structural analysis identified the catalytic nucleophiles Asp224 and Glu266 in a Thermotoga maritime α-L-fucosidase22. Sequence alignment of seven α-L-fucosidases identified in the scallop hepatopancreatic extract (Table 3) illustrates that these α-L-fucosidases share high sequence identity (more than 70% sequence identity); the residue Asp224 is highly conserved, whereas the Glu266 residue is not as conserved (Fig. 3B). However, an Asp residue next to that position is possibly substituted for the key Glu266 residue found in the Thermotoga maritime α-L-fucosidase, which is assumed to play a role in the completion of the reaction (Fig. 3B). In the published structure of the α-L-fucosidase from Thermotoga maritime (PDB: 1HL8), three residues (Glu66, Tyr171, Arg254) contribute to forming the substrate binding pocket (Fig. 3C). Tyr171 and Arg254 are highly conserved in all the α-L-fucosidases from the scallop hepatopancreas, whereas the Glu66 and neighboring residues are not conserved (highlighted in Fig. 3B,C). In addition, compared to the previously determined structure (PDB: 1HL8), the region from Pro46 to Asp56 was absent in the seven α-L-fucosidases identified in the scallop hepatopancreas. As shown in the determined α-L-fucosidase structure (Fig. 3C), this loop is adjacent to the substrate binding pocket, suggesting this region may be associated with substrate binding. Taken together, based on these observed differences in the substrate binding cleft, it is possible to speculate that the α-L-fucosidases from the scallop hepatopancreas have a distinct substrate binding specificity (Fig. 3B,C).

The presence of sulfated polysaccharides is a significant feature of seaweed as compared to terrestrial plants. Sulfatases catalyze the hydrolysis of sulfuric acid esters from a wide variety of substrates, including glycosaminoglycans, glycolipids and steroids23. Therefore, sulfatases play important roles in polysaccharide metabolism.

Seven arylsulfatase B genes were identified in the scallop genome, four of which were also detected in protein mass spectroscopy of the scallop hepatopancreas; undetected arylsulfatases may be possibly related to a low expression level as mentioned above. Phylogenetic analysis of arylsulfatase B genes was carried out for Zhikong scallop (C. farreri) and other mollusk species. As shown in Fig. 4A, all molluscan arylsulfatase B proteins were clustered into two large clades. Gene duplications were observed for bivalves and gastropods in both clades. For example, duplication of Cf ARSB1, Cf ARSB2 and Cf ARSB5 occurred in clade I and duplication of Cf ARSB3, Cf ARSB4, Cf ARSB6 and Cf ARSB7 occurred in clade II. The sea hare A. californica had the most gene copies among the examined mollusks, with 10 copies in clade I and 6 copies in clade II. We hypothesize that the high diversity of arylsulfatase B genes observed here may be a mechanism for mollusks to adapt to diverse algae resources.

Figure 4
figure 4

Analysis of scallop hepatopancreas arylsulfatases.

(A) Phylogenetic analysis of arylsulfatases. The C. farreri (Cf) sequences are highlighted by black triangles in both the genome data and proteomic analyses. Hollow triangles represent sequences present only in the genome data. ARSB: arylsulfatase; Cg: Crassostrea gigas; Lg: Lottia gigantean; Ac: Aplysia californica; Ng: Nannochloropsis gaditana CCMP526. The accession numbers of the protein sequences are provided in Supplementary Table S2. (B) Amino acid sequence alignment of arylsulfatases. C. farreri (ARSB1–4) and human (ARSBH, PDB: 1FSU). Identical residues are shaded in gray. The ten primary active-site residues described in ARSBH are marked with triangles. (C) Structural analysis of ARSBH. The human structure of ARSBH (PDB: 1FSU) illustrated with orange ribbons and with the C-terminal region in blue. The primary active-site residues are shown as sticks, and the catalytic center is highlighted by a dashed circle. The picture was generated by PyMOL32.

Sequence alignments and structure-based analyses were carried out to further explore the potential substrate binding specificities of the arylsulfatases. As shown in Fig. 4B, ten primary active-site residues (Asp53; Asp54; Cys91 (or Ser); Pro93; Arg95; Lys145; His147; His242; Asp300; and Lys318) (PDB: 1FSU) are highly conserved in the sulfatase family24. Structural analyses of arylsulfatases indicated that all of the members have positively charged active-site pockets suitable for recognizing a sulfated substrate (Fig. 4C), although the overall sizes, shapes, and electrostatics of the pockets vary extensively. In addition, the C-terminal regions of arylsulfatases play an important role in defining the active-site pocket23,24. Sequence alignment of arylsulfatases indicates that the C-terminal regions of the four arylsulfatases from C. farreri are not conserved and diverge significantly (Fig. 4B), which suggests they may have unique substrate specificities.

Taken together, our phylogenetic and structural analyses suggest that gene duplications of α-L-fucosidase and arylsulfatase occurred in the scallop genome along with sequence and structural variations that may possibly confer different substrate specificities to the enzymes, consistent with the diversity of phytoplankton consumed by the scallop.

Conclusion

Proteomic approaches were applied to analyze the protein composition of the scallop hepatopancreas. A variety of enzymes associated with polysaccharide metabolism were identified, suggesting a complex enzyme system is required for the scallop to deal with diverse seaweed food sources. In support of this hypothesis, phylogenetic and structural analyses revealed gene duplications and potentially diversified protein binding specificities. Overall, our study characterizes several novel polysaccharide-degrading enzymes in the scallop hepatopancreas, providing an enhanced view of these enzymes and our understanding of marine polysaccharide digestion.

Methods

Soluble fractions extract from scallop hepatopancreas

Scallops were purchased from a local seafood market in Qingdao. The hepatopancreas was dissected by hand from the scallop viscera (Fig. 1), immediately transferred to the cold 1 × PBS buffer, and homogenized manually at 4 °C. The homogenate was centrifuged for 30 min at 8,000 × g, and the supernatant was subjected to ammonium sulfate fractionation. The precipitates formed under 80% ammonium sulfate were collected by centrifugation at 8,000 × g for 30 min. The precipitates were then dissolved in 1 × PBS buffer and dialyzed against 1 × PBS buffer at 4 °C overnight, and the resulting solution was lyophilized to protein powder for subsequent analysis.

Mass spectroscopy

Sample preparation and SDS-PAGE

The extracted and lyophilized protein powder was dissolved in 0.4 ml SDT lysis buffer (4% SDS, 150 mM Tris-HCl, 100 mM DTT, pH 7.6)25, heated at 100 °C for 3 min, and then sonicated at 50 W for 5 min (2 s sonicate, 8 s rest) on ice. The sample was reheated to 100 °C for 3 min and then centrifuged at 14,000 × g for 40 min. The supernatant solution was collected and filtered, the protein concentration was determined using the Bradford method, and an aliquot of the treated sample was analyzed by SDS-PAGE (Fig. 2).

Filter-aided proteome enzyme solution

Protein digestion was conducted using the FASP procedure25. For the treated sample, 200 μg of protein was loaded onto an ultrafiltration filter (30 kDa cutoff, Sartorius, Germany) containing 200 μl of UA buffer (8 M urea, 150 mM Tris-HCl, pH 8.0) followed by centrifugation at 14,000 × g for 30 min and an additional washing step with 200 μl of UA buffer. One hundred microliters of 50 mM iodoacetamide in UA buffer was subsequently added to the filter to block the reduced cysteine residues, the sample was incubated for 30 min at room temperature in the dark, and then the sample was centrifuged at 14,000 × g for 30 min. The filters were washed twice with 100 μl of UA buffer and centrifuged at 14,000 × g for 20 min after each washing step. Next, 100 μl of 25 mM ammonium bicarbonate was added to the filter, followed by centrifugation at 14,000 × g for 20 min, which was repeated twice. The protein suspensions were then digested with 40 μl of trypsin (Promega, Madison, WI, USA) buffer (4 μg trypsin in 100 μl ammonium bicarbonate) at 37 °C for 16–18 h. Finally, the filter unit was transferred to a new tube and centrifuged at 14,000 × g for 30 min. The resulting peptides were collected as a filtrate, and the peptide concentration was analyzed at OD280.

Capillary high-performance liquid chromatography

The samples were analyzed using an Easy-nLC nanoflow HPLC system connected to an Orbitrap Elite mass spectrometer (Thermo Fisher Scientific, San Jose, CA, USA). A total of 1 μg of each sample was loaded onto a Thermo Scientific EASY column (two columns) using an autosampler at a flow rate of 150 nl/min. The sequential separation of peptides on the Thermo Scientific EASY trap column (10 μm × 2 cm, 5 μm, 100 Å, C18) and analytical column (75 μm × 25 cm, 5 μm, 100 Å, C18) was accomplished using a segmented 1 hr gradient from Solvent A (0.1% formic acid in water) to 50% Solvent B (0.1% formic acid in 100% ACN) for 50 min, followed by 50–100% Solvent B for 4 min and then 100% Solvent B for 6 min. The column was re-equilibrated to its initial highly aqueous solvent composition before each analysis.

Electrospray ionization mass spectrometry

The mass spectrometer was operated in positive ion mode, and MS spectra were acquired over a range of 300–1800 m/z. The resolving powers of the MS scan and MS/MS scan at 200 m/z for the Orbitrap Elite were set as 70,000 and 17,500, respectively. The top ten most intense signals in the acquired MS spectra were selected for further MS/MS analysis. The isolation window was 2 m/z, and ions were fragmented through higher energy collisional dissociation with normalized collision energies of 27 eV. The maximum ion injection times were set at 10 ms for the survey scan and 60 ms for the MS/MS scans, and the automatic gain control target values were set to 1.0 × 10−6 for full scan modes and 5 × 104 for MS/MS. The dynamic exclusion duration was 40 s.

ESI MS data analysis

The raw files were analyzed using the Proteome Discoverer 1.4 software (Thermo Fisher Scientific). A search for the fragmentation spectra was performed using the MASCOT search engine embedded in Proteome Discoverer against the whole-genome database (unpublished data). The following search parameters were used: monoisotopic mass, trypsin as the cleavage enzyme, two missed cleavages, carbamidomethylation of cysteine as fixed modifications, and peptide charges of 2+, 3+, and 4+, and the oxidation of methionine, Phospho, Gln->pyro-Glu (N-terminal Gln) and Acetyl (protein N terminus) were specified as variable modifications. The mass tolerance was set to 10 ppm for precursor ions and to 0.05 Da for the fragment ions. The results were filtered based on a false discovery rate (FDR) of no more than 1%.

Bioinformatics analysis

Functional annotation and enrichment

The related protein sequences and their functional annotations were retrieved from our ongoing whole-genome sequencing project for C. farreri (unpublished). Gene ontology (GO) enrichment analyses for proteins from the scallop hepatopancreas were carried out based on the algorithm implemented in GOstat26, using the whole annotated C. farreri gene set as the background. GO terms that were enriched within a given gene set were extracted with EnrichPipeline27. Similar techniques were used for the Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis.

Multiple sequence alignment and phylogenetic analysis

The related protein sequences of C. farreri were obtained from the scallop genome as described above. Protein sequences from other mollusks and outgroup species were retrieved from the NCBI or Ensembl databases (accession numbers provided in Supplementary Table S2). Amino acid sequences were aligned using Clustal W version 2.028 and alignments were adjusted manually. Only unambiguously aligned positions (selected by the program GBlocks29,30) were considered in phylogenetic analyses. Phylogenetic trees were constructed using MrBayes v3.231. The MCMC chain was allowed to run for 106 generations saving one tree every 100 generations. The first 25% of the resulting samples were discarded, and the remaining trees were used to construct a majority rule consensus tree, with a posterior probability of 0.95 or greater considered significant.

Additional Information

How to cite this article: Qianqian, L. et al. Proteomic analysis of scallop hepatopancreatic extract provides insights into marine polysaccharide digestion. Sci. Rep. 6, 34866; doi: 10.1038/srep34866 (2016).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.