Introduction

Pearl oysters, Pinctada fucata, are one of the most important economical pearl production species in China and Japan and are also one of the best studied biomineralization models1. Their shells are composed of calcite as the outer prismatic layer and aragonite as the inner nacreous layer. The biomineralized products possess superior mechanical2 and biological properties3 compared to common calcium carbonate. The shells typically consist of 95% CaCO3 and approximately 5% organic macromolecules including proteins, polysaccharides and lipids1. Specifically, shell matrix proteins (SMPs) play important roles in crystal nucleation, polymorphism, morphology and organization of calcium carbonate crystallites during shell formation4.

Since the cloning of the first SMP, Nacrein from P. fucata in 19965, MSI606, N167, Prismalin-148, Shematrin9, lysine(K)-rich matrix protein (KRMP)9, Aspein10, Tyrosinase11, N4012, Pif17713, Prisilkin-3914, PfN2315 and PfN4416 have been cloned and characterized. SMPs have been found to possess several functionalities: 1) they facilitate calcite (Aspein17) or aragonite crystallization (N4012), 2) they act as framework proteins (Shematrin18 and Prisilkin-3914) and 3) they guide calcium carbonate assembly (N1619).

A comprehensive characterization of SMPs will offer an opportunity to better understand biomineralization processes and refine the current “chitin-silk fibroin gel proteins-acidic macromolecules” model20. To achieve this characterization, proteomics has proven to be useful in identifying SMPs in high-throughput ways and has previously been used in Lottia gigantea21, Pinctada margaritifera22, Pinctada maxima22, Mytilus coruscus23, Acropora millepora24, Stylophora pistillata25 and Cepaea nemoralis26. In the present study, we identified 72 SMPs from the shells of P. fucata. Ethylenediaminetetraacetic acid (EDTA)-extracted proteins were subjected to SDS-PAGE followed by liquid chromatography–mass spectrometry (LC-MS/MS) analysis. Raw data from the LC-MS/MS were directly interrogated against the proteome derived from the draft genome of P. fucata27. Proteins with mascot scores above 5.0 and at least two matched peptide fragments were considered to be valid and were analyzed by BLAST, SMART and InterProScan. In addition to controlling CaCO3 crystallization process proteins, proteomic analysis suggests that diverse SMPs of P. fucata contain extracellular matrix-(ECM) related proteins. Moreover, diverse domains were found, including carbonic anhydrase, Glyco_hydro_18, Cu2_monooxygen, chitin-binding, complement control protein, von Willebrand factor type A, epidermal growth factor-like, tissue inhibitor of metalloproteinase and Laminin_G_2/3. Immunohistological experiments showed localization of SMPs in the mantle cells, shells and synthetic calcite. Real-time PCR validated some representative genes in vivo. Together, our results increase shell matrix proteins’ repertoires in P. fucata and may guide the further study of SMPs.

Methods

All methods were carried out in accordance with the approved guidelines. All experimental protocols were approved by the Animal Experimental Ethics Committee of Tsinghua University, Beijing, China.

Sample preparation

The adult pearl oyster, Pinctada fucata (with shells 5.5–6.5 cm in length and 30–40 g of wet weight and approximately 2 years of age) was obtained from the Zhanjiang Pearl Farm (Guangxi Province, China). In the laboratory, the oysters were maintained at approximately 20 °C in an aquarium that contained aerated artificial seawater at 3% salinity.

Shell preparation and proteins extraction

Cleaned shells of P. fucata were immersed in 5% sodium hydroxide for 24 h and were subsequently rinsed in the distilled water to avoid possible contamination of soft tissues adhered to the inner surface of nacre. The two layers of shells, the outer prismatic layer and the inner nacreous layer, were separated mechanically by abrasion before air-drying. Their fragments were pulverized (30 g) and were then decalcified with 0.8 M ethylenediaminetetraacetic acid (EDTA, pH 8.0) for 60 h at 4 °C with continuous agitation. For extraction of the soluble matrix, the supernatant was collected by centrifugation at 13,000 rpm for 30min at 4 °C and was then desalted by ultrafiltration (3 K). For extraction of the insoluble matrix, the above precipitation were thoroughly rinsed with water and were treated with denaturing solution (30 mM Tris-HCl, pH 8.0, 1% sodium dodecyl sulfate (SDS), 10 mM dithiothreitol) at 100 °C for 30 min. After a short centrifugation, the denatured samples were ready to be applied on 12% SDS-polyacrylamide gels. Proteins were stained with Coomassie Brilliant Blue and were quantified by a BCA assay kit (Pierce).

Characterization

The morphologies of the cleaned shells were examined by scanning electron microscope (SEM) (FEI Quanta 200, 15 kV) after being sputter-coated with a thin layer of gold nanoparticles.

Immunolocalization of SMPs

Primary antibody production

Rabbit polyclonal antibodies were produced by injecting mixed shell matrix proteins in New Zealand rabbits16.

Western blotting

Proteins were electrophoretically transferred to PVDF membranes (Millipore) using a Mini Trans-Blot® (Bio-Rad). Then, the PVDF membranes were blocked with 5% skim milk and were incubated with primary antibody (1:4000) for 2 h. After washing with Tris-Buffered Saline with 0.05% Tween 20 (TBST) and incubating with HRP-conjugated goat anti-rabbit secondary antibody (1:10000, Huaxingbio Science, China), detection was performed using 3,3’-diaminobenzidine (DAB) solution (TIANGEN, China). A control experiment was performed without the first antibody step.

Immunolocalization on the shells

Immunogold-labeling assays were conducted as described by the literature15,28 with some modifications. The antibodies were used at dilutions of 1:200. Goat anti-rabbit antibodies coupled to 15 nm gold particles (1:100, Huaxingbio Science, China) were used as the secondary antibodies. A control experiment was performed without the first antibody step. For high-resolution imaging, samples were sputter-coated with carbon and were analyzed using a Hitachi-SU8010 SEM at backscatter mode.

Immunolocalization on the mantle cells

Deparaffinized 10 μm sections of the mantle tissues, previously fixed for 24 h in Davidson fixative, were permeabilized for 10 min in TBST. Tissues were then incubated for 1 h in saturating medium (1% BSA, TBS) at room temperature. Then, samples were incubated with the anti-matrix antibody (1:100) for 1 h in TBST–BSA 1% at room temperature (RT). After rinsing in saturating medium, samples were incubated for 2 h at RT with HRP-conjugated goat anti-rabbit secondary antibody (1:10000, Huaxingbio Science, China). Finally, samples were observed with a DM-4000B Leica microscope. A control experiment was performed without the first antibody step22.

Proteomic analysis

Protein bands selected from ESMs and EISMs of the prismatic and nacreous layers were excised and completely destained by washing with 50 μL of 50 mM NH4HCO3/CH3CN (50/50) mixture for 30 min at 37 °C. Then, reduction was conducted with 50 μL of 10 mM DTT in 50 mM NH4HCO3 for 1 h at 57 °C and alkylation was performed with 50 μL of 100 mM iodoacetamide (IAA) for 30 min at RT in the dark. The cut gels were dried in CH3CN and were treated with 0.4 μg trypsin (Proteomics grade, Sigma) in 10 μL of 50 mM NH4HCO3 for 12 h at 37 °C. The solution was treated with 50 μL of 1% formic acid at 30 °C for 30 min under agitation. The digests were then lyophilized and suspended in 30 μL of 0.1% trifluoroacetic acid (TFA) and 4% acetonitrile for LC–MS/MS analysis. Five μL of sample was injected into the LTQ Orbitrap Velos mass spectrometer with Dionex U-3000 Rapid Separation nano LC system (Thermo Scientific) for analysis. MS data were acquired automatically using Analyst QS 1.1 software (Applied Biosystems) following a MS survey scan over m/z 350–1500 at a resolution of 60,000 for full scan and 2, 000 for MS/MS measurements.

The LC-MS/MS data were searched against the P. fucata predicted protein database (http://marinegenomics.oist.jp/genomes) using a Mascot 2.1 search engine with carbamidoethylated cysteine as a fixed modification and oxidized methionine and tryptophan as variable modifications. The peptide MS and MS/MS tolerances were set to 0.5 Da. Finally, sequences with mascot scores of at least 5.0 and with at least two matched peptide fragments were considered valid.

Nucleotide and amino acid sequences analysis

Identification of proteins from above was attempted using Blastp and tBlastn searches against NCBI database (http://blast.ncbi.nlm.nih.gov/Blast.cgi). The protein sequences were deposited in NCBI (Table S1).The theoretical mass, isoelectric point and amino acid composition of the proteins were computed using ProtParam from the EXPASY online server. Conserved domains were predicted using SMART (http://smart.embl-heidelberg.de/) and InterproScan (http://www.ebi.ac.uk/interpro/ search/sequence-search). IUPRED (http://iupred.enzim.hu/) was used to recognize disordered regions from the amino acid sequences of SMPs based on the estimated pairwise energy content. XSTREAM (http://jimcooperlab.mcdb.ucsb.edu/xstream/) was used to isolate proteins with tandem-arranged repeat units using the default settings.

Quantification of nacre and prism transcripts by real-time PCR

Total RNA from the mantle tissue and muscle were extracted using TRIzol reagent (Life Technologies) according to the manufacturer's instructions. RNA integrity was checked by agarose gel analysis and RNA concentrations were determined by NanoDrop 2000 (Thermo Scientific). RNA concentration was approximately 1000 ng·μL−1 and A260/A280 was above 1.90. Then, cDNA was prepared by reverse transcription-PCR of the total RNA with GoScriptTM Reverse Transcription System (Promega) following the manufacturer’s instructions. Real-time PCR was conducted to quantify gene expression levels, with β-actin as an internal reference due to its relatively stable expression in mantle cells15,16. A typical reaction mixture is: SYBR® Premix Ex TaqTM II(Takara) 12.5 μL, forward primer 0.5 μL (10 μM), reverse primer 0.5 μL (10 μM), cDNA template 0.5 μL and H2O 6 μL. PCR parameters were: 95 °C for 30 s (1 cycle); 95 °C for 5 s, 60 °C for 30 s (40 cycles), 72 °C for 5 s; 72 °C for 30 s (see Table S2 for primers). Dissociation curves were generated to determine product purity and amplification specificity. Relative gene expression levels were calculated using two reference genes by the delta-delta method, as follows: fold = 2−[ΔCt sample – ΔCt calibrator] = 2−ΔΔCt. Here, the “ΔCt calibrator” represents the mean ΔCt values of β-actin (AF378128.1) in the corresponding tissues.

Results and Discussion

Protein composition in the prismatic and nacre matrix

The shell of P. fucata is composed of two layers, the prismatic layer and the nacreous layer (Fig. 1a). The prismatic layer is composed of prisms with length of 10–40 μm embedded in the organic sheath (Fig. 1b) and the nacreous layer is formed by stacked hexagonal nanotablets with side lengths of 0.5–3 μm (Fig. 1c). To extract SMPs, shells were first cleaned with NaOH to avoid possible contamination from outside organic matter. Then, separated shells, prism and nacre were dissolved with EDTA, leaving soluble and insoluble extracts. EDTA can chelate Ca2+, dissolve the shell and release the organic matrices. In this study, we observed that after 60 h, the shell can be fully dissolved. The yields of organic matrices from the shell were approximately 1.5–3.5 mg/g (determined by the concentration of proteins obtained from certain amounts of shell powders).

Figure 1
figure 1

(a) Optical image shows the prismatic and nacreous layer of a typical shell (red box is examined by SEM), (b) SEM shows the shell surfaces of Pinctada fucata: the prismatic (b) and nacreous layer (c,d) SDS-PAGE of the four groups of extracted proteins (ESM and EISM are EDTA-soluble and EDTA-insoluble extracts, respectively; P and N mean the prismatic and nacreous layer, respectively).

The soluble and insoluble extracts were subjected to SDS-PAGE (Fig. 1d). Protein bands in the gel were cut and digested with trypsin. Using LC-MS/MS, peptide fragments were searched against the proteome translated from the draft genome of P. fucata27. Consequently, through bioinformatics analysis such as BLAST, InterproScan and SMART, we identified 72 different SMPs out of 144 whole proteomes, in which 36 and 19 are solely found in the prismatic and the nacreous layers, respectively, while 17 are found in both layers. It should be noted that these identified proteins are related to the number of lysine and arginine residues available for trypsin cleavage in the protein sequences29. For example, Aspein and Prisilkin-39 lack trypsin cleavage sites, making them unsuitable for standard proteomic detection29. Jeana L. Drake et al. found thirty-six skeletal organic matrix proteins in the coral, Stylophora pistillata. Thirty-one were observed with tryptic digestion, while the remaining five were observed only after proteinase K digestion25. Therefore, the use of other digestion reagents only slightly increases the number of detected SMPs and proteins found using trypsin as a digestion reagent likely represent most proteins in the shell. A typical process for LC-MS/MS analysis and protein identification is shown in Figure S1.

Intrinsic disorder (ID) refers to segments or to whole proteins that have no fixed 3D structures, with such disorder sometimes existing in the native state. ID domains are key molecular features that contribute to the formation and function of mollusk nacre; John Evans found that of 39 mollusk aragonite-associated protein sequences, 100% contain at least one region of intrinsic disorder or unfolding30. This researcher proposed that the intrinsically disordered domains are important for matrix assembly30. Hence, we used IUPRED to check the 35 unique SMPs found by proteomics. The results showed that 22 out of 35 sequences were predicted to have at least one region of intrinsic disorder (Table S3). Through XSTREAM, 7 out of 35 sequences were predicted to have tandem repeats (Table S3 and Figure S2). It is noteworthy that repetitive low complexity domains (RLCDs) are important but not the only implications for intrinsically disordered proteins.

According to the blastp results in the National Center for Biotechnology Information (NCBI) database, SMPs were divided into two groups: proteins with homology (e-value ≤ 10−5) (Table 1) and proteins without homology (e-value > 10−5) (Table 2). Compared to the shell proteomics of Pinctada margaritifera22, a closely related species with P. fucata, the numbers of proteins found in both layers were significantly improved. Mpn88, Nacrein, Nacrein-like, Shematrin-1, Shematrin-2, Shematrin-7, PTyr, PTyr1 and PNU1-9 were found in both layers, suggesting their potential roles in the formation of both layers in P. fucata. Conversely, the proteins in the two layers of P. margaritifera were Nacrein, nacre uncharacterized shell protein (NUSP18) and Shematrin 8, implying that the molecular toolkits responsible for formation of the prismatic and nacreous layers were extremely different. Sequence alignments from diverse mollusks and metazoans (Figure S3) showed that copper amine oxidase, peroxiredoxin and chitinase were highly conserved across the metazoa. EGF domain-containing proteins and FN3 domain-containing proteins were highly conserved in the Pinctada family.

Table 1 Shell matrix proteins from the shells of P. fucata with blast homology.
Table 2 Shell matrix proteins from the shells of P. fucata without blast homology.

Immunolocalization of proteins

To further validate the SMPs in vivo and in vitro, immunolocalization experiments were performed. Western blotting using polyclonal antibodies raised against the mixed shell matrix proteins of P. fucata showed ETDA-soluble matrices (ESMs) and ETDA-insoluble matrices (EISMs) all reacted with the antibodies (Figure S4). Immunohistochemical results clearly indicated that the SMPs are located in the mantle pallial and the mantle edge (Fig. 2a1) but showed no signal in the control (Fig. 2a2). Immunogold observations of the prismatic layer revealed that the antibodies exhibited in both the prismatic tablets (Fig. 2b1) and the chitin layer (Fig. 2b3). In the nacreous layer, antibodies exhibited a specific signal on nacre and localized in the interlamellar matrix that separated nacre tablets (Fig. 2b4) and in the nacre tablets (Fig. 2b2). In contrast, the control showed no gold nanoparticle signal (Figure S5). Immunolabeling synthetic calcite was conducted to verify the influence of extracted proteins on the growth of CaCO3. In the control group without the addition of extracted proteins, no fluorescence signal was observed under the same microscopy settings (Figure S6). By contrast, all four groups with the addition of extracted proteins exhibited fluorescence signals, indicating they could be occluded in/on the CaCO3. Specifically, SMPs from prismatic layers at approximately 1 μg·mL−1 had no noticeable effect on the morphology of CaCO3 and were evenly distributed (Fig. 2c1,c2). EDTA-soluble matrix from nacreous layers was concentrated in the center of crystals (Fig. 2c3). EDTA-insoluble matrix from nacreous layers changed the rhombohedral crystals into 5–10 μm rounded particles. In addition, the fluorescence intensity seemed to be concentrated at the edge of particles (Fig. 2c4). These results show that SMPs originate from the mantle cells and are finally embedded in the shells. In addition, SMPs can affect the CaCO3 crystallization process. Extensive studies have shown that SMPs, a single protein or mixed proteins can affect the nucleation, polymorphism and morphology of CaCO319,31. The immune assay using antibodies against the SMPs suggests that SMPs from different parts of shells execute their distinct roles in CaCO3 crystallization, resulting in being occluded in/on CaCO3 with different patterns. The in vitro crystallization experiments were conducted under a basic condition (pH 8.0), which is close to the pH of seawater (8.2); therefore, the experiment may provide clues to the mechanism behind in vivo mineralization.

Figure 2
figure 2

Immunolocalization of the shell matrix proteins (SMPs) of P. fucata.

A polyclonal antibody raised against mixed extracted proteins is used to identify EDTA-soluble matrices (ESMs) and EDTA-insoluble matrices (EISMs). (a) Immunohistochemical localization in the mantle epithelia: control without the first antibody (a1); mantle sections with the first antibody (a2) (MF: middle fold; OF: outer fold; IF: inner fold); (b) Immunogold labeling of SMPs on the EDTA-mounted prismatic and nacreous layers. Layers with the first antibody (b1-b4). b1, b3 are prismatic layers and b2, b4 are nacreous layers. The red arrowheads indicate gold nanoparticles. (Scale bars, 200 nm) (c) Confocal fluorescence laser scanning microscopy images of SMPs in vitro show synthetic calcite after immunolabeling. Calcite in the presence of 1 μg·mL−1 ESM-P (c1), EISM-P (c2), ESM-N (c3), EISM-N (c4). (Microscopy settings are identical. The control is shown in Figure S6. Scale bars, 10 μm.)

Figure 3
figure 3

Real-time PCR of selected SMPs with domains or repeats shows relative gene expression in the mantle edge and mantle pallial of P. fucata.

(a) Relative gene expression of twenty-one selected SMPs in the mantle edge compared with mantle pallial. Copper = Copper amine oxidase (b) Relative gene expression of six well-studied SMPs. (The longitudinal coordinates are the values of log10(ME/MP), ME: mantle edge, MP: mantle pallial)

Verification and quantification of matrix genes by real-time PCR

To verify and quantify the proteins found by our proteomic analysis, real-time PCR was performed. As is known, the mantle edge is responsible for the formation of the periostracum and the prismatic layer, whereas the mantle pallial enabled the formation of nacreous layer22. Therefore, we examined the relative gene expression of 21 selected genes, which correspond to the proteins found in the proteomic analysis, in the mantle edge and mantle pallial of P. fucata (Fig. 3a and Table S4). According to previous studies, six developmental stages have been described across the entire P. fucata life cycle, including descriptions of the fertilized egg, trochophore stage, D-shaped stage, umbonal stage, juvenile and adult32. The calcium carbonate crystal polymorphisms, the shell layer structure and the expression of SMPs change during these stages32. Almost all SMPs showed a dramatic increase at the adult stage. For example, the expression level of Pif and Prisilkin-39 in the adult stage is 2116.9 and 119.48 times that of the juvenile stage, respectively32. Hence, in the present study, all RNA was extracted from the mantle of adult oysters. SMPs encoded by the twenty-one genes are one valine (V)-rich protein (Alveoline-like protein), one glycine and serine (GS)-rich protein (NU7), one aspartic acid (D)-rich protein (PNU6), one tissue inhibitor of metalloproteinase (PTIMP), one Peroxiredoxin, one Copper amine oxidase, two Complement control protein (CCP) proteins (PU8 and PU10), one Laminin G protein (NU10), two von Willebrand factor type A (vWA) proteins (PNU4 and PNU5), three chitin-binding proteins (PU8, PNU1 and NU5), four chitinase (Clp1, Clp3, PNU3 and PU12) and four fibronectin type III (FN3) proteins (PU3, PU5, PU6 and PU15). Among all 21 tested genes, 18 genes (Alveoline-like, PU8, PTIMP, PNU3, Clp3, Copper amine oxidase, Clp1, Peroxiredoxin, PU15, PNU1, PU3, PU12, PU6, PU10, PNU6, NU5, PU5 and NU10) were highly expressed in the mantle edge, mantle pallial, or both referenced to the muscle (Table S4), indicating they were likely to be involved in biomineralization process. PNU4, PNU5 and NU7 genes were neither highly expressed in mantle edge nor mantle pallial, so they may originate from other cells such as hemocytes33 and ultimately be involved in biomineralization34. Log(ME/MP) is the relative expression in the mantle edge to the mantle pallial, indicating roles of genes in the formation of the prismatic or nacreous layers. The results showed that most proteins found in the prismatic layer were highly expressed in the mantle edge (Fig. 3a). Surprisingly, PU10 and PU5 were found in the prismatic layer, but the corresponding genes were more highly expressed in the mantle pallial (Fig. 3a), suggesting their additional roles in nacreous layer formation. Similarly, previous studies showed that shematrin 5, a prism-related SMP, had much greater expression in the mantle pallial than in the mantle edge35. Additionally, six well-studied genes were also examined by real-time PCR (Figure 3b). Nacrein and Pif177 genes are related to the formation of the nacreous layer and showed high expression in the mantle pallial (~105 and ~10-fold referenced to the muscle, respectively). Tyrosinase-1(Tyr-1), KRMP, Primalin-14 and Prisikin-39 are genes related to the formation of the prismatic layer and exhibited high expression in the mantle edge (~104, ~105, ~104 and ~103-fold referenced to the muscle, respectively). Therefore, these data confirm that these genes are expressed in the calcifying tissues and the corresponding proteins are embedded in the shells of P. fucata.

Putative functions of the SMPs based on domains

To gain insight into the functions of SMPs, analysis of their sequences and domains is required. SMPs involved in biomineralization have several distinct characteristics25: 1) they are enriched in some specific amino acids such as aspartic acid, glutamic acid, glycine and serine, 2) they have flexible secondary structures and repeated low complexity domains (RLCDs) and 3) they possess multiple modularity. From well-known SMPs, biomineralization-related proteins contained carbonic anhydrases, chitin-binding, aragonite-binding, vWA and D-rich domains. According the blast results, some well-studied SMPs have been found in the shells, verifying the effectiveness of our method. In general, the Shematrin18 and Tyrosinase family11 were among the most abundant proteins in the prismatic layer and others included Nacrein5, Chitinase-like protein 1 (Clp1)22, Clp322, KRMP9, Alveoline-like protein22, Amylase (GenBank, AGN55420.1), Prismalin-148, Glycine-rich protein 2-like (PGRP2)36, Tissue inhibitor of metalloproteinases (TIMPs)37, Liprin-α protein, PPP-1038, Mantle protein 10 (PFMG1)39, Mpn8829, cement-like protein (SGMP1)22, Actin (GenBank, ACD99707.1) and Copper amine oxidase22. Nacrein5, Pif17713, N167 and N1940 were enriched in the nacreous layer and others were methionine-rich nacre protein (MRNP)41, MSI80 (GenBank, BAL45933.1), MSI606, Mpn8829, Actin (GenBank, ACD99707.1), Peroxiredoxin22 and Polyubiquitin42 (Table 1). The SMPs that have previously been thoroughly characterized will not be discussed in this work.

Moreover, we identified some domains that are considered to play significant roles in the formation of biominerals. These domains include Glyco_hydro_18, Cu2_monooxygen, Chitin-binding domain 2 (ChtBD2), Complement control protein (CCP/SUSHI), Epidermal growth factor-like (EGF), TIMP and Laminin_G_2/3. Compared with domains found in P. margaritifera, Glyco_hydro_20, EF-hand and Kunitz-like domains were not found. Kunitz domains (InterPro accession number, IPR002223) are the active domains of proteins that inhibit the function of protein degrading enzymes. Notably, the proteins extracted from prismatic layers carried more diverse domains than those from nacreous layers. Based on the domains, the SMPs were classified into the following three groups:

(1) Proteins that potentially regulate the extracellular microenvironment: carbonic anhydrase, chitinase, chitin-binding proteins and TIMP

The microenvironment, the immediate small-scale environment of mantle cells and organs responsible for the formation of shells, includes pH, framework (mostly chitin) and proteinase in the seawater. Therefore, the extracellular microenvironment is critical for shell formation. Three types of proteins are related to the microenvironment: carbonic anhydrase, chitin relevant proteins and proteinase.

Carbonic anhydrase is responsible for controlling pH by converting CO2 to HCO3- and is found in Nacrein. Chitin is the major framework in which CaCO3 grows20. Four proteins (PClp1, PClp3, PNU3 and PU2) contain Glyco_hydro_18 domains (IPR001223), which belong to a family of glycoside hydrolases, hydrolyzing the glycosidic bond between carbohydrates or between a carbohydrate and a non-carbohydrate moiety. Specifically, chitinase hydrolyzes chitin oligosaccharides. Real-time PCR showed that mRNAs of PClp1, PClp3, PNU3 had high expression levels in the mantle edge and mantle pallial, suggesting their critical roles in shell formation (Fig. 3a and Table S4). A very recent microarray study had shown that chitinases were highly expressed at the D-shaped stage of P. fucata when the shell were first formed32. In addition to chitinase, another domain related to chitin is chitin-binding domain_2 (IPR002557). This domain was found in four proteins (PNU1, PU8, Pif177 and NU5), suggesting that chitin-binding ability is required by both layers. PNU1 and PU8 had higher mRNA expression levels in the mantle edge and lower levels in the mantle pallial and N-U5 was the opposite. Pif177, an important gene in the formation of nacre, had higher mRNA expression levels in the mantle pallial and lower in the mantle edge (Fig. 3b). In complex seawater, all secreted SMPs faced degradation by microorganism proteinases. Action must be taken to address this problem. We found tissue inhibitors of metalloproteinase (PTIMP and PTIMP3), which may complex with extracellular matrix metalloproteinases such as collagenases and irreversibly inactivate them (IPR001820). Members of this family are commonly found in the extracellular regions of vertebrate species. It has been inferred that the function of TIMP in P. martensii on nacre formation is to inhibit matrix metalloproteinases (MMP) activity due to their capacity to degrade most components of the extracellular matrix37. Thus, the existence of TIMP may ensure a “safe” place for other SMPs to execute their functions by creating suitable MMP-to-TIMP ratios. In humans, inhibition of MMP activity occurs in a 1:1 stoichiometric relationship and an imbalanced MMP to TIMP ratio may lead to various diseases43. In P. martensii, knocking down of TIMP by RNA interference results in abnormal nacre formation. However, a “suitable” MMP-TIMP ratio in the nacre formation has yet unknown. In fact, water-soluble extracts from the nacre of P. margaritifera possessed proteinase inhibitory activity against proteinase K44. Moreover, TIMP is also found in the shells of C. gigas45 and M. coruscus23. However, other protease inhibitors, such as Kunitz-like protease inhibitor or WAP (Whey acidic protein) domains proteins46, are not found in the shells of P. fucata. There is also a possibility that the amount of proteinase inhibitors in the shell are too low to detect. The problem may be addressed by RNA-seq of mantle cells in the future.

(2) Extracellular matrix- (ECM) related proteins: fibronectin-related proteins, laminin proteins, EGF proteins and liprin-α proteins

The shells are constantly growing and can be repaired after injury, indicating the possibility of communication between cells in different systems of P.fucata and the extracellular matrix (ECM). Proteomic data from C. gigas indicated that oyster shell matrix was not formed simply by self-assembling silk-like proteins but by diverse proteins through complex assembly and modification processes that may involve hemocyte and exosomes45. Indeed, it has been discovered that except RNA transport, ECM-receptor proteins were the most abundant in shell proteins. In agreement with previous results, we identified four ECM-related domains (fibronectin-related, laminin, EGF proteins and liprin-α) in the SMPs that may mediate communication between cells and the extracellular matrix.

Five uncharacterized proteins from the prismatic layer, PU3, PU5, PU6, PU15 and PU16, contain fibronectin type III (FN3) repeat regions (Fig. 3a and Table S4). Strikingly, PU3, PU5 and PU6 are among the most expressed proteins in the mantle pallial (Table S4), although they are from the prismatic layer. PU3, PU5, PU6, PU15 are highly expressed in both mantle edge and mantle pallial, implying their vital roles in the formation of both layers. FN3 is an approximately 100 amino acid domain that contains different tandem repeats with binding sites for DNA, heparin and the cell surface. The majorities of proteins containing FN3 are involved in cell surface binding or are receptor protein tyrosine kinases or cytokine receptors (IPR003961). The beta-sandwich structure of FN3 closely resembles that of immunoglobulin domains47. Notably, in C. gigas, a gene coding for a fibronectin-like protein was highly expressed at the early developmental stage when larval shells are formed in unison with chitin synthase45. Proteins containing fibronectin have been found in the shells of various biomineralization species, including P. margaritifera22, C. gigas45, M. coruscus23 and A. millepora24, suggesting important but yet unknown roles.

PU8 and NU10 possess a Laminin_G (LG) domain, which is thought to mediate attachment, migration and organization of cells into tissues during embryonic development by interacting with other extracellular matrix components (IPR001791). This domain has approximately 180–200 residues and is found in many extracellular and receptor proteins. LG modules have been implicated in interactions with cellular receptors such us α6β1 integrins, sulfated carbohydrates and other extracellular ligands48. In shells, proteins containing Laminin_G domain are found in C. gigas45, M. coruscus23 and L. gigantea21.

PU12 contains two epidermal growth factor-like domains (EGF), which are commonly found in extracellular proteins (IPR000742). EGF exhibits six conserved cysteine residues linked through three sulfide bonds. These domains are related to the immune system, apoptosis and Ca2+ binding. In shells, EGF proteins are found in P. margaritifera23, C. gigas45, M. coruscus23 and L. gigantea21.

Another protein belonging to this category is PLiprin-α, a member of the leukocyte common antigen-related (LAR) protein tyrosine phosphatase-interacting protein family. This protein binds to the tyrosine phosphatase LAR and appears to localize LAR to cell focal adhesions. This interaction may regulate the disassembly of focal adhesion and thus help orchestrate cell-matrix interactions (IPR029515)49. However, this protein is only found in the shells of C. gigas45 and not in the other species.

Interestingly, FN3, Laminin_G, EGF and Liprin-α proteins are able to interact with integrins, which are transmembrane receptors for cell-cell and cell-ECM interactions50. Integrin was identified in the matrix proteins of S. pistillata, a biomineralization model of coral species25. Moreover, αvβ6 integrin has been shown to be expressed by ameloblasts and plays a crucial role in regulating amelogenin deposition and enamel biomineralization51. These studies suggest the integrin has a strong relationship with biomineralizaiton. Indeed, through real-time PCR, gene expression of Integrin in the mantle of P. fucata is higher than in the gonad, foot, muscle and gill (Figure S7).

(3) Other proteins of interest

Except for the above two groups of proteins, some genes are important due to their high expression in the mantle. An acidic protein, four VwA proteins, three complement control protein (CCP) proteins, a V-rich protein and two enzymes (Tyrosinase and Copper amine oxidase) are discussed.

PNU6 is an extremely acidic protein with pI of 3.51 and possesses poly(D)52. The unique primary sequence indicates its role in the CaCO3 formation, possibly anchoring Ca2+ through the polyD domain, increasing concentration of local Ca2+ and favoring calcite precipitation. The function of PNU6 may be similar to the polyD-containing protein Aspein, an unusually acidic matrix protein found in P. fucata17.

Protein-protein interactions are important for shell formation because framework proteins, acidic proteins and others must cooperate to fulfill the complex requirement of biomineralization. VwA domains (IPR002035) in extracellular eukaryotic proteins mediate adhesion via metal ion-dependent adhesion sites (MIDAS), which are found in Pif177, PNU4, PNU5 and PU4. Three out of four proteins are found in the two layers, implying the importance of protein-protein interaction in both layers. In shells, vWA proteins are found in P. margaritifera22, C. gigas45, M. coruscus23 and L. gigantea21.

The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues. They exist in a wide variety of complement and adhesion protein52 and are found in PU4, PU8 and PU10. Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, which are surface markers on the outside of the red blood cell membrane (IPR000436). The CCP proteins indicate a putative relationship between shells and the immune system. CCP proteins are found in the shells of P. margaritifera22, C. gigas45, M. coruscus23 and S. pistillata25.

Alveoline-like protein (Alv), a V-rich protein, has only been discovered in the other two related species P. margaritifera and P. maxima22. Real-time PCR shows that it is highly expressed (almost 103–104 times referenced to the muscle) in both the mantle edge and mantle pallial (Fig. 3a and Table S4), indicating its critical roles in the formation of both calcite and aragonite. However, the function of Alv in CaCO3 crystallization is poorly understood.

Tyrosinase is an oxidase that controls the production of melanin by hydroxylation of a monophenol to o-quinone (IPR002227). It is reported that 21 tyrosinase genes were found in the genome of P. fucata53. As expected, four tyrosinase proteins were identified in the prismatic layer and two tyrosinase proteins were found in both layers. Tyrosinase is also found in the shells of P. margaritifera22, C. gigas45 and M. coruscus23. Although the specific roles of tyrosinase are unknown, it is deduced that this protein plays distinctive roles in melanogenesis in pigmented shells11. Additionally, a tyrosinase gene is potentially involved in larval shell biogenesis in C. gigas54.

Copper amine oxidase catalyzes the oxidation of a wide range of biogenic amines including neurotransmitters, histamine and xenobiotic amines (IPR000269). In eukaryotes, they have a broad range of functions including cell differentiation and growth, wound healing, detoxification and cell signaling. In P. fucata, copper amine oxidase is primarily expressed in the mantle edge and has lower levels in the mantle pallial. However, its role has not previously been investigated in mollusks. A previous study in eastern oysters showed that the amine metabolic process was enriched in SMPs by Gene Ontology enrichment analysis34. This protein has not been reported in other shells of mollusks.

The indication of proteomics on the shell mineralization mechanism

From our proteomic findings, some unexpected proteins have been discovered, indicating the intricacy of biomineralizaion in pearl oyster. The increased SMPs offer a chance to refine the previously proposed “chitin-silk fibroin gel proteins-acidic macromolecules” model20. The biomineralization mechanism of nacre has been previously proposed to consist of the following four stages: (1) assembly of the matrix, (2) the first-formed mineral phase, (3) nucleation of individual aragonite tablets and (4) growth of the tablets to form the mature tissue20. At the first stage, the matrix is formed by layers of β-chitin, with a gel comprising silk-like protein filling the space between. Using proteomics, we identified four chitinases in both layers of shells. Chitinase is an enzyme that catalyzes the hydrolysis of β-1,4-N-acetyl-d-glucosamine linkages in chitin polymers and oligomers. Interestingly, we did not find any chitin synthase in our proteomic analysis, although chitin synthase genes can be found in the genome of P. fucata53. These results indicate that chitin synthases are located in the cell or on the cell membrane, while chitinases are secreted to the outside of cell to reconstruct the chitin network. Chitin binding proteins are able to interact with both chitin and minerals. For example, Pif protein with chitin domains have been proven to play an important role in the association of the inorganic phase and polysaccharide template and in the controlled nucleation of the initial mineral phase31. In vivo, Pif proteins are suggested to be able to work with other proteins such as N16, contributing to the formation of the lamellar sheet of nacre13. Silk-like proteins are rich in Gly and Ala or just in Gly. Such proteins include MSI60, MSI80, Shematrins, PGRP2, PAmylase, KRMP4, SGMP1, NU5 and NU7. Silk-like proteins are found in both the prismatic and nacreous layers, indicating their importance. One function of silk-like proteins is to act as a mild inhibitor of mineralization20. An in vitro crystallization assay demonstrates that recombinant KRMP3 inhibits the precipitation of CaCO3, affects the crystal morphology of calcite and inhibits the growth of aragonite in vitro and these results are almost entirely attributed to the lysine-rich region. The Gly/Tyr-rich region of KRMP3 has the capacity to bind chitin55. Then, at the second stage, the first-form mineral phase, which is usually composed of amorphous calcium carbonate (ACC), is formed. ACC has been considered the precursor of biominerals, which exist in a wide range of living organisms, including nacre56. Mollusks requires high concentrations of Ca2+ and CO32- from the seawater to form ACC. CO32- is concentrated by carbonic anhydrase such as Nacrein and Ca2+ is concentrated by acidic proteins such as Pif13 and Aspein17. ACC is unstable compared to calcite and therefore needs to be stabilized by specialized macromolecules such as Pif31. Several acidic proteins are found in the shell, including PNU2 and PNU6. At the third stage, nucleation of individual tablets begins, requiring some nucleators. Asp-rich proteins are thought to play a role in this stage. The final stage is growth of the tablets to form the mature tissue and the shape of the final biominerals is thought to be thermodynamically driven57. At this stage, SMPs are incorporated in to the final biominerals. It is noteworthy that some SMPs, such as Pif, play multiple roles and can function during several stages. Furthermore, copper amine oxidase, peroxiredoxin and tyrosinase could be related to modification of SMPs, contributing to the overall structure and mechanical properties of shells.

Although this model can largely explain biomineralization from a crystal growth perspective, many proteins including FN3 proteins, CCP proteins, EGF proteins found in our proteomic analysis are not relevant to crystal growth so far. Recently, studies show that eastern oyster, Crassostrea virginica, forms its shell through a series of coordinated events involving hemocyte cells and ECM58. In fact, primary mantle cell cultures of P. fucata are able to precipitate amorphous calcium carbonate in vitro, suggesting the ability of mantle cells to perform biomineralization and shell formation processes59. Therefore, it is reasonable to hypothesize that the shell formation process is related to ECM-related proteins secreted by the mantle cells. These ECM-related proteins are part of SMPs and may play multiple roles. In fact, Osteopontin, a highly expressed ECM-related protein in bone that is glycosylated and enriched in acidic residues, is involved in a number of cellular processes including immune response and apoptosis besides its main role in biomineralization60.

Although we are aware that domains do not necessarily represent the exact function of proteins, the results in the present study are able to guide the further study of the diverse shell matrix proteins, improving our understanding of biomineralization.

Conclusion

Using a proteomic approach, we identified 72 unique shell matrix proteins (SMPs) in which thirty-six are associated with the prismatic layer and nineteen are associated with the nacreous layer, while seventeen are associated with both layers. Based on immunohistological localization, these proteins were confirmed in the mantles, shells and synthetic calcites. In addition to controlling the CaCO3 crystallization process, the shell matrix proteins potentially regulate the extracellular microenvironment and communication between cells and the extracellular matrix (ECM). Our results increase the knowledge of shell matrix proteins in pearl oysters and offer an opportunity to refine the conventional “chitin-silk fibroin gel proteins-acidic macromolecules” model.

Additional Information

How to cite this article: Liu, C. et al. In-depth proteomic analysis of shell matrix proteins of Pinctada fucata. Sci. Rep. 5, 17269; doi: 10.1038/srep17269 (2015).