Discovering potential serological biomarker for chronic Hepatitis B Virus-related hepatocellular carcinoma in Chinese population by MAL-associated serum glycoproteomics analysis

The accuracy of current biomarkers for the diagnosis of hepatocellular carcinoma (HCC), especially chronic Hepatitis B Virus (HBV)-related HCC, is limited. Recent progress in glycoproteomics has provided a novel platform for screening novel serological biomarkers of HCC. In this study, lectin affinity chromatography by Maackia amurensis lectin (MAL) and iTRAQ combined with mass spectrometric analysis were performed to enrich and identify the glycoprotein fractions in serum samples from HBV-related HCC patients and from healthy controls. Seventeen differential MAL-associated glycoproteins were identified. Among them, Galectin 3 binding protein (Gal-3BP) was selected for further evaluated by ELISA analysis and showed a high diagnostic potential of HBV-related HCC, with the AUC of 0.898 and a sensitivity, specificity and accuracy of 80.00%, 93.75% and 86.88%, respectively. Moreover, we constructed a predictive model through the combined use of serum Gal-3BP and Alpha Fetoprotein (AFP), which improved the sensitivity (from 87.5% to 95%), specificity (from 93.75% to 95%) and accuracy (from 90.63% to 95%) of diagnosing early HCC. These data suggested serum Gal-3BP level is a promising biomarker to identify HBV-related HCC and the combined use of serum Gal-3BP and AFP improves the diagnostic potential of HBV-HCC compared with AFP alone in current clinical practice.

have been associated with a poor prognosis [6][7][8] . Aberrant sialylation and enhanced activity of the sialyltransferases have been proven to be characteristic features of cancer cells 9 . Furthermore, Sialyl Lewis x , an important sialic acid-containing carbohydrate epitope, and increased α 2,3 SialylT activity are involved in the adhesion and metastasis of cancer cells 10,11 . Therefore, exploration and identification of specific alternations in glycosylation patterns, such as sialylation, should be an urgent and promising direction in the cancer biomarker research field, including work on HCC biomarkers.
It is worth noting that most of the currently used cancer biomarkers, such as carbohydrate antigen (CA)15-3, CA19-9, CA125, carcino-embryonic antigen (CEA), and Alpha Fetoprotein (AFP), are glycoproteins. More remarkably, as a heterogenetic glycoprotein of AFP, the lens culinaris agglutinin-reactive fraction of AFP (AFP-L3) has been approved by the US Food and Drug Administration (FDA) as a diagnostic index for HCC 12 and has been increasingly widely used over the past decade. Serum is the most available sample to be the primary clinical specimen in disease diagnosis and biomarker discovery because various glycoproteins from cell surfaces or tissue are released into serum 13 . However, most of the current studies are focused on glycosylation analysis of global serum glycoproteins; in contrast, there are fewer studies on the serum glycoproteins with specific carbohydrate structure. Lectins, which are widely used in glycan research, could recognize glycoprotein fractions by their strong affinity for specific carbohydrate epitopes of glycoproteins, and lectin affinity chromatography is one of the most widely used tools to purify specific glycoproteins from complex mixtures prior to their identification by mass spectrometry (MS) 14,15 . Additionally, the latest advances in MS technology are pushing proteomics toward the analysis of post-translational modifications and have made major strides in glycoprotein identification 16,17 .
In this study, we enriched glycoprotein fractions in serum samples from Chinese patients with chronic HBV infection and early HCC and in serum samples from healthy controls; the serum samples were enriched by lectin affinity chromatography with Maackia amurensis lectin (MAL), which could bind with the Siaα 2,3 Gal structure. The enriched fractions were labeled with mass-balanced isobaric tags (isobaric tag for relative and absolute quantitation, iTRAQ) and were identified by Nano-high-performance liquid chromatography with tandem mass spectrometric (Nano-HPLC-MS/MS) analysis to provide a novel variety of serological biomarker candidates for diagnosing HBV-related HCC. The differentially expressed MAL-associated serum glycoproteins were further validated by western blotting. Among these candidates, based on enzyme-linked immunosorbent assay (ELISA) analysis, Galectin-3-binding protein (Gal-3BP) was verified as one of the potential serological biomarkers for diagnosing HBV-related HCC; moreover, when combined with serum AFP, the diagnostic accuracy of Gal-3BP was enhanced.

Results
Identification and relative quantification of MAL-associated serum glycoproteomics from HBV-related HCC patients and from healthy controls. The same volume of pooled albumin-and IgG-depleted serum samples from 15 Chinese early-stage HCC patients with chronic HBV infection and from 15 healthy controls were enriched by MAL-agarose and labeled with iTRAQ tags. The two groups of peptides were mixed and analyzed by nano-HPLC-MS/MS for protein identification and relative quantification. The identified proteins were further searched using the Uniport database, and 17 MAL-associated serum glycoprotein groups were identified and relatively quantified as differentially expressed glycoproteins by iTRAQ analysis (Table 1). When the ratio of quantified glycoproteins of HBV-related HCC patients to healthy controls was defined as >1.2 or <0.8, eight of them, such as Gal-3BP, α -2-macroglobulin and α -1-antitrypsin, were up-regulated, and the other nine, such as Complement factor B and Vitamin D-binding protein, were down-regulated. Figure 1A,B show representative MS spectra of peptide IYTSPTWSAFVTDSSWSAR digested from Galectin-3-binding protein and TLNQPDSQLQLTTGNGLFLSEGLK digested from α -1-antitrypsin, respectively.
Validation of the differentially expressed MAL-associated serum glycoproteins. To confirm these differentially expressed MAL-associated serum glycoproteins, three up-regulated proteins (Gal-3BP, α -2-macroglobulin and α -1-antitrypsin) and one down-regulated protein (vitamin D-binding protein) were selected for further verification by western blotting in another pool of serum samples from 15 Chinese early HCC patients with chronic HBV infection and 15 healthy controls. Quantitative analysis of immunoblotted bands of these proteins was performed by Quantity One v4.62 (Bio-Rad, Hercules, CA, USA). Figure 2 showed that all four of these proteins were completely coincident with the iTRAQ quantification trend.

Bioinformatics analysis for the differentially expressed MAL-associated serum glycoproteins.
To further understand the roles of these differentially expressed MAL-associated serum glycoproteins, DAVID analysis and Ingenuity Pathway Analysis (IPA) analysis were performed. Figure 3A shows that these differential glycoproteins were mainly involved in the acute inflammatory response, complement activation, protein processing, maturation and proteolysis; furthermore, nearly 63% of the differentially expressed MAL-associated serum glycoproteins were involved in the defense response. The cellular component functional annotation of these glycoproteins was mainly enriched in the extracellular region and in the extracellular space (Fig. 3B). In addition, molecular function analysis of the differentially expressed MAL-associated serum glycoproteins indicated that the most common functional annotations of these proteins were inhibitor activity and complement binding (not shown here). One of the top corresponding IPA networks of the series of 17 differentially expressed MAL-associated serum glycoproteins is shown in Fig. 4. There were 11 network shapes in the network: peptidase, transmembrane receptor, phosphatase, kinase, cytokine, enzyme, complex/group, transporter, ion channel, translation regulator and other. Moreover, four significant nodes, including tumor necrosis factor (TNF), Nuclear factor-kappa B (NFkB), protein kinase B (PKB, also known as Akt) and epidermal growth factor receptor (EGFR), were enriched in the network.

Analysis of Gal-3BP and MAL-associated Gal-3BP levels in pooled serum samples.
Of the three up-regulated MAL-associated proteins, MAL-associated Gal-3BP was selected for further analysis because its expression level was highest in the serum samples. Immunoprecipitation (IP) using Gal-3BP-and MAL-associated Gal-3P-specific antibodies was performed to confirm the expression levels of Gal-3BP and MAL-associated  Gal-3BP in pooled serum samples from another group of 15 HBV-related HCC patients and from another group of 15 healthy controls, respectively. Figure 5A showed that more purified Gal-3BP was detected in pooled serum samples from HBV-related HCC patients compared with pooled serum from the healthy controls. Meanwhile, the result from the lectin blotting analysis showed that the proportions of MAL-associated Gal-3BP were similar in equivalent purified serum Gal-3BP samples from both HBV-related HCC patients and healthy controls. In other words, this finding suggests that the up-regulation of MAL-associated Gal-3BP in HBV-related HCC patients compared with the levels of MAL-associated Gal-3BP found in serum from healthy controls was mainly due to the increased expression level of Gal-3BP protein in HBV-related HCC patients but not to significant changes in sialylation in these patients.
Verification of the clinical diagnostic potential of serum Gal-3BP for HBV-related HCC. The level of Gal-3BP in serum samples from 80 early HCC patients with chronic HBV infection and 80 healthy controls was examined by quantitative ELISA analysis to further explore the clinical diagnostic potential of serum Gal-3BP. Compared with the levels of Gal-3BP in the serum of healthy controls, the levels of Gal-3BP in the serum samples from HBV-related HCC patients were statistically higher (p < 0.001, Fig. 5B). The ROC curve was established to evaluate the diagnostic efficacy of the serum Gal-3BP, and the area under the ROC curve (AUC), which can summarize the overall diagnostic accuracy of a potential biomarker, was 0.898 (Fig. 5C). The sensitivity, specificity and accuracy of Gal-3BP to predict HCC were 80.00%, 93.75% and 86.88%, respectively ( Table 2).
Considering AFP to be the classical serum biomarker of HCC, the serum level of AFP in these 80 early HCC patients with chronic HBV infection and 80 healthy controls was also examined, and its AUC was 0.901 (Fig. S1); the associated sensitivity, specificity and accuracy of AFP were 87.50%, 93.75% and 90.63%, respectively (Table 2). Then, we constructed a logistic regression model and identified a panel constructed jointly with serum AFP level and serum Gal-3BP level. The predictive model for distinguishing HCC patients from healthy controls was as follow: The value 0.3733 was defined as a cut off value (p < 0.001); the positive likelihood ratio was 19; and Nagelkerke R was 0.845. Figure 5D shows that the panel provided a higher diagnostic accuracy of HCC, and the AUC was up to 0.975, with a sensitivity of 95.00%, specificity of 95.00% and accuracy of 95.00% (Table 2), which improved 7.5%, 1.25% and 4.37% compared with AFP alone, respectively.

Discussion
HCC is the second most common malignant cancer in China, and there is a high prevalence of HBV infection among Chinese patients with HCC 18 . Currently, α -1-Fetoprotein (AFP) is mainly used in the clinic for diagnosis of primary HCC. However, because its sensitivity and specificity are not ideal 19 , novel diagnostic biomarkers for the diagnosis of early HBV-related HCC are greatly needed. In recent years, the use of 'omic' methods, especially quantitative proteomics analysis, to search for and identify novel HCC biomarkers has attracted extensive attention. For instance, Naboulsi et al. demonstrated that versican core protein (VCAN) is a potential biomarker for early HCC diagnosis by label-free discovery analysis: selected (multiple) reaction monitoring (SRM/MRM) 20 . It was also reported that protein S100A9, which was dramatically up-regulated in sera from HCC patients, is a candidate HCC diagnostic biomarker 21 . Lu used the mass spectroscopic approach to characterize the metabolic features of the liver in HCC patients and verified that serum acetylcarnitine was a meaningful biomarker reflecting HCC diagnosis and progression 22 . Our group has focused on potential biomarkers of HCC for several years and reported that heat-shock protein 27 (HSP27) could be a potential biomarker of HCC by 2-DE based on serum proteome analysis 23 . Furthermore, we reported a series of glycoprotein biomarkers of HCC, such as paraoxonase 1 (PON1) and haptoglobin (Hp), through quantitative ELSIA analysis 24,25 . In a retrospective study, we performed differential MAL-associated serum glycoproteomics analysis in pooled serum samples from early HCC patients with chronic HBV infection and in pooled samples from healthy controls by MAL affinity chromatography

Table 2. Diagnosis values of serum AFP level and serum Gal-3BP level in differentiating HCC patients with chronic HBV infection from healthy controls.
combined with iTRAQ followed by Nano-HPLC-MS/MS analysis. Of these MAL-associated serum glycoproteins, Gal-3BP was further validated as a promising candidate for diagnosing early HCC; this result further confirmed that the combined use of serum Gal-3BP and serum AFP might serve as a potential marker for distinguishing early HCC patients with chronic HBV infection from healthy controls in the Chinese population. In this study, it was further found that 17 differentially expressed MAL-associated serum glycoproteins were significantly enriched in four significant nodes (TNF, NFkB, Akt and EGFR) of the IPA network, which were all involved in the incidence of HCC. Among them, NFkB can regulate genes for cell adhesion, proliferation, differentiation and apoptosis, and an imbalance in NFkB activity has been associated with the development of HCC 26 . Further, TNF can induce cell death and excessive hepatocyte apoptosis, whereas NFkB can protect hepatocytes against TNF-induced cell death 27 . In addition, diverse lines of evidence have suggested that EGFR can bind transforming growth factor α (TGF-α ) and that the activation of the TGF-α /EGFR pathway contributes to HCC formation 28 . It was also proven that blocking the activity of EGFR has an antitumor effect on the development of HCC 29 , whereas the AKT signaling pathway has been firmly established as a major determinant of HCC occurrence and might be the target of specific inhibiting therapies 30,31 . Overall, these results suggest that the differentially expressed MAL-associated serum glycoproteins in our study might be involved in the development of HCC by interacting with the significant nodes in the network. However, the direct interactions of these glycoproteins with the significant nodes remain elusive; moreover, the specific mechanisms of how these glycoproteins are involved in HCC development are unknown and will require further research.
Gal-3BP was identified as one of the differentially expressed MAL-associated serum glycoproteins between HBV-related HCC patients and healthy controls in our study; meanwhile its expression quantity was the highest among the three up-regulated MAL-associated proteins. In contrast, although α -2-macroglobulin was more significantly different protein in this study, it had been reported already as a novel cytochemical not serological marker to identify rat hepatocellular preneoplastic and neoplastic lesions 32,33 . So we chose MAL-associated Gal-3BP for further analysis on serum samples from HCC patients and healthy controls. Gal-3BP is a secreted protein composed of 585 amino acid residues, and it is widely expressed in many tissues and epithelial cells 34,35 . It was previously reported that the serum Gal-3BP levels were elevated in patients with cancer and viral infections, such as hepatitis C virus (HCV) 36 . Difference gel electrophoresis (DIGE) analysis showed a statistically significant increase in the plasma Gal-3BP levels in HCV-infected cirrhotic patients 37 . Other studies demonstrated that there are complex type N-glycans with terminal Siaα 2-3 structure on Gal-3BP proteins from breast cancer cells and that as a large hyperglycosylated protein, Gal-3BP could act as a ligand for galectins by glycan-dependent interactions and induce galectin-mediated tumor cell aggregation 38 . Of note, it was found and confirmed that the serum Gal-3BP levels were significantly elevated in HCC patients in comparison to those of healthy individuals 39 , consistent with our report. Furthermore, in our study, the serum Gal-3BP level provided a relatively high diagnostic accuracy for distinguishing HBV-related HCC patients in the Chinese population, and based on the predictive model, the combined use of serum Gal-3BP and serum AFP could improve the sensitivity, specificity and accuracy in the diagnosis of HCC. Furthermore, the serum Gal-3BP level also showed a relative sensitivity in the diagnosis of HCC, even with low AFP levels (not shown here). These results suggested that the combined use of serum Gal-3BP and serum AFP might make up for the low sensitivity of serum AFP alone in the diagnosis of HCC.
However, there are some limitations to this study. Although it was also found that the sialylation of Gal-3BP in serum samples was relatively stable between patients with and without HCC, this study cannot rule out a possible role of the glycosylation of Gal-3BP in regulating the structure and function of the glycoprotein and how it takes part in the development of HCC; the role of glycosylation in the regulation of Gal-3BP in HCC is thus worthy of further investigation. Moreover, further validation in larger samples is needed to confirm that the model constructed by combining serum Gal-3BP with serum AFP could aid in the diagnosis of HCC superior to AFP alone.
In conclusion, we acquired the MAL-associated serum glycoproteomics profile of Chinese HCC patients with chronic HBV infection and healthy controls by iTRAQ combined with Nano-HPLC-MS/MS analysis, and we demonstrated that serum Gal-3BP is a promising diagnostic candidate of HBV-related HCC by quantitative ELISA analysis. Moreover, this is the first report that, based on a predictive model, the combined use of serum Gal-3BP and serum AFP might be a potential marker to identify HBV-related HCC patients.

Methods
Clinical specimens. In this study, serum samples from 110 Chinese early HCC patients with chronic HBV infection (The TNM Classification of Malignant Tumours (TNM) was I and II) and from 110 healthy controls were collected at the First Affiliated Hospital of Guangxi Medical University (Guangxi, China) and stored at − 80 °C for further analyses. The general information and clinical characteristics of these HCC patients and healthy controls are shown in Table 3. Pooled serum samples from 15 early HCC patients with chronic HBV infection and from 15 healthy controls were used for identifying MAL-associated serum glycoprotein profiles by MS, and other pooled serum samples from 15 HBV-related HCC patients and from 15 healthy controls were used to re-identify the results from MS by western blotting. Additionally, serum samples from 80 HBV-related HCC patients and from 80 healthy controls were used in further quantitative ELISA analyses. Informed consent was obtained from each patient; all of the projects in the study were approved by Ethics Committee of the

Nano-HPLC-MS/MS analysis.
The Nano-HPLC MS/MS analysis was carried out according to previous studies 41,42 by a Nano Aquity UPLC system (Waters Corporation, Milford, MA, USA) coupled with a quadrupole-Orbitrap mass spectrometer (Q-Exactive, Thermo Fisher Scientific, Bremen, Germany) equipped with an online nano-electrospray ion source. The protocol was described as follows: each sample was resuspended in 0.1% formic acid and loaded onto a trap column (Thermo Scientific Acclaim PepMap C18, 100 μ m × 2 cm) for 3 min at a flow rate of 10 μ l/min. The sample subsequently separated by the analytical column (Acclaim PepMap C18, 75 μ m × 25 cm) with a linear gradient, from 5-30% phase B in 95 min (phase A: water with 0.1% formic acid; phase B: ACN with 0.1% formic acid). The flow rate of the analytical column was maintained at 300 nL/min, and the column temperature was maintained at 45 °C. The Q-Exactive mass spectrometer was operated in data-dependent mode with full-scan MS spectra from 350-1600 m/z, resolution at 70 K, followed by fifteen sequential high energy collisional dissociation (HCD) MS/MS scans with a resolution of 17.5 K. In all cases, one microscan was recorded using dynamic exclusion of 30 seconds.
The raw mass spectrometry data files were extracted using the Proteome Discoverer software (Thermo Fisher Scientific, version 1.4.0.288), and the MS/MS results were analyzed using Mascot (Matrix Science, London, UK; version 2.3). The search parameters were as follows: Database was specified in Mascot as Uniprot-SwissProt (Taxonomy: human, 20201 entries), enzyme was specified as trypsin, a fragment ion mass tolerance was specified as 0.050 Da and a parent ion tolerance was specified as 10.0 PPM. Carbamidomethyl of cysteine and iTRAQ 8-plex of lysine and the N-terminus were specified in Mascot as fixed modifications, and Oxidation of methionine and iTRAQ 8-plex of tyrosine were specified as a variable modification. Peptide level false discovery rates (FDR) were controlled lower than 1% by the percolator algorithm. Protein quantifications were performed by unique peptides, and experimental bias was corrected by the method of normalization on the protein median; the minimum number of proteins that must be observed to allow normalization was set to 100.

Western blotting. Equivalent volumes of eluted MAL-associated fraction in pooled albumin-and
IgG-depleted serum samples from Chinese HBV-related HCC patients and from healthy controls were separated by SDS-PAGE and analyzed by western blotting. In brief, proteins in the gels were transferred onto PVDF membranes (Millipore, Billerica, MA, USA). After blocking for nonspecific binding, the membranes were incubated with antibodies against α -2-macroglobulin, α -1-antitrypsin, Gal-3BP and vitamin D-binding protein (R&D, Minneapolis, MN, USA) at 4 °C overnight, respectively. Then, the membranes were washed with 0.1% TBS-Tween20 (TBST, 50 mM Tris, 150 mM NaCl, 0.1% Tween 20, pH 7.6), and incubated with HRP-conjugated secondary antibodies for 1 h at room temperature (Kangcheng, Shanghai, China). After the membranes were washed three times with TBST, Amersham TM ECL TM prime western blotting detection reagents (GE Healthcare, Piscataway, NJ, USA) were used to detect the bands on the membranes. Bioinformatics analysis. The identified differentially expressed MAL-associated proteins were evaluated using the Uniprot database (http://www.uniport.org). Further functional categories analysis was performed by the Database for Annotation, Visualization and Integrated Discovery (DAVID, https://david.ncifcrf.gov/), and Ingenuity Pathway Analysis (IPA, QIAGEN, Redwood City, CA, USA) was used to investigate the biological interactions of them.
Immunoprecipitation and lectin blotting. Albumin-and IgG-depleted serum sample proteins were immunoprecipitated with Gal-3BP antibody using Pierce ® Co-Immunoprecipitation Kit according to the recommendation of the manufacturer. Equivalent volumes of eluted immune-complexes were run on 10% SDS-PAGE  Table 3. General information and clinical characteristics of Chinese HCC patients with chronic HBV infection and healthy controls a . a The values supplied in Table 3 were means with SD or range. b Abbreviations: AFP, alpha fetoprotein; ALT, alanine aminotransferase; AST, aspartate transaminase; HbsAg, hepatitis B surface antigen.
gels and incubated with Gal-3BP antibody to perform the western blotting detection as described above. Subsequently, the sample amounts were adjusted according to the quantitative analysis of immunoblotted bands, and the PVDF membranes were incubated with biotinylated MAL (Vector Laboratories, Burlingame, CA, USA) and Streptavidin Horseradish Peroxidase (HRP) Conjugate (Invitrogen, Waltham, MA, USA) in turn to perform lectin blotting detection.
ELISA analysis. Levels of Gal-3BP protein in serum samples from HBV-related HCC patients and healthy controls were assayed by Human Gal-3BP Quantikine ® ELISA kits (R&D, Minneapolis, MN, USA) according to the recommendation of the manufacturer. Serum samples were diluted (1:500) in sample diluents, respectively before the assay.
Statistical analysis. Statistical analysis was performed by SPSS 16.0 statistical packages (SPSS Inc., Chicago, IL, USA). Data were presented as the mean ± SD, unless otherwise indicated. Quantitative variables were evaluated using Student's t-test (two tailed) to compare two groups of parametric variants; p < 0.05 was considered statistically significant. Receiver operating characteristic (ROC) curves generated by sensitivity and 1-specificity were used to determine the diagnostic value of biomarkers; the cutoff was defined as the point in the ROC curve that maximizes the value of sensitivity plus specificity.