Prognostic significance and oncogene function of cathepsin A in hepatocellular carcinoma

Cathepsin A (CTSA) is a lysosomal protease that regulates galactoside metabolism. The previous study has shown CTSA is abnormally expressed in various types of cancer. However, rarely the previous study has addressed the role of CTSA in hepatocellular carcinoma (HCC) and its prognostic value. To study the clinical value and potential function of CTSA in HCC, datasets from the Cancer Genome Atlas (TCGA) database and a 136 HCC patient cohort were analyzed. CTSA expression was found to be significantly higher in HCC patients compared with normal liver tissues, which was supported by immunohistochemistry (IHC) validation. Both gene ontology (GO) and The Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses demonstrated that CTSA co-expressed genes were involved in ATP hydrolysis coupled proton transport, carbohydrate metabolic process, lysosome organization, oxidative phosphorylation, other glycan degradation, etc. Survival analysis showed a significant reduction both in overall survival (OS) and recurrence-free survival (RFS) of patients with high CTSA expression from both the TCGA HCC cohort and 136 patients with the HCC cohort. Furthermore, CTSA overexpression has diagnostic value in distinguishing between HCC and normal liver tissue [Area under curve (AUC) = 0.864]. Moreover, Gene set enrichment analysis (GSEA) showed that CTSA expression correlated with the oxidative phosphorylation, proteasome, and lysosome, etc. in HCC tissues. These findings demonstrate that CTSA may as a potential diagnostic and prognostic biomarker in HCC.

www.nature.com/scientificreports/ In this study, the gene expression profiling interactive analysis (GEPIA), The Cancer Genome Atlas (TCGA), The Human Protein Atlas databases were used to investigate the expression of CTSA in HCC and normal liver tissues to determine the relationship between prognosis of HCC and CTSA expression [24][25][26] . The relationship between the prognosis of HCC and CTSA expression was verified by Immunohistochemistry (IHC). The potential function of CTSA in HCC was analyzed by screening CTSA co-expressed genes using cBioPortal and Linke-dOmics, as well as gene ontology (GO) and The Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses performed using the Database for Annotation, Visualization and Integrated Discovery (DAVID) 27 . Then, we performed gene set enrichment analysis (GSEA) analysis to determine the enriched genes and whether a series of previously-defined 9 stages of HCC progress-related gene sets were enriched in different phenotypes 28 .

Materials and methods
Profiling of CTSA expression data. Genotype Tissue Expression (GTEx) projects and TCGA database provide mRNA expression data in various types and stages of human cancers. We used GEPIA (http:// gepia. cancer-pku. cn/) which is a web server for analyzing the RNA-Seq expression data from the TCGA and GTEx projects to explore the expression distribution and correlation of CTSA in a body map and different cancer tissues. The GEPIA and the UCSC Xena project (https:// xenab rowser. net/ datap ages/) has recomputed all raw expression data from TCGA database and were used to detect the CTSA expression in different stages and types of HCC 29 . Then, the Kaplan-Meier Plotter (http:// http:// kmplot. com/ analy sis/) and GEPIA website to analyze the expression of CTSA and the prognosis of survival in HCC tissues 30 . The Human Protein Atlas database (https:// www. prote inatl as. org/) were used to analyze the CTSA protein expression in HCC tissues and normal liver tissues by IHC. We used the Cancer Cell Line Encyclopedia project (CCLE) (https:// porta ls. broad insti tute. org/ ccle/) to analyze the relationship of CTSA mRNA levels with DNA copy number in different liver cancer cell lines. We obtained the information about the alteration of the CTSA gene in cBioPortal for Cancer Genomics (http:// www. cbiop ortal. org/) 31 . Prognostic analysis using CTSA expression and clinicopathological data in HCC patients. We analyzed CTSA expression in HCC and adjacent peritumoral tissues from patients in TCGA database (https:// xenab rowser. net/ datap ages/). We performed survival analysis to assess the clinical outcomes of patients with HCC after examination and transformation of variables evaluated in a Cox proportional hazards regression model. Overall survival (OS) was defined as the interval between surgery and mortality or between surgery and the last observation point. For surviving patients, the data were censored at the last follow-up. Recurrence-free survival (RFS) was defined as the interval between the date of surgery and the date of diagnosis of any type of relapse (intra-hepatic recurrence or extrahepatic metastasis) 32 . In order to evaluate the prognostic value of CTSA in HCC, 136 tumor specimens were collected during the continuous HCC tumors from September 2010 to December 2012, and the last follow-up was conducted on December 31, 2017. We used the Edmondson grading system 33 and Child-Pugh classification to assess tumor differentiation and liver function, respectively. The 2010 International Union Against Cancer Tumor-Node-Metastasis (TNM) classification system were used to assess the tumor stages 34 .The patients must meet following criteria: liver function grade Child-Pugh class A, only one tumor lesion and absence of any metastasis, no cancer radiotherapy or chemotherapy prior to surgery, and pathology confirmed as primary HCC following surgery. The relevant clinicopathological data of HCC patients were obtained from the medical record of hospital database, and pathological data were assessed by two pathologists. Survival information was accessed from medical records, telephone interviews as well as the Social Security Death Index. This study was approved by the Human Research Ethics Committee of 900 Hospital of the Joint Logistics Team (Fuzhou, China). All experiments were performed in accordance with relevant named guidelines and regulations. All participants were supplied with written information and gave written consent prior to collection of the specimens and informed consent was obtained from all the participants. Immunohistochemistry (IHC) analyses. 4-μm sections of 136 HCC tissue samples were fixed in super frost-charged glass microscope slides. Then, the tissue sections were deparaffinized and rehydrated using graded concentrations of malondialdehyde and ethanol. Antigen retrieval was performed by boiling sections in Tris/ Ethylenediaminetetraacetic acid (EDTA) (pH 9.0) for 20 min. Endogenous peroxidase was inhibited by incubation for 10 min in 3% H2O2 and washed in phosphate-buffered saline (PBS) three times. The sections were incubated with 10% normal goat serum for 30 min at room temperature (25 °C) without washing. A monoclonal rabbit Anti-CTSA antibody (1:300; 15,020-1-AP, Proteintech, Wuhan, China) were added dropwise to sections, incubated overnight at 4 °C, and washed in PBS three times. Then, sections were incubated with the secondary antibody (1:50,000; KIT-5010; anti-rabbit/mouse IgG; Maixin Biotechnology Development Co., Ltd., Fuzhou, China) at 37 °C for 30 min and were washed by PBS three times. Next, the sections were stained with 3,3′-diaminobenzidine and a substrate-chromogen (Dako) for 2 min at room temperature, and counterstained with hematoxylin for 40 s. Finally, the sections were dehydrated in 95% alcohol and sealed with neutral balsam.
Evaluation of IHC staining. The sections were dropped by only the second antibody without the CTSA antibody was as the negative control. The 136 stained tissue sections were viewed using a CX41 microscope (Olympus, Tokyo, Japan) and assessed by two separate pathologists with no prior knowledge of any patient information. The expression of CTSA was predominantly cytoplasmic or cytomembrane based on the Human Protein Atlas database and previous studies 20,22 . A semi-quantitative IHC scoring system was used for the evaluation of CTSA protein level with a 5-point scale, as follows: 0, no positive cells; 1, < 25% positive cells; 2, 26-50% positive cells; 3, 51-75% positive cells; 4, > 75% positive cells. HCC tissue samples with a score of 0.1 or 2 were regarded as low CTSA expression, whereas a score of 3 or 4 was regarded as high CTSA expression. www.nature.com/scientificreports/ GO and KEGG enrichment analysis and PPI network construction. The hepatocellular carcinoma dataset in the cBioportal database and the LinkedOmics database were selected to analyze the correlated genes of CTSA expression using the function of co-expressed genes. Then, the overlapping genes with Pearson's Correlation greater than 0.35 obtained in the two databases were screened as CTSA co-expressed genes. Next, the Functional Annotation Tool in the DAVID database was used to perform GO and KEGG enrichment analysis on the co-expressed genes of CTSA 35,36 . In this process, the critical value of the significant gene functions and pathways to be screened was set as P < 0.05. Then, the protein-protein interaction (PPI) network of CTSA coexpressed genes was constructed in the STRING database, and the minimum required interaction score was set as 0.9(highest confidence) 37 . Finally, The Cytoscape software was used to visualize the PPI network 38 .
GSEA enrichment analysis. Normalized gene expression data was downloaded in the TCGA database from the UCSC Xena database (https:// xenab rowser. net/ datap ages/) 39 . Among them, there were 374 HCC specimens and 50 adjacent non-tumor tissue specimens. The 374 HCC specimens were divided into high expression group and low expression group taking the median of CTSA expression in HCC specimens as the critical point. We imported gene expression data into GSEA 4.1.0 software for enrichment analysis. In the process, we selected the KEGG gene sets (c2.cp.kegg.v7.0.symbols.gmt) as the functional gene set, the Number of permutations as 1000, and other parameters as the default settings. In the analysis of the results, the pathway of gene enrichment with a normal p-value < 0.05 and FDR q-value < 0.25 was selected.
Genetic alteration of CTSA in HCC. We selected the Liver Hepatocellular Carcinoma (TCGA, Firehose Legacy) dataset in the cBioportal database to query the genetic alteration and the mutational hotspot of CTSA genes in HCC. Then, the overall survival rate was compared between HCC patients with CTSA mutations and without mutations.
Statistical analysis. The statistical analysis was performed using Stata Statistical Software: SPSS 21 (SPSS Inc., Chicago, IL, USA) and GraphPad Prism 6.0 (GraphPad Software, Inc., San Diego, CA, USA). Pearson's chi-square test was used to compare the categorical variables. For normally distributed continuous variables, the Student's t-test was used. Survival was estimated using Kaplan-Meier plots with log-rank test for differences. The diagnostic significance of CTSA for HCC was evaluated by the receiver operating characteristic (ROC) curve. P < 0.05 was considered statistically significant unless otherwise stated.

Results
The relationship between the expression of CTSA and survival percentages of HCC patients in GEPIA and Kaplan-Meier plotters database. We found that the expression of CTSA mRNA in a variety of tumor tissues was significantly higher than normal tissues in the GEPIA database (Fig. 1A). Besides, the human body map shows that the mRNA expression level of CTSA is significantly up-regulated in liver cancer tissues compared with normal liver (Fig. 1B). We analyzed the CCLE database and found that the CTSA mRNA expression copy number is significantly different in various kinds of cancer cell lines (Fig. 1C). In the GEPIA database, CTSA is the significantly high expression in HCC patients ( Fig. 2A) and which is associated with a poor prognosis (Fig. 2B). We evaluated the relationship between the CTSA mRNA levels and overall survival in HCC patients using the Kaplan-Meier plotters database and found that the high expression cohort had a shorter median survival time (P = 0.0081) (Fig. 2C).
The relationship between the expression of CTSA and clinical outcomes in HCC patients in TCGA database. We downloaded the HCC dataset from the TCGA database to analyze the relationship between CTSA mRNA expression and the clinical outcome of HCC patients. The mRNA expression of CTSA in the HCC group (n = 373) was significantly higher than in the none-HCC group (n = 50) (P < 0.001, Fig. 2D). The expression level of CTSA was significantly positively correlated with the TNM staging of HCC (Fig. 2E). In addition, the mRNA expression of CTSA was incrementally upregulated with increasing neoplasm histology grades (Fig. 2F). The high expression of CTSA mRNA was related to the vascular invasion (P = 0.001), tumor TNM stage (P = 0.004), serum alpha-fetoprotein (AFP) level (P = 0.001), neoplasm histology grades (P = 0.038), and adjacent hepatic inflammation in (P = 0.009) HCC patients. Age, gender, family cancer history, preoperative radiotherapy, preoperative pharmaceutical, and body mass index (BMI) were not related to CTSA mRNA expression (Table 1). We found that vascular invasion (P = 0.014), tumor TNM staging (P < 0. www.nature.com/scientificreports/ Human Protein Atlas database (https:// www. prote inatl as. org/) and found that its expression quantity in HCC tissues (> 75%, Fig. 2J) was higher than normal liver tissues (< 25%, Fig. 2K).
The relationship between the CTSA protein expression and clinicopathologic characteristics in 136 HCC patients. As shown in the figure, CTSA mainly expressed on the cytoplasm and cell membrane of HCC (Fig. 3A,B). According to the A semi-quantitative IHC scoring system, 70 of 136 patients showed high CTSA expression and 66 had a low expression. The high CTSA protein expression of HCC patients was related to TNM staging (P = 0.024), serum AFP level (P = 0.001), tumor location (P = 0.037), tumor differentiation (P = 0.031), tumor recurrence (P = 0.013), and survival (P = 0.036), but not related to age, gender, tumor size, vascular invasion, and tumor encapsulation ( Table 3) Analysis of CTSA alteration using RNA-sequencing data set in cBioPortal database. we queried the genetic alterations of CTSA in a cohort of 359 HCC patients (TCGA, Firehose Legacy) in the cBioportal database and found that queried gene is altered in 25 (7%) of queried HCC patients, including 1 case of truncating mutation, 1 case of amplification, and 23 cases of mRNA high expression. We also analyzed the mutational hotspot of CTSA in 359 HCC patients and found that there was one mutational hotspot, L218Wfs*40/frameshift deletion mutation (Fig. 4A). The percentage of samples with a somatic mutation in CTSA is 0.3%. In addition, the Kaplan-Meier curve shows that the OS percentage of LIHC patients with CTSA alterations (n = 25) is poorer than without CTSA alterations (n = 334) (P = 0.0174, Fig. 4B).   www.nature.com/scientificreports/ genes. We selected the liver hepatocellular carcinoma dataset in the cBioportal database and the LinkedOmics database to analyze the correlated genes of CTSA expression using the function of co-expressed genes. A total of 19,899 genes related to CTSA protein expression in a cohort of 371 HCC patients from the LinkedOmics database expression were investigated. We exhibited a gene heat map and volcano plot with 8846 positively and 11,053 negatively correlated genes with CTSA protein expression (Fig. 4C-E). A total of 117 overlapping genes with Spearman's Correlation greater than 0.35 obtained in the cBioportal database and the LinkedOmics database were screened as co-expressed genes of CTSA (Fig. 4F). Next, we performed GO and KEGG enrichment analysis for 117 co-expressed genes of CTSA using DAVID 6.8 to predict the potential function and pathway of CTSA in HCC. The GO analysis showed that 117 cases of co-expressed genes of CTSA were mainly enriched in biological processes, such as ATP hydrolysis coupled proton transport, Carbohydrate metabolic process, Ganglioside catabolic process, Lysosome organization, and Negative regulation of fibroblast proliferation, etc. The KEGG enrichment analysis showed that 117 co-expressed genes of CTSA were mainly enriched in the signal pathway of Lysosome, Oxidative phosphorylation, Metabolic pathways, Amino sugar, and nucleotide sugar metabolism, etc. (Table 5). The 117 co-expressed genes of CTSA were analyzed in the STRING database to identify the significant interactions and were visualized using Cytoscape software. 47 nodes with 70 edges were selected to construct the PPI networks with a confidence score of > 0.900 (highest confidence) (Fig. 4G). The PPI network showed that the FTL, GRN, NPC2, HEXB, and PTGES2 protein can interact with CTSA (Fig. 4H). The Pearman correlation test analysis in the LinkedOmics database confirmed that the protein expression of FTL, GRN, NPC2, HEXB, and PTGES2 significantly positively correlated with CTSA (FTL: r = 0.4042, GRN: r = 0.4132, HEXB: 0.4641, NPC2: r = 0.4646, PTGES2: r = 0.3533, all P < 0.05) (Fig. 5A). We also performed the Kaplan-Meier curve and log-rank test analyses in the Kaplan-Meier plotters database and the result showed that www.nature.com/scientificreports/ the mRNA expression of FTL, GRN, NPC2, and HEXB were significantly related to the OS in HCC patients (all P < 0.05) (Fig. 5B).

GSEA enrichment analysis of CTSA expression in TCGA database.
We performed GSEA enrichment analysis in the GSEA 4.1.0 software to investigate the potential pathway that CTSA may regulate the carcinogenesis and development of HCC using normalized gene expression data in the TCGA HCC dataset downloaded from the UCSC Xena database. The GSEA revealed that KEGG pathway associated with carcinogenesis and development including "oxidative phosphorylation", "proteasome", "lysosome", "glycerophospholipid metabolism", "other glycan degradation", "pentose phosphate pathway", and "amino sugar and nucleotide sugar metabolism" were identified as significantly altered in CTSA high group (Fig. 5C). Furthermore, the result showed that the genes involving in these KEGG pathways were significantly altered in the high CTSA expression group (Table 6). In summary, we considered that CTSA regulated the occurrence and development of HCC may through these signaling pathways.

Discussion
HCC was a highly aggressive malignant tumor with high mortality and caused a huge health burden globally. According to data in 2019 released by the American Cancer Society 40 , the 5-year survival percentage of HCC patients for all stages was only 18%, and cancer-related mortality ranks 5th among all cancers. Even though the www.nature.com/scientificreports/ methods of diagnosis and treatment have got rapid development, HCC still lacks effective diagnostic biomarkers.
In the past few years, the role of various classes of cathepsins in the proliferation and metastasis of various types of human cancers has been extensively studied. The overexpression of cathepsin B promotes invasion and metastasis of breast cancer, pancreatic cancer, HCC, and colorectal cancer [11][12][13][14] . In addition, cathepsin B has been found to be involved in tumor initiation, migration, and drug resistance of glioblastoma stem cells and prostate cancer stem cells 41,42 . CTSA is a well-known serine protease cathepsin member of the cathepsin lysosomal protease family, which has been identified as a potential biomarker for early diagnosis, prognosis, and monitoring during cancer treatment 20,21,43 . A previous study showed that knockdown of CTSA suppressed the metastasis of prostate cancer by reducing the phosphorylation of the P38 MAPK pathway 22 . Another previous study had found that CTSA is highly expressed in hepatocellular carcinoma through the method of quantitative proteomics 44 , but its clinical prognostic value and gene function never been illustrated. This study is the first systematic investigation of diagnostic value, clinical significance, and the gene function of CTSA in HCC. We analyzed the mRNA expression level of CTSA using the GEPIA database and TCGA database and found that its expression in HCC tissues was significantly higher than adjacent tissues. And this result was confirmed by IHC from the Human Protein Atlas database and 136 cases of clinical specimens. Furthermore, the high-level expression of CTSA mRNA was significantly correlated with poor OS and PFS of HCC patients in the GEPIA database, Kaplan-Meier plotters database, and TCGA database as well as the 136 HCC patients, indicating that CTSA may perform an important role in the development of HCC. To further investigate the clinical significance of CTSA in HCC, we analyzed the relationship between clinicopathological variables records and CTSA expression from the TCGA database and 136 HCC patients and found that the high mRNA expression of CTSA was significantly associated with vascular invasion, TNM staging, serum AFP level, neoplasm histology grades, adjacent hepatic inflammation, tumor recurrence, and survival. And the multivariate regression analysis confirmed that the high mRNA expression of CTSA was an independent risk factor for OS in HCC patients. High protein expression of CTSA was an independent risk factor for OS and RFS in HCC patients (Table 4). Previous studies had shown that CTSA can be used as a biomarker of the prognostic value of HCC 44 . Our research was consistent with it. The ROC curve using data from the TCGA database indicating that CTSA mRNA expression has a significant diagnostic value between HCC and normal liver tissues. IHC is a routine pathological examination after HCC resection. Our research showed that CTSA protein expression was significantly increased in HCC www.nature.com/scientificreports/ and was an independent risk factor for OS and RFS. Therefore, postoperative CTSA IHC examination can help predict the recurrence and prognosis of HCC patients. In order to explore the function of CTSA in the process of tumorigenesis and development of HCC, we identified the co-expressed genes using cBioPortal and LinkedOmic databases, and then performed GO and KEGG enrichment analysis used DAVID software on co-expressed genes. The results show that CTSA mainly involves in biological processes, such as ATP hydrolysis coupled proton transport, the Carbohydrate metabolic process, Ganglioside catabolic process, Lysosome organization, and Negative regulation of fibroblast proliferation, etc. KEGG as well as GSEA exhibited that the signaling pathway CTSA is involved in such as Lysosome, Oxidative phosphorylation, Metabolic pathways, etc., and should be further investigated in the future work. We further constructed a PPI network of CTSA co-expressed genes and screened out several genes including FTL, GRN, NPC2, HEXB, and PTGES2, which interact with CTSA. To the best of our knowledge, genes with similar expression patterns may be functionally related or even similar. The results of the Kaplan-Meier plotters database revealed that the high expression of these co-expressed gene mRNAs was related to the poor OS of HCC patients. Therefore, all our results indicate that CTSA may as an oncogene in the process of HCC tumorigenesis.
To gain more insight into the role of CTSA in HCC, we further queried its genetic alteration in a cohort of 359 LIHC patients in the cBioPortal database. The results showed that approximately 7% of CTSA exhibited the alterations, and these alterations were significantly related to poor OS. We further explored the reasons for this www.nature.com/scientificreports/ genetic alteration. From the PPI network, we screened out 5 co-expressed genes that interact with CTSA, and the high expression of these co-expressed genes was associated with poor OS of HCC. It is reasonable to assume that these co-expressed genes have an impact on the CTSA alterations and may play an important role in the tumorigenesis and development of HCC. www.nature.com/scientificreports/

Conclusion
The mRNA and protein expression level of CTSA in HCC tissues was significantly higher than adjacent normal liver tissues. High CTSA expression level was associated with poor clinical outcomes of HCC patients. CTSA can be used as a biomarker of the prognostic value of HCC. CTSA may as an oncogene that regulated tumorigenesis and development through influencing pathways such as lysosome, oxidative phosphorylation, and metabolic pathways, etc. Postoperative CTSA IHC examination can help predict the recurrence and prognosis of HCC patients. www.nature.com/scientificreports/ www.nature.com/scientificreports/

Data availability
All datasets generated for this study are available within the article.  www.nature.com/scientificreports/