A novel tumor immunotherapy-related signature for risk stratification, prognosis prediction, and immune status in hepatocellular carcinoma

Immunotherapy as a strategy to deal with cancer is increasingly being used clinically, especially in hepatocellular carcinoma (HCC). We aim to create an immunotherapy-related signature that can play a role in predicting HCC patients’ survival and therapeutic outcomes. Immunotherapy-related genes were discovered first. Clinical information and gene expression data were extracted from GSE140901. By a series of bioinformatics methods to analyze, overlapping genes were used to build an immunotherapy-related signature that could contribute to predict both the prognosis of people with hepatocellular carcinoma and responder to immune checkpoint blockade therapy of them in TCGA database. Differences of the two groups in immune cell subpopulations were then compared. Furthermore, A nomogram was constructed, based on the immunotherapy-related signature and clinicopathological features, and proved to be highly predictive. Finally, immunohistochemistry assays were performed in HCC tissue and normal tissue adjacent tumors to verify the differences of the four genes expression. As a result of this study, a prognostic protein profile associated with immunotherapy had been created, which could be applied to predict patients' response to immunotherapy and may provide a new perspective as clinicians focus on non-apoptotic treatment for patients with HCC.


A novel tumor immunotherapy-related signature for risk stratification, prognosis prediction, and immune status in hepatocellular carcinoma
Jianping Sun , Lefeng Xi , Dechen Zhang , Feipei Gao , Liqin Wang & Guangying Yang * Immunotherapy as a strategy to deal with cancer is increasingly being used clinically, especially in hepatocellular carcinoma (HCC).We aim to create an immunotherapy-related signature that can play a role in predicting HCC patients' survival and therapeutic outcomes.Immunotherapy-related genes were discovered first.Clinical information and gene expression data were extracted from GSE140901.By a series of bioinformatics methods to analyze, overlapping genes were used to build an immunotherapy-related signature that could contribute to predict both the prognosis of people with hepatocellular carcinoma and responder to immune checkpoint blockade therapy of them in TCGA database.Differences of the two groups in immune cell subpopulations were then compared.Furthermore, A nomogram was constructed, based on the immunotherapy-related signature and clinicopathological features, and proved to be highly predictive.Finally, immunohistochemistry assays were performed in HCC tissue and normal tissue adjacent tumors to verify the differences of the four genes expression.As a result of this study, a prognostic protein profile associated with immunotherapy had been created, which could be applied to predict patients' response to immunotherapy and may provide a new perspective as clinicians focus on non-apoptotic treatment for patients with HCC.Among the various solid tumors in humans, liver cancer is one of the tumors that has attracted much attention, and one of the world's deadliest cancers.This disease is often at an advanced stage when diagnosed, which is the main cause for its high mortality 1 .Thus, further research is needed to identify key molecules in the progression of hepatocellular carcinoma and guide clinicians in making accurate diagnosis and treatment decisions.Immunotherapy was used in many disease areas with impressive results and revolutionized cancer treatment 2 .

Abbreviations
www.nature.com/scientificreports/Such as lung cancer 3 , pancreatic cancer 4 , medulloblastoma 5 , and etc. it had emerged as an invaluable treatment option for many patients with HCC 6,7 .However, they were not successful in all patients.Despite these advances in the understanding of treatment strategy, the key molecules that influence the efficacy of HCC immunotherapy need to be further clarified.Proteins are the basic functional units of the human body, and different proteins have their unique roles, for example, as enzymes, antibodies, messengers 8,9 , etc.For the treatment of tumors, they are often present as targets, such as PD-L1 is an important immune checkpoint 10 .In the present study, a new prognosis-related risk model about immunotherapy-related proteins was developed, it can be a potential target for individual immunotherapy as well as a prognostic indicator for HCC.

Collect public data and identify immunotherapy-related genes
We downloaded the microarray data of HCC patients after PD-1/PD-L1 immunotherapy from the GSE140901 database in NCBI (https:// www.ncbi.nlm.nih.gov/ geo/) 11 .Other RNA sequencing data (RNA-seq) and clinicopathology features data used in the paper were gained from three free and publicly available datasets: TCGA-LIHC (https:// xena.ucsc.edu), ICGC (LIRI-JP) (https:// dcc.icgc.org).We obtained some images in public databases, protein expression in HPA (http:// www.prote inatl as.org) and cell line expression in CCLE (http:// www.ccle.org) 12 .Table S1 shows the general clinical characteristics of HCC patients from the database and the author's hospital.

WGCNA and co-expression modules analysis
After all genes were included in the Weighted correlation network analysis (WGCNA) process, power values were filtered out as each module was built.In the gradient method, each module had a different power value (ranging from 1 to 20) was tested for independence and average connectivity.It was subsequently found that the power value reached most appropriate when the degree of independence was close to 0.85.To make the results highly reliable, we extracted the gene information corresponding to each module, and 50 was the minimum number of genes we set."Heatmap 3" further determined the interaction of co-expression modules.

Creation of an immunotherapy-related signature
Univariate Cox regression with p < 0.05 was our method used to filter out genes related to patient's prognosis after immunotherapy.The screened prognostic genes were then screened for the most valuable genes using least absolute shrinkage and selection operator (LASSO) Cox regression, random forest algorithm and stepwise COX regression models, and overlapping genes were further incorporated into a multivariate Cox regression model to establish an immunotherapy-associated signature.The multivariate Cox relapse coefficient (β) was applied to build a risk score found on directly mixing the equation below with genes expression levels.Risk score = ∑iCoefficient (genes)*Expression (genes).In the GSE140901 dataset, the ROC (receiver operating characteristic curve) curve was used to assess the diagnostic accuracy of this signature.Furthermore, in the Metascape database (https:// metas cape.org), gene set enrichment analysis (GSEA) was performed to separate the altogether cautious GO and KEGG items based on different scores 13 .At last, a nomogram model was established to study the predictive accuracy of the signature in TCGA database.

Immune infiltrate and genetic alterations analysis
To obtain the abundance ratio of infiltrating immune cells in the tumor immune microenvironment, we used TIMER 14 , CIBERSORT 15 and xCELL 16 databases.Mutation data were acquired from TCGA, and genetic variation in different subgroups was assessed using the R package "maftools".

IHC assays for protein levels of genes within the immunotherapy-associated signature
Four antibodies were bought back from Thermo Fisher Scientific.Immunohistochemistry was used to detect 40 HCC tumor tissues and 40 adjacent normal tissues from YIHE Hospital.After informing all patients of the purpose of our study, all patients signed a written informed consent form for the organization's donation.These tissues were first fixed with formalin, then embedded with paraffin, and finally made into 3 μm thick sections by a sectioning mechanism for later use.After antigen repair, the sections were incubated overnight at 4 °C with either ENG (PA5-79203; 1:200 dilution), FCER1G (PA5-28832; 1:500 dilution), PSEN1 (PA5-98093; 1:100 dilution), SLAMF6 (MA5-29572; 1:500 dilution), and binding was detected using the avidin-biotin-peroxidase method.Block slides after hematoxylin counterstaining.It was subsequently referred to two experienced pathologists for an independent, double-blind evaluation.The immunohistochemical positive intensity score criteria were graded as 0, 1, 2, and 3 for no, weak, moderate, and strong staining, respectively.Scores of 0 and 1 were considered low expressions, while scores of 2 and 3 were high expressions.

Statistical analysis
An independent-sample t-test was used to analyze quantitative variables.With R software (version 4.0.3),ROC curve analysis and Kaplan-Meier survival analysis were used to evaluate the accuracy of predicting survival outcomes.Cox proportional models were used to examine the relationship between prognostic classifiers and survival outcomes, as well as other clinical parameters.The results were considered statistically significant when the P-value was less than 0.05.

Create and verify immunotherapy-related signatures
A total of 28 candidate genes with a P < 0.05 were determined to be related to patient prognosis (Fig. 2A), further screening by LASSO-Cox regression model (Fig. 2B), Random forests model (Fig. 2C), and Stepwise COX regression model.Then, the overlapped genes were gained by the Veen model, and a four-genes signature was finally created (Fig. 2D).To assess the potential value for diagnosing of the four-genes, the ROC curve had been used, and had a good predictive power, and the AUCs were 0.859, 0.745, 0.694, and 0.852 in the datasets GSE140901, respectively (Fig. 2E).The results from database showed that only the PSEN1's expression level was different in tumor and normal tissues (Fig. 3A), although all the four genes were closely associated with patient's prognosis (Fig. 3B), according to the results of GEPIA database (http:// gepia.cancer-pku.cn) 17 .The expression profiles of ENG, FCER1G, PSEN1, SLAMF6 from HPA (http:// www.prote inatl as.org) shown that, ENG, FCER1G, SLAMF6 were not expressed or lowly expressed in tumor and normal tissues; PSEN1 was unexpressed or underexpressed in normal tissues and highly expressed in tumor tissues (Fig. 3C).We obtained the expression levels of 4 genes in HCC cell lines from the CCLE database (http:// www.ccle.org), shown in Fig. 3D.

Establishment of an immunotherapy-related genes signature in TCGA
These four genes were placed into multivariate Cox-regression model, then, a 4-gene signature related to HCC prognostic was identified.Risk score = PSEN1*0.7484834-ENG*0.2827324-SLAMF6*0.4829894+ FCER1G*0.6202788.After calculating the risk score of TCGA liver cancer patients using the above formula, patients were divided into two risk subgroups based on the optimal risk score threshold (Fig. 4A).Kaplan-Meier survival analysis showed that the prognosis was better when the score was lower, while those with higher scores had the opposite outcomes (Fig. 4B), and ROC analysis found that this signature got an ideal predictive function for patient prognosis with AUCs at 1-, 2-, 3-year of 0.770, 0.741, 0.767(Fig.4C).As shown as Fig. S2, differential gene enrichment in the high-and low-risk subgroups in immune-related biological processes.Our following job reveals that living patients had lower risk scores than dead patients.Even more, patients in the advanced clinical stage (Fig. 4D) always had a higher risk score.This information suggests that patients with low-risk score have a better outcome.

Verification of the signature in the ICGC
For validating the signature, ICGC datasets were token into use as validation cohort.The same formula was used to calculate a patient's risk score.In the ICGC, patients were divided into high-risk and low-risk subgroups (Fig. S3A) in the ICGC dataset, we found that surviving patients with hepatocellular carcinoma (Fig. S3B) or in the early stage of TNM were accompanied by lower scores (Fig. S3C).The results of the Kaplan-Meier survival analysis showed that in the ICGC cohort, the higher the risk score, the lower the OS rate, and the two were significantly negatively correlated (Fig. S3D).ROC analysis showed that the prognosis prediction of this signature was also very good in the ICGC cohort, with AUC of 0.659, 0.634, and 0.656 at 1, 2, and 3 years, respectively (Fig. S3E).

Immune infiltrate and genetic alterations analysis
Significant difference in ImmuneScore was observed between the two groups.The high-score group had lower scores, Stromalscore and ESTIMATEscore observed same results (Fig. 5A).After using TIMER, CIBERSORT, and xCELL databases to determine the abundance ratio of infiltrating immune cells in the immune microenvironment of HCC tissues, a heat map of all the different immune cells was made.In the TIMER database, we found that the infiltration level of CD8 + T cells was remarkably reduced in the high-risk group.The results we obtained from CIBERSORT revealed that the infiltration levels of CD8 + T cells, CD4 + T cells, Macrophage.M1 cells, Macrophage.M2 cells were significantly lower in the high-risk score group.The results we obtained from xCELL showed that the infiltration levels of Myeloid dendritic cell activated cells, CD8 + naïve T cells, Common lymphoid progenitor cells, CD8 + central memory T cells, Endothelial cells, Hematopoietic stem cells, Macrophage cells, Macrophage.M2 cells, Plasmacytoid dendritic cells, CD4 + Th1 T cells, CD4 + Th2 T cells were significantly lower in the highrisk score group (Fig. 5B).All these results revealed that a decrease infiltration levels of CD8 + T cells in tumors.Analysis of genetic variation showed that the mutation rates of the top 10 genes with the most significant mutations differed significantly between two subgroups (Fig. 6A).Subsequently, the results of the TMB assessment for each patient showed a positive correlation between risk score and TMB (Fig. 6B, C).Next, higher levels of PD-L1 expression were detected in tumor tissue from high-risk group patients, suggesting that patients would benefit more from immune checkpoint inhibitors (ICIs) treatment in this group, even with lower PD1 expression levels (Fig. 6D).These suggest that the higher the risk score in our model, the worse the response to immunotherapy is likely to be.Furthermore, we used the TIDE scoring system to demonstrate this.As shown in Fig. 7A-B, in our model, the higher-risk group has a higher TIDE score, tumor cells are more prone to immune escape, and the outcome of the response to immunotherapy may be worse.

Establish a nomogram model in TCGA
In TCGA, a nomogram model was developed to examine the coefficient prediction efficiency of this signature.Results indicated that the nomogram(C-index = 0.739) provided an accurate quantitative prediction method for predicting 1-to-3-year survival rate (Fig. 7C).From the overlap of predicted probability and actual probability of 1-to-3-year survival rate in the calibration curve, it showed great agreement (Fig. 7D).The AUC (area under the curve) values from these curves show that our model has good predictive potential (Fig. 7E).

IHC assay in clinical samples
As demonstrated in Fig. 8, there was no difference in the expression level of ENG in tumor and normal tissue adjacent tumors, and the expression of FCER1G was lower in tumor tissue, PSEN1 was higher in tumor tissue, and SLAMF6 was lower in tumor tissue, with Fisher's precision probability test results at (p = 0.084, p < 0.001, p < 0.001, and p < 0.001), respectively.

Comparison with the previous signature
We tried to compare the predictive potential of several genetic signatures to help researchers better understand the prognostic significance.As shown in Fig. S4A-B, our signature showed better predictive power when using fewer genes than other previously published signatures.

Discussion
Hepatocellular carcinoma was the second most common lethal tumor worldwide and had a high mortality rate.Compared with other types of liver cancer, HCC accounted for 90% of primary liver malignancies 1 .At present, in addition to liver transplantation, there are many options for treatment, mainly radical hepatectomy, local ablation, chemoembolization, transcatheter therapy, targeted therapy, etc. 18,19 In the beginning, immune checkpoint inhibitor to be clinically studied in HCC was Tremelimumab, which targets CTLA-4 20 .Subsequently, it was confirmed in multiple research reports that immunotherapy was effective for liver cancer, and the immunosuppressive combination therapy developed immediately after that further improved the prognosis of patients [21][22][23][24] .Clinicians wishing to learn more about immunology were also working to tailor treatments to individual patients www.nature.com/scientificreports/based on predictive biomarkers and etiology, potentially reshaping many previous treatment options and guidelines for the benefit of patients with advanced disease.Moreover, these strategies had been applied to the treatment of various cancers, such as melanoma, lung cancer, glioblastoma, etc.However, to date, few patients have received long-term remission, and only 20-40% of patients have responded to immunotherapy.How to select appropriate detection and prognostic biomarkers and screen the most beneficial patient population are both opportunities and challenges.www.nature.com/scientificreports/inhibitors in HCC.In the subsequent mutation gene analysis, we found that TP53, CTNNB1 and TTN gene mutations were predominant in patients in the high-risk group, while the mutation rates of these three genes were lower in the low-risk group.Mutations in TP53 and CTNNB1 genes are common oncogenic factors and are usually associated with poor tumor prognosis.A higher proportion of these genes' mutation in the high-risk group also predicts a worse prognosis for tumor patients.In addition, LRP1B, OBSCN and ABCA13 high-frequency mutated genes were also present in the high-risk group, while APOB, FLG and HMCN1 high-frequency mutated genes were alone in the low-risk group, and the two groups had significantly different gene mutation patterns.patients with high-risk scores were accompanied by high TMB and higher levels of PDL-1 expression, suggesting an immune microenvironment of HCC could be affected by four-gene, and having shown that immunotherapy would be more effective in the high-risk score subgroup.By using TIMER, CIBERSORT and xCELL, we obtained results that CD8 + expression levels were lower in the high-risk group, which may be associated with T cell exhaustion in advanced tumors 25 .T cells were in constant contact with tumor antigens and become blunted in their reactivity.When tumor immune escape occurs, it often indicates that the body's defense ability to the tumor is reduced, resulting in a worse prognosis.Our analysis showed that the high-risk  According to the statistical results of the TCGA transcriptome, FCER1G and PSEN1 were negatively correlated with patient prognosis, while ENG and SLAMF6 were positively, and we performed further experimental verification.Endoglin (ENG) encodes homodimeric transmembrane proteins, as the main glycoprotein in the vascular endothelium, which is a component of transforming growth factor receptor complexes.This protein has a high affinity for BETA1 and BETA3 peptides and is involved in the regulation of angiogenesis.It is not only essential for the integrity and normal structure of the adult vasculature, but also regulates the migration of vascular endothelial cells.Work in recent years has shown that Endoglin may be a reliable disease biomarker and therapeutic target, which has brought it a lot of attention [26][27][28] .Our findings showed no significant difference in ENG expression levels in normal tissues from tumors and adjacent tumors.This result was consistent with those obtained in the TCGA database, but inconsistent with the results of some studies.Chen et al. 29 showed that ENG is directly regulated by miR-370 and promote the occurrence and development of endometrioid carcinoma.HCC tumor tissue contains abundant microvessels, and tumor cells might express ENG protein to promote intertumoral angiogenesis, which will promote intrahepatic and extrahepatic metastasis of the tumor and presumably bring a worse prognosis, but further trials are needed.Function of the high-affinity IgE receptor (FCER1G) is associated with anaphylaxis.It is a tetramer consisting of 1 α, 1 β, and 2 γ chains.γ chain is also a subunit of other Fc receptors.It is a component of the high affinity immunoglobulin E (IgE) receptor involved in the transduction of allergic inflammatory signals in mast cells.It had been proved that it was expressed in monocytes/macrophages of tumor microenvironment 30,31 .Studies had shown that the subgroup with higher myeloma expression level had better prognosis.However, for gliomas of the central nervous system, the results were reversed 32 .The results 33,34 of two pan-cancer studies suggest that FCER1G is involved in the immune infiltration process of tumors and may be a potential immunotherapeutic target.Our experiment confirmed that it was highly expressed in the sinuses of normal tissue adjacent tumor, while the expression of tumors was reduced.Kupffer Cells (KC) were widely present in normal liver tissues and were a key mediator of the inflammatory response that occurs within liver tissue cells.Several studies had demonstrated that KC was lowly expressed in www.nature.com/scientificreports/hepatocellular carcinoma tissues 35 , while FCER1G protein was expressed in monocytes/macrophages, which was consistent with our study.Presenilin 1 (PSEN1) was an intramembrane protease, the active subunit of the γ-secretase complex.Presenilin-1 (PSEN1) and presenilin-2 (PSEN2) are two genes associated with several diseases, and early studies have focused on their association with familial Alzheimer's disease (FAD), which often leads to early onset of the disease [36][37][38] .Many recent studies had proved that PNEN1 was involved in the initiation and development of tumors and was always accompanied by bad effects 39,40 .KEGG showed that PNEN1, as an upstream protein, positively regulated the process of β-catenin protein entering the nucleus, which promoted the occurrence of tumors 41,42 .The histology of β-catenin activated hepatocellular adenoma might have moderate cytological and structural abnormalities, which is difficult to differentiate from highly differentiated HCC.This adenoma also has a higher malignancy rate, too.Many studies had proved that the continuous activation of Wnt/β-catenin signaling pathway make malignant tumor cells have the characteristics of continuous self-renewal and growth, which will also reduce the efficacy of HCC immunotherapy 43,44 .In studies focused on treatment, Ma et al. 45 showed that high expression of PSEN1 increased the radiation resistance and chemotherapy tolerance of hepatocellular carcinoma cells.SLAM family member 6(SLAM6) belongs to the Signaling lymphocyte Activating molecule (SLAM) family.Its receptors regulate innate and adaptive immune responses and trigger cytolytic activity in some natural killer cells (NK) 46,47 .Our experiments found that SLAM6 protein expression levels were decreased in HCC.Study has shown that SLAMF6 promotes the development of liver cancer by promoting macrophage M2 polarization 48 .Other studies have shown that CD8 + T cell responses during chronic viral infections are sustained by interleukin-21 (IL-21) from CD4 + T cells, which are composed of three transcriptionally and epigenetically distinct populations: Cxcr5 + Tfh cells, Slamf6 + Memory-like (Tml) subsets, Cxcr6 + Th1 cells 49 .
The positive correlation between Slamf6 and CD8 + T cell may explain the poor prognosis when the expression is decreased.In short, the four genes in the prognostic model are involved in the immune activity of the human body, further studying its mechanisms helps to provide new ideas for treatment.
Compared with other prognostic models, such as Fourteen-gene, Twelve-gene, Ten-gene, and Six-gene [50][51][52][53] .Our prediction model achieves more accurate predictions and higher C-index scores with fewer genes.Inevitably, individual studies had some limitations.The validity of this signature needs to be verified by more HCC samples.The mechanism of how each gene affects the immune efficacy needs more study.In addition, in vivo and in vitro assays will be added in future studies to further investigate the expression of these four genes at the protein level and prognostic relevance, including their roles in HCC progression.

Conclusions
In summary, our study constructed an immunotherapy-related risk model for predicting prognosis and individualized immunotherapy for HCC patients, which was effective in classifying those patients.

Figure 1 .
Figure 1.Identification of genes related to immunotherapy treatment.(A) As a result, two non-grey modules are filtered out.(B) The blue module was significantly associated with immunotherapy efficacy.(C) The scatter plot for genes in the blue module of GS score and MM.(D) Enrichment analysis of Immunotherapy-related genes from the blue module.

Figure 2 .
Figure 2. Identification of genes related to immunotherapy treatment.(A) Immunotherapy-related genes by univariate Cox regression.(B) Immunotherapy-related genes screened by the LASSO-Cox regression model.(C) Immunotherapy-related genes screened by the Random forests model.(D) Four overlapping genes were considered Immunotherapy-related genes.(E) Validation of the diagnostic value of immunotherapy Efficacy in the dataset GSE140901.

Figure 3 .
Figure 3. Expression and prognostic significance of the 4 genes.(A) Expression difference in the tumor and normal tissues.(B) Prognostic value analysis.(C) Protein's expression levels in the tumor and normal tissues.(D) Gene's expression levels in the cell lions.

Figure 4 .
Figure 4. Construction of Immunotherapy-related genes signature in TCGA.(A) Risk score distribution, OS status and expression profile of four genes.(B) In the TCGA cohort, survival outcomes were significantly decreased in patients with higher risk scores.(C) ROC analysis predicts the prognostic value of 1-, 2-, and 3-year OS rates.(D) Higher risk scores were linked to different survival status, such as grade, recurrence status, TNM stage, T stage, and vascular invasion.

Figure 5 .Figure 6 .
Figure 5.Immune infiltrate analysis.(A) Risk score was significantly correlated with immune score, interstitial score, and estimation score.(B) A heatmap of all substantially different immune cells between the high-and low-score subgroups.

Figure 8 .
Figure 8. Protein's expression in validation test cases.(A) Number of high and low expression cases in normal tissue adjacent tumor and tumor.(B) Protein's expression levels in validation cases.