Identification of ferroptosis-related genes for overall survival prediction in hepatocellular carcinoma

Ferroptosis is a novel type of cell death depending on iron and is strongly related to the development of tumors. Hepatocellular carcinoma (HCC) is a malignancy with high incidence. Despite some reports demonstrating the relation between ferroptosis-related genes and HCC, more details have not been excavated. In the present study, we collected and analyzed HCC patients' datasets from the TCGA-LIHC project and ICGC portal, respectively. Through the bioinformatic methods, we screened 126 differentially expressed genes. Then a prognostic model was established with four genes (GPX2, MT3, PRDX1, and SRXN1). PRDX1 is the hub gene of the prognosis model and has a high expression in hepatocellular carcinoma tumor tissue and cell lines. We further found that silencing PRDX1 increased the accumulation of ferrous ions and lipid peroxidation accumulation in HEPG2 cells and promoted ferroptosis in hepatocellular carcinoma. In conclusion, the study demonstrated the four-gene signature can be used to predict HCC prognosis. It also revealed the potential function of the ferroptosis-related gene PRDX1 in HCC, which can be a biomarker of the prediction for HCC outcome.

Establishment and assessment of prognostic model. To understand what role FRGs play in the HCC more deeply, we applied univariate Cox regression analysis and received 19 prognostic genes ( Table 1). The LASSO (Least Absolute Shrinkage and Selection Operator) regression analysis was used to get the genes that are obviously related to the overall survival of HCC (Fig. 3A), and cross-validation for the model of selected genes is necessary (Fig. 3B). For a stable prognosis-related gene signature, the 19 genes related to overall survival (OS) were processed by stepwise multivariate Cox regression analysis after the LASSO approach. Following that, we finally built a four-gene signature for overall survival in HCC, and the four genes are GPX2, MT3, PRDX1, and SRXN1 (Fig. 3C). According to the multivariate Cox regression coefficients, the risk score of each sample in the www.nature.com/scientificreports/ training cohort was calculated. To assure the accurate separating standard of the risk score group, we painted an ROC curve to search cut-off value based on the risk score, and the cut-off value we got is 1.275 (Fig. 4A). The 167 HCC patients in the training dataset were divided into two groups, high-(n = 25) and low-risk group (n = 142), based on the cut-off value. Comparing two groups with their survival status and the four genes expression, the high-risk group is obviously associated with a comparative high death rate and with higher gene expression ( Fig. 3D-F). Through the same process, the HCC patients in validation cohort were divided into high-(n = 92) and low-risk group (n = 43) by cut-off value (cut-off value = 0.802) (Fig. 4D). And the results that the correlation in risk score and survival status even genes expression are similar to the training set ( Fig. 3G-I).  www.nature.com/scientificreports/ Next, we applied Kaplan-Meier analysis to perform the difference of survival status between the high-risk group and low-risk group in the training cohort (Fig. 4B). The Kaplan-Meier curve presents the significant difference between the high-and low-risk group (P < 0.0001). For better convincing, we painted the ROC curve to validate the result of the Kaplan-Meier curve, and a higher AUC (Area Under Curve) means better performance. The AUCs of OS corresponding to 1, 2, and 3 years in the training set were 0.682, 0.694, and 0.539, respectively (Fig. 4C). The two plots were also validated in the testing cohort, and we got the same result that the Kaplan-Meier curve performed a differentially expressed OS between two groups (P = 0.025) (Fig. 4E), AUCs were 0.534, 0.681, and 0.635 respectively contacting to the OS of 1, 2 and 3 years in the validation set (Fig. 4F). The credible AUCs can prove the high-risk group with poor prognosis presented by the Kaplan-Meier curve in HCC patients. Besides, we also analyzed the four prognostic genes with the Kaplan-Meier approach in the training cohort and validation dataset, respectively ( Fig. S1A-H). And we noticed that the expression level of PRDX1 is obviously related to overall survival (P = 0.031; P = 0.023). However, the expression of the four genes in the normal group, tumor group, and risk groups did not present differences (Fig. S1I, J). And then we used the STRING website to determine the key of the four genes are PRDX1 and SRXN1 ( Fig. S2A-C). Through the exploration of the GEPIA website, we checked the expression level of the two genes and plotted their survival curves based on the TPM value of the expression level ( Fig. S2D-G).
For the prognostic model, univariate Cox regression analysis and multivariate Cox regression analysis were applied to assess the clinical characteristics whether are the independent predictively prognostic factors or not. As shown in Fig. 5, with four clinical parameters (Age, Gender, Stage, and risk score), it is suggested that risk score is the most significant clinical feature related to survival in training and testing cohorts (P < 0.05), but the p-value of risk score processed by multivariate Cox regression analysis in ICGC (International Cancer Genome Consortium, https:// dcc. icgc. org/ proje cts/ LIRI-JP) dataset (P = 0.053) is a few more than 0.05. Hence, we considered that risk score can be an independent prognostic predictor with high probability.
Association between FRGs and immune. To clearly evaluate the proportion of 19 tumor-infiltrating immune cells in each sample, CIBERSORT analysis was applied to HCC samples which P value < 0.05 in the   www.nature.com/scientificreports/ training dataset (Fig. 6A). Besides, we explored the correlation between risk score and immune status including immune cells and immune functions. Single-sample Gene Set Enrichment Analysis (ssGSEA) was applied to assess the abundance level of immune cells and immune functions in two cohorts ( Fig. 6B-E), most enrichment scores expressed highly in the high-risk group. Figure 6B and Fig. 6C presented the different enrichment scores of 5 immune cells and 4 immune functions between high-and low-risk groups in the training dataset. And the enrichment score of neutrophils in 5 immune cells was significantly different in the two risk groups (P < 0.05), the enrichment scores of check-point and Para inflammation in 4 immune functions indicated significant difference (P < 0.01). As shown in Fig. 6D and E, they presented different enrichment scores of 4 immune cells and 3 immune functions between high-and low-risk groups in the validation dataset. However, the results were different from those in the training cohort. The enrichment score of Treg cells in 4 immune cells was significantly different in two risk groups (P < 0.05), but there were not obviously different enrichment scores about immune functions between high-and low-risk groups in the testing cohort.

Gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses.
Subsequently, we applied functional enrichment analyses to the DEGs in high-and low-risk groups in the training cohort. As observed, the result of functional annotation encompassing biological processes (BP), cellular components (CC), and molecular functions (MF) by Gene Ontology (GO) analysis performed some ferroptosis-related function, such as cellular response to oxidative stress in biological processes, antioxidant activity, oxidoreductase activity, acting on NAD(P)H in molecular functions and so on (Fig. 7A). In addition, the ferroptosis-related DEGs were dealt with Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis (Fig. 7B). As expected, the pathway was significantly enriched on the ferroptosis pathway.
To guarantee the result more accurately, ICGC cohort, a validation dataset was accepted with GO and KEGG pathway enrichment analyses (Fig. 7C,D), too. Delightfully, the functions and pathways of enrichment in the validation dataset were as same as the training dataset, especially the enriched pathways found by KEGG analysis that the first enrichment pathway was ferroptosis.

The correlation between FPI and tissues and immunohistochemistry for validation.
To understand the influence on prognosis and therapy ferroptosis brought in HCC, ferroptosis potential index (FPI) quantizing the expression level of ferroptosis was established to reveal the ferroptotic functions and different expressions between normal and tumor tissues in HCC patients from the TCGA-LIHC cohort. As shown in Fig. 8A, the expression level of FPI has an obvious difference (P = 1.0e−04) in normal and tumor groups. Finally, we found out the figures of immunohistochemistry (IHC) about the expression of encoding proteins by four-gene signature from The Human Protein Atlas (HPA) database. Obviously, for the expression level, PRDX1 was strongly expressed in normal tissues, but GPX2 was moderately expressed in tumor tissues ( Fig. 8B-E). Regrettably, there were no IHC plots of MT3 and SRXN1 that can be completely compared with on the website.

Silencing PRDX1 induces ferroptosis in hepatocellular carcinoma cells.
To determine the clinical relevance of PRDX1 expression levels, we examined PRDX1 expression in a normal liver cell (LX2) and three hepatocellular carcinoma cell lines (MHCC-97H, HEPG2, and Huh-7). The results showed that PRDX1 protein www.nature.com/scientificreports/ levels were expressed in MHCC-97H, HEPG2, and Huh-7 cells (Fig. 9A), and notably, its expression was highest in HEPG2 cells. Therefore, HEPG2 cells were used for the follow-up analysis. To reveal the biological function of PRDX1 in HEPG2 cells, we transfected HEPG2 cells with siRNA to silence PRDX1 expression. Western blot results showed that PRDX1 protein expression was significantly down-regulated in HEPG2 cells after transfection with si -PRDX1, indicating that PRDX1 gene was successfully knocked down (Fig. 9B).  www.nature.com/scientificreports/ Similarly, to further investigate the role of PRDX1 expression in ferroptosis, we examined the expression of key indicators associated with ferroptosis (Fe2+, lipid peroxidation, and ferroptosis -associated proteins) in HEPG2 cells with knockdown of PRDX1. First, lipid peroxidation plays a key role in the development of ferroptosis, we next examined the effect of PRDX1 on intracellular lipid peroxidation levels in HEPG2 cells and showed that inhibition of PRDX1 expression increased intracellular lipid peroxidation levels in HEPG2 cells (Fig. 9C). In addition, we examined the effect of PRDX1 on the change of intracellular Fe2+ levels and found that PRDX1 knockdown increased intracellular Fe2+ levels in HEPG2 cells (Fig. 9D). Meanwhile, western blot results showed that the expression of SLC7A11 and GPX4 decreased and the expression of 4HNE and ACSL4 increased in HEPG2 cells after PRDX1 knockdown (Fig. 9E). Thus, our results suggest that silencing PRDX1 promotes ferroptosis in hepatocellular carcinoma cells.

Discussion
By the increasingly developed anticancer therapy of selective induction of cancer cell death, ferroptosis was received with great concern because of its special modality of cell death. Several reports have indicated that ferroptosis plays a crucial role in the process of tumorigenesis 17,18 and it provides a new perspective on cancer treatment and may develop new strategies for the treatment of liver cancer 19 . Hepatocellular carcinoma has a strong relation with metabolic disorders 20 . ROS has been proved to be chemically reactive metabolites containing oxygen that decrease in liver cancer 21 . Lipid ROS accumulating too much by exogenous drugs and ironmetabolism dysregulation both are the keys in ferroptosis origin and development 6,22 . There are several previous studies had explored the relationship between ferroptosis-related genes and HCC prognosis, and received a few genes related to HCC prognosis [23][24][25] . However, the mechanism that how ferroptosis influences tumor cell death www.nature.com/scientificreports/ in HCC patients and which kind of role the predictors for prognosis play is not demonstrated completely until now. Hence, our study constructed a prognostic model with four ferroptosis-related genes through bioinformatic methods, which was proved to be beneficial for early diagnosing HCC patients. Besides, we performed some experiments to validate the role PRDX1 played between HCC and ferroptosis.
In the present study, we constructed a risk model based on prognostic genes, none of the clinical parameters were proved to be an independent prognostic factor except risk score. Then we conducted a series of enrichment analyses on functions and pathways. Based on the close association between ferroptosis and oxidative stress 26 , we observed the functions of cellular response to oxidative stress, oxidoreductase activity, acting on NAD(P) H, and so on were related with ferroptosis in two datasets. Interestingly, the depletion of NADPH will promote lipid ROS accumulation by reducing the GSH which metabolism can directly influence ferroptosis sensitivity 26,27 . Moreover, the KEGG pathway analysis indicated that glutathione metabolism and ferroptosis pathways were enriched, which suggested that all DEGs were strongly linked with ferroptosis. Besides, we also used ssGSEA to construct FPI to characterize ferroptosis based on genes expression patterns and the FPI is higher in tumor tissues. And FPI was put forward to model ferroptosis level which can predict poor prognosis with a high score in many cancer types 28 .
The prognostic model contained four genes-GPX2, MT3, PRDX1, and SRXN1. Glutathione peroxidase 2 (GPX2), a member of the antioxidant enzyme GPX family, overexpression can induce poor prognosis of hepatocellular carcinoma patients 29 . In our study, the expression of GPX2 was high in the high-risk group which has a poor prognosis. Metallothionein III (MT3) overexpression contributes to carcinogenesis and poor prognosis of several other types of cancer patients but not clearly for HCC patients 30 . And with the downregulation of MT3 often accompanying the methylation, there was speculation that MT3 may suppress the tumor via promoting hypermethylation 31 . Up-regulated peroxiredoxin 1 (PRDX1) often acts as an oncogene in many types of cancer but remains controversial 32 . However, it was confirmed that overexpressed PRDX1 was closely related to the poor prognosis of HCC patients 33 . Sulfiredoxin-1 (SRXN1) was revealed as a pro-tumorigenic in HCC by regulating ROS signaling 34 . In addition, SRXN1 is involved in oxidoreductase activity 35 , its overexpression may play a crucial role in the tumorigenesis and progression of HCC 36 . Nevertheless, there are not so many reports about the relationship between these genes and ferroptosis. We can only know that the expression of GPX2, MT3, and SRXN1 was upregulated during ferroptosis induced by erastin or RSL3, and they may promote ferroptosis 7 . Besides, PRDX1 is necessary to ferroptosis-related lipid peroxidation, and it also may promote ferroptosis because ferroptosis-related suppressors can block enhanced lipid peroxidation and recover cell viability without PRDX1 37 . After the survival analysis for each prognostic gene based on its expression level, PRDX1 with high expression is significantly related to low overall survival.
As a member of the PRDXs protein family, PRDX1 is considered an important antioxidant protein, can regulate gene expression, and also enhances the killing activity of NK cells against cancer cells 38 , thereby preventing malignant transformation of cells. It has been shown that decreased PRDX1 levels lead to impaired antioxidant response and excessive accumulation of ROS, promoting hepatocellular carcinoma cell death 39 . In the present study, we found that PRDX1 expression levels were significantly upregulated in hepatocellular carcinoma cell lines, especially in HEPG2 cells. Furthermore, high PRDX1 expression was associated with poor prognosis of hepatocellular carcinoma, and silencing PRDX1 increased the accumulation of Fe2+ and led to lipid peroxidation accumulation in hepatocellular carcinoma cells, which promoted ferroptosis in hepatocellular carcinoma. Therefore, PRDX1 can be used as a potential biomarker for the prognostic value of hepatocellular carcinoma and can serve as a therapeutic target for hepatocellular carcinoma.
In summary, we established a prognostic model with four ferroptosis-related genes through statistical analyses and processes another time in the testing cohort to evaluate its accuracy. Meanwhile, we also checked the relevance between the overall survival rate and the expression level of each prognostic gene, respectively. And we selected PRDX1 as the key which silencing can promoted ferroptosis in hepatocellular carcinoma according to the experimental validation. In addition, we check the functions and pathways of genes to validate the connection between genes and ferroptosis. It can also demonstrate that the ferroptosis-related gene signature can be a predictor in HCC, linking ferroptosis and HCC more closely. However, the limitation in this study needs improvement. The datasets we acquired both are from public databases, the quantity of samples and the completeness of clinical information need to collect more to strengthen its reliability. And we also need a forward-looking study to further improve the availability of the prognostic model.
Differential expression analysis. To identify differentially expressed genes (DEGs) between tumor tissues and normal tissues, we used the "limma" package in R (version 4.0.2) to calculate the logFC (log fold change) and P values of 214 ferroptosis-related genes. The DEGs were filtered using adjusted p-value (adjusted by false discovery rate) < 0.05 and absolute logFC > 0.5. Following this, the DEGs were separated into two groups (high-expressed and low-expressed groups) for simple nucleotide variation analysis. The "maftools" R package was used to analyze the summary of DEGs mutation information in the TCGA dataset. Besides, we utilized the

Identification of prognostic FRGs and signature building. A univariate Cox regression analysis was
used to screen differentially expressed FRGs associated with overall survival (OS) in HCC patients, and we considered P value < 0.05 as statistical significance. To narrow the range of potential prognostic FRGs, we did a LASSO regression analysis by applying the "glmnet" package in R. Then, for performing the prognostic signature, we conducted the stepwise multivariate Cox regression analysis and constructed a ferroptosis-related four-gene signature. The prognostic risk score was calculated according to the expression levels of the genes and a linear combination of the regression coefficient (λ) in a multivariate Cox regression model. The formula was established as follows: score = sum (each gene's expression level × λ). The patients were divided into a high-risk group and a low-risk group based on the cut-off value of the risk scores.
Prognosis analysis. Using "survival", "survminer" and "timeROC" packages in R software to plot Kaplan-Meier survival curves and ROC curves, which can evaluate the potentially predictive performances of the prognostic signature. GEPIA (Gene Expression Profiling Interactive Analysis) website (http:// gepia. cancer-pku. cn/) also can analyze the expression level of genes in normal and tumor groups based on TCGA database and GTEx Portal (https:// www. gtexp ortal. org/ home/) and plot Kaplan-Meier survival curves of the selected gene. Finally, we conducted univariate and multivariate Cox regression analysis to confirm which traditional clinical characteristic was independent. Hazard ratios (HRs) and 95% confidence intervals (CIs) for each variable were calculated. P value < 0.05 was considered statistical significance.

Immune cells and functions. Cell-type Identification By Estimating Relative Subsets Of RNA Transcripts
(CIBERSORT) analysis was used to evaluate the proportion of 19 human immune cell subpopulations by calculating the absolute abundance of immune cells and stromal cells. P value < 0.05 indicates statistical significance. Making single-sample GSEA (ssGSEA) to estimate the immune scores of immune cells and abundance of immune functions in two risk groups by using the "GSVA" R package. Mann-Whitney test with P value validated its differential expression between the high-risk group and low-risk group.

Functional enrichment analysis.
To explore the functional annotation which was associated with the risk score, we conducted Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) [41][42][43] analyses based on the DEGs (FDR < 0.05) by applying the "clusterProfiler" package 44 in R software. P-values were adjusted with the FDR method. Next, we constructed the ferroptosis potential index (FPI) by using ssGSEA to reveal the functional roles of ferroptosis and differential FPI expression in tumor and normal tissues. Finally, we obtained the immunohistochemistry images of prognostic signature from the HPA website (The Human Protein Atlas, https:// www. prote inatl as. org/).
Cell culture and transfection. Human hepatocellular carcinoma cell lines (MHCC-97H, HEPG2, Huh-7) and human normal hepatocytes (LX2) were purchased from the American Type Culture Collection (ATCC). All cells were maintained in DMEM medium (Gibco, Grand Island, USA) containing 10% fetal bovine serum (Gibco, Grand Island, USA) and 1% penicillin-streptomycin (Gibco, Grand Island, USA) and incubated at 37 °C and 5% CO2. PRDX1-siRNA was obtained from Sangon Biotech, Shanghai, China (Shanghai, China) for silencing the expression of PRDX1. In this study, the PRDX1-siRNA sequence was: sense:5′-GCA CCA UUG CUC AGG AUU ATT -3′; antisense:5′-UAA UCC UGA GCA AUG GUG CTT -3′. Cells were seeded at a density of 6 × 104 cells/well in 12-well plates and transfected after 24 h. PRDX1-siRNA/ NC-siRNA was transfected using siRNA transfection reagent (Polyplus, France) to a final concentration of 20 nM. Finally, the expression levels of target proteins in cells transfected for 72 h were analyzed. The successfully transfected cells will be used for subsequent experiments.
Western Blot. Proteins were extracted from cells and cell lysates were prepared using RIPA lysate with PMSF (Solarbio, Beijing, China), and then protein quantification was performed using a BCA protein assay kit (Sangon Biotech, Shanghai, China). Proteins were then separated by 12% SDS-PAGE and transferred to nitrocellulose membranes. The membranes were blocked with 5% bovine serum albumin (BSA) for 1 h and then incubated with primary antibody overnight at 4 °C. The next day, after washing the membrane three times with TBST, the membrane was incubated for 1 h at room temperature with horseradish peroxidase-labelled secondary antibody (1:4000), followed by three washes with TBST. Finally, the colors were developed using BeyoECL Moon (Beyotime Biotechnology, Shanghai, China).
Iron assay. Cells  www.nature.com/scientificreports/ Statistical analysis. Data are expressed as mean ± standard deviation (SD). Statistical analysis was performed by using GraphPad Prism analysis software. The t test was used to assess the difference between the two groups and a value of P < 0.05 indicates a statistically significant difference, * indicates P < 0.05; ** indicates P < 0.01; *** indicates P < 0.001.

Data availability
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding authors.