Identification of potentially functional circular RNAs hsa_circ_0070934 and hsa_circ_0004315 as prognostic factors of hepatocellular carcinoma by integrated bioinformatics analysis

Hepatocellular carcinoma (HCC) is one of the most prevalent cancers worldwide, which has a high mortality rate and poor treatment outcomes with yet unknown molecular basis. It seems that gene expression plays a pivotal role in the pathogenesis of the disease. Circular RNAs (circRNAs) can interact with microRNAs (miRNAs) to regulate gene expression in various malignancies by acting as competitive endogenous RNAs (ceRNAs). However, the potential pathogenesis roles of the ceRNA network among circRNA/miRNA/mRNA in HCC are unclear. In this study, first, the HCC circRNA expression data were obtained from three Gene Expression Omnibus microarray datasets (GSE164803, GSE94508, GSE97332), and the differentially expressed circRNAs (DECs) were identified using R limma package. Also, the liver hepatocellular carcinoma (LIHC) miRNA and mRNA sequence data were retrieved from TCGA and differentially expressed miRNAs (DEMIs) and mRNAs (DEGs) were determined using the R DESeq2 package. Second, CSCD website was used to uncover the binding sites of miRNAs on DECs. The DECs' potential target miRNAs were revealed by conducting an intersection between predicted miRNAs from CSCD and downregulated DEMIs. Third, candidate genes were uncovered by intersecting targeted genes predicted by miRWalk and targetscan online tools with upregulated DEGs. The ceRNA network was then built using the Cytoscape software. The functional enrichment and the overall survival time of these potential targeted genes were analyzed, and a PPI network was constructed in the STRING database. Network visualization was performed by Cytoscape, and ten hub genes were detected using the CytoHubba plugin tool. Four DECs (hsa_circ_0000520, hsa_circ_0008616, hsa_circ_0070934, hsa_circ_0004315) were obtained and six miRNAs (hsa-miR-542-5p, hsa-miR-326, hsa-miR-511-5p, hsa-miR-195-5p, hsa-miR-214-3p, and hsa-miR-424-5p) which are regulated by the above DECs were identified. Then 543 overlapped genes regulated by six miRNAs mentioned above were predicted. Functional enrichment analysis showed that these genes are mostly associated with regulatory pathways in cancer. Ten hub genes (TTK, AURKB, KIF20A, KIF23, CEP55, CDC6, DTL, NCAPG, CENPF, PLK4) have been screened from the PPI network of the 204 survival-related genes. KIF20A, NCAPG, TTK, PLK4, and CDC6 were selected for the highest significance p-values. At the end, a circRNA-miRNA-mRNA regulatory axis was established for five final selected hub genes. This study implies the potential pathogenesis of the obtained network and proposes that the two DECs (has_circ_0070934 and has_circ_0004315) may be important prognostic markers for HCC.


Microarray data collection and identification of differentially expressed circRNAs (DECs).
To identify differentially expressed circRNAs (DECs), three microarray datasets containing circRNA expression profiles in HCC patients were downloaded from GEO (https:// www. ncbi. nlm. nih. gov/ geo/) as a functional genomic database by using GEOquery package in R language software (V 4.1.0). The gene chips in all eligible microarray datasets have the same platform (Agilent-069978 Arraystar Human CircRNA microarray V1). The microarray data consist of three datasets: GSE164803 (six normal tissue samples and six HCC tissue samples), GSE94508 (five paracancerous tissue samples and five HCC tissue samples), and GSE97332 (seven hepar normal tissue samples and seven HCC tissue samples), Additionally, all raw expression data were normalized and log 2 transformed. The sva package (V 3.40.0) 20 in R software was also used to remove the hidden batch effects and combine three datasets. The R limma package (V 3.48.3) 21 based on the Bioconductor package was applied to screen DECs microarray datasets with the cutoff criteria of |log 2 FoldChange|≥ 1.5 and adj. p-value < 0.05. This criterion was considered statistically significant.  www.nature.com/scientificreports/ RNA-Seq and miRNA-Seq data were normalized by the R package GDCRNATools 22 utilizing TMM 23 and VOOM 24 methods. Expression data analysis for RNA and miRNA between primary tumor and normal solid tissue was performed by DESeq2 (V 1.32.0), a Bioconductor package based on R language to gain differentially expressed miRNAs (DEMIs) and differentially expressed mRNAs (DEGs). DESeq2 performs an internal normalization by computing the geometric mean for each gene across all samples. The number of genes in each sample is then divided by the mean value 25 . The threshold for selecting DEMIs and DEGs based on two-dimension, |log 2 FoldChange|> 1 and adj. p-value < 0.05, these two statistical thresholds were considered significant.

Identification of differentially expressed miRNAs (DEMIs) and differentially expressed mRNAs (DEGs). On
Prediction of miRNA response elements (MREs). The Cancer-Specific CircRNA Database (CSCD, http:// gb. whu. edu. cn/ CSCD/) is a web-based tool for understanding the functional effects of circRNAs that was used to predict miRNA Response Element (MRE) sites for each DECs. These target miRNAs, which are complementary to the DECs-related MREs, have been named CTmiRNAs (CSCD target miRNAs).
CTmiRNAs were further screened according to the DEMIs obtained from the TCGA database. To acquire potential target miRNAs on the DECs, an intersection was performed between CTmiRNAs and downregulated DEMIs (downregulated DEMIs were selected with the cutoff criteria of log 2 FoldChange < −1 and adj. p-value < 0.05). Finally, the intersection and overlapping of these two algorithms' miRNAs have been dubbed FImiRNAs (Final intersected miRNAs).
Prediction of miRNA-mRNA interaction. The Targetscan (http:// www. targe tscan. org/ vert_ 72) and Mirwalk (http:// mirwa lk. umm. uni-heide lberg. de) websites were used to predict target genes. These two mentioned websites are comprehensive publicly available online resources of miRNA-target interactions. Overlapped miRNA target genes between these two databases are considered final target genes, which we termed FmRNAs (Final mRNAs).
To identify potential miRNA-related genes, the intersection of upregulated DEGs from TCGA-RNA-seq analysis and FmRNAs was performed using the same method. (upregulated DEGs were selected with the cutoff criteria of log 2 FoldChange > 1 and adj. p-value < 0.05). At last, we got the final functional genes, which we refer to as FImRNAs (Final Intersected mRNAs).
Formation of circRNA/miRNA/mRNA regulatory network. Using previous analysis and predicted target interactions, the acquired DEC-FImiRNA pairs, and FImiRNA-FImRNA pairs were combined to form a circRNA/miRNA/mRNA regulatory network, according to the ceRNA hypothesis. This network was visualized using the Cytoscape software (V 3.8.2) 26 , a robust tool for analyzing and visualizing data networks.
Functional enrichment analysis for FImRNAs. Gene ontology (GO) 27 and Kyoto Encyclopedia of Genes and Genomes (KEGG) 28 analysis were applied to explore the main function and associated enriched pathways information of the FImRNAs in the preliminary ceRNA regulatory network.
Using the GO database, GO assessed for the biological process (BP), cellular component (CC), and molecular function (MF) of each element to find potential functional genes. ClusterProfiler (V 4.0.2) is an R software package based on ontology for statistical analysis and visualizing gene functional profiles 29 . In this analysis, the ClusterProfiler with the Benjamini-Hochberg method and p-value < 0.05 and q-value < 0.05 criteria were utilized for GO analysis of FImRNAs. Based on the KEGG database, KEGG determines the potential functions of these genes that participated in the pathways. ClueGO, a Cytoscape plugin, was used to perform and visualize the KEGG analysis 30 . Functional annotation with a p-value below 0.05 and a kappa score greater than 0.4 was considered statistically significant.
Survival analysis of FImRNAs. Kaplan-Meier method was used to evaluate the overall survival of DEGs in 347 HCC samples using the survival tool package. The hazard ratio (HR) and corresponding 95% confidence interval were calculated, and a p-value less than 0.05 was considered statistically significant. Lastly, the intersection of survival-related DEGs and FImRNAs was conducted.

Construction of protein-protein interaction (PPI) network and identification of hub genes.
The STRING (https:// string-db. org/), an online search tool for predicting protein-protein interactions (PPI) used to establish a PPI network 31 . The overlapped findings for the overall survival of DEGs and FImRNAs were imported into the STRING database. The extraction cutoff standard for the PPI pair was used with an interaction score greater than 0.4. The obtained network was visualized using Cytoscape, and hub genes were found using the Cytoscape plugin, CytoHubba, a network analysis tool with a simple user interface and 11 scoring systems was used to evaluate the significance of nodes in a biological network 32 .

Results
Identification of DECs, DEMIs, and DEGs. The differential expressed circRNAs (DECs) were verified in the initial step to establish an interaction network among the circRNA-miRNA-mRNA regulatory axis. Three microarray datasets from GEO on an identical platform from HCC patients were included in this study.
First, the raw expression data was log 2 transformed in R software and then normalized using the Normalize-Quantiles function ( Fig. 2A). Following the merging of these three datasets, the popular ComBat function in the R sva package was used to eliminate the hidden batch effects (Fig. 2B). We utilized the R limma package for differential analysis of all microarray data and got a string of statistically significant circRNAs comprising 2042 variables. Nine circRNAs with much more significant differential expression than other circRNAs were identified This set of DECs in HCC tissues were found compared to normal tissues. Eight were upregulated, and one was downregulated circRNA (Fig. 3A). After further analysis and filtering among these nine DECs, four upregulated DECs, including hsa_circ_0000520, hsa_circ_0008616, hsa_circ_0070934, hsa_circ_0004315, were chosen as the research objects in this study (Table 1). In the second step of building a circRNA/miRNA/mRNA network, verifying differentially expressed miRNAs (DEMIs) and differentially expressed mRNAs (DEGs) was essential. DESeq2 package analysis in R was performed on samples taken from the TCGA-LIHC database; as a result, the DEMIs and DEGs between primary tumor and normal solid tissue were revealed. DEMIs consisted of 108 upregulated and 19 downregulated genes (Fig. 3B), whereas DEGs consisted of 1229 upregulated genes and 797 downregulated genes (Fig. 3C). The DEGs and DEMIs were determined using the criteria log 2 FoldChange > 1 and adj. p-value < 0.05. published on the function of circRNAs that may serve as ceRNAs in different cancers. These data suggest that circRNAs trap miRNAs like a sponge. The four DECs identified contain miRNA response element (MRE) sites that sponge miRNAs. (Fig. 4A) We used the CSCD database to predict potential target miRNAs of these DECs; we called the predicted miRNAs derived from CSCD, CTmiRNAs. Because the DECs selected for this study were upregulated and, according to the ceRNA hypothesis, circRNAs lead to miRNA downregulation, we intersected the CTmiRNAs with 19 downregulated DEMIs from the TCGA-LIHC analysis and called them FImiRNAs (hsa-miR-542-5p, hsa-miR-326, hsa-miR-511-5p, hsa-miR-195-5p, hsa-miR-214-3p, and hsa-miR-424-5p). Table 2  Prediction and analysis of miRNA-mRNA pairs and construction of ceRNA network. MiRNAs may bind to complementary sequences inside mRNAs, leading to their degradation or suppression; thus, the Targetscan and Mirwalk databases have been used to predict and determine which RNAs have complementary binding sites to bind our selected FImiRNAs. After that, FmRNA was chosen as an appropriate term for the final target genes earned from the intersection of two databases.
Upregulation of DECs, as stated by the ceRNA hypothesis, causes downregulation of corresponding target miRNAs. Since fewer miRNAs are in the cytoplasm, target gene transcripts are less degraded and suppressed. Therefore, we carried an intersection between FmRNAs and upregulated DEGs; as a result, 543 genes were discovered. In this study, these 543 final intersected mRNAs were termed FImRNAs.
To better understand the roles of DECs on mRNAs through miRNA binding in HCC cancer, a ceRNA network was constructed and visualized by Cytoscape software for four selected DECs, their potential target miRNAs (DEC-FImiRNA pairs), and miRNA target genes (FImiRNA-FImRNA pairs). We utilized the downregulated miRNA as a junction in this network, along with upregulated circRNAs and mRNAs in the circRNA-miRNA and miRNA-mRNA interaction pairings. Finally, there were 300 interactions in the ceRNA network, including four DECs, six FImiRNAs, and 543 FImRNAs (Fig. 5).
GO and KEGG pathway enrichment analysis of FImRNAs. Using the ClusterProfiler R package, GO analysis including three BP, CC, and MF categories was used to understand better the biological roles of 543 FImRNAs (Fig. 6). 4199 BPs, 499 CCs, and 748 MFs were enriched for this dataset. P-value with a cutoff set less than 0.05 was considered statistically significant.
Based on the count of genes in the pathway and the degree of significance for BPs terms, these FImRNAs were mainly enriched in organelle fission, nuclear division, DNA replication, positive regulation of cell cycle process, and DNA conformation changes, respectively. Likewise, for MF terms, according to the number of genes in the pathway and the degree of significance, these FImRNAs are mainly enriched in ATPase activity, tubulin binding, microtubule-binding, extracellular matrix structural constituent, DNA replication origin binding, respectively. For CC terms, these FImRNAs are mainly enriched in the chromosomal region, spindle, microtubule, chromosome, centromeric region, and kinetochore, respectively, depending on the count of genes included in the pathway and the degree of significance. The chord plot indicates that 51 of 543 FImRNAs had high log 2 FoldChange values and were enriched in the top four BP terms, three CC terms, and three MF terms (Fig. 7).
KEGG analysis was done on 543 FImRNAs using clueGO, a Cytoscape plugin, to reveal their potential function (Fig. 8). The statistical analysis criteria were p-value ≤ 0.05 and a kappa score ≥ 0.4. According to the charts, the groups' network between terms, the KEGG signal pathway enrichment analysis suggests that these FImRNAs were enriched in pathways such as ECM-receptor interaction, cell cycle, and Rap1 signaling pathway.
Survival analysis and construction of PPI network. The Kaplan-Meier survival analysis was conducted to analyze the connection between DEGs and survival time. As an outcome, 441 DEGs with p-value survival < 0.05 were identified between primary tumor and normal solid tissue. In order to evaluate the survival analysis of the ceRNA network established in the previous results, an intersection was accomplished between 441 survival-related DEGs and 543 FImRNAs, resulting in the identification of 204 genes associated with the overall survival of the HCC patients.

Discussion
Despite significant improvements in diagnosing hepatocellular cancer through various methods, appropriate novel strategies, such as identifying specific molecular regulatory mechanisms, are still required to optimize the treatments and diagnosis of HCC 33 . With the advent of the bioinformatics era and increased research on cancerrelated genes, it has been demonstrated that there is a network of complex interactions between RNAs known as the ceRNA network (ceRNET) in the way that, focusing on this network and the factors involved, may promote the rapid diagnosis of various malignancies 34 . Thousands of highly-conserved sequence circRNAs have been identified thanks to advances in high-throughput sequencing technology and computational biology techniques. These circRNAs, as potential regulatory elements, may act as ceRNAs in cell physiology, trapping miRNAs and thereby influencing the regulation of other transcripts 35 . Various studies have shown that circRNAs are expressed differentially in tumor tissues, including circRASGRF2 36 , CircPUM1 37 , and circGFRA1 38 , which have been related to HCC progression.
In this study, we first gathered three circRNA microarray datasets from the GEO database, and then we screened nine DECs in the initial stages. (Table 3) Other researchers have investigated the role of hsa_ circ_0074854 and hsa_circ_0091570 in hepatocellular cancer 40,41 . However, the other seven DECs in liver cancer have yet to be explored. www.nature.com/scientificreports/ These seven DECs were abnormally upregulated in HCC cells based on our findings. According to the ceRNA hypothesis, increased expression of these seven DECs may decrease the expression of miRNAs inside the cell. We predicted DEC-related MREs using intersecting miRNAs from the CSCD database and downregulated DEMIs. The results revealed only four DECs (hsa_circ_0000520, hsa_circ_0004315, hsa_circ_0008616, hsa_circ_0070934) were ascertained as ceRNAs to downregulate miRNA expression inside the cell, and those DECs that did not have miRNAs matching with downregulated DEMI from TCGA were eliminated, such as hsa_circ_0083766, hsa_circ_0079958, and hsa_circ_0016867. The miRNAs affected by final DECs, including hsa-miR-542-5p, hsa-miR-326, hsa-miR-511-5p, hsa-miR-195-5p, hsa-miR-214-3p, and hsa-miR-424-5p, were referred to as FImiRNAs in this research.
Among these six FImiRNAs, miR-542-5p has a complementary sequence with MRE sites related to two DECs, hsa_circ_0000520 and hsa_circ_0008616. According to the previous research, miR-542-5p expression is downregulated in non-small cell lung cancer (NSCLC) tissues, which is implicated in NSCLC tumorigenesis 42 . This miRNA has also been investigated in breast cancer, endometrial carcinosarcoma, and osteosarcoma [43][44][45] .
For miR-326, some studies have found that this miRNA acts as a tumor suppressor in glioblastoma, human prostatic carcinoma, and breast cancer. The expression of miR-326 is downregulated in these cancers [46][47][48] . Besides that, this miRNA has been demonstrated to reverse chemoresistance in human lung cancer by targeting the specificity protein 1(SP1) 49 .
MiR-511-5p has been studied for its tumor-suppressive role in preventing colorectal cancer cell progression by targeting GPR116 50 and lung adenoma-carcinoma cells by targeting oncogene TRIB2 51 . Also, a study has proven that miRNA-511-5p prevents malignant behaviors of breast cancer with a direct effect on SOX9 and PI3/ AKT regulatory pathway 52 .
For miR-195-5p, overexpression of this miRNA has been demonstrated to inhibit cell migration and invasiveness in oral squamous cell cancer, cervical cancer, and breast cancer by targeting TRIM14, YAP1, and CCNE1 proteins, respectively 53-55 , or preventing thyroid carcinoma cell progression by acting on the p21/cyclin D1 axis 56 . Overall, miR-195-5p has the potential to be a tumor suppressor. MiR-214-3p and miR-424-5p, like the previously mentioned miRNAs, were investigated in some research for their involvement in reducing cell progression through their effect on transcripts and signaling pathways. MiR-214-3p, for example, serves as a tumor suppressor by targeting ABCB1 and XIAP proteins, preventing multi-drug resistance and stimulating apoptosis. MiR-424-5p as well suppresses intrahepatic cholangiocarcinoma metastasis by targeting ARK5 57-61 . In summary, To determine the effects of DECs on mRNAs through miRNAs, an intersection was performed between the overlapped target genes of six FImiRNAs from the Mirwalk and Targetscan databases and upregulated DEGs from the TCGA database; as a result, 543 genes were obtained and named as FImRNAs. Afterward, a circRNA/ miRNA/mRNA regulation network was established as a ceRNA network based on DEC-FImiRNA and FImiRNA-FImRNA interactions.
The GO and KEGG databases were employed to understand the biological roles and potential function of 543 FImRNAs. Based on GO analysis, these genes in the biological process are mainly enriched in the positive regulation of the cell cycle, DNA replication, and nuclear division. Therefore, they may play pivotal roles in tumorigenesis and tumor development. At the same time, KEGG analysis showed that these genes were mainly enriched in the ECM-receptor interaction, cell cycle, and Rap1 signaling pathway. Some studies have examined these pathways and shown many roles of these genes in cancer progression. In conclusion, the circRNAs explored in this study may perform similar or related functions via the circRNA-miRNA-mRNA axis.
Rap1 has a variety of functions in tumor initiation and development. Rap1 regulates integrins and cadherins by stimulating EGFR and Src/FAK, which plays essential roles in cell adhesion to ECM and cell-cell adhesion; both are crucial for tumor cell invasion and metastasis. Also, Rap1 induces tumors and epithelial-mesenchymal transition (EMT) via notch signaling. During tumorigenesis, the interaction between cancer cells and the tumor microenvironment (TME) causes ECM stiffness and alteration of ECM key receptors, leading to aberrant mechanotransduction and malignant transformation.
On the other hand, Src triggered by Rap1 signaling may activate the MAPK/ERK pathway, which has been found in some research to promote G0/G1 to S phase cell cycle progression and angiogenesis in cancer [62][63][64][65] . Thus, these FImRNAs' function and signaling pathways are associated with the occurrence and development of tumors. In summary, these FImRNAs, which are indirectly regulated by the four selected DECs we identified, may play a vital role in the HCC signaling pathway.
When 543 FImRNAs and 441 survival-related DEGs were intersected, only 204 FImRNAs were determined to be effective in survival. We built a PPI network for these 204 genes and selected ten hub genes from the network for further analysis. Figure 10A shows the overall survival time of ten hub genes in Kaplan-Meier plots. A circRNA-miRNA-mRNA regulatory axis was created for these hub genes to help the researchers understand them www.nature.com/scientificreports/ better. The top five hub genes based on the most significant p-value among these ten hub genes that considerably affect overall survival are KIF20A, NCAPG, TTK, PLK4, and CDC6. KIF20A mRNA, which encodes protein MKlp2, has been reported to be significantly expressed in large human hepatocellular cells. MKlp2 accumulation is linked to abnormal hepatocyte proliferation and tumor aggressiveness in human hepatocellular carcinoma 66 . Interestingly, several studies showed a correlation between KIF20A overexpression and tumor progression and proliferation in colorectal cancer, renal clear cell carcinoma, and bladder cancer cells [66][67][68] .
NCAPG expression is increased in various cancers, including hepatocellular liver cancer, as an oncogene that stimulates cell proliferation and apoptosis through the PI3K/AKT/FOXO4 pathway. Another study found that knocking down NCAPG as a mitotic gene may prevent HCC cell growth, progression, and migration 69,70 .
According to the results of the previous studies, TTK activates Akt/mTOR, and MDM2/p53 in a p53-dependent mechanism increases cell proliferation and migration, which stimulates malignancy 71 . TTK overexpression has also been shown to increase HCC cell drug resistance to sorafenib, suggesting it may be a viable therapeutic target for human hepatocellular carcinoma 72 .
Concerning PLK4 and CDC6, their association with HCC has been described in previous researches; for example, PLK4 overexpression promotes cell growth, while its knockdown suppresses progression, invasion, and migration. CDC6 dysregulation has a role in developing many cancers, like hepatocellular carcinoma. CDC6, which acts as a regulator in the early stages of DNA replication and as a checkpoint controller before mitosis, has been associated with the clinical progression of HCC and may be utilized as a biomarker in patients with this kind of cancer 73,74 .
Da-Wei Zhang and colleagues found that knocking down hsa_circ_0070934 expression inhibited CSCC cell proliferation and increased apoptosis compared to the negative control group. They also identified the importance of hsa_circ_0070934 in regulating HOXB7 expression levels by acting as a ceRNA and interacting with miR-1236-3p in the pathogenesis of CSCC. As a result, they discovered that hsa_circ_0070934 could be useful in the diagnosis and treatment of CSCC and other diseases 75 . For hsa_circ_0004315, some studies have found that this circRNA may act as a molecular sponge for miRNAs such as hsa-miR-214-3p and hsa-miR-195-5p to regulate the expression, respectively, of PPARGC1A and CCNE1 in hepatocellular carcinoma and breast cancer 76,77 .
Our results suggest that these two DECs identified by bioinformatic techniques may be utilized as effective diagnostic and valuable prognostic biomarkers.

Conclusions
Our study was mainly conducted through bioinformatics analysis using software tools and genetic databases. To summarize, the goal of this study was to provide a comprehensive picture of the molecular mechanisms involved in HCC. Based on the ceRNA hypothesis, a circRNA/miRNA/hub gene network was constructed, and its function was predicted and evaluated. DECs, DEMIs, and DEGs inside this network may significantly affect patients' prognosis and survival. In conclusion, these genes, especially two final DECs include hsa_circ_0070934 and hsa_circ_0004315, may serve as novel prognostic factors in the HCC patients. These genes' abnormal expression and function need to be verified in future laboratory studies (Supplementary Information).