Identification of RNF150 as the hub gene associated with microsatellite instability in gastric cancer

Gastric cancer (GC) is a common digestive tract malignancy with the sixth global incidence and third cancer-related deaths, respectively. Microsatellite instability (MSI), accounting for one of the molecular subtypes of GC, plays an important role in GC and is affected by a sophisticated network of gene interactions. In this study, we aimed to explore the expression pattern and clinical performance of MSI related gene in GC patients. Weighted gene co-expression network analysis (WGCNA) was exploited to single out the vital module and core genes in TCGA database. We applied the protein–protein interaction (PPI) and survival analysis to propose and confirm RNF150 as the hub gene in GC. Finally, we utilized immunohistochemistry (IHC) and reverse transcription-polymerase chain reaction (RT-PCR) to explore the expression pattern of RNF150 in GC patients. With the highest weight correlation and standard correlation, RNF150 was selected as the hub gene for following validation. In validation, data obtained from the test sets showed a lower expression of RNF150 in MSI GC compared to microsatellite stability (MSS) GC. Moreover, survival analysis shows that MSI GC patients with a lower RNF150 expression level displayed the longer OS time. Compared to the expression in normal gastric tissues, the protein level of RNF150 was virtually up-regulated in ten cases of GC tissues. Furthermore, RNF150 protein level was decreased in MSI GC samples compared to MSS GC samples. When validated the mRNA expression with RT-PCR in fresh GC tissues, we also found the similar trend. RNF150 was identified as a novel MSI-related gene in GC. It is expected to be an auspicious prognostic biomarker for GC patients.

Co-expression network construction. Then we used those DEGs to construct co-expression structure using the 'WGCNA' package 19 . Firstly, we estimated the Pearson's correlation matrices of paired genes. Secondly, we built up a weighted adjacency matrix by the power function a mn =|cmn| β . The parameter β was a soft threshold, which could lay emphasis on solid correlations and weaken powerless correlations between genes. In our research, β = 4 (scale R 2 = 0.88) is determined to establish a scale-free network ( Fig. 1A-D). Then, the adjacency was adjusted into topological overlap matrix (TOM) that could estimate the network connectivity of every single gene 20 . Finally, based on the TOM-based measure of similarities and differences, we used the average linkage hierarchical clustering method to decide the minimum genome size of the gene dendrogram to be 30 21 .
Identification of significant module. We applied two methods to determine important modules related with MSI. Linear regression analysis was performed between MSI status and various gene mRNA level. The P value was then log 10 transformed into the gene significance (GS). Then we computed average gene significance for every single gene in the module to get the module significance. Next, we conducted principal component analysis, and took the major component as module eigengenes (ME). Finally, the key module was defined as with higher gene significance and higher correlation between module eigengenes and MSI.

GO and KEGG analysis of DEGs.
To examine the underlying mechanism of DEGs in the core module, we applied 'clusterProfiler' R package to conduct pathway analysis such as gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) 22 . Then, package 'ggplot2' was exploited to visualize the top ten pathways of GO and KEGG analysis. Finally, we exported genes in the key module and exploited the software Cytoscape to establish protein-protein interaction structure.
Gene set enrichment analysis (GSEA). In order to expound the underlying mechanism, we divided 121 MSI samples from TCGA database into low RNF150 group and high RNF150 group based on RNF150 mRNA level. Next, we used GSEA to explore functional pathway analysis and set the cutoff as P value < 0.05, gene size more than 30, and |enrichment score (ES)| more than 0.6. Hub gene validation. We performed survival analysis using the package named 'survival' . We used test set GSE62254 to validate the expression pattern of the RNF150 between MSI and MSS patients. GraphPad Prism 8 was applied to analyze these data and visualize results. Statistical significance was estimated by two-tailed Student's t-tests. P value less than 0.05 indicated statistical significance.
Sample collection for validation. Ten GC patients from Renmin Hospital of Wuhan University during 2022 January and March were involved in this study. Paraffin Section tissues and their adjacent normal gastric mucosa were collected for IHC assay. Moreover, another 16 cases of GC from Renmin Hospital of Wuhan University during 2023 March and May were also recruited in this study and we prospectively collected the fresh tissues for RT-PCR assay. Our research was accepted by the Ethics Committee of Renmin Hospital of Wuhan University (No. WDRY2021-K002). All procedures are performed under the Declaration of Helsinki. All participants signed informed consent to allow their tissues to be used in this study. RT  Statistical analysis. All the statistical works were completed with SPSS software (version 18.0). Expression of RNF150 is divided into low expression and high expression based on the cutoff value. The prognostic significance of RNF150 mRNA in GC individuals was presented by Kaplan-Meier curves and determined by log rank test between the high and low groups of RNF150. ROC curves generated by R software (version 3.3) are utilized to determine the diagnostic power of RNF150 mRNA for differentiating MSI from MSS in GC samples. Pearson correlation was adopted in our analysis to determine the correlation between certain modules and MSI. P value less than 0.05 is regarded as statistically significant.

Results
DEGs extracting. Figure  Construction of co-expression network and key modules identification. We exploited the package named 'WGCNA' to calculate 4679 DEGs into modules and 8 modules were visualized with various colors (Fig. 3A). Two approaches were used to select the core module related to MSI. First, these data indicated that the turquoise module had an upregulated MS value (Fig. 3B). Furthermore, the ME displayed that the turquoise module is the most significantly linked to MSI (r = −0.39, p = 8.07e−16) (Fig. 3C). Thus, the turquoise module was selected as the core module linked to MSI. Then we acquired the gene list from turquoise module for subsequent analysis.
For the reason of investigating the mechanism of the genes in GC we obtained from the turquoise module, we conducted pathway analysis. The top ten pathways in GO analysis were shown in Supplementary Fig. S1. Among biological processes, "regulation of cell morphogenesis" was the most significant enrichment, and "collagencontaining extracellular matrix" was the most significant enrichment in cellular components, and "cell adhesion molecule binding" was the most significant enrichment in molecular function ( Supplementary Fig. S1A-C). Furthermore, our results show that proteoglycans in GC were the most significantly enrichment in the KEGG pathway analysis. (Supplementary Fig. S1D).
Hub gene identification. Hub genes are defined as those who have closer connection with other genes in the same module. Among the turquoise module, we found the module membership (MM) of 43 genes were higher than 0.875. Then these 43 genes were chosen as hub gene candidates and exported to establish network of PPI using Cytoscape. Results show that 7 hub gene candidates were closely connected to other genes in the network (Fig. 4). Both with the highest weight correlation and standard correlation, RNF150 was finally defined as the core gene related to MSI GC.  individuals both in the TCGA-STAD and GSE62254 (Fig. 5A, B). According to the degree of MSI, GC was defined as MSI-High (MSI-H), MSI-Low (MSI-L) and MSS 23 . While according to the TCGA database, our results indicated that RNF150 mRNA level could definitely discriminate both MSI-H and MSI-L from MSS, but did not show obvious effect among MSI-H and MSI-L (Fig. 5C). In GSE62254, the RNF150 mRNA level was significantly up-regulated among paired adjacent normal tissues than that in tumor tissues in 98 patients (Fig. 5D).
As for the Lauren classification of GC, the RNF150 mRNA level in diffuse-type was significantly downregulated compared to intestinal-type (Fig. 5E). Furthermore, according to their TNM stage, we categorized GC patients into stage I, stage II, stage III, stage IV. Results show RNF150 mRNA level increased with tumor development in patients with advanced tumor stage (Fig. 5F). Finally, we conducted ROC curve to explore the diagnostic value of RNF150 in distinguishing MSI from MSS GC patients. Results show that RNF150 exhibited a certain value both in TCGA and GSE62254 (Fig. 5H,I). www.nature.com/scientificreports/ Furthermore, we analyzed the relationship between RNF150 mRNA level and the OS time of GC patient in both TCGA database and GSE62254. First of all, we separated all GC patients into two groups using cutoff value in TCGA and GSE62254, respectively. Results show that patients with a lower RNF150 expression level showed the longer OS time (Fig. 6A, B). Secondly, we separated MSI and MSS patients into two groups based on their RNF150 expression levels as mentioned above, and results remains the same trend (Fig. 6C-F).

Gene set enrichment analysis.
To determine the underlying biological mechanism of RNF150 associated with the KEGG pathway in MSI GC individuals, we conducted gene set enrichment analysis and found that "heparin binding" was enriched ( Supplementary Fig. S2A). Besides, PPI network revealed that RNF150 may play an important role in the heparin binding pathway through the SLIT2 and RSPO3 gene (Supplementary Fig. S2B). Correlation analysis also showed that in MSI GC, the RSPO3 mRNA level was mostly related to RNF150 mRNA level (r = 0.8239, Supplementary Fig. S2C). We also analyzed the correlation between RNF150 and four MSI related genes (MLH1, MSH2, MSH6, PMS2) and results show that there is correlation between them ( Supplementary Fig. S3).
Validation in clinical GC samples. To further explore RNF150 protein level in GC samples, we applied the human protein atlas in our research. The results showed that the staining intensity of RNF150 in gastric tumors was significantly lower than that in normal tissue (Fig. 7A-D). For the sake of improving the reliability of our data, we acquired paraffin embedded section of 10 GC patients, in which five patients were MSS and five patients were MSI. We obtained tumor tissues (N = 10) and paired adjacent normal tissues (N = 10) simultaneously. Then we conducted IHC for RNF150 on tumor sections and paired adjacent normal sections from those GC patients. The representative images showed that the staining intensity of RNF150 in gastric tumors was significantly lower than that in normal tissues (Fig. 7E-G), and all the IHC images were shown in Supplementary Fig. S4. Furthermore, RT-PCR assay was utilized to verify the expression pattern of RNF150 mRNA in GC tissues and normal gastric tissues from Renmin cohort. As exhibited in Fig. 5G, expression of RNF150 mRNA in GC tissues (N = 16) was remarkably lower than that in normal gastric tissues (N = 10). RNF150 mRNA level was decreased in MSI individuals (N = 6) compared with MSS individuals (N = 10), which is in accordance with results we obtained in TCGA-STAD and GSE62254 datasets.

Discussion
In the present study, we amied to identify hub genes involved in MSI in GC. By using TCGA database for WGCNA analysis, we identified the turquoise as the key module. Furthermore, we performed GO, KEGG and PPI analysis and discovered the RNF150 as the hub gene. Moreover, by cross validation with GEO datasets, we confirmed that mRNA levels of RNF150 were significantly down-regulated in GC than normal gastric mucosa, and low level of RNF150 mRNA predicted much better survival in GC patients.
RNF150 is a member of ring finger protein family. Eukaryotic cells contain a huge amount of RNF proteins, most of which function as E3 ubiquitin ligases 24 . E3 ubiquitin ligases, bind E2 to substrate and transfer ubiquitin www.nature.com/scientificreports/ molecules from E2 to substrates, is one of the three enzymes required to regulate protein ubiquitination. The other two enzymes are E1 ubiquitin-activating enzyme that hydrolyses ATP and E2 ubiquitin-conjugating enzyme that receives the ubiquitin from E1 25 . E3 ubiquitin ligases play a vital role in sustaining protein stability by ubiquitinating and degrading proteins misfolded 26 . Besides, E3 ubiquitin ligases also participated in other biological processes, such as cell proliferation, apoptosis, DNA damage repair, and intracellular vesicle trafficking, etc. 27 .
Although there are few literature reports on RNF150, the role of many RING finger E3 ligases in malignant tumors is well known, including the oncogene MDM2 28 and the suppressor gene BRCA1 29 . For instance, MDM2 can cause degradation of p53 through its E3 ligase activity 30 . According to reports, the MDM2/p53 pathway is associated with the development of GC 31 . In addition, BRCA1 can encourage DNA adjustment and simultaneously steady p53 through ubiquitination 32 . The study shows that the BRCA1 mRNA level is decreased in gastric cancer tissue and decided whether platinum-based chemotherapy had a response 33 .
The MSI GCs displayed distinct biological features from MSS GCs. Several studies have shown that MSI is associated with good overall survival of gastric cancer patients [34][35][36][37][38] . Karol el at found that stage I-III GC patients with MSI-H showed longer OS time in spite of positive status of margin 38 , while Stefania Beghelli and colleges declared that MSI in GC is linked to superior outcomes only with stage II patients 39 . In our study, we found   www.nature.com/scientificreports/ that RNF150 is down-regulated in MSI GCs, while down-regulation of RNF150 predicts a longer survival of GC patients. Our results are in accordance with the previous study that MSI GC showed superior outcomes in GC patients. However, it's too soon to get a conclusion about the exact role of MSI in GC patients. Two studies demonstrated that the detection of microsatellite instability has limited prognostic value in GC. Therefore, the underlying mechanism of MSI GC is still unclear and need further research to reveal. Gene Set Enrichment Analysis (GSEA) does not require a definite threshold for differentially expressed genes. The algorithm analyzes the overall trend based on the actual situation, and from the perspective of gene set enrichment. It is easier to encompass the impact of subtle but synergistic changes on biological pathways. We used GSEA in our analysis and identified heparin binding is the most significant pathway in GC. Heparin binding plays an important role in acute stress, inflammation and tumor progression 40 . Heparin binding contributes to the up-regulation of heparanase, which associates with tumor vascularity and less favorable postoperative survival of cancer individuals 41 . Another study reveals that heparin binding promotes the angiogenic activity of tumor cells 42 . Hence, the result of GSEA suggests that RNF150 might be involved in the progression of GC partly via heparin binding which correlates well with angiogenesis and hematogenous metastasis.
We need to mention limitations in this study. Firstly, studies on RNF150 are very limited so it's difficult for us to hypothesize the potential mechanism of RNF150 affecting GC cells. Secondly, the prognostic role of RNF150 needs to be further explored in clinical individuals by different experimental methods such as western blotting and immunofluorescence. Finally, further cellular and animal experiments are needed in the future to expound the underlying mechanism of RNF150 in GC patients.
In brief, we recognized RNF150 as the vital gene linked to MSI in GC. Lower mRNA level of RNF150 predicts a longer survival of GC patients. Therefore, we propose that RNF150 is a novel biomarker and this study has important clinical implications for the development of new therapies for GC patients.

Conclusion
Our study identified RNF150 as a novel biomarker in MSI GC. It is expected to be an auspicious prognostic biomarker for GC patients.

Data availability
The datasets used and analyzed during the current study are available from the corresponding author on reasonable request. www.nature.com/scientificreports/