Identification of hub genes distinguishing subtypes in endometrial stromal sarcoma through comprehensive bioinformatics analysis

Diagnosing low-grade and high-grade endometrial stromal sarcoma (LG-ESS and HG-ESS) is a challenge. This study aimed to identify biomarkers. 22 ESS cases were analyzed using Illumina microarrays. Differentially expressed genes (DEGs) were identified via Limma. DEGs were analyzed with String and Cytoscape. Core genes were enriched with GO and KEGG, their pan-cancer implications and immune aspects were studied. 413 DEGs were found by exome sequencing, 2174 by GSE85383 microarray. 36 common genes were identified by Venn analysis, and 10 core genes including RBFOX1, PCDH7, FAT1 were selected. Core gene GO enrichment included cell adhesion, T cell proliferation, and KEGG focused on related pathways. Expression was evaluated across 34 cancers, identifying immune DEGs IGF1 and AVPR1A. Identifying the DEGs not only helps improve our understanding of LG-ESS, HG-ESS but also promises to be potential biomarkers for differential diagnosis between LG-ESS and HG-ESS and new therapeutic targets.


Differential gene core module construction and hub gene screening
The STRING database (Version 12.0) (https:// cn.string-db.org/) was used for protein interaction analysis, generating a PPI network imported into Cytoscape software for visualization.In this study, the DEGs obtained from the screening between LG-ESS and HG-ESS have used the STRING database to get a PPI network, then have visualized by Cytoscape software.The top 10 ranked Hub genes were selected using cytoHubba and Maximal clique centrality (MCC) algorithm 12,13 .

Functional enrichment analyses
DAVID platform (http:// david-d.ncifc rf.gov/ summa ry.jsp/) facilitated Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of DEGs, revealing enriched pathways and biological processes annotated by Hub genes.

Hub genes' pan-cancer analysis
TIMER database (https:// cistr ome.shiny apps.io/ timer/) was used to explore pan-cancer correlations of the ten core genes and their expression differences in 34 common human cancers.

Immune-related differential gene analysis
The immunology database and analysis portal (ImmPort, https:// www.immpo rt.org/) database, encompassing almost all immune-related genes, was intersected with the differential genes obtained from what was accessed by Venn analysis to identify immune differential genes.

Differential genes from SNP array
SNP microarray technology was utilized to explore gene-level disparities between HG-ESS and LG-ESS.After DNA extraction, qualified quality control, amplification, and hybridization, 413 differentially expressed genes (adjusted p value < 0.05) (Supplementary Table 1) were identified between LG-ESS and HG-ESS groups in this analysis 10,11 .

Differential genes from GEO
The expression profile data in GSE85383 obtained from the GEO database, whose original expression matrix has some differences (Fig. 1a), were standardized by R code to achieve comparability between each sample (Fig. 1b).2174 differential genes were extracted (Supplementary Table 2), including 1225 up-regulated and 949 down-regulated genes in the HG-ESS group compared to the LG-ESS group with the visualization results shown in Fig. 2. DEGs were acquired by the measures of logFC (fold change) in upregulated ≥ 1.0 and downregulated genes ≤ minus 1.0, adjusted p value < 0.05 10,11 .

Venn analysis
Venn analysis revealed thirty-six shared genes from differential gene datasets obtained through exome chip sequencing and the GSE85383 database (Fig. 3a, Table 1).

Differential gene core module construction and hub gene screening
PPI network analysis via the STRING database generated protein interaction insights for the 36 shared differential genes (Fig. 3b).Then the PPI-related results were imported into the cytoHobba plugin in Cytoscape.The top 10 ranked core genes in the core module were screened as RBFOX1, PCDH7, FAT1, DSCAM, CCND1, CDH18,  www.nature.com/scientificreports/PXDN, ANO1, AVPR1A, COL4A4, where genes PCDH7, FAT1, DSCAM, CCND1, CDH18, PXDN, COL4A4 were up-regulated in the high-level group compared to the low-level group; genes RBFOX1, ANO1, AVPR1A were decreased in the high-level group compared to the low-level group.The above results were plotted in the protein interaction network Fig. 4a,b.

Functional enrichment analyses of hub genes
GO analysis results shown in Table 2 were obtained by enrichment of the above 10 core genes, which mainly included biological processes related to cell adhesion, T cell proliferation, protein secretion, calcium ion binding, and transcriptional corepressor activity.KEGG pathway enrichment focused on adhesion-related PI3K-Akt and p53 signaling pathways(Table 3).www.nature.com/scientificreports/

Hub genes' pan-cancer analysis
Pan-cancer analysis across 34 common human cancers unveiled distinct expression patterns for the ten core genes, as shown in Fig. 5.In this study, the expression of the RBFOX1 gene (or A2BP1), which had decreased expression in the high-grade group compared with the low-grade group, differed in 13 cancer types (Fig. 5a).In four cancers, including breast cancer (BRCA), the expression levels of tumor tissues were higher than normal tissues; While in other eight cancers, such as colon cancer (COAD), head and neck squamous cell carcinoma (HNSC), the expression levels of tumor tissues were lower than normal tissues.The expression of the FAT1 gene increased in the high-grade group compared with the low-grade group and was differently expressed in 16 cancer species (Fig. 5b).In 12 kinds of cancers, such as Bladder urothelial carcinoma (BLCA), the expression level of the tumor tissue was higher than that of the normal tissue; In BRCA and Kidney Chromophobe (KICH), the expression level in tumor tissues was lower than that in normal tissues.The expression of the PCDH7 gene, which was higher in HG-ESS than LG-ESS, was different in 14 cancer species (Fig. 5c).In four carcinomas within cholangiocarcinoma (CHOL), their expression levels were higher in tumor tissues than in normal tissues.In BLCA and BRCA, etc., their expression levels in tumor tissues were lower.The DSCAM gene with increased expression occurred in the high-grade group in this study.It differed in 13 cancer types (Fig. 5d): in 11 cancers such as HNSC and Uterine Corpus Endometrial Carcinoma (UCEC), their expression levels in tumor tissues were higher; only in BRCA cancer, their expression levels in tumor tissues were lower than normal tissues.In this study, the expression of the CDH18 gene, which was increased in the HG-ESS, differed in 13 cancer types (Fig. 5e).In Lung squamous cell carcinoma (LUSC), the tumor tissues with higher expression of it.In 5 cancers, including COAD and Kidney clear cell carcinoma (KIRC), they were lower.The CCND1 gene with increased expression in the HG-ESS differed in 13 cancer types (Fig. 5f).In 9 cancers as Rectal Cancer (READ), the tumor's expression was higher than the normal's; the lower face could be seen in the tumor part rather than the standard part in Kidney Papillary Cell Carcinoma (KIRP) and LUSC.Mentioning the PXDN gene's expression (Fig. 5g), which with an increased level in the HG-ESS, was unevenly expressed in 13 cancers.The expression levels in tumors higher than normal's could be found in CHOL and HNSC carcinomas, while the expression levels in tumors lower than normal's could be found in BLCA and UCEC.The expression of the ANO1 gene, which was lower in HG-ESS than LG-ESS, was different in 16 cancer species (Fig. 5h).In six carcinomas within Esophageal Cancer (ESCA), their expression levels were higher in tumor tissues than in normal tissues, while in CHOL, KICH, etc., their expression levels in tumor tissues were lower.The AVPR1A gene with decreased expression in the HG-ESS differed in 15 cancer types (Fig. 5i).In KICH and HNSC, their tumor's expression was higher than the normal's, while the lower expression could be seen in the tumor part rather than the normal part in BLCA and Prostate Cancer (PRAD), etc.The term of the COL4A4 gene, which was increased in the high-grade group compared with the low-grade group, differed in 15 cancer types (Fig. 5j).In CHOL cancer, it overexpressed in tumor tissues more than the normal part, while in 13 cancers, including BRCA and KICH, its expression level in tumor tissues was lower than normal.

Immune-related differential gene analysis
All immune-related genes 1793 were downloaded from the ImmPort official website.Through comparison with ImmPort's immune-related genes, two genes, IGF1 and AVPR1A, were identified with down-regulated expressions in the high-grade group compared to the low-grade group (Fig. 6).

Discussion
ESS is a rare malignant neoplasm of the female genital tract, categorized into low-grade (LG-ESS) and highgrade (HG-ESS), presents distinctive clinical and prognostic variations 14,15 .However, distinguishing between these subtypes using conventional methods remains challenging.Advances in sequencing technology and data sharing platforms provide an opportunity to comprehend disease genotypes as a foundation for molecular  17,18 ; The unexplored effects of RBFOX1, PCDH7, FAT1, DSCAM, CDH18, PXDN, AVPR1A, and COL4A4 in ESS warrant further investigation.RBFOX1, the RNA binding fox-1 homolog 1, previously associated with developmental coordination disorder and spinal cerebellar ataxia, has also been linked to tumors like gastric and colon cancers now.For example, in the study of colorectal cancer in British Bangladeshis, RBFOX1 deletion was associated with high prevalence, early onset, and frequent mucous tissue types of colorectal cancer 19 .In the study of Malignant Mesothelioma (MMt), a pure deletion mutation in RBFOX1/A2BP1 was also identified for the first time, and the possibility was raised that this gene could be deemed as a new suppressor in MMt 20 .It has also been documented that RBFOXI may be involved in the pathogenesis of acute kidney injury by inhibiting the inflammatory response and oxidative stress and reducing the apoptosis of HK-2 cells induced by the hypoxic environment 21 .In this paper, the relationship between the gene RBFOXI and ESS disease was discussed for the first time.Considering the sequencing of the RBFOXI gene in this study (the RBFOXI gene showed a decreased expression in the HG-ESS group compared with the LG-ESS group) and the results of the pan-cancer analysis in the TIMER database, it can be proposed that RBFOXI can be a differential gene with tumor suppressive effect between the HG-ESS and LG-ESS groups.
FAT atypical cadherin 1 (FAT1), a tumor suppressor through WNT/β-catenin, Hippo, and MAPK/ERK pathways, influences tumor progression and affects therapy response in various cancers.The FAT1 gene deficiency in breast cancer affects resistance to CDk4/6 inhibitor therapy 22 ; FAT1 mutations are associated with poor survival in head and neck squamous cell carcinoma (HNSCC) 23,24 ; Some studies suggest that FAT1 may be an immune response regulator involved in different inflammatory processes, as has been argued in gliomas and T-cell lymphomas 25,26 .Some researchers have also obtained different results: Kim et al. suggested that human papillomavirus (HPV)-negative HNSCC patients with FAT1 mutation showed a better prognosis 27 .In melanoma and NSCLC, FAT1 can inhibit the tumor initiation ability of NSCLC cells by activating the Hippo signaling pathway, resulting in a significantly better survival outcome in patients with FAT1 mutations than the wild type and correlating with better immunogenicity and ICI efficacy.Thus, FAT1 can be recommended as a biomarker for immunotherapy in patients with melanoma and NSCLC 24 .A recent study showed that immunotherapy is increasingly used in sarcomas, not limited to patients with solid tumors 28 .Based on these results, the present study suggests that the differential expression of FAT1 between HG-ESS and LG-ESS may have important biological significance.The differential expression of FAT1 between HG-ESS and LG-ESS could impact clinical outcomes and immunotherapy.
PCDH7, a cell-cell adhesion regulator, emerges as a prognostic factor across cancers.Its roles vary.PCDH7 was defined as a risk gene, and its high expression suppressed survival in lung cancer patients in a study by Chen Y et al. 29 .PCDH7 provides a potential therapeutic strategy in colon cancer: it can inhibit the MEK1/2/ERK/c-Fos axis by knocking down PCDH7, induce neurogenesis and autophagy, and enhance the effect of colon cancer cells on chemotherapy 30 .Since androgen receptors can target PCDH7, discussing the relationship between the degree of PCDH7 methylation and the growth, invasion, and apoptosis of AIPC cells in non-androgen-dependent prostate cancer (AIPC) provides a new idea for the treatment of AIPC 31 .However, in gastric cancer PRMT6-KO-GC cells, knockdown of tumor suppressor gene PCDH7 promoted cell migration and invasion 32 .Meanwhile, in cervical cancer, up-regulation of PCDH7 can significantly inhibit the proliferation, migration, and invasion of cancer cells, and PCDH7 is positively correlated with the survival rate of patients 33 .In summary, PCDH7 has been considered an independent prognostic factor in various cancers.Further experimental validation is needed to determine whether it can provide a new target and therapeutic idea for the treatment of HG-ESS through the hints in this paper.
Enrichment analyses unveiled pathways like cell adhesion, PI3K-Akt, and p53 signaling associated with tumor progression.Studies have shown that changes in cell surface receptor expression often occur in malignant tumors; The activation of adhesion signaling pathways plays an essential role in cell differentiation, development, proliferation, and apoptosis and influences tumor progression by participating in tumor invasion, motility, and metastasis processes.As one of the classical signal transduction pathways, the abnormal activation of the PI3K-Akt signaling pathway and a variety of downstream effector molecules is closely related to the biological characteristics of the malignant proliferation of tumor cells 34 , which is consistent with the information we obtained in the clinic that tumors in the high-grade group are larger, more mitotically active, more commonly necrotic, more extensively infiltrated, and infiltration can involve the myometrium, lymph and blood vessels 2 , and the above enrichment analysis further explains the high malignancy and poor prognostic outcome in the high-grade group from the genetic level.Moreover, immune-related genes were implicated, indicating potential for immunotherapy.They are not only involved in the biological progression of positive regulation of activated T cell proliferation, but we also obtained three immune-related genes, IGF1, AVPR1A, and FAT1, by bioinformatics analysis.With the above hints, the idea that patients may benefit from immunotherapy is more testable.
According to the latest guidelines, the treatment of ESS patients is still very limited by the lack of specific targeted therapies or immunotherapy.However, some pan-tumor targets such as neurotrophic receptor tyrosine kinase (NTRK) gene fusions, microsatellite instability (MSI), and tumor mutation burden (TMB) have been tried in clinical work as adjuvant therapy.For example, some studies have suggested that immunotherapy, such as pembrolizumab, may be an option for primary or relapsed patients with TMB ≥ 10 who are surgically unresectable or have multiple metastases throughout the body when more good treatment options are not available 21 .The discussion of FAT1, PCDH7, and other essential genes in this paper may prompt questions about these genes as therapeutic targets and independent prognostic indicators of ESS.Could these key genes be therapeutic targets and independent prognostic factors that distinguish HG-ESS from LG-ESS, and could they guide immunotherapy?Also, considering the activated PI3K-AKT pathway, could inhibitors of this pathway offer new therapeutic prospects?In the clinical context, genetic testing could guide targeted and immunotherapeutic approaches for patients with limited treatment options.In conclusion, this study initiates the exploration of candidate genes, offering new insights into ESS treatment.While limitations exist due to sample size and methods, further validation and experimental expansion are planned to advance these findings.

Conclusion
In summary, our study, utilizing SNP sequencing and the GEO database, employed bioinformatics methods to analyze core genes, including RBFOX1, revealing insights into the molecular underpinnings of HG-ESS development.These core genes could potentially serve as biomarkers and therapeutic targets for HG-ESS.

Figure 1 .Figure 2 .
Figure 1.Data obtained from the GEO database were standardized using R language normalization.The raw data are shown in (a), while the normalized data are shown in (b).

Figure 3 .
Figure 3. (a) Venn diagram depicting the overlap of differential genes identified through exon microarray sequencing and those from the GEO database.(b) Protein-Protein Interaction (PPI) network constructed using 36 co-expressed differential genes.The colored "balls" (or Nodes) represent proteins with direct effects; The lines between the "Nodes" represent the interaction between the two proteins, and the lines with different colors represent different interaction types.

Figure 4 .
Figure 4. (a) Module of hub genes constructed using the 36 co-expressed differential genes.The value of the gene degree is represented by the size and color of the node.(b) Top ten core genes displayed in a protein interaction network connectivity diagram.The value of the gene rank is represented by the color of the node in the Maximal clique centrality (MCC) algorithm.

Figure 6 .
Figure 6.Venn diagram illustrating the overlap between immune-related genes and differentially expressed genes.

Table 1 .
36 differential genes shared by GSE85383 and whole genome exon microarray.

Table 2 .
Differential gene GO enrichment results of high and low grade endometrial stromal sarcomas.