Introduction

Endometrial stromal sarcoma (ESS), an infrequent mesometrial-origin tumor, exhibits marked malignancy levels1,2. The latest World Health Organization classification divides ESS into low-grade (LG-ESS) and high-grade (HG-ESS) categories based on severity1. Clinical studies underscore distinct treatment approaches and prognoses for these types, yet biomarkers for their differential diagnosis remain scarce. This study endeavors to identify potential biomarkers for distinguishing between HG-ESS and LG-ESS.

While the past decade's research methods for ESS have centered on molecular fusion gene markers crucial to diagnosis3,4,5,6, current literature predominantly comprises retrospective case reports. Studies, confined to immunohistochemistry and fusion genes, offer limited insights into genetic commonalities and disparities between LG-ESS and HG-ESS subtypes. Recent genomic innovations present opportunities to grasp comprehensive tumor tissue biology. Genome-wide single nucleotide polymorphism (SNP) analysis, assessing genomes and transcripts in small sample sets, can identify target genes effectively7. SNPs offer extensive human genome loci coverage, ease of high-throughput typing, and cost-efficient sequencing, catering to the rarity of ESS.

In this study, we employed exon microarray technology to sequence 22 wax block samples (15 LG-ESS and 7 HG-ESS cases). Combining these with GEO platform data facilitated initial differential expression gene (DEG) screening between LG-ESS and HG-ESS groups. STRING platform (Search Tool for Interacting Genes Retrieval) (Version 12.0) and Cytoscape software (Software platform for visualizing molecular interaction networks) (Version 3.7.2) analyses of core genes followed. Pan-cancer analysis, immune-related discussions, and functional enrichment shed light on LG-ESS and HG-ESS differential diagnosis, molecular pathways, and clinical treatment insights.

Materials and methods

Patient and tissue specimens

We included 22 tissue wax block samples (15 LG-ESS and 7 HG-ESS cases) from patients admitted at the Department of Pathology, First Affiliated Hospital, Shihezi University, School of Medicine, between 2012 and 2021.

DNA extraction and SNP array

Tissue sections were obtained through laser capture microdissection (LCM) and processed for DNA extraction using the QIAamp DNA Micro Kit. DNA concentration was quantified using agarose gel electrophoresis and Nanodrop ND-2000. Genomic DNA (gDNA) concentrations up to 50 ng/L underwent whole genome amplification, followed by fragmentation, precipitation, and resuspension for hybridization onto the Illumina iScan Reader chip. After hybridization, the non-specifically bounding DNA was removed by washing, and the remaining specifically bound sites were extended by single bases, stained and scanned using Illumina iScan Reader. Campus Biotechnology Co., LTD supported the above sequencing process.

Microarray data acquisition and standardization

From the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) databases, 307 human endometrial stromal sarcoma datasets were retrieved using "Endometrial stromal sarcomas" as the keyword and "Homo sapiens" as the study subject, and the GSE85383 dataset (Last update date: August 31, 2017) was selected for analysis8,9. This dataset contained 9 LG-ESS and 4 HG-ESS samples, using Agilent GPL22303 SurePrint G3 Human GE 8 × 60 K Microarray. Differentially expressed genes (DEGs) were identified based on logFC values and adjusted p-values10,11 ( logFC (fold change) in upregulated ≥ 1.0 and downregulated genes ≤ minus 1.0, adjusted p-value < 0.05).

Differential gene analysis

Raw data from iScan were analyzed using the GenomeStudio Genotyping module for normalization, clustering, and genotyping. Volcano and heat maps were utilized for visualization of DEGs.

Venn analysis

Venn's analysis (http://bioinformatics.psb.ugent.be/webtools/Venn/), an online tool, was employed for meaningful DEGs intersection between exon microarray and GEO database results, yielding target genes.

Differential gene core module construction and hub gene screening

The STRING database (Version 12.0) (https://cn.string-db.org/) was used for protein interaction analysis, generating a PPI network imported into Cytoscape software for visualization. In this study, the DEGs obtained from the screening between LG-ESS and HG-ESS have used the STRING database to get a PPI network, then have visualized by Cytoscape software. The top 10 ranked Hub genes were selected using cytoHubba and Maximal clique centrality (MCC) algorithm12,13.

Functional enrichment analyses

DAVID platform (http://david-d.ncifcrf.gov/summary.jsp/) facilitated Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of DEGs, revealing enriched pathways and biological processes annotated by Hub genes.

Hub genes' pan-cancer analysis

TIMER database (https://cistrome.shinyapps.io/timer/) was used to explore pan-cancer correlations of the ten core genes and their expression differences in 34 common human cancers.

Immune-related differential gene analysis

The immunology database and analysis portal (ImmPort, https://www.immport.org/) database, encompassing almost all immune-related genes, was intersected with the differential genes obtained from what was accessed by Venn analysis to identify immune differential genes.

Ethical considerations

Patient and family consent was obtained, adhering to ethical guidelines.

Ethical approval and consent to participate

The present study was conducted in accordance with the Declaration of Helsinki for participant’s well-being and safety. We obtained 15 cases of LG-ESS and 7 cases of HG-ESS from patients at the First Affiliated Hospital of the School of Medicine, Shihezi University. Research involving human material, have been performed in accordance with the Declaration of Helsinki. Informed consent was obtained from each patient. The study protocol was approved by the Ethics Committee at the First Affiliated Hospital of the School of Medicine, Shihezi University.

Results

Patients and tissue specimens

The patient cohort, with an average age of 49.8 years (age range: 27–73), exhibited clinical symptoms such as irregular vaginal bleeding, abdominal pain, postmenopausal vaginal bleeding, and uterine fibroids. Diagnostic verification involved immunohistochemistry and consensus among at least three senior pathologists.

Differential genes from SNP array

SNP microarray technology was utilized to explore gene-level disparities between HG-ESS and LG-ESS. After DNA extraction, qualified quality control, amplification, and hybridization, 413 differentially expressed genes (adjusted p value < 0.05) (Supplementary Table 1) were identified between LG-ESS and HG-ESS groups in this analysis10,11.

These results are available on the Human Genetic Resources Information Management Backup platform (https://ngdc.cncb.ac.cn/hgrip) and the China National Center for Bioinformation (CNCB, https://ngdc.cncb.ac.cn/omix/preview/ZXO0dVNm).

Differential genes from GEO

The expression profile data in GSE85383 obtained from the GEO database, whose original expression matrix has some differences (Fig. 1a), were standardized by R code to achieve comparability between each sample (Fig. 1b). 2174 differential genes were extracted (Supplementary Table 2), including 1225 up-regulated and 949 down-regulated genes in the HG-ESS group compared to the LG-ESS group with the visualization results shown in Fig. 2. DEGs were acquired by the measures of logFC (fold change) in upregulated ≥ 1.0 and downregulated genes ≤ minus 1.0, adjusted p value < 0.0510,11.

Figure 1
figure 1

Data obtained from the GEO database were standardized using R language normalization. The raw data are shown in (a), while the normalized data are shown in (b).

Figure 2
figure 2

Analysis of differential gene expression in high and low-grade endometrial mesenchymal sarcomas within the GEO database. Red indicates high expression, while blue indicates low. Differentially expressed genes (DEGs) were identified based on logFC values and adjusted p-values (logFC (fold change) in upregulated ≥ 1.0 and downregulated genes ≤ minus 1.0, adjusted p-value < 0.05).

Venn analysis

Venn analysis revealed thirty-six shared genes from differential gene datasets obtained through exome chip sequencing and the GSE85383 database (Fig. 3a, Table 1).

Figure 3
figure 3

(a) Venn diagram depicting the overlap of differential genes identified through exon microarray sequencing and those from the GEO database. (b) Protein–Protein Interaction (PPI) network constructed using 36 co-expressed differential genes. The colored "balls" (or Nodes) represent proteins with direct effects; The lines between the "Nodes" represent the interaction between the two proteins, and the lines with different colors represent different interaction types.

Table 1 36 differential genes shared by GSE85383 and whole genome exon microarray.

Differential gene core module construction and hub gene screening

PPI network analysis via the STRING database generated protein interaction insights for the 36 shared differential genes (Fig. 3b). Then the PPI-related results were imported into the cytoHobba plugin in Cytoscape. The top 10 ranked core genes in the core module were screened as RBFOX1, PCDH7, FAT1, DSCAM, CCND1, CDH18, PXDN, ANO1, AVPR1A, COL4A4, where genes PCDH7, FAT1, DSCAM, CCND1, CDH18, PXDN, COL4A4 were up-regulated in the high-level group compared to the low-level group; genes RBFOX1, ANO1, AVPR1A were decreased in the high-level group compared to the low-level group. The above results were plotted in the protein interaction network Fig. 4a,b.

Figure 4
figure 4

(a) Module of hub genes constructed using the 36 co-expressed differential genes. The value of the gene degree is represented by the size and color of the node. (b) Top ten core genes displayed in a protein interaction network connectivity diagram. The value of the gene rank is represented by the color of the node in the Maximal clique centrality (MCC) algorithm.

Functional enrichment analyses of hub genes

GO analysis results shown in Table 2 were obtained by enrichment of the above 10 core genes, which mainly included biological processes related to cell adhesion, T cell proliferation, protein secretion, calcium ion binding, and transcriptional corepressor activity. KEGG pathway enrichment focused on adhesion-related PI3K-Akt and p53 signaling pathways(Table 3).

Table 2 Differential gene GO enrichment results of high and low grade endometrial stromal sarcomas.
Table 3 KEGG enrichment analysis of differential genes between high and low grade endometrial stromal sarcomas.

Hub genes' pan-cancer analysis

Pan-cancer analysis across 34 common human cancers unveiled distinct expression patterns for the ten core genes, as shown in Fig. 5. In this study, the expression of the RBFOX1 gene (or A2BP1), which had decreased expression in the high-grade group compared with the low-grade group, differed in 13 cancer types (Fig. 5a). In four cancers, including breast cancer (BRCA), the expression levels of tumor tissues were higher than normal tissues; While in other eight cancers, such as colon cancer (COAD), head and neck squamous cell carcinoma (HNSC), the expression levels of tumor tissues were lower than normal tissues. The expression of the FAT1 gene increased in the high-grade group compared with the low-grade group and was differently expressed in 16 cancer species (Fig. 5b). In 12 kinds of cancers, such as Bladder urothelial carcinoma (BLCA), the expression level of the tumor tissue was higher than that of the normal tissue; In BRCA and Kidney Chromophobe (KICH), the expression level in tumor tissues was lower than that in normal tissues. The expression of the PCDH7 gene, which was higher in HG-ESS than LG-ESS, was different in 14 cancer species (Fig. 5c). In four carcinomas within cholangiocarcinoma (CHOL), their expression levels were higher in tumor tissues than in normal tissues. In BLCA and BRCA, etc., their expression levels in tumor tissues were lower. The DSCAM gene with increased expression occurred in the high-grade group in this study. It differed in 13 cancer types (Fig. 5d): in 11 cancers such as HNSC and Uterine Corpus Endometrial Carcinoma (UCEC), their expression levels in tumor tissues were higher; only in BRCA cancer, their expression levels in tumor tissues were lower than normal tissues. In this study, the expression of the CDH18 gene, which was increased in the HG-ESS, differed in 13 cancer types (Fig. 5e). In Lung squamous cell carcinoma (LUSC), the tumor tissues with higher expression of it. In 5 cancers, including COAD and Kidney clear cell carcinoma (KIRC), they were lower. The CCND1 gene with increased expression in the HG-ESS differed in 13 cancer types (Fig. 5f). In 9 cancers as Rectal Cancer (READ), the tumor’s expression was higher than the normal’s; the lower face could be seen in the tumor part rather than the standard part in Kidney Papillary Cell Carcinoma (KIRP) and LUSC. Mentioning the PXDN gene’s expression (Fig. 5g), which with an increased level in the HG-ESS, was unevenly expressed in 13 cancers. The expression levels in tumors higher than normal’s could be found in CHOL and HNSC carcinomas, while the expression levels in tumors lower than normal’s could be found in BLCA and UCEC. The expression of the ANO1 gene, which was lower in HG-ESS than LG-ESS, was different in 16 cancer species (Fig. 5h). In six carcinomas within Esophageal Cancer (ESCA), their expression levels were higher in tumor tissues than in normal tissues, while in CHOL, KICH, etc., their expression levels in tumor tissues were lower. The AVPR1A gene with decreased expression in the HG-ESS differed in 15 cancer types (Fig. 5i). In KICH and HNSC, their tumor’s expression was higher than the normal’s, while the lower expression could be seen in the tumor part rather than the normal part in BLCA and Prostate Cancer (PRAD), etc. The term of the COL4A4 gene, which was increased in the high-grade group compared with the low-grade group, differed in 15 cancer types (Fig. 5j). In CHOL cancer, it overexpressed in tumor tissues more than the normal part, while in 13 cancers, including BRCA and KICH, its expression level in tumor tissues was lower than normal.

Figure 5
figure 5

Expression profiles of the 10 hub genes across 34 common human cancers analyzed in a pan-cancer context. Green represents tumor samples, purle represents normal control samples and yellow represents SKCM Metastasis samples. (*p value < 0.05, **p value < 0.01, ***p value < 0.001).

Immune-related differential gene analysis

All immune-related genes 1793 were downloaded from the ImmPort official website. Through comparison with ImmPort's immune-related genes, two genes, IGF1 and AVPR1A, were identified with down-regulated expressions in the high-grade group compared to the low-grade group (Fig. 6).

Figure 6
figure 6

Venn diagram illustrating the overlap between immune-related genes and differentially expressed genes.

Discussion

ESS is a rare malignant neoplasm of the female genital tract, categorized into low-grade (LG-ESS) and high-grade (HG-ESS), presents distinctive clinical and prognostic variations14,15. However, distinguishing between these subtypes using conventional methods remains challenging. Advances in sequencing technology and data sharing platforms provide an opportunity to comprehend disease genotypes as a foundation for molecular targeted therapy7. This study identified genes with differential expression between HG-ESS and LG-ESS through sequencing and bioinformatics analysis, focusing on ten meaningful candidates (RBFOX1, PCDH7, FAT1, DSCAM, CCND1, CDH18, PXDN, ANO1, AVPR1A, COL4A4). Several of these genes, including CCND1 and ANO1, have been associated with ESS previously, reinforcing the reliability of this analysis. Rekhi B et al. suggested that positive expression of Cyclin D1 was positively associated with more aggressive ESSs by reporting a particular ESS16; A study by Lee CH and Subbaraya S et al. suggested that positive expression of Cyclin D1 and ANO1 could be used as diagnostic immunomarkers for HG-ESS associated with YWHAE gene rearrangement17,18; The unexplored effects of RBFOX1, PCDH7, FAT1, DSCAM, CDH18, PXDN, AVPR1A, and COL4A4 in ESS warrant further investigation.

RBFOX1, the RNA binding fox-1 homolog 1, previously associated with developmental coordination disorder and spinal cerebellar ataxia, has also been linked to tumors like gastric and colon cancers now. For example, in the study of colorectal cancer in British Bangladeshis, RBFOX1 deletion was associated with high prevalence, early onset, and frequent mucous tissue types of colorectal cancer19. In the study of Malignant Mesothelioma (MMt), a pure deletion mutation in RBFOX1/A2BP1 was also identified for the first time, and the possibility was raised that this gene could be deemed as a new suppressor in MMt20. It has also been documented that RBFOXI may be involved in the pathogenesis of acute kidney injury by inhibiting the inflammatory response and oxidative stress and reducing the apoptosis of HK-2 cells induced by the hypoxic environment21. In this paper, the relationship between the gene RBFOXI and ESS disease was discussed for the first time. Considering the sequencing of the RBFOXI gene in this study (the RBFOXI gene showed a decreased expression in the HG-ESS group compared with the LG-ESS group) and the results of the pan-cancer analysis in the TIMER database, it can be proposed that RBFOXI can be a differential gene with tumor suppressive effect between the HG-ESS and LG-ESS groups.

FAT atypical cadherin 1 (FAT1), a tumor suppressor through WNT/β-catenin, Hippo, and MAPK/ERK pathways, influences tumor progression and affects therapy response in various cancers. The FAT1 gene deficiency in breast cancer affects resistance to CDk4/6 inhibitor therapy22; FAT1 mutations are associated with poor survival in head and neck squamous cell carcinoma (HNSCC)23,24; Some studies suggest that FAT1 may be an immune response regulator involved in different inflammatory processes, as has been argued in gliomas and T-cell lymphomas25,26. Some researchers have also obtained different results: Kim et al. suggested that human papillomavirus (HPV)-negative HNSCC patients with FAT1 mutation showed a better prognosis27. In melanoma and NSCLC, FAT1 can inhibit the tumor initiation ability of NSCLC cells by activating the Hippo signaling pathway, resulting in a significantly better survival outcome in patients with FAT1 mutations than the wild type and correlating with better immunogenicity and ICI efficacy. Thus, FAT1 can be recommended as a biomarker for immunotherapy in patients with melanoma and NSCLC24. A recent study showed that immunotherapy is increasingly used in sarcomas, not limited to patients with solid tumors28. Based on these results, the present study suggests that the differential expression of FAT1 between HG-ESS and LG-ESS may have important biological significance. The differential expression of FAT1 between HG-ESS and LG-ESS could impact clinical outcomes and immunotherapy.

PCDH7, a cell–cell adhesion regulator, emerges as a prognostic factor across cancers. Its roles vary. PCDH7 was defined as a risk gene, and its high expression suppressed survival in lung cancer patients in a study by Chen Y et al.29. PCDH7 provides a potential therapeutic strategy in colon cancer: it can inhibit the MEK1/2/ERK/c-Fos axis by knocking down PCDH7, induce neurogenesis and autophagy, and enhance the effect of colon cancer cells on chemotherapy30. Since androgen receptors can target PCDH7, discussing the relationship between the degree of PCDH7 methylation and the growth, invasion, and apoptosis of AIPC cells in non-androgen-dependent prostate cancer (AIPC) provides a new idea for the treatment of AIPC31. However, in gastric cancer PRMT6-KO-GC cells, knockdown of tumor suppressor gene PCDH7 promoted cell migration and invasion32. Meanwhile, in cervical cancer, up-regulation of PCDH7 can significantly inhibit the proliferation, migration, and invasion of cancer cells, and PCDH7 is positively correlated with the survival rate of patients33. In summary, PCDH7 has been considered an independent prognostic factor in various cancers. Further experimental validation is needed to determine whether it can provide a new target and therapeutic idea for the treatment of HG-ESS through the hints in this paper.

Enrichment analyses unveiled pathways like cell adhesion, PI3K-Akt, and p53 signaling associated with tumor progression. Studies have shown that changes in cell surface receptor expression often occur in malignant tumors; The activation of adhesion signaling pathways plays an essential role in cell differentiation, development, proliferation, and apoptosis and influences tumor progression by participating in tumor invasion, motility, and metastasis processes. As one of the classical signal transduction pathways, the abnormal activation of the PI3K-Akt signaling pathway and a variety of downstream effector molecules is closely related to the biological characteristics of the malignant proliferation of tumor cells34, which is consistent with the information we obtained in the clinic that tumors in the high-grade group are larger, more mitotically active, more commonly necrotic, more extensively infiltrated, and infiltration can involve the myometrium, lymph and blood vessels2, and the above enrichment analysis further explains the high malignancy and poor prognostic outcome in the high-grade group from the genetic level. Moreover, immune-related genes were implicated, indicating potential for immunotherapy. They are not only involved in the biological progression of positive regulation of activated T cell proliferation, but we also obtained three immune-related genes, IGF1, AVPR1A, and FAT1, by bioinformatics analysis. With the above hints, the idea that patients may benefit from immunotherapy is more testable.

According to the latest guidelines, the treatment of ESS patients is still very limited by the lack of specific targeted therapies or immunotherapy. However, some pan-tumor targets such as neurotrophic receptor tyrosine kinase (NTRK) gene fusions, microsatellite instability (MSI), and tumor mutation burden (TMB) have been tried in clinical work as adjuvant therapy. For example, some studies have suggested that immunotherapy, such as pembrolizumab, may be an option for primary or relapsed patients with TMB ≥ 10 who are surgically unresectable or have multiple metastases throughout the body when more good treatment options are not available21. The discussion of FAT1, PCDH7, and other essential genes in this paper may prompt questions about these genes as therapeutic targets and independent prognostic indicators of ESS. Could these key genes be therapeutic targets and independent prognostic factors that distinguish HG-ESS from LG-ESS, and could they guide immunotherapy? Also, considering the activated PI3K-AKT pathway, could inhibitors of this pathway offer new therapeutic prospects? In the clinical context, genetic testing could guide targeted and immunotherapeutic approaches for patients with limited treatment options. In conclusion, this study initiates the exploration of candidate genes, offering new insights into ESS treatment. While limitations exist due to sample size and methods, further validation and experimental expansion are planned to advance these findings.

Conclusion

In summary, our study, utilizing SNP sequencing and the GEO database, employed bioinformatics methods to analyze core genes, including RBFOX1, revealing insights into the molecular underpinnings of HG-ESS development. These core genes could potentially serve as biomarkers and therapeutic targets for HG-ESS.