Introduction

Glioblastoma multiforme (GBM) is a commonly diagnosed primary brain malignancy. In the United States, GBM is the most frequently diagnosed central nervous system tumor in adults. However, unremarkable improvement has been made in increasing the 5-year GBM patients' survival1. Understanding GBM biology can pave the way for introducing new treatments for affected patients.

Growing evidence has highlighted the significance of microRNA (miR) in developing various malignancies; miRs are non-protein-coding RNAs that can post-transcriptionally regulate various mRNAs2,3,4. It is well-established that miRs have pivotal roles in different biological processes5. In cancer biology, specific miRs can be divided into tumor-suppressive miRs and oncomiRs, which their ectopic expression causes tumor suppression or tumor growth, respectively6. Therefore, understanding the biology of miRs in cancer development might provide valuable insights into cancer treatment.

Most human herpesviruses can encode miRs, and viral miRs have considerable roles in various pathogenesis and virus-related diseases7. For instance, it is well-known that EBV is associated with solid and hematological malignancies. Lei et al. have shown that ectopic expression of ebv-miR-BART3* downregulates the expression of DICE1, a tumor-suppressive gene, leading to increased cell proliferation in nasopharyngeal carcinoma8. ebv-miR-BART18-5p is substantially upregulated in nasopharyngeal carcinoma tissues, and its expression level is negatively correlated with PTEN, as a critical tumor-suppressive gene of the AKT/PI3K signaling pathway9. Consistent with these, HSV1 antibody positivity combined with other risk factors is associated with an increased risk of oropharyngeal squamous cell carcinoma development10. Based on a report, 15% of sequenced oropharyngeal squamous cell carcinoma tissues are positive for HSV11. Growing reports have shown the presence of HSV1 infection in malignant tissues, like vulvar and buccal squamous cell carcinoma12,13. Although HSV1 can infect the central nervous system, there is no comprehensive study to investigate the potential role of HSV1 in the development of GBM.

In the current study, we aimed to investigate whether HSV1 miRs are involved in GBM pathogenesis and, if they are involved, how they exert their potential oncogenic or tumor-suppressive effect on GBM development. For this aim, we conducted a deep in silico investigation, did single-cell RNA sequencing analyses, and performed a thorough literature review to identify the potential trace of HSV1 miRs in GBM development. We also used the TCGA-GBM dataset to externally validate the identified differentially expressed genes. The results of this study provide novel and valuable insights into the potential roles of HSV1 in GBM development.

Results

Sample normalization and clustering

First, we normalized the expression value of both the GSE158284 and GSE13276 datasets. After excluding 17 samples (nine samples were from pediatric GBM patients, six samples were not arranged based on unsupervised clustering, and two samples had technical problems in sequencing), we included 24 samples, i.e., 15 adult GBM tissues and 9 non-tumoral samples, from the GSE158284. The box plot of normalized values and clustering analysis of the 24 samples from the GSE158284 dataset are shown in Fig. 1A,B. Regarding the GSE13276 dataset, we included 8 samples from adult patients with GBM. We excluded 7 samples because they were disarranged according to unsupervised clustering or the non-tumoral tissues were not from adjacent GBM. The box plot of normalized values and clustering analysis of the 8 samples from the GSE13276 dataset are shown in Fig. 1C,D.

Figure 1
figure 1

Data normalization and clustering. (A) The boxplot of normalized data of the included samples from the GSE158284 dataset. (B) Clustering of the included samples from the GSE158284 dataset. (C) The boxplot of normalized data of the included samples from the GSE13276 dataset. (D) Clustering of the included samples from the GSE13276 dataset.

Identifying differentially expressed genes and HSV1 miRs and their targets

Afterward, we used the limma package to identify differentially expressed HSV miRs and differentially expressed genes in GBM tissues. By defining the adjusted P-value ˂ 0.05 and |log2FC| ≥ 1.3, we depicted the differentially expressed miRs and differentially expressed genes in GBM tissues (Fig. 2A,B, respectively). Regarding the HSV miRs, our results have indicated that hsv1-miR-H6-3p is significantly upregulated in adult GBM tissues, and hsv1-miR-H1-5p is significantly downregulated in adult GBM tissues. Afterward, we used the miRDB database to predict the potential targets of hsv1-miR-H6-3p and hsv1-miR-H1-5p. It is found that hsv1-miR-H6-3p can target 371 human mRNAs, and hsv1-miR-H1-5p can target 571 human mRNAs. Our results have indicated that four targets of hsv1-miR-H6-3p, as the upregulated HSV1 miR in GBM tissues, are among the significantly downregulated genes in GBM tissues; these genes are SH3TC2, SASH1, EPB41L1, and SH3PXD2A (Fig. 2C and Table 1). Also, eight targets of hsv1-miR-H1-5p, as the downregulated HSV1 miR in GBM tissues, are among the significantly upregulated genes in GBM tissues; these genes are MELK, FZD2, IL1RN, NOVA1, MYH10, TMEM97, PTPRZ1, and PDGFC (Fig. 2D, and Table 1).

Figure 2
figure 2

The volcano plots of the included samples from the GSE158284 and GSE13276 datasets and the common genes between DEGs and targets of hsv1-miR-H6-3p and hsv1-miR-H1-5p. (A) The differentially expressed genes (DEGs) of included samples from the GSE158284 dataset with the cut-off of adjusted P-value ˂ 0.05 and |log2FC| ≥ 1.3. (B) The DEGs of the included samples from the GSE158284 dataset adjusted P-value ˂ 0.05 and |log2FC| ≥ 1.3. (C) The common genes between DEGs and targets of hsv1-miR-H6-3p and hsv1-miR-H1-5p. (C) The common genes between targets of hsv1-miR-H6-3p and the downregulated genes in GBM tissues. (D) The common genes between targets of hsv1-miR-H1-5p and the upregulated genes in GBM tissues.

Table 1 The differentially expressed targets of hsv1-miR-H6-3p and hsv1-miR-H1-5p in GBM tissues.

Characterizing hsv1-miR-H6-3p targets

We accessed the TCGA-GBM and GTEX datasets to study the expression levels of the identified gene. We used GBM tissues (n = 163) from the TCGA-GBM dataset and normal brain tissues (n = 207) from the GTEX dataset. Our results have confirmed that the expression levels of EPB41L1 and SH3PXD2A are significantly downregulated in GBM tissues (Fig. 3A,B). However, there are no statistically significant differences in the expression levels of SH3TC2 and SASH1 between GBM and non-tumoral tissues (Fig. 3C,D). Based on this external validation, hsv1-miR-H6-3p upregulation can be responsible for downregulating EPB41L1 and SH3PXD2A expression in GBM tissues.

Figure 3
figure 3

Characterizing hsv1-miR-H6-3p targets. (A) EPB41L1 expression, (B) SH3PXD2A expression, (C) SH3TC2 expression, (D) SASH1 expression, (E) the prognostic values of EPB41L1 and SH3PXD2A in GBM patients. (F) Drug sensitivity analysis of EPB41L1 and SH3PXD2A based on GDSC. (G) Drug sensitivity analysis of EPB41L1 and SH3PXD2A based on CTRP. (H,I) EPB41L1 and SH3PXD2A expression in temozolomide-resistant GBM cell lines and normal astrocyte cells. (J) Immune cell infiltration and EPB41L1 and SH3PXD2A in GBM tissues.

After externally validating EPB41L1 and SH3PXD2A downregulation in GBM, we studied their prognostic values. It has been found that EPB41L1 and SH3PXD2A have no statistically significant prognostic values in determining the disease-specific survival, overall survival, and progression-free survival of GBM patients (Fig. 3E). Figure 3F,G depict the drug sensitivity analyses. Of note, there is a significant positive correlation between EPB41L1 with teniposide, as anti-neoplastic chemotherapy. However, no significant correlations were identified between EPB41L1 and SH3PXD2A with temozolomide. Consistent with this, it has been found that there are no significant differences between the expression levels of EPB41L1 and SH3PXD2A in temozolomide-resistant cells and normal astrocyte cells (Fig. 3H,I). Figure 3J shows the correlations between EPB41L1 and SH3PXD2A and infiltrated immune cells in GBM tissues; of note, there is a significant positive correlation between CD4+ T-cells and EPB41L1 in GBM tissues. The enrichment analysis has indicated that SH3PXD2A is significantly involved in the superoxide-generating NADPH oxidase activator activity (data are not shown).

Characterizing hsv1-miR-H1-5p targets

For the external validation, we investigated the expression levels of identified hsv1-miR-H1-5p targets based on TCGA-GTEX data. The expression levels of MELK, FZD2, IL1RN, NOVA1, PTPRZ1, TMEM97, and PDGFC are significantly upregulated in GBM tissues (Fig. 4A–G). However, MYH10 expression is not altered in GBM tissues (Fig. 4H). Based on this external validation, hsv1-miR-H1-5p downregulation can be responsible for upregulating MELK, FZD2, IL1RN, NOVA1, PTPRZ1, TMEM97, and PDGFC in GBM tissues.

Figure 4
figure 4figure 4

Characterizing hsv1-miR-H1-5p targets. (A) MELK expression, (B) FZD2 expression, (C) IL1RN expression, (D) NOVA1 expression, (E) PTPRZ1 expression, (F) TMEM97 expression, (G) PDGFC expression, (H) MYH10 expression, (I) the prognostic values of MELK, FZD2, IL1RN, NOVA1, PTPRZ1, TMEM97, and PDGFC in GBM patients. (J) Drug sensitivity analysis of MELK, FZD2, IL1RN, NOVA1, PTPRZ1, TMEM97, and PDGFC based on GDSC. (K) Drug sensitivity analysis of MELK, FZD2, IL1RN, NOVA1, PTPRZ1, TMEM97, and PDGFC based on CTRP. (LO) MELK, PTPRZ1, NOVA1, and TMEM97 expression in temozolomide-resistant GBM cell lines and normal astrocyte cells. (P) PDGFC staining in GBM cells, (Q) PTPRZ1 staining in GBM cells, (R) NOVA1 staining in GBM cells. (S) Immune cell infiltration and hsv1-miR-H1-5p targets in GBM tissues. (T) Enrichment analyses of hsv1-miR-H1-5p targets.

Following the external validation of MELK, FZD2, IL1RN, NOVA1, PTPRZ1, TMEM97, and PDGFC upregulation in GBM, we studied their prognostic significance. It has been found that NOVA1 downregulation is significantly associated with poor disease-specific survival and progression-free survival of GBM patients (Fig. 4I). Figure 4J,K depict the drug sensitivity analyses. Notably, there are significant negative correlations between the studied upregulated genes with bleomycin, midostaurin, afatinib, and docetaxel, as anti-neoplastic chemotherapeutic agents. Only the expression of TMEM97 has been positively associated with temozolomide treatment based on GDSC. However, the expression levels of MELK, PTPRZ1, and NOVA1, along with TMEM97, have been significantly upregulated in temozolomide-resistant GBM cell lines compared to normal astrocyte cells (Fig. 4I–O). The HPA009134, HPA015103, and HPA004155 staining have localized the PDGFC in the plasma membrane and cytosol, PTPRZ1 in the cytosol, and NOVA1 in the nucleoplasm, nucleoli, and vesicles, respectively (Fig. 4P–R, respectively). Figure 5S shows the correlations between MELK, FZD2, IL1RN, NOVA1, PTPRZ1, TMEM97, and PDGFC and infiltrated immune cells in GBM tissues; of note, there is a significant positive correlation between macrophage and IL1RN and negative correlation between CD8+ naïve T-cells and IL1RN in GBM tissues. The enrichment analysis has indicated that FZD2 and PTPRZ1 are significantly involved in Wnt-activated receptor activity and protein tyrosine phosphatase activity, respectively (Fig. 4T).

Figure 5
figure 5

Tumor bulk analyses and gene interactions. (A) The heatmap of the correlation between the identified common genes based on the TCGA-GBM dataset. (B) The heatmap of methylation patterns of hsv1-miR-H6-3p and hsv1-miR-H1-5p targets. (C) The heatmap of mRNA expression patterns of hsv1-miR-H6-3p and hsv1-miR-H1-5p targets. (D) The gene interaction of hsv1-miR-H6-3p and hsv1-miR-H1-5p targets based on GeneMANIA.

Correlational studies and gene interaction of hsv1-miR-H6-3p and hsv1-miR-H1-5p targets

The data from the TCGA-GBM were analyzed to demonstrate a trend in the correlation of the identified genes (Fig. 5A). The TCGA-GBM data has releveled an overall hypomethylation in MELK, FZD2, IL1RN, NOVA1, PTPRZ1, TMEM97, and PDGFC and hypermethylation in EPB41L1 in GBM tissues (Fig. 5B). Figure 5C depicts the mRNA expression patterns of hsv1-miR-H6-3p and hsv1-miR-H1-5p in GBM and non-tumoral tissues. Except for FZD2, hsv1-miR-H6-3p and hsv1-miR-H1-5p targets have considerable interactions with each other based on the GeneMANIA (Fig. 5D).

The expression pattern of identified common genes in cell populations of GBM tissues

After external validating the upregulation of MELK, FZD2, IL1RN, NOVA1, TMEM97, PTPRZ1, and PDGFC in GBM tissues compared to non-tumoral samples, we applied single-cell RNA sequencing analyses to identify which cell types are responsible for the increased expression of these genes. After quality control, 35,133 out of 42,966 cells were included in our analyses. Our results have identified 11 distinct cell populations in the included primary GBM tissues (Fig. 6A). Except for the IL1RN, MELK, FZD2, NOVA1, TMEM97, PTPRZ1, and PDGFC have been highly expressed in astrocyte populations in terms of magnitude and/or intensity (Fig. 6B–H, respectively).

Figure 6
figure 6

Identification of cells in the included primary GBM microenvironment and the dot plots of the identified genes for demonstrating their expression patterns in the cell populations of the included primary GBM microenvironment. (A) Cell populations of included GBM tissues, (B) IL1RN, (C) MELK, (D) FZD2, (E) NOVA1, (F) TMEM97, (G) PTPRZ1, and (H) PDGFC.

Discussion

Despite considerable advancements in our understanding of GBM development, the underlying virus-mediated post-transcriptional regulation in GBM development needs further studies. In the current in silico study, we thoroughly investigated the role of HSV1 miRs in GBM development. We identified that hsv1-miR-H6-3p, as an upregulated HSV1 miR in GBM tissues, can target the expression of EPB41L1 and SH3PXD2A in GBM tissues. Also, hsv1-miR-H1-5p, as a downregulated HSV1 miR in GBM tissues, can target the expression of MELK, FZD2, NOVA1, TMEM97, PTPRZ1, and PDGFC in non-tumoral brain tissues and these genes are expressed in astrocytes. The following discusses the significance of the above-mentioned genes in GBM development.

As a cell-cycle-dependent protein kinase overexpressed in cancer cells, MELK silencing is associated with an increased level of p21 in p53-deficient cells and arrests cancer cells at the G1 phase14. MELK inhibitor has been associated with enhanced radiosensitivity and suppressed tumor growth in animal models15. MELK inhibitor suppresses the proliferation, clonogenicity, neurosphere formation, invasion, and migration of GBM cells. MELK inhibition can also upregulate p21 and arrest the cell cycle. In animal models, MELK inhibition has increased the survival of affected mice and inhibited tumor growth. Of interest, the inhibitory effect of MELK inhibition has been more effective on cancer stem cells than on GBM cells16. FZD2 can stimulate the canonical and non-canonical Wnt pathways in malignancies. FZD2 suppression inhibits the β-catenin-dependent signaling pathway, and its knockdown suppresses tumor growth in neuroblastoma animal models17. FZD2 meditates the EMT process and cell migration, and administering antibodies against FZD2 suppresses tumor growth and migration in animal models18. In line with these, we have shown that FZD2 is significantly involved in the Wnt-activated receptor activity. NOVA1 can increase telomerase activity and enhances its stability, leading to tumor development19,20. NOVA1 is substantially upregulated in GBM tissues and GBM cells compared to non-tumoral counterparts. NOVA1 expression is upregulated in patients with high-infiltration depth and tumors greater than 2 cm compared with patients with low infiltration depth and smaller tumor sizes, respectively21. TMEM97 expression is substantially increased in glioma tissues compared to non-tumoral brain tissues, and its expression is increased with higher glioma grades. Increased expression of TMEM97 has also been associated with the inferior survival of glioma patients. TMEM97 silencing suppresses tumor proliferation, arrests the cell cycle, inhibits cell invasion, and decreases the migration of GBM cells. Besides, TMEM97 knockdown suppresses the EMT process, decreases the protein expression of β-catenin and Twist, and downregulates E-cadherin protein expression in GBM cells22TMEM97 knockdown can suppress astrocytoma migration and decrease tumor growth22. PTPRZ1 expression has been higher in glioma stem cells compared to matched non-tumoral stem cells, and PTPRZ1 knockdown decreases the cell viability and tumorsphere formation of glioma stem cells. In animal models, PTPRZ1 disruption decreases cell proliferation and downregulates SOX2 expression. Besides, PTPRZ1+ tumor cells display a high rate of tumorigenesis compared with matched PTPRZ1- in animal models23. Fujikawa et al. have shown that PTPRZ1 knockdown decreases stemness features of GBM cells24. Activating the platelet-derived growth factor receptor (PDGFR) pathway via PDGFs can stimulate oncogenic effects and lead to GBM development. PDGFC expression is substantially higher in GBM tissues than in normal brain tissues. Also, PDGFC expression is upregulated in GBM cells and GBM stem cells compared to normal astrocyte cells25. Besides, PDGFC knockdown remarkably decreases non-small cell lung cancer cell proliferation26. The increased expression of PDGFC is associated with the inferior disease-free survival of patients with triple-negative breast cancers27. Besides, upregulated PDGFC expression is associated with the histological grade and stage of breast cancer patients, and its knockdown has decreased invasion and proliferation in breast cancer cells39.

The mRNA and protein expression of EPB41L1 is considerably downregulated in GBM tissues compared to adjacent non-tumoral tissues. EPB41L1 overexpression arrests the cell cycle in the G0/G1 phase, increases the apoptosis rate, and suppresses the invasion and migration of GBM cells28. Our results have demonstrated that EPB41L1 is considerably downregulated in GBM tissues, and hsv1-miR-H6-3p might target their expression. We have demonstrated that SH3PXD2A is significantly involved in the superoxide-generating NADPH oxidase activator activity. It is found that the expression level of SH3PXD2A is substantially decreased in GBM tissues, and hsv1-miR-H6-3p might target their expression.

The current study has some strengths. First, we highlighted the trace of HSV1 miRs in GBM development. Second, we applied exhaustive in silico investigations, external validation, and literature review to elucidate the identified HSV1 miRs targets in GBM pathogenesis. Third, we applied single-cell RNA analyses to display the expression patterns of the upregulated genes that can be targeted by hsv1-miR-H1-5p in various cells residing in the GBM microenvironment. Overall, the current study provides novel insights into the potential role of HSV1 miRs in GBM development.

Methods

Dataset selection

The Gene Expression Omnibus (GEO) database of NCBI was deeply searched to select datasets that investigated miR expression in GBM and non-tumoral tissues. For this purpose, the GSE158284 database was selected; this dataset utilized Agilent-021827 Human miRNA Microarray (V3) (Probe Name version) to investigate miR expression in GBM and non-tumoral tissues. Besides, we aimed to identify datasets that studied mRNA expression in GBM tissues, adjacent non-tumoral tissues, GBM cell lines, and normal astrocyte cells. For this purpose, the GSE13276 dataset was chosen; this dataset used [HG-U133A] Affymetrix Human Genome U133A Array to study mRNA expression in GBM and non-tumoral samples29. Also, we selected the GSE153679 that sequenced the gene expression of normal astrocyte cells and patient-derived temozolomide-resistant GBM cell lines. This dataset applied [HuGene-1_0-st] Affymetrix Human Gene 1.0 ST Array [transcript (gene) version] for sequencing genes30. For single-cell RNA sequencing, the GSE182109 dataset was selected to access the raw single-cell RNA sequencing data of patients with primary GBM. The GSE182109 dataset applied Illumina HiSeq 4000 (Homo sapiens) for single-cell RNA sequencing31.

Identifying differentially expressed genes (DEGs) and HSV1 miRs targets

After data normalization and clustering, R software (version 4.1.3) was leveraged to find the differentially expressed miRs and mRNAs in GSE158284 and GSE13276. Like our previous work, the Benjamini and Hochberg method was used for adjusting the obtained P-values2. The limma package was used to identify DEGs. The adjusted P-value ˂ 0.05 and |log2FC| ≥ 1.3 were the criteria for these analyses considering a miR or an mRNA as significantly altered miR or mRNA expression. Volcano plots were drawn to display the differentially expressed miRs and mRNAs. After identifying the differentially expressed HSV1 miRs, the miRDB database was used to identify the potential targets of HSV1 miRs32. A score ≥ 50 was the criteria for selecting the targets of HSV1 miRs.

Tumor bulk analysis

The TCGA-GBM and GTEX datasets were accessed using the GEPIA2 (http://gepia2.cancer-pku.cn/#index) to validate the expression of identified DEGs externally33. The TCGA-GBM dataset was also analyzed to find the correlations between DEGs, their mRNA expression, and methylation patterns in the non-tumoral and GBM tissues. The survival, immunogenetics, and drug sensitivity analyses were conducted using gene set cancer analysis (GSCA). This database contains data from genomics of drug sensitivity in cancer (GDSC), cancer therapeutics response portal (CTRP), and TCGA cohorts. The immunogenetics analyses are based on the immuCellAI algorithms34. The Human Protein Atlas (https://www.proteinatlas.org/) was used to identify the protein localization of DEGs in human GBM cell lines35. The molecular function of DEGs was investigated using enrichment analyses. Adjusted P-value < 0.05 was considered a criterion for determining the significance36.

Single-cell RNA sequencing analyses

The GSE182109 dataset was downloaded from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/). The Seurat (version 4.1.0)37 R package was used to analyze raw single-cell RNA sequencing data. Cells with mitochondrial gene expression ˂ 10% and expressed genes of above 500 were selected for analysis. Following data normalization, finding variable genes, and integrating data from multiple samples, we scaled the data and ran the principal component analysis. Afterward, the cells were clustered into a two-dimensional figure using the uniform manifold approximation and projection (UMAP) clustering algorithms with a resolution of 0.1. The PanglaoDB38 and CellMarker39 databases were used to annotate the identified clusters. The gene markers used in this study are shown in Table 2.

Table 2 The gene markers used for annotating the identified clusters.