RUNX1 and REXO2 are associated with the heterogeneity and prognosis of IDH wild type lower grade glioma

Based on isocitrate dehydrogenase (IDH) alterations, lower grade glioma (LGG) is divided into IDH mutant and wild type subgroups. However, the further classification of IDH wild type LGG was unclear. Here, IDH wild type LGG patients in The Cancer Genome Atlas and Chinese Glioma Genome Atlas were divided into two sub-clusters using non-negative matrix factorization. IDH wild type LGG patients in sub-cluster2 had prolonged overall survival and low frequency of CDKN2A alterations and low immune infiltrations. Differentially expressed genes in sub-cluster1 were positively correlated with RUNX1 transcription factor. Moreover, IDH wild type LGG patients with higher stromal score or immune score were positively correlated with RUNX1 transcription factor. RUNX1 and its target gene REXO2 were up-regulated in sub-cluster1 and associated with the worse prognosis of IDH wild type LGG. RUNX1 and REXO2 were associated with the higher immune infiltrations. Furthermore, RUNX1 and REXO2 were correlated with the worse prognosis of LGG or glioma. IDH wild type LGG in sub-cluster2 was hyper-methylated. REXO2 hyper-methylation was associated with the favorable prognosis of LGG or glioma. At last, we showed that, age, tumor grade and REXO2 expression were independent prognostic factors in IDH wild type LGG.

Transcriptional characteristics of the different sub-clusters of IDH wild type LGG patients. Based on the criterion of absolute fold change > 2 and P values < 0.001, the differentially expressed genes between sub-cluster1 and sub-cluster2 of IDH wild type LGG patients were identified. These resulted 3390 differentially expressed genes in TCGA dataset and 1601 differentially expressed genes in CGGA dataset, respectively. 723 genes were commonly changed between sub-cluster1 and sub-cluster2 IDH wild type LGG patients in TCGA and CGGA datasets (Fig. 2a). Un-supervised clustering heatmaps showed that those genes divided the IDH wild type LGG patients into two distinct subgroups and most genes were up-regulated in sub-cluster2 IDH wild type LGG patients (Fig. 2b).
Using gene set enrichment analysis (GSEA), we further determined the Kyoto Encyclopedia of Genes and Genomes (KEGG) signaling pathways associated with the transcriptional signatures in sub-cluster1 and sub-cluster2 IDH wild type LGG patients in TCGA and CGGA datasets. Based on the criterion of P values < 0.05, we found that IDH wild type LGG patients in sub-cluster1 were positively correlated with TP53 signaling pathway Figure 1. Two molecular sub-clusters of IDH wild type LGG patients with different clinical outcomes, genetic alterations and immune alterations. (a) Consensus-map showed that primary LGG patients without IDH mutations from TCGA dataset were divided into two sub-clusters using NMF method based on the all expressed genes. Kaplan-Meier survival plot showed the different overall survival of IDH wild type LGG patients in sub-cluster1 and sub-cluster2. P values were determined by log-rank test. (b) LGG patients without IDH mutations from CGGA dataset were divided into two sub-clusters using NMF method. The clinical outcomes of sub-cluster1 and sub-cluster2 were determined. (c) The difference of age, gender and tumor grade in the two subclusters of IDH wild type LGG patients from TCGA dataset. (d) The difference of age, gender and tumor grade in the two sub-clusters of IDH wild type LGG patients from CGGA dataset. (e) The number of IDH wild type LGG patients with or without TP53, EGFR, CDKN2A, PTEN, NF1 alterations in each sub-cluster. P values were calculated by Chi-square test. (f) Box plots showed the stromal score, immune score and tumor purity in sub-cluster1 and sub-cluster2 IDH wild type LGG patients in TCGA dataset. (g) Box plots showed the stromal score, immune score and tumor purity in sub-cluster1 and sub-cluster2 IDH wild type LGG patients in CGGA dataset.  www.nature.com/scientificreports/ and cytosolic DNA sensing signaling pathway, while, negatively correlated with long term potentiation and calcium signaling pathway in TCGA and CGGA datasets (Fig. 2c). Also, transcription factors associated with the transcriptional signatures of sub-cluster1 and sub-cluster2 IDH wild type LGG patients in TCGA and CGGA datasets were identified. IDH wild type LGG patients in sub-cluster1 were positively correlated with IRF and AML1 (RUNX1) transcription factors, while, negatively correlated with MYOD and RORA1 transcription factors in CGGA datasets (Fig. 2d). However, the enrichment of IRF and RUNX1 transcription factors in sub-cluster1 IDH wild type LGG patients was not significant in TCGA dataset (P = 0.07 and P = 0.23, respectively) (Fig. 2d).
Transcriptional characteristics of of IDH wild type LGG patients with different immune infiltrations. Genes associated with the immune alterations in IDH wild type LGG patients were also determined.
The normalized stromal score or immune score > 0 were defined as high stromal score or immune score. Based on the criterion of absolute fold change > 2 and P values < 0.001, 865 genes were associated with high stromal score, while, 1008 genes were associated with high immune score in TCGA dataset (Fig. 3a). In CGGA dataset, IDH wild type LGG patients with high stromal score were also with high immune score. We identified 1317 genes were associated with the immune alterations in CGGA dataset (Fig. 3a). 387 genes were differentially expressed in IDH wild type LGG patients in with different immune infiltrations in TCGA and CGGA datasets (Fig. 3a). Un-supervised clustering heatmaps showed that those genes divided the IDH wild type LGG patients into two distinct subgroups and most genes were up-regulated in IDH wild type LGG patients with higher immune infiltrations (Fig. 3b).
Furthermore, the KEGG signaling pathways and transcription factors associated with the immune alterations of IDH wild type LGG patients were determined. B cell receptor signaling pathway and T cell receptor signaling pathway were significantly positively enriched in IDH wild type LGG patients with high stromal score or immune score in TCGA and CGGA datasets (Fig. 3c). Moreover, RUNX1 and NF-κB transcription factors were significantly associated with the stromal score and immune score in IDH wild type LGG patients. IDH wild type LGG patients with higher stromal score or immune score were positively correlated with RUNX1 and NF-κB transcription factors in TCGA and CGGA datasets (Fig. 3c).
Transcription factor RUNX1 is up-regulated in sub-cluster1 and associated with the worse clinical outcomes of IDH wild type or IDH mutant LGG patients. RUNX1 was reported to be a prognostic factor in high grade glioma (GBM) 31 . However, the prognosis of RUNX1 in lower grade glioma (LGG), particularly in IDH wild type LGG was unknown. So, the expression levels and prognostic significance of RUNX1 in IDH wild type LGG were further tested. First, we found that, compared with sub-cluster2 IDH wild type LGG patients, RUNX1 was up-regulated in sub-cluster1 IDH wild type LGG patients in TCGA (P = 5E−07) and CGGA (P = 0.0004) datasets (Fig. 4a). Moreover, RUNX1 was up-regulated in IDH wild type LGG patients with higher stromal score, compared with IDH wild type LGG patients with lower stromal score in TCGA (P = 3.6E−05) and CGGA (P = 2.6E−06) datasets (Fig. 4b). Also, compared with IDH wild type LGG patients with lower immune score, the expression levels of RUNX1 were higher in IDH wild type LGG patients with higher immune score in TCGA (P = 0.0001) datasets (Fig. 4b).
Based on the mean expression value of RUNX1, IDH wild type LGG patients were divided into higher RUNX1 expression or lower RUNX1 expression subgroups. Corresponding to the higher expression levels of RUNX1 in sub-cluster1 IDH wild type LGG patients, higher RUNX1 expression was associated with the worse prognosis of IDH wild type LGG patients in TCGA dataset (P = 0.02) (Fig. 4c). However, in CGGA dataset, there was no significantly different clinical overall survival between RUNX1 highly or lowly expressed IDH wild type LGG patients (P = 0.42) (Fig. 4c). Interestingly, RUNX1 expression was also a prognostic factor in IDH mutant LGG patients. Compared with RUNX1 highly expressed IDH mutant LGG patients, RUNX1 lowly expressed IDH mutant LGG patients had more favorable clinical overall survival in TCGA (P = 0.046) and CGGA (P = 0.017) datasets (Fig. 4d).

RUNX1 expression is associated with IDH mutation and the worse clinical outcomes of LGG or glioma patients.
Moreover, we showed that RUNX1 expression was associated with IDH mutation. Compared with IDH wild type LGG patients, RUXN1 was down-regulated in IDH mutant LGG patients in TCGA (P = 2.8E−08) and CGGA (P = 2.4E−10) datasets (Fig. 4e). Moreover, RUNX1 expression levels were correlated with the LGG overall survival in both TCGA (P < 0.0001) and CGGA (P < 0.0001) datasets (Fig. 4f). Overall survival was increased in RUNX1 lowly expressed LGG patients, compared with RUNX1 highly expressed LGG patients (Fig. 4f).

RUNX1 target gene REXO2 is up-regulated in sub-cluster1 and associated with the worse clinical outcomes of IDH wild type or IDH mutant LGG patients. As a transcription factor, RUNX1
regulates the expression levels of multiple target genes 32,33 . The regulatory impact of RUNX1 was determined using single sample gene set enrichment analysis (ssGSEA) in R "GSVA" package. We found that IDH wild type LGG patients with lower regulatory impact of RUNX1 had increased overall survival than IDH wild type LGG www.nature.com/scientificreports/ patients with higher regulatory impact of RUNX1 in TCGA dataset (P = 0.048) (Fig. 5a). However, the regulatory impact of RUNX1 was not associated with the overall survival of IDH wild type LGG patients in CGGA dataset (P = 0.09) (Fig. 5a). Furthermore, the regulatory impact of RUNX1 was correlated with the LGG overall survival in both TCGA (P = 8E−04) and CGGA (P = 0.0068) datasets (Fig. 5b).
Using the GSEA gene sets, we identified 42 RUNX1 target genes. Un-supervised clustering heatmaps demonstrated that most of the RUNX1 target genes were up-regulated in sub-cluster1 IDH wild type LGG patients in TCGA dataset (Fig. 5c). Furthermore, the prognostic effects of RUNX1 target genes were determined using univariate cox regression analysis. Six genes REXO2, FCER1G, ID3, LTBP1, SASH3 and TNFRSF12A were associated with the prognosis of IDH wild type LGG patients in TCGA and CGGA datasets (Fig. 5d). REXO2 was most correlated with the overall survival of IDH wild type LGG patients in TCGA dataset (P = 0.0001) and was the seventh gene most correlated with the overall survival of IDH wild type LGG patients in CGGA dataset (P = 0.017) (Fig. 5d). So, we focused on REXO2 for our next studies.
REXO2 is a RNA binding protein, and the prognostic effects of REXO2 in LGG were unknown. Here, we showed that, like RUNX1, REXO2 was up-regulated in sub-cluster1 IDH wild type LGG patients in TCGA (P = 1E−05) and CGGA (P = 6.7E−06) datasets (Fig. 6a). Also, REXO2 was up-regulated in IDH wild type LGG patients with higher stromal score or immune score, compared with IDH wild type LGG patients with lower stromal score or immune score in TCGA and CGGA datasets (Fig. 6b). Moreover, higher REXO2 expression was associated with the worse prognosis of IDH wild type LGG patients in TCGA (P = 0.0017) and CGGA (P = 0.0296) datasets (Fig. 6c). However, unlike RUNX1, REXO2 had no prognostic effects in IDH mutant LGG patients in TCGA dataset (P = 0.09) (Fig. 6d). On the contrary, in CGGA dataset, higher REXO2 expression was associated with the worse prognosis of IDH mutant LGG patients (P = 4E−04) (Fig. 6d).
REXO2 expression is associated with IDH mutation and the worse clinical outcomes of LGG or glioma patients. REXO2 expression was also associated with IDH mutation. Compared with IDH wild type LGG patients, REXO2 was down-regulated in IDH mutant LGG patients in TCGA (P = 3.3E−05) and CGGA (P = 2.1E−11) datasets (Fig. 6e). Moreover, REXO2 expression levels were correlated with the overall survival of LGG in both TCGA (P < 0.0001) and CGGA (P < 0.0001) datasets (Fig. 6f). Overall survival was increased in REXO2 lowly expressed LGG patients, compared with REXO2 highly expressed LGG patients (Fig. 6f).
RUNX1 and REXO2 hyper-methylation are associated with the favorable clinical outcomes of IDH wild type LGG patients. We had shown the different genetic alterations, clinical outcomes and transcriptional features between sub-cluster1 and sub-cluster2 IDH wild type LGG patients. Moreover, the DNA methylation profiling in IDH wild type LGG patients in sub-cluster1 or sub-cluster2 was also significantly different. Based on the criterion of the changes of beta values > 0.1 and P values < 0.001, 214 genes were hypermethylated in sub-cluster2 IDH wild type LGG patients, while, only 55 genes were hyper-methylated in sub-cluster1 IDH wild type LGG patients (Fig. 7a). However, the methylation levels of RUNX1 and REXO2 were not significantly different in sub-cluster1 and sub-cluster2 IDH wild type LGG patients (P = 0.42 and P = 0.67 respectively) (Fig. 7b).
Next, we determined the prognostic effects of RUNX1 and REXO2 methylation levels. Based on the mean methylation values of RUNX1, IDH wild type LGG patients were divided into hypo-methylation or hypermethylation subgroups. Kaplan-Meier Plotters demonstrated the clinical outcomes of IDH wild type LGG with RUNX1 hypo-methylation or with RUNX1 hyper-methylation. RUNX1 hyper-methylated IDH wild type LGG patients had lower overall survival than RUNX1 hypo-methylated IDH wild type LGG patient in CGGA dataset (P = 0.024), but not in TCGA dataset (P = 0.26) (Fig. 7c). And IDH wild type LGG patients with REXO2 hypomethylation had better clinical outcomes than LGG patients with REXO2 hyper-methylation in CGGA dataset (P-0.02) (Fig. 7d). However, in TCGA dataset, the overall survival of IDH wild type LGG patients with REXO2 hypo-methylation or with REXO2 hyper-methylation was not significantly different (P = 0.963) (Fig. 7d).

RUNX1 and REXO2 methylation are associated with IDH mutation and the better clinical outcomes of LGG or glioma patients. The 2HG accumulation in IDH mutant
LGG could increase extensive epigenetic DNA methylation 6 and maintain the CpG island methylator phenotype (CIMP) 34 . We showed that RUNX1 and REXO2 methylation levels were associated with IDH mutation. Compared with IDH wild type LGG patients, RUNX1 and REXO2 were hyper-methylated in IDH mutant LGG patients in TCGA dataset (P = 1.1E−46 and P = 1.5E−13, respectively) (Fig. 7e). Moreover, LGG patients with RUNX1 or REXO2 hypomethylation had better clinical outcomes than LGG patients with RUNX1 or REXO2 hyper-methylation in TCGA dataset (P < 0.0001) (Fig. 7f).
Furthermore, compared with grade II LGG patients, RUNX1 and REXO2 were hypo-methylated in grade III LGG patients in TCGA dataset (Fig. 7g). RUNX1 and REXO2 were further hypo-methylated in grade IV glioma (GBM) patients in TCGA dataset (Fig. 7g). Glioma patients with RUNX1 or REXO2 hypo-methylation had worse clinical outcomes than LGG patients with RUNX1 or REXO2 hyper-methylation in TCGA dataset (Fig. 7h). We then used multivariate cox regression survival analysis to detect the associations of the prognostic factors. The forest plots showed that age (P = 0.002), grade (P = 0.02) and REXO2 expression (P = 0.004) were independent prognostic markers in IDH wild type LGG patients in TCGA dataset (Fig. 8a). In CGGA dataset, tumor grade (P = 0.014) was an independent prognostic marker in IDH wild type LGG patients, while, age and REXO2 expression were not independent prognostic markers (P = 0.213 and P = 0.595, respectively) (Fig. 8b). Furthermore, in TCGA and CGGA dataset, IDH alteration was an independent prognostic marker in LGG patients (P < 0.001 and P = 0.021, respectively) (Fig. 8c,d). Interestingly, although REXO2 expression was associated with IDH alterations, REXO2 expression was a prognostic marker independent IDH alterations in TCGA and CGGA datasets (P = 0.015 and P = 0.005, respectively) (Fig. 8c,d). Those results suggested that REXO2 was a good prognostic maker to predict the clinical outcomes of LGG.

IDH wild type
LGG is a small subgroup of LGG. Only 20% LGG patients are without IDH mutations 2 . Because of the lack of enough samples, the inner heterogeneity within IDH wild type LGG subgroup is unclear. Using 96 IDH wild type LGG patients in TCGA and 35 IDH wild type LGG patients in CGGA dataset, we showed that IDH wild type LGG patients could divide into two sub-clusters by NMF assay. IDH wild type LGG patients in sub-cluster2 had prolonged overall survival and low frequency of CDKN2A alterations and low immune infiltrations. The prognostic relevance of CDKN2A loss was previously demonstrated in IDH mutant LGG 35,36 . IDH wild type LGG patients with higher immune alterations were associated with the worse prognosis. Those results were consistent with the classification of IDH wild type LGG patients by ConsensusClusterPlus method 28 , suggested that IDH wild type LGG was indeed a heterogeneous cohort, and could be further divided into two sub-clusters.
We further found that RUNX1 was correlated with the differentially expressed genes in sub-cluster1 IDH wild type LGG patients. Mutations of RUNX1 were occurred in nearly 10% acute myeloid leukemia (AML) patients [37][38][39] . Mutation of RUNX1 or high expression of RUNX1 in AML conferred the poor prognosis 40,41 . The expression of RUNX1 was also correlated with the clinical outcomes of lung adenocarcinomas 42 and triple negative breast cancer 43 . Here, we showed that, RUNX1 was associated with the high immune infiltrations of IDH wild type LGG, and RUNX1 was associated with the worse clinical outcomes of IDH wild type or IDH mutant LGG patients. Moreover, RUNX1 expression and methylation were correlated with IDH mutation and the clinical outcomes of LGG or glioma patients. All those results highlighted the prognostic significance of RUNX1 in LGG. However, mutation of RUNX1 was barely detected in LGG 2 . The functions of RUNX1 in IDH wild type LGG should be studied.
REXO2 is a RNA binding protein and is required for maintaining the integrity of mitochondria 44,45 . Previous reports showed that higher expression levels of REXO2 were associated with the poor prognosis of prostate cancer 46 . However, the expression and prognosis of REXO2 in other malignant disease, particular in glioma were unknown. Here we showed that REXO2 was up-regulated in IDH wild type LGG patients with higher stromal score or immune score. And REXO2 expression and methylation were associated with the clinical outcomes of IDH wild type LGG patients. Moreover, REXO2 expression and methylation were associated with IDH mutation and the clinical outcomes of LGG or glioma patients. Furthermore, REXO2 expression was a prognostic marker independent IDH alteration, age and tumor grade, suggested that REXO2 was a good prognostic maker to predict the clinical outcomes of LGG.
Overall, our analysis highlighted the heterogeneity within IDH wild type LGG subgroup and revealed new prognostic factor of RUNX1 and REXO2 in IDH wild type LGG. However, the results were derived from published datasets and lack of further validations. Therefore, the expression and prognosis of RUNX1 and REXO2 in large cohorts of IDH wild type LGG should be studied. Also further studies of the functions of immune alterations in IDH wild type LGG are needed. . Transcription factor RUNX1 is up-regulated in sub-cluster1 and associated with the worse clinical outcomes of IDH wild type or IDH mutant LGG or glioma patients. (a) Box plots showed the expression levels of RUNX1 in sub-cluster1 and sub-cluster2 IDH wild type LGG patients in TCGA and CGGA datasets. P values were determined by two tails paired Student's t test. (b) Box plots showed the RUNX1 expression levels in TCGA and CGGA IDH wild type LGG patients with higher stromal or immune score (red) or with lower stromal or immune score (blue). (c) The Kaplan-Meier Plotters showed the associations between RUNX1 expression and overall survival in IDH wild type LGG patients in TCGA and CGGA datasets. P values showed the different overall survival between RUNX1 highly expressed IDH wild type LGG patients (red) and RUNX1 lowly expressed IDH wild type LGG patients (black). (d) The Kaplan-Meier Plotters showed the associations between RUNX1 expression and overall survival in IDH mutant LGG patients in TCGA and CGGA datasets.