Long-term epilepsy-associated tumors: transcriptional signatures reflect clinical course

Long-term epilepsy-associated tumors (LEATs) represent mostly benign brain tumors associated with drug-resistant epilepsy. The aim of the study was to investigate the specific transcriptional signatures of those tumors and characterize their underlying oncogenic drivers. A cluster analysis of 65 transcriptome profiles from three independent datasets resulted in four distinct transcriptional subgroups. The first subgroup revealed transcriptional activation of STAT3 and TGF-signaling pathways and contained predominantly dysembryoplastic neuroepithelial tumors (DNTs). The second subgroup was characterized by alterations in the MAPK-pathway and up-stream cascades including FGFR and EGFR-mediated signaling. This tumor cluster exclusively contained neoplasms with somatic BRAFV600E mutations and abundance of gangliogliomas (GGs) with a significantly higher recurrence rate (42%). This finding was validated by examining recurrent tumors from the local database exhibiting BRAFV600E in 90% of the cases. The third cluster included younger patients with neuropathologically diagnosed GGs and abundance of the NOTCH- and mTOR-signaling pathways. The transcript signature of the fourth cluster (including both DNTs and GGs) was related to impaired neural function. Our analysis suggests distinct oncological pathomechanisms in long-term epilepsy-associated tumors. Transcriptional activation of MAPK-pathway and BRAFV600E mutation are associated with an increased risk for tumor recurrence and malignant progression, therefore the treatment of these tumors should integrate both epileptological and oncological aspects.


Results
clustering revealed distinct transcriptional architecture of long-term epilepsy associated tumors. First, we explored the transcriptional landscape of long-term epilepsy-associated tumors by merging three independent datasets (Fig. 1a). Expression profiles were normalized and corrected from batch effects. The optimal number of clusters was determined by using a partitioning around medoids (PAM) unsupervised clustering and achieved the highest average silhouette width at four clusters (C1-C4) (Fig. 1b). Samples with negative silhouette width were removed to optimize cluster accuracy and subgroup discrimination ( Supplementary Fig. 1S). The distribution of all datasets was well balanced across all clusters (Fig. 1c). Validation was performed by tSNE and consensus cluster, confirming the optimal number of 4 clusters ( Fig. 1d and Supplementary Fig. 2S). All identified subgroups showed distinct signature genes and characteristic transcriptional architecture with activation of oncogenic pathways (Fig. 1e,f,h and Supplementary Fig. 2S), while their histopathological classification remained only partially preserved within the clusters (Fig. 1g). In order to exclude a potential co-clustering by low tumor content and being able to compare the specific expressional differences of the clusters to wild type, we have performed an additional expression analysis of normal brain samples (Atlas of Human Brian, http://human.brain-map.org) ( Supplementary Fig. 4S).
The first cluster ("DNT-like" cluster, C1) contained only samples (n = 11), which were histologically diagnosed as DNTs (Fig. 1g). The cluster showed exclusive expression of IRF1 (Interferon Regulatory Factor 1) and IL10RA (Interleukin 10 Receptor Subunit Alpha), which participate in immune regulating pathways and were independently up-regulated in this subgroup (Fig. 1e,f). Gene Set Enrichment Analysis (GSEA) confirmed these findings by identifying strong enrichment of immune response pathways, which included STAT3 activation and TGF-beta signaling (Fig. 1h). A BRAF mutation was not observed in this cluster and the patients from this cluster showed low recurrence rate (n = 1; p < 0.05).
The second cluster contained 18 samples with histologically diagnosed GGs or PXAs, distributed within different age groups (Fig. 1g). In contrast to all other clusters, the patients from this cluster showed a significantly higher recurrence rate of 42% (Fig. 1g). BRAF mutation or gain of BRAF was observed in 71%, which was a genetic hallmark of this subgroup (p < 0.001) (Fig. 1g). Therefore this group was designated as "BRAF-GG" (C2). CDH18 (cadherin 18) was one of the hallmark genes of this cluster. CDH18 is part of the cadherin family, which are known to play a crucial role in the oncogenesis by activation of EGFR and MAPK-pathways 22 . In line with these findings and the detected BRAF mutational status, we found an increased pathway signaling of the MAPK and up-stream pathways such as FGFR and EGFR ( Fig. 1h and Supplementary Fig. 2S).
The third cluster ("Juvenile-GG" cluster, C3) included predominantly younger patients (n = 13, of whom 10 patients were younger than 18 years) with mainly histologically diagnosed GGs (Fig. 1g). RAP1B (Ras-related protein Rap-1b) and PSMA2 (Proteosome subunit, alpha type 2) were identified as exclusively up-regulated in "Juvenile-GG", underlying the dominant oncogenic driver of this subgroup (Fig. 1e,f). The RAP 1 gene has been showed to play a crucial role in Notch activation and cell adhesion 23 . In line with these findings gene set enrichment analysis revealed an up-regulation of the NOTCH-and focal adhesion pathways ( Fig. 1h and Supplementary  Fig. 2S). One patient (7.5%) with identified BRAF mutation showed tumor recurrence during the course of the disease (p > 0.05) (Fig. 1g).
The fourth cluster contained a mixed population of histologically diagnosed DNTs and GGs samples without BRAF mutations or cases of tumor recurrence (no other specified, NOS cluster, C4). We identified up-regulated PLXNB1 (Plexin B1) and MBP (myelin basic protein) (Fig. 1e,f). Here, gene set enrichment analysis showed an activation of the GABA-signaling and the nicotine-pathway (Fig. 1h) without association with any strong oncogenic driver.
Patients' characteristics, histopathological features and clinical implication. The clinical and molecular characteristics of all patients from the local database (n = 18) included into the transcriptional analysis ("Dataset Freiburg") are presented on Table 1. The neuropathological examination revealed 6 GGs, 10 DNTs and 2 PXAs. All tumors were negative for IDH-mutation and classified as WHO grade I (except for both PXAs, which were classified as WHO grade II and III, respectively). The majority of the tumors (n = 15) were localized in the www.nature.com/scientificreports www.nature.com/scientificreports/ temporal lobe. The mean age at surgery was 28 years (+/−14.6) and 16 patients (88%) were seizure free after resection. Tumor recurrence was seen in four patients (one patient from "DNT-like" cluster 1 and three patients from "BRAF-GG" cluster 3) ( Table 1). Red indicates up-regulated gene expression, blue downregulated genes, respectively. (f) Violin plot of distinct signature genes characteristic for each cluster. (g) Clinical information including histology, age, BRAF-mutation status and oncological course (progressive disease PD, or stable disease SD) (h) Gene Set Enrichment Analysis (GSEA) of each cluster group was performed and illustrated by bar-plots. P-values are determined GSEA and adjusted by False-Discovery Rate for multiple testing. Data is given as mean ± standard deviation; *p < 0.05, **p < 0.01, ***p < 0.001.
Since the transcriptional analysis revealed that the second tumor cluster (BRAF-GG) carries an increased risk for tumor recurrence (42%) we aimed at validating these results by investigating the BRAF status in patients with recurrent gangliogliomas (and without transcriptional data) from the local database. We were able to identify 14 patients with recurrent gangliogliomas (Fig. 2a). All patients have shown radiological tumor mass increase or recurrence of already resected tumors. The MRI visualization of these patients revealed local or diffuse contrast enhancement, mass effects and vast perifocal edema, which is commonly associated with high-grade glioma (Fig. 2b). Histologically proven malignant progression (from WHO grade I to WHO grade II or III) was observed in 6 patients. Three patients had to be excluded because of insufficient tissue quality. Clinical data of the remaining 11 patients (GG-Group2) were used to build matched pairs with patients without tumor recurrence (GG-Group1) (Fig. 2a). Both groups were evaluated for BRAF V600E mutation, p-S6K, p-MAPK, PTEN and CD34 alterations by immunohistochemistry (IHC, Fig. 2c,d). Protein targets were chosen based on our primary analysis, suggesting that activation of MAPK pathway was associated with poor oncological prognosis and early tumor progress. Since not only BRAF mutations but also other oncogenic drivers lead to activation of MAPK and AKT-mTOR pathways, our goal was to examine to what extent MAPK or AKT-mTOR pathway activation was associated with poor disease course. In GG-Group1, 9 of 10 patients (one patient was excluded because of invalid staining) showed wildtype BRAF status (Fig. 2e, Supplementary Fig. 3S). In contrast, 10 of 11 patients from the GG-Group2 (recurrent tumors) revealed BRAF-V600E mutations (Supplementary Fig. 3S) with consequent activation of the MAPK signaling and expression of S6K, implying activation of both MAPK and AKT-mTOR pathways (Fig. 2d,e). It is noteworthy that all four deceased patients from the local database died because of tumor recurrence and carried BRAF mutations. BRAF mutation showed an impact on the progression free survival (PFS) as well, sharply separating the Kaplan-Meier-curves of wildtype and mutated patients (Fig. 2f), although a statistical significance was not reached due to the low sample size.

Discussion
Although LEATs were initially described more than 15 years ago 2 , their molecular genetic background and underlying biology still remain unknown. Differences in the clinical and oncological disease course between pathologically similar tumors suggest that the histological appearance does not reflect their biological heterogeneity 6,13 and hint on a potential value of a further molecular-genetic characterization 24 . Recently, growing evidence has supported the hypothesis that various common oncogenic drivers are also contributing to the evolution of LEATs. Our results are consistent with these findings showing a pivotal role of EGFR/FGFR tyrosine kinase receptor and its downstream pathways RAS/RAF/MAPK and PI3K/AKT/mTOR in the pathogenesis of long-term epilepsy-associated tumors 9,25,26 . The importance of the oncological mechanisms participating in tumor pathogenesis is further emphasized by the fact that some glioneuronal tumors relapse and eventually become malignant 17 .
In this study, we demonstrate that LEATs can express distinct oncogenic drivers and are characterized by a transcriptional profile, which only partially corresponds to their histopathological appearance. The first subgroup ("DNT-like", C1) included tumors, which were histologically diagnosed as DNTs. The oncogenic pathways of STAT3 and TGF-signaling were characteristic for this group. In line with the results of Stone and colleagues 13  www.nature.com/scientificreports www.nature.com/scientificreports/ showed that BRAF mutation in DNTs was only of minor importance compared to GGs. Both the neuronal expression profile and the low recurrence rate implied the benign character of these tumors. In contrast, the second cluster ("BRAF-GG", C2) included tumors histologically diagnosed as GGs or PXAs. BRAF alternations frequently appeared in this subgroup and were associated with poor oncological outcome and tumor recurrence in almost half of the patients. Our data suggest, that MAP/ERK, EGFR and FGFR signaling contributed to the higher recurrence rate and malignant transformation. Similar findings are observed in other malignant brain tumors as well 27 reflecting the aggressive and proliferative nature of this subgroup. Of note, due to their similar histopathological appearance some of the PXAs could have been misdiagnosed as GGs implying the necessity for integration of DNA methylation-based classification as proposed by Capper et al. 24 In order to validate these findings, we identified 10 patients with recurrent GGs from the local LEAT database. 90% of these patients revealed BRAF mutation along with increased MAPK signaling, confirming the transcriptional-classification findings. Additionally, we observed that PTEN expression was lost in the majority of these patients (n = 7/10, 70%) causing additional activation of the AKT-mTOR pathway marked by the increased frequency of p-S6K positive cells. The activation of both MAPK and AKT-mTOR pathways in BRAF mutated tumors was reported by other groups as well 26 . Kaplan-Meier analysis, stratifying for the BRAF mutation, revealed shorter PFS in the BRAF mutated tumors without reaching statistical significance due to the low sample number. Certainly, the reason for this is a low power resulting in a classic ß-error, which represents a limitation of the study. Interestingly, Stone et al. pointed out that BRAF mutation is common for GGs, while FGFR1 mutations mainly occur in DNTs. Similar findings were investigated by other authors as well 10,28 . While we also showed that BRAF mutations are predominantly observed in histologically diagnosed ganglioglioma, we could not show an activation of the FGFR pathway in the "DNT-like" cluster. Possible explanations for this discrepancy could be differences in the examined population (adult vs. pediatric population) or in the histological classification, which may not be reliable in terms of glioneuronal pathologies. Of note, the third cluster ("Juvenile-GG") revealed activation of the mTOR pathway without the presence of BRAF mutations. This implies an alternative activation of the AKT-mTOR pathway either through activating mutations or loss of function in PTEN.
The last group of LEATs included almost equally GGs and DNTs without BRAF mutation or tumors recurrences. The GSEA revealed unspecific pathway signaling including activation of neurotransmitter pathways (GABA release) and oligodengroglial gene expression, which may correspond to the FGFR1 subgroup reported by Stone et al. 13 . The activation of GABA related pathways has been shown to be associated with increased epileptogenic activity by changes in chloride homeostasis, switching the GABAergic signaling from hyperpolarizing towards depolarizing 29 .
Our study has some limitations as well. Both GSEA/GSVA analysis are based on in-silico calculations, which are less robust in terms of pathway enrichments. Therefore, we have performed a further validation of the oncologically important group ("BRAF-GG") based on immunohistochemistry (IHC). However, IHC can show some flaws particularly in cases when mutations are presented with low allele frequency highlighting by this the necessity of direct sequencing in such cases.

conclusion
Our transcriptional analysis map various molecular alterations involved in the development and oncological characterization of long-term epilepsy-associated tumors. Finally, we would like to discuss our findings in regard of their clinical relevance. The oncological aspects in the identified subgroups with neuronal and oligodendroglial differentiation including STAT3/TGF or neurotransmitter activation (Fig. 3) appear to be of minor importance. Consequently, surgical interventions should focus on treatment of drug-resistant epilepsy. In contrast, patients with LEATs demonstrating astrocytic differentiation, BRAF mutation as well as activation of MAPK/FGFR/EGFR oncogene pathways are at higher risk of tumor recurrence and malignant progression (Fig. 3). The treatment of these tumors should be performed from oncological point of view, in which gross total resection of the tumor followed by narrow clinical observation plays a pivotal role. In relapsed tumors involving functionally eloquent brain areas that restrict surgical resection, targeted therapy with BRAF inhibitors may be a feasible and promising treatment option.

Methods and Materials
Contact for reagent and resource sharing.  Semi-Supervised cluster analysis. First, to assess clustering robustness, we used two different algorithms to validate the optimal number of clusters. All patients were analyzed by the ConsensusClusterPlus 32 . The optimal number of clusters was defined by the highest area under the CDF (Consensus Cumulative Distribution Function) curve. In a second analysis, a 'partitioning around medoids (PAM) cluster analysis was performed on the 1000 most variable genes to identify robust metabolic clusters ('PAM' function of the 'cluster' package in R 33 ).
To determine the optimal number of clusters, mean silhouette widths were computed (clustering repeated from k = 2 to k = 10) and confirmed an optimal number of four clusters, which was chosen for downstream analysis. In order to identify the "core" patients of each cluster, we further excluded samples with negative silhouette widths, retaining 53 patients.

Differential gene expression of long-term epilepsy-associated tumors transcriptional subgroups.
We aimed to identify the "core" genes of each cluster by a large-scale regression model (Supplementary Table 1S). The model aligns all genes based on their affiliation-likelihood within all four subgroups. The resulted score and its negative logarithm of the FDR was visualized in a scatterplot, in which different axis presented an individual subgroup. Based on the regression model, subclass scores were used to identify subgroup specific pathway activation.
Gene set enrichment analysis. A permutation-based pre-ranked GSEA was applied to each module to verify its biological functions and pathways 34 . We compute genes that survive the thresholding, from the nearest shrunken centroid classifier. The average rank of the gene in the cross-validation folds for genes surviving at the given threshold was used for GSEA as described in the AutoPipe package. The predefined gene sets of the Molecular Signature Database v5.1were taken 35 . Enrichment score was calculated by the rank order of gene computed by above described classifier score of each subgroup. For significant enrichment, p-values were adjusted by FDR. Gene Set Variation Analysis (GSVA) was performed with the GSVA package implemented in R-software 36 . The analysis based on a non-parametric unsupervised approach, which transformed a classic gene matrix (gene-by-sample) into a gene set by sample matrix resulted in an enrichment score for each sample and pathway.
Immunostaining. Tissue samples were fixed using 4% phosphate buffered formaldehyde and paraffin-embedded according to standard procedures. H&E staining was performed on 4 µm paraffin sections using standard protocols. Immunohistochemistry was performed on 3 µm paraffin-embedded tissue sections after deparaffinization and heat-induced epitope retrieval in citrate buffer by using the SignalStain Kit by Cell Signaling according to the manufacturer's instructions. As primary antibody, anti-BRAF-V600 antibody (1:500, SAB5600047 SIGMA) ( Supplementary Fig. 3S), anti-S6K (1:250 Anti-P70 S6 Kinase beta antibody, ab70963), anti-CD34 (1:600, Anti-CD34 antibody [EP373Y], ab81289), anti-p38 MAPK (1:1000, Anti-p38 antibody, ab197348) and anti-PTEN (1:1000, Anti-PTEN antibody, ab31392) was applied to the tissue and incubated overnight at 4 °C. The next day, after application of SignalStain ® Boost IHC Reagent followed by SignalStain ® DAB, counterstaining with Meyer's haemalaun solution was performed. The samples were then mounted and analyzed with an Olympus microscope. Positive cells were counted (by ImageJ) in 6 high-fields (40x magnification) per slide and compared to the total number of cells in each field. From this data, the mean percentage of positive cells was calculated.