GlioPredictor: a deep learning model for identification of high-risk adult IDH-mutant glioma towards adjuvant treatment planning

Identification of isocitrate dehydrogenase (IDH)-mutant glioma patients at high risk of early progression is critical for radiotherapy treatment planning. Currently tools to stratify risk of early progression are lacking. We sought to identify a combination of molecular markers that could be used to identify patients who may have a greater need for adjuvant radiation therapy machine learning technology. 507 WHO Grade 2 and 3 glioma cases from The Cancer Genome Atlas, and 1309 cases from AACR GENIE v13.0 datasets were studied for genetic disparities between IDH1-wildtype and IDH1-mutant cohorts, and between different age groups. Genetic features such as mutations and copy number variations (CNVs) correlated with IDH1 mutation status were selected as potential inputs to train artificial neural networks (ANNs) to predict IDH1 mutation status. Grade 2 and 3 glioma cases from the Memorial Sloan Kettering dataset (n = 404) and Grade 3 glioma cases with subtotal resection (STR) from Northwestern University (NU) (n = 21) were used to further evaluate the best performing ANN model as independent datasets. IDH1 mutation is associated with decreased CNVs of EGFR (21% vs. 3%), CDKN2A (20% vs. 6%), PTEN (14% vs. 1.7%), and increased percentage of mutations for TP53 (15% vs. 63%), and ATRX (10% vs. 54%), which were all statistically significant (p < 0.001). Age > 40 was unable to identify high-risk IDH1-mutant with early progression. A glioma early progression risk prediction (GlioPredictor) score generated from the best performing ANN model (6/6/6/6/2/1) with 6 inputs, including CNVs of EGFR, PTEN and CDKN2A, mutation status of TP53 and ATRX, patient’s age can predict IDH1 mutation status with over 90% accuracy. The GlioPredictor score identified a subgroup of high-risk IDH1-mutant in TCGA and NU datasets with early disease progression (p = 0.0019, 0.0238, respectively). The GlioPredictor that integrates age at diagnosis, CNVs of EGFR, CDKN2A, PTEN and mutation status of TP53, and ATRX can identify a small cohort of IDH-mutant with high risk of early progression. The current version of GlioPredictor mainly incorporated clinically often tested genetic biomarkers. Considering complexity of clinical and genetic features that correlate with glioma progression, future derivatives of GlioPredictor incorporating more inputs can be a potential supplement for adjuvant radiotherapy patient selection of IDH-mutant glioma patients.

radiation oncologists.Identification of IDH-mutant glioma patients at high risk of early progression is critical for personalized radiotherapy treatment planning.
According to the 2021 WHO central nervous system (CNS) classification system, the ATRX (alpha-thalassemia/mental retardation, X-linked) retained and 1p/19q-codeleted group defines a WHO Grade 2 or Grade 3 1p/19q codeleted oligodendroglioma; ATRX lost and homozygous deletion of CDKN2A/B is sufficient to classify IDH-mutant glioma as WHO Grade 4, and those without CDKN2A/B deletion are WHO Grade 2 or 3 astrocytoma 6,7 .Therefore, multiple possible WHO Grades can be designated within a biomarker-defined diagnostic entity, representing a major departure from prior histology-based CNS tumor classifications and highlighting the importance of molecular biomarkers in guiding glioma treatment 6 .Molecular biomarkers currently used for IDH-mutant glioma classification have complex interrelationships and multiple other molecular biomarkers are emerging as potential new candidates.Therefore, a systematic selection and integration of candidate biomarkers for risk assessment of IDH-mutant glioma is warranted.To this end, the objective of this study is to train and validate a supervised machine-learning (ML) based algorithm to identify IDH-mutant glioma patient at high risk of early progression.
Supervised ML is now widely used in the medical field to produce models and classifiers from training data for automation of tasks.Artificial neural network (ANN) is a subtype of ML technology that can analyze large datasets as inputs and make predictions with the probability of accuracy as outputs 8,9 .An ANN with two or more hidden layers is often called a deep neural network (DNN), which is particularly robust in making predictions for complex situations 10,11 .Basic requirements for supervised DNN training includes identification of relevant inputs with reduced dimensionality and redundancy, as well as a set of accurately labeled training data as output values 12 .Due to the longevity and lack of accurate long-term follow-up data of IDH-mutant patients, disease progression information in large public datasets is often censored.We attempted to identify and train genetic and clinical features that have no direct causal relation with IDH status to identify IDH-mutant glioma patients that have similar genetic background as IDH-wildtype.

Patient selection
Training and validation of artificial neural networks (ANNs) were carried out using WHO Grade 2 and Grade 3 cases from The Cancer Genome Atlas dataset.Copy number variations (CNVs) of genes such as PTEN, EGFR, CDKN2A, and mutation status of genes such as IDH1, TP53, and ATRX, clinical data including age, gender, progression-free interval (PFI), overall survival (OS) days, as well as histological classifications of the TCGA cases were derived from the UCSC xena platform (https:// xenab rowser.net/) (Fig. 1).CNVs and RNA-Seq raw data were processed using GISTIC2.0 and Log2(norm_count + 1) algorithms, respectively.1309 and 404 Grade 2 and Grade 3 glioma cases from AACR GENIE v13.0 and Memorial Sloan Kettering (MSK) datasets, respectively, were derived from publicly assessable cBioportal (https:// www.cbiop ortal.org/).This retrospective study followed the STROBE reporting guideline for publicly available datasets including TCGA, and MSK datasets included in the cBioportal.Data downloaded from a publicly available cBioPortal database does not require ethical approval.All patients whose samples were used in this analysis signed informed consent (https:// docs.cbiop ortal.org/ userguide/ faq/).IDH1-mutant WHO Grade 3 cases with subtotal resection (STR) who received adjuvant concurrent chemoRT were derived from Northwestern University (NU) (n = 21).Patient data was accessed with the approval of the Institutional Review Board (Study number STU00213078, August 2020) and was performed in accordance with the 45 Code of Federal Regulations Part 46 (45 CFR 46), Protection of Human Subjects (https:// irb.north weste rn.edu/ about/, irb@northwestern.edu).The workflow of datasets used for construction and validation of the model was illustrated in Fig. 2.

Genomic alterations and genetic mutations
Comparison and alignment of WHO Grade 2 and Grade 3 cases from TCGA dataset (n = 516) of most frequently altered chromosome cytobands were conducted in Firebrowse (http:// fireb rowse.org/).Cases were first aligned by patients' age at glioma diagnosis.Corresponding cases with mutated genes were indicated and types of mutations, including frameshift, splice site, missense, inframe, and synonymous mutations, were color-coded.The most frequently mutated genes were listed.Copy number gain and loss are also listed based on the frequency of alterations.The prevalence of genetic mutations, CNVs of WHO Grade 2 and Grade 3 patients were studied at different age groups (18-40, 40-60, > 60) for MSK datasets (n = 279).Age at diagnosis less than 18 were included in the 18-40 subgroup.Genetic markers from 1309 glioma patients of AACR GENIE v13.0 dataset were subgrouped into IDH1-mutant (IDH1_MT) and -wildtype (IDH1_WT) and aligned based on CNVs status of EGFR, CDKN2A, and PTEN, as well as mutation status of TP53 and ATRX.

Data preprocessing
Input data preprocessing was carried out in the Jupyter Notebook using Python programming language.The TCGA cases that had missing data on any input parameters were dropped.In the binary output, '0' stands for IDH1 mutated, '1' stands for IDH1_WT.Genetic inputs with missense mutation, and truncating mutations including nonsense, frameshift, nonstart, nonstop, and splice mutations were considered positive and were assigned '0' , wildtype inputs were assigned '1' .Cases were randomly assigned to the training set, and 30% were www.nature.com/scientificreports/assigned to the validation set.Inputs were selected and tested based on its variation prevalence in glioma (> 20%) and features with a correlation coefficient of > 0.2 or < − 0.2 were considered to be positively or negatively correlated.

ANN model construction and performance assessment
The model Sequential was imported from the Keras Python library.Briefly, the argument Dense was deployed for each layer with activation function relu for all the hidden layers.Since it is a binary classification task, sigmoid and adam were chosen as the activation function and optimizer, respectively, for the output layer.The loss function was fetched with the 'binary_crossentropy' command.The 'early_stop' and accuracy functions were deployed to prevent overfitting and evaluate models' performance, respectively.Accuracy and loss function for both the training set and the validation set were plotted for each epoch.Figure 3C is a schematic overview of the architecture of the ANN (6/6/6/6/1).The best performing ANN was named as GlioPredictor for prediction of glioma early progression, with weights and biases derived from Python and reconstructed in Microsoft Excel ® .A GlioPredictor score was calculated as 100 minus the integral numbers of sigmoid activation value that was multiplied by one hundred: GlioPredictor Score = (100 − INT(100 * 1/(1 + e (−x) )).www.nature.com/scientificreports/

Statistical analysis
Progression-free survival (PFS) analyses were carried out using the GraphPad Prism version 8.0.Patients at risk at major time points were listed.Log-rank analysis was used to generate survival curves.Violin plot and oneway ANOVA analyses of GlioPredictor score for TCGA datasets were also carried out using GraphPad Prism.Python 3.9.0 was used for data analysis and model construction.Correlation analysis was performed using the 'corr()' command, which corresponds to pairwise Pearson analysis.ROC curve and AUC score analyses were conducted using 'roc_curve' and 'roc_auc_score' functions derived from the sklearn package.Univariate and multivariate analyses were carried out using IBM ® SPSS ® .All statistical tests were 2-sided, and p-values smaller than 0.05 were considered statistically significant.

Different age at IDH-mutant glioma diagnosis reflects unique genetic features
Age > 40 years at diagnosis was the criterion adopted by multiple guidelines in risk stratification for glioma patients 13,14 .We first tested potential genetic discrepancies of WHO Grade 2 and 3 diffuse glioma cases diagnosed at different ages.Progression free survival (PFS) data were derived from the Memorial Sloan Kettering (MSK, n = 250) dataset.Patients were subgrouped based on age at disease diagnosis (18-40, 41-60, > 60) (Fig. 3).We found 81% of patients aged 20-40 have IDH1 mutation, compared to 31% in age > 60 (Fig. 3A).Younger glioma patients have significantly better PFS rates (Fig. 3C, p < 0.001).However, for IDH1-mutant (IDH1_MT) glioma cases (n = 182), age at disease diagnosis had no significant impact on PFS rates (Fig. 3D, p = 0.89).This finding is also true in the independent TCGA dataset (Supplementary Fig. 1).Genes with the most prevalent mutations at different age groups were presented in both IDH1_WT and IDH1_MT glioma (Fig. 3A,B).We found, regardless of IDH1 mutation status, younger glioma patients have statistically significant higher prevalence of P53 or ATRX mutations, and lower PICK3CA mutations (p < 0.05, Fig. 3A,B).These data indicated that age at glioma diagnosis reflects a unique genetic background and using age alone cannot predict progression of IDH_MT glioma.

GlioPredictor in prognosticating glioma treatment response
We further evaluated the potential of GlioPredictor in prognosticating adjuvant treatment response.In TCGA_IDH1_MT WHO Grade 2 and 3 glioma patients treated with adjuvant radiotherapy (w/RT, n = 211), no significant difference on PFS was observed between those with GlioPredictor < 50 (n = 11) vs. GlioPredictor ≥ 50 (n = 200, p = 0.1, Fig. 6C).For those without adjuvant RT (w/o RT, n = 147), we found statistically significant worse PFS for the cohort with GlioPredictor < 50 (n = 14) vs. GlioPredictor ≥ 50 (n = 133, p = 0.029, Fig. 6C), indicating adjuvant RT is warranted in this molecularly high-risk glioma cohort.We then studied the potential of GlioPredictor in prognostication of glioma patients who histologically would warrant adjuvant treatment.IDH1_MT WHO Grade 3 cases with subtotal resection (STR) were derived from Lurie Robert H. Lurie Comprehensive Cancer Center of Northwestern University (NU_STR) who had received adjuvant concurrent temozolomide and RT.We found a correlation among 87% of NU_STR patients with enhancing lesions on their initial post-surgical radiological imaging 87% with a GlioPredictor < 50 vs.52% of patients with enhancement on radiological imaging with a GlioPredictor ≥ 50 as shown in Fig. 6E.For PFS analysis, NU_STR patients with Gliopredictor < 50 (n = 7) have earlier disease progression than those with GlioPredictor ≥ 50 (n = 14), p = 0.0238) (Fig. 6D).

Discussion
This study provided a functional deep neural network (DNN) that can identify high-risk IDH-mutant glioma patients and assist with prognostication for post-operative management.The model we built first predicted IDH1 mutation status in the TCGA dataset with a 90% accuracy and AUC score 0.91 with 6 readily available genetic and clinical characteristics including: TP53 and ATRX mutation status, CNVs for PTEN, EGFR, and CDKN2A, and age at diagnosis.We then used the trained model to generate the GlioPredictor score, with a lower score reflecting a genetic background similar to IDH1 wildtype.We then demonstrated that a low GlioPredictor score can identify a group of IDH1-mutant patient at higher risk of early progression.Therefore, GlioPredictor assessment is capable of integrating important molecular features and clinical information into a simplified risk stratification score.Clinical trial results of Radiation Therapy Oncology Group (RTOG) 9802 and the European Organization for Research and Treatment of Cancer (EORTC) 22033-26033 suggests adjuvant radiotherapy (RT) either alone or in combination with chemotherapy for high-risk WHO Grade 2 glioma patients 13,14,19,20 .Prior to the molecular biomarker-based WHO 2021 classification, high-risk WHO Grade 2 glioma patients were often defined as patients with age > 40 years or a less than total gross resection, the criterion adopted from the RTOG 9802 trial and recommended in the most recent NCCN guidelines 13,14 .It is now known that risk assessment and corresponding treatment planning should incorporate a tumor's genetic features as critical decision-making factors.However, molecular biomarkers have complex biological implications and are often interrelated; as such, a method to systematically access IDH-mutant glioma patients can add prognostic value.
The role of immediate adjuvant radiotherapy (RT) in IDH-mutant management is debatable, and concerns regarding RT-induced long-term neuropsychological side effects are not negligible 19,21,22 .All the molecular markers evaluated in this study, i.e., TP53, ATRX, PTEN, EGFR, and CDKN2A, have been proposed as radiosensitivity biomarkers of glioma.The GlioPredictor model integrates these markers with clinicopathologic information to provide a tool to evaluate the role of radiation therapy in glioma patients.
While we believe our model has good efficacy and applicability, several drawbacks remain that await further study.First and foremost, the sample size and tumor characteristics are limited based on features reported in public datasets.If more samples were available to train the neural network, we believe the performance will be further improved.Secondly, treatment-related details were not available for datasets involved in model training,   www.nature.com/scientificreports/validation, and cross validation.Thirdly, prospective studies are required to demonstrate the clinical applicability of the model, especially when the definition of glioma progression was not clearly specification in those public datasets.w Also, although we tested the trained model in several independent datasets, GlioPredictor was trained using TCGA dataset alone and therefore, sample bias and tumor heterogenicity may compromise the clinical applicability of the model.Last but not least, multiple clinical parameters such as size of the tumor, extent and anatomical location of the tumor involvement, extend of resection, neurological deficits, histology subtypes, gender, history of seizures, treatment received, patient baseline performance, as well as other biomarkers were not incorporated in the current version of GlioPredictor.Those parameters not included are critical for disease status evaluation and treatment recommendation, and can be potential cofounding factors of GlioPredictor score.Furthermore, the GlioPredictor model was not validated in paired recurrent tissues, paired progressed MRI brain.Utilization of GlioPredictor is not a replacement for those known risk assessment criteria.Instead, it is intended to facilitate comprehensive molecular assessment of glioma when clinical decision-making is increasingly dependent upon a panel of seemingly unrelated biomarkers ranging from copy number variation to mutations.

Figure 1 .
Figure 1.Schematic overview of model training.Left panel, illustration of the GlioPredictor structure.The neural network construction starts with identification of proper features, and trial and error in refining the inputs, model hyperparameters.New features can be added if they can further improve the performance of the model.

Figure 4 .
Figure 4. Identification of molecular markers as potential inputs for neural network construction.(A) WHO Grade 2 and 3 glioma cases from TCGA dataset were aligned based on age at diagnosis.Cases with copy number gain (top panel) or loss (lower panel) on cytobands that have most frequently copy number variations (CNVs) were color-coded.Genes selected as inputs in our final model of neural network were indicated.(B, C) 1309 WHO Grade 2 and 3 glioma patients of AACR GENIE v13.0 dataset were grouped into IDH1 mutated (IDH1_MT) and IDH1 wildtype (IDH1_WT) cohorts, and aligned based on CNVs status of EGFR, CDKN2A, and PTEN, as well as mutation status of TP53 and ATRX.NA data not available, SCNAs number of somatic copy number alterations.

Figure 5 .
Figure 5. Artificial neural network (ANN) feature selection, target identification and ANN construction.(A) Correlation study of features of (CNVs of EGFR, CDKN2A, PTEN, and mutation status of TP53, IDH1, ATRX, and age at diagnosis).(B) Evaluation of prediction accuracy for both the test dataset and train dataset.(C) Evaluation of loss function for both the test dataset and train dataset.(D) ROC curve analysis of the built neural network model.(E) Schematic overview of the ANN model (6/6/6/6/2/1).Features selected as inputs are indicated.

Table 1 .
Univariate and multivariate Cox Regression analyses for progression free survival (PFS) for LGG patients with age and genetic alterations as covariables.