Combining multi-site magnetic resonance imaging with machine learning predicts survival in pediatric brain tumors

Brain tumors represent the highest cause of mortality in the pediatric oncological population. Diagnosis is commonly performed with magnetic resonance imaging. Survival biomarkers are challenging to identify due to the relatively low numbers of individual tumor types. 69 children with biopsy-confirmed brain tumors were recruited into this study. All participants had perfusion and diffusion weighted imaging performed at diagnosis. Imaging data were processed using conventional methods, and a Bayesian survival analysis performed. Unsupervised and supervised machine learning were performed with the survival features, to determine novel sub-groups related to survival. Sub-group analysis was undertaken to understand differences in imaging features. Survival analysis showed that a combination of diffusion and perfusion imaging were able to determine two novel sub-groups of brain tumors with different survival characteristics (p < 0.01), which were subsequently classified with high accuracy (98%) by a neural network. Analysis of high-grade tumors showed a marked difference in survival (p = 0.029) between the two clusters with high risk and low risk imaging features. This study has developed a novel model of survival for pediatric brain tumors. Tumor perfusion plays a key role in determining survival and should be considered as a high priority for future imaging protocols.


Introduction
Brain tumours represent one of the most common causes of paediatric and adult oncological mortality.Particular challenges are faced in clinical paediatric oncology research due to the highly heterogeneous nature of paediatric tumours, combined with the relative rarity of the disease in the general population 1 .Despite this, multi-centre studies have allowed impressive advances to be made in the understanding of the major types of children's brain tumours and these are starting to change clinical practice 2,3 .The majority of studies have relied on analysis of tumour tissue; however, medical imaging is becoming increasingly able to probe tissue properties and has the advantage that measurements are made directly in vivo.This is particularly important for probing the tissue microenvironment since quantities such as perfusion cannot be readily determined in tissue samples.Imaging therefore has the potential to provide new biomarkers of prognosis which can be obtained early and throughout the patient journey.
Recently an increased understanding of paediatric brain tumour biology has enabled more accurate prognostication for individual patients.The findings have largely been based on molecular genetic markers identified in tissue.For example in medulloblastoma, biological subgrouping has shown that WNT subgroup tumours have an excellent prognosis whereas group 3 tumours and subsets of SHH tumours have an inferior outcome 4 .However, in even rarer tumours, such as atypical rhabdoid tumours (ATRT), or midline gliomas, where biopsy derived tissue is challenging to acquire, it is even more challenging to perform studies 5 .Therefore, biological studies have been more difficult to perform in meaningful numbers for many tumour types and the small biopsies taken may not provide a representative view of the tumour, particularly its microenvironment.
Medical imaging is an important diagnostic aid for brain tumours, since it is non-invasive and can include the whole tumour and surrounding tissue.It is also capable of probing the tumour microenvironment in vivo, improving our understanding of the in vivo neovascularisation and cellularity of the tumour, as well as surrounding cerebral tissue through perfusion and diffusion imaging, respectively 6,7 .However, as mentioned above for biological studies, recruiting large numbers of patients for imaging studies is challenging, and often requires large multi-centre trials to glean meaningful results.In spite of this, these non-invasive modalities represent highly attractive methods to derive crucial information surrounding the diagnosis and progression of tumours.
Diffusion imaging is available on every major commercial MRI scanner and is routinely used to assess brain tumors 8 .Apparent diffusion coefficient (ADC) maps represent the speed of water motion in the tissue and this correlated cellularity.Perfusion imaging is often acquired either with dynamic susceptibility contrast (DSC) or arterial spin labelling (ASL) techniques 9,10 .DSC imaging is undertaken through the introduction of an exogeneous contrast agent containing gadolinium, and the passage of this bolus through the cerebral vasculature is rapidly imaged and post-processed to form quantitative cerebral blood volume and flow maps 11 .
Studies have shown that diffusion and perfusion imaging are able to discriminate between paediatric tumour types in vivo, with high cellularity and perfusion in high grade tumours, and vice versa for low-grade 12,13 .This data, in turn, has informed survival analysis models using traditional methods such as Cox-regression to derive significant covariates from imaging data 14 .In particular, ADC mean, elevated cerebral blood flow, and image derived texture parameters have been found to be significant factors in long-term paediatric brain tumour survival 7,15,16 .
In this study we have taken a novel approach to the understanding of risk and survival in paediatric brain tumours.We have combined a cohort of patients with multiple tumour types, including both common and rare tumours, and grades from multiple clinical centres.
All participants had both perfusion and diffusion imaging, we have employed both supervised and unsupervised machine learning to determine key imaging derived risk factors to further our understanding and prediction of survival in the paediatric neurooncological population.The imaging protocol for all participants was performed either at 3 or 1.5T and included standard anatomical imaging (T1-weighted pre-and post-contrast and T2-weighted) as well as diffusion and dynamic susceptibility contrast imaging covering the tumour volume (imaging sequence details found in supplementary Table 1).Additional clinical data (age at diagnosis and gender) were also collected for analysis.

Image post-processing and analysis
ADC maps were calculated from diffusion weighted imaging using a linear fit between the two b-value images in Matlab (The Mathworks, MA, 2018a).DSC time-course data were processed using conventional methods to provide uncorrected cerebral blood volume (uCBV) maps, with a leakage correction undertaken to produce corrected cerebral blood volume (cCBV) and K2 maps 17 .T2-weighted imaging and ADC maps were registered to the first DSC volume with SPM12 (UCL).Regions of interest segmenting the tumour volume were drawn on the T2 weighted imaging 18 .
Image analysis was performed in Matlab (2018b, The Mathworks, MA), with the image mean, standard deviation, skewness, and kurtosis were calculated on a volume by volume basis for ADC and UCBV/CCBV/K2 maps for regions of interest and the whole brain as previously described 18 .Tumour volume (cm 3 ) was calculated from the T2 ROI masks drawn by S.W 18 .Regions of interest were also drawn in normal appearing deep grey and white matter for each participant to calculate average diffusion and perfusion measures in normal appearing tissue by J.G. Medulloblastoma Chang stage was derived from radiological reports.
Medulloblastomas were analysed for histological type, subgroup, and MYC and MYCN amplification status were determined by protocols established at Newcastle University [19][20][21] .
Medulloblastoma histology was centrally reviewed at the Royal Victoria Infirmary.Data are summarised in supplementary document 1.

Statistical analysis
All statistical analyses were performed in R (3.6.1) with significance defined at p<0.05, and Bonferroni correction for multiple comparisons used where appropriate.
The data processing pipeline used in this study is summarised in Figure 1.

Univariate statistical analysis
Data normality was assessed using a Shapiro-Wilk test.Subsequently, differences in clinical and imaging features between high-and low-grade tumours were assessed using unpaired t-tests or Mann-Whitney U tests, where appropriate.Area under the ROC curve (AUC) values were calculated for each imaging feature for high/low grade discrimination.
Differences in high-/low-risk (defined below) participants were assessed using unpaired two-tailed t-tests or Mann-Whitney U test, depending on data normality.After unsupervised clustering (described below), further Mann-Whitney U tests were performed to assess for differences in imaging features between low grade tumours in low and high-risk categories, and between alive high-grade tumours in low-and high-risk categories.

Survival and correlation analysis
Univariate Cox-regression was performed with each individual imaging feature, clinical data, and tumour grade used to assess survival hazard coefficients.Tumour grade and type were not used in the analysis detailed below.
Iterative Bayesian survival analysis was undertaken using the iterative BMAsurv package in R using 5-fold stratified cross validation to determine the posterior probabilities and coefficients of the top 5 imaging features that best describe the survival data 22 .Iterative analysis including up to 15 data features in combination at any one time.

Unsupervised and supervised machine learning
K means clustering was performed with the imaging features from Bayesian survival analysis, with the optimal number of clusters determined from the largest average silhouette width.Groups were clustered into high and low risk groups, and subsequently used for further Kaplan-Meier analysis to assess for differences in survival between clusters.Supervised machine learning using the aforementioned Bayesian features was used to predict high/low risk groupings using the Orange toolbox (Orange) in Python (3.6), with Random Forest, a single layer Neural Network, and a support vector machine used.
Validation of classifiers was performed using 10-fold stratified cross-validation.
Clinical and imaging data were subset into Whole Brain (WB), and Region of Interest (ROI) features, and tumour volume and used for supervised learning.Principal component analysis was used to reduce data dimensionality with 95% of data variance or N-1 (where N is the size of the smallest group) used.The top 5 Bayesian features were also used as input into the classifiers, with no further principal component analysis performed.Classifier performance was determined from the classifier accuracy (% correctly classified cases) and F-statistic.

Results
A total of 69patients were analysed in this study with 33 imaging features, including tumour volume, derived per patient.Example tumour anatomical, diffusion, and perfusion imaging can be seen in Figure 2. The survival curve for the whole cohort is seen in Figure 3A, showing 75% overall survival.

Diffusion and perfusion imaging can detect differences between tumour grade
Univariate statistical analysis showed significant differences in both whole brain and ROI imaging features between all high and low-grade tumours (feature with highest AUC = ADC mean (0.82) range: 0.63-0.82)full results detailed in supplementary Table 2. Grey and white matter imaging results are detailed in supplementary Tables 3.

Unsupervised clustering detects distinct groups with significantly different survival and imaging characteristics
Using the Bayesian imaging features, k means clustering revealed two distinct clusters, shown in Figure 3B, which when combined with Kaplan-Meier analysis revealed a significant difference between a high and low risk population (see Figure 3C, p = 0.0015 -overall survival for high and low risk = 55% & 90%, respectively).Cox regression revealed an elevated Hazard Ratio (HR = 5.6, confidence intervals = 1.6-20.1,p < 0.001) for the high-risk cluster, relative to the low risk cluster.
Further univariate analysis of each cluster showed significant differences in a number of imaging features, for example elevated ADC kurtosis in high vs low risk clusters (10.1 ± 5.3 vs 4.3 ± 1.8, p<0.001, respectively).A combination of both high-and low-grade tumours were found in both clusters, all other results detailed in Table 2.

Supervised machine learning can be used to distinguish between high/low risk clusters
Supervised machine learning using imaging features showed that the Bayesian features combined with a single layer neural network, after stratified 10-fold cross validation, provided the most accurate classification of high-and low-risk patients (accuracy = 98%, Fstatistic = 0.98).

There is a distinct difference in survival between high-and low-risk high-grade tumours
Further Kaplan-Meier analysis of clustered high-grade tumours revealed a significant difference in survival (p < 0.05) with a hazard ratio of 7 (0.9-53 lower and upper bounds, respectively).The Kaplan-Meier curves for high grade tumours in both clusters can be seen in supplementary figure 1.Further to this, it is noted that there are a number of children alive at study end with high-risk tumours and currently limited follow-up, for example a Choroid Plexus Carcinoma with a current follow-up of 1 year and a national average 5-year survival rate of 26% 23 and a medulloblastoma with less than 3-year follow-up and M3 Chang stage.There was no detectable difference in survival between the high-and low-risk groups within the low grade tumours (p>0.05).Imaging of example cases by risk and grade given in

Discussion
This study has shown the power of combining diffusion and perfusion imaging with machine learning to predict survival risk in a mixed cohort of paediatric brain tumours.A small handful of studies have previously looked at assessing survival with one of the aforementioned imaging techniques [24][25][26] ; however, here we have shown the utility of combined diffusion-perfusion measures to provide advanced modelling of survival.The univariate results assessing low/high grade suggested a number of key diffusion and perfusion features for the discrimination between groups, however most had a poor AUC.
Therefore, this represented an ideal situation for the use of machine learning to combine these features to provide highly accurate classifiers to solve this challenge.
Interestingly, the majority of parameters predicting survival were from the perfusion imaging which is not currently part of routine clinical practice in many centres.DWI has become a standard method for investigating childhood brain tumours and low ADC is seen as being a marker for higher cellularity and grade which would be associated with poorer survival.The current study substantiates this but shows that DSC-MRI may be an even better modality for predicting survival.The importance of the vessel leakiness parameter K in survival prediction also implies that DSC-MRI may have advantages in survival prediction beyond that available from methods which do not include the injection of contrast agent such as ASL.Furthermore, clustering demonstrated a reasonable separation of high (M1 to M3) from low (M0) Chang stage tumours suggesting that these imaging features identify some properties in the primary tumour which are associated with metastatic potential.
The unsupervised machine learning identified two groups of tumours which did not correspond to any obvious non-imaging tumour characteristics.The credibility of these groups as being distinct entities was substantiated by the high accuracy (98%) with which the tumours could be assigned to the correct group by a supervised learner.A number of patients in the high-risk cluster were still alive at the study end although some of these, including those from known poor prognostic groups had short follow-up times.Further analysis showed that a number of surviving high-risk low-grade tumours had imaging features similar to high grade tumours (such as elevated ADC kurtosis and CBV) and were significantly different to low-risk low grade tumours.It will be interesting to ascertain the clinical course of these tumours over longer periods of follow-up.
A particular strength of this work is that imaging features with clinical data provide a noninvasive tool that can assess risk early in the patient journey.Indeed, the use of a supervised classifier to predict risk category allows for the prospective integration of this model into a clinical decision support system -whereby radiological analysis of a small number of imaging features can rapidly identify patients that should be considered for inclusion into clinical trials for prospective evaluation and subsequent stratification.The use of in vivo imaging also has the advantage that it provides information that cannot be found from analysis of resected tissue, perfusion in particular is inherently an in vivo property.
A further strength of this study is the use of multi-site, multi-scanner data -providing reassurance that the results are robust to the natural variability that occurs in protocols and scanners within clinical practice.Using multiple centres also provided a more statistically powerful study from which clinically relevant results could be obtained.
The imaging modalities used in this study are widely available and so data acquisition should be readily achieved in routine clinical practice.The image processing and classification should be made available by integration into a clinical decision support tool which are increasingly being developed 27 .Indeed, the results shown above show that it is possible to stratify patients into high and low risk groups with a trained supervised neural network, therefore enabling further real-time decisions to be made with regards to appropriate clinical management and inclusion into research trials for novel therapies to aid those with the current worst prognosis.
With the current uncertainty surrounding the use of Gadolinium in clinical practice, and the inability to be used in patients with impaired renal function, future work will include the addition of arterial spin labelling (ASL), a technique to estimate perfusion without the introduction of exogenous contrast agents, as data from this technique has been shown to correlate well with DSC cerebral blood volume 28,29 .However, information on vessel leakiness will not be available from ASL.
In conclusion, this work has demonstrated a highly novel clinical application of advanced survival modelling and machine learning to non-invasively stratify patients for according to risk.Both diffusion and perfusion were found to be important in determining risk with perfusion contributing to a greater extent emphasising the importance of acquiring perfusion imaging.This work represents an important step forward in the use of machine learning to predict survival and paves the way for further clinical studies focusing on the successful identification and treatment of high-risk children with brain tumours.

Supplementary figures and tables
Table S1 -Imaging parameters used in this study.

Patient recruitment and imaging 69
participants with suspected brain tumours (medulloblastoma (N = 17), pilocytic astrocytoma (N = 22), ependymoma (considered high grade, N = 10), other tumours (N = 20) are found in supplementary document 1.They were recruited from four clinical sites in the United Kingdom (Ethics reference: 04/MRE04/41, Birmingham Children's Hospital, Newcastle Royal Victoria Infirmary, Queen's Medical Centre, Alder Hey Children's Hospital, Liverpool).Participants underwent MRI, protocol discussed below, before invasive biopsy to confirm diagnosis.The median follow-up time for the cohort was 4.4 years.Tumours were assigned to high-(3&4) and low-grade (1&2) groups, with full cohort details found in supplementary document 1.

Figure 4 .
Figure 4. Qualitative sub-group analysis of histology and genetics between low-and high-risk medulloblastomas revealed no significant differences between MYC amplification or groupings.The high-risk cluster exhibited a trend toward having a larger number of high Chang stage Medulloblastomas (M3 = 6, M2 = 4, M1 = 3) in comparison to the low risk cluster (M2 = 1, M1 = 1, M0 = 2) -data shown in supplementary document 1.

Figure 1 -
Figure 1 -Data processing pipeline used in this study.

Figure 3 -Figure 4 -
Figure 3 -(A) Overall survival curve for the cohort, (B) K Means clustering survival results showing two distinct clusters, (C) Kaplan-Meier curve for the two clusters showing a significant difference in survival.1 = High risk, 2 = Low risk

Figure S1 -
Figure S1 -Kaplan-Meier curves for high-grade low-risk (red) and high-risk (green) patients showing a significant difference in survival from imaging at diagnosis.