Radiomics for residual tumour detection and prognosis in newly diagnosed glioblastoma based on postoperative [11C] methionine PET and T1c-w MRI

Personalized treatment strategies based on non-invasive biomarkers have potential to improve patient management in patients with newly diagnosed glioblastoma (GBM). The residual tumour burden after surgery in GBM patients is a prognostic imaging biomarker. However, in clinical patient management, its assessment is a manual and time-consuming process that is at risk of inter-rater variability. Furthermore, the prediction of patient outcome prior to radiotherapy may identify patient subgroups that could benefit from escalated radiotherapy doses. Therefore, in this study, we investigate the capabilities of traditional radiomics and 3D convolutional neural networks for automatic detection of the residual tumour status and to prognosticate time-to-recurrence (TTR) and overall survival (OS) in GBM using postoperative [11C] methionine positron emission tomography (MET-PET) and gadolinium-enhanced T1-w magnetic resonance imaging (MRI). On the independent test data, the 3D-DenseNet model based on MET-PET achieved the best performance for residual tumour detection, while the logistic regression model with conventional radiomics features performed best for T1c-w MRI (AUC: MET-PET 0.95, T1c-w MRI 0.78). For the prognosis of TTR and OS, the 3D-DenseNet model based on MET-PET integrated with age and MGMT status achieved the best performance (Concordance-Index: TTR 0.68, OS 0.65). In conclusion, we showed that both deep-learning and conventional radiomics have potential value for supporting image-based assessment and prognosis in GBM. After prospective validation, these models may be considered for treatment personalization.

Section 1: Feature selection criteria for final signature Here we present an example of feature selection for residual tumour status prediction on MET-PET imaging.The same technique applies to residual tumour status prediction on T1c-w MRI as well.39 MET-PET features with the highest mutual information (measured by the AUC) with residual tumour status on MET-PET were selected after hierarchical clustering.These features were then used to build a diagnostic model.Feature selection and model building with internal validation was first performed within 5 repetitions of 5-fold cross-validation (CV) nested in the training data to identify an optimal signature, with model performance evaluated in terms of median AUC across all CV folds.For each of the above-mentioned feature selection methods, the occurrence of every feature in the 25 modelling steps (5 repetitions of 5-fold CV) was counted, and features were ranked according to their occurrences across the cross-validation folds.Table S4 shows features with top 5 ranks across each feature selection method that were further considered.Finally, features that showed repeated occurrences across at least 75% of the feature selection methods were selected.Two features, log_ih_kurt_fbn_n16 and log_stat_skew, occurred in all 4 feature-selection methods, thus meeting the 75% occurrence criteria for candidate features.Both features showed a Spearman correlation of >0.5 on the entire training data as shown in Figure S1(a).Finally, log_ih_kurt_fbn_n16 was selected as a one-feature signature due to the stronger association of this feature with the endpoint (p=2.58 × 10 −5 ) as compared to log_stat_skew (p=8.37 × 10 −5 ).The finally selected signature and the average AUC (average of AUC across all feature selection methods) in internal training and external test are reported in the results section.The same technique is applicable to the detection of residual tumour status on T1-c w MRI and for the prognosis of TTR (example shown in Table S5) and OS.Remarks: dzm_sdhge_3d_fbn_n16, log_stat_min, log_ivh_i90 and log_ih_skew_fbn_n16 occurred in at least 3 out of 4 (75%) feature selection methods.All these features showed a correlation >0.5 (Figure S1b).Finally log_stat_min was selected due to stronger association with the endpoint compared to other features, and was used to build final models using Cox regression (Cox)

Section 2: 3D data augmentation
In this work, we used random flipping to create mirror reflections of the input image volume along only the in-plane x and y axis.Mirroring in batchgenerators is evenly distributed and probability of mirroring along each axis is 0.5.The rest of augmentations used in this work belong to pixel-level transformations.
We used additive Gaussian noise with variance uniformly sampled from the range (0, 0.05).We also used Gaussian blur with standard deviation () selected randomly from the range (1, 1.75).Further, we used Gamma correction to improve luminance of input volumes with gamma values selected randomly from the range (0.5, 2).Finally, brightness multiplicative transform, where the multiplier is randomly sampled from range of (0.7, 1.5), and random contrast transform, where contrast values were randomly sampled from the interval (1, 1.75), were used for augmenting T1c-w MRI data only.We did not use brightness and contrast transform for MET-PET data, as the effect of these transformation was found to be less effective for improving model performance.The hyperparameters for pixel-level transforms were selected manually by visually inspecting the images so that each transformation creates an image      Table S12: Final model coefficients for the prognosis of TTR and OS using the clinical only, the clinical + MET-PET and the clinical + MRI radiomics models.Training was performed on the entire training data using multivariable logistic regression.
In addition, transformation parameters from the Yeo-Johnson transformation and z-normalization, and optimal cutoff values for Kaplan Meier plots are presented.

Table S5 :
Median C-index for prognosis of TTR based on MET-PET data using cross-validation of the training data with Cox regression.Top 5 features ranked according to their occurrence are shown here.Features with a repeated occurrence across at least 75% (3 out of 4) of the feature selection methods are marked in bold.CI: concordance index, CV: cross-validation, EN: elastic net, MRMR: minimum redundancy maximum relevance, MIM: mutual information maximization, TTR: time-torecurrence, UR: univariate regression.

Figure S1 :Figure S2 :
Figure S1: Correlation plot of features with a repeated occurrence across at least 75% (3 out of 4) of the feature selection methods (a) for prediction of residual tumour status on MET-PET, and (b) for prognosis of TTR.Features showed a high correlation ( > 0.5).

Figure S3 :
Figure S3: An illustration of DenseNet and its building block, i.e. dense block.In each dense block, output from each convolution layer is combined with output of all subsequent layers within the desne block.BN: batch normalization, Conv: convolution layer, DB: dense block, FC: fully connected layer, GAP: Global average pooling.
that is representative of real perturbations and by avoiding extreme transformations with very high or low values of transformation parameters.All transformations were applied to 3D patches extracted around the clinical target volume (CTV).Each augmentation was applied with the probability of 0.15 which limits the number of original images shown to the network.The percentage of original images used during the training was 40% and 10%, combining 4 and 6 different augmentation techniques for MET-PET and T1c-w MRI, respectively.

Figure S4 :
Figure S4: Confusion matrices for residual status prediction in training and test data based on MET-PET and T1c-w MRI data (a) using the final radiomics-based logistic regression model and (b) using the final 3D-DenseNet model.

Figure S5 :
Figure S5: Box plot of Yeo-Johnson transformed and z-score normalized features selected in the best performing MET-PET signature for prediction of residual tumour status on MET-PET in training data.PET_log_ih_kurt_fbn_16 showed relatively higher values in MET-PET positive patients as compared to MET-PET negative patients.

Figure S7 :
Figure S7: Calibration plots on training and test data for the prognosis of TTR (after 5 years) using the Cox regression model based on (a) clinical only signature, (b) clinical + MET-PET signature and (c) clinical + T1c-w MRI signature.For calibration, data (thick lines) and 95% confidence intervals (shaded regions) are shown together with linear regression lines (solid lines) and optimal expectation (dashed lines).Density of expected probabilities is shown above the calibration plot.

Figure S8 :
Figure S8:Representative images from MET-PET imaging with corresponding Laplacian of Gaussian (LoG) transformed images and the selected signature, i.e. log_stat_min value from two patients (P) in the two risk groups (P1 from low risk group and P2 from high risk group) of training data.The red contours mark the clinical target volume (CTV).P1 (TTR status = 0, TTR = 15.47 months) showed an overall homogenous appearance on the baseline MET-PET with higher log_stat_min value (a).On the contrary, P2 (TTR status = 1, TTR = 0.07 months) showed a more heterogeneous CTV with a low log_stat_min, which corresponds to high pixel intensities (stat_max) on the baseline PET (b).

Figure S9 :
Figure S9: Calibration plots on training and test data for (a) time to recurrence (TTR) and, (b) overall survival (OS) in training, internal validation and external test data based on the respective joint clinical + ensemble predictions (3D-DenseNet model) on MET-PET data.For calibration, data (thick lines) and 95% confidence intervals (shaded regions) are shown together with linear regression lines (solid lines) and optimal expectation (dashed lines).Density of expected probabilities is shown above the calibration plot.

Figure S10 :
Figure S10: (a) Example PET image of a false-positive case.The PET image appears patchy, compared to (b) a true negative PET image with overall homogenous appearance.(c) Example T1c-w MR image of a false-positive case with confounding effects of surgically induced contrast enhancement in the second baseline MR image.The clinical decision of MR-negative status was made on early postsurgical MRI shown in (d), which does not show contrast enhancement on surgical boundaries.Red contours represent the CTV.

Table S1 :
Image acquisition parameters of diagnostic magnetic resonance imaging (MRI) and positron emission tomography (PET) data for both training and test data.

Table S2 :
Feature classes extracted from MET-PET and T1c-w MRI.LoG transformations used for intensity-based features.

Table S3 :
Image preprocessing parameters for both PET and MRI data, as used in MIRP.

Table S4 :
Median AUC for PET-status prediction based on MET-PET data using cross-validation of the training data with logistic regression.Top 5 features ranked according to their occurrence are shown here.Features with a repeated occurrence across at least 75% (3 out of 4) of the feature selection methods are marked in bold.AUC: area under the curve, CV: crossvalidation, EN: elastic net, MRMR: minimum redundancy maximum relevance, MIM: mutual information maximization, UR: univariate regression.

Table S6 :
Data augmentation parameters used for deep learning analysis.Augmentations were carried out using the batchgenerators package, which is an open-source python package for data augmentations.

Table S7 :
Univariable analysis of time-to-recurrence (TTR), and overall survival (OS) using Cox regression, in the training data.ci: confidence interval.Significant p-values of patient's clinical characteristics are marked in bold.Eastern Co-operative Oncology Group; IDH, isocitrate dehydrogenase; MGMT, O6-methylguanine DNA methyltransferase; MRI, magnetic resonance imaging; OS, overall survival; PET, positron emission tomography; TTR, Time-to-recurrence.

Table S8 :
Summary of the selected T1c-w MRI-based radiomics signature for residual tumour status detection on T1c-w MRI.GLDZM: grey level size zone matrix, IH: intensity histogram.

Table S9 :
Final models for the residual tumour status on MET-PET and T1c-w MRI using conventional radiomics.Training was performed on the entire training cohort using multivariable logistic regression.In addition, transformation parameters from the Yeo-Johnson transformation and z-normalization, and optimal cutoff values from Youden's index are given.

Table S10 :
Ensemble AUC values for training and internal validation CV folds for PET status prediction based on MET-PET imaging and MRI status prediction based on T1c-w MRI data using deep learning with and without data augmentation.Models trained with data augmentation showed relatively higher performance in internal validation, compared to models trained without data augmentation.

Table S13 :
Previous studies for prognostic modelling in glioblastoma.C-index: Concordance index, OS: overall survival, PFS: progression free survival.