A fully automated artificial intelligence method for non-invasive, imaging-based identification of genetic alterations in glioblastomas

Glioblastoma is the most common malignant brain parenchymal tumor yet remains challenging to treat. The current standard of care—resection and chemoradiation—is limited in part due to the genetic heterogeneity of glioblastoma. Previous studies have identified several tumor genetic biomarkers that are frequently present in glioblastoma and can alter clinical management. Currently, genetic biomarker status is confirmed with tissue sampling, which is costly and only available after tumor resection or biopsy. The purpose of this study was to evaluate a fully automated artificial intelligence approach for predicting the status of several common glioblastoma genetic biomarkers on preoperative MRI. We retrospectively analyzed multisequence preoperative brain MRI from 199 adult patients with glioblastoma who subsequently underwent tumor resection and genetic testing. Radiomics features extracted from fully automated deep learning-based tumor segmentations were used to predict nine common glioblastoma genetic biomarkers with random forest regression. The proposed fully automated method was useful for predicting IDH mutations (sensitivity = 0.93, specificity = 0.88), ATRX mutations (sensitivity = 0.94, specificity = 0.92), chromosome 7/10 aneuploidies (sensitivity = 0.90, specificity = 0.88), and CDKN2 family mutations (sensitivity = 0.76, specificity = 0.86).

Many prior studies have demonstrated that certain quantitative image features (e.g. tumor subcompartment ratios, diffusivity values, and image texture features) can be used to predict both IDH mutations and MGMT hypermethylation on preoperative imaging of gliomas [15][16][17][18] . Several other studies have reported similar results for other common glioblastoma genetic biomarkers including ATRX, TP53, and EGFR [19][20][21][22] . However, most prior studies have either focused primarily on lower grade gliomas, utilized non-standardized quantitative imaging feature definitions, or relied on manual segmentation of brain tumor subcompartments for feature extraction, which is a tedious and time-consuming process.
Recently, artificial intelligence and deep learning have emerged as new methods for automating complex medical imaging tasks. In particular, deep convolutional neural networks (dCNNs) have demonstrated the ability to generate rapid and accurate 3-dimensional segmentations of glioblastoma subcompartments from MR images 23,24 . Automated tumor segmentation provides an unbiased and reproducible method for extracting quantitative image features, particularly when combined with standardized and freely available radiomics image feature extraction tools. Radiomics features derived from deep learning segmentations have proven useful for several neuro-oncologic inference tasks, including genetic biomarker prediction in gliomas 25,26 . The purpose of this study was to evaluate a fully automated deep-learning segmentation and radiomics-based approach for predicting the status of several common and clinically relevant genetic biomarkers in glioblastomas using only preoperative imaging.

Materials and methods
Patient population. All studies were performed in accordance with relevant guidelines and regulations and were approved by the University of California San Francisco institutional review board with a waiver for consent. The study population consisted of 199 adult patients with histopathologically confirmed grade IV malignant glioma (i.e. glioblastoma) who underwent preoperative MRI, initial tumor resection, and tumor genetic testing at a single center between 2015 and 2019. Patients with any history of prior brain tumor diagnosis or treatment were excluded. Genetic biomarker testing. Nine different glioblastoma molecular biomarkers were analyzed for this study: mutations or deletions of IDH, TP53, PTEN, ATRX, TERT, and the CDKN2 family, MGMT promoter methylation, EGFR copy number amplification (including the EGFRVIII rearrangement), and aneuploidy of chromosomes 7 and 10. Gold standard assessment of molecular biomarkers was determined by genetic sequencing and/or immunohistochemical staining at the time of biopsy or tumor resection. All IDH mutations were confirmed by genetic sequencing. MGMT status was determined using a methylation sensitive PCR assay. Not all genes were evaluated in every patient. Gene test frequency and prevalence in the study cohort are presented in Table 1.
Image acquisition. All preoperative MRI was performed using a 3.0 T scanner (Discovery 750, GE Healthcare, Waukesha, Wisconsin, USA) and a dedicated 8-channel head coil (Invivo, Gainesville, Florida, USA). The imaging protocol included 3D T2-weighted, T2/FLAIR-weighted, susceptibility-weighted (SWI), diffusionweighted (DWI), pre and post-contrast T1-weighted images, 3D arterial spin labeling (ASL) perfusion images, and 2D 55-direction high angular resolution diffusion imaging (HARDI). Acquisition parameters were as follows: T2: sagittal 3D fast spin echo (FSE) ( Image pre-processing. HARDI data were eddy current corrected and processed using the Eddy and DTI-FIT modules from FSL yielding isotropic diffusion weighted images (DWI) and several quantitative diffusivity maps: mean diffusivity (MD), axial diffusivity (AD), radial diffusivity (RD), and fractional anisotropy (FA). Each image contrast was registered and resampled to the 3D space defined by the T1 postcontrast image (1 mm isotropic resolution) using automated non-linear registration (Advanced Normalization Tools). Resampled coregistered data were then skull stripped using the automated Brain Extraction Tool (BET) from FSL 27,28 . All subsequent image processing steps were performed on resampled co-registered data.
Deep learning-based automated tumor subcompartment segmentation. A previously described and validated deep learning algorithm was used to generate automated 3D segmentation of three key components of glioblastoma that are seen on MRI: enhancing and non-enhancing tumor (together comprising the tumor core) and surrounding tumor related edema. A complete methodologic description and formal evaluation of the segmentation algorithm is available elsewhere 24 . This algorithm was adapted for the study data; however, the underlying network architecture was not changed. Briefly, the segmentation network consisted of 3 cascaded instances of a 2-dimensional deep convolutional neural network implemented with Python 2.7 and Tensorflow 1.7 (Fig. 1). The first network instance was used to segment the entire tumor volume from whole brain images, while the second and third networks were used to segment tumor core and enhancing tumor, respectively, from the tumor volume. Segmentation was performed in all 3 cardinal planes and then combined to create smooth 3-dimensional labels. Input data consisted of preprocessed T2-, T2/FLAIR-, and pre-and postcontrast T1-weighted images. The network was trained using the publicly available BraTS 2017 dataset consisting of manually segmented multi-modal MRI of 243 gliomas 23 . Both high-and low-grade glioma training cases were used given the observation that some IDH-mutant glioblastomas more closely resemble lower grade tumors. Training details included the Adam optimizer, binary softmax cross-entropy loss, a starting learning rate of 1 × 10 -3 with an exponential decay constant of 1 × 10 -7 , a training patch size of 96 × 96 × 4 voxels, a batch size of 5, and 20 total training epochs using the entire training dataset. Training took approximately 50 h on a Nvidia Titan Xp graphics processing unit. Study data was automatically segmented using the trained model.   Predictive modeling of molecular biomarkers. Radiomics features were fed into random forest regression models to predict the likelihood of each genetic biomarker being present. Random forest regression was implemented in Python 3.7 using the scikit-learn 0.23.1 (https ://sciki t-learn .org/) RandomForestRegressor class 30 . Each genetic biomarker was treated as a separate binary regression task. A tenfold stratified shuffle split cross validation strategy was implemented using the StratifiedShuffleSplit class with a train/test split of 60%/40% to account for the class imbalance of certain genetic biomarkers. Automated feature reduction was performed using cross validated recursive feature elimination (RFECV class) with 1% of features eliminated at each step 31 . Automated random forest model hyperparameter tuning was accomplished using a cross validated randomized search approach (RandomizedSearchCV class) with 100 steps. Hyperparameter ranges included in the random search were: number of trees (100 to 10,000), maximum number of levels per decision tree (10 to 100), minimum data samples required to split a node (2 to 10), minimum data samples required at each leaf node (1 to 5), maximum number of features considered at each split (total number of features or its square root), and whether or not to use bootstrap samples for building trees. Final model hyperparameters including the total number of features used for each genetic biomarker are provided as supplementary data. Feature importance was determined using the permutation feature importance method 32 . Model performance was evaluated using receiver operating characteristic analysis in addition to precision, recall, the F1 statistic, and Matthew's correlation coefficient.
External validation of predictive models. A complete external validation was not possible as there is currently no publicly available preoperative glioblastoma MRI dataset with comparable preoperative MRI and genetic results. However, a limited external validation was performed using the TCGA-GBM dataset (https :// wiki.cance rimag ingar chive .net/displ ay/Publi c/TCGA-GBM), which includes T1 pre-and postcontrast, T2, and T2/FLAIR MR images 33 . 57 cases were identified with preoperative MR images and genetic test results, of which only 4 had IDH mutations and ATRX mutations, respectively. Predictive models were retrained on study data using only the 4 image contrasts included in the TCGA-GBM dataset but otherwise identical methods. Trained models were then evaluated on the 57 TCGA-GBM cases.

Results
Image pre-processing. Automated non-linear co-registration, resampling to 1 × 1 × 1 mm, and skull stripping was successfully performed on all study data. Example axial slices from a pre-processed dataset are presented in Fig. 2. Note that all study data, with the exception of the T1 pre-and postcontrast images, underwent a linear interpolation step during resampling due to differences in acquisition resolution between different image contrasts.

Inference of glioblastoma molecular biomarkers using random forest regression. Receiver
operating characteristic (ROC) curves for the four best predicted genetic biomarkers are presented in Fig. 4. ROC curves for the remaining 5 genetic biomarkers are included as supplementary data. Genetic biomarker prediction was most accurate for ATRX mutations. Test characteristics for ATRX mutation prediction included a sensitivity of 0.94 ± 0.07, a specificity of 0.92 ± 0.04, an MCC of 0.71 ± 0.08, and an AUC of 0.97 ± 0.02. Performance was slightly worse for predicting IDH mutations with a sensitivity of 0.93 ± 0.08, a specificity of 0.88 ± 0.07, an MCC of 0.62 ± 0.16, and an AUC of 0.95 ± 0.03. The proposed method was also reasonable for predicting chromosome 7/10 aneuploidies (sensitivity = 0.90 ± 0.09, specificity = 0.88 ± 0.08, MCC = 0.75 ± 0.18, AUC = 0.93 ± 0.05) and CDKN2 family mutations (sensitivity = 0.76 ± 0.06, specificity = 0.86 ± 0.09, MCC = 0.62 ± 0.08, AUC = 0.85 ± 0.04). Prediction of other molecular biomarkers was similar to random chance with AUCs closer to 0.5. Test characteristics for all 9 genetic biomarkers evaluated in this study are presented in Table 1.   Table 2. First order features (i.e. voxel intensity distributions) and gray level size zone features (i.e. connected regions of similar intensity pixels) comprised a majority of the most important features for all 4 of the best predicted genetic biomarkers. For ATRX mutation prediction, the single most predictive feature was the average T1 postcontrast intensity within the tumor core followed by the MD kurtosis within the non-enhancing tumor. IDH mutation prediction similarly relied heavily on diffusion characteristics of the non-enhancing tumor (DWI high gray level emphasis) and T1 postcontrast intensity (variance throughout the whole tumor). Unlike other well predicted genetic biomarkers, chromosome 7/10 aneuploidy prediction showed significant dependence on a shape feature (elongation of the enhancing tumor). Prediction of CDKN2 was optimal with only 5 features and was the only model to show a heavy dependence on ASL (contrast within the tumor core).
Qualitative imaging correlates of preditive radiomics features. Representative MR images of glioblastomas with IDH mutations and chromosome 7/10 aneuploidies are presented in Fig. 5. IDH mutant glioblastomas exhibited overall larger tumor cores with a dominant infiltrative non-enhancing component and a relatively small enhancing component. These qualitative differences are reflected in the importance of T1 postcontrast intensity variance for predicting IDH status. In contrast, glioblastomas with chromosome 7/10 aneuploidies tended to exhibit a more pronounced rounded morphology of the tumor core, which is reflected in the importance of enhancing tumor elongation for predicting chromosome 7/10 aneuploidy status.

Discussion
This study details an automated pipeline for inferring glioblastoma genetic biomarkers using automated segmentation and radiomics feature extraction. Specifically, we examined nine molecular biomarkers, including some that are known to affect prognosis and clinical management. We found that radiomics features extracted using automated deep learning segmentation were useful for accurately identifying IDH-mutations on preoperative imaging of patients with glioblastoma. Our results show that a sensitivity of > 95% for detecting IDH mutations could be achieved with a specificity of over 80%, which is a reasonable characteristic for a screening test. In our study, IDH-mutant glioblastomas demonstrated larger tumor cores with relatively small enhancing components. This qualitative observation is supported by the importance of postcontrast image intensity features within the tumor core for our predictive model. Prior studies have similarly demonstrated quantitative volumetric differences between IDH mutant and wildtype gliomas, albeit with variable tumor grades and different image feature extraction methods such as 2-dimensional and/or manual tumor segmentation 17,18 . Our results for IDH prediction are improved compared to prior studies focused solely on glioblastoma 34 and more comparable to prior work focused on lower grade gliomas 35,36 . Radiomics features were also highly accurate for inferring ATRX mutations. ATRX mutations are extremely common in IDH-mutant glioblastomas but are typically not present in IDH-wildtype glioblastomas. In our cohort of 199 patients with glioblastoma, only 5 tumors demonstrated ATRX mutations without concomitant IDH mutations, and all IDH mutant tumors had associated ATRX mutations. This high degree of correlation between IDH and ATRX mutations suggests that imaging features of these molecular biomarkers will also be highly correlated. While important radiomics features for predicting ATRX and IDH mutations did not overlap completely, both predictive models relied heavily on diffusivity and T1 postcontrast image intensity within the tumor. Our results for predicting ATRX mutations are comparable to prior studies, though it should be noted that prior work has focused almost entirely on lower grade gliomas 21,37 .
We also found that automatically extracted radiomics features were highly sensitive for detecting aneuploidies of chromosomes 7 and 10. These aneuploidies, particularly trisomy 7, monosomy 10, are among the most frequent genetic alterations in glioblastoma (70% in this study) and have been associated with malignant cell proliferation, tumor progression, and lower overall survival 38,39 . Importantly, there was no overlap between chromosome 7/10 aneuploidies and either IDH or ATRX mutations in our study cohort. There has been relatively little prior work on predicting chromosome 7/10 aneuploidies in glioblastoma, however, our results are similar or better compared to prior studies aimed at predicting the 1p/19q co-deletion-another common chromosomal abnormality found in gliomas 40,41 .
CDKN2 family alterations were also relatively well predicted using the proposed methods. Mutations or deletions of the CDKN2 family tumor suppressor genes are present in 30-80% of gliomas and result in unchecked activity of downstream cell cycle kinases including CDK4 5 . These mutations can potentially be targeted by existing small molecule CDK4 inhibitors and are the subject of ongoing clinical trials in patients with gliomas [42][43][44][45] . Prior studies have reported statistically significant but relatively weak correlations between CDKN2 gene deletions and radiomics features 46 .
Several other glioblastoma genetic biomarkers examined in this study, including MGMT promoter methylation, were not found to be highly correlated with any radiomics features. This result contrasts with prior studies that have demonstrated more accurate prediction of MGMT promoter methylation status based on MRI features [47][48][49][50] . There are many potential explanations for this difference, including differences in tumor segmentation and radiomics feature extraction methods, the inclusion of lower grade tumors in other study cohorts, and different testing methods for laboratory determination of MGMT methylation status.
There are several potential approaches to improve the results presented here. For example, manual correction of automated tumor segmentations might improve the discriminative value of certain radiomics features, albeit at the cost of compromising the fully automated nature of the proposed method. An alternative approach would be to use a more advanced automated tumor segmentation schemes such as the 4-compartment model (i.e. separating non-enhancing tumor and cystic tumor necrosis) proposed in the more recent BraTS challenges 51 . Similarly, the inclusion of additional quantitative MR image contrasts may be beneficial as a majority of prior studies have shown that many different imaging features are necessary for accurate glioma genetic biomarker classification 19,47,[52][53][54][55] .
This study has several shortcomings that may limit its generalizability to other data. First, the use of 3 T MR scanners, 3D imaging, and 55-direction HARDI are not widely used in routine brain tumor imaging. This is one likely explanation for the relatively poor external validation performance of our model on the TCGA-GBM dataset. Second, in our cohort of 199 patients, certain molecular biomarkers were only positive in a small subset of tumors due to their relative rarity and/or testing frequency. For example, our cohort included 18 cases with IDH mutations (~ 9%), which is in line with the 5-13% prevalence reported in the literature [56][57][58] . This low number of positive examples can be problematic for machine learning models, which require separate training and testing sets. We used a stratified cross-validation approach to address imbalance in our dataset, however a more balanced dataset with a larger number of cases would be a more reliable approach. Finally, although our medical Scientific RepoRtS | (2020) 10:11852 | https://doi.org/10.1038/s41598-020-68857-8 www.nature.com/scientificreports/ center is a national referral center for brain tumors, it is unclear if the results presented here are generalizable to other patient demographic groups. This work represents an important step towards a fully automated method for non-invasive, imaging-based identification glioblastomas with IDH mutations and certain other molecular biomarkers relevant for guiding therapy and determining prognosis. Although this was a relatively small retrospective study, the rapid and automated nature of the proposed method would allow straightforward application to larger datasets and prospective studies. With further work, our overarching goal is to obviate the need for tissue-based detection of glioblastoma molecular biomarkers using non-invasive MRI-based methods, to help guide maximal safe resection, and to assess response to genetic biomarker specific treatments that have been shown to improve survival in patients with glioblastoma.

Data availability
Code and radiomics feature data are available by request to the corresponding author. Image data is the property of UC Regents.