Background & Summary

Glioblastoma (GBM) is a highly infiltrative brain tumor. To this day, no curative treatment for GBM patients is available. The current standard-of-care involves maximum safe surgical resection, radiotherapy, chemotherapy, and palliative treatment1,2. Its fast growth makes close disease monitoring paramount. The treatment response is routinely evaluated on Magnetic Resonance Imaging (MRI), according to the response assessment in neuro-oncology criteria (RANO)3. A quantitative part of RANO is concerned with measuring the active tumor appearing on contrast-enhanced T1-weighted MRI, a qualitative evaluation is based on abnormalities on the T2-weighted and fluid-attenuated inversion recovery (FLAIR) MRI.

The success of Machine Learning (including Deep Learning, Artificial Intelligence) has fueled research on automated tumor segmentation tools. Such tools show promising results on pre-operative imaging data4,5, and encouraging results have been presented for post-operative data and for assisting clinicians. The use of automated measurement tools based on automated segmentation methods was investigated (e.g.6,7,8).

The availability of annotated data — either performed manually or by an automated system — has empowered researchers to investigate sub-visual cues and complex image-based biomarkers in the field of radiomics9. For GBM, radiomics has been applied to outcome-related tasks such as overall survival prediction5, radiogenomics10,11,12, to detect pseudo-progression13, and progression based on MRI14.

Publicly accessible data is the cornerstone of research on Deep Learning for applications in Radiology. Data availability is critical to establishing new techniques and testing and benchmarking systems outside of the centers an algorithm is developed. Scientific journals increasingly emphasize the importance of external validation data and best practices in validation of new developments15,16,17. We aim to provide researchers with a first publicly available expert RANO rating on a longitudinal data set with MRI data, automated tumor segmentations, and a rich set of complementary information.

We briefly describe publicly available datasets of MRI data from GBM patients below, and Fig. 1 shows the longitudinal distribution of acquisitions.

Fig. 1
figure 1

Comparison of similar GBM datasets regarding scan distribution on a time scale. Every circle corresponds to one study date. Only time points with known time since the first acquisition were considered for this figure.

The TCGA-GBM dataset offers computed tomography (CT) and MRI data of 262 GBM patients. For 259 patients, MRI data with a total of 575 acquisition dates are available, stemming from eight different centers, and are predominantly acquired pre-operatively. Due to its link to The Cancer Genome Atlas (TCGA), clinical, genetic, and pathological information is available. 51% of the available images were acquired pre-operatively. Bakas et al. added segmentation labels and radiomics features to the pre-operative MRI for this dataset and made it publicly available18,19.

The CPTAC-GBM collection contains MRI data from 62 patients, with 140 acquisition dates. The imaging data with a follow-up period beyond three months is available for eight of these patients. The data is complemented with tissue slice images.

The IvyGap dataset offers data of 39 patients, and is complemented by in situ hybridization (ISH) and RNA sequencing information. For 37 of these patients, the collections also includes post-operative MRI20.

The QIN GBM Treatment Response dataset21,22 contains MRI data from 54 patients with a total of 105 acquisition time points. The post-operative MRI scans were performed after a maximum of eight days after the pre-operative study.

The recently published UPenn-GBM dataset23 contains MRI data from 630 patients and includes 611 pre- and 60 post-operative scans. The data further includes diffusion tensor imaging (DTI), dynamic susceptibility contrast (DSC) imaging, pathology data, and information regarding the age, sex, extent of resection and survival time.

In the 2021 edition, the Brain Tumor Segmentation (BraTS) challenge offered in its training set pre-operative MRI data of 1251 brain tumor patients with tumor segmentations. Four MRI sequences are provided: pre- and post-contrast T1-weighted (T1, CT1), T2-weighted (T2), and fluid-attenuated inversion recovery (FLAIR). The tumor segmentation includes the contrast-enhancement, necrotic region, and edema with potentially non-enhancing tumor. It contains pre-operative data from the TCGA-GBM, TCGA-LGG, IvyGAP, and CPTAC-GBM collections5,24.

The prior work that was based on this or subsets of this dataset is focused on automated tumor burden quantification. Meier et al.25 investigated the capability of automated segmentation tools for automated longitudinal tumor size measurements. Keller-Weldon et al.26 and Porz et al.27 compared automated and manual bi-dimensional tumor measurements. Meier et al.28 investigated the automated quantification of the extent-of-resection. Suter et al.29,30 used the pre-operative data to test the robustness of radiomic features and machine learning classifiers in the context of multi-center studies.

In this dataset, we release the longitudinal MRI data of 91 GBM patients who underwent pre-operative MRI between August 2008 and December 2013, with a follow-up period until 2017. The anonymized imaging data is complemented by expert ratings according to the RANO guidelines, patient age at time of diagnosis, sex, and overall survival time. The expert ratings, which we consider a key contribution of this dataset, include the disease state (progressive disease, stable disease, partial response, or complete response), the rating rationale, and bi-dimensional Macdonald31 measurements for lesions above the measurability threshold. The output of two state-of-the-art automated segmentation tools DeepBraTumIA (https://www.nitrc.org/projects/deepbratumia), and HD-GLIO-AUTO6,32,33, are included, both based on the highly successful U-Net deep learning architecture34. Furthermore, we include all radiomic features from the PyRadiomics package35 for each segmentation label, as well as features describing the Co-occurrence of Local Anisotropic Gradient Orientations (CoLlAGe) features36. The imaging information is accompanied by the MRI acquisition parameters for each image to enable researchers to correct for confounding factors and investigate the impact of imaging differences, e.g., on radiomic features and machine learning algorithms. The provided pathology information includes the O6-methylguanine-DNA methyltransferase promoter methylation (MGMT) and isocitrate dehydrogenase 1 (IDH1).

Methods

The records of 91 GBM patients who underwent pre-operative MRI between 2008 and 2013 were reviewed retrospectively, building on the study population of Schucht et al.37 based on the WHO 2016 classification scheme38. All patients were treated with surgical resection and temozolomide-based chemo-radiation at the Inselspital, the University Hospital of Bern, Switzerland. Two patients received Avastin as a second-line treatment, which is noted in the clinical information provided in the dataset. The cantonal ethics committee of Bern approved the studies and waived written informed consent.

Table 1 gives an overview of the patients demographics. The mean survival was at 589 days (19 months) with a standard deviation of 334 days. The patient’s mean age at the time of the first resection was 62.4 years, with a standard deviation of 10.3 years. Derived from the expert RANO ratings, we derived the time to progression as the time span from the first resection to the first follow-up rated as progressive disease.

Table 1 Patient demographics of the single-center dataset.

The study population included 44 women and 47 men. MGMT methylation was found in 37 patients, 43 patients with unmethylated MGMT, and the MGMT status was unavailable for 11 patients. IDH1 was determined as wild-type for 57 patients, 1 patient was IDH-mutant, and IDH information was unavailable for 23 patients. IDH immunohistochemistry was negative for 10 patients, where sequencing would have been necessary for a definite result.

Expert RANO rating

The whole dataset was retrospectively reviewed and evaluated according to the RANO guidelines by an expert neuroradiologist with 14 years of experience specializing in neuro-oncological diagnostics (U.K.). We note that in some cases, the rating is present even though not all four MRI sequences are available since a rating is possible with a subset of the images. The dataset contains 616 expert-rated time points.

Anonymization

All imaging data were skull-stripped to prevent identification by a patient’s skull shape. The skull-stripping was performed with HD-BET32. Both segmentation tools were used on the non-skull-stripped images since both include a brain extraction step. We only retained outputs for both segmentation tools after skull-stripping and renamed the files produced to not contain identifying information. We exclusively provide images in the NIfTI file format to avoid leakage of sensitive metadata from DICOM headers. All dates and time spans were set to be relative to the pre-operative image acquisition. The relative time-spans were subsequently converted to week counts to blur the precise follow-up intervals while preserving enough precision to retain the value of the temporal information. If two study dates fall into the same week, the relative ordering is indicated with a further number, most often the case for pre- and post-operative imaging around surgical resections (e.g., week-000-1 and week-000-2). A follow-up marked at week four may have been acquired between 28 and 35 days after pre-operative acquisition. The patient numbering was created randomly.

MRI data

A summary of the MRI acquisition parameters for the provided MRI sequences is listed in Table 2, and an example series of longitudinal images is shown in Fig. 2. For each MRI image in the dataset, we provide details of the acquisition parameters in a CSV file, such as the vendor, model, timing, voxel size, slice spacing, field strength, flip angle, and the specific absorption rate. 95% of the 2487 provided MRI images have been acquired on Siemens scanners, 3% on Philips scanners (Philips Medical Systems/Philips Healthcare), and 2% on scanners from GE Medical Systems.

Table 2 Acquisition parameters retrieved from the DICOM information.
Fig. 2
figure 2

Example of longitudinal GBM data. CT1/T1: T1-weighted post-/pre-contrast, T2-weighted, and fluid-attenuated inversion recovery (FLAIR). For this patient, the whole contrast enhancement was completely resected. After a phase of complete response, progression occurred with a very fast growth rate.

The skull-stripped, unregistered images are provided to enable further studies not confounded by potential resampling artifacts.

Automated segmentation

Automated tumor segmentations were created with DeepBraTumIA and HD-GLIO-AUTO. Both methods include a co-registration of the four MRI sequences. HD-GLIO-AUTO registers all sequences to a reference image chosen automatically during processing, and DeepBraTumIA registers to an atlas such that all studies may be analyzed in the same space. HD-GLIO-AUTO was slightly modified to retain the transformation used for co-registration of the MRI sequences to enable researchers to perform analysis in the original image space. DeepBraTumIA also provides the possibility of studying the segmentation in the original patient space. Hence we provide the back-transformed segmentations to the original image sequence for both segmentation methods.

HD-GLIO-AUTO provides segmentations for the contrast-enhancing tumor and the T2-signal abnormality. DeepBraTumIA outputs labels for necrosis, contrast enhancement, and edema.

We release automated segmentation of both tools to provide a richer algorithm benchmark and enable researchers to assess the impact of segmentation variability on different parameters of interest (e.g., radiomic biomarkers). The automated segmentation tools require all four MRI sequences as input, which is the case for 599 study dates.

Radiomic features

Radiomic features from the PyRadiomics package (version 3.0.1) are extracted on the co-registered and resampled images for each segmentation label. Feature types include first-order statistics, 3D shape, gray level co-occurrence matrix (GLCM), gray level run length matrix (GLRLM), gray level size zone matrix (GLSZM), neighboring gray-tone difference matrix (NGTDM), and gray level dependence matrix (GLDM) features. For extraction, z-score normalization, scaling by a factor of 100, and intensity shifting by 300 were applied to harmonize the value range and ensure that the majority of voxels have a positive intensity value. The bin width was set to 5.

The CoLlAGe features were calculated with a singular value decomposition radius of 5 and 64 unique angles. Since CoLlAGe features are calculated voxel-wise, we provide feature maps for the whole segmentation foreground area. For convenience to display, the feature vector images are split into their primary and secondary components. The CoLlAGe feature extractor requires a minimum of 50 image slices. For some cases, especially for HD-GLIO-AUTO, the reference image had a lower slice count. To provide a more complete feature set, we extracted this feature type for HD-GLIO-AUTO after resampling the images to 1 mm iso-voxels for all cases. The resampling was used for all cases (even for those with enough original slices) to provide a consistent processing across time points and patients. We provide the code used to generate the needed auxiliary steps and files in the linked GitHub repository. These two feature vector images are provided for all MRI sequences and both automated segmentation tools.

While these feature extraction settings may be used as a feasible baseline, we encourage researchers to customize the parameters for a given target task.

Data Records

All records for this data collection are available through Figshare39. MRI images and tumor segmentations are stored in The Neuroimaging Informatics Technology Initiative (NIfTI) format, maintaining raw medical image coordinates. Clinical and pathology data, radiomics features, MRI acquisition information, as well as the expert rating are stored in comma-separated values (CSV) files. Furthermore, we provide CSV files detailing the data completeness for each study date and provide an extensive readme document elaborating on the origin of the image or how it was derived.

Technical Validation

Each study was visually assessed to ensure skull-stripping was of sufficient quality to ensure patient anonymization. Segmentation outliers from both automated segmentation tools were intentionally left in the dataset to enable studies focusing on common failure modes of automated segmentation tools.

The pathology data was used to ensure our findings were in line with expected and prior clinical findings. Figure 3 shows the Kaplan-Meier curves for our patient cohort stratified by the MGMT promoter status. Consistent with the literature, we observe a much higher median survival time for patients with methylated MGMT status40,41.

Fig. 3
figure 3

Kaplan-Meier (KM) curves for MGMT methylated and unmethylated groups. As expected, patients with MGMT methylation have a much higher median survival.

The expert rating was checked for unfeasible disease state transitions and subsequently revised if needed. We furthermore investigated the transition probability between disease states for patients with methylated and unmethylated MGMT promoter status. This was calculated by a simple occurrence count of state changes normalized by the total number of transitions from a given state. As shown in Fig. 4, we confirm the poorer prognosis of patients with unmethylated MGMT status on a longitudinal level with a lower probability of complete treatment response and higher probabilities for a fast disease progression.

Fig. 4
figure 4

State transitions stratified by the MGMT promoter methylation status. Transitions with a probability of less than 0.1 are not shown for clarity. For MGMT promoter methylation, we observed about the same probability of going directly to progression after resection but more stable states than for the unmethylated status. For the unmethylated MGMT promoter status, the transition probabilities into the progressive disease state were much higher. We note that the sample size for staying in the complete response state was very small for both cases, and the large difference between the two methylation statuses may be considered with caution.

Usage Notes

To visually assess the imaging data, we recommend software such as 3D Slicer42, or ITK-Snap43. We provide example Python scripts for other researchers to get a fast overview of the imaging data, as well as to extract radiomic features with customized settings. This dataset may only be used for non-commercial purposes.