Observing deep radiomics for the classification of glioma grades

Kobayashi, Kazuma; Miyake, Mototaka; Takahashi, Masamichi; Hamamoto, Ryuji

doi:10.1038/s41598-021-90555-2

Download PDF

Article
Open access
Published: 25 May 2021

Observing deep radiomics for the classification of glioma grades

Kazuma Kobayashi^1,2,
Mototaka Miyake³,
Masamichi Takahashi⁴ &
…
Ryuji Hamamoto^1,2

Scientific Reports volume 11, Article number: 10942 (2021) Cite this article

3629 Accesses
23 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Deep learning is a promising method for medical image analysis because it can automatically acquire meaningful representations from raw data. However, a technical challenge lies in the difficulty of determining which types of internal representation are associated with a specific task, because feature vectors can vary dynamically according to individual inputs. Here, based on the magnetic resonance imaging (MRI) of gliomas, we propose a novel method to extract a shareable set of feature vectors that encode various parts in tumor imaging phenotypes. By applying vector quantization to latent representations, features extracted by an encoder are replaced with a fixed set of feature vectors. Hence, the set of feature vectors can be used in downstream tasks as imaging markers, which we call deep radiomics. Using deep radiomics, a classifier is established using logistic regression to predict the glioma grade with 90% accuracy. We also devise an algorithm to visualize the image region encoded by each feature vector, and demonstrate that the classification model preferentially relies on feature vectors associated with the presence or absence of contrast enhancement in tumor regions. Our proposal provides a data-driven approach to enhance the understanding of the imaging appearance of gliomas.

Intelligent feature engineering and ontological mapping of brain tumour histomorphologies by deep learning

Article 09 July 2019

Robust performance of deep learning for distinguishing glioblastoma from single brain metastasis using radiomic features: model development and validation

Article Open access 21 July 2020

Radiomics and radiogenomics in gliomas: a contemporary update

Article Open access 06 May 2021

Introduction

The scientific community has become interested not only in harnessing the predictive performance of machine learning models, but also in dissecting such models to distill useful knowledge that can potentially advance scientific understanding¹. When a model achieves high prediction performance in a particular task, it is expected to have acquired an expressive internal representation that approximates the explanatory patterns underlying the phenomena of interest. Therefore, the internal representations of trained models can be interpreted to obtain meaningful insights and scientific knowledge without directly observing the phenomena. Based on this concept of acquiring medical knowledge in a data-driven manner, the objective of this study is to discover common features in medical imaging associated with specific clinical information across a patient population.

Particularly, this study focuses on the imaging phenotypes of gliomas, which are the most common central nervous system tumors^2,3. According to the grading system of the World Health Organization (WHO), gliomas are classified into grades I to IV, based on histopathological findings obtained from surgical biopsies or specimens⁴. Because the degrees of aggressiveness and infiltrative characteristics significantly affect the disease prognosis, the differential diagnosis between lower-grade gliomas (LGG, WHO grades II and III) and high-grade gliomas (HGG, WHO grade IV) is an important issue regarding treatment options and prognosis⁵.

Currently, the standard procedure for classifying tumors according to the WHO grades is based on pathological study. However, there are still many limitations to tumor classification, including the requirement for invasive procedures such as surgical resections or biopsies, inherent sampling errors caused by the heterogeneity of tumors, and the time-consuming process of histopathological analysis. There are also cases wherein it may be dangerous to perform surgical procedures on tumors located at critical sites in the brain. To address these issues, the computational analysis of magnetic resonance imaging (MRI) for tumor grading has attracted significant attention^6,7. Because MRI can non-invasively observe an entire tumor in vivo, it is free from sampling errors. Therefore, the management of gliomas based on multi-parametric MRI analysis can play a complementary role in pathology-based diagnosis.

Radiomics and deep learning are two mainstays for computational analysis of tumor images. Many intensive studies have attempted to analyze the imaging phenotypes of glioma, and each of these approaches has certain advantages and disadvantages in gaining meaningful insights from trained models.

Radiomics is a research field focusing on decoding tumor phenotypes based on quantitative imaging features⁸. Typically, suitable sets of handcrafted imaging features are extracted from the region of interest (ROI) for analysis. Subsequently, a prediction model based on a machine learning algorithm is trained for a particular prediction task relevant to clinical decision-making. For glioma grading, many previous studies have demonstrated that the tumor characteristics can be quantified using radiomics, and have reported satisfactory discriminative performance^9,10,11,12. Because the radiomics approach uses pre-defined handcrafted imaging features, it has the advantage of high interpretability for the features contributing to the prediction. However, to implement problem-specific handcrafted features, domain knowledge is often required. Because the optimal representative features for a given task are not always obvious, a data-driven approach should be considered to represent the data distribution.

Deep learning has emerged as an innovative technology that enables end-to-end learning between the input data and ground-truth labels¹³. Using backpropagation to tune the parameters of multilayered nonlinear operations during the training process, deep neural networks can automatically abstract useful representations from data. In other words, deep neural networks are capable of data-driven feature extraction. Therefore, a deep learning model can learn internal representations that are meaningful for distinguishing the attributes of samples without relying on feature engineering based on domain knowledge. For example, deep-learning-based algorithms have achieved remarkable prediction performance in glioma grade classification^14,15,16. Conversely, in such complex models, a tradeoff between accuracy and explainability has traditionally existed¹⁷. Hence, complex models, such as deep learning models, are occasionally referred to as black-box models¹⁸, implying that there is a difficulty in interpreting how the models arrive at a particular outcome.

At the core of our challenge is the internal variability of convolutional neural networks (CNNs). When a CNN is trained to predict the imaging characteristics of gliomas, internal representations can be acquired as low-dimensional feature vectors, which collectively constitute the feature maps. One may argue that these feature vectors can then be used as imaging markers in downstream tasks because they are expected to adequately represent the appearance of tumors. Nevertheless, only a few studies have deeply investigated different types of imaging characteristics exploited by deep learning models for prediction in clinical tasks of glioma imaging. Among existing studies, Banerjee et al.¹⁵ investigated the properties of convolutional kernels in different layers through visualization. However, the internal variability of the typical CNNs still hinders model interpretability, whereby each feature map changes dynamically depending on individual inputs, especially focusing on determining the types of internal representations that are critical for a specific task. Because the objective of the majority of medical studies is to find specific factors that are significantly common in a diseased population, it is crucial to fix the variability of feature vectors representing targeted imaging phenotypes.

To combine the advantages of radiomics and deep learning by solving the internal variability of CNNs, we propose a straightforward approach to incorporate vector quantization into the feature extraction process of deep learning models. Particularly, we apply vector quantization to the latent representation inside a segmentation model based on an encoder–decoder structure for tumor regions in images. Through the process of vector quantization, individually varying features extracted from an encoder can be replaced with a fixed set of feature vectors, the configuration of which is also optimized in the model training process. As a result, each imaging phenotype can be indicated by a shareable set of feature vectors, allowing themselves to be used as imaging markers for downstream tasks. Subsequently, we attempt to identify specific types of internal representations associated with particular clinical information by training a classification model based on the set of feature vectors. Thus, our approach combines the flexible representative capacities of deep learning and the highly interpretable aspects of radiomics to acquire meaningful knowledge in a data-driven manner, which we call deep radiomics. Additionally, we devise a feature ablation study to visualize which types of imaging characteristics are utilized by the classification model to provide interpretable feedback to physicians for the task-specific radiological findings. We also discuss whether the obtained result is consistent with the findings reported in the literature.

Methods

In this section, we describe a method to extract a shareable set of feature vectors inside a segmentation network by incorporating vector quantization and to utilize them for the classification of glioma grades using logistic regression. The latter task was formulated as a binary classification whereby an input magnetic resonance (MR) volume is diagnosed either as LGG or HGG. Additionally, the types of imaging characteristics that enable the prediction were investigated by conducting a feature ablation study.

Dataset

We prepared a dataset of brain MRIs with gliomas from the 2019 BraTS Challenge^19,20,21,22. This dataset contains T1, Gd-enhanced T1, T2, and FLAIR sequences for patients diagnosed with LGG or HGG. Note that LGG stands for “lower-grade” glioma herein, the definition of which includes both low-grade glioma (WHO grade II) and intermediate-grade glioma (WHO grade III)^5,23. Bakas et al.²⁴ gives the detailed description of scanning and annotation protocols. Briefly, all clinically acquired multi-parametric MRI scans were co-registered to a common anatomical template, resampled to $1 \;{\mathrm {mm}}^3$, and underwent skull-stripping.

In this study, all four sequences were used, and three types of datasets were obtained: a training dataset (MICCAI_BraTS_Training) containing 355 patients, a validation dataset (MICCAI_BraTS_Validation) containing 125 patients, and a test dataset (MICCAI_BraTS_Testing) containing 167 patients. Only MICCAI_BraTS_Training contains a patient-basis diagnosis of LGG (76 patients) and HGG (259 patients) that is pathologically confirmed²⁴. MICCAI_BraTS_Training originally contained three ground-truth segmentation labels for abnormalities: Gd-enhanced tumor (ET), peritumoral edema (ED), and necrotic and non-enhancing tumor core (NET). Under the supervision of expert radiologists, we segmented the images in MICCAI_BraTS_Validation and MICCAI_BraTS_Testing into the aforementioned three abnormal categories (ET, ED, and NET). Note that the names of the datasets given in the 2019 BraTS Challenge and the purpose of using each dataset in this study are different. To train a segmentation network, a dataset obtained by concatenating MICCAI_BraTS_Validation and MICCAI_BraTS_Testing was used as a training dataset. After training the segmentation network, a classification model was constructed based on MICCAI_BraTS_Training as a validation dataset, which is the only dataset containing information on the glioma grades.

Proposed algorithm for deep radiomics

Here, we describe the algorithm for extracting and exploiting deep radiomics for the classification of glioma grades.

Overview of the algorithm

We first train an encoder–decoder network to predict the segmentation of glioma imaging characteristics from a two-dimensional (2D) axial slice of multi-parametric MRI (Fig. 1a). The core of our proposal is to perform vector quantization at the bottom of the segmentation network, where a codebook consisting of a fixed number of feature vectors as codewords is trained to capture the imaging characteristics meaningful for the tumor segmentation (Fig. 1b). After the training, for individual input images, the varying feature representations by the encoder are substituted by the codewords located at fixed positions through vector quantization. The codewords in the learned codebook can be regarded as shareable in the dataset. Subsequently, imaging features of each MRI volume are represented as a histogram, which summarizes how many times each codeword in the codebook appears in each slice of the MRI volume (Fig. 1c). Thereafter, by applying simple logistic regression to classify different glioma grades based on the histogram representation, a set of feature vectors that are significantly associated with the prediction is identified. We further conduct a feature ablation study to visualize which types of imaging characteristics are associated with glioma grades in the image space (Fig. 2).

Notation

Let us consider a multi-parametric three-dimensional (3D) MRI volume $\varvec{X} \in {\mathbb {R}}^{C \times W \times H \times I}$, where C is the number of channels, W and H represent the height and width of the axial slices, respectively, and I is the number of axial slices. We define $\varvec{x} \in {\mathbb {R}}^{C \times W \times H}$ as a slice in the axial view. The segmentation network encodes a slice-wise input $\varvec{x}$ into the low-dimensional latent representation $\varvec{z} \in {\mathbb {R}}^{C^{\prime } \times W^{\prime } \times H^{\prime }}$ and decodes the segmentation output $\hat{\varvec{y}} \in {\mathbb {R}}^{S \times W \times H}$, where S is the number of segmentation labels. The ground-truth segmentation label $\varvec{y} \in {\mathbb {R}}^{S \times W \times H}$ is used to train the segmentation network. The series of latent representations $\varvec{z}$ for each slice of the MRI volume can be concatenated into a summarized representation $\varvec{Z} \in {\mathbb {R}}^{C^{\prime } \times W^{\prime } \times H^{\prime } \times I^{\prime }}$, which is considered as a volume-based representation. The glioma grades are classified on a volume basis because grading is carried out for each patient based on pathological examinations²⁴.

Segmentation networks with a shareable set of feature vectors

A segmentation network was trained to extract a shareable set of feature vectors. As shown in Fig. 1a, the network consisted of an encoder–decoder pair connected via a discrete latent space containing a set of feature vectors as codewords in a codebook. Through the encoder, a 2D MRI slice $\varvec{x}$ is mapped to a latent representation $\varvec{z}_e$, which can be variable according to individual inputs. In the latent space, vector quantization is performed based on a codebook $\varvec{e} = \{e_k |k = 1, \ldots , K\}\in {\mathbb {R}}^{K \times D}$, which stores a shareable set of K feature vectors as codewords $e_k \in {\mathbb {R}}^D$, by replacing each feature vector in $\varvec{z}_e$ with the nearest codeword to produce the quantized latent representation $\varvec{z}_q$. This vector quantization process is analogous to that of a vector-quantized variational autoencoder (VQ-VAE)^25,26. As illustrated in Fig. 1b, the feature vectors corresponding to each voxel of $\varvec{z}_e$ are quantized by executing a nearest-neighbor lookup on the codebook, as follows:

$$\begin{aligned} z_i = {\mathop {\mathrm{arg\, min}}\limits _{k \in [K]}} \Vert \varvec{z}_{e_i} - e_k\Vert _2. \end{aligned}$$

(1)

Thereafter, the codewords in the codebook are collected as a quantized latent representation $\varvec{z}_q$, as follows:

$$\begin{aligned} \varvec{z}_{q_i} = e_{z_i}. \end{aligned}$$

(2)

To optimize this process, the codebook and encoder are trained to minimize the objective, which is referred to as latent loss, as follows:

$$\begin{aligned} L_{{\mathrm {latent}}} = \Vert {\mathrm {sg}}[\varvec{z}_e(x)] - \varvec{e}\Vert ^2_2 + \beta \Vert \varvec{z}_e(x) - {\mathrm {sg}}[\varvec{e}]\Vert ^2_2, \end{aligned}$$

(3)

where ${\mathrm {sg}}$ represents the stop-gradient operator; this serves as an identity function at the forward computation time and has zero partial derivatives. During training, the codebook loss, which is the first term in the aforementioned equation, updates the codebook variables by delivering the codewords to the encoder’s output (see the arrow indicated by $\nabla L_{\mathrm {codebook}}$ in Fig. 1b). Simultaneously, the commitment loss, which is the second term, encourages the output of the encoder to move closer to the target codewords (see the arrow indicated by $\nabla L_{\mathrm {commit}}$ in Fig. 1b). The hyperparameter $\beta $ controls the reluctance of changing the encoder output to match the corresponding codewords. Backpropagation or exponential moving average can be used to train the codebook²⁷. Notably, the size of the codebook can be arbitrarily tuned, which ensures that a certain amount of information is preserved and compressed within the latent space²⁶.

Then, the decoder takes $\varvec{z}_q$ as input and generates the segmentation map $\hat{\varvec{y}}$, which is encouraged to be similar to the ground-truth labels $\varvec{y}$. The segmentation loss function consists of the soft Dice²⁸ and focal losses²⁹. In summary, the overall training objectives for the segmentation network are as follows:

$$\begin{aligned} L_{\mathrm {total}} = L_{\mathrm {latent}} + L_{\mathrm {segmentation}}. \end{aligned}$$

(4)

At each iteration to minimize Eq. (4), the encoder output $\varvec{z}_e$ is updated to alter the configuration in the next forward pass (see the arrow indicated by $\nabla _z L_{\mathrm {total}}$ in Fig. 1b). Consequently, after the training of the tumor segmentation, we can consider the codewords as a shareable set of feature vectors that contain the representations describing imaging phenotypes of gliomas. Hereinafter, the image analysis method exploiting this shareable set of feature vectors obtained in a data-driven manner is called deep radiomics.

Histogram representation of brain MRI based on deep radiomics

We hypothesize that these feature vectors can be useful to distinguish between LGG and HGG. To demonstrate this, we start with a volume-wise representation of brain MRI, as the pathologically-confirmed glioma grade is associated with the entire volume. We build upon the encoder followed by the vector quantization used as a feature extractor f to produce the slice-wise quantized latent representation $\varvec{z}_q$ (Fig. 1c). All I quantized latent representations $\{\varvec{z}_1, \ldots , \varvec{z}_I\}$ extracted from slices $\{\varvec{x}_1, \ldots ,\varvec{x}_I\}$ in the MRI volume $\varvec{X} \in {\mathbb {R}}^{C \times W \times H \times I}$ are concatenated into a volume-wise representation $\varvec{Z}_q$. Subsequently, we convert this representation into a histogram representation to approximate the imaging appearance as a count of each feature vector on a volume basis, as follows:

$$\begin{aligned} \varvec{Z}_q = \sum _{i \in I} f(\varvec{x}) = \sum _{i \in I} \varvec{z}_q \approx \sum _{i \in I} H_{k \in K} (c_{k_i}, e_k) = H_{k \in K} (c_k, e_k), \end{aligned}$$

(5)

where H is an operator to rearrange a histogram according to the number of feature vectors, K is the number of discrete feature vectors in the codebook, $c_{k_i}$ is the number of occurrence of the $k\hbox {th}$ feature vector in the $i\hbox {th}$ axial slice, and $c_k$ is the summed occurrence of the $k\hbox {th}$ feature vector appearing in the MRI volume.

Classification models for glioma grades

A key benefit of the vector quantization is that a specific set of feature vectors stored in the codebook can be shared across a population, fixing the variability of internal representations of CNNs. This allows us to use these feature vectors as imaging markers for downstream tasks. Therefore, to establish a binary classification model to discriminate the glioma grade, we used logistic regression based on the histogram representation. By considering the number of occurrences $c_i$ of each feature vector as an explanatory variable, the logistic regression model can be formulated as follows:

$$\begin{aligned} {\mathrm {logit}} (p) = \beta _0 + \sum _{k \in K^{*}} \beta _k c_k, \end{aligned}$$

(6)

where p indicates the probability of a particular class, $\beta $ is a regression coefficient, and $K^{*}$ denotes a set of significant classifier coefficients based on the effect likelihood ratio test. The classification performance was evaluated based on accuracy, precision, recall (sensitivity), specificity, and negative predictive value, where HGG and LGG were considered as positive and negative, respectively.

Robustness assessment of the deep radiomics

Robustness of features under varying scanning and segmentation conditions is a significant challenge in conventional radiomics³⁰. Several researchers have studied reproducibility of radiomics and report the variability of radiomics features depending on image preprocessing techniques such as voxel size, slice thickness, and normalization methods^31,32,33. Therefore, robustness assessment of the deep radiomics is necessary to demonstrate its usefulness in medical image analysis.

We evaluated the robustness of the deep radiomics from two perspectives. First, we investigated the reproducibility of the volume-wise representation as the histogram shown in Eq. (5). As standardization of pixel/voxel intensity in brain MRIs significantly affects radiomics markers^31,34, we imposed perturbations by scaling and shifting the entire pixel value of input images. Then, the extent to which selected feature vectors deviated from the original histogram, which was acquired without any perturbation, was quantified. This is formulated as an index called difference ratio as follows:

$$\begin{aligned} \text{ difference } \text{ ratio } = \frac{\text{ number } \text{ of } \text{ feature } \text{ vectors } \text{ different } \text{ from } \text{ the } \text{ original } \text{ histogram }}{\text{ number } \text{ of } \text{ feature } \text{ vectors } \text{ in } \text{ the } \text{ original } \text{ histogram }}, \end{aligned}$$

(7)

where the numerator is calculated as the sum of the absolute values of the difference in the number of occurrences of each feature vector. Second, we assessed the performance degradation of the classification model in Eq. (6) under the same perturbations. The performance indices, such as accuracy, precision, recall (sensitivity), specificity, and negative predictive value, were calculated according to the magnitude of the perturbations.

Identification of responsible vectors

For interpretability, linear models such as logistic regression are considered as transparent, whereas complex models involving deep learning are sometimes regarded as black-box³⁵. Transparent models are considered so because they are inherently interpretable. For example, statistical tests of individual predictors for a logistic regression model showing goodness of fit for a target observation can identify significant variables for prediction. Therefore, we sought to identify feature vectors with coefficients that exhibited statistical significance using the effect likelihood ratio test, which is indicated by $K^*$ in Eq. (6). We refer to these significant feature vectors as responsible vectors. Then, to elucidate the preference of each responsible vector for either LGG or HGG, we analyzed the frequency of each responsible vector according to the glioma grade using the Wilcoxon signed-rank test, because the null-hypothesis for the normality of the variable distribution was rejected by the Shapiro–Wilk test. If a responsible vector is significantly frequent in LGG patients, it is called an LGG responsible vector. Similarly, HGG responsible vectors are defined as frequent feature vectors in HGG patients. The level of statistical significance was set to $p < 0.05$.

Feature ablation study to visualize responsible regions

To enhance the interpretability of deep radiomics, we further devise a feature ablation study to visualize the imaging characteristics that are encoded by a specific feature vector (Fig. 2). First, an input image is projected onto a corresponding latent representation by the encoder and the vector quantization (Fig. 2a). The quantized latent representation $\varvec{z}_q$ is then fed into the decoder to generate the logit map $\tilde{\varvec{y}}$, which is subsequently converted into the segmentation output $\hat{\varvec{y}}$ through the argmax function. Here, the logit map $\tilde{\varvec{y}}$ is retained for further processing. Next, the feature vector of interest in the initial latent representation $\varvec{z}_q$ is replaced with a background vector, which is defined as the most common vector in the background of the images (that is, the black region outside the body in MRI). The replaced latent representation $\varvec{z}_q^\prime $ is subsequently input into the decoder and the corresponding logit map $\tilde{\varvec{y}}^\prime $ is retained. Finally, the per-pixel L1 difference between the two logit maps, $\tilde{\varvec{y}}$ and $\tilde{\varvec{y}}^\prime $, is evaluated. Because the difference map reflects the changed segmentation output through the removal of the feature vector of interest, we can assess the imaging characteristics encoded by each feature vector by observing the corresponding region in the input image. Therefore, we call this difference map the responsible region (Fig. 2b). The responsible regions from all LGG responsible and HGG responsible vectors are collectively denoted as the LGG responsible region and HGG responsible region, respectively.

For a quantitative assessment, the values of the responsible region (the per-pixel L1 difference between $\tilde{\varvec{y}}$ and $\tilde{\varvec{y}}^\prime $) was calculated according to each segmentation label (ET, ED, and NET). The null-hypotheses for the normality of these values in the LGG and HGG responsible regions were rejected by the Shapiro–Wilk test ($p < 0.05$). Thus, we performed the Kruskal–Wallis test and the non-parametric comparisons for all pairs (NET-ED, ED-ET, and NET-ET) using the Dunn method for joint ranking to reveal the responsible regions that are significantly associated with a particular tumor region.

Implementation details

The segmentation network was implemented and trained according to the following descriptions.

Preprocessing

All four sequences, T1, Gd-enhanced T1, T2, and FLAIR, were concatenated into a four-channel MR volume $\varvec{X} \in {\mathbb {R}}^{4 \times 240 \times 240 \times 155}$. The preprocessing pipeline, including axial image resizing to $256 \times 256$ and Z-score normalization, was performed. Moreover, each three-dimensional (3D) MR volume was decomposed into a collection of 2D axial slices $\{\varvec{x}_1, \varvec{x}_2, \dots , \varvec{x}_{155} \in {\mathbb {R}}^{4 \times 256 \times 256} \}$. Both the training and validation datasets were preprocessed.

Encoder network

The encoder consists of residual blocks³⁶, wherein two [convolution + group normalization³⁷ + LeakyReLU] sequences are processed with residual connection. The kernel size, stride, and padding size of the convolution function in the residual blocks are set to 3, 1, and 1, respectively. From the first to the last residual blocks, the encoder uses $32 - 64 - 128 - 128 - 128 - 128$ filter kernels. Each residual block is followed by a downsampling block to halve the feature map size, except for the bottom of the network. The downsampling block consists of a sequence of [convolution + group normalization + LeakyReLU], whose kernel size, stride, and padding size are set to 3, 2, and 1, respectively. The input image is required to have a size of $4 \times 256 \times 256$ $(= {\mathrm {channel}} \times {\mathrm {height}} \times {\mathrm {width}})$. The encoder output, which is denoted as $\varvec{z}_e$, has a size of $64 \times 8 \times 8$.

Decoder network

The decoder architecture is approximately symmetrical to that of the encoder. From the first to the last residual block, the decoder uses $128 - 128 - 128 - 128 - 64 - 32$ filter kernels. The residual blocks consist of two [convolution + group normalization + LeakyReLU] sequences that follow an upsampling layer using an interpolation function coupled with a convolutional function to double the size of the feature map. Latent variables sampled from $p(\varvec{z})$ with a size of $64 \times 8 \times 8$ pass through the decoder to yield reconstructed 2D images with a size of $4 \times 256 \times 256$.

Training setups

All neural networks were implemented using Python 3.7 with the PyTorch library 1.6.0³⁸ on an NVIDIA Tesla V100 GPU with CUDA 10.0. The initialization method proposed by He et al.³⁹ was applied to all the networks. Adam optimization⁴⁰ with a learning rate of $1 \times 10^{-4}$ was used for the segmentation network. The other hyperparameters were empirically determined as follows: batch size = 72, maximum number of epochs = 600. The size of the latent codebook was $512 \times 64$ ($= K \times D$). During training, the data augmentation included horizontal flipping, random rotation, and random-intensity shifting and scaling.

Results

Segmentation performance of segmentation network

Comparison of voxel volumes according to each tumor region (ET, ED, and NET) for the two glioma grades is shown in Table 1. The segmentation performance of the segmentation network based on the Dice score (mean ± standard deviation) was as follows: $0.56 \pm 0.28$ for NET, $0.68 \pm 0.16$ for ED, $0.69 \pm 0.23$ for ET, $0.80 \pm 0.19$ for the tumor core (NET $+$ ET), and $0.76 \pm 0.12$ for the whole tumor (NET $+$ ED $+$ ET). These intermediate Dice scores were expected, because the segmentation network has a bottleneck where the imaging features are compressed according to the limited size of the codebook. Notably, the primary objective of the segmentation network is not segmentation, but to provide a shareable set of feature vectors that sufficiently cover the imaging phenotypes of gliomas and are discriminative in downstream tasks.

Table 1 Comparison of voxel volumes (mean ± standard deviation) [${\mathrm {cm}}^3$] according to each tumor regions between LGG and HGG in the validation dataset (MICCAI_BraTS_Training).

Full size table

Histogram representation

Figure 3 shows the average histogram representations of HGG and LGG patients. These histograms indicate the average number of times each feature vector appears per MRI volume according to the glioma grading. A slight difference can be observed between these two histograms, particularly regarding low-frequency feature vectors. Figure 4a,b shows the difference ratio (Eq. 7), which indicates the reproducibility and repeatability of the same histogram representation under perturbations. For pixel intensities that were standardized through Z-score normalization, we applied scaling (Fig. 4a) and shifting (Fig. 4b) of pixel values with different magnitudes ranging from 0.0 to 1.0 in increments of 0.1 as perturbations. As can be seen, the difference ratio increased as the degree of perturbation increased. Further, shifting tended to have a larger impact than scaling.

Classification accuracy

According to fivefold cross-validation in the validation dataset, the classification results (mean ± standard deviation) of the glioma-grading model were as follows: $0.90 \pm 0.03$ of accuracy, $0.82 \pm 0.13$ of precision, $0.73 \pm 0.08$ of recall (sensitivity), $0.95 \pm 0.04$ of specificity, and $0.93 \pm 0.01$ of negative predictive value. As for the robustness of the classification model under the same perturbations, shifting (Fig. 4d) tended to entail larger decline in the performance than scaling (Fig. 4c). The performance degradation seemed to be consistent with the degree of difference ratio caused by each level of the perturbations.

Identification of responsible vectors

After evaluating the classification performance based on fivefold cross-validation, we trained the classification model again on all samples for further analysis. Additionally, the classification model identified two HGG responsible vectors and three LGG responsible vectors, which were significant covariates in the logistic regression models (effect likelihood ratio test: $p < 0.05$) and had significantly uneven distribution according to the glioma grading (Wilcoxon signed-rank test: $p < 0.05$).

Qualitative evaluation of responsible regions

As demonstrated by the classification performance, the feature vectors in the codebook appear to represent the imaging characteristics of gliomas and may convey meaningful information to identify the glioma grade. Therefore, we investigated the types of imaging characteristics that are encoded by each feature vector through feature ablation study (Fig. 2). We visualized both the HGG and LGG responsible regions to evaluate the overlap with the segmented tumor regions that were provided as ground-truth labels (ET, ED, and NET).

Figure 5 shows the distribution of the HGG and LGG responsible regions in patients with HGG. Notably, the HGG responsible regions were strongly correlated with the tumor regions of the HGG patients. The large difference values (indicated in red color) were preferentially gathered in the central region of the tumor corresponding to the ET label. By contrast, although a small overlap with the LGG responsible regions was observed in the peripheral regions of the tumor, the values were relatively low as indicated by the color map.

Figure 6 presents the distribution of HGG and LGG responsible regions in patients with LGG. In contrast to the aforementioned results, the LGG responsible regions significantly overlapped with the central region of the tumor, and particularly the region labeled as NET. The signals of the HGG responsible regions were not remarkable, as indicated by their low values.

Quantitative evaluation of responsible regions

Finally, we quantitatively evaluated the preferences of each responsible region according to the ET, ED, and NET segmentation labels. The difference values in each segmented area were summed and statistically compared, as shown in Fig. 7. For the HGG responsible regions, the mean ± standard deviation values for the NET, ED, and ET labels were $5.48 \pm 4.69$, $3.78 \pm 2.79$, and $7.66 \pm 5.37$, respectively. The Kruskal–Wallis test and the non-parametric comparisons carried out for all pairs using the Dunn method for joint ranking revealed that the highest values appeared in the ET region ($p < 0.0001$). For the LGG responsible regions, the values for the NET, ED, and ET labels were $1.22 \pm 1.26$, $1.02 \pm 1.10$, and $0.92 \pm 1.02$, respectively. The same statistical tests revealed that the highest values appeared in the NET region ($p < 0.0001$). As these quantitative observations were consistent with the qualitative results (Figs. 5, 6), it can be concluded that the imaging characteristics associated with the prediction of HGG and LGG are indicated by their localization in the ET and NET regions, respectively. In other words, it is implied that the classification model mainly depends on the number of feature vectors associated with the presence (ET) or absence (NET) of contrast enhancement in the tumor.

Discussion

Multi-parametric MRI can reveal the morphological heterogeneity of gliomas, which contain various sub-regions (edematous regions, enhancing and non-enhancing tumor cores) with varying histological and genomic phenotypes. This intrinsic heterogeneity can also be observed in imaging phenotypes because their sub-regions exhibit different intensity patterns across different MR sequences. In this study, three different regions were considered. The ET is defined by areas exhibiting hyper-intensity in the Gd-enhanced T1 sequences compared with T1 signals²⁰. Such regions generally correspond to areas of contrast enhancement, where contrast leakage caused by blood-brain barrier damage may exist^41,42. The ED is defined by areas with high T2/FLAIR signal intensity²⁰, which represent either low cellularity or edema⁴³. The NET indicates non-enhancing tumor regions and pre-necrotic and/or necrotic regions located in the non-enhancing part of the tumor core²⁰. The imaging appearance of NET typically exhibits hypo-intensity in the Gd-enhanced T1 sequences compared with T1 signals²⁰.

The imaging differences between LGG and HGG have attracted a substantial amount of attention regarding early differential diagnosis. Nevertheless, these differences are still debated. Typically, LGG appears as an area of focal signal abnormality with minimal or no contrast enhancement⁴⁴, and does not cause significant blood–brain barrier disruption, which results in less contrast leakage around the lesions. In contrast, most HGG in Gd-enhanced T1 sequences exhibit moderate to strong contrast enhancement, which reflects the degree of microvascularity and the presence of a disrupted blood–brain barrier⁴⁵. Occasionally, necrosis can be observed inside a tumor, and is an important diagnostic feature for HGG⁴⁶. Furthermore, HGG commonly causes significant damage to the blood-brain barrier, which appears as a large ED area surrounding the tumor core. Therefore, based on the segmentation categories adopted in this study, the presence of NETs in the central region of a tumor surrounded by a small ED region can be considered as a typical LGG characteristic. For HGG, a tumorous lesion represented by ET with or without NET and extensively surrounded by ED areas can be considered as a typical representation.

Based on these considerations, our results are consistent with the known imaging characteristics of LGG and HGG. Particularly, the feature ablation study revealed that NET is the most discriminative component of LGG, whereas ET is the most discriminative component of HGG (Fig. 7). The presence of contrast enhancement (ET) is often considered as a sign of HGG⁴⁷. Therefore, the observation that the classification model captured the presence (ET) or absence (NET) of contrast enhancement in the tumor core is compelling.

Several studies have investigated the classification of glioma grades using deep learning. For example, Yang et al. demonstrated that ImageNet-pretrained deep learning models, such as AlexNet⁴⁸ and GoogleNet⁴⁹, can outperform a comparative model trained from scratch, and achieve a maximum test accuracy above 90%¹⁴. However, their method requires the manual segmentation of the ROIs before the classification. Recently, Banerjee et al. proposed a deep-learning-based algorithm that incorporates volumetric tumor information and achieves a maximum accuracy of 97%¹⁵. Similarly, Zhuge et al. proposed a two-step approach to automatically segment brain tumor regions and carry out classification according to the bounded image regions that contain tumors¹⁶. They also achieved a maximum classification accuracy of 97%. To achieve superior performance, an important aspect of deep-learning-based models is the size and extent of the input images. Banerjee et al.¹⁵ compared several neural networks using patch-wise, slice-wise, and volume-wise inputs, and achieved glioma grading accuracy of 82%, 86%, and 95%, respectively. Particularly, when considering the input as a 3D volume, these deep-learning-based approaches can outperform machine-learning-based approaches that use logistic regression based on brain tumor radiomics features (accuracy of 88%)⁵⁰.

Compared with previous studies, the classification accuracy of the proposed model is ranked between the accuracy achieved when using slice-wise inputs and the accuracy achieved when using volume-wise inputs¹⁵. Even though the proposed feature extraction process was performed using slice-wise inputs, the classification model is as simple as using logistic regression. Therefore, the proposed classification model’s performance is remarkable compared with that of end-to-end deep learning models that take slice-wise inputs. Notably, Rudin⁵¹ insisted that the belief whereby more complex models are more accurate is not always true, particularly when a good representation in terms of meaningful features is constructed for a target task. She also argued that there is often no significant difference between the prediction accuracy achieved by more complex models, such as deep neural networks, and much simpler models, such as logistic regression, when the representative data features are given. Accordingly, we confirmed that the feature vectors obtained from the pre-task of tumor segmentation are sufficiently informative for the discrimination of glioma grading.

To the best of our knowledge, this is the first study that uses vector quantization to obtain a shareable set of feature vectors across a population for the purpose of identifying specific factors associated with clinical information. The reason for acquiring quantized latent representations rather than continuous ones for the deep radiomics is that it can explicitly fix the variability of internal representations of CNNs. As the original radiomics is an approach to extract a large number of quantitative image features for the objective comparison of medical images⁸, we believe it is important to yield a comparable set of latent representations in a dataset even when using deep learning as a feature extraction method. Based on these considerations, our methodology has shown considerable success in extracting deep radiomics from the segmentation model, exploiting them in the glioma grade classification, and visualizing the imaging region encoded by each feature vector significantly attributed to the classification. The observations are consistent with those reported in the literature and can equip physicians with an enhanced understanding of the inner reasoning process of classification models.

Limitations

This study has several limitations. First, the detailed information on the public dataset, including scanner vendors, the time of scan, field of view, and patient demographic, was unclear. Second, we have not tested the generalizability of the results using external datasets. In order to compensate for these shortcomings, robustness of the proposed method was investigated, and it was shown that the deep radiomics has a certain level of invariance to the shift and scale of the pixel values (Fig. 4). This could be due to the feature normalization being operated in each layer of the segmentation network. Furthermore, the variation of the encoder output caused by the perturbations can also be suppressed by the vector quantization. Third, no direct comparison was conducted with the conventional or advanced techniques using Radiomics^8,52 and other deep-learning-based feature extraction methods⁵³. Moreover, it should be noted that the distinction between LGG and HGG in the BraTS dataset is different from those in the WHO classification of gliomas², as Dequidt et al. clarified²³. Our source code is publicly available for further research to resolve the aforementioned limitations. Furthermore, future technical challenges include the extension of this work to end-to-end learning including classifiers, and pre-task without label information using self-supervised learning.

Conclusion

Our deep radiomics approach is a data-driven technique to utilize the internal representations acquired inside deep neural networks as imaging markers for downstream tasks. Vector quantization is the core of our proposal to resolve the internal variability of typical CNNs for extracting a shareable set of feature vectors in a population. Based on the dataset containing brain MRIs with gliomas, we demonstrated that the method could provide a good classification accuracy for the glioma grades as well as interpretability for the task-specific radiological findings on which the classification model depends. The proposed method is versatile and easily applicable to other research fields.

Data availability

Data analyzed during the current study are available on Center for Biomedical Image Computing & Analytics (https://www.med.upenn.edu/cbica/).

Code availability

The source code in this work is publicly available on GitHub (https://github.com/Kaz-K/deep-radiomics-glioma).

References

Liu, S. et al. Actionable attribution maps for scientific machine learning. In International Conference on Machine Learning (ICML) Workshop on ML Interpretability for Scientific Discovery (2020).
Wesseling, P. & Capper, D. WHO 2016 classification of gliomas. Neuropathol. Appl. Neurobiol. 44, 139–150 (2018).
Article CAS Google Scholar
DeAngelis, L. M. Brain tumors. N. Engl. J. Med. 344, 114–123 (2001).
Article CAS Google Scholar
Louis, D. N. et al. The 2016 world health organization classification of tumors of the central nervous system: A summary. Acta Neuropathol. 131, 803–820 (2016).
Article Google Scholar
The Cancer Genome Atlas Research Network. Comprehensive, integrative genomic analysis of diffuse lower-grade gliomas. N. Engl. J. Med. 372, 2481–2498 (2015).
Article Google Scholar
Sotoudeh, H. et al. Artificial intelligence in the management of glioma: Era of personalized medicine. Front. Oncol. 9, 768 (2019).
Article Google Scholar
Shaver, M. M. et al. Optimizing neuro-oncology imaging: A review of deep learning approaches for glioma imaging. Cancers 11, 829 (2019).
Article Google Scholar
Aerts, H. J. et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nat. Commun. 5, 1–9 (2014).
Google Scholar
Xiao, T., Hua, W., Li, C. & Wang, S. Glioma grading prediction by exploring radiomics and deep learning features. ACM International Conference Proceeding Series 208–213 (2019).
Banerjee, S., Mitra, S., Masulli, F. & Rovetta, S. Glioma classification using deep radiomics. SN Comput. Sci. 1, 209 (2020).
Article Google Scholar
Chen, W., Liu, B., Peng, S., Sun, J. & Qiao, X. Computer-aided grading of gliomas combining automatic segmentation and radiomics. Int. J. Biomed. Imaging 2018, (2018).
Cho, H., Lee, S., Kim, J. & Park, H. Classification of the glioma grading using radiomics analysis. PeerJ 2018, (2018).
Lecun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article ADS CAS Google Scholar
Yang, Y. et al. Glioma grading on conventional MR images: A deep learning study with transfer learning. Front. Neurosci. 12, 804 (2018).
Article Google Scholar
Banerjee, S., Mitra, S., Masulli, F. & Rovetta, S. Deep radiomics for brain tumor detection and classification from multi-sequence MRI. arXiv preprint arXiv:1903.09240 (2019).
Zhuge, Y. et al. Automated glioma grading on conventional MRI images using deep convolutional neural networks. Med. Phys. 47, 3044–3053 (2020).
Article Google Scholar
Nanayakkara, S. et al. Characterising risk of in-hospital mortality following cardiac arrest using machine learning: A retrospective international registry study. PLoS Med. 15, e1002709 (2018).
Article Google Scholar
Holm, E. A. In defense of the black box. Science 364, 26–27 (2019).
Article ADS CAS Google Scholar
Menze, B. H. et al. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging 34, 1993–2024 (2015).
Article Google Scholar
Bakas, S. et al. Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data 4, 1–13 (2017).
Article Google Scholar
Bakas, S. et al. Segmentation labels and radiomic features for the pre-operative scans of the TCGA-GBM collection. Cancer Imaging Arch.https://doi.org/10.7937/K9/TCIA.2017.KLXWJJ1Q (2017).
Article Google Scholar
Bakas, S. et al. Segmentation labels and radiomic features for the pre-operative scans of the TCGA-LGG collection. Cancer Imaging Arch.https://doi.org/10.7937/K9/TCIA.2017.GJQ7R0EF (2017).
Article Google Scholar
Dequidt, P. et al. Assigning a new glioma grade label ground-truth for the brats dataset using radiologic criteria. In 2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA), 1–6 (2020).
Bakas, S. et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge. arXiv preprint arXiv:1811.02629 (2019).
van den Oord, A., Vinyals, O. & Kavukcuoglu, K. Neural discrete representation learning. In Advances in Neural Information Processing Systems 30 (NeurIPS), 6306–6315 (2017).
Razavi, A., van den Oord, A. & Vinyals, O. Generating diverse high-fidelity images with VQ-VAE-2. In Advances in Neural Information Processing Systems 32 (NeurIPS), 14866–14876 (2019).
Łukasz Kaiser et al. Fast decoding in sequence models using discrete latent variables. In Proceedings of the 35st International Conference on International Conference on Machine Learning (ICML) (2018).
Sudre, C., Li, W., Vercauteren, T. K. M., Ourselin, S. & Cardoso, M. J. Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. arXiv preprint arXiv:1707.03237 (2017).
Lin, T., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. In 2017 IEEE International Conference on Computer Vision (ICCV), 2999–3007 (2017).
Zwanenburg, A. et al. Assessing robustness of radiomic features by image perturbation. Sci. Rep. 9, 1–10 (2019).
Article ADS CAS Google Scholar
Scalco, E. et al. T2w-MRI signal normalization affects radiomics features reproducibility. Med. Phys. 47, 1680–1691 (2020).
Article Google Scholar
Fave, X. et al. Impact of image preprocessing on the volume dependence and prognostic potential of radiomics features in non-small cell lung cancer. Transl. Cancer Res. 5, 349–363 (2016).
Article CAS Google Scholar
Lee, J. et al. Radiomics feature robustness as measured using an MRI phantom. Sci. Rep. 11, 3973 (2021).
Article ADS CAS Google Scholar
Carré, A. et al. Standardization of brain MR images across machines and protocols: Bridging the gap for MRI-based radiomics. Sci. Rep. 10, 12340 (2020).
Article ADS MathSciNet Google Scholar
Barredo Arrieta, A. et al. Explainable explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 58, 82–115 (2020).
Article Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778 (2016).
Wu, Y. & He, K. Group normalization. arXiv preprint arXiv:1803.08494 (2018).
Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32 (NeurIPS), 8024–8035 (2019).
He, K., Zhang, X., Ren, S. & Sun, J. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In 2015 IEEE International Conference on Computer Vision (ICCV), 1026–1034 (2015).
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. In The 3rd International Conference on Learning Representations (ICLR) (2015).
Stadlbauer, A. et al. Preoperative grading of gliomas by using metabolite quantification with high-spatial-resolution proton MR spectroscopic imaging. Radiology 238, 958–969 (2006).
Article Google Scholar
Dowling, C. et al. Preoperative proton MR spectroscopic imaging of brain tumors: Correlation with histopathologic analysis of resection specimens. Am. J. Neuroradiol. 22, 604–612 (2001).
CAS PubMed Google Scholar
Kono, K. et al. The role of diffusion-weighted imaging in patients with brain tumors. Am. J. Neuroradiol. 22, 1081–1088 (2001).
CAS PubMed Google Scholar
Sawlani, V. et al. Multiparametric MRI: Practical approach and pictorial review of a useful tool in the evaluation of brain tumours and tumour-like lesions. Insights Imaging 11, 1-19(2020).
Burger, P. C. Malignant astrocytic neoplasms: Classification, pathologic anatomy, and response to treatment. Semin. Oncol. 13, 16–26 (1986).
CAS PubMed Google Scholar
Raza, S. M. et al. Necrosis and glioblastoma: A friend or a foe? A review and a hypothesis. Neurosurgery 51, 2–13 (2002).
Article Google Scholar
Scott, J. N., Brasher, P. M., Sevick, R. J., Rewcastle, N. B. & Forsyth, P. A. How often are nonenhancing supratentorial gliomas malignant? A population study. Neurology 59, 947–949 (2002).
Article CAS Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS), 1097–1105 (2012).
Szegedy, C. et al. Going deeper with convolutions. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition 1–9, (2015).
Hsieh, K. L. C., Lo, C. M. & Hsiao, C. J. Computer-aided grading of gliomas based on local and global MRI features. Comput. Methods Prog. Biomed. 139, 31–38 (2017).
Article Google Scholar
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).
Article Google Scholar
Chen, J., Milot, L., Cheung, H. M. C. & Martel, A. L. Unsupervised clustering of quantitative imaging phenotypes using autoencoder and Gaussian mixture model. Med. Image Comput. Comput. Assist. Interv. 2019, 575–582 (2019).
Google Scholar
Song, J. et al. Development and validation of a machine learning model to explore tyrosine kinase inhibitor response in patients with stage IV EGFR variant-positive non-small cell lung cancer. JAMA Netw. Open 3, e2030442–e2030442 (2020).
Article Google Scholar

Download references

Acknowledgements

This study was supported by JST CREST (Grant Number JPMJCR1689), JST AIP-PRISM (Grant Number JPMJCR18Y4), and JSPS Grant-in-Aid for Scientific Research on Innovative Areas (Grant Number JP18H04908). The RIKEN AIP Deep Learning Environment (RAIDEN) supercomputer system was used to perform the calculations.

Author information

Authors and Affiliations

Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
Kazuma Kobayashi & Ryuji Hamamoto
Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo, Japan
Kazuma Kobayashi & Ryuji Hamamoto
Department of Diagnostic Radiology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
Mototaka Miyake
Department of Neurosurgery and Neuro-Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
Masamichi Takahashi

Authors

Kazuma Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar
Mototaka Miyake
View author publications
You can also search for this author in PubMed Google Scholar
Masamichi Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Ryuji Hamamoto
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

K.K. conceived the experiments; K.K. conducted the experiments; K.K., M.M., and M.T. analyzed the results. All the authors discussed the results and reviewed the manuscript.

Corresponding author

Correspondence to Kazuma Kobayashi.

Ethics declarations

Competing interests

K.K. and R.H. have received research funding from Fujifilm Corporation. M.M. and M.T. do not have any conflict of interest to be disclosed.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kobayashi, K., Miyake, M., Takahashi, M. et al. Observing deep radiomics for the classification of glioma grades. Sci Rep 11, 10942 (2021). https://doi.org/10.1038/s41598-021-90555-2

Download citation

Received: 29 January 2021
Accepted: 13 May 2021
Published: 25 May 2021
DOI: https://doi.org/10.1038/s41598-021-90555-2

This article is cited by

RadiomicsJ: a library to compute radiomic features
- Tatsuaki Kobayashi
Radiological Physics and Technology (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Intelligent feature engineering and ontological mapping of brain tumour histomorphologies by deep learning

Robust performance of deep learning for distinguishing glioblastoma from single brain metastasis using radiomic features: model development and validation

Radiomics and radiogenomics in gliomas: a contemporary update

Introduction

Methods

Dataset

Proposed algorithm for deep radiomics

Overview of the algorithm

Notation

Segmentation networks with a shareable set of feature vectors

Histogram representation of brain MRI based on deep radiomics

Classification models for glioma grades

Robustness assessment of the deep radiomics

Identification of responsible vectors

Feature ablation study to visualize responsible regions

Implementation details

Preprocessing

Encoder network

Decoder network

Training setups

Results

Segmentation performance of segmentation network

Histogram representation

Classification accuracy

Identification of responsible vectors

Qualitative evaluation of responsible regions

Quantitative evaluation of responsible regions

Discussion

Limitations

Conclusion

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

RadiomicsJ: a library to compute radiomic features

Comments

Search

Quick links