Radiogenomic classification for MGMT promoter methylation status using multi-omics fused feature space for least invasive diagnosis through mpMRI scans

Qureshi, Shahzad Ahmad; Hussain, Lal; Ibrar, Usama; Alabdulkreem, Eatedal; Nour, Mohamed K.; Alqahtani, Mohammed S.; Nafie, Faisal Mohammed; Mohamed, Abdullah; Mohammed, Gouse Pasha; Duong, Tim Q.

doi:10.1038/s41598-023-30309-4

Download PDF

Article
Open access
Published: 25 February 2023

Radiogenomic classification for MGMT promoter methylation status using multi-omics fused feature space for least invasive diagnosis through mpMRI scans

Shahzad Ahmad Qureshi¹,
Lal Hussain^2,3,4,
Usama Ibrar⁵,
Eatedal Alabdulkreem⁶,
Mohamed K. Nour⁷,
Mohammed S. Alqahtani⁸,
Faisal Mohammed Nafie⁹,
Abdullah Mohamed¹⁰,
Gouse Pasha Mohammed¹¹ &
…
Tim Q. Duong⁴

Scientific Reports volume 13, Article number: 3291 (2023) Cite this article

3097 Accesses
18 Citations
Metrics details

Subjects

An Author Correction to this article was published on 30 March 2023

This article has been updated

Abstract

Accurate radiogenomic classification of brain tumors is important to improve the standard of diagnosis, prognosis, and treatment planning for patients with glioblastoma. In this study, we propose a novel two-stage MGMT Promoter Methylation Prediction (MGMT-PMP) system that extracts latent features fused with radiomic features predicting the genetic subtype of glioblastoma. A novel fine-tuned deep learning architecture, namely Deep Learning Radiomic Feature Extraction (DLRFE) module, is proposed for latent feature extraction that fuses the quantitative knowledge to the spatial distribution and the size of tumorous structure through radiomic features: (GLCM, HOG, and LBP). The application of the novice rejection algorithm has been found significantly effective in selecting and isolating the negative training instances out of the original dataset. The fused feature vectors are then used for training and testing by k-NN and SVM classifiers. The 2021 RSNA Brain Tumor challenge dataset (BraTS-2021) consists of four structural mpMRIs, viz. fluid-attenuated inversion-recovery, T1-weighted, T1-weighted contrast enhancement, and T2-weighted. We evaluated the classification performance, for the very first time in published form, in terms of measures like accuracy, F₁-score, and Matthews correlation coefficient. The Jackknife tenfold cross-validation was used for training and testing BraTS-2021 dataset validation. The highest classification performance is (96.84 ± 0.09)%, (96.08 ± 0.10)%, and (97.44 ± 0.14)% as accuracy, sensitivity, and specificity respectively to detect MGMT methylation status for patients suffering from glioblastoma. Deep learning feature extraction with radiogenomic features, fusing imaging phenotypes and molecular structure, using rejection algorithm has been found to perform outclass capable of detecting MGMT methylation status of glioblastoma patients. The approach relates the genomic variation with radiomic features forming a bridge between two areas of research that may prove useful for clinical treatment planning leading to better outcomes.

Improving MGMT methylation status prediction of glioblastoma through optimizing radiomics features using genetic algorithm-based machine learning approach

Article Open access 04 August 2022

Radiomics and MGMT promoter methylation for prognostication of newly diagnosed glioblastoma

Article Open access 08 October 2019

Integrating imaging and genomic data for the discovery of distinct glioblastoma subtypes: a joint learning approach

Article Open access 28 February 2024

Introduction

The division of brain cells, in billions, is declared tumorous in case it is uncontrollable forming abnormal regions, leading to the cancers’ highest mortality rates worldwide for adults as well as children¹. The tumor localization in the brain with its growth rate is a highly unpredictable entity and is broadly classified as primary and secondary tumors. The former, with its origin inside the brain, is the deadliest one, and malignant most of the time. The gliomas cover 80% of the primary brain tumors (in Grades I to IV only Grade I grows slowly, and is benign)^2,3. The gliomas, Grades II and III grow quickly and frequently require prompt treatment. Grade-IV gliomas, the most aggressive type, also known as glioblastoma (GB), are the most challenging for prognosis and better clinical outcomes. In adults, GB and diffuse astrocytic glioma, due to extreme intrinsic heterogeneity in shape and microscopic anatomy, are the most dangerous tumors of the central nervous system. It has been observed that the GB patients’ prognosis, irrespective of the use of numerous treatment options, has had no substantial improvement during the last 20 years^{4,5,6,7,8,9,10}. The World Health Organization (WHO) released details about the central nervous system tumors classification, and emphasized the integrated diagnostics utility, highlighting the clinical tumor diagnosis as integrating molecular-cytogenetic features¹¹. O⁶-methylguanine-DNA methyltransferase (MGMT) is an enzyme for DNA repair that plays a vital role in chemoresistance to alkylating agents and is a promising prognostic factor predicting chemotherapy response based on the methylation of the promoter for an early diagnosed GB^12,13,14. In this respect, the most common courses of treatment for GB include surgery, radiotherapy, and adjuvant chemotherapy¹⁵. The effectiveness of chemotherapy is often tied to the promotor methylation status of MGMT, an important biomarker, which functions as a repairing mechanism for guanine nucleotides and prevents cell death caused by alkylating agents^12,16. In an average scenario, the presence of the MGMT enzyme is beneficial since it prevents DNA damage. In GB patients, however, the presence of MGMT decreases the effectiveness of chemotherapy by rendering the alkylating chemotherapeutic agents ineffective. This work is related to the prediction of the MGMT promoter methylation status ("Preprocessing") causing MGMT gene silencing, where its presence leads to favorable results in GB patients being treated with alkylating agent chemotherapy by stopping cell division through cross-linking DNA strands. Therefore, predicting the methylation status of MGMT promoters in GB can support further decision-making and treatment plan, and it is the sole objective of this research endeavor.

Presently, the minor structural details that are challenging to discriminate by computed tomography (CT) are detected by another non-invasive technique, namely magnetic resonance imaging (MRI). In the case of complicated GB MRI scans, the manual analysis by expert radiologists and physicians is tedious and time taking¹⁷. The complex cases need to compare the tumorous region with neighboring regions which leads to improving the perceptual information stored in the image for improved classification. This situation is impracticable in the case of a large number of images being dealt with using manual techniques. Early GB detection with reliable prediction results is important for the health of a subject¹¹. Consequently, novel approaches are always the main source of attraction for cohorts working critically for prompt and reliable tumor detection. Machine learning (ML) and one of its variants, namely deep learning (DL), is the key enabler of artificial intelligence (AI) for discriminative feature extraction turning the images into useful information.

The RSNA ASNR MICCAI Brain Tumor Segmentation BraTS 2021 challenge (BraTS-2021 dataset) is based on multi-institutional multi-parametric Magnetic Resonance Imaging (mpMRI) scans¹⁸. This article is related to the prediction of MGMT promoter methylation status, a genetic characteristic of glioblastoma, using baseline MRI scans done before and while preparing for a surgical operation. In our work, we propose a novel classification framework distinguishing either MGMT methylated or MGMT unmethylated tumors using a challenging BraTS-2021 dataset. The former class is designated as (1, or MGMT +) while the latter is categorized as (0, or MGMT−).

The current practice for genetic analysis of cancer tissue samples is to use surgery. Further, the tumor genetic characterization requires weeks before a conclusion is reached¹⁸. The consequence of the results may lead to subsequent surgery. The notion here is to predict the cancer genetics using magnetic resonance imaging, namely radiogenomics, that might improve the therapy results along with the reduction of potential surgical treatments. The success of radiogenomics would lead to alleviating brain cancer miseries by least invasive measures for the respective diagnoses and treatments. This new treatment strategy, before any surgery, seems to have the potential to improve the prospects of management and survival of brain cancer patients.

The MGMT gene is regulated by an epigenetic mechanism: the methylation of the CpG island of the MGMT promoter. The role of MGMT promoter methylation is to suppress the MGMT genes’ expression thereby decreasing MGMT enzyme function in the cell. Due to the ability of MGMT methylation to increase the effectiveness of chemotherapy, the status of MGMT methylation is often taken into consideration when determining the course of treatment^18,19,20. The traditional method to determine MGMT methylation status involves the extraction of tumor tissue through surgery. After extraction, the determination of the methylation status of the tumor is a time (up to weeks) taking process. Further, after determination, additional invasive procedures may be necessary to discern the optimal treatment method^18,21.

Radiogenomic diagnosis of MGMT methylation testing aids in decreasing the invasiveness of current testing procedures. Radiogenomics aims to predict cellular genomics with the use of a tissue’s phenotypic image characteristics^22,23. In gliomas, radiomics is commonly used to predict survival, and evaluate the potential of chemotherapeutic treatments in treatment²⁴. Several radiogenomic models have been developed to predict MGMT methylation status in GB patients^25,26,27. However, radiogenomic models are susceptible to a lack of standardization because of the variation between methodology, software, and radiologists’ readings²³. Recently, Zhang et al.²⁸ introduced a data-sharing scheme based on blockchain with fine-grained access control. They separated the public and private parts of electronic medical records, which are subsequently encrypted separately by symmetric searchable encryption (SSE). The symmetric keys used in SSE technology were encrypted by attribute-based encryption. This helped patients to share data without any risk. For GB, the radiogenomic approach to MGMT methylation testing consists of evaluating magnetic resonance images (MRIs) of the brain²⁹. To expand upon current radiogenomic analysis methods, artificial intelligence can be employed to construct complex predictive deep learning radiogenomic models³⁰.

Deep convolutional neural networks are employed for a vast array of tasks, including medical image analysis^31,32 and image classification^14,33. Deep learning architecture has encoding blocks ordered in multiple layers. The feature maps in lower layers are forwarded to subsequent layers with increased complexity order. A convolutional neural network (CNN)³⁴ massively reduces the number of neurons due to sparse interaction in comparison to shallow neural networks. The transfer learning methodology based on CNN is well proven for quite some time^{35,36,37,38,39} and has been extensively used in the analysis of different imaging databases^37,38,40,41, neuroimaging⁴², MRI, CT (Computed Tomography)³⁶, and ultrasound images⁴³. Transfer learning using CNN based on AlexNet and GoogleNet for the ImageNet dataset is well known deep learning approach⁴⁴. The CNNs are extensively used in vision-related applications including object detection⁴⁵, spanning classification⁴⁶, and segmentation⁴⁷. The combination of data pre-processing and augmentation with transfer learning can be helpful for improved classification results. In our case, since the dataset is enormous, only pre-processing is seeming of great value along with a fine-tuned CNN architecture.

The features used as the source of training essence should have discrimination ability that would be exploited to predict the regions across the hyperplane with a maximum confidence level as the target label. In this context, there is a gap for efficient and robust automated systems for brain tumor detection using MRI. Hsieh et al.⁴⁸ defined brain tumor categories based on region-of-interest (ROI), feature selection, and feature extraction. They used local textural features including global histogram moments using 107 images of gliomas (73 low-grade; 34 high-grade). The work, however, was reported on a limited dataset and lacking features based on other static feature extraction methods. Cheng et al.⁴⁹ used a T1-weighted Contrast-Enhanced brain MRI dataset⁵⁰ having three types of tumors (glioma, pituitary, and meningioma), experimenting with three feature extraction methods, namely intensity histogram, bag-of-words (BoW), and gray level co-occurrence matrix (GLCM), finding that BoW performs relatively better at a higher computational cost. The accuracy was limited due to the absence of preprocessing scenarios that could lead to improved discriminative features. The hybrid of solution spaces for the three characteristic feature sets was not explored. Similarly, Sachdeva et al.⁵¹ extracted color and textural features based on segmented ROIs, using the genetic algorithm for features’ selection with optimum fitness level, and reported the accuracy as 94.90% using a genetic algorithm-based artificial neural network (GA-ANN). However, for large datasets, the colored images have different color tints necessitating the use of staining procedures as an essential step. Further, the dynamic features need to be explored with deep learning algorithms to address enhanced discrimination features.

Claro et al.⁵² used hybrid feature space formed by textural features, like Tamura (coarseness, contrast, directionality, line-likeness, regularity, and roughness), gray level run length matrix (GLRLM), histogram of oriented gradients (HOG), morphology, local binary patterns (LBP), merging the extracted features using seven CNN architectures for glaucoma classification. Their feature space was based on 30,862 dimensions, which was squeezed by the gain ratio for arranging the features according to their performance concluding in an optimum setting for glaucoma detection. They found the GLCM descriptor with transfer learning-based features to be the most effective for their specific problem structure. The work needs to be explored on multi-parametric and multi-institutional datasets, for larger and more diverse datasets, with dynamic features using residual feature maps concatenated in the successive layers. Garcia et al.⁵³ solved the problem of imbalanced datasets by using ensemble classifiers with feature space partitioning. The parameters of the partitioning were optimized by using a hybrid metaheuristic method, called GACE which combined a genetic algorithm (GA) with a cross-entropy (CE) method. More elaborative work using generative adversarial networks (GAN) can be used to tackle the underlying problem of class imbalance. Shaban et al.⁵⁴ introduced a hybrid feature selection methodology that extracts the features with optimum characteristics using COVID-19 CT images. The feature selection is based on fast and accurate selection stages. They used an enhanced version of k-NN that is not trapped due to solid heuristics in choosing the neighbors of the tested subject. The work can be explored using other ML classifiers, along with DL classification techniques to involve the features based on high-level abstraction layers.

In this research, we propose a novel two-stage MGMT Promoter Methylation Prediction (MGMT-PMP) system, that precisely quantifies the image structure of GB in patients from the evaluation of FLAIR, T1w, T2, and T1Gd mpMRIs. We have selected the popular feature types, viz. GLCM, HOG, and LBP, and fused these features with novice deep learning features forming a hybrid feature set (HFS) differing vis-à-vis in three aspects. Firstly, it engages a novel Deep Learning Radiomic Feature Extraction (DLRFE) module that extracts dynamic features based on the problem structure into the classification process leading to promising results. Secondly, it provides different categories of second-order statistics and local textural features exploiting the positive aspects of each category of feature extraction modules. Third, the system is based on filtering that uses the rejection algorithm for removing redundant and irrelevant features from the RSNA dataset thereby improving the discrimination or variance and leading to its quick convergence. A comparison with recent techniques is also presented for performance analysis.

The key contributions of this research work are summarized as follows:

This work is related to the MGMT promotor methylation status affecting the efficiency of chemotherapy in GB patients where ‘MGMT+’ status increases the effectiveness of chemotherapy.
A novel two-stage prediction system for MGMT promoter methylation status, that precisely quantifies the image structure of glioblastoma in patients using FLAIR, T1w, T2, and T1Gd mpMRIs.
A novel deep learning-based feature extraction module that extracts dynamic features based on the problem structure.
The hybrid feature set formation by fusing deep features with static features of the origin GLCM, HOG, and LBP.
RSNA ASNR MICCAI Brain Tumor Segmentation BraTS 2021 challenge was used for radiogenomic classification with 348,642 mpMRI scans for the very first time in published form (using performance measures, viz. accuracy, F₁-score, and Matthews correlation coefficient).
The rejection algorithm is introduced for removing redundant and irrelevant features from the dataset.
A detailed complexity analysis of the individual and combinatorial hybrids formed by the fusion of dynamic and static feature sets.
A comparison of the proposed work with other state-of-the-art techniques.

The paper organization follows; "Importance in the Clinical management and the survival of GB patients" briefly details the impact of the study on the survival of GB patients with clinical management, "Material and methods" is dedicated to materials and methods, "Results and discussion" details the results and discussion, followed by Conclusions in section "Conclusions". The abbreviations used throughout this article are illustrated in Table 1.

Table 1 Abbreviations with acronyms used in the text.

Subjects

Abstract

Similar content being viewed by others

Introduction

Importance in the clinical management and the survival of GB patients

Material and methods

Dataset

Preprocessing

Pseudocode for rejection algorithm

Deep learning-based latent shape features

Radiomic and radiogenomic features

GLCM features

HOG features

LBP features

Jackknife cross-validation

Classification model

k-Nearest neighbors (k-NN) classification

Support vector machine classification

Performance measures

Ethical approval

Results and discussion

Experimental setup

Selection of optimal parameters

k-NN model

SVM model

Performance analysis of DL-based latent feature extraction

Analysis of parameters for deep learning and radiomic feature methods

Feature space visualization of deep learning-based FC1 and FC2

GLCM feature extraction process

HOG feature extraction process

LBP features

Performance analysis of individual FEMs

Performance analysis of HFS

Computational time analysis of the framework

Performance comparison

Future challenges and recommendations

Conclusions

Data availability

Change history

30 March 2023

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links