Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Artificial-intelligence-based molecular classification of diffuse gliomas using rapid, label-free optical imaging

Abstract

Molecular classification has transformed the management of brain tumors by enabling more accurate prognostication and personalized treatment. However, timely molecular diagnostic testing for patients with brain tumors is limited, complicating surgical and adjuvant treatment and obstructing clinical trial enrollment. In this study, we developed DeepGlioma, a rapid (<90 seconds), artificial-intelligence-based diagnostic screening system to streamline the molecular diagnosis of diffuse gliomas. DeepGlioma is trained using a multimodal dataset that includes stimulated Raman histology (SRH); a rapid, label-free, non-consumptive, optical imaging method; and large-scale, public genomic data. In a prospective, multicenter, international testing cohort of patients with diffuse glioma (n = 153) who underwent real-time SRH imaging, we demonstrate that DeepGlioma can predict the molecular alterations used by the World Health Organization to define the adult-type diffuse glioma taxonomy (IDH mutation, 1p19q co-deletion and ATRX mutation), achieving a mean molecular classification accuracy of 93.3 ± 1.6%. Our results represent how artificial intelligence and optical histology can be used to provide a rapid and scalable adjunct to wet lab methods for the molecular screening of patients with diffuse glioma.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Bedside SRH and DeepGlioma workflow.
Fig. 2: DeepGlioma molecular classification performance.

Similar content being viewed by others

Data availability

The genomic data for training the genetic embedding model are publicly available through the above-mentioned data repositories, and all genomic data are provided in Supplementary Data Table 2. Institutional review board (IRB) approval was obtained from all participating institutions for SRH imaging and data collection. Restrictions apply to the availability of raw patient imaging or genetic data, which were used with institutional permission through IRB approval for the current study and are, thus, not publicly available. Contact the corresponding author (T.H.) for any requests for data sharing. All requests will be evaluated based on institutional and departmental policies to determine whether the data requested are subject to intellectual property or patient privacy obligations. Data can be shared only for non-commercial academic purposes and will require a formal material transfer agreement. Generally, all such requests for access to SRH data will be responded to within 1 week.

Code availability

All code was implemented in Python (version 3.8) using PyTorch (1.9.0) as the primary machine learning framework. The following packages were used for complete data analysis: pydicom (2.0.0), tifffile (2021.1.14), torchvision (0.10.10), scikit-learn (1.0.1), pandas (l.3.4), numpy (l.20.3), matplotlib (3.5.0), scikit-image (0.18.3) and opencvpython (4.6.0.66). For data visualization and scientific plotting, the following packages were used: R (3.5.2) packages ggplot2 (3.3.5), dplyr (2.1.1) and tidyverse (l.3.1). All code and scripts to reproduce the experiments of this paper are available on GitHub at https://github.com/MLNeurosurg/deepglioma.

References

  1. Louis, D. N. et al. The 2021 WHO Classification of Tumors of the Central Nervous System: a summary. Neuro Oncol. 23, 1231–1251 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Metter, D. M., Colgan, T. J., Leung, S. T., Timmons, C. F. & Park, J. Y. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw. Open 2, e194337 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  3. Damodaran, S., Berger, M. F. & Roychowdhury, S. Clinical tumor sequencing: opportunities and challenges for precision cancer medicine. Am. Soc. Clin. Oncol. Educ. Book 2015, e175–e182 (2015).

  4. Brat, D. J. et al. Molecular biomarker testing for the diagnosis of diffuse gliomas. Arch. Pathol. Lab. Med 146, 547–574 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  5. Orringer, D. A. et al. Rapid intraoperative histology of unprocessed surgical specimens via fibre-laser-based stimulated Raman scattering microscopy. Nat. Biomed. Eng. 1, 0027 (2017).

  6. Hollon, T. C. et al. Near real-time intraoperative brain tumor diagnosis using stimulated Raman histology and deep neural networks. Nat. Med. 26, 52–58 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. in Proceedings of the 37th International Conference on Machine Learning (eds Iii, H. D. & Singh, A.) 1597–1607 (PMLR, 2020).

  8. Eckel-Passow, J. E. et al. Glioma groups based on 1p/19q, IDH, and TERT promoter mutations in tumors. N. Engl. J. Med. 372, 2499–2508 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Cancer Genome Atlas Research Network et al. Comprehensive, integrative genomic analysis of diffuse lower-grade gliomas. N. Engl. J. Med. 372, 2481–2498 (2015).

    Article  Google Scholar 

  10. Pennington, J., Socher, R. & Manning, C. GloVe: global vectors for word representation. in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Association for Computational Linguistics, 2014).

  11. Vaswani, A. et al. Attention is all you need. in Advances in Neural Information Processing Systems (eds Guyon, I. et al.) 30 (Curran Associates, 2017).

  12. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) 4171–4186 (Association for Computational Linguistics, 2019).

  13. DeWitt, J. C. et al. Cost-effectiveness of IDH testing in diffuse gliomas according to the 2016 WHO classification of tumors of the central nervous system recommendations. Neuro Oncol. 19, 1640–1650 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Louis, D. N. et al. cIMPACT-NOW (the consortium to inform molecular and practical approaches to CNS tumor taxonomy): a new initiative in advancing nervous system tumor classification. Brain Pathol. 27, 851–852 (2017).

    Article  PubMed  Google Scholar 

  15. Capper, D. et al. DNA methylation-based classification of central nervous system tumours. Nature 555, 469–474 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Drexler, R. et al. DNA methylation subclasses predict the benefit from gross total tumor resection in IDH-wildtype glioblastoma patients. Neuro Oncol. 25, 315–325 (2022).

  17. Hervey-Jumper, S. L. et al. Interactive effects of molecular, therapeutic, and patient factors on outcome of diffuse low-grade glioma. J. Clin. Oncol. https://doi.org/10.1200/JCO.21.02929 (2023).

  18. Vanderbeek, A. M. et al. The clinical trials landscape for glioblastoma: is it adequate to develop new treatments? Neuro Oncol. 20, 1034–1043 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Frome, A. et al. DeViSE: a deep visual–semantic embedding model. in Advances in Neural Information Processing Systems (eds Burges, C. J. C., Bottou, L., Welling, M., Ghahramani, Z. & Weinberger, K. Q.) 26 (Curran Associates, 2013).

  20. Ramesh, A. et al. Zero-shot text-to-image generation. in Proceedings of the 38th International Conference on Machine Learning (eds Meila, M. & Zhang, T.) 8821–8831 (PMLR, 2021).

  21. Saharia, C. et al. Photorealistic text-to-image diffusion models with deep language understanding. Preprint at arXiv https://doi.org/10.48550/arXiv.2205.11487(2022).

  22. Radford, A. et al. Learning transferable visual models from natural language supervision. in Proceedings of the 38th International Conference on Machine Learning Vol. 139 (eds. Meila, M. & Zhang, T.) 8748–8763 (PMLR, 2021).

  23. Freudiger, C. W. et al. Label-free biomedical imaging with high sensitivity by stimulated Raman scattering microscopy. Science 322, 1857–1861 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Freudiger, C. W. et al. Stimulated Raman scattering microscopy with a robust fibre laser source. Nat. Photonics 8, 153–159 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Hollon, T. C. et al. Rapid intraoperative diagnosis of pediatric brain tumors using stimulated Raman histology. Cancer Res. 78, 278–289 (2018).

    Article  CAS  PubMed  Google Scholar 

  26. Hollon, T. C. et al. Rapid, label-free detection of diffuse glioma recurrence using intraoperative stimulated Raman histology and deep neural networks. Neuro Oncol. 23, 144–155 (2021).

  27. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2016.90 (2016).

  28. Jiang, C. et al. Rapid automated analysis of skull base tumor specimens using intraoperative optical imaging and artificial intelligence. Neurosurgery 90, 758–767 (2022).

    Article  PubMed  Google Scholar 

  29. Zhao, Z. et al. Chinese Glioma Genome Atlas (CGGA): a comprehensive resource with functional genomic data from Chinese glioma patients. Genomics Proteomics Bioinformatics 19, 1–12 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Zhang, J. et al. The International Cancer Genome Consortium Data Portal. Nat. Biotechnol. 37, 367–369 (2019).

    Article  CAS  PubMed  Google Scholar 

  31. Gusev, Y. et al. The REMBRANDT study, a large collection of genomic data from brain cancer patients. Sci. Data 5, 180158 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Jonsson, P. et al. Genomic correlates of disease progression and treatment response in prospectively characterized gliomas. Clin. Cancer Res. 25, 5537–5547 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Du, J. et al. Gene2vec: distributed representation of genes based on co-expression. BMC Genomics 20, 82 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Lanchantin, J., Wang, T., Ordonez, V. & Qi, Y. General multi-label image classification with transformers. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr46437.2021.01621 (2021).

  35. Dosovitskiy, A. et al. An image is worth 16×16 words: transformers for image recognition at scale. Preprint at arXiv https://doi.org/10.48550/arXiv.2010.11929 (2020).

  36. Wiens, J. et al. Do no harm: a roadmap for responsible machine learning for health care. Nat. Med. 25, 1337–1340 (2019).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The results presented here are, in whole or in part, based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga. We would like to thank K. Eddy, L. Wang and A. Marshall for providing technical support and T. Cichonski for editorial assistance.

This work was supported by the following: grants NIH R01CA226527 (D.A.O), NIH/NIGMS T32GM141746 (C.J.) and NIH K12 NS080223 (T.C.H.). It was also supported by the Cook Family Brain Tumor Research Fund (T.C.H), the Mark Trauner Brain Research Fund, the Zenkel Family Foundation (T.C.H.), Ian’s Friends Foundation (T.C.H.) and the UM Precision Health Investigators Awards grant program (T.C.H.).

This research was also supported, in part, through computational resources and services provided by Advanced Research Computing, a division of Information and Technology Services at the University of Michigan.

Author information

Authors and Affiliations

Authors

Contributions

T.H., C.J., A.C., C.F. and D.O. contributed to the conceptualization, study design and analysis of results. T.H., C.J., A.C., A.K. and M.N.-M. contributed to the experimentation, acquisition, analysis and interpretation of data. T.H., C.J., A.C., M.N.-M., A.K., A. Aabedi and A. Adapa contributed to generating the figures and tables for the manuscript. T.H., W.A.-H., J.H., O.S., L.I.W., G.W., V.W., D.R., N.V., M.B., S.H.-J., J.G. and D.O. contributed to obtained tissue for SRH imaging. M.S. and S.C.-P. provided pathologic evaluation of tissue. All authors were involved in the editing, analysis and review of all data and manuscript versions.

Corresponding author

Correspondence to Todd Hollon.

Ethics declarations

Competing interests

T.H., C.F. and D.O. are shareholders in Invenio Imaging, Inc. C.J., A.C., M.N.-M., A.K., A. Aabedi, A. Adapa, W.A.-H., J.H., O.S., P.L., M.C., L.I.W., G.W., V.N., D.R., N.V.S., M.B., S.H.-J., J.G., M.S., S.C.-P. and H.L. do not have any competing interests.

Peer review

Peer review information

Nature Medicine thanks Anant Madabhushi, Stephen Yip and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary handling editor: Ulrike Harjes, in collaboration with the Nature Medicine team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Overall workflow of intraoperative SRH and DeepGlioma.

a, DeepGlioma for molecular prediction is intended for patients with clinical and radiographic evidence of a diffuse glioma who are undergoing surgery for tissue diagnosis and/or tumor resection. The surgical specimen is sampled from the patient’s tumor and directly loaded into a premade, disposable microscope slide with an attached coverslip. The specimen is loaded into the NIO Imaging System (Invenio Imaging, Inc., Santa Clara, CA) for rapid optical imaging. b, SRH images are acquired sequentially as strips at two Raman shifts, 2845 cm−1 and 2930 cm−1. The size and number of strips to be acquired is set by the operator who defines the desired image size. Standard image sizes range from 1-5 mm2 and image acquisition time ranges from 30 seconds to 3 minutes. The strips are edge-clipped, field-flattened, and registered to generate whole slide SRH images, which are then used for both DeepGlioma training and inference. Additionally, whole slide images can be colored using a custom virtual H&E color scheme for review by the surgeon or pathologist [5]. c, For AI based molecular diagnosis, the whole slide image is split into non-overlapping 300×300-pixel patches and each patch undergoes a feedforward pass through a previously trained network to segment the patches into tumor regions, normal brain, and nondiagnostic regions [25]. The tumor patches are then used by DeepGlioma at both training and inference to predict the molecular status of the patient’s tumor.

Extended Data Fig. 2 Training dataset.

The UM adult-type diffuse gliomas dataset used for model training. The UM training set consisted of a total of 373 patients who underwent a biopsy or brain tumor resection. Dataset generation occurred over a 6-year period, from November 2015 through November 2021. a, The distribution of patients by molecular subgroup. IDH-wildtype gliomas consisted of 61.9% (231/373) of the total dataset, IDH-mutant/1p19q-codeleted tumors consisted of 17.2% (64/373), and IDH-mutant/1p19q-intact tumors consisted of 21% (78/373). Our dataset distribution of molecular subgroups is consistent with reported distributions in large-scale population studies. ATRX mutations were found in the majority of IDH-mutant/1p19q-intact patients (78%), also concordant with previous studies [9]. b, The age distribution for each of the molecular subgroups are shown. The average age of IDH-wildtype patients was 62.6 ± 15.4 years and IDH-mutant patients was 44.6 ± 13.8 years. The average patient age of IDH-mutant/1p19q-codeleted group was 47.0 ± 12.9 years, and that of IDH-mutant/1p19-intact was 42.5 ± 14.1 years. c, Individualized patient characteristics and mutational status are shown by molecular subgroups. We report the WHO grade based on pathologic interpretation at the time of diagnosis. Because many of the patients were treated prior to the routine use of molecular status alone to determine WHO grade, several patients have IDH-wildtype lower grade gliomas (grade II or III) or IDH-mutant glioblastomas (grade IV). The discordance between histologic features and molecular features has been well documented [9] and is a major motivation for the present study.

Extended Data Fig. 3 Multi-label contrastive learning for visual representations.

Contrastive learning for visual representation is an active area of research in computer vision [7]. While the majority of research has focused on self-supervised learning, supervised contrastive loss functions have been underexplored and provide several advantages over supervised cross-entropy losses. Unfortunately, no straightforward extension of existing contrastive loss functions, such as InfoNCE and NT-Xent [7], can accommodate multi-label supervision. Here, we propose a simple and general extension of supervised contrastive learning for multi-label tasks and present the method in the context of patch-based image classification. a, Our multi-label contrastive learning framework starts with a randomly sampled anchor image with an associated set of labels. Within each minibatch a set of positive examples are defined for each label of the anchor image that shares the same label status. All images in the minibatch undergo a feedforward pass through the SRH encoder (red dotted lines indicate weight sharing). Each image representation vector (2048-D) is then passed through multiple label projectors (128-D) in order to compute a contrastive loss for each label (yellow dashed line). The scalar label-level contrastive loss is then summed and backpropagated through the projectors and image encoder. The multi-label contrastive loss is computed for all examples in each minibatch. b, PyTorch-style pseudocode for implementation of our proposed multi-label contrastive learning framework is shown. Note that this framework is general and can be applied to any multi-label classification task. We call our implementation patchcon because individual image patches are sampled from whole slide SRH images to compute the contrastive loss. Because we use a single projection layer for each label and the same image encoder is used for all images, the computational complexity is linear in the number of labels.

Extended Data Fig. 4 SRH visual representation learning comparison.

a, SRH patch representations of a held-out validation set are plotted. Patch representations from a ResNet50 encoder randomly initialized (top row), trained with cross-entropy (middle row), and PatchCon (bottom row) are shown. Each column shows binary labels for the listed molecular diagnostic mutation or subgroup. A randomly initialized encoder shows evidence of clustering because patches sampled from the same patient are correlated and can have similar image features. Training with a cross-entropy loss does enforce separability between some of the labels; however, there is no discernible lowdimensional manifold that disentangles the label information. Our proposed multi-label contrastive loss produced embeddings that are more uniformly distributed in representation space than cross-entropy. Uniformity of the learned embedding distribution is known to be a desirable feature of contrastive representation learning. b, Qualitative analysis of the SRH patch embeddings indicates that data are distributed along two major axes that correspond to IDH mutational status and 1p19q-codeletion status. This distribution produces a simplex with the three major molecular subgroups at each of the vertices. These qualitative results are reproduced in the prospective testing cohort shown in Fig. 2e. c, The contour density plots for each of the major molecular subgroups are shown to summarize the overall embedding structure. IDH-wildtype images cluster at the apex and IDH-mutant tumors cluster at the base. Patients with 1p19q-intact are closer to the origin and 1p19q-codeleted tumors are further from the origin.

Extended Data Fig. 5 Diffuse glioma genetic embedding using global vectors.

Embedding models transform discrete variables, such as words or gene mutational status, into continuous representations that populate a vector space such that location, direction, and distance are semantically meaningful. Our genetic embedding model was trained using data sourced from multiple public repositories of sequenced diffuse gliomas (Extended Data Table 2). We used a global vector embedding objective for training [10]. a, A subset of the most common mutations in diffuse gliomas is shown in the co-occurrence matrix. b, The learned genetic embedding vector space with the 11 most commonly mutated genes shown. Both the mutant and wildtype mutational statuses (N = 22) are included during training to encode the presence or absence of a mutation. Genes that co-occur in specific molecular subgroups cluster together within the vector space, such as mutations that occur in (c) IDH-mutant, 1p19q-codel oligodendrogliomas (green), (d) IDH-mutant, ATRX-mutant diffuse astrocytomas (blue), and (e) IDH-wildtype glioblastoma subtypes (red). Radial traversal of the embedding space around the wildtype genes defines clinically meaningful linear substructures [10] corresponding to molecular subgroups. f, Corresponding to the known clinical and prognostic significance of IDH mutations in diffuse gliomas, IDH mutational status determines the axis along which increasing malignancy is defined in our genetic embedding space. g, PyTorch-style pseudocode for transformer-based masked multi-label classification. Inputs to our masked multi-label classification algorithm are listed in lines 1-5. The vision encoder and genetic encoder are pretrained in our implementation but can be randomly initialized and trained end-to-end. The label mask is an L-dimensional binary mask with a variable percentage of the labels removed and subsequently predicted in each feedforward pass. An image x is augmented and undergoes a feedforward pass through the vision encoder f. The image representation is then ℓ2 normalized. The labels are embedded using our pretrained genetic embedding model and the label mask is applied. The label embeddings are then concatenated with the image embedding and passed into the transformer encoder as input tokens. Unlike previous transformer-based methods for multi-label classification [34], we enforce that the transformer encoder outputs into the same vector space as the pretrained genetic embedding model. We perform a batch matrix multiplication with the transformer outputs and the embedding layer weights. The main diagonal elements are the inner product between the transformer encoder output and the corresponding embedding weight values. We then compute the masked binary cross-entropy loss.

Extended Data Fig. 6 Ablation studies and cross-validation results.

We conducted three main ablation studies to evaluate the following model architectural design choices and major training strategies: (1) cross-entropy versus contrastive loss for visual representation learning, (2) linear versus transformer-based multi-label classification, and (3) randomly initialized versus pretrained genetic embedding. a, The first two ablation studies are shown in the panel and the details of the cross-validation experiments are explained in the Methods section (see ‘Ablation Studies’). Firstly, a ResNet50 model was trained using either cross-entropy or patchcon. The patchcon trained image encoder was then fixed. A linear classifier and transformer classifier were then trained using the same patchcon image encoder in order to evaluate the performance boost from using a transformer encoder. This ablation study design allows us to evaluate (1) and (2). The columns of the panel correspond to the three levels of prediction for SRH image classification: patch-, slide-, and patient-level. Each model was trained three times on randomly sampled validation sets and the average (± standard deviation) ROC curves are shown for each model. Each row corresponds to the three molecular diagnostic mutations we aimed to predict using our DeepGlioma model. The results show that patchcon outperforms cross-entropy for visual representation learning and that the transformer classifier outperforms the linear classifier for multi-label classification. Note that the boost in performance of the transformer classifier over the linear model is due to the deep multi-headed attention mechanism learning conditional dependencies between labels in the context of specific SRH image features (i.e., not improved image feature learning because the SRH encoder weights are fixed). b, We then aimed to evaluate (3). A single ResNet50 model was trained using patchcon and the encoder weights were fixed for the following ablation study to isolate the contribution of random initialization versus pretraining of the genetic embedding layer. Three mask label training regimes were tested and are presented in the tables: all input labels masked (100%), two labels randomly masked (66%), and one label randomly masked (33%). The first row in the first table (100% masked) is non-multimodal training, where no genetic information is provided at any point during training or inference. We found that 66% input label masking, or randomly masking two of three diagnostic mutations, showed the best overall classification performance. We hypothesize that this results from allowing a single mutation to weakly define the genetic context while allowing supervision from the two masked labels to backpropagate through the transformer encoder. mAcc, mean label accuracy; mAP, mean average precision; mAUC, mean area under ROC curve; SubAcc, subset accuracy; ebF1, example based F1 score; micF1, micro-F1 score.

Extended Data Fig. 7 Patient demographic subgroup analysis of DeepGlioma IDH classification performance.

a, b, DeepGlioma performance for classifying IDH mutations stratified by patient age. Bar charts are showing patients classification accuracy (± standard deviation). Classification performance remains high in patients less than (n = 89) and greater than 55 years (n = 64). IDH mutations are less common in patients greater than 55 years, causing class imbalance and resulting in a greater proportional drop in classification performance with false negative predictions. (c, d,) Classification performance stratified by sex (male = 74, F = 74) and (e, f) racial groups (non-white = 35, white = 118) as defined by the National Insitute of Health (NIH). Bar charts are showing patients classification accuracy (± standard deviation). Classification performance remains high across all subgroup analyses. No information rate in the accuracy achieved by classifying all examples into the majority class. g, Subset of patients from the prospective cohort with non-canonical IDH mutations and a diffuse midline glioma, H3 K27M mutation. DeepGlioma correctly classified all non-canonical IDH mutations, including IDH-2 mutation. Moreover, DeepGlioma generalized to pediatric-type diffuse high-grade gliomas, including diffuse midline glioma, H3 K27-altered, in a zero-shot fashion as these tumors were not included in the UM training set. This patient was included in our prospective cohort because the patient was a 34-year-old adult at presentation.

Extended Data Fig. 8 DeepGlioma molecular subgroup analysis.

Multiclass classifcation performance for molecular subgroup prediction by DeepGlioma stratified by patient demographic information and prospective testing site is shown. Results stratified by (a) age, (b) race, and (c) sex are shown. Multiclass classification performance remained high in each patient demographic compared to the entire cohort. DeepGlioma was trained to generalize to all adult patients and to be agnostic to patient demographic information. d, Confusion matrix of our benchmark multiclass model trained using categorical cross-entropy. DeepGlioma outperformed the multiclass model by +4.6% in overall patient-level diagnostic accuracy with a substantial improvement in differentiating molecular astrocytomas and oligodendrogliomas. e, Direct comparison of subgrouping performance for our benchmark multiclass model, IDH1-R132H IHC, and DeepGlioma. Performance metrics values are displayed. Molecular subgroupings mean and standard deviations are plotted for both IDH subgrouping and molecular subgrouping. These results provide evidence that multimodal training and multi-label prediction provide a performance boost over multi-class modeling. f, DeepGlioma molecular subgroup classification performance for each of the prospective testing medical centers is shown. Accuracy values with 95% confidence intervals (in parentheses) are shown above the confusion matrices. Overall performance was stable across the three largest contributors of prospective patients. Performance on the MUV dataset was comparatively; however, some improvement was observed during the LIOCV experiments. Red indicates the best performance.

Extended Data Fig. 9 Molecular genetic and molecular subgroup heatmaps.

DeepGlioma predictions are presented as heatmaps from representative patients included in our prospective clinical testing dataset for each diffuse glioma molecular subgroup. a, SRH images from a patient with a molecular oligodendroglioma, IDH-mutant, 1p19q-codel. Uniform high probability prediction for both IDH and 1p19q-codel and corresponding low ATRX mutation prediction. SRH images show classic oligodendroglioma features, including small, branching’chicken-wire’ capillaries and perineuronal satellitosis. Oligodendroglioma molecular subgroup heatmap shows expected high prediction probablity throughout the dense tumor regions. b, A molecular astrocytoma, IDH-mutant, 1p19q-intact and ATRX-mutant is shown. Astrocytoma molecular subgroup heatmap shows some regions of lower probability that may be related to the presence of image features found in glioblastoma, such as microvascular proliferation. However, regions of dense hypercellularity and anaplasia are correctly classified as IDH mutant. These findings indicate DeepGlioma’s IDH mutational status predictions are not determined solely by conventional cytologic or histomorphologic features that correlate with lower grade versus high grade diffuse gliomas. c, A glioblastoma, IDH-wildtype tumor is shown. Glioblastoma molecular subgroup heatmap shows high confidence throughout the tumor specimen. Additionally, this tumor was also ATRX mutated, which is known to occur in IDH-wildtype tumors [9]. Despite the high co-occurence of IDH mutations with ATRX mutations, DeepGlioma was able to identify image features predictive of ATRX mutations in a molecular glioblastoma. Because ATRX mutations are not diagnostic of molecular glioblastomas, the ATRX prediction does not affect the molecular subgroup heatmap (see ‘Molecular heatmap generation’ section in Methods). Additional SRH images and DeepGlioma prediction heatmaps can be found at our interactive web-based viewer deepglioma.mlins.org. Scale bar, 1 mm.

Extended Data Fig. 10 Evaluation of DeepGlioma on non-canonical diffuse gliomas.

A major advantage of DeepGlioma over conventional immunohistochemical laboratory techniques is that it is not reliant on specific antigens for effective molecular screening. a, A molecular oligodendroglioma with an IDH2 mutation is shown. DeepGlioma correctly predicted the presence of both an IDH mutation and 1p19q-codeletion. IDH1-R132H IHC performed on the imaged specimen is negative. The patient was younger than 55 and, therefore, required genetic sequencing in order to complete full molecular diagnostic testing using our current laboratory methods. b, A molecular astrocytoma with IDH1-R132S and ATRX mutations. DeepGlioma correctly identifies both mutations. c, A patient with a suspected adult-type diffuse glioma met inclusion criteria for the prospective clinical testing set. The patient was later diagnosed with a diffuse midline glioma, H3 K27-altered. DeepGlioma correctly predicted the patient to be IDH-wildtype without previous training on diffuse midline gliomas or other pediatric-type diffuse gliomas. We hypothesize that DeepGlioma can perform well on other glial neoplasms in a similar zero-shot fashions. Scalebar, 1 mm.

Supplementary information

Supplementary Information

Supplementary Data Tables 1–4 and Supplementary Figs. 1 and 2

Reporting Summary

Supplementary Table 1

Patient dataset from testing cohort at UM.

Supplementary Table 2

Aggregated public diffuse glioma genomic dataset.

Supplementary Table 3

Prospective multicenter testing dataset with DeepGlioma multi-label predictions.

Supplementary Table 4

Prospective multicenter testing dataset with multiclass model predictions.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hollon, T., Jiang, C., Chowdury, A. et al. Artificial-intelligence-based molecular classification of diffuse gliomas using rapid, label-free optical imaging. Nat Med 29, 828–832 (2023). https://doi.org/10.1038/s41591-023-02252-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41591-023-02252-4

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer