Radiomics refers to the high-throughput extraction of quantitative features from radiological scans and is widely used to search for imaging biomarkers for the prediction of clinical outcomes. Current radiomic signatures suffer from limited reproducibility and generalizability, because most features are dependent on imaging modality and tumour histology, making them sensitive to variations in scan protocol. Here, we propose novel radiological features that are specially designed to ensure compatibility across diverse tissues and imaging contrast. These features provide systematic characterization of tumour morphology and spatial heterogeneity. In an international multi-institution study of 1,682 patients, we discover and validate four unifying imaging subtypes across three malignancies and two major imaging modalities. These tumour subtypes demonstrate distinct molecular characteristics and prognoses after conventional therapies. In advanced lung cancer treated with immunotherapy, one subtype is associated with improved survival and increased tumour-infiltrating lymphocytes compared with the others. Deep learning enables automatic tumour segmentation and reproducible subtype identification, which can facilitate practical implementation. The unifying radiological tumour classification may inform prognosis and treatment response for precision medicine.
This is a preview of subscription content
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The data are available within the Article or the Supplementary Information. The imaging data for 9 out of a total of 13 cohorts used in this study are publicly available through the TCIA website (https://www.cancerimagingarchive.net/), as described in the Supplementary Information. The imaging data for the breast cancer cohort from Hokkaido University, Japan are publicly available at https://drive.google.com/drive/folders/1AsI-bvUWwdmwMd7SHXzJttUsKqmImAGz?usp=sharing. The imaging data for the Stanford Lung Cancer, Lung Cancer Immunotherapy and Cambridge GBM cohorts are not publicly available because they contain sensitive information that may compromise patient privacy as well as the ethical restrictions or regulation policy of local institutions. These data will be made available to individuals who contact the corresponding authors with a reasonable request, for example, for non-commercial, research purposes. The gene expression data and mutational data of TCGA samples are publicly available in the Genomic Data Commons (https://gdc.cancer.gov/). The gene expression data for the other cohorts are available from the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/, accession nos. GSE22226, GSE103584 and GSE58661).
For the spherical harmonic decomposition, we used the SPHARM-MAT software (http://www.iu.edu/~spharm/). For autoencoder, XGboost and consensus clustering, we used R software (version 3.5.3, R Foundation for Statistical Computing, Vienna, Austria), the package autoencoder (version 1.1), XGboost (version 184.108.40.206) and ConsensusClusterPlus (version 1.52.0). The U-Net architecture is available at https://github.com/lyakaap/Kaggle-Carvana-3rd-place-solution. Custom codes46 are available at https://github.com/WuLabMDA/PanCancer.
Lambin, P. et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur. J. Cancer 48, 441–446 (2012).
Gillies, R. J., Kinahan, P. E. & Hricak, H. Radiomics: images are more than pictures, they are data. Radiology 278, 563–577 (2015).
Itakura, H. et al. Magnetic resonance image features identify glioblastoma phenotypic subtypes with distinct molecular pathway activities. Sci. Transl. Med. 7, 303ra138 (2015).
Sun, R. et al. A radiomics approach to assess tumour-infiltrating CD8 cells and response to anti-PD-1 or anti-PD-L1 immunotherapy: an imaging biomarker, retrospective multicohort study. Lancet Oncol. 19, 1180–1191 (2018).
Jiang, Y. et al. Noninvasive imaging evaluation of tumor immune microenvironment to predict outcomes in gastric cancer. Ann. Oncol. 31, 760–768 (2020).
Vaidya, P. et al. CT derived radiomic score for predicting the added benefit of adjuvant chemotherapy following surgery in stage I, II resectable non-small cell lung cancer: a retrospective multi-cohort study for outcome prediction. Lancet Digit. Health 2, e116–e128 (2020).
Fan, M., Xia, P., Clarke, R., Wang, Y. & Li, L. Radiogenomic signatures reveal multiscale intratumour heterogeneity associated with biological functions and survival in breast cancer. Nat. Commun. 11, 4861 (2020).
Wu, J. et al. Magnetic resonance imaging and molecular features associated with tumor-infiltrating lymphocytes in breast cancer. Breast Cancer Res. 20, 101 (2018).
Berenguer, R. et al. Radiomics of CT features may be nonreproducible and redundant: influence of CT acquisition parameters. Radiology 288, 407–415 (2018).
Mackin, D. et al. Measuring computed tomography scanner variability of radiomics features. Invest. Radiol. 50, 757–765 (2015).
Traverso, A., Wee, L., Dekker, A. & Gillies, R. Repeatability and reproducibility of radiomic features: a systematic review. Int. J. Radiat. Oncol. Biol. Phys. 102, 1143–1158 (2018).
Limkin, E. et al. Promises and challenges for the implementation of computational medical imaging (radiomics) in oncology. Ann. Oncol. 28, 1191–1206 (2017).
Lambin, P. et al. Radiomics: the bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 14, 749–762 (2017).
Hoadley, K. A. et al. Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer. Cell 173, 291–304 (2018).
Pestana, R. C., Sen, S., Hobbs, B. P. & Hong, D. S. Histology-agnostic drug development-considering issues beyond the tissue. Nat. Rev. Clin. Oncol. 17, 555–568 (2020).
O’Connor, J. P. B. et al. Imaging biomarker roadmap for cancer studies. Nat. Rev. Clin. Oncol. 14, 169–186 (2017).
Wu, J., Mayer, A. T. & Li, R. Seminars in Cancer Biology (Elsevier, 2020).
Chalkidou, A., O’Doherty, M. J. & Marsden, P. K. False discovery rates in PET and CT studies with texture features: a systematic review. PLoS ONE 10, e0124165 (2015).
Zhang, Y. J. Geometric Modeling and Mesh Generation from Scanned Images (CRC Press, 2018).
Wu, J. et al. Intratumoral spatial heterogeneity by perfusion MR imaging predicts recurrence-free survival in locally advanced breast cancer treated with neoadjuvant chemotherapy. Radiology 288, 26–35 (2018).
Braman, N. M. et al. Intratumoral and peritumoral radiomics for the pretreatment prediction of pathological complete response to neoadjuvant chemotherapy based on breast DCE-MRI. Breast Cancer Res. 19, 57 (2017).
Wu, J. et al. Robust intra-tumor partitioning to identify high-risk subregions in lung cancer: a pilot study. Int. J. Radiat. Oncol. Biol. Phys. 95, 1504–1512 (2016).
Yankeelov, T. E. et al. Clinically relevant modeling of tumor growth and treatment response. Sci. Transl. Med. 5, 187ps19 (2013).
Wu, J. et al. Tumor subregion evolution-based imaging features to assess early response and predict prognosis in oropharyngeal cancer. J. Nucl. Med. 61, 327–336 (2020).
Syed, A. K., Whisenant, J. G., Barnes, S. L., Sorace, A. G. & Yankeelov, T. E. Multiparametric analysis of longitudinal quantitative MRI data to identify distinct tumor habitats in preclinical models of breast cancer. Cancers 12, 1682 (2020).
Welch, M. L. et al. Vulnerabilities of radiomic signature development: the need for safeguards. Radiother. Oncol. 130, 2–9 (2019).
Cristescu, R. et al. Pan-tumor genomic biomarkers for PD-1 checkpoint blockade-based immunotherapy. Science 362, eaar3593 (2018).
Zhang, Y. J., Jing, Y. M., Liang, X. H., Xu, G. L. & Dong, L. in Computational Modelling of Objects Represented in Images: Fundamentals, Methods and Applications III (eds Di Giamberardino, P. et al.) 215–220 (2012).
Shukla-Dave, A. et al. Quantitative imaging biomarkers alliance (QIBA) recommendations for improved precision of DWI and DCE-MRI derived biomarkers in multicenter oncology trials. J. Magn. Reson. Imaging 49, e101–e121 (2019).
Lawson, D. A., Kessenbrock, K., Davis, R. T., Pervolarakis, N. & Werb, Z. Tumour heterogeneity and metastasis at single-cell resolution. Nat. Cell Biol. 20, 1349–1360 (2018).
Lou, B. et al. An image-based deep learning framework for individualising radiotherapy dose: a retrospective analysis of outcome prediction. Lancet Digit. Health 1, e136–e147 (2019).
Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).
Jiang, Y. et al. Radiographic assessment of tumor stroma and treatment outcomes using deep learning: a retrospective multicohort study. Lancet Digit. Health 3, e371–e382 (2021).
Li, A., Chen, R., Farimani, A. B. & Zhang, Y. J. Reaction diffusion system prediction based on convolutional neural network. Sci. Rep. 10, 3894 (2020).
Li, A., Farimani, A. B. & Zhang, Y. J. Deep learning of material transport in complex neurite networks. Sci. Rep. 11, 11280 (2021).
Tajdari, M. et al. Image-based modelling for adolescent idiopathic scoliosis: mechanistic machine learning analysis and prediction. Comput. Methods Appl. Mech. Eng. 374, 113590 (2021).
Kickingereder, P. et al. Automated quantitative tumour response assessment of MRI in neuro-oncology with artificial neural networks: a multicentre, retrospective study. Lancet Oncol. 20, 728–740 (2019).
Barajas, R. F. et al. Regional variation in histopathologic features of tumor specimens from treatment-naive glioblastoma correlates with anatomic and physiologic MR Imaging. Neuro Oncol. 14, 942–954 (2012).
Nasha, Z. R. et al. Early response evaluation using primary tumor and nodal imaging features to predict progression-free survival of locally advanced non-small cell lung cancer. Theranostics 10, 11707–11718 (2020).
Reynolds, A. P., Richards, G., de la Iglesia, B. & Rayward-Smith, V. J. Clustering rules: a comparison of partitioning and hierarchical clustering algorithms. J. Math. Model. Algorithms 5, 475–504 (2006).
Kapp, A. V. & Tibshirani, R. Are clusters found in one dataset present in another dataset? Biostatistics 8, 9–31 (2007).
Thorsson, V. et al. The immune landscape of cancer. Immunity 48, 812–830 (2018).
Barbie, D. A. et al. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462, 108–112 (2009).
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Ronneberger, O., Fischer, P. & Brox, T. U-Net: convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (eds Navab, N. et al.) 234–241 (Springer, 2015).
WuLabMDA/PanCancer: first release (Zenodo); https://doi.org/10.5281/zenodo.4906510
This research was partially supported by the National Institutes of Health (NIH) grants R01 CA233578, R01 CA222512 and R01 CA193730 (R.L.). J.W. acknowledges the NIH K99/R00 CA218667 and University of Texas MD Anderson Cancer Center Lung Moon Shot Program. S.J.P. is funded by a National Institute for Health Research (NIHR), Career Development Fellowship (CDF-18-11-ST2-003) and NIHR Brain Injury MedTech Co-operative based at Cambridge University Hospitals NHS Foundation Trust and University of Cambridge. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. C.-B.S. acknowledges support from EPSRC Centre grant no. EP/N014588/1. C.L. acknowledges Cancer Research UK grant no. CRUK/A19732 and EPSRC Centre grant no. EP/N014588/1. We thank TCGA and TCIA for sharing the imaging and genomics data for a subset of patients used in this study.
J.H. reports fees for advisory committees from AstraZeneca, Boehringer Ingelheim, Bristol Myers Squibb, Catalyst, EMD Serono, Foundation Medicine, Hengrui Therapeutics, Genentech, GSK, Guardant Health, Eli Lilly, Merck, Novartis, Pfizer, Roche, Sanofi, Seattle Genetics, Spectrum and Takeda, research support from AstraZeneca, GlaxoSmithKline, Spectrum, and royalties and licensing fees from Spectrum. M.D. reports research funding from AstraZeneca, Varian Medical Systems and Illumina, ownership interest in CiberMed and Foresight Diagnostics, patent filings related to cancer biomarkers, and paid consultancy from Roche, Genentech, AstraZeneca, Novartis, Boehringer Ingelheim, Gritstone Oncology, RefleXion and BioNTech.
Peer review information Nature Machine Intelligence thanks Anum Kazerouni, Yue Wang and Yongjie Zhang for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Morphological characterization of tumours by spherical harmonic decomposition. a) Overall design of morphological analysis; b) Illustration of 3D spherical harmonic basis functions at different degrees and orders; c) Illustration of 3D tumours reconstructed by coefficients obtained from spherical harmonic decomposition. Each row represents a selected 3D tumour, which is reconstructed using decomposition results at 5 different degree levels. Here, lower degree captures more global patterns and higher degree corresponds to more detailed morphological patterns.
Extended Data Fig. 2 The ridgeline plots present the distribution of 20 regional variation features in three different cancer types.
The ridgeline plots present the distribution of 20 regional variation features in three different cancer types. Here, we investigate 2 tumour regions, tumour core (TC) and tumour invasive margin (TIM), plus 2 peritumour regions, parenchymal margin at 5 mm or 10 mm (PM5 or PM10). In total, 5 pair-wise regions are considered, namely, TC-TIM, TC-PM5, TC-PM10, TIM-PM5, TIM-PM10. Variation for each pair-wise region was quantified with four measures (chi-square, Bhattacharyya distance, correlation, intersection), yielding 5*4 = 20 regional variation features. TC-PM5 and TC-PM10 related features are colored in green, while TIM-PM5 and TIM-PM10 related features are colored in blue.
Details of imaging feature dimension reduction via an autoencoder model. a) The structure of autoencoder used to learn a low-dimensional mapping of the original feature signals with detailed tuning hyperparameters; b) The optimal autoencoder loss curves in training and validation; c) Heatmap of pairwise correlations between 10 autoencoded features.
Distribution of imaging clusters (subtypes) in different clinical groups. a) The distribution of all patients in four clusters (subtypes) across three cancer types; b) The distribution of lung cancer patients in four clusters (subtypes) across different clinical stage; The molecular subtype distribution in four imaging subtypes for c) breast cancer with luminal A/B, Her2 + , and triple negative; d) GBM with different MGMT methylation status.
Extended Data Fig. 5 Volcano plot of enrichment scores through single-sample Gene Set Enrichment Analysis (ssGSEA) of 313 proposed imaging features in all three cancer types.
Volcano plot of enrichment scores through single-sample Gene Set Enrichment Analysis (ssGSEA) of 313 proposed imaging features in all three cancer types. a) imaging subtype 1 versus rest, b) subtype 2 versus rest, c) subtype 3 versus rest, and d) subtype 4 versus rest. The data for all enrichment scores are plotted as log2 fold change versus the −log10 of the adjusted p-value. Thresholds are shown as dashed lines. Pathways deemed as significantly different (false discovery rate or FDR < 0.05) are highlighted with different color schemes.
Extended Data Fig. 6 Evaluation of prognostic value of the four imaging subtypes in lung cancer subgroups.
Evaluation of prognostic value of the four imaging subtypes in lung cancer subgroups. Kaplan-Meier curves for a) stage I + II; b) Stage III; c) Patients treated with surgery; d) Patients treated with radiation.
Extended Data Fig. 7 Evaluation of prognostic value of the four imaging subtypes in subgroups within three cancer types.
Evaluation of prognostic value of the four imaging subtypes in subgroups within three cancer types. Kaplan-Meier curves for lung cancer subgroups: a) EGFR Wild Type; b) EGFR Mutant; c) ALK Wild Type; for breast cancer subgroups: d) ER + group; e) HER2 + group; f) Triple Negative (TN) group; for GBM cancer subgroups: g) MGMT Methylated group; h) MGMT Unmethylated group; i) IDH1 Wild group.
Extended Data Fig. 8 Comparison between the proposed imaging subtypes and conventional radiomics analysis for survival prediction in lung cancer cohorts.
Comparison between the proposed imaging subtypes and conventional radiomics analysis for survival prediction in lung cancer cohorts. a) Details of the final radiomic model; b) Distribution of the radiomic risk score in training and validation cohorts; c) Scatterplot shows the correlation between radiomic risk score and tumour size measured in 2D; d) Distribution and comparison of c-index for the radiomic signature and the proposed imaging subtypes in the validation cohort.
Extended Data Fig. 9 Oncogenic processes associated with the imaging subtypes in three cancer types.
Oncogenic processes associated with the imaging subtypes in three cancer types. Limma-modeled enrichment analysis by single-sample Gene Set Enrichment Analysis (ssGSEA) of 50 cancer hallmark pathways is applied. Volcano plot of enrichment scores in lung cancer: a) subtype 1 versus rest, and b) subtype 4 versus rest; in breast cancer: c) subtype 1 versus rest, and d) subtype 4 versus rest; in GBM: e) subtype 1 versus rest, and f) subtype 4 versus rest. The enrichment scores of 50 cancer hallmark pathways are plotted as log2 fold change versus the −log10 of the adjusted p-value. Thresholds are shown as dashed lines. Pathways deemed as significantly different (false discovery rate [FDR] < 0.05) are highlighted with different color schemes.
Extended Data Fig. 10 Evaluation of imaging subtypes in the advanced lung cancer treated with immunotherapy.
Evaluation of imaging subtypes in the advanced lung cancer treated with immunotherapy. Kaplan-Meier curves of overall survival stratified by imaging subtype 1 and 2 versus 4.
About this article
Cite this article
Wu, J., Li, C., Gensheimer, M. et al. Radiological tumour classification across imaging modality and histology. Nat Mach Intell 3, 787–798 (2021). https://doi.org/10.1038/s42256-021-00377-0
Nature Machine Intelligence (2021)