Abstract
Molecular alterations in cancer can cause phenotypic changes in tumor cells and their microenvironment. Routine histopathology tissue slides, which are ubiquitously available, can reflect such morphological changes. Here, we show that deep learning can consistently infer a wide range of genetic mutations, molecular tumor subtypes, gene expression signatures and standard pathology biomarkers directly from routine histology. We developed, optimized, validated and publicly released a one-stop-shop workflow and applied it to tissue slides of more than 5,000 patients across multiple solid tumors. Our findings show that a single deep learning algorithm can be trained to predict a wide range of molecular alterations from routine, paraffin-embedded histology slides stained with hematoxylin and eosin. These predictions generalize to other populations and are spatially resolved. Our method can be implemented on mobile hardware, potentially enabling point-of-care diagnostics for personalized cancer treatment. More generally, this approach could elucidate and quantify genotype–phenotype links in cancer.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
All data, including histological images and information about the age and sex of the participants from the TCGA database are available at https://portal.gdc.cancer.gov/. Genetic data for patients in the TCGA cohorts are available at https://portal.gdc.cancer.gov/ and https://cbioportal.org. Raw data for the DACHS cohort are stored and administered by the DACHS consortium (more information is available from http://dachs.dkfz.org/dachs/). The corresponding authors of this study are not involved in data sharing decisions of the DACHS consortium. All other data supporting the findings of this study are available from the corresponding author upon reasonable request. Source data are provided with this paper.
Code availability
All source codes are available under an open-source license at https://github.com/jnkather/DeepHistology/releases/tag/v0.2.
Change history
17 February 2021
A Correction to this paper has been published: https://doi.org/10.1038/s43018-020-00149-6
References
Cheng, M. L., Berger, M. F., Hyman, D. M. & Solit, D. B. Clinical tumour sequencing for precision oncology: time for a universal strategy. Nat. Rev. Cancer 18, 527–528 (2018).
Rusch, M. et al. Clinical cancer genomic profiling by three-platform sequencing of whole genome, whole exome and transcriptome. Nat. Commun. 9, 3962 (2018).
Kather, J. N., Halama, N. & Jaeger, D. Genomics and emerging biomarkers for immunotherapy of colorectal cancer. Semin. Cancer Biol. 52, 189–197 (2018).
Guinney, J. et al. The consensus molecular subtypes of colorectal cancer. Nat. Med. 21, 1350–1356 (2015).
Fontana, E., Eason, K., Cervantes, A., Salazar, R. & Sadanandam, A. Context matters—consensus molecular subtypes of colorectal cancer as biomarkers for clinical trials. Ann. Oncol. 30, 520–527 (2019).
Shia, J. et al. Morphological characterization of colorectal cancers in The Cancer Genome Atlas reveals distinct morphology–molecular associations: clinical and biological implications. Modern Pathol. 30, 599–609 (2017).
Greenson, J. K. et al. Pathologic predictors of microsatellite instability in colorectal cancer. Am. J. Surg. Pathol. 33, 126–133 (2009).
Le, D. T. et al. PD-1 blockade in tumors with mismatch-repair deficiency. N. Engl. J. Med. 372, 2509–2520 (2015).
Kather, J. N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25, 1054–1056 (2019).
Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).
Sha, L. et al. Multi-field-of-view deep learning model predicts nonsmall cell lung cancer programmed death-ligand 1 status from whole-slide hematoxylin and eosin images. J. Pathol. Inform. 10, 24 (2019).
Schaumberg, A. J., Rubin, M. A. & Fuchs, T. J. H&E-stained whole slide image deep learning predicts SPOP mutation state in prostate cancer. Preprint at bioRxiv https://doi.org/10.1101/064279 (2018).
Kather, J. N. et al. Deep learning detects virus presence in cancer histology. Preprint at bioRxiv https://doi.org/10.1101/690206 (2019).
Zhang, H. et al. Predicting tumor mutational burden from liver cancer pathological images using convolutional neural network. In 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 920–925 (Institute of Electrical and Electronics Engineers, 2019); https://doi.org/10.1109/BIBM47256.2019.8983139
Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019).
Zhang, X., Zhou, X., Lin, M. & Sun, J. ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 6848–6856 (Institute of Electrical and Electronics Engineers, 2018); https://doi.org/10.1109/CVPR.2018.00716
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 4700–4708 (Institute of Electrical and Electronics Engineers, 2017); https://doi.org/10.1109/CVPR.2017.243
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2818–2826 (Institute of Electrical and Electronics Engineers, 2016); https://doi.org/10.1109/CVPR.2016.30
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 770–778 (Institute of Electrical and Electronics Engineers, 2016); https://doi.org/10.1109/CVPR.2016.90
Srinidhi, C. L., Ciga, O. & Martel, A. L. Deep neural network models for computational histopathology: a survey. Preprint at https://arxiv.org/abs/1912.12378 (2019).
Chen, P. C. et al. An augmented reality microscope with real-time artificial intelligence integration for cancer diagnosis. Nat. Med. 25, 1453–1457 (2019).
Muzny, D. M. et al. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).
Fukamachi, H. et al. A subset of diffuse-type gastric cancer is susceptible to mTOR inhibitors and checkpoint inhibitors. J. Exp. Clin. Cancer Res. 38, 127 (2019).
The Cancer Genome Atlas Network et al.Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543–550 (2014).
The Cancer Genome Atlas Networket al.Comprehensive molecular characterization of gastric adenocarcinoma. Nature 513, 202–209 (2014).
André, F. et al. Alpelisib for PIK3CA-mutated, hormone receptor-positive advanced breast cancer. N. Engl. J. Med. 380, 1929–1940 (2019).
The Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).
Xue, Z. et al. MAP3K1 and MAP2K4 mutations are associated with sensitivity to MEK inhibitors in multiple cancer models. Cell Res. 28, 719–729 (2018).
The Cancer Genome Atlas Network. Genomic classification of cutaneous melanoma. Cell 161, 1681–1696 (2015).
The Cancer Genome Atlas Research Network. The molecular taxonomy of primary prostate cancer. Cell 163, 1011–1025 (2015).
Cancer Genome Atlas Research Network. Integrated genomic characterization of pancreatic ductal adenocarcinoma. Cancer Cell 32, 185–203.e13 (2017).
Hammerman, P. S. et al. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525 (2012).
The Cancer Genome Atlas Research Network. Comprehensive and integrative genomic characterization of hepatocellular carcinoma. Cell 169, 1327–1341.e23 (2017).
Khalaf, A. M. et al. Role of Wnt/β-catenin signaling in hepatocellular carcinoma, pathogenesis, and clinical significance. J. Hepatocell. Carcinoma 5, 61–73 (2018).
Linehan, W. M. et al. Comprehensive molecular characterization of papillary renal-cell carcinoma. N. Engl. J. Med. 374, 135–145 (2016).
Creighton, C. J. et al. Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature 499, 43–49 (2013).
Davis, C. F. et al. The somatic genomic landscape of chromophobe renal cell carcinoma. Cancer Cell 26, 319–330 (2014).
The Cancer Genome Atlas Network. Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature 517, 576–582 (2015).
Li, C., Egloff, A. M., Sen, M., Grandis, J. R. & Johnson, D. E. Caspase-8 mutations in head and neck cancer confer resistance to death receptor-mediated apoptosis and enhance migration, invasion, and tumor growth. Mol. Oncol. 8, 1220–1230 (2014).
Burk, R. D. et al. Integrated genomic and molecular characterization of cervical cancer. Nature 543, 378–384 (2017).
Thorsson, V. et al. The immune landscape of cancer. Immunity 48, 812–830.e14 (2018).
Liu, Y. et al. Comparative molecular analysis of gastrointestinal adenocarcinomas. Cancer Cell 33, 721–735.e8 (2018).
Macenko, M. et al. A method for normalizing histology slides for quantitative analysis. In 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro 1107–1110 (Institute of Electrical and Electronics Engineers, 2009); https://doi.org/10.1109/ISBI.2009.5193250
Barresi, V., Bonetti, L. R. & Bettelli, S. KRAS, NRAS, BRAF mutations and high counts of poorly differentiated clusters of neoplastic cells in colorectal cancer: observational analysis of 175 cases. Pathology 47, 551–556 (2015).
Hoffmeister, M. et al. Statin use and survival after colorectal cancer: the importance of comprehensive confounder adjustment. J. Natl Cancer Inst. 107, djv045 (2015).
Brenner, H., Chang-Claude, J., Seiler, C. M. & Hoffmeister, M. Long-term risk of colorectal cancer after negative colonoscopy. J. Clin. Oncol. 29, 3761–3767 (2011).
Cerami, E. et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2, 401–404 (2012).
Gao, J. et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signal 6, pl1 (2013).
Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385.e18 (2018).
Berger, A. C. et al. A comprehensive pan-cancer molecular study of gynecologic and breast cancers. Cancer Cell 33, 690–705.e9 (2018).
Bankhead, P. et al. QuPath: open source software for digital pathology image analysis. Sci. Rep. 7, 16878 (2017).
Bianconi, F., Kather, J. N. & Reyes-Aldasoro, C. C. Evaluation of colour pre-processing on patch-based classification of H&E-stained images. In European Congress on Digital Pathology (eds. Reyes-Aldasoro, C. et al.) 56–64 (Lecture Notes in Computer Science Volume 11435, Springer, 2019).
Acknowledgements
These results are in part based on data generated by the TCGA Research Network (http://cancergenome.nih.gov/). J.N.K. received funding from RWTH University Aachen (START 2018–691906). V.S. was funded by Breast Cancer Now. P. Boor received DFG grants SFB/TRR57, SFB/TRR219, BO3755/3-1 and BO3755/6-1, as well as support from the German Ministry of Education and Research (BMBF) (STOP-FSGS-01GM1901A) and the German Ministry for Economic Affairs and Energy (BMWi) (EMPAIA project). A.T.P. was funded by the NIH and NIDCR (K08-DE026500), an Institutional Research Grant (IRG-16-222-56) from the American Cancer Society, a Cancer Research Foundation Research Grant, and a University of Chicago Medicine Comprehensive Cancer Center Support Grant (P30-CA14599). T.L. was funded by Horizon 2020 (through the European Research Council (ERC) Consolidator Grant PhaseControl (771083)), a Mildred-Scheel Endowed Professorship from the German Cancer Aid (Deutsche Krebshilfe), the German Research Foundation (DFG) (SFB CRC1382/P01, SFB-TRR57/P06 and LU 1360/3-1), the Ernst Jung Foundation Hamburg and the Interdisciplinary Center for Clinical Research (IZKF) at RWTH Aachen.
Author information
Authors and Affiliations
Contributions
J.N.K., A.T.P. and T.L. designed the study. L.R.H., H.I.G., N.A.C., J.J.S., P.A.v.D.B., L.F.S.K., P. Boor and A.P. oversaw the tumor annotation. C.L., A.E., J.K., H.S.M., J.M.N., R.D.B. and K.A.J.S. manually annotated all of the tumors. J.N.K., J.K., J.M.N. and P. Bankhead designed and implemented the algorithm. J.N.K., C.L., A.S., S.K., R.D.B. and N.O.-B. curated the list of molecular alterations. H.B., M.H., A.T.P., A.M.H. and V.S. provided external validation samples and gave statistical advice. C.T., D.J., A.T.P., P. Boor, V.S. and T.L. provided infrastructure and supervised the study. All authors contributed to the data analysis and writing of the manuscript.
Corresponding authors
Ethics declarations
Competing interests
J.N.K. has an informal, unpaid advisory role at Pathomix (Heidelberg, Germany) that does not relate to this research. J.N.K. declares no other relationships or competing interests. All other authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Distribution of predictability scores for feature classes in all cancer types.
a–o, Target features were assigned to one of four categories as shown in Supplementary Table 1: Genetic variants, oncogenic drivers, high-level signatures and standard-of-care features. For each of these classes, predictability by deep learning was assessed and the distribution of false-detection-rate (FDR)-corrected p-values is shown, with low p-values capped at 10-5. High-level signatures were highly predictable in most tumor types. p, Color legend for all panels.
Extended Data Fig. 2 Additional statistics for lung adenocarcinoma, colorectal cancer and breast cancer.
a–c, Detailed prediction statistics for lung adenocarcinoma (LUAD). Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. e-h, Detailed view of the features with highest AUROC values. i–l, Detailed prediction statistics for colorectal cancer (COAD, READ): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. m–p, Correspondingly, a detailed view of the features with highest AUROC values. q–t, Detailed prediction statistics for breast cancer (BRCA): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. u-x, Correspondingly, a detailed view of the features with highest AUROC values. Low p-values capped at 10-5. Blank panels do not contain any data points, but were added keep a consistent format for all plots. Error bars show patient-level AUROC with bootstrapped confidence intervals, the marker denotes the mean, * denotes two-sided t-test FDR-corrected p-value< 0.05. ”n“ refers to the number of patients.
Extended Data Fig. 3 Additional statistics for gastric cancer and melanoma.
a–c, Detailed prediction statistics for gastric cancer (STAD). Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. e–h, Detailed view of the features with highest AUROC values. i–l, Detailed prediction statistics for melanoma primary tumor samples (SKCM): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. m–p, Correspondingly, a detailed view of the features with highest AUROC values. q–t, Detailed prediction statistics for melanoma metastatic samples (SKCM): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. u–x, Correspondingly, a detailed view of the features with highest AUROC values. Low p-values capped at 10-5. Blank panels do not contain any data points, but were added keep a consistent format for all plots. Error bars show patient-level AUROC with bootstrapped confidence intervals, the marker denotes the mean, * denotes two-sided t-test FDR-corrected p-value< 0.05. ”n“ refers to the number of patients.
Extended Data Fig. 4 Additional statistics for prostate, pancreatic and squamous lung cancer.
a–c, Detailed prediction statistics for prostate cancer (PRAD). Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. e–h, Detailed view of the features with highest AUROC values. i–l, Detailed prediction statistics for pancreatic cancer samples (PAAD): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. m–p, Correspondingly, a detailed view of the features with highest AUROC values. q–t, Detailed prediction statistics for squamous lung cancer (LUSC): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. u–x, Correspondingly, a detailed view of the features with highest AUROC values. Low p-values capped at 10-5. Blank panels do not contain any data points, but were added keep a consistent format for all plots. Error bars show patient-level AUROC with bootstrapped confidence intervals, the marker denotes the mean, * denotes two-sided t-test FDR-corrected p-value< 0.05. ”n“ refers to the number of patients.
Extended Data Fig. 5 Additional statistics for hepatocellular, renal papillary and renal clear cell carcinoma.
a–c, Detailed prediction statistics for hepatocellular carcinoma (LIHC). Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. e–h, Detailed view of the features with highest AUROC values. i–l, Detailed prediction statistics for renal papillary carcinoma (KIRP): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. m–p, Correspondingly, a detailed view of the features with highest AUROC values. q–t, Detailed prediction statistics for renal cell clear cell carcinoma (KIRC): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. u–x, Correspondingly, a detailed view of the features with highest AUROC values. Low p-values capped at 10-5. Blank panels do not contain any data points, but were added keep a consistent format for all plots. Error bars show patient-level AUROC with bootstrapped confidence intervals, the marker denotes the mean, * denotes two-sided t-test FDR-corrected p-value< 0.05. ”n“ refers to the number of patients.
Extended Data Fig. 6 Additional statistics for renal chromophobe cancer, head and neck cancer and cervical cancer.
a–c, Detailed prediction statistics for renal chromophobe cancer (KICH). Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. e–h, Detailed view of the features with highest AUROC values. i–l, Detailed prediction statistics for head and neck cancer (HNSC): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. m–p, Correspondingly, a detailed view of the features with highest AUROC values. q–t, Detailed prediction statistics for cervical cancer (CESC): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. u–x, Correspondingly, a detailed view of the features with highest AUROC values. Low p-values capped at 10-5. Blank panels do not contain any data points, but were added keep a consistent format for all plots. Error bars show patient-level AUROC with bootstrapped confidence intervals, the marker denotes the mean, * denotes two-sided t-test FDR-corrected p-value< 0.05. ”n“ refers to the number of patients.
Extended Data Fig. 7 Results of additional technical optimization experiments: Normalization.
a, Comparison of cross-validated absolute differences in AUROC to the baseline model (no normalization), genetic variants. b, Comparison of AUROC differences for genetic driver mutations. c, Comparison of AUROC differences for expression signatures and subtypes.
Extended Data Fig. 8 Results of additional technical optimization experiments: Weakly supervised.
a, Comparison of cross-validated absolute differences in AUROC to the baseline model (no normalization), genetic variants. b, Comparison of AUROC differences for genetic driver mutations. c, Comparison of AUROC differences for expression signatures and subtypes.
Extended Data Fig. 9 Results of additional technical optimization experiments: Frozen tissue.
a, Comparison of cross-validated absolute differences in AUROC to the baseline model (no normalization), genetic variants. b, Comparison of AUROC differences for genetic driver mutations. c, Comparison of AUROC differences for expression signatures and subtypes.
Extended Data Fig. 10 Additional details on the statistical procedures.
a, For patient-level three-fold cross-validation, the patient cohort was split into three random partitions. Each partition had approximately the same proportion of patients within each class. Three classifiers were trained and their patient-level predictions on the respective test set were concatenated. Thus, a prediction was gained for each patient in the cohort, but no patient was ever part of a training set and a test set of the same classifier at the same time. b, The percentage of predicted tiles for each class was used for a receiver operating characteristic (ROC) analysis with 10x bootstrapped pointwise confidence bounds. c, In addition to the ROC analysis, the prediction scores (percent predicted tiles) for patients in each class was compared to prediction scores for patients in all other classes. The resulting false-detection-rate (FDR)-corrected p-value in a two-sided t-test for this comparison was reported for each feature of interest. Icons are from Twitter Twemoji (CC-BY 4.0 license). d, Distribution of tumor content across slides in all tumor types: Central mark = median, bottom and top edge of the box = 25th and 75th percentile, line extends to the most extreme data points, circles = outliers. Outliers larger than 2000 mm2 are not plotted. Median tumor content on slide is 139 mm2 of tumor tissue per slide for colorectal cancer (CRC). Number of tissue slides (plotted here) are available in Supplementary Table 2. e, Design of additional technical optimization experiments: The baseline approach in this study was to perform image analysis of tiles based on manual tumor annota-tions in every single tissue slide, without performing any color normalization. The baseline approach was compared to three alternative approaches as sketched here.
Supplementary information
Source data
Source Data Fig. 1
Individual hyperparameter optimization experiments (from c) and prevalence of mutations (from d).
Source Data Fig. 2
Pan-cancer data for any mutation (variants).
Source Data Fig. 3
Pan-cancer data for drivers.
Source Data Fig. 4
Data for any mutation (variants), drivers, signatures and standard features in colorectal cancer, breast cancer, lung adenocarcinoma and gastric cancer (a–d), pan-cancer data for signatures (e–h) and pan-cancer data for standard of care (i and j).
Source Data Extended Data Fig. 1
Data for any mutation (variants), drivers, signatures and standard features in any cancer type (pan-cancer).
Source Data Extended Data Fig. 2
Data for any mutation (variants), drivers, signatures and standard features in lung adenocarcinoma, colorectal cancer and breast cancer.
Source Data Extended Data Fig. 3
Data for any mutation (variants), drivers, signatures and standard features in gastric cancer, primary melanoma (SKCM-01) and melanoma metastases (SKCM-06).
Source Data Extended Data Fig. 4
Data for any mutation (variants), drivers, signatures and standard features in prostate, pancreatic and lung squamous carcinoma.
Source Data Extended Data Fig. 5
Data for any mutation (variants), drivers, signatures and standard features in hepatocellular, renal papillary and renal clear cell carcinoma.
Source Data Extended Data Fig. 6
Data for any mutation (variants), drivers, signatures and standard features in renal chromophobe, head and neck and cervical cancer.
Source Data Extended Data Fig. 7
Alternative methods (normalization).
Source Data Extended Data Fig. 8
Alternative methods (weakly supervised).
Source Data Extended Data Fig. 9
Alternative methods (frozen samples).
Source Data Extended Data Fig. 10
Number of image tiles per WSI, pan-cancer (from d).
Rights and permissions
About this article
Cite this article
Kather, J.N., Heij, L.R., Grabsch, H.I. et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat Cancer 1, 789–799 (2020). https://doi.org/10.1038/s43018-020-0087-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43018-020-0087-6
This article is cited by
-
Pathogenomics for accurate diagnosis, treatment, prognosis of oncology: a cutting edge overview
Journal of Translational Medicine (2024)
-
tRigon: an R package and Shiny App for integrative (path-)omics data analysis
BMC Bioinformatics (2024)
-
Slideflow: deep learning for digital histopathology with real-time whole-slide visualization
BMC Bioinformatics (2024)
-
Deep learning in cancer genomics and histopathology
Genome Medicine (2024)
-
A novel SpaSA based hyper-parameter optimized FCEDN with adaptive CNN classification for skin cancer detection
Scientific Reports (2024)