Pan-cancer image-based detection of clinically actionable genetic alterations

Kather, Jakob Nikolas; Heij, Lara R.; Grabsch, Heike I.; Loeffler, Chiara; Echle, Amelie; Muti, Hannah Sophie; Krause, Jeremias; Niehues, Jan M.; Sommer, Kai A. J.; Bankhead, Peter; Kooreman, Loes F. S.; Schulte, Jefree J.; Cipriani, Nicole A.; Buelow, Roman D.; Boor, Peter; Ortiz-Brüchle, Nadina; Hanby, Andrew M.; Speirs, Valerie; Kochanny, Sara; Patnaik, Akash; Srisuwananukorn, Andrew; Brenner, Hermann; Hoffmeister, Michael; van den Brandt, Piet A.; Jäger, Dirk; Trautwein, Christian; Pearson, Alexander T.; Luedde, Tom

doi:10.1038/s43018-020-0087-6

Article
Published: 27 July 2020

Pan-cancer image-based detection of clinically actionable genetic alterations

Nature Cancer volume 1, pages 789–799 (2020)Cite this article

11k Accesses
291 Citations
191 Altmetric
Metrics details

Subjects

An Author Correction to this article was published on 03 November 2020

This article has been updated

Abstract

Molecular alterations in cancer can cause phenotypic changes in tumor cells and their microenvironment. Routine histopathology tissue slides, which are ubiquitously available, can reflect such morphological changes. Here, we show that deep learning can consistently infer a wide range of genetic mutations, molecular tumor subtypes, gene expression signatures and standard pathology biomarkers directly from routine histology. We developed, optimized, validated and publicly released a one-stop-shop workflow and applied it to tissue slides of more than 5,000 patients across multiple solid tumors. Our findings show that a single deep learning algorithm can be trained to predict a wide range of molecular alterations from routine, paraffin-embedded histology slides stained with hematoxylin and eosin. These predictions generalize to other populations and are spatially resolved. Our method can be implemented on mobile hardware, potentially enabling point-of-care diagnostics for personalized cancer treatment. More generally, this approach could elucidate and quantify genotype–phenotype links in cancer.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Deep learning workflow for the prediction of molecular features from histology images.**

**Fig. 2: Inference of genetic mutations from histological images.**

**Fig. 3: Inference of putative oncogenic drivers from histological images.**

**Fig. 4: Inference of molecular subtypes, gene expression signatures and standard biomarkers directly from histology.**

**Fig. 5: Explainability of deep learning-based analysis of histological images.**

**Fig. 6: Highest-scoring image tiles for molecular features in gastric cancer.**

3D genomic mapping reveals multifocality of human pancreatic precancers

Article 01 May 2024

PERCEPTION predicts patient response and resistance to treatment using single-cell transcriptomics of their tumors

Article 18 April 2024

Segment anything in medical images

Article Open access 22 January 2024

Data availability

All data, including histological images and information about the age and sex of the participants from the TCGA database are available at https://portal.gdc.cancer.gov/. Genetic data for patients in the TCGA cohorts are available at https://portal.gdc.cancer.gov/ and https://cbioportal.org. Raw data for the DACHS cohort are stored and administered by the DACHS consortium (more information is available from http://dachs.dkfz.org/dachs/). The corresponding authors of this study are not involved in data sharing decisions of the DACHS consortium. All other data supporting the findings of this study are available from the corresponding author upon reasonable request. Source data are provided with this paper.

Code availability

All source codes are available under an open-source license at https://github.com/jnkather/DeepHistology/releases/tag/v0.2.

Change history

17 February 2021
A Correction to this paper has been published: https://doi.org/10.1038/s43018-020-00149-6

References

Cheng, M. L., Berger, M. F., Hyman, D. M. & Solit, D. B. Clinical tumour sequencing for precision oncology: time for a universal strategy. Nat. Rev. Cancer 18, 527–528 (2018).
Article CAS Google Scholar
Rusch, M. et al. Clinical cancer genomic profiling by three-platform sequencing of whole genome, whole exome and transcriptome. Nat. Commun. 9, 3962 (2018).
Article Google Scholar
Kather, J. N., Halama, N. & Jaeger, D. Genomics and emerging biomarkers for immunotherapy of colorectal cancer. Semin. Cancer Biol. 52, 189–197 (2018).
Article CAS Google Scholar
Guinney, J. et al. The consensus molecular subtypes of colorectal cancer. Nat. Med. 21, 1350–1356 (2015).
Article CAS Google Scholar
Fontana, E., Eason, K., Cervantes, A., Salazar, R. & Sadanandam, A. Context matters—consensus molecular subtypes of colorectal cancer as biomarkers for clinical trials. Ann. Oncol. 30, 520–527 (2019).
Article CAS Google Scholar
Shia, J. et al. Morphological characterization of colorectal cancers in The Cancer Genome Atlas reveals distinct morphology–molecular associations: clinical and biological implications. Modern Pathol. 30, 599–609 (2017).
Article CAS Google Scholar
Greenson, J. K. et al. Pathologic predictors of microsatellite instability in colorectal cancer. Am. J. Surg. Pathol. 33, 126–133 (2009).
Article Google Scholar
Le, D. T. et al. PD-1 blockade in tumors with mismatch-repair deficiency. N. Engl. J. Med. 372, 2509–2520 (2015).
Article CAS Google Scholar
Kather, J. N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25, 1054–1056 (2019).
Article CAS Google Scholar
Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).
Article CAS Google Scholar
Sha, L. et al. Multi-field-of-view deep learning model predicts nonsmall cell lung cancer programmed death-ligand 1 status from whole-slide hematoxylin and eosin images. J. Pathol. Inform. 10, 24 (2019).
Article Google Scholar
Schaumberg, A. J., Rubin, M. A. & Fuchs, T. J. H&E-stained whole slide image deep learning predicts SPOP mutation state in prostate cancer. Preprint at bioRxiv https://doi.org/10.1101/064279 (2018).
Kather, J. N. et al. Deep learning detects virus presence in cancer histology. Preprint at bioRxiv https://doi.org/10.1101/690206 (2019).
Zhang, H. et al. Predicting tumor mutational burden from liver cancer pathological images using convolutional neural network. In 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 920–925 (Institute of Electrical and Electronics Engineers, 2019); https://doi.org/10.1109/BIBM47256.2019.8983139
Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019).
Article CAS Google Scholar
Zhang, X., Zhou, X., Lin, M. & Sun, J. ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 6848–6856 (Institute of Electrical and Electronics Engineers, 2018); https://doi.org/10.1109/CVPR.2018.00716
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 4700–4708 (Institute of Electrical and Electronics Engineers, 2017); https://doi.org/10.1109/CVPR.2017.243
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2818–2826 (Institute of Electrical and Electronics Engineers, 2016); https://doi.org/10.1109/CVPR.2016.30
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 770–778 (Institute of Electrical and Electronics Engineers, 2016); https://doi.org/10.1109/CVPR.2016.90
Srinidhi, C. L., Ciga, O. & Martel, A. L. Deep neural network models for computational histopathology: a survey. Preprint at https://arxiv.org/abs/1912.12378 (2019).
Chen, P. C. et al. An augmented reality microscope with real-time artificial intelligence integration for cancer diagnosis. Nat. Med. 25, 1453–1457 (2019).
Article CAS Google Scholar
Muzny, D. M. et al. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).
Article CAS Google Scholar
Fukamachi, H. et al. A subset of diffuse-type gastric cancer is susceptible to mTOR inhibitors and checkpoint inhibitors. J. Exp. Clin. Cancer Res. 38, 127 (2019).
Article Google Scholar
The Cancer Genome Atlas Network et al.Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543–550 (2014).
Article Google Scholar
The Cancer Genome Atlas Networket al.Comprehensive molecular characterization of gastric adenocarcinoma. Nature 513, 202–209 (2014).
Article Google Scholar
André, F. et al. Alpelisib for PIK3CA-mutated, hormone receptor-positive advanced breast cancer. N. Engl. J. Med. 380, 1929–1940 (2019).
Article Google Scholar
The Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).
Xue, Z. et al. MAP3K1 and MAP2K4 mutations are associated with sensitivity to MEK inhibitors in multiple cancer models. Cell Res. 28, 719–729 (2018).
Article CAS Google Scholar
The Cancer Genome Atlas Network. Genomic classification of cutaneous melanoma. Cell 161, 1681–1696 (2015).
The Cancer Genome Atlas Research Network. The molecular taxonomy of primary prostate cancer. Cell 163, 1011–1025 (2015).
Cancer Genome Atlas Research Network. Integrated genomic characterization of pancreatic ductal adenocarcinoma. Cancer Cell 32, 185–203.e13 (2017).
Hammerman, P. S. et al. Comprehensive genomic characterization of squamous cell lung cancers. Nature 489, 519–525 (2012).
Article CAS Google Scholar
The Cancer Genome Atlas Research Network. Comprehensive and integrative genomic characterization of hepatocellular carcinoma. Cell 169, 1327–1341.e23 (2017).
Khalaf, A. M. et al. Role of Wnt/β-catenin signaling in hepatocellular carcinoma, pathogenesis, and clinical significance. J. Hepatocell. Carcinoma 5, 61–73 (2018).
Article CAS Google Scholar
Linehan, W. M. et al. Comprehensive molecular characterization of papillary renal-cell carcinoma. N. Engl. J. Med. 374, 135–145 (2016).
Article Google Scholar
Creighton, C. J. et al. Comprehensive molecular characterization of clear cell renal cell carcinoma. Nature 499, 43–49 (2013).
Article CAS Google Scholar
Davis, C. F. et al. The somatic genomic landscape of chromophobe renal cell carcinoma. Cancer Cell 26, 319–330 (2014).
Article CAS Google Scholar
The Cancer Genome Atlas Network. Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature 517, 576–582 (2015).
Li, C., Egloff, A. M., Sen, M., Grandis, J. R. & Johnson, D. E. Caspase-8 mutations in head and neck cancer confer resistance to death receptor-mediated apoptosis and enhance migration, invasion, and tumor growth. Mol. Oncol. 8, 1220–1230 (2014).
Article CAS Google Scholar
Burk, R. D. et al. Integrated genomic and molecular characterization of cervical cancer. Nature 543, 378–384 (2017).
Article CAS Google Scholar
Thorsson, V. et al. The immune landscape of cancer. Immunity 48, 812–830.e14 (2018).
Article CAS Google Scholar
Liu, Y. et al. Comparative molecular analysis of gastrointestinal adenocarcinomas. Cancer Cell 33, 721–735.e8 (2018).
Article CAS Google Scholar
Macenko, M. et al. A method for normalizing histology slides for quantitative analysis. In 2009 IEEE International Symposium on Biomedical Imaging: From Nano to Macro 1107–1110 (Institute of Electrical and Electronics Engineers, 2009); https://doi.org/10.1109/ISBI.2009.5193250
Barresi, V., Bonetti, L. R. & Bettelli, S. KRAS, NRAS, BRAF mutations and high counts of poorly differentiated clusters of neoplastic cells in colorectal cancer: observational analysis of 175 cases. Pathology 47, 551–556 (2015).
Article CAS Google Scholar
Hoffmeister, M. et al. Statin use and survival after colorectal cancer: the importance of comprehensive confounder adjustment. J. Natl Cancer Inst. 107, djv045 (2015).
Article Google Scholar
Brenner, H., Chang-Claude, J., Seiler, C. M. & Hoffmeister, M. Long-term risk of colorectal cancer after negative colonoscopy. J. Clin. Oncol. 29, 3761–3767 (2011).
Article Google Scholar
Cerami, E. et al. The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2, 401–404 (2012).
Article Google Scholar
Gao, J. et al. Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Signal 6, pl1 (2013).
Article Google Scholar
Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385.e18 (2018).
Article CAS Google Scholar
Berger, A. C. et al. A comprehensive pan-cancer molecular study of gynecologic and breast cancers. Cancer Cell 33, 690–705.e9 (2018).
Article CAS Google Scholar
Bankhead, P. et al. QuPath: open source software for digital pathology image analysis. Sci. Rep. 7, 16878 (2017).
Article Google Scholar
Bianconi, F., Kather, J. N. & Reyes-Aldasoro, C. C. Evaluation of colour pre-processing on patch-based classification of H&E-stained images. In European Congress on Digital Pathology (eds. Reyes-Aldasoro, C. et al.) 56–64 (Lecture Notes in Computer Science Volume 11435, Springer, 2019).

Download references

Acknowledgements

These results are in part based on data generated by the TCGA Research Network (http://cancergenome.nih.gov/). J.N.K. received funding from RWTH University Aachen (START 2018–691906). V.S. was funded by Breast Cancer Now. P. Boor received DFG grants SFB/TRR57, SFB/TRR219, BO3755/3-1 and BO3755/6-1, as well as support from the German Ministry of Education and Research (BMBF) (STOP-FSGS-01GM1901A) and the German Ministry for Economic Affairs and Energy (BMWi) (EMPAIA project). A.T.P. was funded by the NIH and NIDCR (K08-DE026500), an Institutional Research Grant (IRG-16-222-56) from the American Cancer Society, a Cancer Research Foundation Research Grant, and a University of Chicago Medicine Comprehensive Cancer Center Support Grant (P30-CA14599). T.L. was funded by Horizon 2020 (through the European Research Council (ERC) Consolidator Grant PhaseControl (771083)), a Mildred-Scheel Endowed Professorship from the German Cancer Aid (Deutsche Krebshilfe), the German Research Foundation (DFG) (SFB CRC1382/P01, SFB-TRR57/P06 and LU 1360/3-1), the Ernst Jung Foundation Hamburg and the Interdisciplinary Center for Clinical Research (IZKF) at RWTH Aachen.

Author information

These authors contributed equally: Alexander T. Pearson, Tom Luedde.

Authors and Affiliations

Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany
Jakob Nikolas Kather, Chiara Loeffler, Amelie Echle, Hannah Sophie Muti, Jeremias Krause, Jan M. Niehues, Kai A. J. Sommer & Christian Trautwein
German Cancer Consortium (DKTK), Heidelberg, Germany
Jakob Nikolas Kather, Hermann Brenner & Dirk Jäger
Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
Jakob Nikolas Kather & Dirk Jäger
Department of Surgery and Transplantation, University Hospital RWTH Aachen, Aachen, Germany
Lara R. Heij
Department of Surgery, School of Nutrition and Translational Research in Metabolism (NUTRIM), Maastricht University, Maastricht, the Netherlands
Lara R. Heij
Institute of Pathology, University Hospital RWTH Aachen, Aachen, Germany
Lara R. Heij, Roman D. Buelow, Peter Boor & Nadina Ortiz-Brüchle
Department of Pathology, GROW School for Oncology and Developmental Biology, Maastricht University Medical Center+, Maastricht, the Netherlands
Heike I. Grabsch & Loes F. S. Kooreman
Pathology and Data Analytics, Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, UK
Heike I. Grabsch & Andrew M. Hanby
MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK
Peter Bankhead
Department of Pathology, University of Chicago Medicine, Chicago, IL, USA
Jefree J. Schulte & Nicole A. Cipriani
Institute of Medical Sciences, School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Aberdeen, UK
Valerie Speirs
Department of Medicine, University of Chicago Medicine, Chicago, IL, USA
Sara Kochanny, Akash Patnaik & Alexander T. Pearson
Department of Medicine, University of Illinois at Chicago, Chicago, IL, USA
Andrew Srisuwananukorn
Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
Hermann Brenner & Michael Hoffmeister
Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
Hermann Brenner
Department of Epidemiology, GROW School for Oncology and Developmental Biology, Maastricht University Medical Center+, Maastricht, The Netherlands
Piet A. van den Brandt
Division of Gastroenterology, Hepatology and Gastrointestinal Oncology, University Hospital RWTH Aachen, Aachen, Germany
Tom Luedde
Clinic for Gastroenterology, Hepatology and Infectious Diseases, University Hospital Duesseldorf, Duesseldorf, Germany
Tom Luedde

Authors

Jakob Nikolas Kather
View author publications
You can also search for this author in PubMed Google Scholar
Lara R. Heij
View author publications
You can also search for this author in PubMed Google Scholar
Heike I. Grabsch
View author publications
You can also search for this author in PubMed Google Scholar
Chiara Loeffler
View author publications
You can also search for this author in PubMed Google Scholar
Amelie Echle
View author publications
You can also search for this author in PubMed Google Scholar
Hannah Sophie Muti
View author publications
You can also search for this author in PubMed Google Scholar
Jeremias Krause
View author publications
You can also search for this author in PubMed Google Scholar
Jan M. Niehues
View author publications
You can also search for this author in PubMed Google Scholar
Kai A. J. Sommer
View author publications
You can also search for this author in PubMed Google Scholar
Peter Bankhead
View author publications
You can also search for this author in PubMed Google Scholar
Loes F. S. Kooreman
View author publications
You can also search for this author in PubMed Google Scholar
Jefree J. Schulte
View author publications
You can also search for this author in PubMed Google Scholar
Nicole A. Cipriani
View author publications
You can also search for this author in PubMed Google Scholar
Roman D. Buelow
View author publications
You can also search for this author in PubMed Google Scholar
Peter Boor
View author publications
You can also search for this author in PubMed Google Scholar
Nadina Ortiz-Brüchle
View author publications
You can also search for this author in PubMed Google Scholar
Andrew M. Hanby
View author publications
You can also search for this author in PubMed Google Scholar
Valerie Speirs
View author publications
You can also search for this author in PubMed Google Scholar
Sara Kochanny
View author publications
You can also search for this author in PubMed Google Scholar
Akash Patnaik
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Srisuwananukorn
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Brenner
View author publications
You can also search for this author in PubMed Google Scholar
Michael Hoffmeister
View author publications
You can also search for this author in PubMed Google Scholar
Piet A. van den Brandt
View author publications
You can also search for this author in PubMed Google Scholar
Dirk Jäger
View author publications
You can also search for this author in PubMed Google Scholar
Christian Trautwein
View author publications
You can also search for this author in PubMed Google Scholar
Alexander T. Pearson
View author publications
You can also search for this author in PubMed Google Scholar
Tom Luedde
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.N.K., A.T.P. and T.L. designed the study. L.R.H., H.I.G., N.A.C., J.J.S., P.A.v.D.B., L.F.S.K., P. Boor and A.P. oversaw the tumor annotation. C.L., A.E., J.K., H.S.M., J.M.N., R.D.B. and K.A.J.S. manually annotated all of the tumors. J.N.K., J.K., J.M.N. and P. Bankhead designed and implemented the algorithm. J.N.K., C.L., A.S., S.K., R.D.B. and N.O.-B. curated the list of molecular alterations. H.B., M.H., A.T.P., A.M.H. and V.S. provided external validation samples and gave statistical advice. C.T., D.J., A.T.P., P. Boor, V.S. and T.L. provided infrastructure and supervised the study. All authors contributed to the data analysis and writing of the manuscript.

Corresponding authors

Correspondence to Jakob Nikolas Kather, Alexander T. Pearson or Tom Luedde.

Ethics declarations

Competing interests

J.N.K. has an informal, unpaid advisory role at Pathomix (Heidelberg, Germany) that does not relate to this research. J.N.K. declares no other relationships or competing interests. All other authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Distribution of predictability scores for feature classes in all cancer types.

a–o, Target features were assigned to one of four categories as shown in Supplementary Table 1: Genetic variants, oncogenic drivers, high-level signatures and standard-of-care features. For each of these classes, predictability by deep learning was assessed and the distribution of false-detection-rate (FDR)-corrected p-values is shown, with low p-values capped at 10^-5. High-level signatures were highly predictable in most tumor types. p, Color legend for all panels.

Source data

Extended Data Fig. 2 Additional statistics for lung adenocarcinoma, colorectal cancer and breast cancer.

a–c, Detailed prediction statistics for lung adenocarcinoma (LUAD). Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. e-h, Detailed view of the features with highest AUROC values. i–l, Detailed prediction statistics for colorectal cancer (COAD, READ): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. m–p, Correspondingly, a detailed view of the features with highest AUROC values. q–t, Detailed prediction statistics for breast cancer (BRCA): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. u-x, Correspondingly, a detailed view of the features with highest AUROC values. Low p-values capped at 10^-5. Blank panels do not contain any data points, but were added keep a consistent format for all plots. Error bars show patient-level AUROC with bootstrapped confidence intervals, the marker denotes the mean, * denotes two-sided t-test FDR-corrected p-value< 0.05. ”n“ refers to the number of patients.

Source data

Extended Data Fig. 3 Additional statistics for gastric cancer and melanoma.

a–c, Detailed prediction statistics for gastric cancer (STAD). Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. e–h, Detailed view of the features with highest AUROC values. i–l, Detailed prediction statistics for melanoma primary tumor samples (SKCM): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. m–p, Correspondingly, a detailed view of the features with highest AUROC values. q–t, Detailed prediction statistics for melanoma metastatic samples (SKCM): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. u–x, Correspondingly, a detailed view of the features with highest AUROC values. Low p-values capped at 10^-5. Blank panels do not contain any data points, but were added keep a consistent format for all plots. Error bars show patient-level AUROC with bootstrapped confidence intervals, the marker denotes the mean, * denotes two-sided t-test FDR-corrected p-value< 0.05. ”n“ refers to the number of patients.

Source data

Extended Data Fig. 4 Additional statistics for prostate, pancreatic and squamous lung cancer.

a–c, Detailed prediction statistics for prostate cancer (PRAD). Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. e–h, Detailed view of the features with highest AUROC values. i–l, Detailed prediction statistics for pancreatic cancer samples (PAAD): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. m–p, Correspondingly, a detailed view of the features with highest AUROC values. q–t, Detailed prediction statistics for squamous lung cancer (LUSC): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. u–x, Correspondingly, a detailed view of the features with highest AUROC values. Low p-values capped at 10^-5. Blank panels do not contain any data points, but were added keep a consistent format for all plots. Error bars show patient-level AUROC with bootstrapped confidence intervals, the marker denotes the mean, * denotes two-sided t-test FDR-corrected p-value< 0.05. ”n“ refers to the number of patients.

Source data

Extended Data Fig. 5 Additional statistics for hepatocellular, renal papillary and renal clear cell carcinoma.

a–c, Detailed prediction statistics for hepatocellular carcinoma (LIHC). Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. e–h, Detailed view of the features with highest AUROC values. i–l, Detailed prediction statistics for renal papillary carcinoma (KIRP): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. m–p, Correspondingly, a detailed view of the features with highest AUROC values. q–t, Detailed prediction statistics for renal cell clear cell carcinoma (KIRC): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. u–x, Correspondingly, a detailed view of the features with highest AUROC values. Low p-values capped at 10^-5. Blank panels do not contain any data points, but were added keep a consistent format for all plots. Error bars show patient-level AUROC with bootstrapped confidence intervals, the marker denotes the mean, * denotes two-sided t-test FDR-corrected p-value< 0.05. ”n“ refers to the number of patients.

Source data

Extended Data Fig. 6 Additional statistics for renal chromophobe cancer, head and neck cancer and cervical cancer.

a–c, Detailed prediction statistics for renal chromophobe cancer (KICH). Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. e–h, Detailed view of the features with highest AUROC values. i–l, Detailed prediction statistics for head and neck cancer (HNSC): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. m–p, Correspondingly, a detailed view of the features with highest AUROC values. q–t, Detailed prediction statistics for cervical cancer (CESC): Area under the receiver operating curve (AUROC) with corresponding p-values, for each feature. u–x, Correspondingly, a detailed view of the features with highest AUROC values. Low p-values capped at 10^-5. Blank panels do not contain any data points, but were added keep a consistent format for all plots. Error bars show patient-level AUROC with bootstrapped confidence intervals, the marker denotes the mean, * denotes two-sided t-test FDR-corrected p-value< 0.05. ”n“ refers to the number of patients.

Source data

Extended Data Fig. 7 Results of additional technical optimization experiments: Normalization.

a, Comparison of cross-validated absolute differences in AUROC to the baseline model (no normalization), genetic variants. b, Comparison of AUROC differences for genetic driver mutations. c, Comparison of AUROC differences for expression signatures and subtypes.

Source data

Extended Data Fig. 8 Results of additional technical optimization experiments: Weakly supervised.

a, Comparison of cross-validated absolute differences in AUROC to the baseline model (no normalization), genetic variants. b, Comparison of AUROC differences for genetic driver mutations. c, Comparison of AUROC differences for expression signatures and subtypes.

Source data

Extended Data Fig. 9 Results of additional technical optimization experiments: Frozen tissue.

a, Comparison of cross-validated absolute differences in AUROC to the baseline model (no normalization), genetic variants. b, Comparison of AUROC differences for genetic driver mutations. c, Comparison of AUROC differences for expression signatures and subtypes.

Source data

Extended Data Fig. 10 Additional details on the statistical procedures.

a, For patient-level three-fold cross-validation, the patient cohort was split into three random partitions. Each partition had approximately the same proportion of patients within each class. Three classifiers were trained and their patient-level predictions on the respective test set were concatenated. Thus, a prediction was gained for each patient in the cohort, but no patient was ever part of a training set and a test set of the same classifier at the same time. b, The percentage of predicted tiles for each class was used for a receiver operating characteristic (ROC) analysis with 10x bootstrapped pointwise confidence bounds. c, In addition to the ROC analysis, the prediction scores (percent predicted tiles) for patients in each class was compared to prediction scores for patients in all other classes. The resulting false-detection-rate (FDR)-corrected p-value in a two-sided t-test for this comparison was reported for each feature of interest. Icons are from Twitter Twemoji (CC-BY 4.0 license). d, Distribution of tumor content across slides in all tumor types: Central mark = median, bottom and top edge of the box = 25th and 75th percentile, line extends to the most extreme data points, circles = outliers. Outliers larger than 2000 mm2 are not plotted. Median tumor content on slide is 139 mm2 of tumor tissue per slide for colorectal cancer (CRC). Number of tissue slides (plotted here) are available in Supplementary Table 2. e, Design of additional technical optimization experiments: The baseline approach in this study was to perform image analysis of tiles based on manual tumor annota-tions in every single tissue slide, without performing any color normalization. The baseline approach was compared to three alternative approaches as sketched here.

Source data

Supplementary information

Supplementary Information Supplementary Tables 1 and 2.

Reporting Summary

Source data

Source Data Fig. 1

Individual hyperparameter optimization experiments (from c) and prevalence of mutations (from d).

Source Data Fig. 2

Pan-cancer data for any mutation (variants).

Source Data Fig. 3

Pan-cancer data for drivers.

Source Data Fig. 4

Data for any mutation (variants), drivers, signatures and standard features in colorectal cancer, breast cancer, lung adenocarcinoma and gastric cancer (a–d), pan-cancer data for signatures (e–h) and pan-cancer data for standard of care (i and j).

Source Data Extended Data Fig. 1

Data for any mutation (variants), drivers, signatures and standard features in any cancer type (pan-cancer).

Source Data Extended Data Fig. 2

Data for any mutation (variants), drivers, signatures and standard features in lung adenocarcinoma, colorectal cancer and breast cancer.

Source Data Extended Data Fig. 3

Data for any mutation (variants), drivers, signatures and standard features in gastric cancer, primary melanoma (SKCM-01) and melanoma metastases (SKCM-06).

Source Data Extended Data Fig. 4

Data for any mutation (variants), drivers, signatures and standard features in prostate, pancreatic and lung squamous carcinoma.

Source Data Extended Data Fig. 5

Data for any mutation (variants), drivers, signatures and standard features in hepatocellular, renal papillary and renal clear cell carcinoma.

Source Data Extended Data Fig. 6

Data for any mutation (variants), drivers, signatures and standard features in renal chromophobe, head and neck and cervical cancer.

Source Data Extended Data Fig. 7

Alternative methods (normalization).

Source Data Extended Data Fig. 8

Alternative methods (weakly supervised).

Source Data Extended Data Fig. 9

Alternative methods (frozen samples).

Source Data Extended Data Fig. 10

Number of image tiles per WSI, pan-cancer (from d).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kather, J.N., Heij, L.R., Grabsch, H.I. et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat Cancer 1, 789–799 (2020). https://doi.org/10.1038/s43018-020-0087-6

Download citation

Received: 19 November 2019
Accepted: 26 May 2020
Published: 27 July 2020
Issue Date: August 2020
DOI: https://doi.org/10.1038/s43018-020-0087-6

This article is cited by

Pathogenomics for accurate diagnosis, treatment, prognosis of oncology: a cutting edge overview
- Xiaobing Feng
- Wen Shu
- Min He
Journal of Translational Medicine (2024)
tRigon: an R package and Shiny App for integrative (path-)omics data analysis
- David L. Hölscher
- Michael Goedertier
- Roman D. Bülow
BMC Bioinformatics (2024)
Slideflow: deep learning for digital histopathology with real-time whole-slide visualization
- James M. Dolezal
- Sara Kochanny
- Alexander T. Pearson
BMC Bioinformatics (2024)
Deep learning in cancer genomics and histopathology
- Michaela Unger
- Jakob Nikolas Kather
Genome Medicine (2024)
A novel SpaSA based hyper-parameter optimized FCEDN with adaptive CNN classification for skin cancer detection
- Rizwan Ali
- A. Manikandan
- Jinghong Xu
Scientific Reports (2024)

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

Change history

17 February 2021

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links