Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer

Kather, Jakob Nikolas; Pearson, Alexander T.; Halama, Niels; Jäger, Dirk; Krause, Jeremias; Loosen, Sven H.; Marx, Alexander; Boor, Peter; Tacke, Frank; Neumann, Ulf Peter; Grabsch, Heike I.; Yoshikawa, Takaki; Brenner, Hermann; Chang-Claude, Jenny; Hoffmeister, Michael; Trautwein, Christian; Luedde, Tom

doi:10.1038/s41591-019-0462-y

Brief Communication
Published: 03 June 2019

Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer

Nature Medicine volume 25, pages 1054–1056 (2019)Cite this article

40k Accesses
689 Citations
278 Altmetric
Metrics details

Subjects

Abstract

Microsatellite instability determines whether patients with gastrointestinal cancer respond exceptionally well to immunotherapy. However, in clinical practice, not every patient is tested for MSI, because this requires additional genetic or immunohistochemical tests. Here we show that deep residual learning can predict MSI directly from H&E histology, which is ubiquitously available. This approach has the potential to provide immunotherapy to a much broader subset of patients with gastrointestinal cancer.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Tumor detection and MSI prediction in H&E histology.**

**Fig. 2: Classification performance in an external validation set.**

Prediction of tumor origin in cancers of unknown primary origin with cytology-based deep learning

Article Open access 16 April 2024

PERCEPTION predicts patient response and resistance to treatment using single-cell transcriptomics of their tumors

Article 18 April 2024

Demographic bias in misdiagnosis by computational pathology models

Article 19 April 2024

Data availability

All whole-slide images for datasets are available at https://portal.gdc.cancer.gov/. Training images for tumor detection are available at https://doi.org/10.5281/zenodo.2530789. Training images for MSI detection are available at https://doi.org/10.5281/zenodo.2530835 and https://doi.org/10.5281/zenodo.2532612. Source data for Fig. 1 are available in public repositories at https://doi.org/10.5281/zenodo.2530789, https://doi.org/10.5281/zenodo.2530835 and https://doi.org/10.5281/zenodo.2532612. Source Data for Figs. 1, 2 and Extended Data Figs. 1, 2 containing the raw data for these figures are available in the online version of the paper.

Code availability

Source codes are available at https://github.com/jnkather/MSIfromHE.

References

Darvin, P., Toor, S. M., Sasidharan Nair, V. & Elkord, E. Exp. Mol. Med. 50, 165 (2018).
Article Google Scholar
Le, D. T. et al. N. Engl. J. Med. 372, 2509–2520 (2015).
Article CAS Google Scholar
Bonneville, R. et al. JCO Precis. Oncol. 2017, 1–15 (2017).
Google Scholar
Le, D. T. et al. Science 357, 409–413 (2017).
Article CAS Google Scholar
Kather, J. N., Halama, N. & Jaeger, D. Semin. Cancer Biol. 52, 189–197 (2018).
Article CAS Google Scholar
Franke, A. J. et al. J. Clin. Oncol. 36, 796 (2018).
Article Google Scholar
Norgeot, B., Glicksberg, B. S. & Butte, A. J. Nat. Med. 25, 14–15 (2019).
Article CAS Google Scholar
Coudray, N. et al. Nat. Med. 24, 1559–1567 (2018).
Article CAS Google Scholar
Schaumberg, A. J., Rubin, M. A. & Fuchs, T. J. Preprint at https://www.biorxiv.org/content/10.1101/064279v9 (2018).
Chang, P. et al. AJNR Am. J. Neuroradiol. 39, 1201–1207 (2018).
Article CAS Google Scholar
Mobadersany, P. et al. Proc. Natl Acad. Sci. USA 115, E2970–E2979 (2018).
Article CAS Google Scholar
He, K., Zhang, X., Ren, S. & Sun, J. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).
Kather, J. N. et al. PLoS Med. 16, e1002730 (2019).
Article Google Scholar
Kather, J. N. et al. Sci. Rep. 6, 27988 (2016).
Article CAS Google Scholar
The Cancer Genome Atlas Network Nature 513, 202–209 (2014).
Article Google Scholar
The Cancer Genome Atlas Network Nature 487, 330–337 (2012).
Article Google Scholar
Hoffmeister, M. et al. J. Natl Cancer Inst. 107, djv045 (2015).
Article Google Scholar
Brenner, H., Chang-Claude, J., Seiler, C. M. & Hoffmeister, M. J. Clin. Oncol. 29, 3761–3767 (2011).
Article Google Scholar
Aoyama, T. et al. Cancer Med. 7, 4914–4923 (2018).
Article CAS Google Scholar
Rahman, R., Asombang, A. W. & Ibdah, J. A. World J. Gastroenterol. 20, 4483–4490 (2014).
Article Google Scholar
Levine, D. A. & The Cancer Genome Atlas Research Network . Nature 497, 67–73 (2013).
Article Google Scholar
Kawakami, H., Zaanan, A. & Sinicrope, F. A. Curr. Treat. Options Oncol. 16, 30 (2015).
Article Google Scholar
Zhu, L. et al. Mol. Clin. Oncol. 3, 699–705 (2015).
Article CAS Google Scholar
Macenko, M. et al. In Proc. IEEE International Symposium on Biomedical Imaging 1107–1110 (2009).
Liu, Y. et al. Cancer Cell 33, 721–735 (2018).
Article CAS Google Scholar
Bailey, M. H. et al. Cell 173, 371–385 (2018).
Article CAS Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Proc. Advances in Neural Information Processing Systems 1097–1105 (2012).
Simonyan, K. & Zisserman, A. Preprint at https://arxiv.org/abs/1409.1556 (2014).
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. in Proc. IEEE Conference on Computer Vision and Pattern Recognition 2818–2826 (2016).
Iandola, F. N.et al. Preprint at https://arxiv.org/abs/1602.07360 (2016).
DiCiccio, T. J. & Efron, B. Stat. Sci. 11, 189–228 (1996).
Article Google Scholar

Download references

Acknowledgements

The results are in part based on data generated by the TCGA Research Network (http://cancergenome.nih.gov/). J.N.K. was funded by RWTH University Aachen (START 2018-691906). A.T.P. was funded by NIH/NIDCR (K08-DE026500). A.M. was funded by the German Federal Ministry of Education and Research (BMBF) (M2oBITE/13GW0091E). The DACHS study was funded by the Interdisciplinary Research Program of the National Center for Tumor Diseases (NCT), Germany and German Research Council DFG (BR 1704/6-1, BR 1704/6-3, BR 1704/6-4, BR 1704/17-1, CH 117/1-1 and HO 5117/2-1), BMBF (01ER0814, 01ER0815, 01ER1505A and 01ER1505B). P.B. was funded by the DFG (BO 3755/6-1, SFB-TRR57 and SFB-TRR219). T.L. was funded by Horizon 2020 through the European Research Council (ERC) Consolidator Grant PhaseControl (771083), Mildred-Scheel-Endowed Professorship from the German Cancer Aid, DFG (SFB-TRR57/P06 and LU 1360/3-1), Ernst-Jung-Foundation Hamburg and IZKF (Interdisciplinary Center of Clinical Research) at RWTH Aachen.

Author information

Authors and Affiliations

Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany
Jakob Nikolas Kather, Jeremias Krause, Sven H. Loosen, Christian Trautwein & Tom Luedde
German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
Jakob Nikolas Kather, Niels Halama, Dirk Jäger & Hermann Brenner
Applied Tumor Immunity, German Cancer Research Center (DKFZ), Heidelberg, Germany
Jakob Nikolas Kather & Dirk Jäger
Hematology/Oncology, Department of Medicine, University of Chicago, Chicago, IL, USA
Jakob Nikolas Kather & Alexander T. Pearson
Medical Oncology, National Center for Tumor Diseases (NCT), Heidelberg, Germany
Jakob Nikolas Kather, Niels Halama & Dirk Jäger
Translational Immunotherapy, German Cancer Research Center (DKFZ), Heidelberg, Germany
Niels Halama
Institute of Pathology, University Medical Center Mannheim, Heidelberg University, Mannheim, Germany
Alexander Marx
Institute of Pathology and Department of Nephrology, University Hospital RWTH Aachen, Aachen, Germany
Peter Boor
Hepatology and Gastroenterology, Charité University Medicine, Berlin, Germany
Frank Tacke
Visceral and Transplant Surgery, University Hospital RWTH Aachen, Aachen, Germany
Ulf Peter Neumann
Pathology & Data Analytics, Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, UK
Heike I. Grabsch
Pathology and GROW School for Oncology and Developmental Biology, Maastricht University Medical Center+, Maastricht, the Netherlands
Heike I. Grabsch
Department of Gastrointestinal Surgery, Kanagawa Cancer Center, Yokohama, Japan
Takaki Yoshikawa
Department of Gastric Surgery, National Cancer Center Hospital, Tokyo, Japan
Takaki Yoshikawa
Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
Hermann Brenner & Michael Hoffmeister
Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
Hermann Brenner
Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Heidelberg, Germany
Jenny Chang-Claude
Cancer Epidemiology Group, University Cancer Center Hamburg, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
Jenny Chang-Claude

Authors

Jakob Nikolas Kather
View author publications
You can also search for this author in PubMed Google Scholar
Alexander T. Pearson
View author publications
You can also search for this author in PubMed Google Scholar
Niels Halama
View author publications
You can also search for this author in PubMed Google Scholar
Dirk Jäger
View author publications
You can also search for this author in PubMed Google Scholar
Jeremias Krause
View author publications
You can also search for this author in PubMed Google Scholar
Sven H. Loosen
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Marx
View author publications
You can also search for this author in PubMed Google Scholar
Peter Boor
View author publications
You can also search for this author in PubMed Google Scholar
Frank Tacke
View author publications
You can also search for this author in PubMed Google Scholar
Ulf Peter Neumann
View author publications
You can also search for this author in PubMed Google Scholar
Heike I. Grabsch
View author publications
You can also search for this author in PubMed Google Scholar
Takaki Yoshikawa
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Brenner
View author publications
You can also search for this author in PubMed Google Scholar
Jenny Chang-Claude
View author publications
You can also search for this author in PubMed Google Scholar
Michael Hoffmeister
View author publications
You can also search for this author in PubMed Google Scholar
Christian Trautwein
View author publications
You can also search for this author in PubMed Google Scholar
Tom Luedde
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.N.K., A.T.P. and T.L. designed the study; J.N.K. and J.K. performed the analysis; J.N.K., S.H.L. and T.L. performed the statistical analyses; N.H., D.J., A.M., H.I.G., T.Y., H.B., J.C.-C. and M.H. provided human tissue material; D.J., C.T., F.T., U.P.N. and T.L. supervised the study; A.M., P.B. and H.I.G. contributed histopathology expertise; all authors contributed to the interpretation of data and to the writing and revision of the manuscript.

Corresponding authors

Correspondence to Jakob Nikolas Kather or Tom Luedde.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Comparison of five deep neural network architectures.

We compared accuracy and training time of five neural network architectures on the tumor detection dataset with three balanced classes. Alexnet²⁷, VGG19 (ref. ²⁸)) and resnet18 (ref. ¹²) achieved >95% accuracy in withheld images, whereas inceptionv3 (ref. ²⁹) and squeezenet³⁰ had a poor performance on this benchmark task. Among the well-performing models, resnet18 had the lowest number of parameters, making it potentially more portable and less prone to overfitting. In this comparison, we split the dataset into 70% training, 15% validation and 15% test images. Each network is shown twice in this graph: with a learning rate of 1 × 10⁻⁶ and 1 × 10⁻⁵ (outlined). Training was run for 25 epochs. Resnet18 was subsequently retrained on the dataset, attaining a median fivefold cross-validated out-of-sample AUC > 0.99 for tumor detection. The dataset was derived from n = 94 whole-slide images from n = 81 patients and is available at https://doi.org/10.5281/zenodo.2530789.

Source data

Extended Data Fig. 2 Additional data for classifier performance.

a, Flowchart of all experiments. The area under the receiver operating characteristic curve gives an overall measure of patient-level classifier accuracy as measured in held-out test sets. Flag symbols are from https://twemoji.twitter.com/ (licensed under a CC-BY 4.0 license). b, Classification performance in virtual biopsies. We predicted MSI status in all patients in the DACHS cohort, varying the number of blocks (tiles) from 3 to 2,054, which was the median number of blocks per whole-slide image This experiment was repeated five times with different randomly picked blocks being used. As one block has an edge length of 256 µm, a 1-cm tissue cylinder with 100% tumor tissue from a standard 18G biopsy needle corresponds to 117 blocks and a 16G needle corresponds to 156 blocks. In clinical routine, usually only a part of each biopsy core contains tumor, but multiple biopsy cores are collected. With increasing tissue size, performance stabilizes at AUC = 0.84. This shows that a typical biopsy would be sufficient for MSI prediction. CI, confidence interval. c, Distribution of the numbers of blocks for all patients in DACHS (n = 378 patients). d, Overall survival of patients with genetic MSS tumors stratified by high or low predicted MSIness. In this group, patients with high MSIness had a shorter survival than patients with low MSIness. The table shows the number of patients at risk. The P value was calculated by two-sided log-rank test (n = 350 patients).

Source data

Extended Data Fig. 3 Morphological correlates of intratumor heterogeneity of MSI.

a, Histological image of a test set patient who was genetically determined as MSI. b, Corresponding predicted MSI map for the image shown in a. Three regions are highlighted. Region 1 is a glandular region with necrosis and extracellular mucus; this region was predominantly predicted to be MSS. Region 2 is a solid, dedifferentiated region, which was predicted to be MSI. Region 3 contained mostly budding tumor cells mixed with immune cells, this region was strongly predicted to be MSI. Together, these representative examples show that different morphologies elicit different predictions and that these predictions can be traced back to patterns that are understandable for humans. Scale bar, 2.5 mm. This figure is representative of n = 378 patients in the DACHS cohort.

Extended Data Fig. 4 Estimated cost for MSI screening with deep learning.

a, Workflow for MSI screening with deep learning versus immunohistochemistry in tertiary care centers with existing digital pathology core facilities such as the University of Chicago Medical Center. Costs differ by country and are usually cheaper in Europe than in the United States. Here, we list the costs that apply in the United States. b, Set-up cost (fixed cost) for a digital pathology and deep learning infrastructure. H&E, hematoxylin and eosin; MMRd, mismatch repair deficiency; NGS, next-generation sequencing; QC, quality control. Sources and assumptions were as follows. (1) Prices were obtained from https://htrc.uchicago.edu/fees.php?fee=2&fee=2, retrieved on 11 March 2019. We assume ×20 magnification on a high-volume whole-slide scanner. (2) Prices were obtained from https://techcrunch.com/2019/03/07/scaleway-releases-cloud-gpu-instances-for-e1-per-hour/ and https://www.scaleway.com/, retrieved on 11 March 2019. We assume that 1 h of GPU computing on a Nvidia Tesla P100 GPU is required to process whole-slide images for one patient to prediction. (3) US Current Procedural Terminology (CPT) code 88342, four-antibody panel at US$852.00 per staining. (4) Personal communication by the Pathology Department, University of Chicago Medicine, March 2019. (5) Personal communication, Medical Oncology, National Center for Tumor Diseases, Germany. (6) Personal experience of cost for a high-throughput slide scanner plus a limited storing capacity, based on offers by multiple digital pathology vendors. (7) Assuming a tower server with one NVidia Tesla V100 GPU or similar GPU, based on multiple offers by providers for professional hardware, March 2019. Staff cost and infrastructure cost are not accounted for in this schematic.

Supplementary information

Supplementary Information

Supplementary Tables 1–5

Reporting Summary

Source data

Source Data Fig. 2

Source data for the ROC curve in Fig. 2d and the correlation analysis in Fig. 2e

Source Data Extended Data Fig. 1

Source data for scatter plot in Extended Data Fig. 1

Source Data Extended Data Fig. 2

Source data for virtual biopsies in Extended Data Fig. 2b, source data for histogram in Extended Data Fig. 2c and source data for the Kaplan–Meier plot in Extended Data Fig. 2d

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kather, J.N., Pearson, A.T., Halama, N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat Med 25, 1054–1056 (2019). https://doi.org/10.1038/s41591-019-0462-y

Download citation

Received: 09 February 2019
Accepted: 19 April 2019
Published: 03 June 2019
Issue Date: July 2019
DOI: https://doi.org/10.1038/s41591-019-0462-y

This article is cited by

Association of the pathomics-collagen signature with lymph node metastasis in colorectal cancer: a retrospective multicenter study
- Wei Jiang
- Huaiming Wang
- Jun Yan
Journal of Translational Medicine (2024)
Deep learning-based automated lesion segmentation on pediatric focal cortical dysplasia II preoperative MRI: a reliable approach
- Siqi Zhang
- Yijiang Zhuang
- Hongwu Zeng
Insights into Imaging (2024)
Slideflow: deep learning for digital histopathology with real-time whole-slide visualization
- James M. Dolezal
- Sara Kochanny
- Alexander T. Pearson
BMC Bioinformatics (2024)
DeepRisk network: an AI-based tool for digital pathology signature and treatment responsiveness of gastric cancer using whole-slide images
- Mengxin Tian
- Zhao Yao
- Xuefei Wang
Journal of Translational Medicine (2024)
Deep learning in cancer genomics and histopathology
- Michaela Unger
- Jakob Nikolas Kather
Genome Medicine (2024)

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links