Brief Communication | Published:

Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer


Microsatellite instability determines whether patients with gastrointestinal cancer respond exceptionally well to immunotherapy. However, in clinical practice, not every patient is tested for MSI, because this requires additional genetic or immunohistochemical tests. Here we show that deep residual learning can predict MSI directly from H&E histology, which is ubiquitously available. This approach has the potential to provide immunotherapy to a much broader subset of patients with gastrointestinal cancer.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Data availability

All whole-slide images for datasets are available at Training images for tumor detection are available at Training images for MSI detection are available at and Source data for Fig. 1 are available in public repositories at, and Source Data for Figs. 1, 2 and Extended Data Figs. 1, 2 containing the raw data for these figures are available in the online version of the paper.

Code availability

Source codes are available at

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Darvin, P., Toor, S. M., Sasidharan Nair, V. & Elkord, E. Exp. Mol. Med. 50, 165 (2018).

  2. 2.

    Le, D. T. et al. N. Engl. J. Med. 372, 2509–2520 (2015).

  3. 3.

    Bonneville, R. et al. JCO Precis. Oncol. 2017, 1–15 (2017).

  4. 4.

    Le, D. T. et al. Science 357, 409–413 (2017).

  5. 5.

    Kather, J. N., Halama, N. & Jaeger, D. Semin. Cancer Biol. 52, 189–197 (2018).

  6. 6.

    Franke, A. J. et al. J. Clin. Oncol. 36, 796 (2018).

  7. 7.

    Norgeot, B., Glicksberg, B. S. & Butte, A. J. Nat. Med. 25, 14–15 (2019).

  8. 8.

    Coudray, N. et al. Nat. Med. 24, 1559–1567 (2018).

  9. 9.

    Schaumberg, A. J., Rubin, M. A. & Fuchs, T. J. Preprint at (2018).

  10. 10.

    Chang, P. et al. AJNR Am. J. Neuroradiol. 39, 1201–1207 (2018).

  11. 11.

    Mobadersany, P. et al. Proc. Natl Acad. Sci. USA 115, E2970–E2979 (2018).

  12. 12.

    He, K., Zhang, X., Ren, S. & Sun, J. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).

  13. 13.

    Kather, J. N. et al. PLoS Med. 16, e1002730 (2019).

  14. 14.

    Kather, J. N. et al. Sci. Rep. 6, 27988 (2016).

  15. 15.

    The Cancer Genome Atlas Network Nature 513, 202–209 (2014).

  16. 16.

    The Cancer Genome Atlas Network Nature 487, 330–337 (2012).

  17. 17.

    Hoffmeister, M. et al. J. Natl Cancer Inst. 107, djv045 (2015).

  18. 18.

    Brenner, H., Chang-Claude, J., Seiler, C. M. & Hoffmeister, M. J. Clin. Oncol. 29, 3761–3767 (2011).

  19. 19.

    Aoyama, T. et al. Cancer Med. 7, 4914–4923 (2018).

  20. 20.

    Rahman, R., Asombang, A. W. & Ibdah, J. A. World J. Gastroenterol. 20, 4483–4490 (2014).

  21. 21.

    Levine, D. A. & The Cancer Genome Atlas Research Network . Nature 497, 67–73 (2013).

  22. 22.

    Kawakami, H., Zaanan, A. & Sinicrope, F. A. Curr. Treat. Options Oncol. 16, 30 (2015).

  23. 23.

    Zhu, L. et al. Mol. Clin. Oncol. 3, 699–705 (2015).

  24. 24.

    Macenko, M. et al. In Proc. IEEE International Symposium on Biomedical Imaging 1107–1110 (2009).

  25. 25.

    Liu, Y. et al. Cancer Cell 33, 721–735 (2018).

  26. 26.

    Bailey, M. H. et al. Cell 173, 371–385 (2018).

  27. 27.

    Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Proc. Advances in Neural Information Processing Systems 1097–1105 (2012).

  28. 28.

    Simonyan, K. & Zisserman, A. Preprint at (2014).

  29. 29.

    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. in Proc. IEEE Conference on Computer Vision and Pattern Recognition 2818–2826 (2016).

  30. 30.

    Iandola, F. al. Preprint at (2016).

  31. 31.

    DiCiccio, T. J. & Efron, B. Stat. Sci. 11, 189–228 (1996).

Download references


The results are in part based on data generated by the TCGA Research Network ( J.N.K. was funded by RWTH University Aachen (START 2018-691906). A.T.P. was funded by NIH/NIDCR (K08-DE026500). A.M. was funded by the German Federal Ministry of Education and Research (BMBF) (M2oBITE/13GW0091E). The DACHS study was funded by the Interdisciplinary Research Program of the National Center for Tumor Diseases (NCT), Germany and German Research Council DFG (BR 1704/6-1, BR 1704/6-3, BR 1704/6-4, BR 1704/17-1, CH 117/1-1 and HO 5117/2-1), BMBF (01ER0814, 01ER0815, 01ER1505A and 01ER1505B). P.B. was funded by the DFG (BO 3755/6-1, SFB-TRR57 and SFB-TRR219). T.L. was funded by Horizon 2020 through the European Research Council (ERC) Consolidator Grant PhaseControl (771083), Mildred-Scheel-Endowed Professorship from the German Cancer Aid, DFG (SFB-TRR57/P06 and LU 1360/3-1), Ernst-Jung-Foundation Hamburg and IZKF (Interdisciplinary Center of Clinical Research) at RWTH Aachen.

Author information

J.N.K., A.T.P. and T.L. designed the study; J.N.K. and J.K. performed the analysis; J.N.K., S.H.L. and T.L. performed the statistical analyses; N.H., D.J., A.M., H.I.G., T.Y., H.B., J.C.-C. and M.H. provided human tissue material; D.J., C.T., F.T., U.P.N. and T.L. supervised the study; A.M., P.B. and H.I.G. contributed histopathology expertise; all authors contributed to the interpretation of data and to the writing and revision of the manuscript.

Competing interests

The authors declare no competing interests.

Correspondence to Jakob Nikolas Kather or Tom Luedde.

Extended data

  1. Extended Data Fig. 1 Comparison of five deep neural network architectures.

    We compared accuracy and training time of five neural network architectures on the tumor detection dataset with three balanced classes. Alexnet27, VGG19 (ref. 28)) and resnet18 (ref. 12) achieved >95% accuracy in withheld images, whereas inceptionv3 (ref. 29) and squeezenet30 had a poor performance on this benchmark task. Among the well-performing models, resnet18 had the lowest number of parameters, making it potentially more portable and less prone to overfitting. In this comparison, we split the dataset into 70% training, 15% validation and 15% test images. Each network is shown twice in this graph: with a learning rate of 1 × 10−6 and 1 × 10−5 (outlined). Training was run for 25 epochs. Resnet18 was subsequently retrained on the dataset, attaining a median fivefold cross-validated out-of-sample AUC > 0.99 for tumor detection. The dataset was derived from n = 94 whole-slide images from n = 81 patients and is available at Source data

  2. Extended Data Fig. 2 Additional data for classifier performance.

    a, Flowchart of all experiments. The area under the receiver operating characteristic curve gives an overall measure of patient-level classifier accuracy as measured in held-out test sets. Flag symbols are from (licensed under a CC-BY 4.0 license). b, Classification performance in virtual biopsies. We predicted MSI status in all patients in the DACHS cohort, varying the number of blocks (tiles) from 3 to 2,054, which was the median number of blocks per whole-slide image This experiment was repeated five times with different randomly picked blocks being used. As one block has an edge length of 256 µm, a 1-cm tissue cylinder with 100% tumor tissue from a standard 18G biopsy needle corresponds to 117 blocks and a 16G needle corresponds to 156 blocks. In clinical routine, usually only a part of each biopsy core contains tumor, but multiple biopsy cores are collected. With increasing tissue size, performance stabilizes at AUC = 0.84. This shows that a typical biopsy would be sufficient for MSI prediction. CI, confidence interval. c, Distribution of the numbers of blocks for all patients in DACHS (n = 378 patients). d, Overall survival of patients with genetic MSS tumors stratified by high or low predicted MSIness. In this group, patients with high MSIness had a shorter survival than patients with low MSIness. The table shows the number of patients at risk. The P value was calculated by two-sided log-rank test (n = 350 patients). Source data

  3. Extended Data Fig. 3 Morphological correlates of intratumor heterogeneity of MSI.

    a, Histological image of a test set patient who was genetically determined as MSI. b, Corresponding predicted MSI map for the image shown in a. Three regions are highlighted. Region 1 is a glandular region with necrosis and extracellular mucus; this region was predominantly predicted to be MSS. Region 2 is a solid, dedifferentiated region, which was predicted to be MSI. Region 3 contained mostly budding tumor cells mixed with immune cells, this region was strongly predicted to be MSI. Together, these representative examples show that different morphologies elicit different predictions and that these predictions can be traced back to patterns that are understandable for humans. Scale bar, 2.5 mm. This figure is representative of n = 378 patients in the DACHS cohort.

  4. Extended Data Fig. 4 Estimated cost for MSI screening with deep learning.

    a, Workflow for MSI screening with deep learning versus immunohistochemistry in tertiary care centers with existing digital pathology core facilities such as the University of Chicago Medical Center. Costs differ by country and are usually cheaper in Europe than in the United States. Here, we list the costs that apply in the United States. b, Set-up cost (fixed cost) for a digital pathology and deep learning infrastructure. H&E, hematoxylin and eosin; MMRd, mismatch repair deficiency; NGS, next-generation sequencing; QC, quality control. Sources and assumptions were as follows. (1) Prices were obtained from, retrieved on 11 March 2019. We assume ×20 magnification on a high-volume whole-slide scanner. (2) Prices were obtained from and, retrieved on 11 March 2019. We assume that 1 h of GPU computing on a Nvidia Tesla P100 GPU is required to process whole-slide images for one patient to prediction. (3) US Current Procedural Terminology (CPT) code 88342, four-antibody panel at US$852.00 per staining. (4) Personal communication by the Pathology Department, University of Chicago Medicine, March 2019. (5) Personal communication, Medical Oncology, National Center for Tumor Diseases, Germany. (6) Personal experience of cost for a high-throughput slide scanner plus a limited storing capacity, based on offers by multiple digital pathology vendors. (7) Assuming a tower server with one NVidia Tesla V100 GPU or similar GPU, based on multiple offers by providers for professional hardware, March 2019. Staff cost and infrastructure cost are not accounted for in this schematic.

Supplementary information

  1. Supplementary Information

    Supplementary Tables 1–5

  2. Reporting Summary

Source data

  1. Source Data Fig. 2

    Source data for the ROC curve in Fig. 2d and the correlation analysis in Fig. 2e

  2. Source Data Extended Data Fig. 1

    Source data for scatter plot in Extended Data Fig. 1

  3. Source Data Extended Data Fig. 2

    Source data for virtual biopsies in Extended Data Fig. 2b, source data for histogram in Extended Data Fig. 2c and source data for the Kaplan–Meier plot in Extended Data Fig. 2d

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark
Fig. 1: Tumor detection and MSI prediction in H&E histology.
Fig. 2: Classification performance in an external validation set.
Extended Data Fig. 1: Comparison of five deep neural network architectures.
Extended Data Fig. 2: Additional data for classifier performance.
Extended Data Fig. 3: Morphological correlates of intratumor heterogeneity of MSI.
Extended Data Fig. 4: Estimated cost for MSI screening with deep learning.