Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer

Abstract

Microsatellite instability determines whether patients with gastrointestinal cancer respond exceptionally well to immunotherapy. However, in clinical practice, not every patient is tested for MSI, because this requires additional genetic or immunohistochemical tests. Here we show that deep residual learning can predict MSI directly from H&E histology, which is ubiquitously available. This approach has the potential to provide immunotherapy to a much broader subset of patients with gastrointestinal cancer.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Tumor detection and MSI prediction in H&E histology.
Fig. 2: Classification performance in an external validation set.

Similar content being viewed by others

Data availability

All whole-slide images for datasets are available at https://portal.gdc.cancer.gov/. Training images for tumor detection are available at https://doi.org/10.5281/zenodo.2530789. Training images for MSI detection are available at https://doi.org/10.5281/zenodo.2530835 and https://doi.org/10.5281/zenodo.2532612. Source data for Fig. 1 are available in public repositories at https://doi.org/10.5281/zenodo.2530789, https://doi.org/10.5281/zenodo.2530835 and https://doi.org/10.5281/zenodo.2532612. Source Data for Figs. 1, 2 and Extended Data Figs. 1, 2 containing the raw data for these figures are available in the online version of the paper.

Code availability

Source codes are available at https://github.com/jnkather/MSIfromHE.

References

  1. Darvin, P., Toor, S. M., Sasidharan Nair, V. & Elkord, E. Exp. Mol. Med. 50, 165 (2018).

    Article  Google Scholar 

  2. Le, D. T. et al. N. Engl. J. Med. 372, 2509–2520 (2015).

    Article  CAS  Google Scholar 

  3. Bonneville, R. et al. JCO Precis. Oncol. 2017, 1–15 (2017).

    Google Scholar 

  4. Le, D. T. et al. Science 357, 409–413 (2017).

    Article  CAS  Google Scholar 

  5. Kather, J. N., Halama, N. & Jaeger, D. Semin. Cancer Biol. 52, 189–197 (2018).

    Article  CAS  Google Scholar 

  6. Franke, A. J. et al. J. Clin. Oncol. 36, 796 (2018).

    Article  Google Scholar 

  7. Norgeot, B., Glicksberg, B. S. & Butte, A. J. Nat. Med. 25, 14–15 (2019).

    Article  CAS  Google Scholar 

  8. Coudray, N. et al. Nat. Med. 24, 1559–1567 (2018).

    Article  CAS  Google Scholar 

  9. Schaumberg, A. J., Rubin, M. A. & Fuchs, T. J. Preprint at https://www.biorxiv.org/content/10.1101/064279v9 (2018).

  10. Chang, P. et al. AJNR Am. J. Neuroradiol. 39, 1201–1207 (2018).

    Article  CAS  Google Scholar 

  11. Mobadersany, P. et al. Proc. Natl Acad. Sci. USA 115, E2970–E2979 (2018).

    Article  CAS  Google Scholar 

  12. He, K., Zhang, X., Ren, S. & Sun, J. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).

  13. Kather, J. N. et al. PLoS Med. 16, e1002730 (2019).

    Article  Google Scholar 

  14. Kather, J. N. et al. Sci. Rep. 6, 27988 (2016).

    Article  CAS  Google Scholar 

  15. The Cancer Genome Atlas Network Nature 513, 202–209 (2014).

    Article  Google Scholar 

  16. The Cancer Genome Atlas Network Nature 487, 330–337 (2012).

    Article  Google Scholar 

  17. Hoffmeister, M. et al. J. Natl Cancer Inst. 107, djv045 (2015).

    Article  Google Scholar 

  18. Brenner, H., Chang-Claude, J., Seiler, C. M. & Hoffmeister, M. J. Clin. Oncol. 29, 3761–3767 (2011).

    Article  Google Scholar 

  19. Aoyama, T. et al. Cancer Med. 7, 4914–4923 (2018).

    Article  CAS  Google Scholar 

  20. Rahman, R., Asombang, A. W. & Ibdah, J. A. World J. Gastroenterol. 20, 4483–4490 (2014).

    Article  Google Scholar 

  21. Levine, D. A. & The Cancer Genome Atlas Research Network . Nature 497, 67–73 (2013).

    Article  Google Scholar 

  22. Kawakami, H., Zaanan, A. & Sinicrope, F. A. Curr. Treat. Options Oncol. 16, 30 (2015).

    Article  Google Scholar 

  23. Zhu, L. et al. Mol. Clin. Oncol. 3, 699–705 (2015).

    Article  CAS  Google Scholar 

  24. Macenko, M. et al. In Proc. IEEE International Symposium on Biomedical Imaging 1107–1110 (2009).

  25. Liu, Y. et al. Cancer Cell 33, 721–735 (2018).

    Article  CAS  Google Scholar 

  26. Bailey, M. H. et al. Cell 173, 371–385 (2018).

    Article  CAS  Google Scholar 

  27. Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Proc. Advances in Neural Information Processing Systems 1097–1105 (2012).

  28. Simonyan, K. & Zisserman, A. Preprint at https://arxiv.org/abs/1409.1556 (2014).

  29. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. in Proc. IEEE Conference on Computer Vision and Pattern Recognition 2818–2826 (2016).

  30. Iandola, F. N.et al. Preprint at https://arxiv.org/abs/1602.07360 (2016).

  31. DiCiccio, T. J. & Efron, B. Stat. Sci. 11, 189–228 (1996).

    Article  Google Scholar 

Download references

Acknowledgements

The results are in part based on data generated by the TCGA Research Network (http://cancergenome.nih.gov/). J.N.K. was funded by RWTH University Aachen (START 2018-691906). A.T.P. was funded by NIH/NIDCR (K08-DE026500). A.M. was funded by the German Federal Ministry of Education and Research (BMBF) (M2oBITE/13GW0091E). The DACHS study was funded by the Interdisciplinary Research Program of the National Center for Tumor Diseases (NCT), Germany and German Research Council DFG (BR 1704/6-1, BR 1704/6-3, BR 1704/6-4, BR 1704/17-1, CH 117/1-1 and HO 5117/2-1), BMBF (01ER0814, 01ER0815, 01ER1505A and 01ER1505B). P.B. was funded by the DFG (BO 3755/6-1, SFB-TRR57 and SFB-TRR219). T.L. was funded by Horizon 2020 through the European Research Council (ERC) Consolidator Grant PhaseControl (771083), Mildred-Scheel-Endowed Professorship from the German Cancer Aid, DFG (SFB-TRR57/P06 and LU 1360/3-1), Ernst-Jung-Foundation Hamburg and IZKF (Interdisciplinary Center of Clinical Research) at RWTH Aachen.

Author information

Authors and Affiliations

Authors

Contributions

J.N.K., A.T.P. and T.L. designed the study; J.N.K. and J.K. performed the analysis; J.N.K., S.H.L. and T.L. performed the statistical analyses; N.H., D.J., A.M., H.I.G., T.Y., H.B., J.C.-C. and M.H. provided human tissue material; D.J., C.T., F.T., U.P.N. and T.L. supervised the study; A.M., P.B. and H.I.G. contributed histopathology expertise; all authors contributed to the interpretation of data and to the writing and revision of the manuscript.

Corresponding authors

Correspondence to Jakob Nikolas Kather or Tom Luedde.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Comparison of five deep neural network architectures.

We compared accuracy and training time of five neural network architectures on the tumor detection dataset with three balanced classes. Alexnet27, VGG19 (ref. 28)) and resnet18 (ref. 12) achieved >95% accuracy in withheld images, whereas inceptionv3 (ref. 29) and squeezenet30 had a poor performance on this benchmark task. Among the well-performing models, resnet18 had the lowest number of parameters, making it potentially more portable and less prone to overfitting. In this comparison, we split the dataset into 70% training, 15% validation and 15% test images. Each network is shown twice in this graph: with a learning rate of 1 × 10−6 and 1 × 10−5 (outlined). Training was run for 25 epochs. Resnet18 was subsequently retrained on the dataset, attaining a median fivefold cross-validated out-of-sample AUC > 0.99 for tumor detection. The dataset was derived from n = 94 whole-slide images from n = 81 patients and is available at https://doi.org/10.5281/zenodo.2530789.

Source data

Extended Data Fig. 2 Additional data for classifier performance.

a, Flowchart of all experiments. The area under the receiver operating characteristic curve gives an overall measure of patient-level classifier accuracy as measured in held-out test sets. Flag symbols are from https://twemoji.twitter.com/ (licensed under a CC-BY 4.0 license). b, Classification performance in virtual biopsies. We predicted MSI status in all patients in the DACHS cohort, varying the number of blocks (tiles) from 3 to 2,054, which was the median number of blocks per whole-slide image This experiment was repeated five times with different randomly picked blocks being used. As one block has an edge length of 256 µm, a 1-cm tissue cylinder with 100% tumor tissue from a standard 18G biopsy needle corresponds to 117 blocks and a 16G needle corresponds to 156 blocks. In clinical routine, usually only a part of each biopsy core contains tumor, but multiple biopsy cores are collected. With increasing tissue size, performance stabilizes at AUC = 0.84. This shows that a typical biopsy would be sufficient for MSI prediction. CI, confidence interval. c, Distribution of the numbers of blocks for all patients in DACHS (n = 378 patients). d, Overall survival of patients with genetic MSS tumors stratified by high or low predicted MSIness. In this group, patients with high MSIness had a shorter survival than patients with low MSIness. The table shows the number of patients at risk. The P value was calculated by two-sided log-rank test (n = 350 patients).

Source data

Extended Data Fig. 3 Morphological correlates of intratumor heterogeneity of MSI.

a, Histological image of a test set patient who was genetically determined as MSI. b, Corresponding predicted MSI map for the image shown in a. Three regions are highlighted. Region 1 is a glandular region with necrosis and extracellular mucus; this region was predominantly predicted to be MSS. Region 2 is a solid, dedifferentiated region, which was predicted to be MSI. Region 3 contained mostly budding tumor cells mixed with immune cells, this region was strongly predicted to be MSI. Together, these representative examples show that different morphologies elicit different predictions and that these predictions can be traced back to patterns that are understandable for humans. Scale bar, 2.5 mm. This figure is representative of n = 378 patients in the DACHS cohort.

Extended Data Fig. 4 Estimated cost for MSI screening with deep learning.

a, Workflow for MSI screening with deep learning versus immunohistochemistry in tertiary care centers with existing digital pathology core facilities such as the University of Chicago Medical Center. Costs differ by country and are usually cheaper in Europe than in the United States. Here, we list the costs that apply in the United States. b, Set-up cost (fixed cost) for a digital pathology and deep learning infrastructure. H&E, hematoxylin and eosin; MMRd, mismatch repair deficiency; NGS, next-generation sequencing; QC, quality control. Sources and assumptions were as follows. (1) Prices were obtained from https://htrc.uchicago.edu/fees.php?fee=2&fee=2, retrieved on 11 March 2019. We assume ×20 magnification on a high-volume whole-slide scanner. (2) Prices were obtained from https://techcrunch.com/2019/03/07/scaleway-releases-cloud-gpu-instances-for-e1-per-hour/ and https://www.scaleway.com/, retrieved on 11 March 2019. We assume that 1 h of GPU computing on a Nvidia Tesla P100 GPU is required to process whole-slide images for one patient to prediction. (3) US Current Procedural Terminology (CPT) code 88342, four-antibody panel at US$852.00 per staining. (4) Personal communication by the Pathology Department, University of Chicago Medicine, March 2019. (5) Personal communication, Medical Oncology, National Center for Tumor Diseases, Germany. (6) Personal experience of cost for a high-throughput slide scanner plus a limited storing capacity, based on offers by multiple digital pathology vendors. (7) Assuming a tower server with one NVidia Tesla V100 GPU or similar GPU, based on multiple offers by providers for professional hardware, March 2019. Staff cost and infrastructure cost are not accounted for in this schematic.

Supplementary information

Source data

Source Data Fig. 2

Source data for the ROC curve in Fig. 2d and the correlation analysis in Fig. 2e

Source Data Extended Data Fig. 1

Source data for scatter plot in Extended Data Fig. 1

Source Data Extended Data Fig. 2

Source data for virtual biopsies in Extended Data Fig. 2b, source data for histogram in Extended Data Fig. 2c and source data for the Kaplan–Meier plot in Extended Data Fig. 2d

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kather, J.N., Pearson, A.T., Halama, N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat Med 25, 1054–1056 (2019). https://doi.org/10.1038/s41591-019-0462-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41591-019-0462-y

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer