Microsatellite instability determines whether patients with gastrointestinal cancer respond exceptionally well to immunotherapy. However, in clinical practice, not every patient is tested for MSI, because this requires additional genetic or immunohistochemical tests. Here we show that deep residual learning can predict MSI directly from H&E histology, which is ubiquitously available. This approach has the potential to provide immunotherapy to a much broader subset of patients with gastrointestinal cancer.
Subscribe to Journal
Get full journal access for 1 year
only $17.42 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
All whole-slide images for datasets are available at https://portal.gdc.cancer.gov/. Training images for tumor detection are available at https://doi.org/10.5281/zenodo.2530789. Training images for MSI detection are available at https://doi.org/10.5281/zenodo.2530835 and https://doi.org/10.5281/zenodo.2532612. Source data for Fig. 1 are available in public repositories at https://doi.org/10.5281/zenodo.2530789, https://doi.org/10.5281/zenodo.2530835 and https://doi.org/10.5281/zenodo.2532612. Source Data for Figs. 1, 2 and Extended Data Figs. 1, 2 containing the raw data for these figures are available in the online version of the paper.
Source codes are available at https://github.com/jnkather/MSIfromHE.
Darvin, P., Toor, S. M., Sasidharan Nair, V. & Elkord, E. Exp. Mol. Med. 50, 165 (2018).
Le, D. T. et al. N. Engl. J. Med. 372, 2509–2520 (2015).
Bonneville, R. et al. JCO Precis. Oncol. 2017, 1–15 (2017).
Le, D. T. et al. Science 357, 409–413 (2017).
Kather, J. N., Halama, N. & Jaeger, D. Semin. Cancer Biol. 52, 189–197 (2018).
Franke, A. J. et al. J. Clin. Oncol. 36, 796 (2018).
Norgeot, B., Glicksberg, B. S. & Butte, A. J. Nat. Med. 25, 14–15 (2019).
Coudray, N. et al. Nat. Med. 24, 1559–1567 (2018).
Schaumberg, A. J., Rubin, M. A. & Fuchs, T. J. Preprint at https://www.biorxiv.org/content/10.1101/064279v9 (2018).
Chang, P. et al. AJNR Am. J. Neuroradiol. 39, 1201–1207 (2018).
Mobadersany, P. et al. Proc. Natl Acad. Sci. USA 115, E2970–E2979 (2018).
He, K., Zhang, X., Ren, S. & Sun, J. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (2016).
Kather, J. N. et al. PLoS Med. 16, e1002730 (2019).
Kather, J. N. et al. Sci. Rep. 6, 27988 (2016).
The Cancer Genome Atlas Network Nature 513, 202–209 (2014).
The Cancer Genome Atlas Network Nature 487, 330–337 (2012).
Hoffmeister, M. et al. J. Natl Cancer Inst. 107, djv045 (2015).
Brenner, H., Chang-Claude, J., Seiler, C. M. & Hoffmeister, M. J. Clin. Oncol. 29, 3761–3767 (2011).
Aoyama, T. et al. Cancer Med. 7, 4914–4923 (2018).
Rahman, R., Asombang, A. W. & Ibdah, J. A. World J. Gastroenterol. 20, 4483–4490 (2014).
Levine, D. A. & The Cancer Genome Atlas Research Network . Nature 497, 67–73 (2013).
Kawakami, H., Zaanan, A. & Sinicrope, F. A. Curr. Treat. Options Oncol. 16, 30 (2015).
Zhu, L. et al. Mol. Clin. Oncol. 3, 699–705 (2015).
Macenko, M. et al. In Proc. IEEE International Symposium on Biomedical Imaging 1107–1110 (2009).
Liu, Y. et al. Cancer Cell 33, 721–735 (2018).
Bailey, M. H. et al. Cell 173, 371–385 (2018).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Proc. Advances in Neural Information Processing Systems 1097–1105 (2012).
Simonyan, K. & Zisserman, A. Preprint at https://arxiv.org/abs/1409.1556 (2014).
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. in Proc. IEEE Conference on Computer Vision and Pattern Recognition 2818–2826 (2016).
Iandola, F. N.et al. Preprint at https://arxiv.org/abs/1602.07360 (2016).
DiCiccio, T. J. & Efron, B. Stat. Sci. 11, 189–228 (1996).
The results are in part based on data generated by the TCGA Research Network (http://cancergenome.nih.gov/). J.N.K. was funded by RWTH University Aachen (START 2018-691906). A.T.P. was funded by NIH/NIDCR (K08-DE026500). A.M. was funded by the German Federal Ministry of Education and Research (BMBF) (M2oBITE/13GW0091E). The DACHS study was funded by the Interdisciplinary Research Program of the National Center for Tumor Diseases (NCT), Germany and German Research Council DFG (BR 1704/6-1, BR 1704/6-3, BR 1704/6-4, BR 1704/17-1, CH 117/1-1 and HO 5117/2-1), BMBF (01ER0814, 01ER0815, 01ER1505A and 01ER1505B). P.B. was funded by the DFG (BO 3755/6-1, SFB-TRR57 and SFB-TRR219). T.L. was funded by Horizon 2020 through the European Research Council (ERC) Consolidator Grant PhaseControl (771083), Mildred-Scheel-Endowed Professorship from the German Cancer Aid, DFG (SFB-TRR57/P06 and LU 1360/3-1), Ernst-Jung-Foundation Hamburg and IZKF (Interdisciplinary Center of Clinical Research) at RWTH Aachen.
The authors declare no competing interests.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We compared accuracy and training time of five neural network architectures on the tumor detection dataset with three balanced classes. Alexnet27, VGG19 (ref. 28)) and resnet18 (ref. 12) achieved >95% accuracy in withheld images, whereas inceptionv3 (ref. 29) and squeezenet30 had a poor performance on this benchmark task. Among the well-performing models, resnet18 had the lowest number of parameters, making it potentially more portable and less prone to overfitting. In this comparison, we split the dataset into 70% training, 15% validation and 15% test images. Each network is shown twice in this graph: with a learning rate of 1 × 10−6 and 1 × 10−5 (outlined). Training was run for 25 epochs. Resnet18 was subsequently retrained on the dataset, attaining a median fivefold cross-validated out-of-sample AUC > 0.99 for tumor detection. The dataset was derived from n = 94 whole-slide images from n = 81 patients and is available at https://doi.org/10.5281/zenodo.2530789. Source data
a, Flowchart of all experiments. The area under the receiver operating characteristic curve gives an overall measure of patient-level classifier accuracy as measured in held-out test sets. Flag symbols are from https://twemoji.twitter.com/ (licensed under a CC-BY 4.0 license). b, Classification performance in virtual biopsies. We predicted MSI status in all patients in the DACHS cohort, varying the number of blocks (tiles) from 3 to 2,054, which was the median number of blocks per whole-slide image This experiment was repeated five times with different randomly picked blocks being used. As one block has an edge length of 256 µm, a 1-cm tissue cylinder with 100% tumor tissue from a standard 18G biopsy needle corresponds to 117 blocks and a 16G needle corresponds to 156 blocks. In clinical routine, usually only a part of each biopsy core contains tumor, but multiple biopsy cores are collected. With increasing tissue size, performance stabilizes at AUC = 0.84. This shows that a typical biopsy would be sufficient for MSI prediction. CI, confidence interval. c, Distribution of the numbers of blocks for all patients in DACHS (n = 378 patients). d, Overall survival of patients with genetic MSS tumors stratified by high or low predicted MSIness. In this group, patients with high MSIness had a shorter survival than patients with low MSIness. The table shows the number of patients at risk. The P value was calculated by two-sided log-rank test (n = 350 patients). Source data
a, Histological image of a test set patient who was genetically determined as MSI. b, Corresponding predicted MSI map for the image shown in a. Three regions are highlighted. Region 1 is a glandular region with necrosis and extracellular mucus; this region was predominantly predicted to be MSS. Region 2 is a solid, dedifferentiated region, which was predicted to be MSI. Region 3 contained mostly budding tumor cells mixed with immune cells, this region was strongly predicted to be MSI. Together, these representative examples show that different morphologies elicit different predictions and that these predictions can be traced back to patterns that are understandable for humans. Scale bar, 2.5 mm. This figure is representative of n = 378 patients in the DACHS cohort.
a, Workflow for MSI screening with deep learning versus immunohistochemistry in tertiary care centers with existing digital pathology core facilities such as the University of Chicago Medical Center. Costs differ by country and are usually cheaper in Europe than in the United States. Here, we list the costs that apply in the United States. b, Set-up cost (fixed cost) for a digital pathology and deep learning infrastructure. H&E, hematoxylin and eosin; MMRd, mismatch repair deficiency; NGS, next-generation sequencing; QC, quality control. Sources and assumptions were as follows. (1) Prices were obtained from https://htrc.uchicago.edu/fees.php?fee=2&fee=2, retrieved on 11 March 2019. We assume ×20 magnification on a high-volume whole-slide scanner. (2) Prices were obtained from https://techcrunch.com/2019/03/07/scaleway-releases-cloud-gpu-instances-for-e1-per-hour/ and https://www.scaleway.com/, retrieved on 11 March 2019. We assume that 1 h of GPU computing on a Nvidia Tesla P100 GPU is required to process whole-slide images for one patient to prediction. (3) US Current Procedural Terminology (CPT) code 88342, four-antibody panel at US$852.00 per staining. (4) Personal communication by the Pathology Department, University of Chicago Medicine, March 2019. (5) Personal communication, Medical Oncology, National Center for Tumor Diseases, Germany. (6) Personal experience of cost for a high-throughput slide scanner plus a limited storing capacity, based on offers by multiple digital pathology vendors. (7) Assuming a tower server with one NVidia Tesla V100 GPU or similar GPU, based on multiple offers by providers for professional hardware, March 2019. Staff cost and infrastructure cost are not accounted for in this schematic.
Source data for the ROC curve in Fig. 2d and the correlation analysis in Fig. 2e
Source data for scatter plot in Extended Data Fig. 1
Source data for virtual biopsies in Extended Data Fig. 2b, source data for histogram in Extended Data Fig. 2c and source data for the Kaplan–Meier plot in Extended Data Fig. 2d
About this article
Cite this article
Kather, J.N., Pearson, A.T., Halama, N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat Med 25, 1054–1056 (2019). https://doi.org/10.1038/s41591-019-0462-y
Using deep learning to predict anti-PD-1 response in melanoma and lung cancer patients from histopathology images
Translational Oncology (2021)
Translational Oncology (2021)
Biochimica et Biophysica Acta (BBA) - Reviews on Cancer (2021)
Nature Reviews Nephrology (2020)