Recent advances in cancer research and diagnostics largely rely on new developments in microscopic or molecular profiling techniques, offering high levels of detail with respect to either spatial or molecular features, but usually not both. Here, we present an explainable machine-learning approach for the integrated profiling of morphological, molecular and clinical features from breast cancer histology. First, our approach allows for the robust detection of cancer cells and tumour-infiltrating lymphocytes in histological images, providing precise heatmap visualizations explaining the classifier decisions. Second, molecular features, including DNA methylation, gene expression, copy number variations, somatic mutations and proteins are predicted from histology. Molecular predictions reach balanced accuracies up to 78%, whereas accuracies of over 95% can be achieved for subgroups of patients. Finally, our explainable AI approach allows assessment of the link between morphological and molecular cancer properties. The resulting computational multiplex-histology analysis can help promote basic cancer research and precision medicine through an integrated diagnostic scoring of histological, clinical and molecular features.
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Beck, A. H. et al. Systematic analysis of breast cancer morphology uncovers stromal features associated with survival. Sci. Transl. Med. 3, 108ra113 (2011).
Yuan, Y. Spatial heterogeneity in the tumor microenvironment. Cold Spring Harb. Perspect. Med. 6, a026583 (2016).
Gerner, M. Y., Kastenmuller, W., Ifrim, I., Kabat, J. & Germain, R. N. Histo-cytometry: a method for highly multiplex quantitative tissue imaging analysis applied to dendritic cell subset microanatomy in lymph nodes. Immunity 37, 364–376 (2012).
Rimm, D. L. Next-gen immunohistochemistry. Nat. Methods 11, 381–383 (2014).
Samek, W. & Müller, K.-R. in Explainable AI: Interpreting, Explaining and Visualizing Deep Learning (eds. Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K. & Müller, K.-R.) 5–22 (Springer, 2019); https://doi.org/10.1007/978-3-030-28954-6_1
Bach, S. et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10, e0130140 (2015).
Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 (2012).
Kather, J. N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25, 1054–1056 (2019).
Zhang, Z. et al. Pathologist-level interpretable whole-slide cancer diagnosis with deep learning. Nat. Mach. Intell. 1, 236–245 (2019).
Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).
André, B., Vercauteren, T., Buchner, A. M., Wallace, M. B. & Ayache, N. Endomicroscopic video retrieval using mosaicing and visualwords. In 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro 1419–1422 (2010); https://doi.org/10.1109/ISBI.2010.5490265
Caicedo, J. C., Cruz, A. & Gonzalez, F. A. Histopathology image classification using bag of features and kernel functions. In Conference on Artificial Intelligence in Medicine in Europe 126–135 (Springer, 2009).
Yu, K.-H. et al. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features. Nat. Commun. 7, 12474 (2016).
Klauschen, F. et al. Scoring of tumor-infiltrating lymphocytes: From visual estimation to machine learning. Semin. Cancer Biol. 52, 151–157 (2018).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289–300 (1995).
Sabbaghi, M. et al. Defective cyclin B1 induction in trastuzumab-emtansine (T-DM1) acquired resistance in HER2-positive breast cancer. Clin. Cancer Res. 23, 7006–7019 (2017).
Harrell, J. C., Shroka, T. M. & Jacobsen, B. M. Estrogen induces c-Kit and an aggressive phenotype in a model of invasive lobular breast cancer. Oncogenesis 6, 396 (2017).
Kuonen, F. et al. Inhibition of the Kit ligand/c-Kit axis attenuates metastasis in a mouse model mimicking local breast cancer relapse after radiotherapy. Clin. Cancer Res. 18, 4365–4374 (2012).
Jiang, Y., Zou, L., Lu, W.-Q., Zhang, Y. & Shen, A.-G. Foxo3a expression is a prognostic marker in breast cancer. PLoS ONE 8, e70746 (2013).
Loi, S. et al. Tumor-infiltrating lymphocytes and prognosis: a pooled individual patient analysis of early-stage triple-negative breast cancers. J. Clin. Oncol. 37, 559–569 (2019).
Amgad, M. et al. Report on computational assessment of tumor infiltrating lymphocytes from the International Immuno-Oncology Biomarker Working Group. NPJ Breast Cancer 6, 16 (2020).
Gonzalez-Ericsson, P. I. et al. The path to a better biomarker: application of a risk management framework for the implementation of PD-L1 and TILs as immuno-oncology biomarkers in breast cancer clinical trials and daily practice. J. Pathol. 250, 667–684 (2020).
Gatrell, A. C., Bailey, T. C., Diggle, P. J. & Rowlingson, B. S. Spatial point pattern analysis and its application in geographical epidemiology. Trans. Inst. Br. Geogr. 21, 256–274 (1996).
Budczies, J. et al. Classical pathology and mutational load of breast cancer–integration of two worlds. J. Pathol. Clin. Res. 1, 225–238 (2015).
Denkert, C. et al. Tumour-infiltrating lymphocytes and prognosis in different subtypes of breast cancer: a pooled analysis of 3771 patients treated with neoadjuvant therapy. Lancet Oncol. 19, 40–50 (2018).
Gurcan, M. N. et al. Histopathological image analysis: a review. IEEE Rev. Biomed. Eng. 2, 147–171 (2009).
Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K. & Müller, K.-R. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning (Springer, 2019).
Lapuschkin, S. et al. Unmasking clever Hans predictors and assessing what machines really learn. Nat. Commun. 10, 1096 (2019).
Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J. & Müller, K.-R. Toward interpretable machine learning: transparent deep neural networks and beyond. Preprint at https://arxiv.org/abs/2003.07631 (2020).
Montavon, G., Samek, W. & Müller, K.-R. Methods for interpreting and understanding deep neural networks. Digital Signal Process. 73, 1–15 (2018).
Müller, K.-R., Mika, S., Rätsch, G., Tsuda, K. & Schölkopf, B. An introduction to kernel-based learning algorithms. IEEE Trans. Neural Netw. 12, 181–201 (2001).
Csurka, G., Dance, C., Fan, L., Willamowski, J. & Bray, C. Visual categorization with bags of keypoints. In Workshop on Statistical Learning in Computer Vision (2004).
Sonnenburg, S. et al. The SHOGUN machine learning toolbox. J. Mach. Learn. Res. 11, 1799–1802 (2010).
Lapuschkin, S., Binder, A., Montavon, G., Muller, K.-R. & Samek, W. Analyzing classifiers: Fisher vectors and deep neural networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 2912–2920 (2016).
Binder, A., Samek, W., Müller, K.-R. & Kawanabe, M. Enhanced representation and multi-task learning for image annotation. Comput. Vision Image Understanding 117, 466–478 (2013).
Bishop, C. M. Neural Networks for Pattern Recognition (Oxford Univ. Press, 1995).
Zien, A. & Ong, C. S. Multiclass multiple kernel learning. in Proc. 24th International Conference on Machine Learning 1191–1198 (2007).
Raschka, S. Model evaluation, model selection, and algorithm selection in machine learning. Preprint at https://arxiv.org/abs/1811.12808 (2018).
Hoeffding, W. Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58, 13–30 (1963).
Binder, A. & Bockmayr, M. Morphological and molecular breast cancer profiling through explainable machine learning. figshare https://doi.org/10.6084/m9.figshare.13078835 (2021).
This work was funded by the Charité Institute of Pathology, Berlin, the Technical University of Berlin, the Human Frontier Science Program (HFSP) Young Investigator Grant (M.I. and F.K.) and the Einstein Foundation Berlin (F.K.) and partly by the German Research Foundation to A.H. (DFG SFB-TR84, B6, Z1a) and the German Consortium for Translational Cancer Research (DKTK). F.K. was also supported by the German Ministry for Education and Research (BMBF) within the Berlin Institute for the Foundations of Learning and Data (BIFOLD; grant no. 01IS18025D and 01IS18037E), the clinical mass spectrometry centre MSTARS (grant no. 031L0220A) and CompLS Patho234 (grant no. 031L0207B) and the European Research Council under Horizon 2020 of the EU Framework Programme for Research and Innovation (647257). A.B. acknowledges support by the Ministry of Education AcRF Tier 2 grant MOE2016-T2-2-154, and expresses gratitude to SUTD for the SGPAIRS1811 grant. M.B. was supported in part by the University Medical Center Hamburg-Eppendorf. K.R.M. was supported in part by the Institute of Information and Communications Technology Planning and Evaluation (IITP) grant funded by the Korean government (no. 2017-0-00451, Development of BCI based Brain and Cognitive Computing Technology for Recognizing User’s Intentions using Deep Learning and no. 2019-0-00079, Artificial Intelligence Graduate School Program, Korea University), and by the German Ministry for Education and Research (BMBF) under grants 01IS14013A-E, 01GQ1115, 01GQ0850, 01IS18025A, 031L0207D and 01IS18037A; the German Research Foundation (DFG) under grant Math+, EXC 2046/1, project ID 390685689.
The authors declare no competing interests.
Peer review information Nature Machine Intelligence thanks Carsten Marr and the other, anonymous, reviewers for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Methods, Figs. 1–17, Tables 1–13.
Balanced accuracies and significance of molecular predictions.
Area under the curve of molecular prediction.
Number of predictable cases after tail probability analysis for different tails.
About this article
Cite this article
Binder, A., Bockmayr, M., Hägele, M. et al. Morphological and molecular breast cancer profiling through explainable machine learning. Nat Mach Intell 3, 355–366 (2021). https://doi.org/10.1038/s42256-021-00303-4
Proceedings of the IEEE (2021)