Machine learning for tissue diagnostics in oncology: brave new world

Halama, Niels

doi:10.1038/s41416-019-0535-1

Download PDF

Editorial
Open access
Published: 09 August 2019

Machine learning for tissue diagnostics in oncology: brave new world

Niels Halama ORCID: orcid.org/0000-0003-0344-6027^1,2,3,4,5

British Journal of Cancer volume 121, pages 431–433 (2019)Cite this article

3421 Accesses
9 Citations
26 Altmetric
Metrics details

Subjects

Summary

Machine learning is an exciting technology with broad application in big data analysis, as well as increasingly in specialised healthcare. As a diagnostic tool in tissue workup and pathology, it has the potential for personalised and stratified approaches, but the limitations and pitfalls need to be better understood and characterised especially in this critical area of medical care.

The beginning: learning machines in medicine

The development of powerful algorithmic approaches, termed ‘machine learning’, closely follows the development of modern computer technology. The promises of machine learning in medicine revolve around the notion of faster and more reliable classification of images or datasets. Especially in oncology, the possible applications are boundless: from the classification of imaging studies (‘tumour’ versus ‘no tumour’) to the classification of cells within a tissue section. The roots of machine learning lie within the conceptional beginning of circuit design: algorithmic investigations have held a tight grip on how data were perceived and understood. One area that is now attracting more and more research interest is the analysis of tissue specimens—harvesting more information from ‘pure’ tissue sections, i.e. tissue material processed in standardised routine procedures and available from large numbers of patients. Tissue diagnostics and processing is the field of work of the pathologist, and it is not visionary to predict that image analysis and machine learning will further shape the way pathologists will work in the future.

The new ‘microscope’ for tissue: computational tools develop with computational power

Tissue specimens, especially those processed and subjected to haematoxylin-eosin staining, are available in large quantities from a large number of oncological patients. Images generated from these large sections with routine counterstains offer rich information. Fundamental aspects of cellular composition, localisation and quantity can be gained from these images. Without specific staining procedures, it is difficult for the human eye to identify the subsets of cells and to precisely quantify these subsets robustly. There are clear examples where one can expect advantages from a computerised approach: lymphocyte infiltration is a good prognostic factor in many tumour entities. Dataset size is an important factor in machine learning: datasets beyond 1000 data points of uniform type are usually needed for creating robust predictors. With the advent of whole slide image scanners for histology, the availability of large patient datasets (of larger numbers) has increased even more. The type of machine learning algorithms applied to these medical images has developed over time, and the complexity of these algorithms ranges from single-layer neural nets to complex deep learning (Boltzmann) algorithms. The history of machine learning is winding, with key figures in the 50s and 60s of the last century being Marvin Minsky, Frank Rosenblatt and Charles Wightman.¹ In this evolution, convolutional neuronal networks (CNN)² have provided a significant, new, and technically efficient approach.

With this technical advancement, more and more far-reaching classifications and stratifications have been attempted with machine learning. Aligning the treasure chests of ‘big data’ with clinical outcomes has been also in the focus of attention, chemotherapy response prediction in colorectal cancer patients being just one example.³ With the focus on tissue, the identification of predictive features within the tissue section was performed^4,5 (including lymphocytes or vasculature).^6,7,8 A good example is the identification of immunohistochemistry-based signatures to predict metastatic sites of triple-negative breast cancers.⁸ Finding ‘unseen’ aspects in tissue sections to align genetic alterations with phenotypic features is also a key aspect of new developments.^9,10 However, with this advancement, especially for medical application on tissue, new fields of problems have appeared.

Brave new world: a (computational) stratification tool is still a stratification tool

The prerequisite for successful machine learning approaches is still a sufficient dataset size. This is clearly limiting the use of this technology, because the low frequency of certain cancer entities limits available material. This also leads to the misinterpretation of exploratory analyses and points to a need for extensive validation. This is not to be disregarded in a computational approach, which might be easy to transfer from one institution to another. Validating the possible diagnostic machine learning approach requires the same tight controls and quality assurance management as any other medical validation approach with wet lab technology.

Another important point here is to understand the predictive features within the tissue (or the image, see Fig. 1). One way is obscuring the features within the image systematically to identify elements that inform the predictive algorithm. This also opens the door to understand ‘what precisely’ the machine learning algorithm sees in the tissue, e.g. lymphocytes (i.e. round cells without significant cytoplasm). Possible confounders or bias can be identified as well, e.g. the counterstain. Here, the definition of ‘interpretability’ is important—translating the algorithmic findings into human-understandable language or symbols (see e.g. https://fatconference.org/2019/). Missing evidence-based expectations for clinically acceptable performance is another specific danger in machine learning—in other words, the alignment of expected performance with realistic clinical expectations and the validation of it. Exemplum crudelitatis, the written clinical annotation on an image as a predictor of outcome and not the actual medical image itself (see https://medium.com/@jrzech/), is a flamboyant example, but one that emphasises quality control and understanding of the algorithm as important parameters of success.

The outlook: algorithmic tools for pathology

An ideal scenario for development and validation of prediction models should take the abovementioned points into account. Machine learning is best suited for multisite studies and model testing on subsets,¹¹ but its application in study design and reporting in medical research requires the development of clearer standards. Algorithmic tools are indeed becoming a part of the armamentarium in tissue diagnostics and pathology, regardless of whether deep learning, multiple-agent simulations or other computing approaches are used.¹² The advent of another tool in the medical toolbox is always exciting,¹³ but also requires a careful analysis of the tool's boundaries and limitations. Artificial intelligence critic Kate Crawford sums it up: ‘Machine learning does not produce inscrutable and unquestionable objects of mathematics that produce rational, unbiased outcomes. It is human design behind it’. There is no doubt that machine learning will enrich the diagnostic capabilities of pathologists and other medical specialties, but only if mastered properly by trained computer specialists and physicians alike. Medicine needs to shape its tools and not the other way around.

References

McCulloch, W. S. & Pitts, W. A logical calculus of the ideas immanent in nervous activity. B. Math. Biophys. 5, 115–133 (1948).
Article Google Scholar
Bottou L., Bengio Y., Haffner P. & LeCunn Y. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).
Tsuji, S., Midorikawa, Y., Takahashi, T., Yagi, K., Takayama, T., Yoshida, K. et al. Potential responders to FOLFOX therapy for colorectal cancer by Random Forests analysis. Br. J. cancer 106, 126–132 (2012).
Article CAS Google Scholar
Bychkov, D., Linder, N., Turkki, R., Nordling, S., Kovanen, P. E., Verrill, C. et al. Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci. Rep. 8, 3395 (2018).
Article Google Scholar
Kather, J. N., Krisam, J., Charoentong, P., Luedde, T., Herpel, E., Weis, C. A. et al. Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study. PLoS Med 16, e1002730 (2019).
Article Google Scholar
Linder, N., Taylor, J. C., Colling, R., Pell, R., Alveyn, E., Joseph, J. et al. Deep learning for detecting tumour-infiltrating lymphocytes in testicular germ cell tumours. J. Clin. Pathol. 72, 157–164 (2019).
Article Google Scholar
Ing, N., Huang, F., Conley, A., You, S., Ma, Z., Klimov, S. et al. A novel machine learning approach reveals latent vascular phenotypes predictive of renal cancer outcome. Sci. Rep. 7, 13190 (2017).
Article Google Scholar
Klimov, S., Rida, P. C., Aleskandarany, M. A., Green, A. R., Ellis, I. O., Janssen, E. A. et al. Novel immunohistochemistry-based signatures to predict metastatic site of triple-negative breast cancers. Br. J. cancer 117, 826–834 (2017).
Article CAS Google Scholar
Coudray, N., Ocampo, P. S., Sakellaropoulos, T., Narula, N., Snuderl, M., Fenyo, D. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med 24, 1559–1567 (2018).
Article CAS Google Scholar
Saha, A., Harowicz, M. R., Grimm, L. J., Kim, C. E., Ghate, S. V., Walsh, R. et al. A machine learning approach to radiogenomics of breast cancer: a study of 922 subjects and 529 DCE-MRI features. Br. J. cancer 119, 508–516 (2018).
Article CAS Google Scholar
Steyerberg, E. W. Harrell, F. E. Jr. Prediction models need appropriate internal, internal-external, and external validation. J. Clin. Epidemiol. 69, 245–247 (2016).
Article Google Scholar
Valous, N. A., Xiong, W., Halama, N., Zornig, I., Cantre, D., Wang, Z. et al. Multilacunarity as a spatial multiscale multi-mass morphometric of change in the meso-architecture of plant parenchyma tissue. Chaos 28, 093110 (2018).
Article CAS Google Scholar
Kather J. N., Pearson A. T., Halama N., Jager D., Krause J., Loosen S. H. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat Med 2019, e-pub ahead of print Jun 3. https://doi.org/10.1038/s41591-019-0462-y.
Article CAS Google Scholar

Download references

Acknowledgements

N.H. thanks Rodrigo Moraleda, Nek Valous and Wei Xiong for their continuous efforts.

Author information

Authors and Affiliations

Department of Medical Oncology and Internal Medicine VI, National Center for Tumor Diseases, University Hospital Heidelberg, Heidelberg, Germany
Niels Halama
German Translational Cancer Consortium (DKTK), Heidelberg, Germany
Niels Halama
Institute for Immunology, University Hospital Heidelberg, Heidelberg, Germany
Niels Halama
Department of Translational Immunotherapy, German Cancer Research Center (DKFZ), Heidelberg, Germany
Niels Halama
Helmholtz Institute for Translational Oncology (HI-TRON), Mainz, Germany
Niels Halama

Authors

Niels Halama
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

N.H. wrote the paper and drafted the figure.

Corresponding author

Correspondence to Niels Halama.

Ethics declarations

Competing interests

N.H. is co-Subject Editor of the Translational Therapeutics section of the British Journal of Cancer.

Ethics approval and consent to participate

Not applicable.

Funding

Not applicable.

Consent to publish

Not applicable.

Data availability

Not applicable.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Halama, N. Machine learning for tissue diagnostics in oncology: brave new world. Br J Cancer 121, 431–433 (2019). https://doi.org/10.1038/s41416-019-0535-1

Download citation

Received: 11 April 2019
Revised: 02 July 2019
Accepted: 11 July 2019
Published: 09 August 2019
Issue Date: 10 September 2019
DOI: https://doi.org/10.1038/s41416-019-0535-1

This article is cited by

Fast machine learning annotation in the medical domain: a semi-automated video annotation tool for gastroenterologists
- Adrian Krenzer
- Kevin Makowski
- Frank Puppe
BioMedical Engineering OnLine (2022)
High-dimensional role of AI and machine learning in cancer research
- Enrico Capobianco
British Journal of Cancer (2022)
Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer
- E. Amiri Souri
- A. Chenoweth
- S. Tsoka
British Journal of Cancer (2021)
Six application scenarios of artificial intelligence in the precise diagnosis and treatment of liver cancer
- Qi Lang
- Chongli Zhong
- Yu Tian
Artificial Intelligence Review (2021)

Machine learning for tissue diagnostics in oncology: brave new world

Subjects

Summary

The beginning: learning machines in medicine

The new ‘microscope’ for tissue: computational tools develop with computational power

Brave new world: a (computational) stratification tool is still a stratification tool

The outlook: algorithmic tools for pathology

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethics approval and consent to participate

Funding

Consent to publish

Data availability

Additional information

Rights and permissions

About this article

Cite this article

This article is cited by

Fast machine learning annotation in the medical domain: a semi-automated video annotation tool for gastroenterologists

High-dimensional role of AI and machine learning in cancer research

Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer

Six application scenarios of artificial intelligence in the precise diagnosis and treatment of liver cancer

Search

Quick links

Subjects

Summary

The beginning: learning machines in medicine

The new ‘microscope’ for tissue: computational tools develop with computational power

Brave new world: a (computational) stratification tool is still a stratification tool

The outlook: algorithmic tools for pathology

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethics approval and consent to participate

Funding

Consent to publish

Data availability

Additional information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Fast machine learning annotation in the medical domain: a semi-automated video annotation tool for gastroenterologists

High-dimensional role of AI and machine learning in cancer research

Cancer Grade Model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer

Six application scenarios of artificial intelligence in the precise diagnosis and treatment of liver cancer

Search

Quick links