Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Designing deep learning studies in cancer diagnostics

Abstract

The number of publications on deep learning for cancer diagnostics is rapidly increasing, and systems are frequently claimed to perform comparable with or better than clinicians. However, few systems have yet demonstrated real-world medical utility. In this Perspective, we discuss reasons for the moderate progress and describe remedies designed to facilitate transition to the clinic. Recent, presumably influential, deep learning studies in cancer diagnostics, of which the vast majority used images as input to the system, are evaluated to reveal the status of the field. By manipulating real data, we then exemplify that much and varied training data facilitate the generalizability of neural networks and thus the ability to use them clinically. To reduce the risk of biased performance estimation of deep learning systems, we advocate evaluation in external cohorts and strongly advise that the planned analyses, including a predefined primary analysis, are described in a protocol preferentially stored in an online repository. Recommended protocol items should be established for the field, and we present our suggestions.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Characteristics of recent, presumably influential, deep learning studies in cancer diagnostics.
Fig. 2: Effect of data variation when training deep learning systems.
Fig. 3: Development and evaluation of deep learning systems.
Fig. 4: Reliability of performance estimations in recent, presumably influential, deep learning studies in cancer diagnostics.

References

  1. Schmidhuber, J. Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015).

    PubMed  Google Scholar 

  2. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    CAS  PubMed  Google Scholar 

  3. Hosny, A., Parmar, C., Quackenbush, J., Schwartz, L. H. & Aerts, H. J. W. L. Artificial intelligence in radiology. Nat. Rev. Cancer 18, 500–510 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. Vamathevan, J. et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18, 463–477 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. Bera, K., Schalper, K. A., Rimm, D. L., Velcheti, V. & Madabhushi, A. Artificial intelligence in digital pathology — new tools for diagnosis and precision oncology. Nat. Rev. Clin. Oncol. 16, 703–715 (2019).

    PubMed  PubMed Central  Google Scholar 

  6. Nagendran, M. et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ 368, m689 (2020).

    PubMed  PubMed Central  Google Scholar 

  7. Kim, D. W., Jang, H. Y., Kim, K. W., Shin, Y. & Park, S. H. Design characteristics of studies reporting the performance of artificial intelligence algorithms for diagnostic analysis of medical images: results from recently published papers. Korean J. Radiol. 20, 405–410 (2019).

    PubMed  PubMed Central  Google Scholar 

  8. Liu, X. et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit. Health 1, e271–e297 (2019).

    PubMed  Google Scholar 

  9. Ross, C. & Swetlitz, I. IBM’s Watson supercomputer recommended ‘unsafe and incorrect’ cancer treatments, internal documents show. STAT https://www.statnews.com/2018/07/25/ibm-watson-recommended-unsafe-incorrect-treatments/ (2018).

  10. Narla, A., Kuprel, B., Sarin, K., Novoa, R. & Ko, J. Automated classification of skin lesions: from pixels to practice. J. Invest. Dermatol. 138, 2108–2110 (2018).

    CAS  PubMed  Google Scholar 

  11. Zech, J. R. et al. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 15, e1002683 (2018).

    PubMed  PubMed Central  Google Scholar 

  12. Winkler, J. K. et al. Association between surgical skin markings in dermoscopic images and diagnostic performance of a deep learning convolutional neural network for melanoma recognition. JAMA Dermatol. 155, 1135–1141 (2019).

    PubMed Central  PubMed  Google Scholar 

  13. Rueckert, D. & Schnabel, J. A. Model-based and data-driven strategies in medical image computing. Proc. IEEE 108, 110–124 (2020).

    Google Scholar 

  14. Zhang, C., Bengio, S., Hardt, M., Recht, B. & Vinyals, O. Understanding deep learning requires rethinking generalization. Proc. Int. Conf. Learn. Represent. https://arxiv.org/abs/1611.03530 (2017).

  15. Liu, Y., Chen, P.-H. C., Krause, J. & Peng, L. How to read articles that use machine learning: users’ guides to the medical literature. JAMA 322, 1806–1816 (2019).

    PubMed  Google Scholar 

  16. Ransohoff, D. F. Bias as a threat to the validity of cancer molecular-marker research. Nat. Rev. Cancer 5, 142–149 (2005).

    CAS  PubMed  Google Scholar 

  17. Moons, K. G. M. et al. PROBAST: a tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann. Intern. Med. 170, W1–W33 (2019).

    PubMed  Google Scholar 

  18. Simard, P., Victorri, B., LeCun, Y. & Denker, J. Tangent Prop — a formalism for specifying selected invariances in an adaptive network. Adv. Neural Inf. Process. Syst. 4, 895–903 (1992).

    Google Scholar 

  19. Shorten, C. & Khoshgoftaar, T. M. A survey on image data augmentation for deep learning. J. Big Data 6, 60 (2019).

    Google Scholar 

  20. Ioannidis, J. P. A. What have we (not) learnt from millions of scientific papers with P values? Am. Stat. 73, 20–25 (2019).

    Google Scholar 

  21. Ioannidis, J. P. A. Why most published research findings are false. PLoS Med. 2, e124 (2005).

    PubMed  PubMed Central  Google Scholar 

  22. Moons, K. G. M. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): explanation and elaboration. Ann. Intern. Med. 162, W1–W73 (2015).

    PubMed  Google Scholar 

  23. Heaven, D. Why deep-learning AIs are so easy to fool. Nature 574, 163–166 (2019).

    CAS  PubMed  Google Scholar 

  24. Ioannidis, J. P. A. Evolution and translation of research findings: from bench to where? PLoS Clin. Trials 1, e36 (2006).

    PubMed  PubMed Central  Google Scholar 

  25. Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44–56 (2019).

    CAS  PubMed  Google Scholar 

  26. Justice, A. C., Covinsky, K. E. & Berlin, J. A. Assessing the generalizability of prognostic information. Ann. Intern. Med. 130, 515–524 (1999).

    CAS  PubMed  Google Scholar 

  27. Subbaswamy, A. & Saria, S. From development to deployment: dataset shift, causality, and shift-stable models in health AI. Biostatistics 21, 345–352 (2020).

    PubMed  Google Scholar 

  28. Ioannidis, J. P. A. & Khoury, M. J. Improving validation practices in “omics” research. Science 334, 1230–1232 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  29. Obermeyer, Z. & Emanuel, E. J. Predicting the future — big data, machine learning, and clinical medicine. N. Engl. J. Med. 375, 1216–1219 (2016).

    PubMed  PubMed Central  Google Scholar 

  30. Keane, P. A. & Topol, E. J. With an eye to AI and autonomous diagnosis. NPJ Digit. Med. 1, 40 (2018).

    PubMed  PubMed Central  Google Scholar 

  31. Gianfrancesco, M. A., Tamang, S., Yazdany, J. & Schmajuk, G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern. Med. 178, 1544–1547 (2018).

    PubMed  PubMed Central  Google Scholar 

  32. Noor, P. Can we trust AI not to further embed racial bias and prejudice? BMJ 368, m363 (2020).

    PubMed  Google Scholar 

  33. Luo, W. et al. Guidelines for developing and reporting machine learning predictive models in biomedical research: a multidisciplinary view. J. Med. Internet Res. 18, e323 (2016).

    PubMed  PubMed Central  Google Scholar 

  34. Hua, K. L., Hsu, C. H., Hidayati, S. C., Cheng, W. H. & Chen, Y. J. Computer-aided classification of lung nodules on computed tomography images via deep learning technique. Onco Targets Ther. 8, 2015–2022 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Ciompi, F. et al. Automatic classification of pulmonary peri-fissural nodules in computed tomography using an ensemble of 2D views and a convolutional neural network out-of-the-box. Med. Image Anal. 26, 195–202 (2015).

    PubMed  Google Scholar 

  36. Arevalo, J., González, F. A., Ramos-Pollán, R., Oliveira, J. L. & Guevara Lopez, M. A. Representation learning for mammography mass lesion classification with convolutional neural networks. Comput. Methods Prog. Biomed. 127, 248–257 (2016).

    Google Scholar 

  37. Setio, A. A. A. et al. Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks. IEEE Trans. Med. Imaging 35, 1160–1169 (2016).

    PubMed  Google Scholar 

  38. Roth, H. R. et al. Improving computer-aided detection using convolutional neural networks and random view aggregation. IEEE Trans. Med. Imaging 35, 1170–1181 (2016).

    PubMed  Google Scholar 

  39. Kallenberg, M. et al. Unsupervised deep learning applied to breast density segmentation and mammographic risk scoring. IEEE Trans. Med. Imaging 35, 1322–1331 (2016).

    PubMed  Google Scholar 

  40. Litjens, G. et al. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci. Rep. 6, 26286 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Huynh, B. Q., Li, H. & Giger, M. L. Digital mammographic tumor classification using transfer learning from deep convolutional neural networks. J. Med. Imaging 3, 034501 (2016).

    Google Scholar 

  42. Nie, K. et al. Rectal cancer: assessment of neoadjuvant chemoradiation outcome based on radiomics of multiparametric MRI. Clin. Cancer Res. 22, 5256–5264 (2016).

    PubMed  Google Scholar 

  43. Kooi, T. et al. Large scale deep learning for computer aided detection of mammographic lesions. Med. Image Anal. 35, 303–312 (2017).

    PubMed  Google Scholar 

  44. Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Dhungel, N., Carneiro, G. & Bradley, A. P. A deep learning approach for the analysis of masses in mammograms with minimal user intervention. Med. Image Anal. 37, 114–128 (2017).

    PubMed  Google Scholar 

  46. Yu, L., Chen, H., Dou, Q., Qin, J. & Heng, P. Automated melanoma recognition in dermoscopy images via very deep residual networks. IEEE Trans. Med. Imaging 36, 994–1004 (2017).

    PubMed  Google Scholar 

  47. Sun, W., Tseng, T. B., Zhang, J. & Qian, W. Enhancing deep convolutional neural network scheme for breast cancer diagnosis with unlabeled data. Comput. Med. Imaging Graph. 57, 4–9 (2017).

    PubMed  Google Scholar 

  48. Cruz-Roa, A. et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci. Rep. 7, 46450 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Ciompi, F. et al. Towards automatic pulmonary nodule management in lung cancer screening with deep learning. Sci. Rep. 7, 46479 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  50. Araújo, T. et al. Classification of breast cancer histology images using convolutional neural networks. PLoS ONE 12, e0177544 (2017).

    PubMed  PubMed Central  Google Scholar 

  51. Becker, A. S. et al. Deep learning in mammography: diagnostic accuracy of a multipurpose image analysis software in the detection of breast cancer. Invest. Radiol. 52, 434–440 (2017).

    PubMed  Google Scholar 

  52. Dou, Q., Chen, H., Yu, L., Qin, J. & Heng, P. Multilevel contextual 3-D CNNs for false positive reduction in pulmonary nodule detection. IEEE Trans. Biomed. Eng. 64, 1558–1567 (2017).

    PubMed  Google Scholar 

  53. Lao, J. et al. A deep learning-based radiomics model for prediction of survival in glioblastoma multiforme. Sci. Rep. 7, 10353 (2017).

    PubMed  PubMed Central  Google Scholar 

  54. Setio, A. A. A. et al. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge. Med. Image Anal. 42, 1–13 (2017).

    PubMed  Google Scholar 

  55. Ehteshami Bejnordi, B. et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 318, 2199–2210 (2017).

    PubMed  PubMed Central  Google Scholar 

  56. Mohamed, A. A. et al. A deep learning method for classifying mammographic breast density categories. Med. Phys. 45, 314–321 (2018).

    PubMed  Google Scholar 

  57. Khosravi, P., Kazemi, E., Imielinski, M., Elemento, O. & Hajirasouliha, I. Deep convolutional neural networks enable discrimination of heterogeneous digital pathology images. EBioMedicine 27, 317–328 (2018).

    PubMed  Google Scholar 

  58. Xiao, Y., Wu, J., Lin, Z. & Zhao, X. A deep learning-based multi-model ensemble method for cancer prediction. Comput. Methods Prog. Biomed. 153, 1–9 (2018).

    Google Scholar 

  59. Marchetti, M. A. et al. Results of the 2016 International Skin Imaging Collaboration International Symposium on Biomedical Imaging challenge: comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images. J. Am. Acad. Dermatol. 78, 270–277.e1 (2018).

    PubMed  Google Scholar 

  60. Chen, P.-J. et al. Accurate classification of diminutive colorectal polyps using computer-aided analysis. Gastroenterology 154, 568–575 (2018).

    PubMed  Google Scholar 

  61. Bychkov, D. et al. Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci. Rep. 8, 3395 (2018).

    PubMed  PubMed Central  Google Scholar 

  62. Yasaka, K., Akai, H., Abe, O. & Kiryu, S. Deep learning with convolutional neural network for differentiation of liver masses at dynamic contrast-enhanced CT: a preliminary study. Radiology 286, 887–896 (2018).

    PubMed  Google Scholar 

  63. Chang, K. et al. Residual convolutional neural network for the determination of IDH status in low- and high-grade gliomas from MR imaging. Clin. Cancer Res. 24, 1073–1081 (2018).

    CAS  PubMed  Google Scholar 

  64. Ribli, D., Horváth, A., Unger, Z., Pollner, P. & Csabai, I. Detecting and classifying lesions in mammograms with deep learning. Sci. Rep. 8, 4165 (2018).

    PubMed  PubMed Central  Google Scholar 

  65. Chaudhary, K., Poirion, O. B., Lu, L. & Garmire, L. X. Deep learning-based multi-omics integration robustly predicts survival in liver cancer. Clin. Cancer Res. 24, 1248–1259 (2018).

    CAS  PubMed  Google Scholar 

  66. Mobadersany, P. et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl Acad. Sci. USA 115, E2970–E2979 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  67. Saltz, J. et al. Spatial organization and molecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology images. Cell Rep. 23, 181–193.e7 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  68. van de Goor, R., van Hooren, M., Dingemans, A.-M., Kremer, B. & Kross, K. Training and validating a portable electronic nose for lung cancer screening. J. Thorac. Oncol. 13, 676–681 (2018).

    PubMed  Google Scholar 

  69. Chang, H., Han, J., Zhong, C., Snijders, A. M. & Mao, J. Unsupervised transfer learning via multi-scale convolutional sparse coding for biomedical applications. IEEE Trans. Pattern Anal. Mach. Intell. 40, 1182–1194 (2018).

    PubMed  Google Scholar 

  70. Han, S. S. et al. Classification of the clinical images for benign and malignant cutaneous tumors using a deep learning algorithm. J. Invest. Dermatol. 138, 1529–1538 (2018).

    CAS  PubMed  Google Scholar 

  71. Hirasawa, T. et al. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer 21, 653–660 (2018).

    PubMed  Google Scholar 

  72. Chang, P. et al. Deep-learning convolutional neural networks accurately classify genetic mutations in gliomas. Am. J. Neuroradiol. 39, 1201–1207 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. Haenssle, H. A. et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann. Oncol. 29, 1836–1842 (2018).

    CAS  PubMed  Google Scholar 

  74. Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).

    CAS  PubMed  Google Scholar 

  75. Wang, P. et al. Development and validation of a deep-learning algorithm for the detection of polyps during colonoscopy. Nat. Biomed. Eng. 2, 741–748 (2018).

    PubMed  Google Scholar 

  76. Urban, G. et al. Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy. Gastroenterology 155, 1069–1078.e8 (2018).

    PubMed  Google Scholar 

  77. Rajpurkar, P. et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 15, e1002686 (2018).

    PubMed  PubMed Central  Google Scholar 

  78. Hosny, A. et al. Deep learning for lung cancer prognostication: a retrospective multi-cohort radiomics study. PLoS Med. 15, e1002711 (2018).

    PubMed  PubMed Central  Google Scholar 

  79. Nam, J. G. et al. Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology 290, 218–228 (2019).

    PubMed  Google Scholar 

  80. Byrne, M. F. et al. Real-time differentiation of adenomatous and hyperplastic diminutive colorectal polyps during analysis of unaltered videos of standard colonoscopy using a deep learning model. Gut 68, 94–100 (2019).

    PubMed  Google Scholar 

  81. Horie, Y. et al. Diagnostic outcomes of esophageal cancer by artificial intelligence using convolutional neural networks. Gastrointest. Endosc. 89, 25–32 (2019).

    PubMed  Google Scholar 

  82. Kather, J. N. et al. Predicting survival from colorectal cancer histology slides using deep learning: a retrospective multicenter study. PLoS Med. 16, e1002730 (2019).

    PubMed  PubMed Central  Google Scholar 

  83. Rodríguez-Ruiz, A. et al. Detection of breast cancer with mammography: effect of an artificial intelligence support system. Radiology 290, 305–314 (2019).

    PubMed  Google Scholar 

  84. Li, X. et al. Diagnosis of thyroid cancer using deep convolutional neural network models applied to sonographic images: a retrospective, multicohort, diagnostic study. Lancet Oncol. 20, 193–201 (2019).

    PubMed  Google Scholar 

  85. Wang, S. et al. Predicting EGFR mutation status in lung adenocarcinoma on CT image using deep learning. Eur. Respir. J. 53, 1800986 (2019).

    PubMed  PubMed Central  Google Scholar 

  86. Brinker, T. J. et al. A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task. Eur. J. Cancer 111, 148–154 (2019).

    PubMed  Google Scholar 

  87. Kickingereder, P. et al. Automated quantitative tumour response assessment of MRI in neuro-oncology with artificial neural networks: a multicentre, retrospective study. Lancet Oncol. 20, 728–740 (2019).

    PubMed  Google Scholar 

  88. Brinker, T. J. et al. Deep learning outperformed 136 of 157 dermatologists in a head-to-head dermoscopic melanoma image classification task. Eur. J. Cancer 113, 47–54 (2019).

    PubMed  Google Scholar 

  89. Choi, K. S., Choi, S. H. & Jeong, B. Prediction of IDH genotype in gliomas with dynamic susceptibility contrast perfusion MR imaging using an explainable recurrent neural network. Neuro Oncol. 21, 1197–1209 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  90. Ardila, D. et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 25, 954–961 (2019).

    CAS  PubMed  Google Scholar 

  91. Yala, A., Lehman, C., Schuster, T., Portnoi, T. & Barzilay, R. A deep learning mammography-based model for improved breast cancer risk prediction. Radiology 292, 60–66 (2019).

    PubMed  Google Scholar 

  92. Kather, J. N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25, 1054–1056 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  93. Liu, Y. et al. Artificial intelligence-based breast cancer nodal metastasis detection: insights into the black box for pathologists. Arch. Pathol. Lab. Med. 143, 859–868 (2019).

    CAS  PubMed  Google Scholar 

  94. Kehl, K. L. et al. Assessment of deep natural language processing in ascertaining oncologic outcomes from radiology reports. JAMA Oncol. 5, 1421–1429 (2019).

    PubMed Central  PubMed  Google Scholar 

  95. Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  96. Chen, P.-H. C. et al. An augmented reality microscope with real-time artificial intelligence integration for cancer diagnosis. Nat. Med. 25, 1453–1457 (2019).

    CAS  PubMed  Google Scholar 

  97. Hu, L. et al. An observational study of deep learning and automated evaluation of cervical images for cancer screening. J. Natl Cancer Inst. 111, 923–932 (2019).

    PubMed  PubMed Central  Google Scholar 

  98. Rodriguez-Ruiz, A. et al. Stand-alone artificial intelligence for breast cancer detection in mammography: comparison with 101 radiologists. J. Natl Cancer Inst. 111, 916–922 (2019).

    PubMed  PubMed Central  Google Scholar 

  99. Wang, X. et al. Weakly supervised deep learning for whole slide lung cancer image analysis. IEEE Trans. Cybern. 50, 3950–3962 (2019).

    PubMed  Google Scholar 

  100. Jurmeister, P. et al. Machine learning analysis of DNA methylation profiles distinguishes primary lung squamous cell carcinomas from head and neck metastases. Sci. Transl Med. 11, eaaw8513 (2019).

    CAS  PubMed  Google Scholar 

  101. Courtiol, P. et al. Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nat. Med. 25, 1519–1525 (2019).

    CAS  PubMed  Google Scholar 

  102. Wang, P. et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut 68, 1813–1819 (2019).

    PubMed  Google Scholar 

  103. Liao, F., Liang, M., Li, Z., Hu, X. & Song, S. Evaluate the malignancy of pulmonary nodules using the 3-D deep leaky noisy-OR network. IEEE Trans. Neural Netw. Learn. Syst. 30, 3484–3495 (2019).

    PubMed  Google Scholar 

  104. Luo, H. et al. Real-time artificial intelligence for detection of upper gastrointestinal cancer by endoscopy: a multicentre, case-control, diagnostic study. Lancet Oncol. 20, 1645–1654 (2019).

    CAS  PubMed  Google Scholar 

  105. Wu, L. et al. Randomised controlled trial of WISENSE, a real-time quality improving system for monitoring blind spots during esophagogastroduodenoscopy. Gut 68, 2161–2169 (2019).

    PubMed  Google Scholar 

  106. Shkolyar, E. et al. Augmented bladder tumor detection using deep learning. Eur. Urol. 76, 714–718 (2019).

    PubMed  PubMed Central  Google Scholar 

  107. Yamamoto, Y. et al. Automated acquisition of explainable knowledge from unannotated histopathology images. Nat. Commun. 10, 5642 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  108. McKinney, S. M. et al. International evaluation of an AI system for breast cancer screening. Nature 577, 89–94 (2020).

    CAS  PubMed  Google Scholar 

  109. Hollon, T. C. et al. Near real-time intraoperative brain tumor diagnosis using stimulated Raman histology and deep neural networks. Nat. Med. 26, 52–58 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  110. Haenssle, H. A. et al. Man against machine reloaded: performance of a market-approved convolutional neural network in classifying a broad spectrum of skin lesions in comparison with 96 dermatologists working under less artificial conditions. Ann. Oncol. 31, 137–143 (2020).

    CAS  PubMed  Google Scholar 

  111. Ström, P. et al. Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study. Lancet Oncol. 21, 222–232 (2020).

    PubMed  Google Scholar 

  112. Bulten, W. et al. Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. Lancet Oncol. 21, 233–241 (2020).

    PubMed  Google Scholar 

  113. Skrede, O.-J. et al. Deep learning for prediction of colorectal cancer outcome: a discovery and validation study. Lancet 395, 350–360 (2020).

    CAS  PubMed  Google Scholar 

  114. Saillard, C. et al. Predicting survival after hepatocellular carcinoma resection using deep-learning on histological slides. Hepatology 72, 2000–2013 (2020).

    PubMed  Google Scholar 

  115. Jin, E. H. et al. Improved accuracy in optical diagnosis of colorectal polyps using convolutional neural networks with visual explanations. Gastroenterology 158, 2169–2179.e8 (2020).

    PubMed  Google Scholar 

  116. de Groof, A. J. et al. Deep-learning system detects neoplasia in patients with Barrett’s esophagus with higher accuracy than endoscopists in a multistep training and validation study with benchmarking. Gastroenterology 158, 915–929.e4 (2020).

    PubMed  Google Scholar 

  117. Bangalore Yogananda, C. G. et al. A novel fully automated MRI-based deep-learning method for classification of IDH mutation status in brain gliomas. Neuro Oncol. 22, 402–411 (2020).

    PubMed  Google Scholar 

  118. Zheng, X. et al. Deep learning radiomics can predict axillary lymph node status in early-stage breast cancer. Nat. Commun. 11, 1236 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  119. Galateau Salle, F. et al. Comprehensive molecular and pathologic evaluation of transitional mesothelioma assisted by deep learning approach: a multi-institutional study of the International Mesothelioma Panel from the MESOPATH Reference Center. J. Thorac. Oncol. 15, 1037–1053 (2020).

    CAS  PubMed  Google Scholar 

  120. Baldwin, D. R. et al. External validation of a convolutional neural network artificial intelligence tool to predict malignancy in pulmonary nodules. Thorax 75, 306–312 (2020).

    PubMed  Google Scholar 

  121. Wang, P. et al. Effect of a deep-learning computer-aided detection system on adenoma detection during colonoscopy (CADe-DB trial): a double-blind randomised study. Lancet Gastroenterol. Hepatol. 5, 343–351 (2020).

    PubMed  Google Scholar 

  122. Song, Q., Zheng, Y., Sheng, W. & Yang, J. Tridirectional transfer learning for predicting gastric cancer morbidity. IEEE Trans. Neural Netw. Learn. Syst. https://doi.org/10.1109/TNNLS.2020.2979486 (2020).

    Article  PubMed  Google Scholar 

  123. Dong, D. et al. Deep learning radiomic nomogram can predict the number of lymph node metastasis in locally advanced gastric cancer: an international multicenter study. Ann. Oncol. 31, 912–920 (2020).

    CAS  PubMed  Google Scholar 

  124. Shin, H. et al. Early-stage lung cancer diagnosis by deep learning-based spectroscopic analysis of circulating exosomes. ACS Nano 14, 5435–5444 (2020).

    CAS  PubMed  Google Scholar 

  125. Kann, B. H. et al. Multi-institutional validation of deep learning for pretreatment identification of extranodal extension in head and neck squamous cell carcinoma. J. Clin. Oncol. 38, 1304–1311 (2020).

    PubMed  Google Scholar 

  126. [No authors listed] AI diagnostics need attention. Nature 555, 285 (2018).

    Google Scholar 

  127. [No authors listed] Is digital medicine different? Lancet 392, 95 (2018).

    Google Scholar 

  128. Kawaguchi, K., Kaelbling, L. P. & Bengio, Y. Generalization in deep learning. arxiv https://arxiv.org/abs/1710.05468 (2017).

  129. LeCun, Y. in Connectionism in Perspective (eds Pfeifer, R., Schreter, Z., Fogelman, F., & Steels, L.) 143–156 (Elsevier, 1989).

  130. Neyshabur, B., Bhojanapalli, S., Mcallester, D. & Srebro, N. Exploring generalization in deep learning. Adv. Neural Inf. Process. Syst. 30, 5947–5956 (2017).

    Google Scholar 

  131. Pan, S. J. & Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010).

    Google Scholar 

  132. Weiss, K., Khoshgoftaar, T. M. & Wang, D. A survey of transfer learning. J. Big Data 3, 9 (2016).

    Google Scholar 

  133. Deng, J. et al. ImageNet: a large-scale hierarchical image database. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. https://doi.org/10.1109/CVPR.2009.5206848 (2009).

    Article  Google Scholar 

  134. Russakovsky, O. et al. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).

    Google Scholar 

  135. Shankar, S. et al. No classification without representation: assessing geodiversity issues in open data sets for the developing world. NIPS Workshop Mach. Learn. Dev. World https://arxiv.org/abs/1711.08536 (2017).

  136. Geirhos, R. et al. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. Proc. Int. Conf. Learn. Represent. https://arxiv.org/abs/1811.12231 (2019).

  137. Beyer, L., Hénaff, O. J., Kolesnikov, A., Zhai, X. & van den Oord, A. Are we done with ImageNet? arxiv https://arxiv.org/abs/2006.07159 (2020).

  138. Sun, C., Shrivastava, A., Singh, S. & Gupta, A. Revisiting unreasonable effectiveness of data in deep learning era. Proc. IEEE Int. Conf. Comput. Vis. https://doi.org/10.1109/ICCV.2017.97 (2017).

    Article  Google Scholar 

  139. Simard, P. Y., Steinkraus, D. & Platt, J. C. Best practices for convolutional neural networks applied to visual document analysis. Proc. 7th Int. Conf. Doc. Anal. Recognit. https://doi.org/10.1109/ICDAR.2003.1227801 (2003).

    Article  Google Scholar 

  140. Baird, H. S. Document image defect models and their uses. Proc. 2nd Int. Conf. Doc. Anal. Recognit. https://doi.org/10.1109/ICDAR.1993.395781 (1993).

    Article  Google Scholar 

  141. Stacke, K., Eilertsen, G., Unger, J. & Lundstrom, C. Measuring domain shift for deep learning in histopathology. IEEE J. Biomed. Health Inform. https://doi.org/10.1109/JBHI.2020.3032060 (2020).

    Article  Google Scholar 

  142. Lakhani, P. & Sundaram, B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 284, 574–582 (2017).

    PubMed  Google Scholar 

  143. Hussain, Z., Gimenez, F., Yi, D. & Rubin, D. Differential data augmentation techniques for medical imaging classification tasks. AMIA Annu. Symp. Proc. 2017, 979–984 (2018).

    PubMed  PubMed Central  Google Scholar 

  144. Sajjad, M. et al. Multi-grade brain tumor classification using deep CNN with extensive data augmentation. J. Comput. Sci. 30, 174–182 (2019).

    Google Scholar 

  145. Tellez, D. et al. Quantifying the effects of data augmentation and stain color normalization in convolutional neural networks for computational pathology. Med. Image Anal. 58, 101544 (2019).

    PubMed  Google Scholar 

  146. Kerr, R. S. et al. Adjuvant capecitabine plus bevacizumab versus capecitabine alone in patients with colorectal cancer (QUASAR 2): an open-label, randomised phase 3 trial. Lancet Oncol. 17, 1543–1557 (2016).

    CAS  PubMed  Google Scholar 

  147. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. & Wojna, Z. Rethinking the inception architecture for computer vision. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. https://doi.org/10.1109/CVPR.2016.308 (2016).

    Article  Google Scholar 

  148. Miller, R. G. J. Simultaneous Statistical Inference 2nd edn (Springer, 1981).

  149. Hochberg, Y. & Tamhane, A. C. Multiple Comparison Procedures (Wiley, 2009).

  150. Michiels, S., Koscielny, S. & Hill, C. Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet 365, 488–492 (2005).

    CAS  PubMed  Google Scholar 

  151. Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning 2nd edn (Springer-Verlag, 2009).

  152. Russell, S. & Norvig, P. Artificial Intelligence: A Modern Approach 3rd edn (Prentice Hall, 2010).

  153. Hemingway, H., Riley, R. D. & Altman, D. G. Ten steps towards improving prognosis research. BMJ 339, b4184 (2009).

    PubMed  Google Scholar 

  154. Korevaar, D. A. et al. Facilitating prospective registration of diagnostic accuracy studies: a STARD initiative. Clin. Chem. 63, 1331–1341 (2017).

    CAS  PubMed  Google Scholar 

  155. Ioannidis, J. P. A. The importance of predefined rules and prespecified statistical analyses: do not abandon significance. JAMA 321, 2067–2068 (2019).

    PubMed  Google Scholar 

  156. Brodersen, K. H., Ong, C. S., Stephan, K. E. & Buhmann, J. M. The balanced accuracy and its posterior distribution. Proc. 20th Int. Conf. Pattern Recognit. https://doi.org/10.1109/ICPR.2010.764 (2010).

    Article  Google Scholar 

  157. van den Hout, W. B. The area under an ROC curve with limited information. Med. Decis. Mak. 23, 160–166 (2003).

    Google Scholar 

  158. Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 27, 861–874 (2006).

    Google Scholar 

  159. Harrell, F. E. Jr, Califf, R. M., Pryor, D. B., Lee, K. L. & Rosati, R. A. Evaluating the yield of medical tests. J. Am. Med. Assoc. 247, 2543–2546 (1982).

    Google Scholar 

  160. Lobo, J. M., Jiménez-Valverde, A. & Real, R. AUC: a misleading measure of the performance of predictive distribution models. Glob. Ecol. Biogeogr. 17, 145–151 (2008).

    Google Scholar 

  161. Voosen, P. How AI detectives are cracking open the black box of deep learning. Science https://www.sciencemag.org/news/2017/07/how-ai-detectives-are-cracking-open-black-box-deep-learning (2017).

  162. Adadi, A. & Berrada, M. Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access. 6, 52138–52160 (2018).

    Google Scholar 

  163. Barredo Arrieta, A. et al. Explainable artificial intelligence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion. 58, 82–115 (2020).

    Google Scholar 

  164. Montavon, G., Samek, W. & Müller, K.-R. Methods for interpreting and understanding deep neural networks. Digit. Signal. Process. 73, 1–15 (2018).

    Google Scholar 

  165. Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. Proc. Int. Conf. Learn. Represent. https://arxiv.org/abs/1312.6034 (2014).

  166. Bach, S. et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10, e0130140 (2015).

    PubMed  PubMed Central  Google Scholar 

  167. Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. Proc. 34th Int. Conf. Mach. Learn. 70, 3319–3328 (2017).

    Google Scholar 

  168. Friedman, L. M., Furberg, C. D., DeMets, D. L., Reboussin, D. M. & Granger, C. B. Fundamentals of Clinical Trials 5th edn (Springer, 2015).

  169. van Luijn, H. E. M., Musschenga, A. W., Keus, R. B., Robinson, W. M. & Aaronson, N. K. Assessment of the risk/benefit ratio of phase II cancer clinical trials by Institutional Review Board (IRB) members. Ann. Oncol. 13, 1307–1313 (2002).

    PubMed  Google Scholar 

  170. Martin, L., Hutchens, M., Hawkins, C. & Radnov, A. How much do clinical trials cost? Nat. Rev. Drug Discov. 16, 381–382 (2017).

    CAS  PubMed  Google Scholar 

  171. Teutsch, S. M. et al. The evaluation of genomic applications in practice and prevention (EGAPP) initiative: methods of the EGAPP Working Group. Genet. Med. 11, 3–14 (2009).

    PubMed  PubMed Central  Google Scholar 

  172. Vollmer, S. et al. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness. BMJ 368, l6927 (2020).

    PubMed  Google Scholar 

  173. Chan, A.-W. et al. SPIRIT 2013 explanation and elaboration: guidance for protocols of clinical trials. BMJ 346, e7586 (2013).

    PubMed  PubMed Central  Google Scholar 

  174. Cruz Rivera, S. et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat. Med. 26, 1351–1363 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  175. Moher, D. et al. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ 340, c869 (2010).

    PubMed  PubMed Central  Google Scholar 

  176. Collins, G. S. & Moons, K. G. M. Reporting of artificial intelligence prediction models. Lancet 393, 1577–1579 (2019).

    PubMed  Google Scholar 

  177. Liu, X. et al. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat. Med. 26, 1364–1374 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  178. [No authors listed] Should protocols for observational research be registered? Lancet 375, 348 (2010).

    Google Scholar 

  179. Loder, E., Groves, T. & MacAuley, D. Registration of observational studies. BMJ 340, c950 (2010).

    PubMed  Google Scholar 

  180. Chambers, C. & Munafo, M. Trust in science would be improved by study pre-registration. The Guardian https://www.theguardian.com/science/blog/2013/jun/05/trust-in-science-study-pre-registration (2013).

  181. Williams, R. J., Tse, T., Harlan, W. R. & Zarin, D. A. Registration of observational studies: is it time? Can. Med. Assoc. J. 182, 1638–1642 (2010).

    Google Scholar 

  182. Gill, J. & Prasad, V. Improving observational studies in the era of big data. Lancet 392, 716–717 (2018).

    PubMed  Google Scholar 

  183. Sørensen, H. T. & Rothman, K. J. The prognosis for research. BMJ 340, c703 (2010).

    PubMed  Google Scholar 

  184. Vandenbroucke, J. P. Registering observational research: second thoughts. Lancet 375, 982–983 (2010).

    PubMed  Google Scholar 

  185. [No authors listed] The registration of observational studies — when metaphors go bad. Epidemiology 21, 607–609 (2010).

    Google Scholar 

  186. Andre, F. et al. Biomarker studies: a call for a comprehensive biomarker study registry. Nat. Rev. Clin. Oncol. 8, 171–176 (2011).

    PubMed  Google Scholar 

  187. Hooft, L. & Bossuyt, P. M. Prospective registration of marker evaluation studies: time to act. Clin. Chem. 57, 1684–1686 (2011).

    CAS  PubMed  Google Scholar 

  188. Altman, D. G. The time has come to register diagnostic and prognostic research. Clin. Chem. 60, 580–582 (2014).

    CAS  PubMed  Google Scholar 

  189. Rifai, N. et al. Registering diagnostic and prognostic trials of tests: is it the right thing to do? Clin. Chem. 60, 1146–1152 (2014).

    CAS  PubMed  Google Scholar 

  190. Rajkomar, A., Dean, J. & Kohane, I. Machine learning in medicine. N. Engl. J. Med. 380, 1347–1358 (2019).

    PubMed  Google Scholar 

  191. Zou, J. & Schiebinger, L. AI can be sexist and racist — it’s time to make it fair. Nature 559, 324–326 (2018).

    CAS  PubMed  Google Scholar 

  192. Adamson, A. S. & Smith, A. Machine learning and health care disparities in dermatology. JAMA Dermatol. 154, 1247–1248 (2018).

    PubMed  Google Scholar 

  193. Vyas, D. A., Eisenstein, L. G. & Jones, D. S. Hidden in plain sight — reconsidering the use of race correction in clinical algorithms. N. Engl. J. Med. 383, 874–882 (2020).

    PubMed  Google Scholar 

  194. Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019).

    CAS  PubMed  Google Scholar 

  195. Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G. & Chin, M. H. Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 169, 866–872 (2018).

    PubMed  PubMed Central  Google Scholar 

  196. Owens, K. & Walker, A. Those designing healthcare algorithms must become actively anti-racist. Nat. Med. 26, 1327–1328 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  197. Moons, K. G. M. et al. Risk prediction models: II. External validation, model updating, and impact assessment. Heart 98, 691–698 (2012).

    PubMed  Google Scholar 

Download references

Acknowledgements

The authors thank M. Seiergren for assembling all figures, T. S. Hveem for discussions, T. Ystanes, H. A. Inderhaug and B. M. Sannes for setting up and maintaining our computer network and computational infrastructure, and the authors of Inception-v3 for making their code freely available under an open source licence (Apache License, version 2.0). The authors of this Perspective acknowledge funding from the Research Council of Norway through its IKTPLUSS Lighthouse programme (project number 259204).

Author information

Authors and Affiliations

Authors

Contributions

H.E.D and D.J.K initiated the project. All authors researched data for the article. A.K., O.-J.S. and K.L. evaluated the recent, presumably influential, deep learning studies in cancer diagnostics. S.D.R. executed the training, tuning and evaluation of Inception-v3 systems. A.K. drafted the manuscript, and all authors contributed to reviewing and editing the manuscript.

Corresponding author

Correspondence to Håvard E. Danielsen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information

Nature Reviews Cancer thanks J. Kather and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

ClinicalTrials.gov registry: https://www.clinicaltrials.gov

International Standard Randomized Controlled Trial Number (ISRCTN) registry: https://www.isrctn.com

Journal Policies | Journal of Clinical Oncology : https://ascopubs.org/jco/authors/journal-policies

Supplementary information

Glossary

Area under the receiver operating characteristic curve

(AUC). A performance metric measuring the concordance between a dichotomous outcome and the ranking of subjects provided by a continuous or categorical marker. An AUC of 50% indicates random guessing and 100% indicates perfect prediction. For dichotomous markers, the AUC and balanced accuracy are equivalent.

Artificial neural networks

Mathematical functions mapping input data to output representations, structured as a directed graph of nodes and edges.

Balanced accuracy

A classification performance metric calculated by averaging the proportion of true predicted outcomes across all possible outcomes. For dichotomous outcomes, this reduces to the average between the sensitivity and the specificity.

Capacity

The ability of a model class, for example a particular network architecture, to express complicated correlations between input data and target output. Model classes with high capacity have the potential to produce models that are able to map training data to target outputs with a high degree of accuracy, but are also more prone to overfitting.

Concordance index

(c-index). A performance metric measuring the concordance between a target outcome, usually defined by time to event data, and the ranking of subjects provided by a continuous or categorical marker. A c-index of 50% indicates random guessing and 100% indicates perfect prediction. For dichotomous outcomes, the c-index and the area under the receiver operating characteristic curve are equivalent.

Deep learning

A class of machine learning methods that make use of successively more abstract representations of the input data to perform a specific task, typically implemented using artificial neural networks. They also consist of an objective function that compares the final output with a target output as well as an optimization method that is used to optimize the objective function.

Deep learning models

Computational models obtained by training deep neural networks. Note that a single training of a neural network produces a sequence of models as each new optimization iteration produces a model slightly different from the previous one. A tuning data set may be used to select among these models.

Deep learning systems

Systems utilizing one or more deep learning models to make predictions. A system’s output may be a function of the outputs of the models, for example by averaging and thresholding the model outputs.

Development cohort

A cohort used for training and, sometimes, tuning and internal validation of a system.

External cohorts

Also known as independent cohorts, these differ non-randomly from the development cohort. In cancer diagnostics, the external cohorts will often contain patients suspected of having the same disease or disease attribute, at risk of developing the same event or suspected to respond to the same treatment as patients in the development cohort. However, external cohorts may be intentionally more different from the development cohort.

External validation

An evaluation of a system’s performance on an external cohort that did not influence the development of the system.

Generalizability

The ability of a system to perform similarly on subjects not included in training to on those included in the training. Poor generalizability can be caused by overfitting to the training data or by the lack of generally relevant features in the training data.

Overfitting

Utilizing noise or features in the training data that are not generally relevant for the prediction task but cause the system to perform better on the training sample.

Supervised machine learning

A methodology in which learning occurs by mimicking the mapping of input data to target output labels. By contrast, the input data are not associated with any output labels in unsupervised learning.

Test

Although frequently used by the machine learning community to refer to an evaluation of a system’s performance, we use ‘test’ to refer to evaluations other than external validations, for example internal validations.

Training

Optimization of model parameters based on data.

Tuning

Informed selection of hyperparameter values (parameters not optimized during training) based on data. Examples include the network architecture, optimization method and threshold for a model’s continuous output. The nomenclature in machine learning is to use ‘validation’ instead of ‘tuning’.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kleppe, A., Skrede, OJ., De Raedt, S. et al. Designing deep learning studies in cancer diagnostics. Nat Rev Cancer 21, 199–211 (2021). https://doi.org/10.1038/s41568-020-00327-9

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41568-020-00327-9

This article is cited by

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer