Abstract
Artificial intelligence (AI) has been commoditized. It has evolved from a specialty resource to a readily accessible tool for cancer researchers. AI-based tools can boost research productivity in daily workflows, but can also extract hidden information from existing data, thereby enabling new scientific discoveries. Building a basic literacy in these tools is useful for every cancer researcher. Researchers with a traditional biological science focus can use AI-based tools through off-the-shelf software, whereas those who are more computationally inclined can develop their own AI-based software pipelines. In this article, we provide a practical guide for non-computational cancer researchers to understand how AI-based tools can benefit them. We convey general principles of AI for applications in image analysis, natural language processing and drug discovery. In addition, we give examples of how non-computational researchers can get started on the journey to productively use AI in their own work.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Jiang, T., Gradus, J. L. & Rosellini, A. J. Supervised machine learning: a brief primer. Behav. Ther. 51, 675–687 (2020).
Alloghani, M., Al-Jumeily, D., Mustafina, J., Hussain, A. & Aljaaf, A. J. in Supervised and Unsupervised Learning for Data Science (eds Berry, M. W. et al.) 3–21 (Springer International, 2020).
Yala, A. et al. Optimizing risk-based breast cancer screening policies with reinforcement learning. Nat. Med. 28, 136–143 (2022).
Kaufmann, E. et al. Champion-level drone racing using deep reinforcement learning. Nature 620, 982–987 (2023).
Nasteski, V. An overview of the supervised machine learning methods. Horizons 4, 51–62 (2017).
Dike, H. U., Zhou, Y., Deveerasetty, K. K. & Wu, Q. Unsupervised learning based on artificial neural network: a review. In 2018 IEEE International Conference on Cyborg and Bionic Systems (CBS) 322–327 (2018).
Shurrab, S. & Duwairi, R. Self-supervised learning methods and applications in medical imaging analysis: a survey. PeerJ Comput. Sci. 8, e1045 (2022).
Wang, X. et al. Transformer-based unsupervised contrastive learning for histopathological image classification. Med. Image Anal. 81, 102559 (2022).
Wang, X. et al. RetCCL: clustering-guided contrastive learning for whole-slide image retrieval. Med. Image Anal. 83, 102645 (2023).
Vinyals, O. et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575, 350–354 (2019).
Zhao, Y., Kosorok, M. R. & Zeng, D. Reinforcement learning design for cancer clinical trials. Stat. Med. 28, 3294–3315 (2009).
Sapsford, R. & Jupp, V. Data Collection and Analysis (SAGE, 2006).
Yamashita, R., Nishio, M., Do, R. K. G. & Togashi, K. Convolutional neural networks: an overview and application in radiology. Insights Imaging 9, 611–629 (2018).
Chowdhary, K. R. in Fundamentals of Artificial Intelligence (ed. Chowdhary, K. R.) 603–649 (Springer India, 2020).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Vaswani, A. et al. Attention is all you need. Preprint at https://doi.org/10.48550/arXiv.1706.03762 (2017).
Shmatko, A., Ghaffari Laleh, N., Gerstung, M. & Kather, J. N. Artificial intelligence in histopathology: enhancing cancer research and clinical oncology. Nat. Cancer 3, 1026–1038 (2022).
Wagner, S. J. et al. Transformer-based biomarker prediction from colorectal cancer histology: a large-scale multicentric study. Cancer Cell 41, 1650–1661.e4 (2023).
Khan, A. et al. A survey of the vision transformers and their CNN-transformer based variants. Artif. Intell. Rev. 56, 2917–2970 (2023).
Hamm, C. A. et al. Deep learning for liver tumor diagnosis part I: development of a convolutional neural network classifier for multi-phasic MRI. Eur. Radiol. 29, 3338–3347 (2019).
Ren, J., Eriksen, J. G., Nijkamp, J. & Korreman, S. S. Comparing different CT, PET and MRI multi-modality image combinations for deep learning-based head and neck tumor segmentation. Acta Oncol. 60, 1399–1406 (2021).
Unger, M. & Kather, J. N. A systematic analysis of deep learning in genomics and histopathology for precision oncology. BMC Med. Genomics 17, 48 (2024).
Gawehn, E., Hiss, J. A. & Schneider, G. Deep learning in drug discovery. Mol. Inform. 35, 3–14 (2016).
Bayramoglu, N., Kannala, J. & Heikkilä, J. Deep learning for magnification independent breast cancer histopathology image classification. In 2016 23rd International Conference on Pattern Recognition (ICPR) 2440–2445 (IEEE, 2016).
Galon, J. et al. Type, density, and location of immune cells within human colorectal tumors predict clinical outcome. Science 313, 1960–1964 (2006).
Schmidt, U., Weigert, M., Broaddus, C. & Myers, G. Cell detection with star-convex polygons. In Medical Image Computing and Computer Assisted Intervention — MICCAI 2018. Lecture Notes in Computer Science Vol. 11071 (eds Frangi, A. et al.) https://doi.org/10.1007/978-3-030-00934-2_30 (Springer, 2018).
Edlund, C. et al. LIVECell—a large-scale dataset for label-free live cell segmentation. Nat. Methods 18, 1038–1045 (2021).
Bankhead, P. et al. QuPath: open source software for digital pathology image analysis. Sci. Rep. 7, 16878 (2017).
Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH image to imageJ: 25 years of image analysis. Nat. Methods 9, 671–675 (2012).
Rueden, C. T. et al. ImageJ2: ImageJ for the next generation of scientific image data. BMC Bioinformatics 18, 529 (2017).
Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).
Linkert, M. et al. Metadata matters: access to image data in the real world. J. Cell Biol. 189, 777–782 (2010).
Gómez-de-Mariscal, E. et al. DeepImageJ: a user-friendly environment to run deep learning models in ImageJ. Nat. Methods 18, 1192–1195 (2021).
Betge, J. et al. The drug-induced phenotypic landscape of colorectal cancer organoids. Nat. Commun. 13, 3135 (2022).
Park, T. et al. Development of a deep learning based image processing tool for enhanced organoid analysis. Sci. Rep. 13, 19841 (2023).
Belthangady, C. & Royer, L. A. Applications, promises, and pitfalls of deep learning for fluorescence image reconstruction. Nat. Methods 16, 1215–1225 (2019).
Echle, A. et al. Deep learning in cancer pathology: a new generation of clinical biomarkers. Br. J. Cancer 124, 686–696 (2021).
Cifci, D., Foersch, S. & Kather, J. N. Artificial intelligence to identify genetic alterations in conventional histopathology. J. Pathol. 257, 430–444 (2022).
Greenson, J. K. et al. Pathologic predictors of microsatellite instability in colorectal cancer. Am. J. Surg. Pathol. 33, 126–133 (2009).
Kather, J. N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25, 1054–1056 (2019).
Echle, A. et al. Clinical-grade detection of microsatellite instability in colorectal tumors by deep learning. Gastroenterology 159, 1406–1416.e11 (2020).
Kather, J. N. et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat. Cancer 1, 789–799 (2020).
Fu, Y. et al. Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis. Nat. Cancer 1, 800–810 (2020).
Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–1567 (2018).
Schmauch, B. et al. A deep learning model to predict RNA-seq expression of tumours from whole slide images. Nat. Commun. 11, 3877 (2020).
Binder, A. et al. Morphological and molecular breast cancer profiling through explainable machine learning. Nat. Mach. Intell. 3, 355–366 (2021).
Loeffler, C. M. L. et al. Predicting mutational status of driver and suppressor genes directly from histopathology with deep learning: a systematic study across 23 solid tumor types. Front. Genet. 12, 806386 (2022).
Chen, R. J. et al. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 40, 865–878.e6 (2022).
Bilal, M. et al. Development and validation of a weakly supervised deep learning framework to predict the status of molecular pathways and key mutations in colorectal cancer from routine histology images: a retrospective study. Lancet Digit. Health 3, e763–e772 (2021).
Yamashita, R. et al. Deep learning model for the prediction of microsatellite instability in colorectal cancer: a diagnostic study. Lancet Oncol. 22, 132–141 (2021).
Echle, A. et al. Artificial intelligence for detection of microsatellite instability in colorectal cancer—a multicentric analysis of a pre-screening tool for clinical application. ESMO Open 7, 100400 (2022).
Schirris, Y., Gavves, E., Nederlof, I., Horlings, H. M. & Teuwen, J. DeepSMILE: contrastive self-supervised pre-training benefits MSI and HRD classification directly from H&E whole-slide images in colorectal and breast cancer. Med. Image Anal. 79, 102464 (2022).
Jain, M. S. & Massoud, T. F. Predicting tumour mutational burden from histopathological images using multiscale deep learning. Nat. Mach. Intell. 2, 356–362 (2020).
Xu, H. et al. Spatial heterogeneity and organization of tumor mutation burden with immune infiltrates within tumors based on whole slide images correlated with patient survival in bladder cancer. J. Pathol. Inform. 13, 100105 (2022).
Chen, S. et al. Deep learning-based approach to reveal tumor mutational burden status from whole slide images across multiple cancer types. Preprint at https://doi.org/10.48550/arXiv.2204.03257 (2023).
Shamai, G. et al. Artificial intelligence algorithms to assess hormonal status from tissue microarrays in patients with breast cancer. JAMA Netw. Open 2, e197700 (2019).
Beck, A. H. et al. Systematic analysis of breast cancer morphology uncovers stromal features associated with survival. Sci. Transl. Med. 3, 108ra113 (2011).
Arslan, S. et al. A systematic pan-cancer study on deep learning-based prediction of multi-omic biomarkers from routine pathology images. Commun. Med. 4, 48 (2024).
Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019).
Lu, M. Y. et al. AI-based pathology predicts origins for cancers of unknown primary. Nature 594, 106–110 (2021).
Kleppe, A. et al. A clinical decision support system optimising adjuvant chemotherapy for colorectal cancers by integrating deep learning and pathological staging markers: a development and validation study. Lancet Oncol. 23, 1221–1232 (2022).
Jiang, X. et al. End-to-end prognostication in colorectal cancer by deep learning: a retrospective, multicentre study. Lancet Digit. Health 6, e33–e43 (2024).
Zeng, Q. et al. Artificial intelligence-based pathology as a biomarker of sensitivity to atezolizumab–bevacizumab in patients with hepatocellular carcinoma: a multicentre retrospective study. Lancet Oncol. 24, 1411–1422 (2023).
Ghaffari Laleh, N., Ligero, M., Perez-Lopez, R. & Kather, J. N. Facts and hopes on the use of artificial intelligence for predictive immunotherapy biomarkers in cancer. Clin. Cancer Res. 29, 316–323 (2022).
Pedersen, A. et al. FastPathology: an open-source platform for deep learning-based research and decision support in digital pathology. IEEE Access 9, 58216–58229 (2021).
Pocock, J. et al. TIAToolbox as an end-to-end library for advanced tissue image analytics. Commun. Med. 2, 120 (2022).
Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).
El Nahhas, O. S. M. et al. From whole-slide image to biomarker prediction: a protocol for end-to-end deep learning in computational pathology. Preprint at https://doi.org/10.48550/arXiv.2312.10944 (2023).
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Preprint at https://doi.org/10.48550/arXiv.1912.01703 (2019).
Jorge Cardoso, M. et al. MONAI: an open-source framework for deep learning in healthcare. Preprint at https://doi.org/10.48550/arXiv.2211.02701 (2022).
Goode, A., Gilbert, B., Harkes, J., Jukic, D. & Satyanarayanan, M. OpenSlide: a vendor-neutral software foundation for digital pathology. J. Pathol. Inform. 4, 27 (2013).
Martinez, K. & Cupitt, J. VIPS—a highly tuned image processing software architecture. In IEEE Int.Conf. Image Processing 2005; https://doi.org/10.1109/icip.2005.1530120 (2005).
Dolezal, J. M. et al. Deep learning generates synthetic cancer histology for explainability and education. NPJ Precis. Oncol. 7, 49 (2023).
Plass, M. et al. Explainability and causability in digital pathology. Hip Int. 9, 251–260 (2023).
Reis-Filho, J. S. & Kather, J. N. Overcoming the challenges to implementation of artificial intelligence in pathology. J. Natl Cancer Inst. 115, 608–612 (2023).
Aggarwal, R. et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit. Med. 4, 65 (2021).
Rajput, D., Wang, W.-J. & Chen, C.-C. Evaluation of a decided sample size in machine learning applications. BMC Bioinformatics 24, 48 (2023).
Ligero, M. et al. Minimizing acquisition-related radiomics variability by image resampling and batch effect correction to allow for large-scale data analysis. Eur. Radiol. 31, 1460–1470 (2021).
Zwanenburg, A. et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295, 328–338 (2020).
van Griethuysen, J. J. M. et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 77, e104–e107 (2017).
Fedorov, A. et al. 3D Slicer as an image computing platform for the quantitative imaging network. Magn. Reson. Imaging 30, 1323–1341 (2012).
Yushkevich, P. A. et al. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage 31, 1116–1128 (2006).
Khader, F. et al. Multimodal deep learning for integrating chest radiographs and clinical parameters: a case for transformers. Radiology 309, e230806 (2023).
Yu, A. C., Mohajer, B. & Eng, J. External validation of deep learning algorithms for radiologic diagnosis: a systematic review. Radiol. Artif. Intell. 4, e210064 (2022).
US FDA. Artificial intelligence and machine learning (AI/ML)-enabled medical devices; https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices (2023).
Bruker Corporation. Artificial intelligence in NMR; https://www.bruker.com/en/landingpages/bbio/artificial-intelligence-in-nmr.html (2024).
Wasserthal, J. TotalSegmentator: tool for robust segmentation of 104 important anatomical structures in CT images. GitHub https://doi.org/10.5281/zenodo.6802613 (2023).
Garcia-Ruiz, A. et al. An accessible deep learning tool for voxel-wise classification of brain malignancies from perfusion MRI. Cell Rep. Med. 5, 101464 (2024).
Lång, K. et al. Artificial intelligence-supported screen reading versus standard double reading in the Mammography Screening with Artificial Intelligence trial (MASAI): a clinical safety analysis of a randomised, controlled, non-inferiority, single-blinded, screening accuracy study. Lancet Oncol. 24, 936–944 (2023).
Bera, K., Braman, N., Gupta, A., Velcheti, V. & Madabhushi, A. Predicting cancer outcomes with radiomics and artificial intelligence in radiology. Nat. Rev. Clin. Oncol. 19, 132–146 (2022).
Núñez, L. M. et al. Unraveling response to temozolomide in preclinical GL261 glioblastoma with MRI/MRSI using radiomics and signal source extraction. Sci. Rep. 10, 19699 (2020).
Müller, J. et al. Radiomics-based tumor phenotype determination based on medical imaging and tumor microenvironment in a preclinical setting. Radiother. Oncol. 169, 96–104 (2022).
Amirrashedi, M. et al. Leveraging deep neural networks to improve numerical and perceptual image quality in low-dose preclinical PET imaging. Comput. Med. Imaging Graph. 94, 102010 (2021).
Zinn, P. O. et al. A coclinical radiogenomic validation study: conserved magnetic resonance radiomic appearance of periostin-expressing glioblastoma in patients and xenograft models. Clin. Cancer Res. 24, 6288–6299 (2018).
Lin, Y.-C. et al. Diffusion radiomics analysis of intratumoral heterogeneity in a murine prostate cancer model following radiotherapy: pixelwise correlation with histology. J. Magn. Reson. Imaging 46, 483–489 (2017).
Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).
Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).
Unger, M. & Kather, J. N. Deep learning in cancer genomics and histopathology. Genome Med. 16, 44 (2024).
Zhou, Y. et al. A foundation model for generalizable disease detection from retinal images. Nature 622, 156–163 (2023).
Filiot, A. et al. Scaling self-supervised learning for histopathology with masked image modeling. Preprint at bioRxiv https://doi.org/10.1101/2023.07.21.23292757 (2023).
Campanella, G. et al. Computational pathology at health system scale—self-supervised foundation models from three billion images. Preprint at https://doi.org/10.48550/arXiv.2310.07033 (2023).
Vorontsov, E. et al. Virchow: a million-slide digital pathology foundation model. Preprint at https://doi.org/10.48550/arXiv.2309.07778 (2023).
Clusmann, J. et al. The future landscape of large language models in medicine. Commun. Med. 3, 141 (2023).
Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at https://doi.org/10.48550/arXiv.2303.12712 (2023).
Truhn, D., Reis-Filho, J. S. & Kather, J. N. Large language models should be used as scientific reasoning engines, not knowledge databases. Nat. Med. 29, 2983–2984 (2023).
Adams, L. C. et al. Leveraging GPT-4 for post hoc transformation of free-text radiology reports into structured reporting: a multilingual feasibility study. Radiology 307, e230725 (2023).
Truhn, D. et al. Extracting structured information from unstructured histopathology reports using generative pre-trained transformer 4 (GPT-4). J. Pathol. 262, 310–319 (2023).
Wiest, I. C. et al. From text to tables: a local privacy preserving large language model for structured information retrieval from medical documents. Preprint at bioRxiv https://doi.org/10.1101/2023.12.07.23299648 (2023).
Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).
Truhn, D. et al. A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports. Sci. Rep. 13, 20159 (2023).
Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).
Derraz, B. et al. New regulatory thinking is needed for AI-based personalised drug and cell therapies in precision oncology. NPJ Precis. Oncol. https://doi.org/10.1038/s41698-024-00517-w (2024).
Extance, A. ChatGPT has entered the classroom: how LLMs could transform education. Nature 623, 474–477 (2023).
Thirunavukarasu, A. J. et al. Large language models in medicine. Nat. Med. 29, 1930–1940 (2023).
Webster, P. Six ways large language models are changing healthcare. Nat. Med. 29, 2969–2971 (2023).
Krishnan, R., Rajpurkar, P. & Topol, E. J. Self-supervised learning in medicine and healthcare. Nat. Biomed. Eng. 6, 1346–1352 (2022).
Meskó, B. Prompt engineering as an important emerging skill for medical professionals: tutorial. J. Med. Internet Res. 25, e50638 (2023).
Sushil, M. et al. CORAL: expert-curated oncology reports to advance language model inference. NEJM AI 1, 4 (2024).
Brown, T. B. et al. Language models are few-shot learners. Preprint at https://doi.org/10.48550/arXiv.2005.01416 (2020).
Ferber, D. & Kather, J. N. Large language models in uro-oncology. Eur. Urol. Oncol. 7, 157–159 (2023).
Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature 619, 357–362 (2023).
Nori, H. et al. Can generalist foundation models outcompete special-purpose tuning? Case study in medicine. Preprint at https://doi.org/10.48550/arXiv.2311.16452 (2023).
Balaguer, A. et al. RAG vs fine-tuning: pipelines, tradeoffs, and a case study on agriculture. Preprint at https://doi.org/10.48550/arXiv.2401.08406 (2024).
Gemini Team et al. Gemini: a family of highly capable multimodal models. Preprint at https://doi.org/10.48550/arXiv.2312.11805 (2023).
Tisman, G. & Seetharam, R. OpenAI’s ChatGPT-4, BARD and YOU.Com (AI) and the cancer patient, for now, caveat emptor, but stay tuned. Digit. Med. Healthc. Technol. https://doi.org/10.5772/dmht.19 (2023).
Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://doi.org/10.48550/arXiv.2302.13971 (2023).
Lipkova, J. et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell 40, 1095–1110 (2022).
Niehues, J. M. et al. Generalizable biomarker prediction from cancer pathology slides with self-supervised deep learning: a retrospective multi-centric study. Cell Rep. Med. 4, 100980 (2023).
Foersch, S. et al. Multistain deep learning for prediction of prognosis and therapy response in colorectal cancer. Nat. Med. 29, 430–439 (2023).
Boehm, K. M. et al. Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer. Nat. Cancer 3, 723–733 (2022).
Vanguri, R. et al. Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer. Nat. Cancer 3, 1151–1164 (2022).
Shifai, N., van Doorn, R., Malvehy, J. & Sangers, T. E. Can ChatGPT vision diagnose melanoma? An exploratory diagnostic accuracy study. J. Am. Acad. Dermatol. 90, 1057–1059 (2024).
Liu, H., Li, C., Wu, Q. & Lee, Y. J. Visual instruction tuning. Preprint at https://doi.org/10.48550/arXiv.2304.08485 (2023).
Li, C. et al. LLaVA-med: training a large language-and-vision assistant for biomedicine in one day. Preprint at https://doi.org/10.48550/arXiv.2306.00890 (2023).
Lu, M. Y. et al. A foundational multimodal vision language AI assistant for human pathology. Preprint at https://doi.org/10.48550/arXiv.2312.07814 (2023).
Adalsteinsson, V. A. et al. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors. Nat. Commun. 8, 1324 (2017).
Zhang, Z. et al. Uniform genomic data analysis in the NCI Genomic Data Commons. Nat. Commun. 12, 1226 (2021).
Vega, D. M. et al. Aligning tumor mutational burden (TMB) quantification across diagnostic platforms: phase II of the Friends of Cancer Research TMB Harmonization Project. Ann. Oncol. 32, 1626–1636 (2021).
Anaya, J., Sidhom, J.-W., Mahmood, F. & Baras, A. S. Multiple-instance learning of somatic mutations for the classification of tumour type and the prediction of microsatellite status. Nat. Biomed. Eng. 8, 57–67 (2023).
Chen, B. et al. Predicting HLA class II antigen presentation through integrated deep learning. Nat. Biotechnol. 37, 1332–1343 (2019).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Callaway, E. What’s next for AlphaFold and the AI protein-folding revolution. Nature 604, 234–238 (2022).
Cheng, J. et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381, eadg7492 (2023).
Barrio-Hernandez, I. et al. Clustering predicted structures at the scale of the known protein universe. Nature 622, 637–645 (2023).
Yang, X., Wang, Y., Byrne, R., Schneider, G. & Yang, S. Concepts of artificial intelligence for computer-assisted drug discovery. Chem. Rev. 119, 10520–10594 (2019).
Mullowney, M. W. et al. Artificial intelligence for natural product drug discovery. Nat. Rev. Drug Discov. 22, 895–916 (2023).
Jayatunga, M. K. P., Xie, W., Ruder, L., Schulze, U. & Meier, C. AI in small-molecule drug discovery: a coming wave? Nat. Rev. Drug Discov. 21, 175–176 (2022).
Vert, J.-P. How will generative AI disrupt data science in drug discovery? Nat. Biotechnol. 41, 750–751 (2023).
Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626, 177–185 (2023).
Swanson, K. et al. Generative AI for designing and validating easily synthesizable and structurally novel antibiotics. Nat. Mach. Intell. 6, 338–353 (2024).
Janizek, J. D. et al. Uncovering expression signatures of synergistic drug responses via ensembles of explainable machine-learning models. Nat. Biomed. Eng. 7, 811–829 (2023).
Savage, N. Drug discovery companies are customizing ChatGPT: here’s how. Nat. Biotechnol. 41, 585–586 (2023).
Boiko, D. A., MacKnight, R., Kline, B. & Gomes, G. Autonomous chemical research with large language models. Nature 624, 570–578 (2023).
Arnold, C. AlphaFold touted as next big thing for drug discovery—but is it? Nature 622, 15–17 (2023).
Mock, M., Edavettal, S., Langmead, C. & Russell, A. AI can help to speed up drug discovery—but only if we give it the right data. Nature 621, 467–470 (2023).
AI’s potential to accelerate drug discovery needs a reality check. Nature 622, 217 (2023).
Upswing in AI drug-discovery deals. Nat. Biotechnol. 41, 1361 (2023).
Hutson, M. AI for drug discovery is booming, but who owns the patents? Nat. Biotechnol. 41, 1494–1496 (2023).
Wong, C. H., Siah, K. W. & Lo, A. W. Estimation of clinical trial success rates and related parameters. Biostatistics 20, 273–286 (2019).
Subbiah, V. The next generation of evidence-based medicine. Nat. Med. 29, 49–58 (2023).
Yuan, C. et al. Criteria2Query: a natural language interface to clinical databases for cohort definition. J. Am. Med. Inform. Assoc. 26, 294–305 (2019).
Lu, L., Dercle, L., Zhao, B. & Schwartz, L. H. Deep learning for the prediction of early on-treatment response in metastatic colorectal cancer from serial medical imaging. Nat. Commun. 12, 6654 (2021).
Trebeschi, S. et al. Prognostic value of deep learning-mediated treatment monitoring in lung cancer patients receiving immunotherapy. Front. Oncol. 11, 609054 (2021).
Castelo-Branco, L. et al. ESMO guidance for reporting oncology real-world evidence (GROW). Ann. Oncol. 34, 1097–1112 (2023).
Morin, O. et al. An artificial intelligence framework integrating longitudinal electronic health records with real-world data enables continuous pan-cancer prognostication. Nat. Cancer 2, 709–722 (2021).
Yang, X. et al. A large language model for electronic health records. NPJ Digit. Med. 5, 194 (2022).
Huang, X., Rymbekova, A., Dolgova, O., Lao, O. & Kuhlwilm, M. Harnessing deep learning for population genetic inference. Nat. Rev. Genet. 25, 61–78 (2024).
Pawlicki, Lee, D.-S., Hull & Srihari. Neural network models and their application to handwritten digit recognition. In IEEE 1988 Int. Conf. Neural Networks (eds Pawlicki, T. F. et al.) 63–70 (1988).
Chui, M. et al. The economic potential of generative AI: the next productivity frontier. McKinsey https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai-the-next-productivity-frontier (2023).
Dell’Acqua, F. et al. Navigating the jagged technological frontier: field experimental evidence of the effects of AI on knowledge worker productivity and quality. Harvard Business School https://www.hbs.edu/ris/Publication%20Files/24-013_d9b45b68-9e74-42d6-a1c6-c72fb70c7282.pdf (2023).
Boehm, K. M., Khosravi, P., Vanguri, R., Gao, J. & Shah, S. P. Harnessing multimodal data integration to advance precision oncology. Nat. Rev. Cancer 22, 114–126 (2022).
Gilbert, S., Harvey, H., Melvin, T., Vollebregt, E. & Wicks, P. Large language model AI chatbots require approval as medical devices. Nat. Med. 29, 2396–2398 (2023).
Mobadersany, P. et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl Acad. Sci. USA 115, E2970–E2979 (2018).
Chang, Y. et al. A survey on evaluation of large language models. ACM Trans. Intell. Syst. Technol. 15, 1–45 (2024).
Lin, T., Wang, Y., Liu, X. & Qiu, X. A survey of transformers. AI Open 3, 111–132 (2022).
Acknowledgements
R.P.-L. is supported by LaCaixa Foundation, a CRIS Foundation Talent Award (TALENT19-05), the FERO Foundation, the Instituto de Salud Carlos III-Investigacion en Salud (PI18/01395 and PI21/01019), the Prostate Cancer Foundation (18YOUN19) and the Asociación Española Contra el Cancer (AECC) (PRYCO211023SERR). J.N.K. is supported by the German Cancer Aid (DECADE, 70115166), the German Federal Ministry of Education and Research (PEARL, 01KD2104C; CAMINO, 01EO2101; SWAG, 01KD2215A; TRANSFORM LIVER, 031L0312A; and TANGERINE, 01KT2302 through ERA-NET Transcan), the German Academic Exchange Service (SECAI, 57616814), the German Federal Joint Committee (TransplantKI, 01VSF21048), the European Union’s Horizon Europe and innovation programme (ODELIA, 101057091; and GENIAL, 101096312), the European Research Council (ERC; NADIR, 101114631) and the National Institute for Health and Care Research (NIHR; NIHR203331) Leeds Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. This work was funded by the European Union. Views and opinions expressed are, however, those of the authors only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.
Author information
Authors and Affiliations
Contributions
All authors contributed substantially to discussion of the content and reviewed and/or edited the manuscript before the submission. R.P.-L., N.G.L. and J.N.K. researched data for the article and wrote the article.
Corresponding author
Ethics declarations
Competing interests
J.N.K. declares consulting services for Owkin, DoMore Diagnostics, Panakeia, Scailyte, Mindpeak and MultiplexDx; holds shares in StratifAI GmbH; has received a research grant from GSK; and has received honoraria from AstraZeneca, Bayer, Eisai, Janssen, MSD, BMS, Roche, Pfizer and Fresenius. R.P.-L. declares research funding by AstraZeneca and Roche, and participates in the steering committee of a clinical trial sponsored by Roche, not related to this work. All other authors declare no competing interests.
Peer review
Peer review information
Nature Reviews Cancer thanks the anonymous reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Related links
Hugging Face: https://huggingface.co/
You.com: https://you.com
Supplementary information
Glossary
- Application programming interface
-
(API). A set of tools and protocols for building software and applications, enabling software to communicate with AI models.
- Artificial neural networks
-
(ANNs). Computational models loosely inspired by the structure and function of the human brain, consisting of interconnected layers of nodes, called neurons, that process input data and learn to recognize patterns and make decisions.
- Computational pathology
-
The use of algorithms, machine learning and image analysis techniques to extract information from digital pathology images.
- Computer vision
-
A field of AI that focuses on enabling computers to analyse and interpret visual data, such as images and videos.
- Convolutional neural networks
-
(CNNs). A type of deep neural network that is especially effective for analysing visual imagery and used in image analysis.
- Deep learning
-
Deep learning is a subfield of machine learning that uses artificial neural networks with multiple layers, called deep neural networks, to learn and extract highly complex features and patterns from raw input data.
- Digital images
-
Visual representations captured and stored in a digital format, consisting of a grid of pixels, with each pixel representing a colour intensity value.
- Digital pathology
-
The practice of converting glass slides into digital slides that can be viewed, managed and analysed on a computer.
- Explainability methods
-
Techniques in AI that provide insights and explanations on how the AI model arrived at its conclusions, thus making the decision-making process of the AI more transparent.
- Generative AI
-
AI systems that can generate new content (text, images or music) that is similar to the content on which it was trained, often creating novel and coherent outputs.
- Gigapixel images
-
Extremely high-resolution digital images consisting of 1 billion pixels, obtained by scanning tissue slides with a slide scanner.
- Graphics processing units
-
(GPUs). Specialized hardware used to rapidly process large blocks of data simultaneously, used in computer gaming and AI.
- Large language models
-
(LLMs). Advanced AI models trained on vast amounts of text data, capable of analysing, generating and manipulating human language, often at the human level174.
- Long short-term memory (LSTM) networks
-
A type of neural network particularly good at processing sequences of data (such as time series or language), with a capability to remember information for a certain time.
- Machine learning
-
A subset of AI focusing on the development of algorithms and models that enable computers to learn and improve their performance on a specific task without being explicitly instructed how to achieve this.
- Natural language processing
-
(NLP). A branch of AI that helps computers to analyse, interpret and respond to human language in a useful way.
- Prompt engineering
-
Crafting inputs or questions in a way that guides AI models, particularly LLMs, to provide the most effective and accurate responses.
- Transformers
-
Types of a neural network model that excel at processing sequences of data, such as sentences in text, by focusing on different parts of the sequence to make predictions175.
- Voxel
-
The three-dimensional equivalent of a pixel in images, representing a value on a regular grid in three-dimensional space, commonly used in medical imaging such as MRI and CT scans.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Perez-Lopez, R., Ghaffari Laleh, N., Mahmood, F. et al. A guide to artificial intelligence for cancer researchers. Nat Rev Cancer 24, 427–441 (2024). https://doi.org/10.1038/s41568-024-00694-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41568-024-00694-7
This article is cited by
-
Advancements in triple-negative breast cancer sub-typing, diagnosis and treatment with assistance of artificial intelligence : a focused review
Journal of Cancer Research and Clinical Oncology (2024)