Oncologists face a conundrum with immunotherapies. These drugs are designed to smash through the immunity-suppressing fog created by tumours. When they work, they can marshal a potent antitumour immune response and deliver years of remission. Unfortunately, most people with cancer do not experience these benefits, and it isn’t obvious who will respond to treatment.

Clinicians are clamouring for useful biomarkers that can quickly sort likely responders from those for whom such treatments will not work. Artificial intelligence (AI) could be a valuable ally in this setting, and researchers are developing algorithms that are proving adept at spotting patterns in clinical data that could guide better treatment (see ‘Pattern recognition’). For example, a 2022 study led by Anant Madabhushi, a biomedical engineer at Emory University School of Medicine in Atlanta, Georgia, demonstrated an AI platform that doubled the success rate for predicting whether people with lung cancer would benefit from immunotherapy1.

Pattern recognition: histology sample of an ovarian tumour showing cancer tissue, the stromal microenvironemnt and immune cells

Credit: Madabhushi Lab, Emory Univ. School of Medicine

This is just one example of how researchers are using algorithms to make the most of the clinical data they have at hand, whether it is sophisticated molecular insights from genomics or the tried-and-tested histopathology slide. “It’s really unethical not to use the data that’s available, because the data is there,” says Jakob Kather, a clinician and computer scientist at the Technical University Dresden in Germany. “I think we really have an obligation to squeeze every bit of knowledge out of this fruit.” The resulting algorithms range from focused assessment tools for individual drug categories to more futuristic endeavours, such as ‘digital twins’ — computer models of tumours that can be used to test various simulated treatments. The goal is to help physicians to quickly match people to the safest and most effective care.

About a decade ago, there was more hype than substance on this front. The technology giant IBM commercially launched Watson for Oncology, an AI-guided treatment selection platform, to considerable fanfare in 2016. But within years, it became clear that the system was an unreliable and expensive black box that often generated incorrect advice. IBM eventually stepped back from the effort, selling off its Watson assets in 2022. But the concept has remained tantalizing, and in the past several years, researchers have taken a more systematic approach to the problem, powered by the rapid evolution of AI — even as researchers and clinicians have grown acutely aware of the challenges of teaching computers to provide unbiased, trustworthy medical advice.

Subtle signatures

The modern cancer-therapy landscape is exceedingly complex, encompassing diverse drug categories alongside the established tools of surgery, chemotherapy and radiation. Identifying the best mix of approaches for each person with cancer is challenging. “Two in five people are going to be diagnosed with some form of cancer in their lifetime, and we still know so little about the right management and treatment strategies for them,” says Madabhushi.

The problem is particularly acute for immunotherapies such as checkpoint inhibitor drugs, which selectively inactivate the immune checkpoint proteins that cancers exploit to dodge destruction. Arsela Prelaj, a thoracic oncologist at the National Institute of Tumours in Milan, Italy, says that some of her patients with advanced lung cancer remain progression-free five to ten years on.

Unfortunately, the main mechanism for selecting immunotherapy — the extent to which the checkpoint protein PD-L1 is expressed in a biopsy — is an unreliable tool for identifying those who would benefit. “It’s just a terrible biomarker — but that’s all we have,” says Madabhushi. One analysis from 2019 found that fewer than 30% of the people who were recommended a checkpoint-inhibitor treatment on the basis of the PD-L1 expression levels of their tumour responded to treatment2. Madabhushi’s team has been working to improve that track record through AI-powered analysis of conventional clinical imaging data, with the goal of uncovering structural and physiological features that reveal insights into the tumour’s immune environment.

The team’s 2022 study1 entailed machine vision analysis of computed tomography images from 507 people with non-small-cell lung cancer (NSCLC). The researchers were able to identify vascular features that correctly predicted checkpoint-inhibitor response in more than 60% of cases. “The more twisted the vasculature, the more likely these patients were to not respond to immunotherapy,” he says. This is consistent with research indicating that in a tumour, abnormal blood-vessel growth is associated with low immunogenicity. The predictive power of this model is now being tested in the INSIGNA lung cancer clinical trial.

Prelaj’s group has also evaluated AI-guided prediction of immunotherapy response in people with NSCLC, and has generally been impressed with its ability to deliver useful predictions. “These tools are trustworthy, and are working,” she says. In 2022, her group spearheaded the I3LUNG Project, a five-year initiative that has recruited 2,200 people with NSCLC in Europe, the United States and Israel. I3LUNG aims to develop a deep-learning model for predicting the response to checkpoint inhibitors — either alone, or in combination with other therapies — on the basis of imaging, histology and data from clinical records. The researchers will then validate the model’s ability to identify effective treatment strategies in a prospective cohort of people with cancer3.

Diving deeper

Many of a tumour’s vulnerabilities are most easily discerned by using strategies such as genomic sequencing and RNA analysis to spot aberrant gene expression. By coupling these ‘multi-omic’ data with histological and radiological insights, clinical researchers can more accurately identify what caused a cancer to arise and how best to treat it.

This could further boost the odds of success in immunotherapy, and the I3LUNG cohort includes 200 people with cancer who will undergo extensive multi-omic analysis to see how molecular features enhance predictive performance for this class of agents. But these multi-omic insights become especially valuable in selecting targeted therapies, which modulate the activity of specific proteins involved in the survival or progression of tumour cells. Roughly one-fifth of all breast cancers, for example, abnormally express a protein called HER2, which can be targeted with various therapies.

In a 2021 study, researchers led by Carlos Caldas, a medical oncologist at the University of Cambridge, UK, used machine-learning algorithms to generate models based on pathological, DNA and RNA data from tumours and their surrounding tissue and immune cells4. The goal was to predict breast cancer response to standard chemotherapy, either alone or with therapies targeted to HER2. The model delivered correct predictions 87% of the time, and is scheduled for testing in a prospective clinical trial starting later this year.

Access to molecular data of this kind, however, is far from guaranteed even in wealthy countries. In resource-limited settings, DNA- or RNA-sequencing technologies are hard to come by. To make up for this shortfall, some researchers are trying to uncover molecular-scale abnormalities by using only standard histological preparations. “What we are doing mainly is to look at whatever data is routinely available in the clinic at scale,” says Kather. In 2019, he and his colleagues developed a deep-learning model that detects microsatellite instability — a defect in DNA repair that causes abnormally high levels of mutations — based entirely on physical features from conventionally prepared and stained histopathology slides. This characteristic is strongly associated with immunotherapy response, and the technique achieved a success rate of more than 80% in identifying such defects5.

Researchers led by Eytan Ruppin, head of computational precision oncology at the National Cancer Institute (NCI) in Bethesda, Maryland, have taken this concept even further. They analysed standard histopathology slides from 5,528 participants in a US genomics effort called the Cancer Genome Atlas study in combination with accompanying transcriptomic data. They then used deep learning to identify histological features correlated with changes in gene expression6. “Out of 20,000 genes, there were only a few thousand genes we could predict reliably,” says Ruppin. “That was sufficient for us.” This allowed them to generate trained models for predicting gene expression in 16 tumour types, on the basis of histological appearance. A treatment-selection algorithm called ENLIGHT — developed by Pangea Biomed, a company in Tel Aviv, Israel, co-founded by Ruppin, who is now an unpaid adviser to the firm — then used the inferred gene-expression data to develop a model that could successfully identify regimens of targeted agents and immunotherapies that would prove effective in an independent cohort of people with cancer.

Tumour ex machina

One big limitation of these approaches is that they provide only a static snapshot of a cancer. If the tumour mutates during treatment, a new analysis or another biopsy will probably be required.

Simulations known as digital twins could help. These virtual constructs are deployed in the engineering world for analysing the behaviour of cars, aeroplanes or spacecraft in complex real-world environments. “What characterizes a digital twin as opposed to just a model is that there is a stream of measurement data that flows from the actual physical system back into the model all the time,” explained Ilya Shmulevich, who was an engineer specializing in complex biological systems at the Institute for Systems Biology in Seattle, Washington. (He died shortly before this article went to press.) A digital twin of a person’s tumour would be both individualized to the person and dynamic, making digital twins a potentially powerful tool for planning cancer therapy and monitoring its effects.

The NCI and the US Department of Energy have funded multiple initiatives related to cancer digital twins. Olivier Gevaert, a biomedical informatician at Stanford University in California, is leading one such effort, with the goal of generating a digital twin model for lung cancer based on streams of data collected from roughly 200 people with the disease. Gevaert says his team has a full range of clinical information about these people, including imaging, pathology data and molecular read-outs such as RNA sequencing and mutations, and that their first digital twin prototype will help to “develop a model that can dynamically predict tumour growth over time”. He adds that their imaging data capture the response of tumours that have undergone a variety of treatments, such as chemotherapy, radiation, immunotherapy and targeted drugs. This should allow his team to assess the impact of a wide range of perturbations on tumour proliferation, treatment response and survival.

The Shmulevich group’s digital twin project, focused on acute myeloid leukaemia, is taking a different approach. The team’s models eschew the sophisticated molecular-scale analyses that are often found in research laboratories but that are not necessarily accessible in clinical care settings. Instead, they focus on clinical data that are routinely collected over time, including standard panels of genes known to be often mutated in leukaemia, and profiles of the cellular composition and morphology in the blood of people with leukaemia. They have collected vast data sets of real-world drug responses from studies in North America and Europe, including more than 1,400 people from the Leukemia & Lymphoma Society’s Beat AML study. The aim is to generate models that not only predict a positive drug response using pathological data, but that can also anticipate toxicity and help physicians to tune treatment to avoid anaemia and other side effects.

These two digital twin efforts differ in another way. Gevaert is exploring what kind of insight can emerge from AI alone. “We want to see the limit of what we can get out of cutting-edge deep-learning methods,” he says. By contrast, Shmulevich’s team will be establishing guardrails for their first-generation model by setting up a manually coded framework based on accepted medical and scientific knowledge. The goal is to provide a sanity check for the AI model, using real-world data to generate estimates of how confident the algorithm is about its predictions. All of these digital twin projects are in the early stages, however, and their scope is likely to evolve during development.

Driven by data

Even when an AI model is designed to make use of clinical information that is routinely recorded, supplying those data to the model can still be difficult. In Germany, Kather says, most pathology records exist as physical slides and printed reports. “The biggest challenge is the availability of any digital data,” he says.

Electronic health records are a valuable asset. A study this year from researchers at Vanderbilt University Medical Center in Nashville, Tennessee, and GE Healthcare in Chicago, Illinois, showed that they could predict both a positive response to immunotherapy and the risk of adverse events with greater than 70% accuracy, using a machine-learning model trained purely on data from electronic health records7. But these records only tell part of the clinical tale.

Danielle Bitterman, a radiation oncologist at Harvard Medical School in Boston, Massachusetts, points out that many details of a person’s treatment plan — including justifications for departing from standard care guidelines — are locked away in clinicians’ notes. “You want to make sure you’re training your model on data that reflect the full scope of clinical practice,” says Bitterman, noting that even the most advanced electronic health records are ill-equipped to capture such details. She and others are exploring the use of natural-language processing to digitize and transform free-form notes into structured data that can be used for AI training purposes.

Useful public data sets are available, but these are often challenging to obtain. To improve their analytical algorithm, Ruppin and his close collaborator Kenneth Aldape, chief of pathology at the NCI, have combed the literature and prevailed on their network of collaborators to gather any histopathology resources they could find.

Many other bits of valuable clinical data are sequestered at individual institutions — an important safeguard for the privacy of people with cancer, but an impediment to attempts to educate algorithms with data from a broad population. De-identification of patient data sets can be a difficult task, and Bitterman recommends federated learning systems as an alternative. Such systems are trained on institutional data in a site-specific fashion, but only the anonymized insights extracted by the algorithm are made available to the platform developers. “You’re training a central model, but you’re never actually having the data leave the individual institutions,” says Bitterman.

These data also need to be broadly representative of the human population, and curated with an eye towards averting opportunities for algorithmic bias. People from under-represented groups have consistently received lower-quality treatment, which Bitterman thinks could lead algorithms educated on historic records to learn the wrong lessons about how to treat people with cancer in the future. More generally, Bitterman says that much of the data used for AI studies come from “large academic medical centres that tend to serve white, wealthy patients”, and thus could miss risk factors or other biomarkers present in other populations.

Above all, any algorithm intended to guide treatment planning will need to prove its mettle in prospective clinical trials, and win the trust of regulators, practitioners and people with cancer. Aldape says that the road ahead for AI-guided cancer care is not well charted, and will start slowly — focused on specific treatments or tumour types, for example. “It’s going to be step-by-step,” he says. “But I think it’s going to happen.”