Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • ADVERTISEMENT FEATURE Advertiser retains sole responsibility for the content of this article

Brains not brawn to solve pharma’s pipeline problem

With the help of Logica, scientists can screen trillions of virtual compounds in a matter of days.Credit: Cravetiger/ Getty Images

Small-molecule drug development is an increasingly risky venture. Current data point to costs approaching US$4 billion per new drug, and only one out of every 10,000 drugs in development will receive marketing authorization. As new precision diagnostic techniques reveal that many conditions comprise multiple disease types — each requiring a different therapy — brute force, high-throughput screening strategies are becoming less economically viable.

“Throwing more money at the problem is unlikely to change the outcome,” says Chris Hurley, director of chemistry at Charles River Laboratories in Harlow, UK. “From the standpoint of all the experimental tools we have now, including computing power and years of datasets, how are we going to view drug discovery going forward?”

Excited by the success of techniques including DeepMind’s AlphaFold, an AI system that aims to predict real-world protein folding, many biotech firms are betting that the answer to more efficient drug discovery lies in mining massive amounts of data about prior successes and failures. But while virtual simulations excel at tasks such as chemical stability optimization, they lack the power to simulate real clinical trials. As a result, AI is often applied to isolated problems within the conventional discovery processes, but this is not where its full potential lies.

“Instead of trying to solve a particular issue, AI's real power is avoiding those problems in the first place,” notes Hurley. “That's why it's quite difficult for companies to successfully adopt it into their existing programmes.”

Clean-slate strategy

Valo Health, a three-year-old technology company in Boston, aims to shake up drug discovery by infusing the power of AI computing and patient data into decision-making steps, ranging from virtual screening to predictions about a molecule’s therapeutic development path. It accomplishes this through two strategies: a software architecture that uses early data gathering and feedback loops to build and refine models that give objective predictions about potential drug candidates, and a reliance on what the company terms ‘human-centric’ data that reflect active disease properties better than isolated cell lines or extracted organs.

Valo Health has teamed up with data-generating powerhouse, Charles River Laboratories, to offer the industry broader access to its suite of AI tools through a new platform called Logica™. Guido Lanza, vice president of integrated research at Valo Health, explains that Valo’s machine learning capabilities utilized in Logica help focus the search from the near-infinite area of chemical space to more manageable regions, before screening trillions of virtual compounds in a matter of days.

“With Logica, the process is built around the AI, not the other way around,” says Lanza. "By being in that bespoke, problem-specific space you discover interesting and advanceable new leads and avoid pitfalls.”

Valo’s AI-centric approach rethinks traditional processes so that they integrate information from data researchers with that from chemists and biologists in the early stages — even using AI to come up with critical steps in the workflow. “Ask yourself ‘Why am I screening? Do I need starting points for drug discovery, or do I screen because I need training data for my AI?’,” he advises. “Data intentionality is critical. The challenges with designing a drug are diverse and require equally diverse, bespoke algorithmic approaches that can interpret the output.”

Rise from the ranks

One of the fundamental challenges of AI in drug discovery is finding ways to train the machine-learning algorithm to look for repeatable patterns in complex biological activity, such as antibody binding or cell membrane penetration. In addition, statistical bias in published literature and patents, which make up the majority of training data, can also negatively affect the algorithm.

“Without an approach like Logica, there is potential for publication-bias,” explains Lanza. “People publish active compounds, not inactive ones. The nature of biological assays is also highly noisy: I might repeat the same experiment and receive different results. You need to handle that noise and bias to make useful predictions.”

As part of their approach to reduce bias, the Logica team at Valo looked at all published historical medicinal chemistry data. While the absolute potency numbers fluctuated from paper to paper, the relative rankings between pairs of different compounds remained consistent. “This allows the AI to focus on the differences between compounds,” says Lanza. In some cases, the goal for machine learning software is to rank compounds instead of predicting an experimental outcome. The team can encode how a biologist understands and looks at the experimental data: comparability, bias, noise among and between labs, conditions, and more. Encoding that in pairwise can train the system to make those same comparisons. “We can extract every ounce of information possible in a data set that is noisy and biased.”

One successful application of algorithms available in Logica, was used in the design of ryanodine-receptor stabilizers — compounds that can restore the frequency and amplitude of cardiac muscle contractions. Beginning with a few compounds with known activity, the team increased the diversity of its starting candidates by examining patents and literature through machine learning, gradually building a new model using noise-reduction techniques.

Through screening of commercial libraries, the team identified multiple chemical design concepts and subjected them to iterative rounds of optimization. By tightly coupling the virtual libraries with real chemistry, the algorithms were used to identify several compounds that met all success criteria for modulating calcium release in the heart muscle cells. Additionally, the project was completed several years quicker than conventional approaches.

Democracy now

Data-driven techniques are already changing the face of drug discovery. “Enabling fundamental questions to be answered: is the best approach a small molecule? Can you take an alternative approach such as cell and gene therapy?” asks Hurley. “You can make those decisions without bias — including de-risking early if it seems the project isn’t going to be feasible.”

Lanza agrees. “In a sense using machine learning in Logica can democratize the design of small molecules,” he adds. “It used to be that the companies who had the most money and time to invest into designing the best drug molecules were most successful. Charles River wants to scale and to be a true engine of democratization for drug discovery: to shift the value generation and investment to focus ever more on understanding the core biology of human disease. “The people who can best define disease and measure disease efficacy — that's where the winners are going to be in the future.”

To learn more about how AI platforms such as Logica can benefit drug discovery, watch the webcast,

Search

Quick links