For decades, drug development was mostly a game of trial and error, with brute-force candidate screens throwing up millions more duds than winners. Researchers are now using computers to get a head start. By analysing the chemical structure of a drug, they can see if it is likely to bind to, or ‘dock’ with, a biological target such as a protein. Such algorithms are particularly useful for finding potentially toxic side effects that may come from unintended dockings to structurally similar, but untargeted, proteins.

Humans interbred with a mysterious archaic population How the capacity to evolve can itself evolve The weak statistics that are making science irreproducible

Last week, researchers presented a computational effort that assesses billions of potential dockings on the basis of drug and protein information held in public databases. “It’s the largest computational docking ever done by mankind,” says Timothy Cardozo, a pharmacologist at New York University’s Langone Medical Center, who presented the project on 19 November at the US National Institutes of Health’s High Risk–High Reward Symposium in Bethesda, Maryland. The result, a website called Drugable (drugable.com) that is backed by the US National Library of Medicine (NLM), is still in testing, but it will eventually be available for free, allowing researchers to predict how and where a compound might work in the body, purely on the basis of chemical structure (see ‘Mining for drugs’).

Cardozo acknowledges that the computations are just an initial step in drug discovery. After predicting whether a protein can bind to a compound, drug developers must test the drug’s action on the same protein in a cell to see what actually happens to the protein’s function, as well as how much of the drug is needed and under what conditions. Then come animal trials and, if researchers are lucky, human trials. But these extra data are often proprietary and held by pharmaceutical companies, says Brian Shoichet, a computational biologist at the University of California, San Francisco. Some public databases such as PubChem, maintained by the NLM, hold the results of automated tests of drugs on proteins in yeast cells, but they contain inaccuracies and false positives, he says.

Still, scientists have already shown that the computational approach can provide some short cuts. In 2012, Shoichet and researchers at the Novartis Institutes for BioMedical Research in Cambridge, Massachusetts, developed an algorithm that predicts side effects on the basis of similarities between drugs’ chemical structures. When the researchers tested the program on 656 approved drugs and 73 biological targets, they found that it predicted hundreds of previously unknown interactions — and that these side effects turned out to be real about half of the time (E. Lounkine et al. Nature 486, 361–367; 2012). For known drugs, Shoichet says, this type of computation provides a quick way to identify interactions that should be investigated further.

Predicting how untested compounds will interact with proteins in the body, as Drugable attempts to do, is more challenging. In setting up the website, Cardozo’s group selected about 600,000 molecules from PubChem and the European Bioinformatics Institute’s ChEMBL, which together catalogue millions of publicly available compounds. The group evaluated how strongly these molecules would bind to 7,000 structural ‘pockets’ on human proteins also described in the databases. Computing giant Google awarded the researchers the equivalent of more than 100 million hours of processor time on its supercomputers for the mammoth effort.

The team came up with ranked docking scores describing some 4 billion potential drug–protein interactions. Then the group cross-referenced the target proteins with those in the NLM’s Gene Expression Omnibus database, which shows where in the body different genes that code for proteins are expressed. This allowed them to predict where the drug might act, says Cardozo: if Drugable finds an interaction for a protein that is highly expressed in a certain tissue, chances are good that the effect would manifest itself in that tissue.

Pharmaceutical companies have been doing similar computational predictions for years, says Jeremy Jenkins, a researcher at the Novartis Institutes. But he says that Novartis, which has a library of 1.5 million public and proprietary compounds, has never attempted to analyse as many proteins and drugs at once as Drugable has done.

Cardozo hopes that Drugable will be particularly helpful in evaluating psychiatric drugs, which often act in ways that are difficult to measure. As a demonstration, Cardozo’s group applied Drugable’s algorithm to clozapine and chlorpromazine, two drugs often prescribed to treat schizophrenia.

As expected, Drugable showed that the two drugs bind most strongly to receptors for the neurotransmitters serotonin and dopamine, which are expressed in the parts of the brain involved in higher information processing. But it found that clozapine, which also stabilizes mood disorders such as depression, binds strongly to a particular dopamine receptor called DRD4, which is expressed in the brain’s pineal gland — a known mood regulator.

The group also found that clozapine binds to a receptor in the part of the brain that regulates saliva production; excessive salivation is a known side effect of clozapine. Although the biochemical explanations for mood regulation and salivation have been proposed before, Cardozo says that Drugable can be used to reveal the most plausible mechanisms.