Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • ADVERTISEMENT FEATURE Advertiser retains sole responsibility for the content of this article

Predicting protein structure unveils the shape of drugs to come

Advances in computational modeling let researchers screen out thousands of compounds, leaving only those that fit their target protein well.Credit: Schrödinger

Finding new therapeutics often demands more than ten years of research and hundreds of millions of dollars. To develop a small-molecule drug, researchers spend years combing through thousands of organic compounds, searching for those few that have the potential to alter a disease process. They then work to optimize the potency and selectivity of those candidate molecules. These early steps can take up to seven years — and clinical trials cannot start until they’re complete.

Now recent breakthroughs in computational modeling could cut years off the drug discovery process. Advanced physics-based computational methods and machine learning offer clear pictures of proteins and how they interact at the atomic level with small molecules. This helps researchers home in early on the most promising compounds. When coupled with dramatic advances in computing power, these methods can slash the number of compounds that need to be synthesized and tested in the lab, and can ultimately lead to better quality clinical candidates.

These developments are turning computational modeling into a central driving force in early drug discovery, says Enrico Malito, executive director of structural biology at Schrödinger, a developer of software for computationally driven drug discovery. “It is truly revolutionary what can happen next.”

Proteins in 3-D

To be effective as a drug, an organic compound must typically fit into a cleft or pocket of a protein and alter its biological activity. Historically, medicinal chemists involved in drug development synthesized large libraries of these compounds for laboratory and animal testing. As advances in x-ray crystallography, the traditional method of determining protein structures, made available the first large wave of three-dimensional protein structures in the 1980s and 1990s, medicinal chemists began using computational modeling to predict which organic compounds might bind to them and block or boost their activity.

But not all target proteins were receptive to this method, called structure-based drug design. The 3-D structure of many proteins remained unavailable, and many important proteins could not be crystallized and analysed using x-ray crystallography. What’s more, even when a protein structure was available, for many years computational models that predict how proteins bind small molecules, or ligands, were not particularly accurate. This caused skepticism and frustration, Malito says.

Over the last decade, technological advances in a method called cryogenic electron microscopy (cryo-EM), led to a proliferation of 3-D protein structures, including about 15,000 of the more than 17,000 cryo-EM protein structures known today. In cryo-EM, proteins are flash-frozen, then probed with electrons to produce images, molecule by molecule. This allows scientists to see both the stable parts of proteins, and parts that tend to shapeshift and wiggle around, which can affect how a protein functions biologically.

Another breakthrough came in July, when DeepMind Technologies in London released an artificial intelligence algorithm called AlphaFold that predicts a protein’s structure from its sequence of amino acids. The initial release contained 3-D structures of 350,000 proteins, 250,000 of which were previously unknown, including every human protein and many made by mice, fruit flies, E. coli bacteria and other organisms.

This new trove of protein structures generated enormous scientific excitement. Because a protein’s structure helps determine its biological function, the new structures, along with those available from cryo-EM, could revolutionize biology and drug discovery.

The amino acid sequence of a protein determines its characteristic 3-D shape. Artificial intelligence and physics-based computational modeling can now help predict it.Credit: Christoph Burgstedt/SPL/Getty Images

Coming into focus

To be useful for drug discovery, a 3-D protein structure must be in sharp focus, meaning that the position of each atom in the protein is known. To confidently design molecules that target a protein, chemists need a resolution of about 2.5 angstroms, says Bruce Rogers, chief scientific officer at Morphic Therapeutics, a Waltham, Massachusetts, biotechnology company.

Today, however, most cryo-EM studies resolve protein structures only to about three or four angstroms, leaving a picture that’s too blurry for structure-based drug design. And despite AlphaFold’s breakthroughs, portions of its protein structure predictions also remain slightly out of focus, in part because machine-learning algorithms traditionally require datasets containing millions of examples for training — a volume of structural data that doesn’t exist.

Physics-based modeling software such as Schrödinger’s can take imperfect structural information and give scientists a better idea of which molecules may fit into proteins of interest. The software takes into account both the shape and the atomic makeup of the proteins, and examines physical forces, such as electrostatics, that affect protein-protein or protein-ligand binding. Medicinal chemists can then synthesize the promising molecules with greater confidence that one of them will be closer to what they’re looking for.

Once these ‘hits’ are identified and narrowed into a few lead compounds, medicinal chemists typically synthesize variants of those compounds, adding a methyl group here or a hydroxyl there. They then screen them in laboratory experiments to find those with a mix of properties that make them drug-like, such as potency, safety, specificity, stability and solubility in the gut. Advanced physics-based modeling can speed this lead optimization process. “If I can throw out 1,800 molecules [out of] the 2,000 that we model, we can make 200 much faster than we can make 2,000,” Rogers says.

Sweeping chemical space

The increased power of physics-based modeling comes down in large part to raw computing power, and progress on two fronts has made this possible. The first front consists of advances in graphics processing units — chips that allow parallel processing. These have dramatically sped up computation, performing in minutes calculations that used to take days on a dedicated supercomputer. “GPUs were game-changing in enabling large-scale computation to drive drug discovery,” says Robert Abel, chief computational scientist at Schrödinger.

The second front is cloud computing. GPUs can be called upon as needed in the cloud, making such computing more affordable and accessible. The increased power from GPUs and cloud computing allow researchers to do a broad sweep of chemical space that was once impossible, Abel says. That can lead to non-intuitive molecules, and fewer, better-targeted drug candidates.

Despite the combined power of real-world structures and in silico modeling, drug discovery researchers still need the old-fashioned synthetic organic chemistry expertise that medicinal chemists have always provided. But those chemists can be a lot more focused and creative in their efforts, exploring areas of chemistry that typically may have been too risky to pursue previously.

And as a result of these advances, drug discovery programmes that may once have had to synthesize thousands of compounds for testing may now need to make only a few hundred. This reduces the laborious early stages of drug discovery from about six years to fewer than three, and better computational methods could eventually cut that down to just a year, Malito says.

“If we have structures, we can reduce the costs and we can reduce the timelines because we optimize what we design,” Malito says. “Without structures, that programme might take forever.”

Learn how physics-based modeling and machine learning can accelerate drug discovery at this dedicated resource from Schrödinger.


Quick links