<@include file="/horizon/includes/leftnav_chemspace_background.html"> <@include file="/horizon/includes/leftnav_logos.html">
Alzheimer’s disease, cancer and type two diabetes: protein folding
Background

printable pdf

Exploring biological space

Sophie Petit-Zeman

The mind-boggling number of compounds with characteristics of drug-like molecules that can be synthesized means that many tools are needed to select those compounds of greatest biological benefit.

Given that the theoretical number of chemicals with drug-like characteristics that could be synthesized is around 1060 (BOX 1), it is unsurprising that pharmaceutical companies expend a great deal of effort exploring uncharted chemical space to find new compounds. In the 1990s, companies embraced techniques such as combinatorial chemistry and became fixated on generating as many compounds as possible through such methods. More compounds created meant more chance of finding leads that could be developed into drugs, said many researchers in the field.

Yet despite the huge increase of chemicals in companies' libraries that are being screened - hundreds of thousands in the case of most big pharmaceutical companies - the number of drugs reaching the market has not increased accordingly.

As Carl Decicco, head of discovery chemistry at Bristol-Myers Squibb, told the Wall Street Journal earlier this year, "You end up making things you can make, rather than what you should make." And churning out as many chemical combinations as possible and testing them in automated assays in the hope that some might do something useful seems to have been pursued without enough attention to small but crucial details.

Reinventing the wheel

Companies soon realized that they had to rethink their approach. They needed to adopt smarter ways to create and test compounds and reduce the number of compounds lying redundant on laboratory shelves. What seems to be happening now is a marriage, or at least a sensible engagement, between the technology that can create and test vast numbers of possible drug molecules, and more tailored approaches to drug discovery.

Bringing these two together is the seminal work of Christopher Lipinski, who recently retired from Pfizer after 32 years with the company and is set to receive, in July, the 2004 Division of Medicinal Chemistry Award of the American Chemical Society. The development of his now-famous 'rule-of-five' analysis aimed to help chemists identify drug compounds that have poor traits of oral availability (BOX 2).

Lipinski's rules were almost immediately embraced by the pharmaceutical industry, and screening methods extending the principles of Lipinski's rules, such as Rapid Elimination Of Swill, have been developed by companies to further assess the drug-like properties of their compounds.

Content is king

As important, if not more so, is finding assays that can distinguish theoretically promising molecules from a whole host of compounds. In the past, companies used high-throughput screening methods to identify compounds of interest, but these were mainly directed towards affinity and stability of compounds and not their drug-like characteristics. What was needed was so-called 'high-content screening', in which information from several assays could be used to determine the therapeutic potential of any compound.

Brent Stockwell, who recently moved from the Whitehead Institute to the Department of Biological Sciences at Columbia University, is at the forefront of developing these assays. He is developing and using screening assays for compounds that are useful to basic scientists probing cellular processes as well as to drug developers. "High-throughput screens have traditionally relied on relatively simple protein-based assays, such as those whose end product is a spectroscopic change - fluorescence, luminescence, absorption," says Stockwell. "More recent work has focused on finding ways to capture more information."

Examples of this include his work on whole-cell assays to investigate enhancers and suppressors of neurotoxic agents - of relevance to basic science as well as to drug development. One of Stockwell's favoured assays involves using 'viability dyes' that selectively stain live versus dead cells, enabling high-throughput detection of toxic compounds or, in this post-genomic age, alleles, their enhancers and suppressors. For example, a mutant allele of the huntingtin gene, which is linked to Huntington's disease, kills differentiated ST14A neuronal cells. Mutants of the free-radical scavenging enzyme superoxide dismutase (SOD), which is implicated in familial amyotrophic lateral sclerosis (ALS; motor neuron or Lou Gehrig's disease), kill oxidatively stressed N2A cells.

Viability assays can also be used when the fatal allele's mechanism of action is indirect, such as when the mutant SOD1 allele acts together with glutamate, which results in motor neuron loss. But here, viability assays meet a hurdle: how can you be sure that SOD1 and glutamate are acting synergistically, rather than simply additively? As Stockwell says, "It's important to check that this apparent action is not replicated when, for example, glutamate is in the soup with other weakly lethal stimuli, be they toxic alleles or small molecules. If these 'counter screens' reveal that glutamate only causes increased toxicity when mutant SOD is present, there is likely to be a functional connection between the two." Other phenotypic assays include measuring levels of a specific gene product to better understand its activity, from regulation to degradation.

Microscopy-based screens are an example of an emerging and even more easily visualized phenotypic assay, which can detect changes in the subcellular localization of proteins or in the morphology of cells. Although slower than plate-based screens, they can be automated for increased reliability and greater throughput, and are able to reveal more complex phenotypes, such as neurite outgrowth, and even the detection of specific proteins within growth cones.

With the creation of all these assays, it is important to ensure that the quality of assays can be compared directly, so that promising compounds can be selected with confidence. The 'Z-factor' developed by Zhang and colleagues at DuPont is a dimensionless, simple statistical parameter that provides the means for comparison and evaluation of assay quality. The Z-factor reflects the assay signal dynamic range and the variation associated with the signal measurements; a value between 0.5 and 1 indicates an excellent assay.

Character studies

Although whole-cell assays do provide useful information, such phenotypic assays often don't tell you how 'compound X' actually works, or reveal its protein targets and downstream effectors. A step up from cell assays is to use model organisms. If model organisms, such as yeast, worms and flies, show the phenotype of interest, they can be used to screen compounds for activity. But such approaches come with a caveat: they might not be reliable models for how the compound will act in humans, and therefore do not yield entirely useful results. For example, using yeast cells as human drug screens has come under question for a number of reasons: yeast membranes are considerably less permeable to numerous compounds than those of human cells; only 10-20% of human genes have yeast orthologues; and the extent to which phenotypic characteristics of yeast really mimic many of those exhibited by, for example, neurons, is open to question.

Other emerging tools for studying how small molecules work and whether they're worth pursuing include transcription and proteomic profiling, which reveal the global molecular changes that the small molecules initiate. RNA interference, in which small fragments of RNA specifically and potently silence the expression of genes, can tell you still more about the effects of small molecules, as specific target transcripts (such as those identified in assays of global molecular change) can be altered to further reveal the nature of activity of test compounds. As yet, Stockwell says, "There aren't any drugs that have come from these assays, because they are so new. It takes 10-15 years to develop a drug from a hit in a screen, but we do have confirmed (unpublished) hits in all of these assays, some of which are with already approved drugs."

A structured approach

For compounds that are successful 'hits' in assays, how can this success be linked with chemical structure? Stockwell says this is a very interesting question. "There is no good informatics system for these types of screens. After trying to use some of the existing commercial software products, my lab became quite frustrated and we decided to create our own system, which allows us to track which chemicals are active in which assays and do chemoinformatic and bioinformatic analyses of the resulting data. It's taken us a couple of difficult years to develop this system, which we're writing up now, but I think it will greatly benefit those doing this kind of work."

It will be interesting to see whether anarchy reigns as emerging technologies are applied in the post-genomic era, which itself has been predicted to yield several hundred new drug targets. And as Stockwell points out, "One of the reasons that chemical biology is such a difficult field for new people to get into is that it requires the complete integration of chemistry and biology knowledge. Most biologists don't have the savvy to pick the right kinds of compounds to test in their assays, and most chemists can't pick the right assays to test their compounds in. To succeed in this area, you need to become both a chemist and a biologist, and few people have been able to be both."

Box 1 | Exploring chemical space.

'Chemical space', the set of all possible molecular structures, is a familiar phrase to chemists but one that is difficult to accurately explain to many other scientists. Scientists are familiar with the three dimensions that a protein fills and the fourth dimension of time, but chemical space is much more complex in that it is multidimensional. The chemical space that a molecule fills depends on which set of 'dimensions', or 'descriptors', one chooses to define the molecule, such as surface area, charge and number of hydrogen bond donors or acceptors.

Estimates for how many molecular structures can be made that have characteristics of drug-like compounds vary widely — between 1018 and 10200 — depending on the type of descriptors chosen for the calculation. But in one of the more highly cited estimates, Regine Bohacek considers creating a linear compound from scratch, choosing a carbon, oxygen or sulphur atom to form the backbone to the molecule of 30 members. Adding any stable chemical group onto the free bonds, and considering aspects that would produce greater chemical diversity, such as branching, recyclization and stereochemistry, gives an estimate in excess of 1060 possible molecules.

For drug discovery companies, this is a tremendously enticing figure, as the number of molecules that has been synthesized up until now is a mere drop in the ocean compared with the total number of possibilities - for example, the Beilstein database, which covers organic chemistry from 1779 to the present, contains 107 molecules. However, only a small proportion of these ~1060 molecules will be therapeutically useful - most will be biologically inert or have a poor pharmacokinetic profile, usually defined by an ADME-Tox profile (the absorption, distribution, metabolism, excretion and toxicity of a compound).

If the intended target is well characterized (such as G-protein-coupled receptors (GPCRs) or kinases), potential compounds can be compared with compounds that have been developed successfully into drugs, or those with known activity against those targets. Such a comparison is shown in the figure below. Chemical structures from different target spaces appear to occupy certain areas. Biological space occupies discrete 'pockets' within chemical space, each pocket having statistically definable physico-chemical property limits. But importantly, in terms of drug design and development, not all biological space is coherent with ADME-Tox space. Figure kindly provided by Andrew Hopkins, Pfizer.



Box 2 | Lipinski's rule-of-five analysis
Christopher Lipinski's rule-of-five analysis helped to raise awareness about properties and structural features that make molecules more or less drug-like. The guidelines were quickly adopted by the pharmaceutical industry as it helped apply ADME considerations early in preclinical development and could help avoid costly late-stage preclinical and clinical failures. The guidelines predict that poor absorption or permeation of a orally administered compound are more likely if the compound meets the following criteria:
  • Molecular mass greater than 500 Da
  • High lipophilicity (expressed as cLogP greater than 5)
  • More than 5 hydrogen bond donors
  • More than 10 hydrogen bond acceptors

 

 
 

Further Reading

Bleicher, K. H., Bohm, H. J., Muller, K. & Alanine, A. I. Hit and lead generation: beyond high-throughput screening. Nature Rev. Drug Discov. 2, 369-378 (2003). |PubMed| |article|

Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Del. Rev. 23, 3-25 (1997).

Root, D. E., Kelley, B. P. & Stockwell, B. R. Global analysis of large-scale chemical and biological experiments. Curr. Opin. Drug Discov. Devel. 5, 355-360 (2002). |PubMed|

Root, D. E. et al. Biological mechanism profiling using an annotated compound library. Chem. Biol. 10, 881-892 (2003). |PubMed|

Zhang, J. H., Chung, T. D. & Oldenburg, K. R. A simple statistical parameter for use in evaluation and validation of high throughput screening assays. J. Biomol. Screen. 4, 67-73 (1999). |PubMed|

 
 
 
   
<@include file="/horizon/includes/footer_2004.html">