|

Exploring biological space
Sophie Petit-Zeman
The mind-boggling number of compounds with characteristics of drug-like molecules that can be synthesized means that many tools are needed to select those compounds of greatest biological benefit.
Given that the theoretical number of chemicals
with drug-like characteristics that could be synthesized
is around 1060 (BOX 1), it is unsurprising
that pharmaceutical companies expend a great deal of effort
exploring uncharted chemical space to find new compounds.
In the 1990s, companies embraced techniques such as combinatorial
chemistry and became fixated on generating as many compounds
as possible through such methods. More compounds created
meant more chance of finding leads that could be developed
into drugs, said many researchers in the field.
Yet despite the huge increase of chemicals in companies' libraries that are being screened - hundreds of thousands in the case of most big pharmaceutical companies - the number of drugs reaching the market has not increased accordingly.
As Carl Decicco, head of discovery chemistry at Bristol-Myers
Squibb, told the Wall Street Journal earlier this
year, "You end up making things you can make, rather than
what you should make." And churning out as many chemical
combinations as possible and testing them in automated assays
in the hope that some might do something useful seems to
have been pursued without enough attention to small but
crucial details.
Reinventing the wheel
Companies soon realized that they had to rethink their approach. They needed to adopt smarter ways to create and test compounds and reduce the number of compounds lying redundant on laboratory shelves. What seems to be happening now is a marriage, or at least a sensible engagement, between the technology that can create and test vast numbers of possible drug molecules, and more tailored approaches to drug discovery.
Bringing these two together is the seminal work of Christopher
Lipinski, who recently retired from Pfizer after 32 years
with the company and is set to receive, in July, the 2004
Division of Medicinal Chemistry Award of the American Chemical
Society. The development of his now-famous 'rule-of-five'
analysis aimed to help chemists identify drug compounds
that have poor traits of oral availability (BOX
2).
Lipinski's rules were almost immediately embraced by the pharmaceutical industry, and screening methods extending the principles of Lipinski's rules, such as Rapid Elimination Of Swill, have been developed by companies to further assess the drug-like properties of their compounds.
Content is king
As important, if not more so, is finding assays that can distinguish theoretically promising molecules from a whole host of compounds. In the past, companies used high-throughput screening methods to identify compounds of interest, but these were mainly directed towards affinity and stability of compounds and not their drug-like characteristics. What was needed was so-called 'high-content screening', in which information from several assays could be used to determine the therapeutic potential of any compound.
Brent Stockwell, who recently moved from the Whitehead Institute to the Department of Biological Sciences at Columbia University, is at the forefront of developing these assays. He is developing and using screening assays for compounds that are useful to basic scientists probing cellular processes as well as to drug developers. "High-throughput screens have traditionally relied on relatively simple protein-based assays, such as those whose end product is a spectroscopic change - fluorescence, luminescence, absorption," says Stockwell. "More recent work has focused on finding ways to capture more information."
Examples of this include his work on whole-cell assays to investigate
enhancers and suppressors of neurotoxic agents - of relevance
to basic science as well as to drug development. One of
Stockwell's favoured assays involves using 'viability dyes'
that selectively stain live versus dead cells, enabling
high-throughput detection of toxic compounds or, in this
post-genomic age, alleles, their enhancers and suppressors.
For example, a mutant allele of the huntingtin gene,
which is linked to Huntington's disease, kills differentiated
ST14A neuronal cells. Mutants of the free-radical scavenging
enzyme superoxide dismutase (SOD), which is implicated in
familial amyotrophic lateral sclerosis (ALS; motor neuron
or Lou Gehrig's disease), kill oxidatively stressed N2A
cells.
Viability assays can also be used when the fatal allele's mechanism
of action is indirect, such as when the mutant SOD1
allele acts together with glutamate, which results in motor
neuron loss. But here, viability assays meet a hurdle: how
can you be sure that SOD1 and glutamate are acting synergistically,
rather than simply additively? As Stockwell says, "It's
important to check that this apparent action is not replicated
when, for example, glutamate is in the soup with other weakly
lethal stimuli, be they toxic alleles or small molecules.
If these 'counter screens' reveal that glutamate only causes
increased toxicity when mutant SOD is present, there is
likely to be a functional connection between the two." Other
phenotypic assays include measuring levels of a specific
gene product to better understand its activity, from regulation
to degradation.
Microscopy-based screens are an example of an emerging and even more easily visualized phenotypic assay, which can detect changes in the subcellular localization of proteins or in the morphology of cells. Although slower than plate-based screens, they can be automated for increased reliability and greater throughput, and are able to reveal more complex phenotypes, such as neurite outgrowth, and even the detection of specific proteins within growth cones.
With the creation of all these assays, it is important to ensure that the quality of assays can be compared directly, so that promising compounds can be selected with confidence. The 'Z-factor' developed by Zhang and colleagues at DuPont is a dimensionless, simple statistical parameter that provides the means for comparison and evaluation of assay quality. The Z-factor reflects the assay signal dynamic range and the variation associated with the signal measurements; a value between 0.5 and 1 indicates an excellent assay.
Character studies
Although whole-cell assays do provide useful information, such phenotypic assays often don't tell you how 'compound X' actually works, or reveal its protein targets and downstream effectors. A step up from cell assays is to use model organisms. If model organisms, such as yeast, worms and flies, show the phenotype of interest, they can be used to screen compounds for activity. But such approaches come with a caveat: they might not be reliable models for how the compound will act in humans, and therefore do not yield entirely useful results. For example, using yeast cells as human drug screens has come under question for a number of reasons: yeast membranes are considerably less permeable to numerous compounds than those of human cells; only 10-20% of human genes have yeast orthologues; and the extent to which phenotypic characteristics of yeast really mimic many of those exhibited by, for example, neurons, is open to question.
Other emerging tools for studying how small molecules work and whether they're worth pursuing include transcription and proteomic profiling, which reveal the global molecular changes that the small molecules initiate. RNA interference, in which small fragments of RNA specifically and potently silence the expression of genes, can tell you still more about the effects of small molecules, as specific target transcripts (such as those identified in assays of global molecular change) can be altered to further reveal the nature of activity of test compounds. As yet, Stockwell says, "There aren't any drugs that have come from these assays, because they are so new. It takes 10-15 years to develop a drug from a hit in a screen, but we do have confirmed (unpublished) hits in all of these assays, some of which are with already approved drugs."
A structured approach
For compounds that are successful 'hits' in assays, how can this success be linked with chemical structure? Stockwell says this is a very interesting question. "There is no good informatics system for these types of screens. After trying to use some of the existing commercial software products, my lab became quite frustrated and we decided to create our own system, which allows us to track which chemicals are active in which assays and do chemoinformatic and bioinformatic analyses of the resulting data. It's taken us a couple of difficult years to develop this system, which we're writing up now, but I think it will greatly benefit those doing this kind of work."
It will be interesting to see whether anarchy reigns as emerging
technologies are applied in the post-genomic era, which
itself has been predicted to yield several hundred new drug
targets. And as Stockwell points out, "One of the reasons
that chemical biology is such a difficult field for new
people to get into is that it requires the complete integration
of chemistry and biology knowledge. Most biologists don't
have the savvy to pick the right kinds of compounds to test
in their assays, and most chemists can't pick the right
assays to test their compounds in. To succeed in this area,
you need to become both a chemist and a biologist, and few
people have been able to be both."
 |
 |
 |
Box 1 | Exploring chemical space.
'Chemical space', the set of all possible molecular structures, is a familiar phrase to chemists but one that is difficult to accurately explain to many other scientists. Scientists are familiar with the three dimensions that a protein fills and the fourth dimension of time, but chemical space is much more complex in that it is multidimensional. The chemical space that a molecule fills depends on which set of 'dimensions', or 'descriptors', one chooses to define the molecule, such as surface area, charge and number of hydrogen bond donors or acceptors.
Estimates for how many molecular
structures can be made that have characteristics of
drug-like compounds vary widely — between 1018
and 10200 — depending on the type of descriptors
chosen for the calculation. But in one of the more
highly cited estimates, Regine Bohacek considers creating
a linear compound from scratch, choosing a carbon,
oxygen or sulphur atom to form the backbone to the
molecule of 30 members. Adding any stable chemical
group onto the free bonds, and considering aspects
that would produce greater chemical diversity, such
as branching, recyclization and stereochemistry, gives
an estimate in excess of 1060 possible
molecules.
For drug discovery companies, this is a tremendously enticing figure, as the number of molecules that has been synthesized up until now is a mere drop in the ocean compared with the total number of possibilities - for example, the Beilstein database, which covers organic chemistry from 1779 to the present, contains 107 molecules. However, only a small proportion of these ~1060 molecules will be therapeutically useful - most will be biologically inert or have a poor pharmacokinetic profile, usually defined by an ADME-Tox profile (the absorption, distribution, metabolism, excretion and toxicity of a compound).
If the intended target is well characterized (such as G-protein-coupled receptors (GPCRs) or kinases), potential compounds can be compared with compounds that have been developed successfully into drugs, or those with known activity against those targets. Such a comparison is shown in the figure below. Chemical structures from different target spaces appear to occupy certain areas. Biological space occupies discrete 'pockets' within chemical space, each pocket having statistically definable physico-chemical property limits. But importantly, in terms of drug design and development, not all biological space is coherent with ADME-Tox space. Figure kindly provided by Andrew Hopkins, Pfizer.
|
 |
 |
 |
Box 2 | Lipinski's
rule-of-five analysis
Christopher Lipinski's rule-of-five analysis helped
to raise awareness about properties and structural features
that make molecules more or less drug-like. The guidelines
were quickly adopted by the pharmaceutical industry
as it helped apply ADME considerations early in preclinical
development and could help avoid costly late-stage preclinical
and clinical failures. The guidelines predict that poor
absorption or permeation of a orally administered compound
are more likely if the compound meets the following
criteria:
- Molecular mass greater than
500 Da
- High lipophilicity (expressed
as cLogP greater than 5)
- More than 5 hydrogen bond donors
- More than 10 hydrogen bond
acceptors
|
|
Further Reading
Bleicher, K. H., Bohm, H. J., Muller, K.
& Alanine, A. I. Hit and lead generation: beyond high-throughput
screening. Nature Rev. Drug Discov. 2, 369-378
(2003). |PubMed|
|article|
Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Del. Rev. 23, 3-25 (1997).
Root, D. E., Kelley, B. P. & Stockwell,
B. R. Global analysis of large-scale chemical and biological
experiments. Curr. Opin. Drug Discov. Devel. 5,
355-360 (2002). |PubMed|
Root, D. E. et al. Biological mechanism
profiling using an annotated compound library. Chem.
Biol. 10, 881-892 (2003). |PubMed|
Zhang, J. H., Chung, T. D. & Oldenburg,
K. R. A simple statistical parameter for use in evaluation
and validation of high throughput screening assays. J.
Biomol. Screen. 4, 67-73 (1999). |PubMed|
|