Cellzome uses reverse phase microarrays to study subproteomes. (Image courtesy of Cellzome.)

For years, scientists in academic labs and small biotechnology companies have suffered from throughput envy. Peering over the wall at their colleagues in big pharmaceutical firms, they saw massive banks of multiwell plates, vast libraries of novel chemicals and armies of robots processing thousands of experiments daily. The concept of high-throughput screening was appealing, but the budgets and staffs of these operations remained far out of reach for most basic scientists.

Now, declining equipment prices, the opening of a major national screening center and new high-throughput core facilities at some universities are feeding the development of a new field, variously called chemical genomics, chemical genetics or chemical proteomics. Practitioners disagree on the terminology—most just apply their favorite term to their own work—but the common theme is the use of high-throughput assays using small molecules to glean fundamental biological insights rather than direct pharmaceutical leads.

Papers over pills

In industry, high-throughput chemical screens typically entail purifying a specific protein target, then testing the company's compound library to identify hits that change that protein's activity. The protein targets themselves are chosen on the basis of biological activity, perceived 'druggability', and the company's existing intellectual property covering particular drugs and targets.

Although basic scientists may envy the pharmaceutical companies' gear, they have approached high-throughput screening with a different set of priorities. “In academia, you have the luxury to get rewarded for your work even if it doesn't bring you to solid [intellectual property],” says Andrei Gudkov, Scientific Director of the Small Molecule Screening Core at the Cleveland Clinic in Cleveland, Ohio, USA.

Chemical 'omics, which occurs primarily in places like Gudkov's academic core faci-lity, begins instead with a general biological phenomenon, such as cancer cell division or DNA replication. Researchers develop a suitable high-throughput assay for the process, often using whole cells or cell fractions rather than purified proteins, then screen libraries of compounds to see which ones affect it. These hits then serve as laboratory tools to probe the biology.

The approach is less direct than testing a single protein target, but Gudkov argues that it is less biased. “Sometimes your favorite molecule was picked from a plethora of others ... not based on objective reasons but simply reflecting the history of the subject,” he says. Testing the effects of novel chemicals on whole cells removes that bias, potentially revealing entirely new targets.

Indeed, Gudkov now has the data to back that claim: in 2005, he and his colleagues used chemical genomics to uncover a new tumor suppression mechanism used by the much-studied protein p53 (ref. 1). That work, along with other successes in the field, has spurred more academic researchers to dip their toes into the high-throughput pool.

Although several major research centers now have screening core facilities, they do not generally offer turnkey solutions; one does not simply drop off an assay at the core facility and pick up useful hits a few months later. “It never works like this for a screening facility,” says Gudkov.

Instead, researchers should expect to spend some time collaborating with the screening core to optimize the assay. Once the test is suitable for high-throughput techniques, the postdoc or graduate student leading the project will generally spend a few months working directly in the screening core, first with a small pilot set of compounds and then with a more comprehensive library.

Newcomers to chemical genomics are not the only ones facing a steep learning curve. With the field still in its infancy, granting agencies have not yet figured out how to assess proposals for academic high-throughput screening. Chemical genomics study sections at the US National Institutes of Health (NIH) include members familiar with the technique, but the standards are still a moving target.

At one recent study section meeting he attended, Gudkov says “the most shocking thing for me ... was the complete lack of solid criteria for selection or evaluation of qualified proposals.” As the field becomes more established, experts expect that clearer standards will emerge.

A game of concentration

The NIH, for its part, is keen to see chemical genomics take off. As part of the recent Roadmap initiative, the Institutes have launched the NIH Chemical Genomics Center (NCGC), which is both a national-level core facility and a collection of 'centers of excellence' scattered around the country

Mass spectrometry goes high-throughput. (Image courtesy of Cellzome.)

The main NCGC facility is the brainchild of two experienced high-throughput screeners who learned the field at Merck: Christopher Austin, NCGC Director, and James Inglese, who runs the NCGC Biomolecular Screening and Profiling Division. “Coming to the NIH, I sat around thinking ... 'what would I do differently if I were starting from scratch?',” says Inglese. Among other things, he decided that the new facility should pay more attention to pharmacology than pharmaceutical screeners do.

In a typical pharmaceutical high-throughput screen, researchers test millions of compounds in an assay, all at the same concentration. When screeners find hits this way, they must test the compounds' dose-response characteristics and toxicities separately, and the data are difficult to compare between screens. “There was a compromise along the way, and the decision was basically let's just forget about pharmacology ... just see what's active,” says Inglese.

At the NCGC, Inglese and his colleagues developed an alternative approach, which they call quantitative high-throughput screening2. Testing each compound at seven or more concentrations, they obtain dose-response data and hits simultaneously. That septuples the size of the job, of course, but because the team was starting from scratch, they were able to use the latest miniaturization techniques, including 1,536-well plates and high-precision liquid handlers.

Driving the screening process is a robot built by GNF Systems and Kalypsys, and now marketed separately by both companies. Besides handling smaller volumes and denser plates than previous systems, the GNF-Kalypsys robot doubles as a compound-storage system. “We basically prepare the library and load it onto the robot once every six months,” says Inglese.

The system is also more reliable than older high-throughput screening tools because of a combination of the heavy-duty robot and an assay design that places different compound concentrations on separate plates, which makes it easier to keep the titrations accurate. “We've gotten wind of this from other people who've tried [quantitative high-throughput screening], and people who've had robotic systems that are less robust generate noisy data,” says NCGC director Christopher Austin.

A high-precision robot manipulates 1,536-well plates at the NIH Chemical Genomics Center. (Image courtesy James Inglese, NCGC.)

As an NIH facility, the NCGC is available to researchers worldwide. “We're always looking for good assays,” says Austin, adding that when it comes to targets, “anything which is encoded by any genome in any species is fair game.” Researchers can submit proposals to the center, which reviews them to ensure that the assays are suitable for high-throughput screening. Before sending a proposal, though, Austin recommends making a phone call. “Unless people have a lot of experience developing assays, which most academics don't, ... we really much prefer that they just contact us,” he says.

Once researchers have found a few interesting hits, they might want to turn to a related NIH effort, called PubChem. Run by the National Library of Medicine, PubChem is a rapidly expanding database of small-molecule structures, with information on their bioactivities. Links to GenBank, Medline and similar resources allow users to browse through all of the information available on specific compounds. At this writing, PubChem contained data on at least 10 million compounds, and the database is growing daily.

“What one can do then from PubChem is to go seamlessly from a gene to a protein to a small molecule that binds to that protein, and see the crystal structure and the protein bound to it, and you can go to papers on what that small molecule and that protein and that gene do,” says Austin. Combining free databases like PubChem with new screening software (see Box 1) might eventually provide academic scientists with high-throughput screening capabilities that rival those in industry.

Under the subproteome

Industry, of course, is not sitting still. While major pharmaceutical companies are still heavily invested in screening specific protein targets, some are starting to explore chemical genomics approaches as well. Meanwhile, a few small biotechnology companies hope to use chemical genomics to find targets the big players missed.

“We're looking at drug-proteome interactions,” says Gitte Neubauer, vice president of research operations at Cellzome, who describes the company's approach as “chemical proteomics”. As one of the early adopters of the strategy, Cellzome has established collaborations with pharmaceutical giants Johnson and Johnson, and Ortho-McNeil, but the company is also developing its own drug leads.

Rather than screen drugs against individual proteins, Cellzome screens 'subproteomes', collections of proteins purified by their affinity to particular compounds. “We have a set of small molecules that defines a tractable part of the proteome, and these small molecules can be immobilized on a matrix,” says Gerard Drewes, the company's vice president for discovery research. Drewes adds that this part of the technique is “quite similar to affinity chromatography.”

Having selected an appropriate subproteome—for example, all kinases and their associated protein complexes—the company's researchers then search for compounds that will elute the kinases or their partners from the matrix. The scientists can identify the eluted proteins, which are the test compound's targets, by mass spectrometry. “We test our components against many different kinds of things in one assay,” says Neubauer.

To run these complex assays, the researchers use a combination of off-the-shelf robotic systems from Perkin-Elmer, and custom-built or modified hardware developed in-house.

Cellzome's general approach has also proven useful for basic research. Using a similar affinity purification and mass spectrometry technique, the company recently performed a genome-wide screen for protein complexes in Saccharomyces cerevisiae, isolating a total of 491 putative complexes, 257 of which were newly identified3. Cellzome is now collaborating with academic researchers on other 'interactome' projects, but since completing the yeast work, Drewes says the company's main focus has shifted to subproteomes involved in specific human diseases.

Russian scientists lead in pharmacophore space

Regardless of the assay, anyone hoping to do high-throughput screening needs to pick an appropriate library of compounds to test. For big pharmaceutical companies, a team of staff chemists caters to that need, but academic researchers lack that luxury. Fortunately, several types of compound libraries are now available for purchase, covering a wide range of pharmacological complexity and price.

Standard chemical suppliers offer libraries that encompass nearly all known drugs in clinical use—about 4,000 unique compounds. Although quite small by high-throughput standards, that type of library has some distinct advantages. “It can be [screened] very fast, and people are usually very happy if something already approved for use in humans can affect their target,” says the Cleveland Clinic's Gudkov.

ChemBridge provides a variety of chemical libraries. (Image courtesy of ChemBridge.)

If this quick, cheap approach yields no useful hits, researchers can turn to larger libraries available from specialty suppliers. In the past decade, these suppliers have dipped into a trove of compounds produced by chemistry students at Russian universities. The students must synthesize new chemical structures as part of their advanced training, and since the collapse of the Soviet Union, selling these compounds to high-throughput library suppliers has kept many of the country's scientific centers afloat.

“Because there is no well-established Russian pharmaceutical industry, those compounds would do nothing if companies like Chembridge and others didn't buy them,” says Reg Richardson, European marketing manager for ChemBridge. With one of the largest collections of these compounds—now approaching 500,000—ChemBridge is a common source for the libraries used in chemical genomics.

Although the known-drug library is quite small, the complete ChemBridge collection is enormous, and most scientists prefer to screen a mid-size collection instead. To address that need, ChemBridge offers a subset of the library, called DIVERset. “DIVERset, which is a subset of about 50,000 compounds, covers something like 60–65% of the pharmacophore space,” says Richardson.

The pharmacophore space is a theoretical construct used widely in high-throughput screening. Chemists attach descriptors to particular chemical moieties, such as hydrogen bond acceptor, hydrogen bond donor and aromatic group, then classify compounds based on the number and types of descriptors in them. Two compounds with radically different structures might have very similar pharmacophore descriptors, so screeners would only need to include one of the two in a library to cover that portion of the pharmacophore space.

ChemBridge also offers subsets of DIVERset, structured so that a group of 10,000 compounds covers nearly the same pharmacophore space as the 50,000-compound set, though at lower resolution. “People can buy the whole 50,000, and we'd be pleased if they did, but few can afford to buy that, so they tend to buy 5,000–10,000,” says Richardson. These smaller subsets, with enough material for two dozen assays, cost about $10,000. Investigators who prefer to pick their own subset of the overall library can also browse ChemBridge's entire database, which is available for free downloading.

Whichever approach they choose, chemical genomics researchers are clearly learning that it is not the size of the assay that matters—but how you probe it. See Table 1.

Table 1 Suppliers guide: companies offering chemical biology reagents and equipment