Chemical genetics — the use of small molecules to mimic the cellular effects of genetic mutations — has emerged as an important approach for unravelling biological pathways and also for providing chemical starting points for the development of potential drugs that modulate these pathways. However, this strategy typically requires tens of thousands of molecules to be screened in order to identify a few active molecules, and then considerable further effort to establish their underlying mode of action.

Turning this problem on its head, Brent Stockwell and colleagues have now assembled a library of 2,000 compounds with known and well-characterized biological activities, and developed an annotation system that captures all of the available published information on these activities. As described in their paper in Chemistry and Biology, having this knowledge associated with each compound can greatly aid the identification of the mechanisms underlying interesting effects — for example, antiproliferative activity — in cellular screens.

The 2,036 biologically active compounds, which include 514 US FDA-approved drugs, represent 169 broad, primary biological mechanisms, such as antihypertensive, anti-inflammatory and antifungal. Each compound was annotated with a score for each of 12,755 biological mechanisms — comprising the 169 primary mechanism descriptors, 200 Medline terms related to pharmacology and more than 12,000 human gene names — by counting the number of abstracts in Medline that contain both the compound name and a given biological mechanism using automated algorithms.

A comparison of the annotated compound library (ACL) with a commercial library typical of those used in high-throughput chemical-genetic screens revealed that it is significantly more structurally diverse. But would it yield more hits in a biological screen? To test their hypothesis that compounds with known biological activity would have a greater probability of being active in new cellular assays than random compounds because their molecular mechanism might be operative in a new context, the authors evaluated the ability of the two libraries to selectively inhibit the proliferation of engineered human tumour cells — an assay that none of the compounds in the ACL had previously been tested in. And indeed, 1% of the ACL compounds were at least fourfold selective for killing tumour cells over normal cells, compared with only 0.01% of the compounds from the commercial library.

Next, the authors tested the ability of the ACL to uncover mechanisms associated with cellular processes. Lung tumour cells were treated with each compound in the library, and 85 compounds had an antiproliferative effect. In a conventional chemical-genetic screen, these compounds would have been the starting point for 85 separate time-consuming target identification projects. However, using the information associated with each compound in the ACL, the authors were able to rapidly identify 28 biological mechanisms that were statistically over-represented in the 85 active compounds. These included both known anticancer mechanisms, confirming the utility of the approach, and also several mechanisms with no previously recognized relationship with cell death, highlighting its potential to identify novel associations. Follow-up experiments with one such novel mechanism showed that several compounds with this mechanism, which would not have been selected a priori as antitumour agents, could selectively kill tumour cells.

So, this method can considerably accelerate the evaluation of numerous mechanisms that might underlie the cellular effects of the compounds in this ACL (information for which is publicly available on the Stockwell lab website). Expanding such ACLs, and introducing further information for each compound, such as effects on gene expression from microarray experiments, has the potential to allow the mechanisms regulating cellular processes to be defined with ever-increasing precision.