Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Quantifying biogenic bias in screening libraries

Abstract

In lead discovery, libraries of 106 molecules are screened for biological activity. Given the over 1060 drug-like molecules thought possible, such screens might never succeed. The fact that they do, even occasionally, implies a biased selection of library molecules. We have developed a method to quantify the bias in screening libraries toward biogenic molecules. With this approach, we consider what is missing from screening libraries and how they can be optimized.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1
Figure 2: Compounds in screening libraries are biased toward biogenic molecules.
Figure 3: Biogenic bias increases with molecular size.
Figure 4: Core ring structures common among drugs and related molecules.

Similar content being viewed by others

References

  1. Wilhelm, S. et al. Discovery and development of sorafenib: a multikinase inhibitor for treating cancer. Nat. Rev. Drug Discov. 5, 835–844 (2006).

    Article  CAS  Google Scholar 

  2. Spencer, R.W. High-throughput screening of historic collections: observations on file size, biological targets, and file diversity. Biotechnol. Bioeng. 61, 61–67 (1998).

    Article  CAS  Google Scholar 

  3. Fox, S., Farr-Jones, S., Sopchak, L., Boggs, A. & Comley, J. High-throughput screening: searching for higher productivity. J. Biomol. Screen. 9, 354–358 (2004).

    Article  CAS  Google Scholar 

  4. Macarron, R. Critical review of the role of HTS in drug discovery. Drug Discov. Today 11, 277–279 (2006).

    Article  Google Scholar 

  5. Pereira, D.A. & Williams, J.A. Origin and evolution of high throughput screening. Br. J. Pharmacol. 152, 53–61 (2007).

    Article  CAS  Google Scholar 

  6. Bohacek, R., McMartin, C. & Guida, W. The art and practice of structure-based drug design: a molecular modeling perspective. Med. Res. Rev. 16, 3–50 (1996).

    Article  CAS  Google Scholar 

  7. Roth, B., Sheffler, D. & Kroeze, W. Magic shotguns versus magic bullets: selectively non-selective drugs for mood disorders and schizophrenia. Nat. Rev. Drug Discov. 3, 353–359 (2004).

    Article  CAS  Google Scholar 

  8. Paolini, G., Shapland, R., van Hoorn, W., Mason, J. & Hopkins, A. Global mapping of pharmacological space. Nat. Biotechnol. 24, 805–815 (2006).

    Article  CAS  Google Scholar 

  9. Yildirim, M., Goh, K.-I., Cusick, M., Barabasi, A.-L. & Vidal, M. Drug–target network. Nat. Biotechnol. 25, 1119–1126 (2007).

    Article  CAS  Google Scholar 

  10. Martin, Y.C. Diverse viewpoints on computational aspects of molecular diversity. J. Comb. Chem. 3, 231–250 (2001).

    Article  CAS  Google Scholar 

  11. Breinbauer, R., Vetter, I.R. & Waldmann, H. From protein domains to drug candidates—natural products as guiding principles in the design and synthesis of compound libraries. Angew. Chem. Int. Ed. 41, 2879–2890 (2002).

    Google Scholar 

  12. Koehn, F. & Carter, G. The evolving role of natural products in drug discovery. Nat. Rev. Drug Discov. 4, 206–220 (2005).

    Article  CAS  Google Scholar 

  13. Arve, L., Voigt, T. & Waldmann, H. Charting biological and chemical space: PSSC and SCONP as guiding principles for the development of compound collections based on natural product scaffolds. QSAR Comb. Sci. 25, 449–456 (2006).

    Article  CAS  Google Scholar 

  14. Ertl, P., Roggo, S. & Schuffenhauer, A. Natural product-likeness score and its application for prioritization of compound libraries. J. Chem. Inf. Model. 48, 68–74 (2008).

    Article  CAS  Google Scholar 

  15. Gupta, S. Aires-de-Sousa, J. Comparing the chemical spaces of metabolites and available chemicals: models of metabolite-likeness. Mol. Divers. 11, 23–36 (2007).

    Article  CAS  Google Scholar 

  16. Fink, T. & Reymond, J.L. Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. J. Chem. Inf. Model. 47, 342–353 (2007).

    Article  CAS  Google Scholar 

  17. Sadowski, J. & Kubinyi, H. A scoring scheme for discriminating between drugs and nondrugs. J. Med. Chem. 41, 3325–3329 (1998).

    Article  CAS  Google Scholar 

  18. Good, A.C. & Hermsmeier, M.A. Measuring CAMD technique performance. 2. How “druglike” are drugs? Implications of random test set selection exemplified using druglikeness classification models. J. Chem. Inf. Model. 47, 110–114 (2007).

    Article  CAS  Google Scholar 

  19. Glen, R.C. et al. Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME. IDrugs 9, 199–204 (2006).

    CAS  Google Scholar 

  20. Bemis, G.W. & Murcko, M.A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 39, 2887–2893 (1996).

    Article  CAS  Google Scholar 

  21. Schreiber, S. Target-oriented and diversity-oriented organic synthesis in drug discovery. Science 287, 1964–1969 (2000).

    Article  CAS  Google Scholar 

  22. Haggarty, S., Clemons, P., Wong, J. & Schreiber, S. Mapping chemical space using molecular descriptors and chemical genetics: deacetylase inhibitors. Comb. Chem. High Throughput Screen. 7, 669–676 (2004).

    Article  CAS  Google Scholar 

  23. Shang, S. & Tan, D.S. Advancing chemistry and biology through diversity-oriented synthesis of natural product-like libraries. Curr. Opin. Chem. Biol. 9, 248–258 (2005).

    Article  CAS  Google Scholar 

  24. Gregori-Puigjané, E. & Mestres, J. Coverage and bias in chemical library design. Curr. Opin. Chem. Biol. 12, 359–365 (2008).

    Article  Google Scholar 

  25. Ertl, P., Jelfs, S., Mühlbacher, J., Schuffenhauer, A. & Selzer, P. Quest for the rings. In silico exploration of ring universe to identify novel bioactive heteroaromatic scaffolds. J. Med. Chem. 49, 4568–4573 (2006).

    Article  CAS  Google Scholar 

  26. Wester, M.J. et al. Scaffold topologies. 2. Analysis of chemical databases. J. Chem. Inf. Model. 48, 1311–1324 (2008).

    Article  CAS  Google Scholar 

  27. Wetzel, S., Schuffenhauer, A., Roggo, S., Ertl, P. & Waldmann, H. Cheminformatic analysis of natural products and their chemical space. Chimia 61, 355–360 (2007).

    Article  CAS  Google Scholar 

  28. Fink, T., Bruggesser, H. & Reymond, J.L. Virtual exploration of the small-molecule chemical universe below 160 Daltons. Angew. Chem. Int. Ed. 44, 1504–1508 (2005).

    Article  CAS  Google Scholar 

  29. Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).

    Article  CAS  Google Scholar 

  30. Buckingham, J. Dictionary of Natural Products (Chapman & Hall/CRC, United Kingdom, 2008).

    Google Scholar 

  31. Irwin, J.J. & Shoichet, B.K. ZINC–a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 45, 177–182 (2005).

    Article  CAS  Google Scholar 

  32. Morgan, H.L. Generation of a unique description for chemical structures-a technique developed at Chemical Abstract Service. J. Chem. Doc. 5, 107–113 (1965).

    Article  CAS  Google Scholar 

  33. Hert, J. et al. Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. Org. Biomol. Chem. 2, 3256–3266 (2004).

    Article  CAS  Google Scholar 

  34. Koch, M. et al. Charting biologically relevant chemical space: a structural classification of natural products (SCONP). Proc. Natl. Acad. Sci. USA 102, 17272–17277 (2005).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This work was supported by US National Institutes of Health grant GM59957 to B.K.S. J.H. was supported by a Marie Curie fellowship from the 6th Framework Program of the European Commission; M.J.K. was supported by a US National Science Foundation graduate fellowship; C.L. was supported by a fellowship from the Max Kade Foundation.

Author information

Authors and Affiliations

Authors

Contributions

The project was conceived of by J.H. and B.K.S. J.H. undertook most of the calculations, with molecular proof checking by J.J.I. and C.L. and algorithmic assistance from M.J.K. J.H. and B.K.S. wrote the manuscript, which was read and commented on by the other authors.

Corresponding author

Correspondence to Brian K Shoichet.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–4 and Supplementary Tables 1–3 (PDF 306 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hert, J., Irwin, J., Laggner, C. et al. Quantifying biogenic bias in screening libraries. Nat Chem Biol 5, 479–483 (2009). https://doi.org/10.1038/nchembio.180

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nchembio.180

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing