Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

A guide to drug discovery

Hit and lead generation: beyond high-throughput screening

Key Points

  • Due to the escalating downstream costs in the development phase, objective quality assessment of lead series long before entering clinical trials is an increasing necessity within pharmaceutical research.

  • Moving away from the linear process of compound optimization towards a parallel strategy in which the profile of chemical entities is shaped in a multidimensional manner allows the properties of a molecule to be appropriately balanced in a rapid, iterative fashion.

  • The initiating point in a medicinal chemistry programme can arise from a variety of sources. Depending on the target and further information available, they can range from brute-force, serendipity search-based methods to information-rich design approaches for identifying novel chemical entities for further optimization.

  • High-throughput screening campaigns currently provide the main source for chemistry initiation in pharmaceutical research. Assay development time, logistical hurdles and issues concerning compound acquisition increasingly demand alternative approaches to complement this lead discovery pathway.

  • The design of combinatorial compound libraries on the basis of predicted molecular properties is now widely applied, increasing the quality of the product compounds generated. In addition focused libraries can be generated on the basis of ligand or biostructural information most effectively enhanced by support from modern integrated computational and synthetic methods.

  • Computational algorithms allow the annotation and grouping of biological targets as well as chemical structures. 'Chemogenomics' is the interface between disciplines where chemical topology space is married with biological target space. Chemogenomics databases will in future allow existing target-ligand information to be used prospectively to identify drugable targets and design tailored new ligand motifs thus creating valuable knowledge.


The identification of small-molecule modulators of protein function, and the process of transforming these into high-content lead series, are key activities in modern drug discovery. The decisions taken during this process have far-reaching consequences for success later in lead optimization and even more crucially in clinical development. Recently, there has been an increased focus on these activities due to escalating downstream costs resulting from high clinical failure rates. In addition, the vast emerging opportunities from efforts in functional genomics and proteomics demands a departure from the linear process of identification, evaluation and refinement activities towards a more integrated parallel process. This calls for flexible, fast and cost-effective strategies to meet the demands of producing high-content lead series with improved prospects for clinical success.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Therapeutic target classes.
Figure 2: Don't panic...
Figure 3: Stage-by-stage quality assessment to reduce costly late-stage attrition.
Figure 4: Hit-identification strategies.
Figure 5: Where there's a will, there's a way...

Similar content being viewed by others


  1. Drews, J. Drug discovery: A historical perspective. Science 287, 1960–1964 (2000).

    Article  CAS  Google Scholar 

  2. Lander, E. S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    Article  CAS  Google Scholar 

  3. Knowles, J. & Gromo, G. Target selection in drug discovery. Nature Rev. Drug Discov. 2, 63–69 (2003).

    Article  CAS  Google Scholar 

  4. Hopkins, A. L. & Groom, C. R. The druggable genome. Nature Rev. Drug Discov. 1, 727–730 (2002).

    Article  CAS  Google Scholar 

  5. Lenz, G. R., Nash, H. M. & Jindal, A. Chemical ligands, genomics and drug discovery. Drug Discov. Today 5, 145–156 (2000).

    Article  CAS  Google Scholar 

  6. Hodgson, J. ADMET — turning chemicals into drugs. Nature Biotechnol. 19, 722–726 (2001).

    Article  CAS  Google Scholar 

  7. Proudfoot, J. R. Drugs, leads, and drug-likeness: An analysis of some recently launched drugs. Bioorg. Med. Chem. Lett. 12, 1647–1650 (2002).

    Article  CAS  Google Scholar 

  8. Alanine, A., Nettekoven, M., Roberts, E. & Thomas, A. Lead generation — enhancing the success of drug discovery by investing into the hit to lead process. Combin. Chem. High Throughput Screen. 6, 51–66 (2003).

    Article  CAS  Google Scholar 

  9. Boguslavsky, J. Minimizing risk in 'Hits to Leads'. Drug Discov. & Develop. 4, 26–30 (2001).

    Google Scholar 

  10. Bleicher, K. H. Chemogenomics: bridging a drug discovery gap. Curr. Med. Chem. 9, 2077–2084 (2002).

    Article  CAS  Google Scholar 

  11. Bajorath, J. Integration of virtual and high-throughput screening. Nature Rev. Drug Discov. 1, 882–894 (2002). This review article covers the current concepts of integrating both virtual and high-throughput screening.

    Article  CAS  Google Scholar 

  12. Teague, J. S., Davis, A. M., Leeson, P. D. & Oprea, T. The design of leadlike combinatorial libraries. Angew. Chem. Int. Ed. Engl. 38, 3743–3748 (1999).

    Article  CAS  Google Scholar 

  13. Walters, P. & Murcko, M. A. Prediction of 'drug-likeness' Adv. Drug Deliv. Rev. 54, 255–271 (2002).

    Article  CAS  Google Scholar 

  14. Martin, E. J. & Critchlow, R. E. Beyond mere diversity: tailoring combinatorial libraries for drug discovery. J. Comb. Chem. 1, 32–45 (1999).

    Article  CAS  Google Scholar 

  15. Menard, P. R., Mason, J. S., Morize I. & Bauerschmidt, S. Chemistry space metrics in diversity analysis, library design and compound selection. J. Chem. Inf. Comput. Sci. 38, 1204–1213 (1998).

    Article  CAS  Google Scholar 

  16. Roche, O. et al. Development of a virtual screening method for identification of 'Frequent Hitters' in compound libraries. J. Med. Chem. 45, 137–142 (2002).

    Article  CAS  Google Scholar 

  17. Balkenhohl, F., von dem Busche-Hünnefeld, C., Lansky, A. & Zechel, C. Combinatorial synthesis of small organic molecules. Angew. Chem. Int. Ed. Engl. 35, 2288–2337 (1996).

    Article  CAS  Google Scholar 

  18. Böhm, H. -J. & Schneider, G. (eds). Virtual Screening for Bioactive Molecules (Wiley–VCH, Weinheim, 2000). An excellent compendium of current virtual screening methods.

    Book  Google Scholar 

  19. Hann, M. M., Leach, A. R. & Harper, G. Molecular complexity and its impact on the probability of finding leads for drug discovery. J. Chem. Inf. Comput. Sci. 41, 856–864 (2001).

    Article  CAS  Google Scholar 

  20. Crossley, R. From hits to leads, focusing the eyes of medicinal chemistry. Modern Drug Discov. 5, 18–22 (2002).

    CAS  Google Scholar 

  21. Van Dogen, M., Weigelt, J., Uppenberg, J., Schultz, J. & Wikström, M. Structure-based screening and design in drug discovery. Drug Discov. Today 7, 471–477 (2002).

    Article  Google Scholar 

  22. Carr, R. & Jhoti, H. Structure-based screening of low affinity compounds. Drug Discov. Today 7, 522–527 (2002).

    Article  CAS  Google Scholar 

  23. Huang, L., Lee, A. & Ellman, J. A. Identification of potent and selective mechanism-based inhibitors of the cysteine protease cruzain using solid-phase parallel synthesis. J. Med. Chem. 45, 676–684 (2002).

    Article  CAS  Google Scholar 

  24. Patchett, A. A. & Nargund, R. P. Privileged structures — an update. Annu. Rep. Med. Chem. 35, 289–298 (2000).

    CAS  Google Scholar 

  25. Bleicher, K. H., Wütherich, Y., Adam, G., Hoffmann, T. & Sleight, A. J. Parallel solution- and solid-phase synthesis of spiropyrrolo-pyrroles as novel NK-1 receptor ligands. Bioorg. Med. Chem. Lett. 12, 3073–3076 (2002).

    Article  CAS  Google Scholar 

  26. Stahl, M. et al. A validation study on the practical use of automated de novo design. J. Comput.-Aided Mol. Des. 16, 459–478 (2002).

    Article  CAS  Google Scholar 

  27. Schneider, G. et al. Virtual screening for bioactive molecules by de novo design. Angew. Chem Int. Ed. Engl. 39, 4130–4133 (2000).

    Article  CAS  Google Scholar 

  28. Schneider, G. & Böhm, H. -J. Virtual screening and fast automated docking methods. Drug Discov. Today 7, 64–70 (2002).

    Article  CAS  Google Scholar 

  29. Lipinski, C., Lombardo, F., Dominy, B. & Feeney, P. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 23, 3–25 (1997). A landmark publication based on retrospective data analysis for bioavailability resulting in the 'rule-of-five'.

    Article  CAS  Google Scholar 

  30. Cariello, N. F. et al. Comparison of the computer programs DEREK and TOPKAT to predict bacterial mutagenicity. Mutagenesis 17, 321–329 (2002).

    Article  CAS  Google Scholar 

  31. Sadowski, J. & Kubinyi, H. A scoring scheme for discriminating between drugs and nondrugs. J. Med. Chem. 41, 3325–3329 (1998).

    Article  CAS  Google Scholar 

  32. Zuegge, J. et al. A fast virtual screening filter for cytochrome P450 3A4 inhibition liability of compound libraries. Quant. Struct.-Act. Relat. 21, 249–256 (2002).

    Article  CAS  Google Scholar 

  33. Roche, O. et al. A virtual screening method for prediction of the hERG potassium channel liability of compound libraries. Chembiochem 3, 455–459 (2002).

    Article  CAS  Google Scholar 

  34. Schneider, G., Neidhart, W., Giller, T. & Schmid, S. 'Scaffold hopping' by topological pharmacophore search: a contribution to virtual screening. Angew. Chem Int. Ed. Engl. 38, 2894–2896 (1999).

    Article  CAS  Google Scholar 

  35. Mason, J. S., Good, A. C. & Martin, E. J. 3-D Pharmacophores in drug discovery. Curr. Pharm. Des. 7, 567–597 (2001).

    Article  CAS  Google Scholar 

  36. Bissantz, C., Folkers, G. & Rognan, D. Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. J. Med. Chem. 43, 4759–4767 (2000).

    Article  CAS  Google Scholar 

  37. Duckworth, D. M. & Sanseau, P. In silico identification of novel therapeutic targets. Drug Discov. Today 7, 64–69 (2002).

    Article  Google Scholar 

  38. Lee, D. K. et al. Identification of four human G-protein-coupled receptors expressed in the brain. Mol. Brain Res. 86, 13–22 (2001). This paper describes the successful identification of orphan G-protein-coupled receptors initiated by bioinformatic approaches.

    Article  CAS  Google Scholar 

  39. Alaimo, P. J., Shogren-Knaak, M. A. & Shokat, K. M. Chemical genetic approaches for the elucidation of signaling pathways. Curr. Opin. Chem. Biol. 5, 360–367 (2001).

    Article  CAS  Google Scholar 

  40. McGregor, M. J. & Pallai, P. V. Clustering of large databases of compounds: using the MDL “keys” as structural descriptors. J. Chem. Inf. Comp. Sci. 37, 443–448 (1997).

    Article  CAS  Google Scholar 

  41. Stanton, D. T. Evaluation and use of BCUT descriptors in QSAR and QSPR studies. J. Chem. Inf. Com. Sci. 39, 11–20 (1999).

    Article  CAS  Google Scholar 

  42. Sprague, P. W. Automated chemical hypothesis generation and database searching with CATALYST. Perspect. Drug Discov. Design 3, 1–20 (1995).

    Article  CAS  Google Scholar 

  43. Liebeschuetz, J. W. et al. PRO_SELECT: combining structure-based drug design and array-based chemistry for rapid lead discovery. 2. The development of a series of highly potent and selective Factor Xa inhibitors. J. Med. Chem. 45, 1221–1232 (2002).

    Article  CAS  Google Scholar 

  44. Boehm, H. -J. Prediction of binding constants of protein ligands: a fast method for the prioritization of hits obtained from de novo design or 3D database search programs. J. Comput.-Aided Mol. Des. 12, 309–323 (1998).

    Article  Google Scholar 

Download references


Dr. Simona Ceccarelli is cordially thanked for providing the cartoons 'Don't panic....' and 'Where there's a will, there's a way...'.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Alexander I. Alanine.

Related links

Related links


Society for Biomolecular Screening



The feasibility of a target to be effectively modulated by a small molecule ligand that has appropriate bio-physicochemical and absorption, distribution, metabolism and excretion properties to be developed into a drug candidate with appropriate properties for the desired therapeutic use.


A primary active compound(s), with non-promiscuous binding behaviour, exceeding a certain threshold value in a given assay(s). The 'active' is followed up with an identity and purity evaluation, an authentic sample is then obtained or re-synthesized and activity confirmed in a multi-point activity determination to establish the validity of the hit (validated hit).


A prototypical chemical structure or series of structures that demonstrate activity and selectivity in a pharmacological or biochemically relevant screen. This forms the basis for a focused medicinal chemistry effort for lead optimization and development with the goal of identifying a clinical candidate. A distinct lead series has a unique core structure and the ability to be patented separately.


Screening (of a compound collection) to identify hits in an in vitro assay, usually performed robotically in 384-well microtitre plates.


A lead series in which representatives have been extensively refined in not only their structure–activity relationship and selectivity, but also in their physicochemical and early absorption, distribution, metabolism and excretion properties, and safety measures, such as metabolic stability, permeation and hERG liabilities. Correlations have been elucidated and all crucial parameters have shown themselves to be modulated in the series.


The consistent correlation of structural features or groups with the biological activity of compounds in a given biological assay.


Physical molecular properties of a compound. Typical properties are solubility, acidity, lipophilicity, polar surface area, shape, flexibility and so on.


A set of hits clustered into sub-structurally related families, representatives of which have been evaluated for their specificity, selectivity, physicochemical and in vitro ADME properties to characterize the series.


A peer-reviewed milestone, the requirements to be fulfilled are closely linked to the clinical candidate profile. Initial criteria are defined when hits are first identified; they include activity, selectivity and pertinent physicochemical properties, plus an evaluation of ADME and certain safety attributes. In vivo activity is not a mandatory requirement, provided the obstacles are appreciated and considered to be surmountable based on evidence.


A family of promiscuous iron-haem-containing enzymes involved in oxidative metabolism of a broad variety of xenobiotics and drug compounds.


The spatial orientation of various functional groups or features necessary for activity at a biomolecular target.


The process of parallel optimization of several relevant drug-property parameters in concert with activity, to produce a drug candidate with balanced property profiles suitable for clinical development.


A spectroscopy tool used for the assignment and confirmation of chemical structure of a compound or biological macromolecule. Sophisticated multi-dimensional methods are used to characterize larger and more complex biomolecules.


Synthesis technologies to generate compound libraries rather than single products. Robotic instruments for solid- and solution-phase chemistry, as well as high-throughput purification equipment, are applied.


A scoring metric (computational) for the similarity of a given structure to a representative reference set of marketed drugs.


A property–distance metric reflecting the dissimilarity of objects (molecules). Various molecular descriptors (indices) are used to define compounds in a numerical fashion so that they can be readily compared. Such measures must be considered within an appropriate context to be meaningful.


An empirically derived metric by which compounds are assigned a probability to produce (false) positive results (hits) frequently in diverse screening assays.


A graph-based method of describing molecular structure using atom connectivity through the molecular framework and assigning atoms or substructural domains with various property types: lipophilic, H-bond acceptor/donor, positively/ negatively charged and so on.


Metrics used to numerically describe a structure or certain molecular attributes of a compound (for example, Tanimoto, Ghose and Crippen, BCUT and so on).


The process by which a set of individual compounds is made simultaneously using common chemical building blocks and homologous reagents.


A specific core or scaffolding structure that imparts a generic activity towards a protein family or limited set of its members independently of the specific substituents attached to it.


Distinct three-dimensional forms of a molecular structure of a given atomic connectivity, which results from internal rotations about single bonds between atoms.


The process of computationally placing a virtual molecular structure into a binding site of a biological macromolecule (docking) and flexibly or rigidly relaxing the respective structures then ranking (scoring) the complementarity of fit.


A character-based line notation for chemical structures.


A contiguous set of characters that consists entirely of 1s and 0s, which can be used to encode, for example, the presence or not of structural elements in a compound.


Rapid feedback provided by assaying small compound sets (< 1,000) through a medium- throughput assay to guide the SAR for rapid iterative design and synthesis cycles.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bleicher, K., Böhm, HJ., Müller, K. et al. Hit and lead generation: beyond high-throughput screening. Nat Rev Drug Discov 2, 369–378 (2003).

Download citation

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing