Introduction

Organic small molecules with novel mechanisms of action (nMoAs) are needed to help address today's most challenging biomedical problems. In particular, potential disease-related molecular targets indicated by the study of human genetics are often poorly characterized and cannot be studied or modulated with existing chemical tools1. Drugs with nMoAs are also urgently needed to address resistance to existing drugs for cancer and infectious diseases2,3,4.

The search for new chemical probes and drugs can begin in many places. Natural products and their derivatives, for example, have played important roles in the study and treatment of disease for thousands of years5,6,7,8,9. Alternatively, when the therapeutic target is known, advances in synthetic organic chemistry, structural biology, computational modelling and screening have enabled researchers to construct, identify and optimize bioactive compounds via structure-based design10,11. While these techniques and others have long histories of success in drug discovery12,13,14, this article primarily discusses high-throughput screening of small-molecule libraries, in which researchers sift through collections of compounds (103 to >106 members) to find 'hits' that exhibit a desired biological activity.

The use of small-molecule screening in early-stage drug discovery expanded dramatically in the 1990s with the introduction of new screening technologies and more efficient methods to synthesize large numbers of compounds with combinatorial chemistry. Given the vastness of 'chemical space', all the molecules synthesized to date represent only a tiny fraction of all possible compounds that have properties similar to those of existing chemical probes and drugs (Box 1). This limitation has spurred chemists to want to populate the compound libraries being used in screening experiments with a more optimal set of molecules. Early and insightful thoughts into valuable features of compound libraries were offered by Paul Bartlett and colleagues, who suggested that libraries can be classified in one of two ways: focused libraries or prospecting libraries15. Focused libraries — also known as targeted or biased libraries — typically comprise analogues of a known bioactive compound to generate structure–activity relationships (SARs) that can inform optimization efforts. Conversely, prospecting libraries eschew a specific molecular architecture in favour of combinations of available starting materials that maximize the structural novelty and diversity of their products. This strategy is particularly appealing when no known bioactive compounds exist or an nMoA compound is desired.

Instead, initial library construction efforts were dominated by the acquisitions of increasingly large collections from vendors because larger collections are in general thought to increase the probability of finding good starting points for drug discovery16. Although this approach has been a productive source of drugs and probes, commercial vendors tend to increase library size by prioritizing ease of synthesis and availability of starting materials (for example, by using reactions that link compounds containing planar, aromatic rings). These reaction products typically have a higher content of sp2-hybridized carbons and a lower content of sp3-hybridized carbons, leading to an unequal distribution of structural features in small-molecule libraries. Therefore, most areas of synthetically accessible chemical space remain underexplored17.

Not all compounds containing under-represented structures, however, are equally worthy of attention in bioactive compound discovery. Taking a cheminformatics-based approach, Shelat and Guy18 have proposed that compounds that are structurally distinct from the contents of commercial screening collections are more likely to act via nMoAs. Natural products are one such class of compounds19,20, and they have been a productive source of chemical tools and drugs, as noted above. However, billions of years of selection pressure commonly endow natural products with exquisite specificity for their targets, which are most often essential proteins (for example, tubulin and actin) whose inhibition will typically lead to cell death and in some cases overt toxicity21. In one study, the molecular targets of natural products were enriched in proteins constituting the most connected nodes of protein–protein interaction networks, consistent with natural selection yielding toxins used by organisms to gain advantage over competitors22. Furthermore, given their often high level of structural complexity, it can be challenging to synthesize many analogues of natural products to explore SARs.

Intuitively, the high structural complexity of natural products supports the principle that the 3D topography of binding sites on macromolecular targets would benefit from similarly 3D compounds. Developments in synthetic methodology over the past three decades have facilitated the construction of molecules that exhibit the intricate structural architectures and other chemical features that are common in natural products but under-represented in commercial screening collections. Some noteworthy examples include the establishment of transition-metal-mediated coupling and metathesis reactions, asymmetric catalysis and organocatalysis, all of which have been reviewed extensively23,24,25. These transformations have enabled efficient syntheses of compounds that contain synthetically challenging structural features, such as medium-sized rings26, non-peptidic macrocycles27,28 and spirocyclic, fused or bridged bicyclic and polycyclic ring systems29, among many others30,31,32.

The concept underlying diversity-oriented synthesis (DOS), which was introduced in 2000 (Ref. 33), is to harness such advances to synthesize libraries of compounds that incorporate chemical features common to natural products, including sp3-hybridized basic nitrogen atoms, stereogenic elements and novel skeletons. The resulting compounds are not analogues of specific classes of natural products, yet they have the overall appearance of natural products. In order to overcome the tendency of natural selection to select for the limited number of essential targets, this process purposefully breaks the link between natural selection and the generation of natural-product-like compounds. (We use the term 'natural-product-like' to refer to compounds that have chemical features found in ensembles of natural products, not necessarily just in specific ones.)

After a decade of developing these synthetic pathways, executing on them to make libraries of compounds that have under-represented chemical features and testing those compounds in biological assays, in this article we focus on the remarkable activities of the resulting probes and drug leads. These studies begin to address the question of whether novel chemistry is a productive method to generate compounds that act via novel mechanisms. It is not experimentally practical, or arguably feasible, to determine whether this approach is more or less effective than other means to identify nMoA compounds, such as fragment-based drug discovery or other small-molecule synthesis and screening techniques7,12,34. And given the vast numbers of accessible chemical structures, a diversity of approaches to find nMoA compounds could be valuable. Our aim here is simply to illustrate the value of compounds that have been identified using DOS strategies and related approaches to produce compounds that are structurally dissimilar from those in commercially available screening collections.

So, after briefly summarizing the strategic considerations behind the use of modern asymmetric synthesis to construct small-molecule screening libraries, we will focus on the performances of the resulting chemical probes and drug leads by discussing examples in a wide range of therapeutic areas.

Strategies for library synthesis

The strategy and implementation of DOS have been reviewed previously33,35,36, but briefly, short reaction pathways are designed such that libraries of skeletally and stereochemically diverse compounds can be synthesized efficiently. Initial efforts relied upon sequences of complexity-generating reactions (for example, multicomponent reactions, cycloadditions and ring-opening or ring-closing metathesis reactions) to construct intricate scaffolds from readily available precursors rapidly37. Pathways were designed such that diversity could stem from large arrays of building blocks (appendage diversity)38, enumeration of stereoisomers (stereochemical diversity)39 and late-stage branching points that alter scaffold topography (skeletal diversity)40,41.

An early DOS pathway that furnished a selective histone deacetylase 6 (HDAC6) inhibitor provides an illustrative case study. Inspired by the metal-binding features of the natural products trichostatin A and trapoxin A, a library of 7,200 1,3-dioxanes was synthesized42. Metal-chelating functional groups were installed to bias the library towards HDAC inhibition, and the novel 1,3-dioxane core and its appendages were conceived to impart specificity. One library member, tubacin, was found to be a domain-specific inhibitor of HDAC6 (effective concentration for half-maximum response (EC50) = 2.9 μM)43,44. Tubacin proved to be a useful tool compound in cancer and neurodegenerative disease research45,46, but the impracticality of its synthesis hindered its optimization and clinical potential. A more efficient synthetic route would have aided the elaboration of SARs and the optimization of pharmacokinetic properties for in vivo studies. Recognizing this deficiency led to a rethinking of synthetic strategies, leading to novel compound collections.

Modular pathways that facilitate analogue synthesis can enable chemists to meet the demands required of probe and drug molecules. These syntheses can be realized by mimicking natural product biosynthesis, in which building blocks are coupled intermolecularly and then cyclized intramolecularly — from penicillins to terpenes47. This process introduces both topographic complexity and rigidifying structural elements into the reaction products48, the latter of which are important for decreasing the entropic cost of binding macromolecules. In the context of DOS, this concept is known as the 'build–couple–pair' strategy (discussed further in examples below)49.

Target-oriented syntheses of complex molecules, especially natural products, have also benefited tremendously from the pairing of convergent, modular synthesis with asymmetric chemistry. For example, the development of the oncology drug eribulin was made possible by the Kishi group's efficient synthesis of the marine natural product halichondrin B and subsequent analysis of structural analogues50,51. This strategy has been extended to the synthesis of focused libraries of natural product derivatives, which can be screened to find a molecule with improved physicochemical properties or activity against drug-resistant cells or pathogens52. Along these lines, the Myers group has developed platforms for fully synthetic tetracycline and macrolide antibiotics53,54. The construction of such libraries would have been impractical without advances in reaction methodology and synthetic planning.

As suggested above, however, the search for nMoA compounds is probably better served by prospecting libraries synthesized via DOS. But because nothing is known a priori about the activities of library members, the corresponding pathways should permit facile alterations throughout the molecule; ideally, each position should be amenable to functionalization. Strategic implementation of the build–couple–pair concept can fulfil this criterion.

The remainder of this article discusses the synthesis, optimization and biomedical impact of compounds containing under-represented chemical features. The broad range of their biological activities shows that such compounds can act as useful probes and potential drugs throughout medicine.

Probes for heritable disease targets

Advances in genotyping, sequencing and analysis of genetic data have facilitated the large-scale study of disease-related genetic variations55,56. Despite the substantial challenges involved in translating genomic information into therapeutic insights and new medicines57, genetics-based 'experiments of nature' can reveal a type of dose–response of gene activity that suggests the physiological consequences of modulating a target with a therapeutic agent58. Analyses of human genomic data have revealed variants associated with type 2 diabetes59, inflammatory bowel disease60 and psychiatric disorders61,62, and these variants have pointed to disease-relevant pathways not previously recognized63.

Alongside genetic approaches, chemical probes capable of potently and selectively modulating protein activity are valuable tools for exploring therapeutic hypotheses before they are tested in the clinic64,65,66. If the protein target of interest is known, various in vitro binding and functional assays can enable target-based screening campaigns to identify a corresponding small-molecule modulator. Ensuing optimization efforts are then driven by secondary assays and cell-based disease models. Alternatively, phenotypic screening can identify compounds that elicit a notable effect on disease-relevant physiology, most often in human cells. This technique is particularly useful when the phenotype of interest is not strongly associated with a causal protein target or when searching for compounds with nMoAs. Although an in-depth discussion of the relative benefits and challenges of these two strategies is beyond the scope of this article, the strategies can be complementary in some circumstances67,68,69,70.

The examples shown in Fig. 1 illustrate how the interrogation of compound libraries synthesized using DOS and related strategies has afforded small-molecule bioactive compounds across a wide range of heritable diseases (defined here as diseases that may be influenced by heritable genetic characteristics)71,72,73,74,75,76,77,78,79,80,81,82; notably absent are compounds for the study of genetic targets in cancer, which will be discussed in a separate section. Here, we highlight three vignettes that provide greater insight into the underlying synthetic pathways, compound libraries and chemical biology.

Figure 1: A selection of compounds generated by diversity-oriented synthesis and related strategies as probes for a wide range of heritable diseases.
figure 1

Studies that use δ-lactone 9, an activator of neurite growth in multiple classes of murine neurons82, may lead to insights into neurological disorders. The pyrimidodiazepine 7 disrupts the leucyl-tRNA synthetase–RAS-related GTP-binding protein D (RAGD) protein–protein interaction (PPI), which may improve our understanding of autophagy, ageing and immunosuppression via control of the mechanistic target of rapamycin complex 1 (mTORC1) signalling pathway80,260. The benzannulated sultam 4 modulates lysosome acidification and thus serves as a probe for V-ATPase function75, which has been associated with loss of bone density and decreased kidney function, among other conditions261,262. Other compounds interact with targets implicated in Alzheimer disease (1)71, holoprosencephaly (2)72, schizophrenia (3)74, inflammatory diseases (5)76, lipid metabolism (6)79 and coronary artery disease (8)81. The remaining compounds shown are described in more detail in the vignettes in the main text. Properties relevant to their value as probes are discussed in Box 2. GSK3β, glycogen synthase kinase 3β; SHH, sonic hedgehog; SRB1, scavenger receptor class B member 1; TRIB1, tribbles homologue 1.

PowerPoint slide

Robotnikinin — developmental disorders. The Hedgehog signalling pathway plays a role in the organization of developing vertebrates and insects83,84. While also studied in the context of oncogenesis and tumour maintenance85, disruptions in Hedgehog signalling in humans have been linked to developmental disorders such as holoprosencephaly86. Complementing genetic studies in model organisms, small-molecule tool compounds have greatly improved our understanding of both individual ligands and the signalling pathway as a whole87.

Before 2009, however, no direct modulators of sonic hedgehog (SHH), one of the three mammalian Hedgehog orthologues, had been discovered. To address this gap, a target-based screen was performed against ShhN (the active portion of SHH) using small-molecule microarrays77. The microarrays consisted of ~10,000 natural products, known bioactive compounds and products of various diversity synthesis pathways attached to an isocyanate-functionalized glass microscope slide88.

Efforts to optimize the most promising initial hits resulted in the discovery of robotnikinin (Fig. 1), a 12-membered macrolactone that binds to purified ShhN (dissociation constant (Kd) = 3.1 μM) and exhibits concentration-dependent inhibition of ShhN-mediated signalling in multiple cell lines, such as Shh-LIGHT2 and C3H10T1/2 cells (1.9–62.5 μM)77. No inhibition was observed in cell lines that lacked its transmembrane receptor, Patched. These data suggest that robotnikinin disrupts the interaction between ShhN and Patched, although the exact mechanism by which this occurs is not well understood. With regard to selectivity, later studies that revealed novel macrocyclic peptide binders also showed that robotnikinin elicits a 3–4-fold greater response in an enzyme-linked immunosorbent assay (ELISA)-based competition assay relative to the other two Hedgehog orthologues89.

Robotnikinin was derived from a library of 12-membered to 14-membered chiral macrocycles90. After an Evans asymmetric alkylation was used to set the first stereocentre, additional stereogenic elements were introduced via intermolecular couplings with chiral 1,2-aminoalcohols (Fig. 2a). The final step, a Ru-catalysed ring-closing metathesis, effected cyclization of the linear precursor 11. The modular synthetic pathway, in addition to providing an efficient route to under-represented non-peptidic macrocycles, facilitated systematic optimization of ring size, substituents and stereochemistry.

Figure 2: Key transformations along the synthetic routes to selected compounds.
figure 2

The general strategy of building (or purchasing) chiral building blocks, coupling them together and cyclizing key linear intermediates via intramolecular transformations has generated a diverse array of chemical structures and biological activities in heritable disease and other areas of medicine. a | Chiral oxizolidinone 10 was functionalized strategically to afford intermediate 11, which subsequently underwent esterification and ring-closing metathesis to afford the sonic hedgehog (SHH) probe robotnikinin. b | Similarly, intermediate 13 was prepared from chiral ester 12. SNAr-based cyclization afforded the benzannulated 8-membered ring (14) at the core of BRD0476, a probe for type 1 diabetes and Janus kinase–signal transducer and activator of transcription (JAK–STAT) signalling. c | Intermediate 16, which was prepared from chiral acid 15 and closely resembles 13, was subjected to macrolactamization conditions to afford 12-membered macrocycle 17 en route to the autophagy probe BRD5631.

PowerPoint slide

As the first direct binder of SHH, robotnikinin has been used to study the Hedgehog signalling pathway91. On the molecular level, it is capable of distinguishing between closely related proteins and disrupting a critical protein–protein interaction required for proper signalling.

BRD0476 — diabetes. A therapeutic agent that reverses or prevents the autoimmune destruction of the insulin-producing β-cells of the pancreas in patients with type 1 diabetes would address a substantial unmet medical need. With this goal in mind, a cell-based phenotypic screen was performed to identify small molecules that inhibit cytokine-mediated β-cell apoptosis73. The screening library comprised 6,488 chiral [6,8]-fused bicyclic lactams generated via DOS.

One of the most promising suppressors of apoptosis was optimized for potency via a medicinal chemistry campaign to afford BRD0476 (Ref. 73) (Fig. 1). In validation experiments, BRD0476 reduced caspase 3 activity and nitrite production in concentration-dependent manners, and it restored several aspects of normal β-cell function73. Subsequent research involved the synthesis of analogues with improved aqueous solubility and the discovery that BRD0476 promotes β-cell survival by selectively inhibiting ubiquitin-specific peptidase 9X (USP9X)-dependent Janus kinase–signal transducer and activator of transcription (JAK–STAT) signalling (significant activity at 2–5 μM in independent primary samples)92,93. Biochemical experiments revealed no significant inhibition of 96 human kinases, including JAK1–3 (<40% inhibition at 10 μM) or 11 other members of the deubiquitinating enzyme (DUB) superfamily92. Not only do these results provide the first example of non-kinase-mediated inhibition of JAK–STAT signalling but they also suggest USP9X as a new potential therapeutic target for type 1 diabetes.

The synthesis of BRD0476 (Fig. 2b) begins with an auxiliary-mediated asymmetric aldol reaction and intermolecular coupling of a chiral 1,2-aminoalcohol94. This key linear intermediate was then cyclized via SNAr cycloetherification to construct the central 8-membered ring, a structural motif that is under-represented in current screening collections26. With the core in place, appendage elaboration afforded BRD0476. The inclusion of all possible stereoisomers in the parent library aided the triage of screening data owing to the existence of built-in stereochemistry-based SARs73.

As the development of BRD0476 has shown, the value of a probe is linked to the detailed molecular understanding of its MoA. These types of study can afford biological insights with implications for new therapeutics in the future.

BRD5631 — autophagy. Autophagy is the catabolic mechanism by which cells shuttle macromolecules, organelles and pathogens to the lysosome for recycling or degradation95. Alterations in this critical maintenance process have been implicated in human genetic studies of Crohn's disease, and additional studies have connected it to nonalcoholic fatty liver disease, Huntington's disease and other disorders96,97. The development of both inhibitors and enhancers of autophagy would improve our understanding of multiple diseases and facilitate the discovery of new medicines.

One effort to search for small-molecule modulators of autophagy used a cell-based high-throughput screen of nearly 60,000 compounds from multiple DOS pathways prepared using asymmetric synthesis78. Briefly, compound-treated HeLa cells were imaged via fluorescence microscopy to detect and quantify the presence of autophagosomes, the characteristic cellular structures of autophagy. BRD5631 (Fig. 1) was shown to increase autophagosome formation in a concentration-dependent manner (EC50 = 3.1 μM). Follow-up studies revealed that BRD5631 enhances autophagy via a mechanistic target of rapamycin (mTOR)-independent mechanism. While mTOR inhibitors are known to induce autophagy98,99, their use in the clinic is limited by unwanted side effects100. Therefore, BRD5631 represents an additional tool for studying autophagy.

The first few steps of the BRD5631 synthesis (Fig. 2c) closely mirror those of BRD0476 — an asymmetric aldol reaction followed by the intermolecular coupling of a chiral 1,2-aminoalcohol — but SNAr chemistry was incapable of closing the 12-membered ring. Instead, a head-to-tail macrolactamization successfully generated all eight stereoisomers of the core scaffold, which were then functionalized to afford a 7,936-member library that included BRD5631 (Ref. 101).

The structural similarity of intermediates 13 and 16 highlights the power of strategically designed compounds whose functional groups can be connected in different ways to generate different molecular skeletons. This example illustrates the build–couple–pair concept that mimics the way in which biosynthetic pathways construct natural products33,35,36,48,102. As a key tenet of DOS, the ability to access multiple scaffolds from common precursors promotes skeletal diversity31,35,103. Skeletally diverse libraries are becoming more valuable because they are thought to be more performance diverse104, which would aid the search for nMoA compounds.

Probes for cancer targets

Historically, cancer drug discovery was based in part on phenotypic assays that identified compounds that killed cancer cells more effectively than they killed normal cells. Thus, many traditional (and still widely used) anticancer drugs affect processes important in all dividing cells, such as DNA synthesis, with their therapeutic window thought to be derived from the greater importance of such processes in rapidly dividing cancer cells105,106,107. However, starting in the early 1980s, studies of the molecular basis of cancer have revealed that many cancers are driven by mutations in particular signalling proteins108; targeting these mutated proteins — which are not present in normal cells — may enable selective cancer cell killing. Consequently, target-based screening has become an increasingly popular strategy in oncology over the past two decades109. This era of molecularly targeted cancer therapy — enabled by technologies for investigating genetic differences between cancer cells and normal cells110,111 — has substantially improved the treatment of cancer. Many indications, however, remain without a targeted therapy, and resistance to new drugs continues to arise.

Cancer drug discovery has also been advanced by the development of potent and specific small-molecule probes that help characterize the protein targets that emerge from cancer genetics research. However, these proteins are often challenging to study with either small molecules or other modalities such as monoclonal antibodies. Therefore, if we are to exploit the findings of contemporary cancer research fully, the next generation of chemical probes should be capable of modulating the activities of historically challenging targets such as oncogenic transcription factors and GTPases112,113,114. The screening of compounds that contain under-represented structural features, generated via DOS or other methods, appears to be a promising strategy, and it has already produced several probes for cancer research115,116,117,118,119,120,121,122,123,124,125,126,127 (Fig. 3). The following vignettes discuss the discovery, optimization and use in cancer research of four notable tool compounds.

Figure 3: Chemical probes for cancer targets.
figure 3

These probes include benzothiophene 18 and indoloquinolizine 20 as tools to study the cytoskeleton, a common cancer target115,118; compounds such as hydrazide 22 (Ref. 122) and benzannulated lactam 26 (Ref. 127) that are highly specific and can discriminate between proteins within the same sub-family; probes discovered via target-based screening that exhibit significant stereochemistry-based structure–activity relationships, such as 19 (Ref. 117) and 25 (Ref. 125); dienone 24, which sensitizes p53-deficient cells to DNA-damaging agents124; and probes of ostensibly 'undruggable' transcription factors and protein–protein interactions, such as lactam 21 (Ref. 121) and dihydropyran 23 (Ref. 123). Other compounds shown are described in more detail in the vignettes in the main text. Properties relevant to their value as probes are discussed in Box 2. BRD7/9, bromodomain-containing protein 7/9; HDAC8, histone deacetylase 8; HOXA13, homeobox A13; HSF1, heat shock factor 1; IDH1, isocitrate dehydrogenase 1; MCL1, myeloid cell leukaemia 1; PME1, phosphatase methylesterase 1.

PowerPoint slide

BRD7880 — aurora kinases B/C. Our first example does not discuss the targeting of a novel protein or protein class, but it does illustrate an unusual degree of selectivity in targeting members of a well-known class — protein kinases — with compounds resulting from DOS. Misregulation of the cell cycle, especially mitosis, is one of many hallmarks of cancer. Some of the key regulators of this process are the aurora kinases, a family of serine/threonine protein kinases that coordinate the complex cytoskeletal manoeuvres that occur during cell division128. Alterations in aurora kinase activity have been linked to oncogenesis, and aurora kinases are often overexpressed in colorectal, breast and ovarian tumours, among others129,130. Several aurora kinase inhibitors have since entered clinical trials131,132,133, but none of them has won US Food and Drug Administration (FDA) approval. Achieving selective kinase inhibition can be challenging134, so the lack of clinical success may be a result of insufficient specificity.

An unbiased phenotypic screen of 8,000 compounds from multiple DOS pathways identified BRD7880 (Fig. 3) as a highly selective inhibitor of aurora kinases B and C119. In this study, pools of DNA-barcoded cancer cell lines were used in a series of small-molecule screens. Cell lines were barcoded via viral transduction of DNA identifiers, pooled, treated with compounds and assessed for viability via amplification and sequencing of the DNA barcodes119. One of the hits from the screen that exhibited selective and concentration-dependent activity was BRD7880. After comparing its sensitivity profile — which cell lines were killed and which were not — to those of over 400 compounds with known MoAs in the Cancer Therapeutics Response Portal (CTRP), its close similarity to the sensitivity profile of the pan-aurora kinase inhibitor tozasertib suggested that BRD7880 is an aurora kinase inhibitor. Follow-up biochemical assays revealed not only that this hypothesis was correct but also that BRD7880 — obtained directly from the screening collection with no further optimization — was extremely potent and considerably more selective than tozasertib (Fig. 4), which had previously been tested in clinical trials133. The only 2 kinases (out of 308) with >50% inhibition when treated with 30 nM BRD7880 were the aurora kinases B (half-maximal inhibitory concentration (IC50) = 7 nM) and C (IC50 = 12 nM)119.

Figure 4: Kinase affinity profiling reveals that BRD7880 is significantly more selective than tozasertib.
figure 4

The KinomeScan assay reveals kinases for which a given small molecule shows significant affinity (compound decreases binding of control by >75%). Aurora kinases A, B and C are shown in blue; other kinase binding partners are shown in red. CAMK, Ca2+/calmodulin-dependent protein kinase; CK1, casein kinase 1; STE, serine/threonine kinase; TK, tyrosine kinase; TKL, tyrosine kinase-like. Reproduced from Ref. 119, Macmillan Publishers Limited.

PowerPoint slide

The synthesis of BRD7880 (Fig. 5a) closely resembles that of BRD0476, a structurally similar benzannulated lactam (Fig. 2b). However, an important difference is the relative stereochemistry of the methyl group and the silyl ether in linear intermediate 28. An Evans asymmetric aldol reaction was performed to install the desired 1,2-syn stereodiad94. Cyclization via SNAr followed by functionalization of orthogonally protected lactam 29 furnished BRD7880.

Figure 5: Key transformations along the synthetic routes to selected compounds.
figure 5

a | In a synthesis that resembles that of BRD0476, chiral oxazolidinone 27 was functionalized strategically to afford intermediate 28, which underwent SNAr-based cyclization and further elaboration to provide BRD7880, a remarkably selective aurora kinase inhibitor. b | Fully synthetic analogues of rocaglate natural products originated from hydroxyflavone 30. Rohinitib, a probe for heat shock factor 1 (HSF1) activity, and other rocaglate derivatives were synthesized via a biomimetic approach, proceeding through key intermediate 31. c | Starting from imine 32, linear intermediate 33 was generated via a diastereoselective and enantioselective nitro-Mannich reaction. After cyclization to afford lactam 34, appendage decoration furnished the bromodomain-containing protein 7/9 (BRD7/9) inhibitor LP99 as a single stereoisomer. d | The asymmetric synthesis of ABL127, a potent and selective probe of phosphatase methylesterase 1 (PME1), was achieved via generation of ketene 36 from carboxylic acid 35, followed by a [2 + 2] cycloaddition that was rendered enantioselective via a chiral nucleophilic catalyst.

PowerPoint slide

BRD7880 is a particularly striking example of the impact of screening compounds that are accessible only via modern stereoselective synthesis. Because the aurora kinases had already been the subject of multiple drug-discovery campaigns, it is remarkable that the most specific inhibitor to date was identified directly from a screen of 8,000 compounds. This result highlights the potential efficiency of exploring new regions of chemical space made accessible by DOS strategies and advances in synthetic methodology. The rapid determination of the MoA of BRD7880 via high-dimensional assay data also provides a valuable precedent for future phenotypic screening campaigns. But the most direct impact of BRD7880 may stem from its use in studying the therapeutic hypothesis of aurora kinase inhibition more rigorously than had been possible before with chemical tools.

Rohinitib — HSF1. Heat shock factor 1 (HSF1), a transcription factor that coordinates the heat shock response in human cells135, is a regulator of ribosome biosynthesis that has also been associated with oncogenesis and poor outcomes in cancer patients136,137,138. Exploring the connections between HSF1 activity, translational flux and maintenance of the tumorigenic cell state could reveal new therapeutic targets and strategies for cancer.

While studying the effects of varying ribosomal activity on transcription, Santagata and colleagues discovered a potent and selective inhibitor of HSF1 (Ref. 116).They first observed that inhibition of translation elongation significantly decreased HSF1 activity, suggesting that HSF1 is involved in ribosome-mediated regulation of transcription. A screen of more than 300,000 compounds from the National Institutes of Health Molecular Libraries Probe Production Centers Network (NIH MLPCN) collection was then performed to identify small-molecule inhibitors of HSF1 activity. The resulting ~2,500 hits were then subjected to a dual reporter-based secondary screen to eliminate nonspecific compounds; remarkably, several ostensibly selective HSF1 inhibitors suppressed both reporters116,139, which underscores the challenges associated with modulating transcription factor activity140. The most potent and selective (>20-fold) hit, a natural product named rocaglamide A, was optimized to afford rohinitib (IC50 = ~20 nM; Fig. 3). This unique chemical probe was then used to characterize extensively the link between protein translation and HSF1 activity in cancer cells. In addition to illuminating a number of fundamental insights into cancer biology, rohinitib suppressed tumour growth both in vitro and in vivo116.

The optimization of rocaglamide A involved both fully synthetic analogues and additional rocaglate natural products. All synthetic analogues, including rohinitib, were derived from the same 3-hydroxyflavone derivative, 30. The fused tricyclic rocaglate core was constructed via a biomimetic photocycloaddition–α-ketol rearrangement–reduction sequence141,142 (Fig. 5b). The use of a chiral Brønsted acid can render the [3 + 2] cycloaddition enantioselective143; alternatively, the desired enantiomer can be accessed via chiral resolution144. This synthetic pathway demonstrates the utility of oft-underutilized photochemical transformations in generating natural-product-like scaffolds from flat, sp2-rich precursors145.

LP99 — BRD7/9. Switch/sucrose non-fermentable (SWI/SNF) chromatin remodelling complexes are multi-protein complexes that regulate transcriptional processes involved in proliferation, DNA repair and other critical cellular functions. Loss of overall SWI/SNF function via inactivation of specific subunits has been linked to tumorigenesis, and SWI/SNF subunit mutations are found in roughly 20% of human cancers146,147. One such subunit is bromodomain-containing protein 7 (BRD7), which is an essential cofactor for p53-mediated tumour suppression148. Conversely, a subunit known as BRD9 is upregulated in some cancers and promotes the growth of acute myeloid leukaemia cell lines149. Because the human proteome contains dozens of bromodomain-containing proteins, selective inhibitors are needed to assess the viability of BRD7 and BRD9 as therapeutic targets.

Looking to develop the first selective BRD7/9 inhibitor, Clark and colleagues126 found that 1-methylquinolone could be elaborated into a potent and selective probe via asymmetric synthesis. 1-Methylquinolone had previously been identified via a crystallographic fragment screen against a BRD9-related protein called ATPase family AAA domain-containing protein 2 (ATAD2) (Ref. 150). From this promising starting point, additional structure-based and biophysical data informed the design of analogues that bind BRD9 with sub-micromolar affinity. The most potent compound, LP99 (KD = 99 nM; Fig. 3), stabilized only 2 of the 48 expressible human bromodomain-containing proteins at 10 μM as measured via differential scanning fluorimetry (BRD7/9, change in transition midpoint for thermal unfolding (ΔTm) > 4 °C; all others, ΔTm < 1 °C)126. LP99 was shown to disrupt the BRD7/9–chromatin interaction in cells without causing indiscriminate cell death. Preliminary experiments using LP99 as a tool compound found that BRD7 and BRD9 regulate inflammatory cytokine production, suggesting a novel anti-inflammatory therapeutic strategy.

While early analogues were synthesized as racemates and resolved via preparative-scale chiral chromatography, the active enantiomers of the most potent compounds, including LP99, were prepared using asymmetric synthesis. The key transformation in the pathway (Fig. 5c) is achieved via an organocatalyzed nitro-Mannich reaction that furnishes the desired intermediate 33 with high diastereoselectivity and enantioselectivity (d.r. = 7:1, e.r. = 19:1)126. Furthermore, this reaction can be performed on a gram scale, which facilitates the generation of additional analogues and the stockpiling of material for biological studies. Installation and decoration of the lactam core enabled the full synthesis of LP99 to be realized in just eight steps from commercially available reagents.

LP99 is a potent and selective chemical probe that has already uncovered previously unknown — and therapeutically promising — biology. Its development leveraged the interplay of many techniques, such as asymmetric catalysis and X-ray crystallography, and demonstrates that fragment-based drug design12 is also capable of generating compounds with under-represented chemical features.

ABL127 — PME1. The widespread development of kinase inhibitors to study and treat cancer demonstrates the importance of the phosphoproteome in oncology134. Various protein phosphatases regulate cellular dynamics by catalysing the dephosphorylation of serine, threonine and tyrosine residues. But despite the well-characterized link between aberrant phosphatase activity and cancer151,152, many fewer chemical probes exist for protein phosphatases than for their kinase counterparts, contributing to the belief that phosphatases are largely 'undruggable' (Refs 153, 154). Therefore, compounds that selectively modulate protein phosphatase activity, either directly or indirectly, would be useful small-molecule probes.

Protein phosphatase 2A (PP2A), a potent tumour suppressor that is responsible for the majority of serine/threonine dephosphorylation in eukaryotic cells, is inactivated in a variety of cancers155,156. Its negative regulation is mediated in part by protein phosphatase methylesterase 1 (PME1)157, a serine hydrolase whose overexpression has been linked to increased cellular proliferation and other disease phenotypes158. PME1 inhibition, therefore, is a potential anticancer therapeutic strategy.

A collaboration between the Cravatt and Fu laboratories afforded the first small-molecule inhibitor of PME1 (Ref. 120). After a high-throughput fluorescence polarization-activity-based protein profiling assay of PME1 activity was developed, a screen of more than 315,000 members of the NIH MLPCN collection, a small fraction of which included compounds submitted by academic laboratories practising asymmetric synthesis (for example, BRD7880 above), was conducted. This experiment revealed a set of four aza-β-lactams, including ABL127 (IC50 = 4.2 nM; Fig. 3), that inhibited PME1. Twenty-two other aza-β-lactams (including the enantiomers of three of the four hits) were significantly less active, which suggested that the observed PME1 inhibition resulted from a specific interaction. Follow-up studies showed that ABL127 inhibits PME1 in both cells (IC50 = 6.4–11.1 nM) and mice (50 mg/kg) and that this inhibition decreases demethylated PP2A levels. Additional activity-based protein profiling and 'click' chemistry-based experiments revealed that ABL127 exhibits selectivity across the entire proteome120, further increasing its value as an nMoA chemical tool to study both PME1 and PP2A.

Enantioselective nucleophilic catalysis enables a three-step asymmetric synthesis of ABL127 (Fig. 5d). After generation of ketene 36, a [2 + 2] cycloaddition with dimethyl azodicarboxylate affords ABL127. The cyclization is rendered enantioselective via the inclusion of a planar-chiral 4-pyrrolidinopyridine catalyst, and the transformation is compatible with a broad range of ketene and azo coupling partners159. An extension of this methodology was used to synthesize oxo-β-lactams and develop a general platform for crafting specific serine hydrolase inhibitors160.

Probes for infectious disease targets

Because agents of infectious disease are distinct from the host organism, the search for antimicrobials presents unique sets of opportunities and challenges. For the most common strategy of targeting the pathogen directly — rather than host-based mechanisms — on-target toxicity can be less of a concern because the drug target in the pathogen may not be found or may be significantly altered in the human patient. On the other hand, many pathogens can rapidly acquire drug resistance due to high rates of reproduction and DNA mutation161,162. Accordingly, high-throughput phenotypic screening for compounds that kill pathogens but not human cells is commonly used to find potential drug leads for infectious diseases163. Once a lead has emerged from screening, initial clues regarding the MoA of the compound can be gained via genetic methods. One approach involves incubating pathogens with sub-lethal concentrations of the compound to select for resistant strains that have altered sequences or copy numbers of the compound's target (as revealed by genome sequencing)164,165. Alternatively, systematically treating panels of pathogens in which a single gene is either deleted or overexpressed in each clone can point to relevant biology166.

Reflecting their often evolutionary origin in competition between microorganisms, natural products have been a prolific source of leads and drugs against infectious diseases, but the emergence of resistance and the exhaustion of 'low-hanging fruit' have led this strategy to become less productive in recent decades7,8,167,168. For some diseases, such as HIV/AIDS and hepatitis C, target-based drug discovery (often aided by knowledge of pathogen genetics) has made major contributions to the development of new medicines169,170. But in others, particularly in the search for novel antibiotics, such strategies have been much less fruitful. In these project areas, the compounds typically available in both commercial libraries and industry collections could be a major limiting factor in anti-infective drug discovery. For example, 'rule of five' compliance continues to shape the contents of many screening collections even though it has little bearing on the success of antimicrobials168,171.

One strategy to address these issues is to synthesize derivatives of known natural-product-based antimicrobials53,54,172,173. Alternatively, advances in synthetic methodology may be leveraged to access scaffolds that harbour chemical features frequently found in natural products without directly mimicking any specific class. Many promising next-generation antimicrobial agents contain such structural features174,175,176,177,178,179,180,181,182,183,184,185 (Fig. 6). The following four vignettes provide more detailed examples of the value of this approach in generating compounds that can be used to study and possibly treat infectious diseases.

Figure 6: Chemical probes and clinical candidates for infectious disease targets.
figure 6

A library of chiral azetidines originally designed to mimic central nervous system (CNS)-active compounds206 has afforded promising compounds against parasites such as Leishmania donovani (the causative agent of visceral leishmaniasis; 45) and Plasmodium falciparum (the causative agent of malaria; 38,39 and BRD7929)175,176,178,184. Macrocycle 37 (malaria) and benzannulated lactam 42 (Chagas disease) also exhibit antiparasitic activity174,181. Target-based and phenotypic screening strategies were used to identify oxazocane 41 and bridged bicyclic 43, which have bactericidal properties180,182. The Pictet–Spengler-derived spirocycles 40 and 44 show antimalarial and antiviral activity, respectively178,183. Other compounds shown are described in more detail in the vignettes in the main text. Properties relevant to their value as probes are discussed in Box 2. CDI, Clostridium difficile infection, MRSA, methicillin-resistant Staphylococcus aureus; RSV, respiratory syncytial virus.

PowerPoint slide

NITD609 — malaria. Malaria is caused by several species of mosquito-borne Plasmodium parasites, with Plasmodium falciparum being responsible for the majority of fatalities186. But most antimalarials treat only the asexual blood stage of Plasmodium187, sparing the liver-stage and sexual blood-stage parasites that can sit dormant in the body and cause relapse at a later date188. Furthermore, the risk of drug resistance189,190,191 necessitates easy-to-follow dosing regimens that maximize patient compliance — ideally a single-dose cure with prophylaxis and transmission-blocking activities187. nMoA compounds that inhibit Plasmodium growth at multiple life stages are needed to address these challenges.

Prompted by the emergence of malaria-causing parasites resistant to artemisinin, an endoperoxide natural product192,193, scientists at the Novartis Institute for Tropical Diseases and their collaborators performed phenotypic screens with Plasmodium to find nMoA antimalarial leads177. After a screen of ~12,000 natural products and natural-product-like synthetic compounds, subsequent optimization yielded NITD609 (Fig. 6), a chiral spiroindolone that exhibits potent in vitro activity (IC50 = ~0.5–10 nM) against several strains of P. falciparum and Plasmodium vivax via an nMoA. Its molecular target, as first suggested by genomic sequencing of resistant parasites, was confirmed to be PfATP4, a cation-transporting P-type adenosine triphosphatase that maintains Plasmodium sodium homeostasis177,187,194,195. Studies measuring toxicity (50% cytotoxicity concentration (CC50)/IC50 > 104; human ether-a-go-go-related gene potassium channel (hERG) IC50 > 30 μM), pharmacokinetics (100% bioavailability) and in vivo efficacy (single-dose cure in a Plasmodium berghei mouse model at 100 mg/kg) suggested that NITD609 would be compatible with once-daily oral dosing177, which was critical for its advancement into human trials as the first nMoA antimalarial clinical candidate in two decades187. Since entering the clinic, not only has NITD609 exhibited impressive efficacy196 but it has also been shown to prevent P. falciparum transmission from humans back to mosquitoes197. Its success has prompted the development of additional antimalarials that target PfATP4 (Refs 198, 199, 200).

Advances in asymmetric catalysis have made the synthesis of NITD609 efficient. Starting from 6-chloro-5-fluoroindole, the original synthetic route afforded racemic NITD609 in eight steps; chiral chromatography was needed to isolate the active enantiomer177. Due to the high cost of preparative-scale chiral chromatography, asymmetric routes would likely be more practical. The last step of the original synthesis, a Pictet–Spengler reaction, is highly diastereoselective but necessarily produces a racemate because the reaction lacks a chiral influence. Use of enantioenriched tryptamine derivatives or inclusion of a chiral phosphoric acid catalyst have been shown to furnish the appropriate enantiomer201,202. Alternative routes that do not rely upon the Pictet–Spengler reaction have also been devised. One such pathway incorporates a Ni-catalysed enantioselective aza-Diels–Alder reaction to construct NITD609's key spirocyclic ring system (Fig. 7a), affording the final product in just three steps from indole 46 and ketimine 47 (Ref. 203).

Figure 7: Key transformations along the synthetic routes to selected compounds.
figure 7

a | One synthetic route to NITD609, a potent antimalarial with a novel mechanism of action (nMoA), involved an enantioselective aza-Diels–Alder reaction between indole 46 and ketimine 47 to afford spirocycle 48, which is one step from NITD609. b | The synthesis of antimalarial BRD7929 contained two key cyclizations. First, treatment of chloride 50 with strong base supplied azetidine 51. Subsequent functionalization furnished the ring-closing metathesis substrate 52, which was cyclized to diazocine 53 en route to BRD7929. c | From aniline 54, imine 55 was an ideal substrate for a diastereoselective and enantioselective Povarov reaction. The resulting tetrahydroquinoline 56 was readily converted to BRD0761, which shows nMoA activity against Clostridium difficile. d | Similar to BRD7929, the tuberculosis probe BRD4592 was accessed via a chiral azetidine (59), which in turn was generated from chloride 58 and amino alcohol 57.

PowerPoint slide

BRD7929 — malaria. Buoyed by the success of earlier pilot experiments174,204,205, another effort to develop multistage antimalarial compounds began with a high-throughput screen of ~100,000 DOS-derived small molecules against a multidrug (MDR)-resistant strain of blood-stage P. falciparum178. These compounds represent hundreds of scaffolds — several of which were described above — not currently found outside of the Broad Institute. Hits from the primary screen were subjected to a panel of counter-screens to identify nMoA compounds that exhibited multistage activity. This analysis revealed clusters of structurally related bioactive compounds, most notably a series of bicyclic azetidines that showed nanomolar activity against all three Plasmodium life stages. For this series, genome sequencing of intentionally evolved resistant parasites showed mutations in the genetic locus predicted to encode phenylalanyl-tRNA synthetase (PfPheRS) — a novel antimalarial target. Follow-up experiments confirmed that an optimized analogue of the original hit, BRD7929 (Fig. 6), was not only a potent inhibitor of PfPheRS activity in vitro (IC50 = 23 nM) but also a single-dose cure, prophylactic and inhibitor of disease transmission in mouse models of malaria when dosed at 25 mg/kg (Ref. 178).

The unusual bicyclic core of BRD7929 was constructed via two non-successive cyclizations (Fig. 7b). First, the azetidine was installed via treatment of chloride 50 with strong base to effect intramolecular ring closure, affording a 6:5 mixture of epimers that were separated by chromatography206. Then, after functionalization of azetidine 51, ring-closing metathesis constructed the 8-membered diazocine. Subsequent Sonogashira coupling, isocyanate condensation and Mitsunobu-mediated amination afforded BRD7929 (Ref. 178). While not currently applicable to BRD7929, recent advances in stereospecific C–H arylation methodology have improved the efficiency of synthetic pathways to bicyclic azetidines that lack functionalization at the 4-position207.

The development of NITD609 and BRD7929 highlights the power of combining asymmetric synthesis with phenotypic screening to discover new antimalarials. Multiple chemotypes showed potent activity against multidrug-resistant Plasmodium directly out of the screening deck; data surrounding many of these are freely available at the Malaria Therapeutics Response Portal (MTRP). Once optimized, both NITD609 and BRD7929 showed remarkable in vivo results that build upon an expanding pipeline of antimalarial compounds making their way through preclinical and clinical trials208,209.

BRD0761 — Clostridium difficile infection. Narrow-spectrum nMoA antibiotics could have an important role in addressing the growing threat posed by the emergence of drug resistance in the Gram-positive bacterium Clostridium difficile, an opportunistic pathogen that can cause severe diarrhoea, sepsis and death210,211. A high-throughput screen of the same ~100,000-member DOS library described above against eight bacterial organisms revealed potent and selective inhibitors of C. difficile growth185. Two structurally distinct series of compounds, one of which is represented by BRD0761 (Fig. 6), were explored further. Compounds from both series were more selective than fidaxomicin, an FDA-approved antibiotic that was expressly approved for its selective activity against C. difficile212. Furthermore, BRD0761 exhibited greater potency against several C. difficile isolates (minimum inhibitory concentration (MIC) = 0.06–1 μg/mL) than did vancomycin, which is used to treat severe cases of C. difficile infection210,213. Genomic analysis of resistant mutants and in silico modelling suggested that the target of BRD0761 is glutamate racemase, an essential protein involved in cell wall biosynthesis. While this enzyme is a known target in Helicobacter pylori, Mycobacterium tuberculosis (Mtb) and other Gram-positive bacteria214, it has not yet been validated in C. difficile185.

The preparation of BRD0761 (Fig. 7c) demonstrates the value of combining complexity-generating transformations with asymmetric catalysis. An organocatalyzed Povarov reaction developed in the Jacobsen laboratory was identified as an efficient method to generate tetrahydroquinoline-containing compounds with three contiguous stereocentres. This reaction was used to synthesize tetrahydroquinoline 56 with high diastereoselectivity and enantioselectivity215. Prepared in only two steps from commercially available starting materials, the resulting scaffold was readily elaborated into a 2,328-member library that contained BRD0761 (Ref. 185).

Despite their structural and biological differences, as described above, BRD7929 and BRD0761 were derived from the same screening collection. These results highlight the ability of prospecting libraries to generate nMoA leads with diverse chemotypes across different diseases.

BRD4592 — tuberculosis. Recent years have seen the alarming rise of MDR, extensively drug-resistant (XDR) and totally drug-resistant (TDR) strains of Mtb216,217,218, the causative agent of tuberculosis (TB). So, there is a substantial need for nMoA anti-TB compounds. Hoping to discover such compounds, a collaboration among researchers at the Broad Institute, the University of Chicago and Argonne National Laboratory screened roughly 83,000 compounds of the Broad DOS collection for activity against green fluorescent protein (GFP)-expressing Mtb179. This screen identified BRD4592 (Fig. 6), a chiral azetidine with three contiguous stereocentres, as an inhibitor of Mtb growth (MIC90 = 3 μM). Notably, its other seven stereoisomers were not scored as active, which suggests that its bactericidal activity is due to specific target engagement. Genetic analysis of resistant mutants followed by extensive kinetic, thermodynamic and biochemical characterization revealed the molecular target to be both subunits of tryptophan synthase (IC50α = 71 nM; IC50β = 23 nM), a previously untargeted essential metabolic enzyme179,219. BRD4592 stabilizes multiple enzyme conformations via allosteric binding (Fig. 8), which probably contributes to its high specificity; by contrast, substrate mimetics that target enzyme active sites are likely to exhibit substantial off-target activity220. BRD4592 was then shown to inhibit Mycobacterium marinum growth at 15 μM in zebrafish embryos, a common in vivo model of TB221.

Figure 8: Stereo view of BRD4592 (cyan) bound at the interface of the α and β subunits of tryptophan synthase.
figure 8

X-ray crystallography was used to solve the co-crystal structure of BRD4592 and tryptophan synthase. Hydrogen bonds and water molecules are shown as dashed lines and red spheres, respectively. Reproduced from Ref. 179, Macmillan Publishers Limited.

PowerPoint slide

The versatile azetidine-based synthetic pathway that afforded BRD7929 can be redirected to generate BRD4592 (Fig. 7d). As before, cyclization of chloride 58 produced a near-equal mixture of epimers at the 2-position that were easily separated via chromatography206. Suzuki cross-coupling and functional group deprotections then furnished BRD4592. BRD7929 and BRD4592 highlight just two of the many distinct skeletons that can be synthesized from azetidines 51 and 59 and their stereoisomers; the full complement of bridged, fused and spirocyclic ring systems that can be accessed via this chemistry is discussed elsewhere206.

The discovery of BRD4592 offers lessons for antimicrobial drug discovery. The use of phenotypic screening both yielded membrane-permeable hits (an important issue in finding inhibitors of Mtb164,222,223) and enabled the discovery of a new therapeutic target outside of previously drugged pathways179. In addition, although the similarities between the syntheses and chemical structures of BRD4592 and BRD7929 are striking, we hesitate to call the azetidine a 'privileged scaffold'; for example, the corresponding pyrrolidines were not included in the library and are therefore not available for comparison. Rather, we propose that these two vignettes provide evidence that compounds that are structurally dissimilar from those in screening collections are likely to lead to novel biological MoAs and therapeutic insights — different chemistry yields different outcomes.

Conclusions and outlook

The ability to modulate biological systems reversibly, selectively and with temporal control is one of the most powerful weapons in the chemical biologist's arsenal. By advancing our understanding of disease biology, high-quality small-molecule probes have catalysed new research efforts to test therapeutic hypotheses and develop new medicines66. However, we must be wary of compounds whose activities have not been sufficiently or accurately defined65,224 (Box 2). Furthermore, much of our currently available chemical matter is both structure-redundant and performance-redundant225, which hampers our ability to develop nMoA compounds that can exploit insights from human genetics or combat drug-resistant pathogens.

As this article has attempted to illustrate, advances in synthesis have enabled the construction of libraries of natural-product-like compounds. These novel chemistries have afforded nMoA compounds with potential therapeutic value across a broad range of diseases. While novel does not necessarily mean better, probe discovery and drug discovery campaigns are profoundly influenced by the quality of their starting points ('front loading')226. Earlier analyses, such as those conducted by Michael Hann and colleagues227,228, have suggested that libraries enriched in these complex compounds are more likely to be inefficient in high-throughput screening experiments. However, the probes and drug leads discussed in this article were all derived from screens of 8,000 to ~315,000 compounds, suggesting that such libraries can generate useful hits and leads at least as efficiently as the large libraries used in the pharmaceutical industry229.

Biomedical research would also benefit from general, high-throughput methods for discovering small-molecule binders to the novel therapeutic targets uncovered by human and microbial genetics, molecular biology, cellular biology and chemical biology. Even if they do not alter protein activity, binders can be used to study how biological systems respond to various perturbations230,231. For example, they can be applied to discover compounds that promote new protein–protein interactions or stabilize or destabilize their cellular targets. In addition, the binders can be equipped with additional chemical motifs that promote the cellular mis-localization or degradation of targeted proteins232,233. Unfortunately, methods for discovering binders are often time consuming and resource intensive234,235. Affinity enrichment-based screening of libraries of DNA-barcoded compounds, however, is a promising technique that is gaining in popularity due to advances in DNA sequencing, among other factors236. Current DNA-encoded libraries are largely populated by peptidomimetics and sp2-rich compounds237, similar to commercial libraries of 'conventional' small molecules. Perhaps, then, DNA-encoded libraries could also benefit from the inclusion of compounds that contain under-represented structural features. This concept poses a future challenge for synthetic organic synthesis because modern chemistries are needed that are compatible with the conditions required for concomitant DNA barcoding.

Looking ahead, the building of performance-diverse small-molecule libraries would be more efficient if a rapid feedback mechanism were to exist between synthetic organic chemistry and biology. The current multi-year retrospective and time-consuming method for assessing small-molecule bioactivity precludes biological performance from influencing library design225, and structural diversity alone has been shown to be a poor predictor of biological performance diversity238. On the other hand, high-dimensional (multiplexed) biological assays that track changes in gene expression239, cell morphology240 and other cellular features241 can provide chemists with near-immediate feedback on their synthetic decisions. By associating each compound with thousands of biological changes, these assays constitute the backbone of a means for real-time biological annotation of reaction products242,243. Such a feedback loop would indicate during library construction — rather than years later — the reactions, scaffolds and synthetic pathways worth pursuing for maximal performance diversity. They also enable the hand-picking of small clusters of compounds, with each cluster having either known or novel MoAs, and thus collectively provide an empirical means for the construction of performance-diverse compound collections for probe and drug development. Chemists have neither the time nor the resources to synthesize every possible compound, so a prudent approach to bioactive compound discovery would incorporate the principles of the scientific method — hypotheses, experiments, observations and conclusions — whenever possible.

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.