Ultra-large library docking for discovering new chemotypes

Lyu, Jiankun; Wang, Sheng; Balius, Trent E.; Singh, Isha; Levit, Anat; Moroz, Yurii S.; O’Meara, Matthew J.; Che, Tao; Algaa, Enkhjargal; Tolmachova, Kateryna; Tolmachev, Andrey A.; Shoichet, Brian K.; Roth, Bryan L.; Irwin, John J.

doi:10.1038/s41586-019-0917-9

Article
Published: 06 February 2019

Ultra-large library docking for discovering new chemotypes

Jiankun Lyu^1,2^na1,
Sheng Wang^3,4^na1,
Trent E. Balius¹^na1,
Isha Singh¹^na1,
Anat Levit¹,
Yurii S. Moroz^5,6,
Matthew J. O’Meara¹,
Tao Che⁴,
Enkhjargal Algaa¹,
Kateryna Tolmachova⁷,
Andrey A. Tolmachev⁷,
Brian K. Shoichet¹,
Bryan L. Roth^4,8,9 &
…
John J. Irwin¹

Nature volume 566, pages 224–229 (2019)Cite this article

65k Accesses
535 Citations
381 Altmetric
Metrics details

Subjects

Chemical libraries

Abstract

Despite intense interest in expanding chemical space, libraries containing hundreds-of-millions to billions of diverse molecules have remained inaccessible. Here we investigate structure-based docking of 170 million make-on-demand compounds from 130 well-characterized reactions. The resulting library is diverse, representing over 10.7 million scaffolds that are otherwise unavailable. For each compound in the library, docking against AmpC β-lactamase (AmpC) and the D₄ dopamine receptor were simulated. From the top-ranking molecules, 44 and 549 compounds were synthesized and tested for interactions with AmpC and the D₄ dopamine receptor, respectively. We found a phenolate inhibitor of AmpC, which revealed a group of inhibitors without known precedent. This molecule was optimized to 77 nM, which places it among the most potent non-covalent AmpC inhibitors known. Crystal structures of this and other AmpC inhibitors confirmed the docking predictions. Against the D₄ dopamine receptor, hit rates fell almost monotonically with docking score, and a hit-rate versus score curve predicted that the library contained 453,000 ligands for the D₄ dopamine receptor. Of 81 new chemotypes discovered, 30 showed submicromolar activity, including a 180-pM subtype-selective agonist of the D₄ dopamine receptor.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Make-on-demand compounds are diverse and have increased exponentially.**

Fig. 2: Structural fidelity between docked-predicted and crystallographically determined poses of the new β-lactamase inhibitors.

**Fig. 3: Testing 549 molecules at different docking ranks against the D₄ receptor.**

**Fig. 4: Estimating the number of active D₄ receptor ligands in the 138 million compound library.**

Synthon-based ligand discovery in virtual libraries of over 11 billion compounds

Article 15 December 2021

Chemical space docking enables large-scale structure-based virtual screening to discover ROCK1 kinase inhibitors

Article Open access 28 October 2022

A practical guide to large-scale docking

Article 24 September 2021

Data availability

Active molecules reported here are available from B.K.S. or directly from Enamine. The four structures of AmpC determined with the new docking hits are available from the PDB with accession numbers 6DPZ, 6DPY, 6DPX and 6DPT. The compounds docked in this study are freely available from our ZINC lead-like make-on-demand library (http://zinc15.docking.org). All active compounds are available either from the authors or may be purchased from Enamine. Figures with associated raw data include: Fig. 2, for which electron density and reflection files are deposited with the PDB; Figs. 3, 4 and Extended Data Fig. 5, for which Source Data are available in the online version of the paper; Extended Data Fig. 1, for which the data are included in Supplementary Table 1; Extended Data Fig. 6, for which raw clustering or no-clustering rank numbers are included in Supplementary Tables 8, 9. Further data are provided in Supplementary Tables 3, 5 (aggregation assays for AmpC inhibitors and D₄ ligands); Extended Data Table 1 (crystallographic data collection and refinement); Supplementary Tables 9, 10 and Supplementary Data 12–15 (chemical purity of active ligands and their spectra); Supplementary Data 11 and 14 (synthetic routes to compounds). All other data are available from the authors on request.

References

Bohacek, R. S., McMartin, C. & Guida, W. C. The art and practice of structure-based drug design: a molecular modeling perspective. Med. Res. Rev. 16, 3–50 (1996).
Article CAS Google Scholar
Ertl, P. Cheminformatics analysis of organic substituents: identification of the most common substituents, calculation of substituent properties, and automatic identification of drug-like bioisosteric groups. J. Chem. Inf. Comput. Sci. 43, 374–380 (2003).
Article CAS Google Scholar
Fink, T., Bruggesser, H. & Reymond, J. L. Virtual exploration of the small-molecule chemical universe below 160 Daltons. Angew. Chem. Int. Ed. 44, 1504–1508 (2005).
Article CAS Google Scholar
Chevillard, F. & Kolb, P. SCUBIDOO: a large yet screenable and easily searchable database of computationally created chemical compounds optimized toward high likelihood of synthetic tractability. J. Chem. Inf. Model. 55, 1824–1835 (2015).
Article CAS Google Scholar
Keserü, G. M. & Makara, G. M. The influence of lead discovery strategies on the properties of drug candidates. Nat. Rev. Drug Discov. 8, 203–212 (2009).
Article Google Scholar
McGovern, S. L., Caselli, E., Grigorieff, N. & Shoichet, B. K. A common mechanism underlying promiscuous inhibitors from virtual and high-throughput screening. J. Med. Chem. 45, 1712–1722 (2002).
Article CAS Google Scholar
Brenner, S. & Lerner, R. A. Encoded combinatorial chemistry. Proc. Natl Acad. Sci. USA 89, 5381–5383 (1992).
Article ADS CAS Google Scholar
Ahn, S. et al. Allosteric “beta-blocker” isolated from a DNA-encoded small molecule library. Proc. Natl Acad. Sci. USA 114, 1708–1713 (2017).
Article CAS Google Scholar
Goodnow, R. A. Jr, Dumelin, C. E. & Keefe, A. D. DNA-encoded chemistry: enabling the deeper sampling of chemical space. Nat. Rev. Drug Discov. 16, 131–147 (2017).
Article CAS Google Scholar
Jorgensen, W. L. The many roles of computation in drug discovery. Science 303, 1813–1818 (2004).
Article ADS CAS Google Scholar
de Graaf, C. et al. Crystal structure-based virtual screening for fragment-like ligands of the human histamine H₁ receptor. J. Med. Chem. 54, 8195–8206 (2011).
Article ADS Google Scholar
Katritch, V. et al. Structure-based discovery of novel chemotypes for adenosine A_2A receptor antagonists. J. Med. Chem. 53, 1799–1809 (2010).
Article CAS Google Scholar
Manglik, A. et al. Structure-based discovery of opioid analgesics with reduced side effects. Nature 537, 185–190 (2016).
Article ADS CAS Google Scholar
Wang, S. et al. D₄ dopamine receptor high-resolution structures enable the discovery of selective agonists. Science 358, 381–386 (2017).
Article ADS CAS Google Scholar
Negri, A. et al. Discovery of a novel selective kappa-opioid receptor agonist using crystal structure-based virtual screening. J. Chem. Inf. Model. 53, 521–526 (2013).
Article CAS Google Scholar
Jazayeri, A., Andrews, S. P. & Marshall, F. H. Structurally enabled discovery of adenosine A_2A receptor antagonists. Chem. Rev. 117, 21–37 (2017).
Article CAS Google Scholar
Lane, J. R. et al. Structure-based ligand discovery targeting orthosteric and allosteric pockets of dopamine receptors. Mol. Pharmacol. 84, 794–807 (2013).
Article CAS Google Scholar
Langmead, C. J. et al. Identification of novel adenosine A_2A receptor antagonists by virtual screening. J. Med. Chem. 55, 1904–1909 (2012).
Article CAS Google Scholar
Becker, O. M. et al. G protein-coupled receptors: in silico drug discovery in 3D. Proc. Natl Acad. Sci. USA 101, 11304–11309 (2004).
Article ADS CAS Google Scholar
Kooistra, A. J. et al. Function-specific virtual screening for GPCR ligands using a combined scoring method. Sci. Rep. 6, 28288 (2016).
Article ADS CAS Google Scholar
Congreve, M. et al. Discovery of 1,2,4-triazine derivatives as adenosine A_2A antagonists using structure based drug design. J. Med. Chem. 55, 1898–1903 (2012).
Article CAS Google Scholar
Kiss, R. et al. Discovery of novel human histamine H4 receptor ligands by large-scale structure-based virtual screening. J. Med. Chem. 51, 3145–3153 (2008).
Article CAS Google Scholar
Oprea, T. I. & Gottfries, J. Chemography: the art of navigating in chemical space. J. Comb. Chem. 3, 157–166 (2001).
Article CAS Google Scholar
Mysinger, M. M., Carchia, M., Irwin, J. J. & Shoichet, B. K. Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J. Med. Chem. 55, 6582–6594 (2012).
Article CAS Google Scholar
Bemis, G. W. & Murcko, M. A. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 39, 2887–2893 (1996).
Article CAS Google Scholar
Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Res. 45, D945–D954 (2017).
Article CAS Google Scholar
Katz, B. A. et al. A novel serine protease inhibition motif involving a multi-centered short hydrogen bonding network at the active site. J. Mol. Biol. 307, 1451–1486 (2001).
Article CAS Google Scholar
Congreve, M., Langmead, C. J., Mason, J. S. & Marshall, F. H. Progress in structure based drug design for G protein-coupled receptors. J. Med. Chem. 54, 4283–4311 (2011).
Article CAS Google Scholar
Vaidehi, N. Dynamics and flexibility of G-protein-coupled receptor conformations and their relevance to drug design. Drug Discov. Today 15, 951–957 (2010).
Article CAS Google Scholar
Irwin, J. J. & Shoichet, B. K. Docking screens for novel ligands conferring new biology. J. Med. Chem. 59, 4103–4120 (2016).
Article CAS Google Scholar
Vass, M. et al. in Computational Methods for GPCR Drug Discovery (Heifetz, A.) Ch. 4, 73–113 (Humana, Springer, New Jersey, 2018).
Isberg, V. et al. Generic GPCR residue numbers – aligning topology maps while minding the gaps. Trends Pharmacol. Sci. 36, 22–31 (2015).
Article CAS Google Scholar
McCorvy, J. D. et al. Structural determinants of 5-HT_2B receptor activation and biased agonism. Nat. Struct. Mol. Biol. 25, 787–796 (2018).
Article CAS Google Scholar
Powers, R. A., Morandi, F. & Shoichet, B. K. Structure-based discovery of a novel, noncovalent inhibitor of AmpC β-lactamase. Structure 10, 1013–1023 (2002).
Article CAS Google Scholar
Feng, B. Y. et al. A high-throughput screen for aggregation-based inhibition in a large compound library. J. Med. Chem. 50, 2385–2390 (2007).
Article CAS Google Scholar
Babaoglu, K. et al. Comprehensive mechanistic analysis of hits from high-throughput and docking screens against β-lactamase. J. Med. Chem. 51, 2502–2511 (2008).
Article CAS Google Scholar
Rowley, M. et al. 5-(4-chlorophenyl)-4-methyl-3-(1-(2-phenylethyl)piperidin-4-yl)isoxazole: a potent, selective antagonist at human cloned dopamine D4 receptors. J. Med. Chem. 39, 1943–1945 (1996).
Article CAS Google Scholar
Enguehard-Gueiffier, C. et al. 2-[(4-phenylpiperazin-1-yl)methyl]imidazo(di)azines as selective D₄-ligands. Induction of penile erection by 2-[4-(2-methoxyphenyl)piperazin-1-ylmethyl]imidazo[1,2-a]pyridine (PIP3EA), a potent and selective D₄ partial agonist. J. Med. Chem. 49, 3938–3947 (2006).
Article CAS Google Scholar
Löber, S., Hübner, H. & Gmeiner, P. Synthesis and biological investigations of dopaminergic partial agonists preferentially recognizing the D4 receptor subtype. Bioorg. Med. Chem. Lett. 16, 2955–2959 (2006).
Article Google Scholar
Lindsley, C. W. & Hopkins, C. R. Return of D₄ dopamine receptor antagonists in drug discovery. J. Med. Chem. 60, 7233–7243 (2017).
Article CAS Google Scholar
Tirado-Rives, J. & Jorgensen, W. L. Contribution of conformer focusing to the uncertainty in predicting free energies for protein-ligand binding. J. Med. Chem. 49, 5880–5884 (2006).
Article CAS Google Scholar
Abagyan, R., Totrov, M. & Kuznetsov, D. ICM—a new method for protein modeling and design: applications to docking and structure prediction from the distorted native conformation. J. Comput. Chem. 15, 488–506 (1994).
Article CAS Google Scholar
Halgren, T. A. et al. Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. J. Med. Chem. 47, 1750–1759 (2004).
Article CAS Google Scholar
Goodsell, D. S. & Olson, A. J. Automated docking of substrates to proteins by simulated annealing. Proteins 8, 195–202 (1990).
Article CAS Google Scholar
Kufareva, I., Katritch, V., Stevens, R. C. & Abagyan, R. Advances in GPCR modeling evaluated by the GPCR Dock 2013 assessment: meeting new challenges. Structure 22, 1120–1139 (2014).
Article CAS Google Scholar
Kramer, B., Rarey, M. & Lengauer, T. Evaluation of the FLEXX incremental construction algorithm for protein-ligand docking. Proteins 37, 228–241 (1999).
Article CAS Google Scholar
McGann, M. FRED pose prediction and virtual screening accuracy. J. Chem. Inf. Model. 51, 578–596 (2011).
Article CAS Google Scholar
Jones, G., Willett, P., Glen, R. C., Leach, A. R. & Taylor, R. Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol. 267, 727–748 (1997).
Article CAS Google Scholar
Corbeil, C. R., Williams, C. I. & Labute, P. Variability in docking success rates due to dataset preparation. J. Comput. Aided Mol. Des. 26, 775–786 (2012).
Article ADS CAS Google Scholar
Hawkins, P. C., Skillman, A. G., Warren, G. L., Ellingson, B. A. & Stahl, M. T. Conformer generation with OMEGA: algorithm and validation using high quality structures from the Protein Databank and Cambridge Structural Database. J. Chem. Inf. Model. 50, 572–584 (2010).
Article CAS Google Scholar
Hawkins, G. D. et al. AMSOL version 7.1 https://comp.chem.umn.edu/amsol/ (2004).
Wei, B. Q., Baase, W. A., Weaver, L. H., Matthews, B. W. & Shoichet, B. K. A model binding site for testing scoring functions in molecular docking. J. Mol. Biol. 322, 339–355 (2002).
Article CAS Google Scholar
Mysinger, M. M. & Shoichet, B. K. Rapid context-dependent ligand desolvation in molecular docking. J. Chem. Inf. Model. 50, 1561–1573 (2010).
Article CAS Google Scholar
Sterling, T. & Irwin, J. J. ZINC 15—ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324–2337 (2015).
Article CAS Google Scholar
Barelier, S. et al. Increasing chemical space coverage by combining empirical and computational fragment screens. ACS Chem. Biol. 9, 1528–1535 (2014).
Article CAS Google Scholar
Gray, D. L. et al. Impaired β-arrestin recruitment and reduced desensitization by non-catechol agonists of the D1 dopamine receptor. Nat. Commun. 9, 674 (2018).
Article ADS Google Scholar
Carlsson, J. et al. Ligand discovery from a dopamine D₃ receptor homology model and crystal structure. Nat. Chem. Biol. 7, 769–778 (2011).
Article CAS Google Scholar
Meng, E. C., Shoichet, B. K. & Kuntz, I. D. Automated docking with gridb-based energy evaluation. J. Comput. Chem. 13, 505–524 (1992).
Article CAS Google Scholar
Sharp, K. A., Friedman, R. A., Misra, V., Hecht, J. & Honig, B. Salt effects on polyelectrolyte-ligand binding: comparison of Poisson–Boltzmann, and limiting law/counterion binding models. Biopolymers 36, 245–262 (1995).
Article CAS Google Scholar
Gallagher, K. & Sharp, K. Electrostatic contributions to heat capacity changes of DNA-ligand binding. Biophys. J. 75, 769–776 (1998).
Article ADS CAS Google Scholar
Coleman, R. G., Carchia, M., Sterling, T., Irwin, J. J. & Shoichet, B. K. Ligand pose and orientational sampling in molecular docking. PLoS ONE 8, e75992 (2013).
Article ADS CAS Google Scholar
Tolmachev, A. et al. Expanding synthesizable space of disubstituted 1,2,4-oxadiazoles. ACS Comb. Sci. 18, 616–624 (2016).
Article CAS Google Scholar
Eidam, O. et al. Design, synthesis, crystal structures, and antimicrobial activity of sulfonamide boronic acids as β-lactamase inhibitors. J. Med. Chem. 53, 7852–7863 (2010).
Article CAS Google Scholar
Kabsch, W. XDS. Acta Crystallogr. D 66, 125–132 (2010).
Article CAS Google Scholar
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D 66, 486–501 (2010).
Article CAS Google Scholar
Murshudov, G. N., Vagin, A. A. & Dodson, E. J. Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr. D 53, 240–255 (1997).
Article CAS Google Scholar
Adams, P. D. et al. PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr. D 66, 213–221 (2010).
Article CAS Google Scholar
Eidam, O. et al. Fragment-guided design of subnanomolar β-lactamase inhibitors active in vivo. Proc. Natl Acad. Sci. USA 109, 17448–17453 (2012).
Article ADS CAS Google Scholar
Feng, B. Y. & Shoichet, B. K. A detergent-based assay for the detection of promiscuous inhibitors. Nat. Protoc. 1, 550–553 (2006).
Article CAS Google Scholar
Allen, J. A. et al. Discovery of β-arrestin-biased dopamine D₂ ligands for probing signal transduction pathways essential for antipsychotic efficacy. Proc. Natl Acad. Sci. USA 108, 18488–18493 (2011).
Article ADS CAS Google Scholar
Carpenter, B. et al. Stan: a probabilistic programming language. J. Stat. Softw. 1, 1–32 (2017).
Google Scholar
Ryan, E. G., Drovandi, C. C., McGree, J. M. & Pettitt, A. N. A review of modern computational algorithms for Bayesian optimal design. Int. Stat. Rev. 84, 128–154 (2016).
Article MathSciNet Google Scholar
Rainforth, T., Cornish, R., Yang, H., Warrington, A. & Wood, F. On Nesting Monte Carlo Estimators. In Proc. 35th International Conference on Machine Learning PMLR 80 (eds Dy, J. & Krause, A.) 4267–4276 (2018).

Download references

Acknowledgements

This research was supported by GM71896 (to J.J.I.); R35 GM122481 and a UCSF PBBR New Frontier Award (to B.K.S.); R01 MH112205, U24DK1169195 and the NIMH Psychoactive Drug Screening Contract (to B.L.R.); Strategic Priority Research Program of the Chinese Academy of Sciences, grant number XDB19000000 (to S.W.). We thank R. Stein and I. Fish for help with AmpC preparation, H. Torosyan for aggregation assays, R. H. J. Olsen for developing the D₄ receptor BRET assay, B. Wong and C. Dandarchuluun for computer support, and M. Korczynska and J. Pottel for reading this manuscript; ChemAxon for a license to JChem, OpenEye Scientific software for a license to OEChem and Omega2, Molecular Networks for a license to Corina, and Molinspriation for a license to Mitools.

Reviewer information

Nature thanks M. M. Babu, D. E. Gloriam and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Author information

These authors contributed equally: Jiankun Lyu, Sheng Wang, Trent E. Balius, Isha Singh

Authors and Affiliations

Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, USA
Jiankun Lyu, Trent E. Balius, Isha Singh, Anat Levit, Matthew J. O’Meara, Enkhjargal Algaa, Brian K. Shoichet & John J. Irwin
State Key Laboratory of Bioreactor Engineering, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science & Technology, Shanghai, China
Jiankun Lyu
State Key Laboratory of Molecular Biology, CAS Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, China
Sheng Wang
Department of Pharmacology, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC, USA
Sheng Wang, Tao Che & Bryan L. Roth
National Taras Shevchenko University of Kiev, Kiev, Ukraine
Yurii S. Moroz
Chemspace, Riga, Latvia
Yurii S. Moroz
Enamine, Kiev, Ukraine
Kateryna Tolmachova & Andrey A. Tolmachev
Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Bryan L. Roth
National Institute of Mental Health Psychoactive Drug Screening Program (NIMH PDSP), School of Medicine, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, NC, USA
Bryan L. Roth

Authors

Jiankun Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Trent E. Balius
View author publications
You can also search for this author in PubMed Google Scholar
Isha Singh
View author publications
You can also search for this author in PubMed Google Scholar
Anat Levit
View author publications
You can also search for this author in PubMed Google Scholar
Yurii S. Moroz
View author publications
You can also search for this author in PubMed Google Scholar
Matthew J. O’Meara
View author publications
You can also search for this author in PubMed Google Scholar
Tao Che
View author publications
You can also search for this author in PubMed Google Scholar
Enkhjargal Algaa
View author publications
You can also search for this author in PubMed Google Scholar
Kateryna Tolmachova
View author publications
You can also search for this author in PubMed Google Scholar
Andrey A. Tolmachev
View author publications
You can also search for this author in PubMed Google Scholar
Brian K. Shoichet
View author publications
You can also search for this author in PubMed Google Scholar
Bryan L. Roth
View author publications
You can also search for this author in PubMed Google Scholar
John J. Irwin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.J.I. and B.K.S. conceived the study. J.J.I. created the enlarged docking libraries. J.L., T.E.B. and A.L. performed the docking. S.W., T.C. and B.L.R. performed D₄ receptor assays and analysis. I.S. performed all crystallography and assays for AmpC. Y.S.M., K.T. and A.A.T. directed the compound synthesis, purification and characterization. M.J.O. performed Bayesian modelling. E.A., T.E.B. and J.L. contributed enabling computer code. B.K.S., J.L., J.J.I., B.L.R., T.E.B., I.S., A.L., S.W., T.C. and M.J.O. wrote the paper.

Corresponding authors

Correspondence to Brian K. Shoichet, Bryan L. Roth or John J. Irwin.

Ethics declarations

Competing interests

B.K.S. and J.J.I. are founders of a company, BlueDolphin LLC, that works in the area of molecular docking. All other authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Simulating the effect of library size on ligand enrichment among the top 1,000 docked molecules.

a, b, The energy distribution of ligands (a) and decoys (b) from docking enrichment calculations against AmpC. The skewed normal fitting curves are plotted in red lines. The fitting parameters (shape (α), location (loc) and scale values) are shown. c, Heat maps of the number of active molecules in the top 1,000 docked molecules for 6 targets. The number of ligands in the top 1,000 docked molecules for a given library size and the ratio between ligands and decoys is coloured using a log₁₀(number of ligands) scale ranging from 1 (blue) to 1,000 (red). Cells with zero ligands are shown in white. d, Large-library docking screens of AmpC (top, n = 99 million molecules) and D₄ (bottom, n = 138 million molecules). Molecules that are known to bind to AmpC and D₄, as well as close analogues, are treated as ligands and the rest of the molecules are treated as decoys. Left, the energy distributions of decoys (grey), ligands defined by ECFP4 Tc similarity ≥0.5 (blue), 0.6 (green) and 0.7 (orange) to ligands from ChEMBL. Middle, heat maps of the number of ligands in the top 1,000 docked molecules based on fit to full-library docking with the ligands (AmpC, Tc ≥ 0.5, green; D₄, Tc ≥ 0.6, orange) and decoys (grey) distributions. Right, number of ligands in the top 1,000 docked molecules as the library grows based on actual distributions plotted in the left panel. Data are mean ± s.d. of 20 samples. See Supplementary Table 1 for retrospective performance on three more targets.

Extended Data Fig. 2 Initial hits and selected analogues against AmpC.

The five initial hits are shown in the left column. Under each compound, the first row includes the ZINC identifier; the second row is the cluster rank (position in cluster head list sorted by DOCK score) with global rank (position in unclustered hit list sorted by DOCK score) shown in brackets; the third row is the Tc value (Tanimoto coefficient to known AmpC inhibitors in ChEMBL); the fourth row is the K_i value. Five selected analogues for the corresponding hits are shown in the right column. Under each compound, the first row includes the ZINC identifier; the second row is the Tc value; and the third row is the K_i value.

Extended Data Fig. 3 Lineweaver–Burk plot and K_i analysis for analogues of each of the five series of AmpC inhibitors.

a–f, Lineweaver–Burk plots for ZINC776666294 (a), 275579920 (b), ZINC548592534 (c), ZINC1187516987 (d), 339204163 (e) and 549719643 (f), indicating competitive inhibition. IC₅₀ values were determined by nonlinear regression fit in GraphPad Prism, and K_i values calculated by a replot of the slope of each Lineweaver–Burk plot versus the corresponding inhibitor concentration.

Extended Data Fig. 4 Electron density maps for AmpC–inhibitor complexes.

The initial F_o − F_c electron density map contoured at 2.5σ around the inhibitor (density in cyan) with refined 2F_o − F_c electron density contoured at 1σ for enzyme residues for the complexes with the following compounds. a, 547933290. b, 275579920. c, 339204163. d, 549719643. Inhibitor carbons are shown in cyan and enzyme carbons are shown in grey, oxygens in red, nitrogens in blue, sulfurs in yellow and chlorides in green.

Extended Data Fig. 5 Selected D₄ hits from docking 138 million make-on-demand molecules.

Six ligands with docked poses (first column), cAMP Gα_i/o activities (second column), Tango β-arrestin activities (third column) and ³H-N-methylspiperone displacement and chemical drawing (fourth column) are shown. The receptor structure is in grey and ligand carbons are in teal. Ballesteros–Weinstein residue numbers are included as superscripts. Functional assays represent normalized concentration–response curves of the ligands in cloned human D₄-mediated activation of Gα_i/o and β-arrestin translocation. Data are mean ± s.e.m. of three assays. The first row shows an example of an antagonist identified among the D₄ hits. Both agonist (teal curve) and antagonist (purple curve) modes are shown for ZINC130532671 in the third panel; the concentration of quinpirole in the antagonist mode was 100 nM.

Source Data

Extended Data Fig. 6 Pre-clustering the docking library yields much worse scores of scaffold representatives compared to full library docking.

a, b, Comparison of energy distributions of scaffold representatives between full library docking (orange) and pre-clustered library docking for D₄ (a) and AmpC (b) using four strategies: the closest member to the centroid of molecular masses and clogP (blue), the closest member to the centroid of molecular masses (pink), the member with the largest molecular masses (magenta) and the member with the smallest molecular masses (green). The inset shows the ratio of the number of molecules at a given docking score for full library docking divided by the number at that score when only cluster representatives are docked (coloured by clustering method). For each target, two examples illustrate the effect on our experimentally active scaffold families. c, D₄. d, AmpC. The scaffold for each molecule is highlighted in red. The ZINC identifier, post-cluster rank and pre-cluster rank are labelled for each pair. The arrow colour is as for the pre-clustering methods in a and b.

Extended Data Fig. 7 Comparison of hit rates achieved by combined docking score and human prioritization compared to the rates achieved by docking score alone.

a, The hit rates for selecting compounds at different scoring ranges by each strategy: human prioritization and docking score (orange), or docking score alone (blue). Hit rate is the ratio of active compounds/tested compounds; the raw numbers appear at the top of each bar. b, Distribution of the binding affinity level among the hits from a. There are 32 hits from human prioritization and docking score, and 26 hits from the docking score alone. These are divided into three affinity ranges: <100 nM (pale blue); 100 nM–1 μM (blue); 1–10 μM (dark blue). c, Functional activity distribution among the hits from b. There are 22 molecules from human prioritization and docking score, and 7 molecules from the docking score alone. These are divided into five activity ranges: <10 nM (pale green); 10 nM–1 μM (light green); 1–10 μM (olive); 10–50 μM (forest green); and not determined (dark green).

Extended Data Fig. 8 Bayesian prior modelling for balancing information gain and ligand discovery in molecule-selection design and error estimation.

a, Sigmoid functional form for the hit-rate model. b–d, Marginal Bayesian prior (teal) and posterior (red) distributions (n = 200,000) for each model parameter. b, Top. c, Dock₅₀. d, Slope. e, Estimated hit rate based on evaluation by the authors of the docked poses before any molecules were tested. Brown, mean ± s.d.; n = 200, 220, 230, 230, 285, 235, 210, 230 and 200 compounds; n = 5, 4, 4, 4, 4, 4, 4, 4 and 4 experts. The prior mean (green) and samples (n = 200) from the prior (blue) are shown. f, Candidate (blue) and chosen (orange) experimental designs (inset, designs 1–6), with expected number of hits and information gain for each design. g, Expected number of active scaffolds (orange, mean; grey, posterior draws n = 200,000) superimposed on the total number of scaffold cluster heads (black). h, i, Marginal distribution of the number of active compounds (h) and scaffolds (i) over the posterior distributions (n = 200,000).

Extended Data Table 1 Data collection and refinement statistics of AmpC inhibitors

Full size table

Extended Data Table 2 The highest-affinity direct docking hits for the D₄ receptor

Full size table

Supplementary information

Supplementary Information

This file contains Supplementary Tables and Data 1-15, except Supplementary Tables 2, 4, 7 and 8, which are provided as separate files.

Reporting Summary

Supplementary Table

This file contains Supplementary Table 2: Molecules tested against β-lactamase AmpC. We report the zinc id, smiles, and indicator of binding or non-binding. (supplied as a separate file. See Extended Data Fig. 2 for affinities and chemical drawings of potent binders).

Supplementary Table

This file contains Supplementary Table 4: All molecules tested against D4.

Supplementary Table

This file contains Supplementary Table 7: Full-library vs pre-clustering library docking for D4.

Supplementary Table

This file contains Supplementary Table 8: Full-library vs pre-clustering library docking for AmpC.

Source data

Source Data Fig. 3

Source Data Fig. 4

Source Data Extended Data Fig. 5

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lyu, J., Wang, S., Balius, T.E. et al. Ultra-large library docking for discovering new chemotypes. Nature 566, 224–229 (2019). https://doi.org/10.1038/s41586-019-0917-9

Download citation

Received: 26 June 2018
Accepted: 04 January 2019
Published: 06 February 2019
Issue Date: 14 February 2019
DOI: https://doi.org/10.1038/s41586-019-0917-9

This article is cited by

An intriguing vision for transatlantic collaborative health data use and artificial intelligence development
- Daniel C. Baumgart
npj Digital Medicine (2024)
Exploring the combinatorial explosion of amine–acid reaction space via graph editing
- Rui Zhang
- Babak Mahjour
- Tim Cernak
Communications Chemistry (2024)
A divergent intermediate strategy yields biologically diverse pseudo-natural products
- Sukdev Bag
- Jie Liu
- Herbert Waldmann
Nature Chemistry (2024)
Computational drug development for membrane protein targets
- Haijian Li
- Xiaolin Sun
- Horst Vogel
Nature Biotechnology (2024)
Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR
- Alexander Tropsha
- Olexandr Isayev
- Artem Cherkasov
Nature Reviews Drug Discovery (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

References

Acknowledgements

Reviewer information

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Extended data figures and tables

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links