Chemical creativity in the design of new synthetic chemical entities (NCEs) with drug-like properties has been the domain of medicinal chemists. Here, we explore the capability of a chemistry-savvy machine intelligence to generate synthetically accessible molecules. DINGOS (design of innovative NCEs generated by optimization strategies) is a virtual assembly method that combines a rule-based approach with a machine learning model trained on successful synthetic routes described in chemical patent literature. This unique combination enables a balance between ligand-similarity-based generation of innovative compounds by scaffold hopping and the forward-synthetic feasibility of the designs. In a prospective proof-of-concept application, DINGOS successfully produced sets of de novo designs for four approved drugs that were in agreement with the desired structural and physicochemical properties. Target prediction indicated more than 50% of the designs to be biologically active. Four selected computer-generated compounds were successfully synthesized in accordance with the synthetic route proposed by DINGOS. The results of this study demonstrate the capability of machine learning models to capture implicit chemical knowledge from chemical reaction data and suggest feasible syntheses of new chemical matter.
Subscribe to Journal
Get full journal access for 1 year
only $8.67 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
The trained machine learning model, CAS numbers of the training data and reaction SMARTS used in this Article are provded in the Code Ocean capsule https://doi.org/10.24433/CO.6930970.v132. All molecules were preprocessed in accordance with the procedure stated in the Methods (see ‘Molecular building blocks’ section).
Shih, H.-P., Zhang, X. & Aronov, A. M. Drug discovery effectiveness from the standpoint of therapeutic mechanisms and indications. Nat. Rev. Drug Discov. 17, 19–33 (2017).
Hartenfeller, M. & Schneider, G. Enabling future drug discovery by de novo design. Wiley Interdiscip. Rev. Comput. Mol. Sci. 1, 742–759 (2011).
Blakemore, D. C. et al. Organic synthesis provides opportunities to transform drug discovery. Nat. Chem. 10, 383–394 (2018).
Schneider, P. & Schneider, G. De novo design at the edge of chaos. J. Med. Chem. 59, 4077–4086 (2016).
Sliwoski, G., Kothiwale, S., Meiler, J. & Lowe, E. W. Computational methods in drug discovery. Pharmacol. Rev. 66, 334–395 (2013).
Chen, H., Engkvist, O., Wang, Y., Olivecrona, M. & Blaschke, T. The rise of deep learning in drug discovery. Drug Discov. Today 23, 1241–1250 (2018).
Merk, D., Friedrich, L., Grisoni, F. & Schneider, G. De novo design of bioactive small molecules by artificial intelligence. Mol. Inform. 37, 1700153 (2018).
Gupta, A. et al. Generative recurrent networks for de novo drug design. Mol. Inform. 37, 1700111 (2018).
Merk, D., Grisoni, F., Friedrich, L. & Schneider, G. Tuning artificial intelligence on the de novo design of natural-product-inspired retinoid X receptor modulators. Commun. Chem. 1, 68 (2018).
Lowe, D. M. Chemical reactions from US patents (1976–Sep2016) (2017); https://figshare.com/articles/Chemical_reactions_from_US_patents_1976-Sep2016_/5104873
Coley, C. W., Rogers, L., Green, W. H. & Jensen, K. F. Computer-assisted retrosynthesis based on molecular similarity. ACS Cent. Sci. 3, 1237–1245 (2017).
Feng, F., Lai, L. & Pei, J. Computational chemical synthesis analysis and pathway design. Front. Chem. 6, 199 (2018).
Szymkuć, S. et al. Computer-assisted synthetic planning: the end of the beginning. Angew. Chem. Int. Ed. 55, 5904–5937 (2016).
Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).
Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28, 31–36 (1988).
Grisoni, F. et al. Scaffold hopping from natural products to synthetic mimetics by holistic molecular similarity. Commun. Chem. 1, 44 (2018).
Merk, D., Grisoni, F., Friedrich, L., Gelzinyte, E. & Schneider, G. Scaffold hopping from synthetic RXR modulators by virtual screening and de novo design. Med. Chem. Commun. 9, 1289–1292 (2018).
Grisoni, F., Merk, D., Byrne, R. & Schneider, G. Scaffold-hopping from synthetic drugs by holistic molecular representation. Sci. Rep. 8, 16469 (2018).
MACCS-II (MDL Information Systems, 1987).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In Proceedings of 3 rd International Conference on Learning Representations, ICLR2015, 1–13 (2015).
Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Res. 45, D945–D954 (2017).
ChEMBL Database (EBI, 2017); https://www.ebi.ac.uk/chembl/
Johnson, M. A. & Maggiora, G. M. Concepts and Applications of Molecular Similarity (Wiley, 1990).
Lipinski, C. A., Lombardo, F., Dominy, B. W. & Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 23, 3–25 (1997).
Reker, D., Rodrigues, T., Schneider, P. & Schneider, G. Identifying the macromolecular targets of de novo-designed chemical entities through self-organizing map consensus. Proc. Natl Acad. Sci. USA 111, 4067–4072 (2014).
Reutlinger, M. et al. Chemically advanced template search (CATS) for scaffold-hopping and prospective target prediction for ‘orphan’ molecules. Mol. Inform. 32, 133–138 (2013).
Molecular Operating Environment (MOE) (Chemical Computing Group, 2017).
O’Boyle, N. M. & Sayle, R. A. Comparing structural fingerprints using a literature-based similarity benchmark. J. Cheminform. 8, 1–14 (2016).
RDKit: Open-source Cheminformatics (RDKit); www.rdkit.org
Wolber, G. & Langer, T. LigandScout: 3D pharmacophores derived from protein-bound ligands and their use as virtual screening filters. J. Chem. Inf. Model. 45, 160–169 (2005).
Button, A., Merk, A., Hiss, J. A. & Schneider, G. Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis. Code Ocean (2019); https://doi.org/10.24433/CO.6930970.v1
The authors thank L. Friedrich, C. Brunner, B. Huisman, X. Zhang and R. Byrne for stimulating discussions and technical support. D.M. was financially supported by an ETH Zurich Postdoctoral Fellowship (grant no. 16–2 FEL-07). This research was financially supported by the Swiss National Science Foundation (grant no. 205321_182176 to G.S.).
G.S. declares a potential conflict of interest in his role as life-science industry consultant and cofounder of inSili.com GmbH, Zurich. No other competing interests are declared.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Button, A., Merk, D., Hiss, J.A. et al. Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis. Nat Mach Intell 1, 307–315 (2019). https://doi.org/10.1038/s42256-019-0067-7
Current Opinion in Chemical Biology (2020)
Journal of Chemical Information and Modeling (2020)
BMC Chemical Engineering (2020)
Strategies for Design of Molecular Structures with a Desired Pharmacophore Using Deep Reinforcement Learning
Chemical and Pharmaceutical Bulletin (2020)
Nature Reviews Drug Discovery (2020)