Rethinking drug design in the artificial intelligence era


Artificial intelligence (AI) tools are increasingly being applied in drug discovery. While some protagonists point to vast opportunities potentially offered by such tools, others remain sceptical, waiting for a clear impact to be shown in drug discovery projects. The reality is probably somewhere in-between these extremes, yet it is clear that AI is providing new challenges not only for the scientists involved but also for the biopharma industry and its established processes for discovering and developing new medicines. This article presents the views of a diverse group of international experts on the ‘grand challenges’ in small-molecule drug discovery with AI and the approaches to address them.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Integrating mind and machine in drug discovery.


  1. 1.

    Smietana, K., Siatkowski, M. & Møller, M. Trends in clinical success rates. Nat. Rev. Drug Discov. 15, 379–380 (2016).

  2. 2.

    Mullard, A. 2018 FDA drug approvals. Nat. Rev. Drug Discov. 18, 85–89 (2019).

  3. 3.

    Hopfinger, A. J. Computer-assisted drug design. J. Med. Chem. 28, 1133–1139 (1985).

  4. 4.

    Martin, Y. C. Computer-assisted rational drug design. Methods Enzymol. 203, 587–613 (1991).

  5. 5.

    Yu, W. & MacKerell, A. D. Jr. Computer-aided drug design methods. Methods Mol. Biol. 1520, 85–106 (2017).

  6. 6.

    Baig, M. H. et al. Computer aided drug design: success and limitations. Curr. Pharm. Des. 22, 572–581 (2016).

  7. 7.

    Yang, X., Wang, Y., Byrne, R., Schneider, G. & Yang, S. Concepts of artificial intelligence for computer-assisted drug discovery. Chem. Rev. 119, 10520–10594 (2019).

  8. 8.

    Ching, T. et al. Opportunities and obstacles for deep learning in biology and medicine. J. R. Soc. Interface 15, 20170387 (2018).

  9. 9.

    Mignani, S., Huber, S., Tomás, H., Rodrigues, J. & Majoral, J. P. Why and how have drug discovery strategies in pharma changed? What are the new mindsets? Drug Discov. Today 21, 239–249 (2016).

  10. 10.

    Jordan, A. M. Artificial intelligence in drug design – the storm before the calm? ACS Med. Chem. Lett. 9, 1150–1152 (2018).

  11. 11.

    Bender, A. et al. Which aspects of HTS are empirically correlated with downstream success? Curr. Opin. Drug Discov. Devel. 11, 327–337 (2008).

  12. 12.

    Gilad, Y., Nadassy, K. & Senderowitz, H. A reliable computational workflow for the selection of optimal screening libraries. J. Cheminform. 7, 61 (2015).

  13. 13.

    Bajorath, J. Extending accessible chemical space for the identification of novel leads. Expert Opin. Drug Discov. 11, 825–829 (2016).

  14. 14.

    Holenz, J. & Stoy, P. Advances in lead generation. Bioorg. Med. Chem. Lett. 29, 517–524 (2019).

  15. 15.

    Oliveira, A. L. Biotechnology, big data and artificial intelligence. Biotechnol. J. 14, e1800613 (2019).

  16. 16.

    Brown, N. et al. Big data in drug discovery. Prog. Med. Chem. 57, 277–356 (2018).

  17. 17.

    Esaki, T. et al. Data curation can improve the prediction accuracy of metabolic intrinsic clearance. Mol. Inf. 38, 1800086 (2019).

  18. 18.

    Fourches, D., Muratov, E. & Tropsha, A. Trust, but verify II: a practical guide to chemogenomics data curation. J. Chem. Inf. Model. 56, 1243–1252 (2016).

  19. 19.

    Cases, M. et al. The eTOX data-sharing project to advance in silico drug induced toxicity prediction. Int. J. Mol. Sci. 15, 21136–21154 (2014).

  20. 20.

    Huang, R. et al. Modelling the Tox21 10K chemical profiles for in vivo toxicity prediction and mechanism characterization. Nat. Commun. 7, 10425 (2016).

  21. 21.

    Kirchmair, J. et al. Predicting drug metabolism: experiment and/or computation? Nat. Rev. Drug Discov. 14, 387–404 (2015).

  22. 22.

    Knudsen, T. B. et al. FutureTox II: in vitro data and in silico models for predictive toxicology. Toxicol. Sci. 143, 256–267 (2015).

  23. 23.

    Wetmore, B. A. Quantitative in vitro-to-in vivo extrapolation in a high-throughput environment. Toxicology 332, 94–101 (2015).

  24. 24.

    Gorelick, F. S. & Lerch, M. M. Do animal models of acute pancreatitis reproduce human disease?. Cell Mol. Gastroenterol. Hepatol. 4, 251–262 (2017).

  25. 25.

    Haeussler, M. et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 17, 148 (2016).

  26. 26.

    Rao, M. et al. Novel computational approach to predict off-target interactions for small molecules. Front. Big Data 2, 25 (2019).

  27. 27.

    Bittker, J. A. & Ross, N. T. (Eds) High Throughput Screening Methods: Evolution and Refinement (Royal Society of Chemistry, 2017).

  28. 28.

    Papadatos, G., Gaulton, A., Hersey, A. & Overington, J. P. Activity, assay and target data curation and quality in the ChEMBL database. J. Comp. Aided Mol. Des. 29, 885–896 (2015).

  29. 29.

    Tang, J. et al. Drug target commons: a community effort to build a consensus knowledge base for drug-target interactions. Cell Chem. Biol. 25, 224–229 (2018).

  30. 30.

    Mazzolari, A. et al. Prediction of UGT-mediated metabolism using the manually curated MetaQSAR database. ACS Med. Chem. Lett. 10, 633–638 (2019).

  31. 31.

    Santos, R. et al. A comprehensive map of molecular drug targets. Nat. Rev. Drug Discov. 16, 19–34 (2017).

  32. 32.

    Lin, Y. et al. Drug target ontology to classify and integrate drug discovery data. J. Biomed. Sem. 8, 50 (2017).

  33. 33.

    Halpern, Y., Choi, Y., Horng, S. & Sontag, D. Using anchors to estimate clinical state without labeled data. AMIA Annu. Symp. Proc. 2014, 606–615 (2014).

  34. 34.

    Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).

  35. 35.

    Food and Drug Administration. Computerized systems used in clinical investigations (FDA, 2007).

  36. 36.

    Rattan, A. K. Data integrity: history, issues, and remediation of issues. PDA J. Pharm. Sci. Technol. 72, 105–116 (2018).

  37. 37.

    Shockley, K. R. Quantitative high-throughput screening data analysis: challenges and recent advances. Drug Discov. Today 20, 296–300 (2015).

  38. 38.

    Mpindi, J. P. et al. Impact of normalization methods on high-throughput screening data with high hit rates and drug testing with dose–response data. Bioinformatics 31, 3815–3821 (2015).

  39. 39.

    Kalliokoski, T., Kramer, C., Vulpetti, A. & Gedeck, P. Comparability of mixed IC50 data – a statistical analysis. PLOS ONE 8, e61007 (2013).

  40. 40.

    Polit, D. F. & Beck, C. T. Nursing Research: Generating and Assessing Evidence for Nursing Practice (Wolters Kluwer, 2012).

  41. 41.

    Casale, F. P., Dalca, A. V., Saglietti, L., Listgarten, L. & Fusi, M. Gaussian process prior variational autoencoders. in Adv. Neural Inf. Process Syst. (NIPS, 2018).

  42. 42.

    Goldberg, Y. A primer on neural network models for natural language processing. J. Artif. Intell. Res. 57, 345–420 (2016).

  43. 43.

    Peck, M., Moffat, A., Latham, B. & Badrick, T. Review of diagnostic error in anatomical pathology and the role and value of second opinions in error prevention. J. Clin. Pathol. 71, 995–1000 (2018).

  44. 44.

    Lindsey, R. et al. Deep neural network improves fracture detection by clinicians. Proc. Natl Acad. Sci. USA 115, 11591–11596 (2018).

  45. 45.

    Miller, D. D. & Brown, E. W. Artificial intelligence in medical practice: the question to the answer? Am. J. Med. 131, 129–133 (2018).

  46. 46.

    LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

  47. 47.

    Gao, Z., Wang, L., Zhou, L. & Zhang, J. HEp-2 cell image classification with deep convolutional neural networks. IEEE J. Biomed. Health Inf. 21, 416–428 (2017).

  48. 48.

    Liu, D., Cheng, B., Wang, Z., Zhang, H. & Huang, T. S. Enhance visual recognition under adverse conditions via deep networks. IEEE Trans. Image Process. 28, 4401–4412 (2019).

  49. 49.

    Reker, D. & Brown, J. B. Selection of informative examples in chemogenomic datasets. Methods Mol. Biol. 1825, 369–410 (2018).

  50. 50.

    Korotcov, A., Tkachenko, V., Russo, D. P. & Ekins, S. Comparison of deep learning with multiple machine learning methods and metrics using diverse drug discovery data sets. Mol. Pharm. 14, 4462–4475 (2017).

  51. 51.

    Moffat, J. G., Vincent, F., Lee, J. A., Eder, J. & Prunotto, M. Opportunities and challenges in phenotypic drug discovery: an industry perspective. Nat. Rev. Drug Discov. 16, 531–543 (2017).

  52. 52.

    Aliper, A. et al. Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol. Pharm. 13, 2524–2530 (2016).

  53. 53.

    Schneider, G. & Schneider, P. Macromolecular target prediction by self-organizing feature maps. Expert Opin. Drug Discov. 12, 271–277 (2017).

  54. 54.

    Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Res. 45, D945–D954 (2017).

  55. 55.

    Kim, S. et al. PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 47, D1102–D1109 (2019).

  56. 56.

    Jagadeesh, K. A., Wu, D. J., Birgmeier, J. A., Boneh, D. & Bejerano, G. Deriving genomic diagnoses without revealing patient genomes. Science 357, 692–695 (2017).

  57. 57.

    Mayr, L. M. & Bojanic, D. Novel trends in high-throughput screening. Curr. Opin. Pharmacol. 9, 580–588 (2009).

  58. 58.

    Gong, Z. et al. Compound libraries: recent advances and their applications in drug discovery. Curr. Drug Discov. Technol. 14, 216–228 (2017).

  59. 59.

    Franzini, R. M. & Randolph, C. Chemical space of DNA-encoded libraries. J. Med. Chem. 59, 6629–6644 (2016).

  60. 60.

    Favalli, N., Bassi, G., Scheuermann, J. & Neri, D. DNA-encoded chemical libraries - achievements and remaining challenges. FEBS Lett. 592, 2168–2180 (2018).

  61. 61.

    Lucas, X., Grüning, B. A., Bleher, S. & Günther, S. The purchasable chemical space: a detailed picture. J. Chem. Inf. Model. 55, 915–924 (2015).

  62. 62.

    Lyu, J. et al. Ultra-large library docking for discovering new chemotypes. Nature 566, 224–229 (2019).

  63. 63.

    Reymond, J. L. et al. Chemical space as a source for new drugs. Med. Chem. Commun. 1, 30–38 (2010).

  64. 64.

    Drew, K. L., Baiman, H., Khwaounjoo, P., Yu, B. & Reynisson, J. Size estimation of chemical space: how big is it? J. Pharm. Pharmacol. 64, 490–495 (2012).

  65. 65.

    Follmann, M. et al. An approach towards enhancement of a screening library: the Next Generation Library Initiative (NGLI) at Bayer – against all odds? Drug Discov. Today 24, 668–672 (2019).

  66. 66.

    Richter, L. Topliss batchwise schemes reviewed in the era of open data reveal significant differences between enzymes and membrane receptors. J. Chem. Inf. Model. 57, 2575–2583 (2017).

  67. 67.

    Paul, S. M. et al. How to improve R&D productivity: the pharmaceutical industry’s grand challenge. Nat. Rev. Drug Discov. 9, 203–214 (2010).

  68. 68.

    Satyanarayanajois, S. D. & Hill, R. A. Medicinal chemistry for 2020. Future Med. Chem. 3, 1765–1786 (2011).

  69. 69.

    Lusher, S. J., McGuire, R., van Schaik, R. C., Nicholson, C. D. & de Vlieg, J. Data-driven medicinal chemistry in the era of big data. Drug Discov. Today 19, 859–868 (2014).

  70. 70.

    Shatsky, M., Shulman-Peleg, A., Nussinov, R. & Wolfson, H. J. The multiple common point set problem and its application to molecule binding pattern detection. J. Comput. Biol. 13, 407–428 (2006).

  71. 71.

    Wolber, G., Seidel, T., Bendix, F. & Langer, T. Molecule-pharmacophore superpositioning and pattern matching in computational drug design. Drug Discov. Today 13, 23–29 (2008).

  72. 72.

    Schneider, G. & Fechner, U. Computer-based de novo design of drug-like molecules. Nat. Rev. Drug Discov. 4, 649–663 (2005).

  73. 73.

    Schneider, G. & Clark, D. E. Automated de novo drug design – “are we nearly there yet?”. Angew. Chem. Int. Ed. 58, 10792–10803 (2019).

  74. 74.

    Schneider, G. Generative models for artificially-intelligent molecular design. Mol. Inf. 37, 1880131 (2018).

  75. 75.

    Merk, D., Friedrich, L., Grisoni, F. & Schneider, G. De novo design of bioactive small molecules by artificial intelligence. Mol. Inf. 37, 1700153 (2018).

  76. 76.

    Merk, D., Grisoni, F., Friedrich, L. & Schneider, G. Tuning artificial intelligence on the de novo design of natural-product-inspired retinoid X receptor modulators. Commun. Chem. 1, 68 (2018).

  77. 77.

    Yang, Y., Adelstein, S. J. & Kassis, A. I. Target discovery from data mining approaches. Drug Discov. Today 14, 147–154 (2009).

  78. 78.

    Frigault, M. M. & Barrett, J. C. Is target validation all we need? Curr. Opin. Pharmacol. 17, 81–86 (2014).

  79. 79.

    Fisher, J. & Henzinger, T. A. Executable cell biology. Nat. Biotechnol. 25, 1239–1249 (2007).

  80. 80.

    Moignard, V. et al. Decoding the regulatory network of early blood development from single-cell gene expression measurements. Nat. Biotechnol. 33, 269–276 (2015).

  81. 81.

    Silverbush, D. et al. Cell-specific computational modeling of the PIM pathway in acute myeloid leukemia. Cancer Res. 77, 827–838 (2017).

  82. 82.

    Miettinen K. Nonlinear Multiobjective Optimization (Springer, 1999).

  83. 83.

    Lambrinidis, G. & Tsantili-Kakoulidou, A. Challenges with multi-objective QSAR in drug discovery. Expert Opin. Drug Discov. 13, 851–859 (2018).

  84. 84.

    Nicolaou, C. A. & Brown, N. Multi-objective optimization methods in drug design. Drug Discov. Today Technol. 10, e427–e435 (2013).

  85. 85.

    Nicolotti, O. et al. Strategies of multi-objective optimization in drug discovery and development. Expert Opin. Drug Discov. 6, 871–884 (2011).

  86. 86.

    Ekins, S., Honeycutt, J. D. & Metz, J. T. Evolving molecules using multi-objective optimization: applying to ADME/Tox. Drug Discov. Today 15, 451–460 (2010).

  87. 87.

    Kutchukian, P. S. & Shakhnovich, E. I. De novo design: balancing novelty and confined chemical space. Expert Opin. Drug Discov. 5, 789–812 (2010).

  88. 88.

    Grisoni, F., Merk, D., Friedrich, L. & Schneider, G. Design of natural-product-inspired multi-target ligands by machine learning. ChemMedChem 14, 1129–1134 (2019).

  89. 89.

    Wong, W. W. & Burkowski, F. J. A constructive approach for discovering new drug leads: using a kernel methodology for the inverse-QSAR problem. J. Cheminform. 1, 4 (2009).

  90. 90.

    Miyao, T., Kaneko, H. & Funatsu, K. Inverse QSPR/QSAR analysis for chemical structure generation (from y to x). J. Chem. Inf. Model. 56, 286–299 (2016).

  91. 91.

    Blaschke, T., Olivecrona, M., Engkvist, O., Bajorath, J. & Chen, H. Application of generative autoencoder in de novo molecular design. Mol. Inf. 37, 1700123 (2018).

  92. 92.

    Snell, J., Swersky, K. & Zemel, R. Prototypical networks for few-shot learning. Neural Inf. Process. Syst. 31 (2017).

  93. 93.

    Altae-Tran, H., Ramsundar, B., Pappu, A. S. & Pande, V. Low data drug discovery with one-shot learning. ACS Cent. Sci. 3, 283–293 (2017).

  94. 94.

    Baskin, I. I. Is one-shot learning a viable option in drug discovery? Expert Opin. Drug Discov. 14, 601–603 (2019).

  95. 95.

    Duvenaud, D. K. et al. Convolutional networks on graphs for learning molecular fingerprints. Proc. Adv. Neural Inf. Process. Syst. 28, 2215–2223 (2015).

  96. 96.

    Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. Proc. Mach. Learn. Res. 70, 1263–1272 (2017).

  97. 97.

    Jin, W., Barzilay, R. & Jaakkola, T. Junction tree variational autoencoder for molecular graph generation. Proc. Mach. Learn. Res. 80, 2323–2332 (2018).

  98. 98.

    Yuan, X., He, P., Zhu, Q. & Li, X. Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 30, 2805–2824 (2019).

  99. 99.

    Guo, C., Pleiss, G., Sun, Y. & Weinberger, K. Q. On calibration of modern neural networks. Proc. Mach. Learn. Res. 70, 1321–1330 (2017).

  100. 100.

    Kuleshov, V., Fenner, N. & Ermon, S. Accurate uncertainties for deep learning using calibrated regression. Proc. Mach. Learn. Res. 80, 2796–2804 (2018).

  101. 101.

    Lakshminarayanan, B., Pritzel, A. & Blundell, C. Simple and scalable predictive uncertainty estimation using deep ensembles. Adv. Neural Inf. Process. Syst. 30, 6402–6413 (2017).

  102. 102.

    Brookes, D. H., Park, H. & Listgarten, J. Conditioning by adaptive sampling for robust design. Proc. Int. Conf. Mach. Learn. 97, 773–782 (2019).

  103. 103.

    Gillet, V. J. Designing combinatorial libraries optimized on multiple objectives. Methods Mol. Biol. 275, 335–354 (2004).

  104. 104.

    Shim, V. A., Tan, K. C., Chia, J. Y. & Al Mamun, A. Multi-objective optimization with estimation of distribution algorithm in a noisy environment. Evol. Comput. 21, 149–177 (2013).

  105. 105.

    Kramer, O. Self-Adaptive Heuristic for Evolutionary Computation (Springer-Verlag, 2008).

  106. 106.

    Hansen, N. The CMA evolution strategy: a tutorial. Preprint at arXiv (2016).

  107. 107.

    Ollivier, Y., Arnold, L., Auger, A. & Hansen, N. Information-geometric optimization algorithms: a unifying picture via invariance principles. J. Mach. Lern. Res. 18, 1–65 (2017).

  108. 108.

    Brookes, D. H., Busia, A., Fannjiang, C., Murphy K. & Listgarten, J. A view of estimation of distribution algorithms through the lens of expectation-maximization. Preprint at arXiv (2019).

  109. 109.

    Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 1998).

  110. 110.

    Schmidhuber, J. Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015).

  111. 111.

    Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. Preprint at arXiv (2014).

  112. 112.

    Mikolov, T., Sutskever, I., Chen, K., Corrado, G. & Dean, J. Distributed representations of words and phrases and their compositionality. Preprint at arXiv (2013).

  113. 113.

    Schneider, G. & Wrede, P. Artificial neural networks for computer-based molecular design. Prog. Biophys. Mol. Biol. 70, 175–222 (1998).

  114. 114.

    Schneider, G. Neural networks are useful tools for drug design. Neural Netw. 13, 15–16 (2000).

  115. 115.

    Zupan, J. & Gasteiger, J. Neural Networks for Chemists (VCH, 1993).

  116. 116.

    Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).

  117. 117.

    Sattarov, B. et al. De novo molecular design by combining deep autoencoder recurrent neural networks with generative topographic mapping. J. Chem. Inf. Model. 59, 1182–1196 (2019).

  118. 118.

    Hopkins, A. L., Keserü, G. M., Leeson, P. D., Rees, D. C. & Reynolds, C. H. The role of ligand efficiency metrics in drug discovery. Nat. Rev. Drug Discov. 13, 105–121 (2014).

  119. 119.

    Cavalluzzi, M. M., Mangiatordi, G. F., Nicolotti, O. & Lentini, G. Ligand efficiency metrics in drug discovery: the pros and cons from a practical perspective. Expert Opin. Drug Discov. 12, 1087–1104 (2017).

  120. 120.

    Meanwell, N. A. Improving drug design: an update on recent applications of efficiency metrics, strategies for replacing problematic elements, and compounds in nontraditional drug space. Chem. Res. Toxicol. 29, 564–616 (2016).

  121. 121.

    Kenny, P. W., Leitão., A. & Montanari, C. A. Ligand efficiency metrics considered harmful. J. Comput. Aided Mol. Des. 28, 699–710 (2014).

  122. 122.

    Plowright, A. T. et al. Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle. Drug Discov. Today 17, 56–62 (2012).

  123. 123.

    Cumming, J. G., Davis, A. M., Muresan, S., Haeberlein, M. & Chen, H. Chemical predictive modelling to improve compound quality. Nat. Rev. Drug Discov. 12, 948–962 (2013).

  124. 124.

    Harrison, S. et al. Extending ‘predict first’ to the design-make-test cycle in small-molecule drug discovery. Future Med. Chem. 9, 533–536 (2017).

  125. 125.

    Andersson, S. et al. Making medicinal chemistry more effective – application of Lean Sigma to improve processes, speed and quality. Drug Discov. Today 14, 598–604 (2009).

  126. 126.

    Blakemore, D. C. et al. Organic synthesis provides opportunities to transform drug discovery. Nat. Chem. 10, 383–394 (2018).

  127. 127.

    Schneider, G. Automating drug discovery. Nat. Rev. Drug Discov. 17, 97–113 (2018).

  128. 128.

    Reker, D., Schneider, P. & Schneider, G. Multi-objective active machine learning rapidly improves structure-activity models and reveals new protein-protein interaction inhibitors. Chem. Sci. 7, 3919–3927 (2016).

  129. 129.

    Reker, D., Schneider, P., Schneider, G. & Brown, J. B. Active learning for computational chemogenomics. Future Med. Chem. 9, 381–402 (2017).

  130. 130.

    Steiner, S. et al. Organic synthesis in a modular robotic system driven by a chemical programming language. Science 363, eaav2211 (2019).

  131. 131.

    Nettekoven, M. & Thomas, A. W. Accelerating drug discovery by integrative implementation of laboratory automation in the work flow. Curr. Med. Chem. 9, 2179–2190 (2002).

  132. 132.

    Selekman, J. A. et al. High-throughput automation in chemical process development. Annu. Rev. Chem. Biomol. Eng. 8, 525–547 (2017).

  133. 133.

    King, R. D. et al. Make way for robot scientists. Science 325, 945 (2009).

  134. 134.

    Dimitrov, T., Kreisbeck, C., Becker, J. S., Aspuru-Guzik, A. & Saikin, S. K. Autonomous molecular design: then and now. ACS Appl. Mater. Interfaces 11, 28 (2019).

  135. 135.

    Jordan, A. M. & Roughley, S. D. Drug discovery chemistry: a primer for the non-specialist. Drug Discov. Today 14, 731–744 (2009).

  136. 136.

    Roughley, S. D. & Jordan, A. M. The medicinal chemist’s toolbox: an analysis of reactions used in the pursuit of drug candidates. J. Med. Chem. 54, 3451–3479 (2011).

  137. 137.

    Boström, J. & Brown, D. G. Stuck in a rut with old chemistry. Drug Discov. Today 21, 701–703 (2016).

  138. 138.

    Boström, J., Brown, D. G., Young, R. J. & Keserü, G. M. Expanding the medicinal chemistry synthetic toolbox. Nat. Rev. Drug Discov. 17, 709–727 (2018).

  139. 139.

    Segall, M. D. Multi-parameter optimization: identifying high quality compounds with a balance of properties. Curr. Pharm. Des. 18, 1292–1310 (2012).

  140. 140.

    Scott, J. S. & Waring, M. J. Practical application of ligand efficiency metrics in lead optimization. Bioorg. Med. Chem. 26, 3006–3015 (2018).

  141. 141.

    Griffen, E., Leach, A. G., Robb, G. R. & Warner, D. J. Matched molecular pairs as a medicinal chemistry tool. J. Med. Chem. 54, 7739–7750 (2011).

  142. 142.

    King, R. D. et al. The automation of science. Science 324, 85–89 (2009).

  143. 143.

    Hessler, G. & Baringhaus, K. H. Artificial intelligence in drug design. Molecules 23, 2520 (2018).

  144. 144.

    Szymkuć, S. et al. Computer-assisted synthetic planning: the end of the beginning. Angew. Chem. Int. Ed. 55, 5904–5937 (2016).

  145. 145.

    Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).

  146. 146.

    Coley, C. W., Green, W. H. & Jensen, K. F. Machine learning in computer-aided synthesis planning. Acc. Chem. Res. 51, 1281–1289 (2018).

  147. 147.

    Bédard, A. C. et al. Reconfigurable system for automated optimization of diverse chemical reactions. Science 361, 1220–1225 (2018).

  148. 148.

    Rohall, S. L., Pancost-Heidebrecht, M., Shirley, B., Bacon, D. & Tarselli, M. A. Recommendations for chemists: a case study. in Proc. 12th ACM Conf. Recom. Syst. 347–351 (ACM, 2018).

  149. 149.

    Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37, 1038–1040 (2019).

  150. 150.

    Canning, P. et al. Structural mechanisms determining inhibition of the collagen receptor DDR1 by selective and multi-targeted type II kinase inhibitors. J. Mol. Biol. 426, 2457–2470 (2014).

  151. 151.

    Thrun, S. & Pratt, L. (eds). Learning to Learn (Springer, 2012).

  152. 152.

    Gupta, A. et al. Generative recurrent networks for de novo drug design. Mol. Inf. 37, 1700111 (2018).

  153. 153.

    Bruns, D., Merk, D., Kumar, K. S., Baumgartner, M. & Schneider, G. Synthetic activators of cell migration designed by constructive machine learning. ChemistryOpen 8, 1303–1308 (2019).

  154. 154.

    Sieroka, N., Otto, V. I. & Folkers, G. Critical thinking in education and research – why and how? Angew. Chem. Int. Ed. 57, 16574–16575 (2018).

  155. 155.

    Kut, E., Sieroka, N., Folkers, G., & Otto, V. I. A new course fosters critical thinking on pharmaceutical sciences at ETH Zurich. ChemMedChem News (2018).

  156. 156.

    Azzaoui, K. et al. Scientific competency questions as the basis for semantically enriched open pharmacological space development. Drug Discov. Today 18, 843–852 (2013).

  157. 157.

    Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44–56 (2019).

  158. 158.

    Goldberg, K. Robots and the return to collaborative intelligence. Nat. Mach. Intell. 1, 2–4 (2019).

  159. 159.

    Heuer, L. AI could threaten pharmaceutical patents. Nature 558, 519 (2018).

  160. 160.

    King, R. D. & Courtney, P. Dilemma over AI and drug patenting already under debate. Nature 560, 307 (2018).

  161. 161.

    Olley, D. (ed.) Artificial intelligence: how knowledge is created, transferred, and used (Elsevier, 2019).

  162. 162.

    Brown, S. P., Muchmore, S. W. & Hajduk, P. J. Healthy skepticism: assessing realistic model performance. Drug Discov. Today 14, 420–427 (2009).

  163. 163.

    McDonagh, J. L., Nath, N., De Ferrari, L., van Mourik, T. & Mitchell, J. B. O. Uniting cheminformatics and chemical theory to predict the intrinsic aqueous solubility of crystalline druglike molecules. J. Chem. Inf. Model. 54, 844–856 (2014).

  164. 164.

    Hartenfeller, M. et al. DOGS: Reaction-driven de novo design of bioactive compounds. PLOS Comput. Biol. 8, e1002380 (2012).

  165. 165.

    Schwaller, P., Gaudin, T., Lányi, D., Bekas, C. & Laino, T. “Found in Translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem. Sci. 9, 6091–6098 (2018).

  166. 166.

    Button, A., Merk, D., Hiss, J. A. & Schneider, G. Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis. Nat. Mach. Intell. 1, 307–315 (2019).

  167. 167.

    Yuan, W. et al. Chemical space mimicry for drug discovery. J. Chem. Inf. Model. 57, 875–882 (2017).

  168. 168.

    Segler, M. H. S., Kogej, T., Tyrchan, C. & Waller, M. P. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4, 120–131 (2018).

  169. 169.

    Popova, M., Isayev, O. & Tropsha, A. Deep reinforcement learning for de novo drug design. Sci. Adv. 4, eaap7885 (2018).

  170. 170.

    Perron, Q. et al. Deep learning for ligand-based de novo design in lead optimization: a real life case study. Presented at the XXV EFMC International Symposium on Medicinal Chemistry (2018).

  171. 171.

    Rodriguez, T. et al. Multidimensional de novo design reveals 5-HT2B receptor-selective ligands. Angew. Chem. Int. Ed. 54, 1551–1555 (2015).

  172. 172.

    Reutlinger, M., Rodrigues, T., Schneider, P. & Schneider, G. Multi-objective molecular de novo design by adaptive fragment prioritization. Angew. Chem. Int. Ed. 53, 4244–4248 (2014).

  173. 173.

    Parry, D. M. Closing the loop: developing an integrated design, make, and test platform for discovery. ACS Med. Chem. Lett. 10, 848–856 (2019).

  174. 174.

    Esch, E. W., Bahinski, A. & Huh, D. Organs–on–chips at the frontiers of drug discovery. Nat. Rev. Drug Discov. 14, 248–260 (2015).

  175. 175.

    Eglen, R. M. & Randle, D. H. Drug discovery goes three-dimensional: goodbye to flat high-throughput screening? Assay. Drug Dev. Technol. 13, 262–265 (2015).

  176. 176.

    Jones, L. H. & Bunnage, M. E. Applications of chemogenomic library screening in drug discovery. Nat. Rev. Drug Discov. 16, 285–296 (2017).

  177. 177.

    Trobe, M. & Burke, M. D. The molecular industrial revolution: automated synthesis of small molecules. Angew. Chem. Int. Ed. 57, 4192–4214 (2018).

  178. 178.

    Baranczak, A. et al. Integrated platform for expedited synthesis–purification–testing of small molecule libraries. ACS Med. Chem. Lett. 8, 461–465 (2017).

  179. 179.

    Vasudevan, A., Bogdan, A. R., Koolman, H. F., Wang, Y. & Djuric, S. W. Enabling chemistry technologies and parallel synthesis–accelerators of drug discovery programmes. Prog. Med. Chem. 56, 1–35 (2017).

  180. 180.

    Pant, S. M. et al. Design, synthesis, and testing of potent, selective hepsin inhibitors via application of an automated closed-loop opptimization platform. J. Med. Chem. 61, 4335–4347 (2018).

  181. 181.

    Gesmundo, N. J. et al. Nanoscale synthesis and affinity ranking. Nature 557, 228–232 (2018).

Download references


This article is based on a meeting involving a group of international experts from diverse scientific backgrounds and institutions convened in San Francisco in December 2018 for a workshop organized by the RETHINK think-and-do tank of ETH Zurich to rethink drug design with artificial intelligence. Figure 1 was created and contributed by Jack Burgess, who also acted as a visual scribe during the workshop. Jürg Brunnschweiler and the ETH Global team are thanked for excellent organizational support. This research was financially supported by the RETHINK initiative of ETH Zurich.

Author information

All authors contributed equally to the content and approved the final version of the manuscript.

Correspondence to Gisbert Schneider.

Ethics declarations

Competing interests

G.S. and P.S. declare a potential financial conflict of interest in their role as life science industry consultants and cofounders of GmbH, Zurich. The remaining authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links


The ATOM Consortium:

The Innovative Medicines Initiative:

The SALT Knowledge Share Consortium:


Adaptive algorithm

An adaptive algorithm implements a problem-solving heuristic that changes its behaviour at the time it is run, based on information available and a reward mechanism.

Artificial intelligence

(AI). The various definitions and interpretations of this term agree on three essential capabilities of an AI (most often referring to a computer or machine): (i) problem solving, (ii) learning from experience (memory and adaptation) and (iii) coping with new challenges (generalization).

Deep learning

A set of machine learning techniques that utilize multi-layer neural networks to derive relationships from data, specifically the use of neural networks (see below) with many layers. Neural networks with many layers are called ‘deep neural networks’, which corresponds to having many layers of function compositions. Typically, the deeper the layer, the more abstract the semantics of its ‘feature space’ (that is, the implicit representation created by the neural network at that layer).


A supposition or proposed explanation made on the basis of limited evidence as a starting point for further investigation, without any assumption of its truth. In the context of drug design, a molecular structure can serve as a hypothesis.

Machine learning

The science (and art) of programming computers so that they can learn from data; also a branch of artificial intelligence focused on one of several tasks, typically all function approximators. The most common task is the construction and training of classifier models, followed by regression models — both forms of ‘supervised learning’, wherein pairs of ‘inputs’ and ‘labels’ are used to train the model to then make label predictions for cases where only the inputs are observed. Also common in machine learning is ‘unsupervised learning’, wherein only ‘inputs’ are used (for example, a list of molecules numerically encoded such as by way of SMILES strings) and general properties of these are learned by the model, which can then tell you how likely a new input is to have belonged to this set of objects, or can be used to generate ‘new’ such objects. More nuanced mixing and matching of tasks is also possible, yielding ‘semi-supervised learning’.

Natural language processing

(NLP). NLP is concerned with the interactions between computers and human (natural) languages, in particular how to process and analyse large amounts of natural language data, for example, scientific literature. Deep statistical machine learning models achieve state-of-the-art results in many natural language tasks, for example, in language modelling and parsing. NLP can also be used for chemical language analysis and de novo design.

Neural networks

A particular type of function approximators wherein functions that predict discrete classes (classifiers) or real-values (regression models) do so by composing a series of (typically nonlinear) functions, each one converting the previous layer’s outputs into a new ‘space’. These models have been around for decades but came to prominence in the 1990s when the combination of access to large datasets, along with the ability to train ‘deep’ models (see Deep learning) and more powerful computers, enabled them to break benchmarks in computational audio and vision tasks.

Research culture

A community sharing certain practices or using a common method or exemplar, that is, speaking a common language (including formalisms and algorithms) or sharing typical instances, illustrations or exemplifications (including molecular structures).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Schneider, P., Walters, W.P., Plowright, A.T. et al. Rethinking drug design in the artificial intelligence era. Nat Rev Drug Discov (2019).

Download citation