This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Data availability
The chemical structures and labels used for training and validation of the supervised and unsupervised models, with the exception of 684 proprietary molecules, are available at https://github.com/learningmatter-mit/Deep-Drug-Coder20.
Code availability
The code used in this paper is available at https://github.com/learningmatter-mit/Deep-Drug-Coder.
References
Schwalbe-Koda, D. & Gómez-Bombarelli, R. in Lecture Notes in Physics Vol. 968 (eds Schütt, K. T. et al.) 445–467 (Springer, 2020).
Kotsias, P.-C. et al. Direct steering of de novo molecular generation using descriptor conditional recurrent neural networks (cRNNs). Nat. Mach. Intell. 2, 254–265 (2020).
Morgan, H. L. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J. Chem. Doc. 5, 107–113 (1965).
Segler, M. H. S., Kogej, T., Tyrchan, C. & Waller, M. P. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4, 120–131 (2018).
Arús-Pous, J. et al. Exploring the GDB-13 chemical space using deep generative models. J. Cheminform. 11, 20 (2019).
Popova, M., Isayev, O. & Tropsha, A. Deep reinforcement learning for de novo drug design. Sci. Adv. 4, eaap7885 (2018).
Gómez-Bombarelli, R. et al. Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. Nat. Mater. 15, 1120–1127 (2016).
Hachmann, J. et al. Lead candidates for high-performance organic photovoltaics from high-throughput quantum chemistry – the Harvard Clean Energy Project. Energy Environ. Sci. 7, 698–704 (2014).
Kotsias, P. & Bjerrum, E. J. Deep-Drug-Coder v1.0.0 https://doi.org/10.5281/zenodo.3739063 (accessed 15 May 2020).
Gueymard, C. A. The sun’s total and spectral irradiance for solar energy applications and solar radiation models. Sol. Energy 76, 423–453 (2004).
Jensen, J. H. A graph-based genetic algorithm and generative model/Monte Carlo tree search for the exploration of chemical space. Chem. Sci. 10, 3567–3572 (2019).
Jin, W., Barzilay, R. & Jaakkola, T. Domain extrapolation via regret minimization. Preprint at https://arxiv.org/abs/2006.03908 (2020).
Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).
Krenn, M., Häse, F., Nigam, A., Friederich, P. & Aspuru-Guzik, A. Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation. Mach. Learn. Sci. Technol. 1, 045024 (2020).
Kusner, M. J., Paige, B. & Hernández-Lobato, J. M. Grammar variational autoencoder. In Proc. 34th International Conference on Machine Learning (eds Precup, D. & Teh, Y. W.) 1945–1954 (2017).
Dai, H., Tian, Y., Dai, B., Skiena, S. & Song, L. Syntax-directed variational autoencoder for molecule generation. In Proc. International Conference on Learning Representations (ICLR, 2018).
Joulin, A. & Mikolov, T. Inferring algorithmic patterns with stack-augmented recurrent nets. In Advances in Neural Information Processing Systems (2015).
Moniz, J. R. A. & Krueger, D. Nested LSTMs. In Proc. Asian Conference on Machine Learning (PMLR, 2017).
Maziarka, Ł. et al. Molecule attention transformer. Preprint at https://arxiv.org/abs/2002.08264 (2020).
Mohapatra, S., Yang, T. & Gomez-Bombarelli, R. OPM-cRNN v0.1-OPM https://doi.org/10.5281/zenodo.4073289 (2020).
Landrum, G. RDKit: Open-source cheminformatics v2018.09.1 https://www.rdkit.org/docs/index.html (2006).
Acknowledgements
We acknowledge Sumitomo Chemical for providing financial support for this work.
Author information
Authors and Affiliations
Contributions
R.G.-B. supervised the research, and planned the project with contributions from S.M. S.M. trained and analysed the machine learning models. T.Y. ran the DFT calculations with contributions from R.G.-B. All authors contributed to the writing of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Machine Intelligence thanks Olexandr Isayev, Connor Coley and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Discussion Sections 1–8, Figs. 1–3 and Tables 1–5.
Rights and permissions
About this article
Cite this article
Mohapatra, S., Yang, T. & Gómez-Bombarelli, R. Reusability report: Designing organic photoelectronic molecules with descriptor conditional recurrent neural networks. Nat Mach Intell 2, 749–752 (2020). https://doi.org/10.1038/s42256-020-00268-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-020-00268-w
This article is cited by
-
Human- and machine-centred designs of molecules and materials for sustainability and decarbonization
Nature Reviews Materials (2022)
-
Revisiting code reusability
Nature Machine Intelligence (2022)