Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Controlling an organic synthesis robot with machine learning to search for new reactivity

An Author Correction to this article was published on 26 June 2019

Matters Arising to this article was published on 26 June 2019

A Publisher Correction to this article was published on 24 July 2018

This article has been updated

Abstract

The discovery of chemical reactions is an inherently unpredictable and time-consuming process1. An attractive alternative is to predict reactivity, although relevant approaches, such as computer-aided reaction design, are still in their infancy2. Reaction prediction based on high-level quantum chemical methods is complex3, even for simple molecules. Although machine learning is powerful for data analysis4,5, its applications in chemistry are still being developed6. Inspired by strategies based on chemists’ intuition7, we propose that a reaction system controlled by a machine learning algorithm may be able to explore the space of chemical reactions quickly, especially if trained by an expert8. Here we present an organic synthesis robot that can perform chemical reactions and analysis faster than they can be performed manually, as well as predict the reactivity of possible reagent combinations after conducting a small number of experiments, thus effectively navigating chemical reaction space. By using machine learning for decision making, enabled by binary encoding of the chemical inputs, the reactions can be assessed in real time using nuclear magnetic resonance and infrared spectroscopy. The machine learning system was able to predict the reactivity of about 1,000 reaction combinations with accuracy greater than 80 per cent after considering the outcomes of slightly over 10 per cent of the dataset. This approach was also used to calculate the reactivity of published datasets. Further, by using real-time data from our robot, these predictions were followed up manually by a chemist, leading to the discovery of four reactions.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Automatic reaction detection with machine learning.
Fig. 2: Overview of the artificial intelligence algorithm used for the exploration of chemical space with the liquid-handling robot.
Fig. 3: Simulations exploring the chemical space and predictive power of the model.
Fig. 4: Exploring the Suzuki–Miyaura reaction using machine learning.
Fig. 5: Reactivity discovered with the machine-learning-driven robot.

Change history

  • 26 June 2019

    Change history: Owing to the misidentification of compound 22 in the original Letter, changes have been made to Fig. 5, Extended Data Fig. 2 and the main text; see accompanying Amendment.

  • 24 July 2018

    The chemical structure formatting in Fig. 5 has been corrected online.

References

  1. 1.

    Collins, K. D., Gensch, T. & Glorius, F. Contemporary screening approaches to reaction discovery and development. Nat. Chem. 6, 859–871 (2014).

    CAS  Article  Google Scholar 

  2. 2.

    Warr, W. A. A short review of chemical reaction database systems, computer-aided synthesis design, reaction prediction and synthetic feasibility. Mol. Inform. 33, 469–476 (2014).

    CAS  Article  Google Scholar 

  3. 3.

    Plata, R. E. & Singleton, D. A. A case study of the mechanism of alcohol-mediated Morita Baylis-Hillman reactions. The importance of experimental observations. J. Am. Chem. Soc. 137, 3811–3826 (2015).

    CAS  Article  Google Scholar 

  4. 4.

    LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    ADS  CAS  Article  Google Scholar 

  5. 5.

    Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255–260 (2015).

    ADS  MathSciNet  CAS  Article  Google Scholar 

  6. 6.

    Raccuglia, P. et al. Machine-learning-assisted materials discovery using failed experiments. Nature 533, 73–76 (2016).

    ADS  CAS  Article  Google Scholar 

  7. 7.

    Graulich, N., Hopf, H. & Schreiner, P. R. Heuristic thinking makes a chemist smart. Chem. Soc. Rev. 39, 1503–1512 (2010).

    CAS  Article  Google Scholar 

  8. 8.

    Gil, Y., Greaves, M., Hendler, J. & Hirsh, H. Amplify scientific discovery with artificial intelligence. Science 346, 171–172 (2014).

    ADS  CAS  Article  Google Scholar 

  9. 9.

    Trobe, M. & Burke, M. D. The molecular industrial revolution: automated synthesis of small molecules. Angew. Chem. Int. Ed. 57, 4192–4214 (2018).

    CAS  Article  Google Scholar 

  10. 10.

    Ley, S. V., Fitzpatrick, D. E., Ingham, R. J. & Myers, R. M. Organic synthesis: march of the machines. Angew. Chem. Int. Ed. 54, 3449–3464 (2015).

    CAS  Article  Google Scholar 

  11. 11.

    Sans, V. & Cronin, L. Towards dial-a-molecule by integrating continuous flow, analytics and self-optimisation. Chem. Soc. Rev. 45, 2032–2043 (2016).

    CAS  Article  Google Scholar 

  12. 12.

    Houben, C. & Lapkin, A. A. Automatic discovery and optimization of chemical processes. Curr. Opin. Chem. Eng. 9, 1–7 (2015).

    Article  Google Scholar 

  13. 13.

    Sans, V., Porwol, L., Dragone, V. & Cronin, L. A self optimizing synthetic organic reactor system using real-time in-line NMR spectroscopy. Chem. Sci. 6, 1258–1264 (2015).

    CAS  Article  Google Scholar 

  14. 14.

    Dragone, V., Sans, V., Henson, A. B., Granda, J. M. & Cronin, L. An autonomous organic reaction search engine for chemical reactivity. Nat. Commun. 8, 15733 (2017).

    ADS  Article  Google Scholar 

  15. 15.

    Cortes, C. & Vapnik, V. Support vector networks. Mach. Learn. 20, 273–297 (1995).

    MATH  Google Scholar 

  16. 16.

    Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268–276 (2018).

    Article  Google Scholar 

  17. 17.

    Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. 35, 1798–1828 (2013).

    Article  Google Scholar 

  18. 18.

    Coomans, D., Jonckheer, M., Massart, D. L., Broeckaert, I. & Blockx, P. Application of linear discriminant analysis in the diagnosis of thyroid diseases. Anal. Chim. Acta 103, 409–415 (1978).

    CAS  Article  Google Scholar 

  19. 19.

    Perera, D. et al. A platform for automated nanomole-scale reaction screening and micromole-scale synthesis in flow. Science 359, 429–434 (2018).

    ADS  CAS  Article  Google Scholar 

  20. 20.

    Ahneman, D. T., Estrada, J. G., Lin, S., Dreher, S. D. & Doyle, A. G. Predicting reaction performance in C-N cross-coupling using machine learning. Science 360, 186–190 (2018).

    ADS  CAS  Article  Google Scholar 

  21. 21.

    Nielsen, M. K., Ahneman, D. T., Riera, O. & Doyle, A. G. Deoxyfluorination with sulfonyl fluorides: navigating reaction space with machine learning. J. Am. Chem. Soc. 140, 5004–5008 (2018).

    CAS  Article  Google Scholar 

  22. 22.

    Bajusz, D., Racz, A. & Heberger, K. Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? J. Cheminform. 7, 20 (2015).

  23. 23.

    Palazzolo, A. M. E., Simons, C. L. W. & Burke, M. D. The natural productome. Proc. Natl Acad. Sci. 114, 5564–5566 (2017).

    CAS  Article  Google Scholar 

  24. 24.

    Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We acknowledge financial support from the EPSRC (grants number EP/H024107/1, EP/I033459/1, EP/J00135X/1, EP/J015156/1, EP/K021966/1, EP/K023004/1, EP/K038885/1, EP/L015668/1 and EP/L023652/1) and the ERC (project 670467 SMART-POM). J.M.G. acknowledges financial support from the Polish Ministry of Science and Higher Education grant number 1295/MOB/IV/2015/0. We thank A. Henson for help with the Tanimoto analysis.

Author information

Affiliations

Authors

Contributions

L.C. conceived the idea, developed the initial algorithms, designed the project and coordinated the efforts of the research team. J.M.G. developed the machine learning algorithms and devised the LDA and built and programmed the chemical robot. J.M.G. conducted experiments and isolated and characterized new compounds with input from L.D. and V.D. J.M.G. and L.C. co-wrote the paper with input from all authors.

Corresponding author

Correspondence to Leroy Cronin.

Ethics declarations

Competing interests

L.C. is the founder and director of DeepMatterGroup PLC and is listed as an inventor on a patent application filed by The University of Glasgow (GB 1810944.7).

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Reaction space explored.

The chemical inputs (118) used in the platform to search for new transformations and to evaluate the performance of the algorithm.

Extended Data Fig. 2 Suggested mechanisms for observed transformations and small library of compounds synthesized.

a, Suggested mechanism for the synthesis of compound 19. b, Small library of compounds synthesized. c, Suggested mechanism for the synthesis of compound 22. d, Suggested mechanism for the synthesis of compound 21.

Supplementary information

Supplementary Information

This file contains Supplementary Tables 1–6, Supplementary Figures 1–73, hardware specification, machine learning details, characterization of new compounds, structural assignments and copies of NMR spectra.

Supplementary Data

This zipped file contains the X-ray structure of compound 20.

Supplementary Data

This zipped file contains the X-ray structure of compound 21.

Supplementary Table

This table shows exemplary run of LDA algorithm exploring the chemical space. The first ninety experiments were chosen randomly and the next subsequent experiments were chosen using the LDA classifier. The name column contains the identity of the reaction composed from the names of the starting materials. The reactivity column contains the assignment of reactivity from SVM classifier for a given reaction mixture.

Supplementary Table

This table contains the LDA scores for all 969 reaction formed from chemical space shown in Extended Data Fig. 1. The LDA_reactivity column contains scores from LDA and reactivity column contains the assignment of reactivity from SVM classifier for a given reaction mixture.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Granda, J.M., Donina, L., Dragone, V. et al. Controlling an organic synthesis robot with machine learning to search for new reactivity. Nature 559, 377–381 (2018). https://doi.org/10.1038/s41586-018-0307-8

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing